Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Core Concepts

Repositories

A repository is the top-level container for versioned data. It contains:

  • Multiple tables (data sources)
  • Multiple branches (parallel versions)
  • A history of commits (snapshots in time)

Branches

Branches are independent lines of development. Like Git:

  • main is the default branch
  • Create branches for features, experiments, or environments
  • Branches are zero-copy - creating one is instant

Commits

A commit is a snapshot of one or more tables at a point in time:

  • Contains a message describing the changes
  • Links to parent commit(s) forming a history graph
  • Tracks which records changed

Tables

Tables are data sources that Horizon Epoch tracks:

  • Can be database tables (PostgreSQL, MySQL, SQL Server, SQLite) or object storage (S3, Azure, GCS, local)
  • Each table has a schema and primary key
  • Changes are tracked at the record level

Copy-on-Write

Horizon Epoch uses copy-on-write (CoW) semantics:

  • When you create a branch, no data is copied
  • Only modified records are stored separately
  • Queries automatically overlay changes on base data

This makes operations fast and storage efficient.

Three-Way Merge

When merging branches, Horizon Epoch uses three-way merge:

  1. Find the common ancestor
  2. Identify changes from ancestor to each branch
  3. Combine non-conflicting changes
  4. Report conflicts for overlapping changes

Conflicts can be resolved at the field level.

Storage Adapters

Horizon Epoch works with your existing storage through adapters:

Relational Databases:

  • PostgreSQL - Full constraint support, TLS/SSL
  • MySQL - Full constraint support, SSH tunnels
  • SQL Server - Full constraint support, Windows auth
  • SQLite - File-based or in-memory, partial constraint support

Object Storage (Delta Lake format):

  • AWS S3 - And S3-compatible (MinIO, etc.)
  • Azure Blob Storage - Multiple auth methods
  • Google Cloud Storage - Service account or ADC
  • Local Filesystem - For development and edge deployments

Each adapter implements the same interface, so version control operations work identically across storage types.

Metadata Layer

All versioning information is stored in a metadata database (PostgreSQL):

  • Commit graph (directed acyclic graph of commits)
  • Branch pointers (references to commits)
  • Table registrations and schema versions
  • Change tracking indices

The metadata layer is separate from your data storage, meaning Horizon Epoch doesn’t modify how your data is stored.