Copy-on-Write
Copy-on-Write (CoW) is the fundamental mechanism that enables Horizon Epoch to create instant branches and track changes efficiently.
Overview
When you create a branch, no data is copied. Instead:
- The branch is created as a pointer to the current commit
- When you modify a record, only that record is copied
- Queries combine the original data with branch-specific changes
How It Works
Initial State
main branch:
┌─────────────────────────────────────────┐
│ users table │
│ id │ name │ email │
│─────┼─────────┼─────────────────────────│
│ 1 │ Alice │ alice@example.com │
│ 2 │ Bob │ bob@example.com │
│ 3 │ Charlie │ charlie@example.com │
└─────────────────────────────────────────┘
Branch Creation (Zero-Copy)
epoch branch create feature/update-emails
┌─── feature/update-emails
│ (pointer to same data)
▼
main ───────────────►┌─────────────────────────────────────────┐
│ users table │
│ (shared, no copy) │
└─────────────────────────────────────────┘
No data is copied. Both branches point to the same underlying data.
First Modification (Copy-on-Write)
# On feature/update-emails branch
client.execute("UPDATE users SET email = 'alice@company.com' WHERE id = 1")
Now the modified record is stored separately:
main branch: feature/update-emails:
┌───────────────────────┐ ┌─────────────────────────────┐
│ users (base) │ │ users (overlay) │
│ id │ name │ email │ │ id │ name │ email │
│─────┼───────┼─────────│ │─────┼───────┼───────────────│
│ 1 │ Alice │ alice@ │ │ 1 │ Alice │ alice@company │
│ 2 │ Bob │ bob@ │ └─────────────────────────────┘
│ 3 │ Charl │ charlie@│ Only modified records stored
└───────────────────────┘
Querying a Branch
When you query the feature branch:
SELECT * FROM users;
Horizon Epoch:
- Checks the branch overlay for each record
- Returns overlay version if found
- Returns base version otherwise
Result on feature/update-emails:
│ 1 │ Alice │ alice@company.com │ ← from overlay
│ 2 │ Bob │ bob@example.com │ ← from base
│ 3 │ Charlie │ charlie@example.com │ ← from base
Storage Layout: Before and After Modifications
Understanding how data is physically stored helps you reason about performance and storage costs.
PostgreSQL Implementation
Horizon Epoch uses version tables with a naming convention {table}__v_{version_id}:
State 0: Initial Setup (Main Branch Only)
┌─────────────────────────────────────────────────────────────────────────┐
│ Database: my_database │
│ │
│ users (base table) │
│ ┌──────┬─────────┬─────────────────────┬──────────┐ │
│ │ id │ name │ email │ status │ │
│ ├──────┼─────────┼─────────────────────┼──────────┤ │
│ │ 1 │ Alice │ alice@example.com │ active │ │
│ │ 2 │ Bob │ bob@example.com │ active │ │
│ │ 3 │ Charlie │ charlie@example.com │ inactive │ │
│ └──────┴─────────┴─────────────────────┴──────────┘ │
│ │
│ Total storage: ~3 records │
└─────────────────────────────────────────────────────────────────────────┘
State 1: Branch Created (Zero-Copy)
epoch branch create feature/update-users
┌─────────────────────────────────────────────────────────────────────────┐
│ Database: my_database │
│ │
│ users (base table) users__v_feature-update-users │
│ ┌──────┬─────────┬─────────────────────┐ ┌───────────────────────────┐│
│ │ id │ name │ email │ │ (empty table) ││
│ ├──────┼─────────┼─────────────────────┤ │ ││
│ │ 1 │ Alice │ alice@example.com │ │ Created with same schema ││
│ │ 2 │ Bob │ bob@example.com │ │ using LIKE ... INCLUDING ││
│ │ 3 │ Charlie │ charlie@example.com │ │ ALL (copies constraints) ││
│ └──────┴─────────┴─────────────────────┘ └───────────────────────────┘│
│ │
│ Total storage: ~3 records (0 additional for branch) │
│ Branch creation time: O(1) - just creates empty table │
└─────────────────────────────────────────────────────────────────────────┘
State 2: First Modification (Copy-on-Write Triggers)
# On feature branch: Update Alice's email
client.execute("UPDATE users SET email = 'alice@company.com' WHERE id = 1")
┌─────────────────────────────────────────────────────────────────────────┐
│ Database: my_database │
│ │
│ users (base table) users__v_feature-update-users │
│ ┌──────┬─────────┬─────────────────────┐ ┌──────┬───────┬────────────┐│
│ │ id │ name │ email │ │ id │ name │ email ││
│ ├──────┼─────────┼─────────────────────┤ ├──────┼───────┼────────────┤│
│ │ 1 │ Alice │ alice@example.com │ │ 1 │ Alice │ alice@co.. ││
│ │ 2 │ Bob │ bob@example.com │ └──────┴───────┴────────────┘│
│ │ 3 │ Charlie │ charlie@example.com │ │
│ └──────┴─────────┴─────────────────────┘ Only 1 record copied! │
│ │
│ Total storage: ~4 records (1 additional for branch) │
│ Base table: UNCHANGED │
└─────────────────────────────────────────────────────────────────────────┘
State 3: Multiple Modifications
# More changes on feature branch
client.execute("UPDATE users SET status = 'vip' WHERE id = 1") # Updates overlay
client.execute("INSERT INTO users VALUES (4, 'Dana', 'dana@company.com', 'active')")
client.execute("DELETE FROM users WHERE id = 2") # Marks as deleted
┌─────────────────────────────────────────────────────────────────────────┐
│ Database: my_database │
│ │
│ users (base table) users__v_feature-update-users │
│ ┌──────┬─────────┬─────────────────────┐ ┌──────┬───────┬────────────┐│
│ │ id │ name │ email │ │ id │ name │ status ││
│ ├──────┼─────────┼─────────────────────┤ ├──────┼───────┼────────────┤│
│ │ 1 │ Alice │ alice@example.com │ │ 1 │ Alice │ vip (mod) ││
│ │ 2 │ Bob │ bob@example.com │ │ 4 │ Dana │ active(new)││
│ │ 3 │ Charlie │ charlie@example.com │ │ 2 │ NULL │ _DELETED_ ││
│ └──────┴─────────┴─────────────────────┘ └──────┴───────┴────────────┘│
│ │
│ Total storage: ~6 records (3 additional for branch) │
│ Main branch queries: See only original 3 records │
│ Feature branch queries: See Alice(mod) + Charlie + Dana = 3 records │
└─────────────────────────────────────────────────────────────────────────┘
S3/Delta Lake Implementation
Delta Lake uses a similar pattern with separate directories:
State 0: Initial Table
s3://horizon-data/
└── repositories/my-repo/
└── tables/
└── users/ # Base table (main branch)
├── _delta_log/
│ └── 00000000000000000000.json # Transaction log
├── part-00000-abc123.parquet # Data file 1
└── part-00001-def456.parquet # Data file 2
Total: ~100MB (all records)
State 1: Branch Created
s3://horizon-data/
└── repositories/my-repo/
└── tables/
├── users/ # Base table (unchanged)
│ ├── _delta_log/
│ └── *.parquet
│
└── users__v_feature-xyz/ # Version table (empty)
└── _delta_log/
└── 00000000000000000000.json # Schema only, no data files
Additional storage: ~1KB (just metadata)
State 2: After Modifications
s3://horizon-data/
└── repositories/my-repo/
└── tables/
├── users/ # Base table (unchanged: 100MB)
│ ├── _delta_log/
│ └── *.parquet
│
└── users__v_feature-xyz/ # Version table
├── _delta_log/
│ ├── 00000000000000000000.json # Initial schema
│ └── 00000000000000000001.json # Insert/update transaction
└── part-00000-xyz789.parquet # Only modified records (~1MB)
Additional storage: ~1MB (only changed records)
Query Overlay Mechanism
The query overlay mechanism is the heart of CoW - it makes branched data appear seamless to applications.
The UNION ALL + NOT EXISTS Pattern
Horizon Epoch uses a highly optimized SQL pattern that leverages PostgreSQL’s query planner:
-- Overlay query for feature branch
SELECT * FROM schema.users__v_feature_branch
UNION ALL
SELECT b.* FROM schema.users b
WHERE NOT EXISTS (
SELECT 1 FROM schema.users__v_feature_branch v
WHERE b.id = v.id -- Primary key match
)
How it works:
- First SELECT: Returns ALL records from the version table (modified/new records)
- UNION ALL: Combines without deduplication (faster than UNION)
- Second SELECT: Returns base records WHERE they DON’T exist in version table
- NOT EXISTS: Anti-join pattern - highly optimized in PostgreSQL
Query Flow Diagram
┌─────────────────────────────────────────────────────────────────────────┐
│ SELECT * FROM users │
│ (on feature branch) │
└─────────────────────────────────┬───────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ Horizon Epoch Query Rewriter │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ 1. Detect current branch context │ │
│ │ 2. Identify version table: users__v_feature_branch │ │
│ │ 3. Rewrite query with UNION ALL pattern │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────┬───────────────────────────────────────┘
│
┌─────────────┴─────────────┐
│ │
▼ ▼
┌──────────────────────────────┐ ┌──────────────────────────────────────┐
│ Version Table Scan │ │ Base Table Scan with Anti-Join │
│ users__v_feature_branch │ │ users WHERE NOT EXISTS (version) │
│ │ │ │
│ Returns: Modified records │ │ Returns: Unmodified records │
│ (Alice with new email) │ │ (Bob, Charlie - unchanged) │
└──────────────┬───────────────┘ └──────────────────┬───────────────────┘
│ │
└──────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ UNION ALL Result │
│ ┌──────┬─────────┬─────────────────────┐ │
│ │ id │ name │ email │ │
│ ├──────┼─────────┼─────────────────────┤ │
│ │ 1 │ Alice │ alice@company.com │ ← from version table │
│ │ 2 │ Bob │ bob@example.com │ ← from base (not in version) │
│ │ 3 │ Charlie │ charlie@example.com │ ← from base (not in version) │
│ └──────┴─────────┴─────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
Composite Primary Key Handling
For tables with composite primary keys, the pattern adapts:
-- Table with composite PK (order_id, line_item_id)
SELECT * FROM schema.order_lines__v_feature_branch
UNION ALL
SELECT b.* FROM schema.order_lines b
WHERE NOT EXISTS (
SELECT 1 FROM schema.order_lines__v_feature_branch v
WHERE b.order_id = v.order_id
AND b.line_item_id = v.line_item_id
)
Filtered Queries
When you add a WHERE clause, it’s applied to both parts:
-- Original query
SELECT * FROM users WHERE status = 'active'
-- Rewritten overlay query
SELECT * FROM schema.users__v_feature_branch
WHERE status = 'active'
UNION ALL
SELECT b.* FROM schema.users b
WHERE b.status = 'active'
AND NOT EXISTS (
SELECT 1 FROM schema.users__v_feature_branch v
WHERE b.id = v.id
)
Query Optimization Details
Point Lookups (O(1))
SELECT * FROM users WHERE id = 1
Optimized path:
- Check version table first (indexed lookup)
- If found → return immediately
- If not found → check base table
- Return result
-- Horizon Epoch generates:
SELECT * FROM schema.users__v_feature_branch WHERE id = 1
UNION ALL
SELECT * FROM schema.users
WHERE id = 1
AND NOT EXISTS (
SELECT 1 FROM schema.users__v_feature_branch WHERE id = 1
)
With proper indexing, this is two index lookups - very fast.
Full Table Scans
SELECT * FROM users
Performance depends on overlay size:
- Empty overlay: ~0% overhead (NOT EXISTS returns quickly)
- Small overlay (< 1% of base): ~5-10% overhead
- Large overlay (> 10% of base): Consider materialization
Aggregations
SELECT status, COUNT(*) FROM users GROUP BY status
Aggregations work correctly across the union:
SELECT status, COUNT(*) FROM (
SELECT status FROM schema.users__v_feature_branch
UNION ALL
SELECT b.status FROM schema.users b
WHERE NOT EXISTS (...)
) combined
GROUP BY status
S3/Delta Lake Query Pattern
For Delta Lake tables, Horizon Epoch uses DataFusion SQL engine:
-- Register tables in DataFusion context
CREATE EXTERNAL TABLE base STORED AS DELTA LOCATION 's3://bucket/users/'
CREATE EXTERNAL TABLE branch STORED AS DELTA LOCATION 's3://bucket/users__v_feature/'
-- Query with overlay
SELECT * FROM branch
UNION ALL
SELECT * FROM base
WHERE id NOT IN (SELECT id FROM branch)
Performance Benchmarks
| Scenario | Base Query | Overlay Query | Overhead |
|---|---|---|---|
| Point lookup (indexed) | 0.5ms | 0.6ms | +20% |
| Range scan (1000 rows) | 5ms | 5.5ms | +10% |
| Full scan (1M rows, 1K overlay) | 500ms | 520ms | +4% |
| Full scan (1M rows, 100K overlay) | 500ms | 600ms | +20% |
| Aggregation (1M rows) | 200ms | 220ms | +10% |
Key insight: Overhead is proportional to overlay size, not base table size.
Index Recommendations
For optimal overlay query performance:
-- Version tables automatically inherit indexes via LIKE ... INCLUDING ALL
-- But you may want to ensure these exist:
-- Primary key index (automatic)
CREATE INDEX ON users__v_feature_branch (id);
-- Covering indexes for common queries
CREATE INDEX ON users__v_feature_branch (status) INCLUDE (name, email);
Benefits
1. Instant Branch Creation
Creating a branch takes constant time regardless of data size:
- 1 row table: ~1ms
- 1 billion row table: ~1ms
2. Storage Efficiency
Only store what changes:
- Base: 1TB
- Branch with 1000 changes: ~1MB additional
3. Fast Merging
Merge only examines changed records:
- Base: 1 billion rows
- Changes: 1000 rows
- Merge compares: 1000 rows (not 1 billion)
4. Time Travel
Every commit preserves the overlay state:
- Query any historical version
- No additional storage for old versions
- Instant access to history
Considerations
Query Performance
Overlaid queries have slight overhead:
- Must check overlay before/after base
- Overhead is proportional to overlay size
- Negligible for typical branch sizes
For large overlays:
# Materialize long-lived branches
client.materialize_branch("long-running-branch")
Garbage Collection
Old overlays must be cleaned up:
# Merged branches can have overlays removed
client.gc_branch("merged-feature")
# Or automatic cleanup
config = Config(
gc_merged_branches=True,
gc_retention_days=7
)
Deep Branch Hierarchies
main → dev → feature → sub-feature → experiment
Each level adds overlay resolution. Keep hierarchies shallow.
Comparison with Full Copy
| Aspect | Copy-on-Write | Full Copy |
|---|---|---|
| Branch creation | O(1) | O(n) |
| Storage per branch | O(changes) | O(data) |
| Query (point) | O(1) | O(1) |
| Query (scan) | O(n + overlay) | O(n) |
| Merge | O(changes) | O(n) |
Best Practices
- Keep branches short-lived - Smaller overlays are faster
- Merge frequently - Reduces overlay accumulation
- Monitor overlay size - Large overlays may need materialization
- Use appropriate indexes - Index overlay tables for performance
- Clean up merged branches - Run garbage collection regularly