How to Migrate from Snapshots
If you’re currently using database snapshots or manual backups for versioning, this guide helps you migrate to Horizon Epoch.
Current Snapshot-Based Workflow
Many teams use approaches like:
- Daily database dumps
- PostgreSQL
pg_dumpsnapshots - AWS RDS snapshots
- Manual
CREATE TABLE ... AS SELECTcopies
These approaches have limitations:
- Full data copies are expensive
- No granular change tracking
- Difficult to compare versions
- Merging changes is manual
Migration Overview
- Initialize a Horizon Epoch repository
- Import your existing data as the initial commit
- Set up tables for tracking
- Train your team on the new workflow
Step-by-Step Migration
1. Prepare Your Environment
# Start services
docker compose -f docker/docker-compose.yml up -d
# Build Horizon Epoch from source (see Installation guide)
cargo build --release
2. Initialize Repository
epoch init production-data \
--metadata-url "postgresql://localhost/horizon_epoch" \
--description "Migrated from daily snapshots"
3. Register Existing Tables
# Register each table you want to version
epoch table add customers \
--location "postgresql://localhost/mydb/public.customers"
epoch table add orders \
--location "postgresql://localhost/mydb/public.orders"
epoch table add products \
--location "postgresql://localhost/mydb/public.products"
4. Create Initial Commit
epoch commit -m "Initial import from production snapshot 2024-01-15"
5. Tag Important Snapshots
If you have historical snapshots you want to reference:
# Tag the current state
epoch tag snapshot-2024-01-15 \
--message "Migration baseline from daily snapshot"
6. Import Historical Snapshots (Optional)
If you have historical data you want to preserve:
import asyncio
from horizon_epoch import Client, Author
async def import_snapshots():
async with Client.connect("postgresql://localhost/horizon_epoch") as client:
# For each historical snapshot
for snapshot_date in historical_dates:
# Restore snapshot to a temporary database
restore_snapshot(snapshot_date, temp_db)
# Create a branch for this point in time
branch_name = f"history/{snapshot_date}"
await client.branch(branch_name)
# Register tables from restored snapshot
# (pointing to temp database)
# Commit
await client.commit(
message=f"Historical snapshot: {snapshot_date}",
author=Author(name="Migration", email="ops@example.com")
)
# Tag for easy reference
await client.tag_create(
name=f"snapshot-{snapshot_date}",
message="Imported from backup"
)
asyncio.run(import_snapshots())
Mapping Snapshot Workflows
Daily Backup Replacement
Before:
# Nightly cron job
pg_dump mydb > /backups/mydb-$(date +%Y%m%d).sql
After:
# Nightly cron job
epoch commit -m "Daily snapshot $(date +%Y-%m-%d)"
epoch tag daily-$(date +%Y%m%d)
Pre-Change Backup
Before:
# Before making changes
pg_dump mydb > /backups/pre-migration.sql
# Make changes
# If something goes wrong, restore from pre-migration.sql
After:
# Before making changes
epoch tag pre-migration-$(date +%Y%m%d)
# Make changes
epoch commit -m "Applied migration X"
# If something goes wrong
epoch reset --hard pre-migration-$(date +%Y%m%d)
Environment Copies
Before:
# Create staging from production
pg_dump prod_db | psql staging_db
After:
# Create staging branch from production
epoch branch create staging --from main
# Staging now has zero-copy access to production data
Handling Large Datasets
For very large databases:
1. Incremental Registration
Register tables in batches:
# Critical tables first
epoch table add customers orders products
epoch commit -m "Core business tables"
# Then secondary tables
epoch table add logs analytics events
epoch commit -m "Add operational tables"
2. Exclude Large/Non-Critical Tables
Some tables might not need versioning:
# Configure exclusions via CLI
# epoch config set exclude_patterns "audit_logs_*,*_archive,temp_*"
# Or skip these tables when registering
# Only register the tables you want to version
3. Use Partitioned Commits
For very large initial imports:
# Commit in chunks
epoch commit --tables customers,orders -m "Batch 1: customer data"
epoch commit --tables products,inventory -m "Batch 2: product data"
Validation
After migration, verify everything works:
# Check repository status
epoch status
# View commit history
epoch log
# Verify tables are registered
epoch table list
# Test branching
epoch branch create test-migration
epoch checkout test-migration
# Make a small change, commit, merge back
epoch checkout main
epoch merge test-migration
epoch branch delete test-migration
Team Training
Key concepts to communicate:
- Branches replace copies - No more copying entire databases
- Commits are lightweight - Only changes are stored
- Tags mark important points - Like naming a backup
- Merging combines changes - No more manual comparison
Rollback Plan
If migration doesn’t go smoothly:
- Your existing snapshot workflow still works
- Horizon Epoch metadata is separate from your data
- Remove Horizon Epoch without affecting production:
# Remove metadata database dropdb horizon_epoch
Next Steps
- Branching Workflow - Learn the new workflow
- Environment Promotion - Replace environment copies