Tutorial: Schema and Constraint Versioning
This tutorial demonstrates how to version database constraints alongside your data, including schema introspection, constraint diffing, merge conflict resolution, and DDL migration generation.
Prerequisites
- Horizon Epoch installed (Installation Guide)
- PostgreSQL running with a sample database
- Basic familiarity with Git-like version control concepts
Part 1: Schema Introspection with Constraints
Step 1: Create a Table with Constraints
First, let’s create a table with various constraints in PostgreSQL:
-- Create enum type
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'shipped', 'delivered');
-- Create customers table
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL,
name VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT uq_customers_email UNIQUE (email),
CONSTRAINT chk_customers_email CHECK (email ~ '@')
);
-- Create orders table with foreign key
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL,
status order_status DEFAULT 'pending',
total NUMERIC(10,2) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id)
REFERENCES customers (id) ON DELETE CASCADE,
CONSTRAINT chk_orders_total CHECK (total >= 0)
);
-- Create index
CREATE INDEX idx_orders_customer ON orders (customer_id);
CREATE INDEX idx_orders_status ON orders (status) WHERE status != 'delivered';
Step 2: Register Tables with Horizon Epoch
# Initialize repository
epoch init ecommerce-repo \
--metadata-url "postgresql://localhost/horizon_epoch"
# Register tables (constraints are automatically detected)
epoch table add customers \
--location "postgresql://localhost/mydb/public.customers"
epoch table add orders \
--location "postgresql://localhost/mydb/public.orders"
# Initial commit captures schema with constraints
epoch commit -m "Initial schema with constraints"
Step 3: View Introspected Schema
# Show schema including constraints
epoch schema show customers
# Output:
# Table: customers
#
# Columns:
# id SERIAL PRIMARY KEY
# email VARCHAR(255) NOT NULL
# name VARCHAR(255) NOT NULL
# created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
#
# Constraints:
# uq_customers_email UNIQUE (email)
# chk_customers_email CHECK (email ~ '@')
Using Python:
import asyncio
from horizon_epoch import Client
async def view_constraints():
async with Client.connect("postgresql://localhost/horizon_epoch") as client:
# Get constraints for a table
constraints = await client.get_constraints("customers", branch="main")
print(f"\nUnique Constraints: {len(constraints.unique_constraints)}")
print(f"Check Constraints: {len(constraints.check_constraints)}")
print(f"Foreign Keys: {len(constraints.foreign_keys)}")
asyncio.run(view_constraints())
Part 2: Constraint Diffing Between Branches
Step 1: Create a Feature Branch
epoch branch create feature/add-phone
Step 2: Modify Constraints on the Branch
Add a phone number column with constraints:
-- On feature/add-phone branch
ALTER TABLE customers ADD COLUMN phone VARCHAR(20);
ALTER TABLE customers ADD CONSTRAINT uq_customers_phone UNIQUE (phone);
ALTER TABLE customers ADD CONSTRAINT chk_customers_phone
CHECK (phone ~ '^\+?[0-9]{10,15}$');
epoch commit -m "Add phone column with constraints"
Step 3: View the Constraint Diff
# Compare branches
epoch diff main feature/add-phone --schema
# Output:
# Schema changes for 'customers':
#
# Added columns:
# + phone VARCHAR(20)
#
# Added constraints:
# + uq_customers_phone UNIQUE (phone)
# + chk_customers_phone CHECK (phone ~ '^\+?[0-9]{10,15}$')
Using Python:
# Get constraint diff
diff = client.diff_schema("main", "feature/add-phone", table="customers")
print("Added constraints:")
for c in diff.constraints.added:
print(f" + {c.constraint_type}: {c.name}")
print("\nRemoved constraints:")
for c in diff.constraints.removed:
print(f" - {c.constraint_type}: {c.name}")
print("\nModified constraints:")
for m in diff.constraints.modified:
print(f" ~ {m.name}")
for change in m.changes:
print(f" {change}")
Part 3: Constraint Merge Conflict Resolution
Constraint conflicts occur when the same constraint is modified differently on two branches.
Step 1: Create Conflicting Changes
# On main branch
epoch checkout main
-- Modify the email check constraint on main
ALTER TABLE customers DROP CONSTRAINT chk_customers_email;
ALTER TABLE customers ADD CONSTRAINT chk_customers_email
CHECK (email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');
epoch commit -m "Stricter email validation"
# On feature branch (already has different changes)
epoch checkout feature/add-phone
-- Also modify the email check constraint differently
ALTER TABLE customers DROP CONSTRAINT chk_customers_email;
ALTER TABLE customers ADD CONSTRAINT chk_customers_email
CHECK (email LIKE '%@%.%');
epoch commit -m "Simpler email validation"
Step 2: Attempt Merge
epoch checkout main
epoch merge feature/add-phone
# Output:
# CONFLICT (constraint): chk_customers_email modified in both branches
#
# On main:
# CHECK (email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$')
#
# On feature/add-phone:
# CHECK (email LIKE '%@%.%')
#
# Automatic merge failed. Fix conflicts and commit.
Step 3: View Conflicts
epoch status
# Output:
# On branch main
# You have unmerged paths.
#
# Constraint conflicts:
# (use "epoch resolve <constraint>" to mark resolution)
#
# both modified: customers.chk_customers_email
Using Python:
# Get merge conflicts
merge_result = client.merge("feature/add-phone", "main", dry_run=True)
for conflict in merge_result.constraint_conflicts:
print(f"Conflict: {conflict.constraint_name}")
print(f" Type: {conflict.constraint_type}")
print(f" Base: {conflict.base}")
print(f" Ours: {conflict.ours}")
print(f" Theirs: {conflict.theirs}")
print(f" Differences: {conflict.differences}")
Step 4: Resolve Conflicts
Option A: Use ours (main branch version)
epoch resolve customers.chk_customers_email --ours
epoch commit -m "Merge feature/add-phone with stricter email validation"
Option B: Use theirs (feature branch version)
epoch resolve customers.chk_customers_email --theirs
epoch commit -m "Merge feature/add-phone with simpler email validation"
Option C: Use custom resolution
# Create custom constraint
epoch resolve customers.chk_customers_email --custom \
"CHECK (email ~ '^[^@]+@[^@]+\.[^@]+$')"
epoch commit -m "Merge feature/add-phone with balanced email validation"
Using Python:
from horizon_epoch import ConstraintResolution
# Resolve using ours
client.resolve_constraint(
"customers",
"chk_customers_email",
resolution=ConstraintResolution.UseOurs
)
# Or use theirs
client.resolve_constraint(
"customers",
"chk_customers_email",
resolution=ConstraintResolution.UseTheirs
)
# Or provide custom
from horizon_epoch import CheckConstraint
custom_constraint = CheckConstraint(
name="chk_customers_email",
expression="email ~ '^[^@]+@[^@]+\\.[^@]+$'"
)
client.resolve_constraint(
"customers",
"chk_customers_email",
resolution=ConstraintResolution.UseCustom(custom_constraint)
)
# Complete the merge
client.commit(message="Merge with resolved constraint conflict")
Part 4: DDL Migration Generation
Horizon Epoch can generate migration scripts for schema changes.
Step 1: View Pending Changes
# Show what migrations would be generated
epoch schema diff main feature/new-indexes --ddl
# Output:
# -- Migration: main -> feature/new-indexes
#
# -- UP
# CREATE INDEX CONCURRENTLY idx_customers_name ON customers (name);
# ALTER TABLE orders ADD CONSTRAINT chk_orders_status
# CHECK (status IN ('pending', 'processing', 'shipped', 'delivered'));
#
# -- DOWN
# DROP INDEX CONCURRENTLY IF EXISTS idx_customers_name;
# ALTER TABLE orders DROP CONSTRAINT IF EXISTS chk_orders_status;
Step 2: Generate Migration Script
# Generate migration file
epoch schema migrate main feature/new-indexes \
--output migrations/V2__add_indexes.sql
# Output:
# Generated migration: migrations/V2__add_indexes.sql
# UP: 2 statements
# DOWN: 2 statements
# Warnings: 0
Using Python:
from horizon_epoch import PostgresDdlGenerator
# Get the schema diff
diff = client.diff_schema("main", "feature/new-indexes")
# Generate migration
generator = PostgresDdlGenerator()
migration = generator.generate_migration("customers", diff.constraints)
print("UP statements:")
for stmt in migration.up:
print(f" {stmt}")
print("\nDOWN statements:")
for stmt in migration.down:
print(f" {stmt}")
if migration.warnings:
print("\nWarnings:")
for warn in migration.warnings:
print(f" ! {warn}")
# Write to file
with open("migrations/V2__add_indexes.sql", "w") as f:
f.write("-- UP\n")
for stmt in migration.up:
f.write(f"{stmt};\n")
f.write("\n-- DOWN\n")
for stmt in migration.down:
f.write(f"{stmt};\n")
Step 3: Understanding Migration Ordering
The DDL generator follows a specific order to handle dependencies:
UP (Apply) Order:
- Create new enum types (before columns using them)
- Add values to modified enums
- Drop constraints being modified/removed (FK first)
- Drop indexes being modified/removed
- Handle primary key changes
- Add new/modified constraints (check, unique, then FK last)
- Create new indexes
DOWN (Reverse) Order:
- Drop new indexes
- Drop new constraints
- Re-add dropped indexes
- Re-add dropped constraints
- Re-add dropped enum types (with warnings)
Step 4: Check for Destructive Changes
# Check if migration has destructive changes
migration = generator.generate_migration("orders", diff.constraints)
if migration.has_destructive_changes():
print("WARNING: Migration contains destructive changes!")
for warn in migration.warnings:
print(f" {warn}")
Destructive changes include:
- Dropping enum values (data may reference them)
- Dropping foreign keys to tables being dropped
- Removing NOT NULL on columns with NULL values
Part 5: Handling Enum Type Changes
Enum types require special handling because PostgreSQL only allows additive changes.
Step 1: Add Enum Values
epoch checkout main
-- Add new enum value (safe operation)
ALTER TYPE order_status ADD VALUE 'cancelled' AFTER 'delivered';
epoch commit -m "Add cancelled status"
Step 2: Check Enum Safety
# Get enum diff
diff = client.diff_schema("main", "feature-branch")
for enum_diff in diff.enum_types:
if enum_diff.is_additive():
print(f"Enum {enum_diff.name}: Safe to merge (additive only)")
else:
print(f"Enum {enum_diff.name}: UNSAFE - values removed or reordered")
print(f" Removed: {enum_diff.removed}")
print(f" Reordered: {enum_diff.reordered}")
Step 3: Handle Non-Additive Enum Changes
If you need to remove or reorder enum values, you must:
- Create a new enum type
- Migrate data to the new type
- Drop the old type
- Rename the new type
-- Create new enum
CREATE TYPE order_status_new AS ENUM ('pending', 'processing', 'shipped');
-- Migrate data (handle 'delivered' and 'cancelled' values)
UPDATE orders SET status = 'shipped'
WHERE status IN ('delivered', 'cancelled');
-- Change column type
ALTER TABLE orders ALTER COLUMN status TYPE order_status_new
USING status::text::order_status_new;
-- Drop old, rename new
DROP TYPE order_status;
ALTER TYPE order_status_new RENAME TO order_status;
Best Practices
1. Always Review Constraint Diffs Before Merge
# Check what constraints changed
epoch diff main feature-branch --schema --constraints-only
2. Use Dry-Run for Merges
# Preview merge without applying
result = client.merge("feature", "main", dry_run=True)
if result.has_conflicts:
print("Resolve these conflicts before merging:")
for c in result.conflicts:
print(f" {c}")
3. Generate and Review Migrations
# Always review generated DDL
epoch schema migrate main feature --output migration.sql --dry-run
cat migration.sql
4. Test Migrations on Branch Data
# Create test branch
epoch branch create test/migration
# Apply migration
psql -f migration.sql
# Verify data integrity
epoch validate test/migration
5. Use Semantic Constraint Names
Follow a consistent naming convention:
- Foreign keys:
fk_{table}_{column}orfk_{table}_{referenced_table} - Unique:
uq_{table}_{columns} - Check:
chk_{table}_{description} - Index:
idx_{table}_{columns}
See Also
- Constraint Reference - Complete constraint type documentation
- Merge Algorithm - How three-way merge works
- Resolve Conflicts - Conflict resolution strategies