Federation & Replication Quick Start
Federation & Replication Quick Start
Overview
Federation and replication enable seamless data distribution across multiple nodes, regions, and even clouds while maintaining consistency and high availability.
Key Concepts
Federation vs Replication
- Replication: Full or partial data copies across nodes
- Federation: Distributed queries across multiple databases
- Hybrid: Combine both for flexibility
When to Use
- Multi-region deployments
- Disaster recovery
- Geographic data locality
- High availability (no single point of failure)
- Load distribution
Quick Start
1. Setup Replication
Primary-Secondary Setup
-- On PRIMARY nodeCREATE PUBLICATION main_pub FOR ALL TABLES;
-- On SECONDARY nodeCREATE SUBSCRIPTION main_sub CONNECTION 'postgresql://primary-host:5432/database' PUBLICATION main_pub;
-- Verify replicationSELECT * FROM pg_stat_replication;2. Multi-Master Replication
Allows writes on multiple nodes (automatic conflict resolution).
-- Node 1: Enable multi-masterSET multi_master_enabled = true;SET node_id = 1;SET master_nodes = 'node2:5432,node3:5432';
-- Node 2: Same configurationSET multi_master_enabled = true;SET node_id = 2;SET master_nodes = 'node1:5432,node3:5432';
-- Write on either nodeINSERT INTO products (name, price) VALUES ('Widget', 99.99);
-- Changes propagate automatically3. Sharding Strategy
Distribute data by key range or hash.
-- Hash-based shardingCREATE TABLE orders ( id BIGINT PRIMARY KEY, customer_id INT, amount DECIMAL(10,2), shard_key INT GENERATED ALWAYS AS (customer_id % 4)) PARTITION BY HASH (customer_id);
-- Create shardsCREATE TABLE orders_shard_0 PARTITION OF orders FOR VALUES WITH (MODULUS 4, REMAINDER 0);CREATE TABLE orders_shard_1 PARTITION OF orders FOR VALUES WITH (MODULUS 4, REMAINDER 1);-- ... repeat for shards 2, 3
-- Distribute to nodesALTER TABLE orders_shard_0 SET (shard_location = 'node1');ALTER TABLE orders_shard_1 SET (shard_location = 'node2');4. Geographic Distribution
Deploy across multiple regions.
-- Define regionsCREATE REGION 'us-west' WITH (PRIMARY, REPLICA);CREATE REGION 'eu-central' WITH (PRIMARY, REPLICA);CREATE REGION 'ap-southeast' WITH (REPLICA);
-- Create geo-replicated tableCREATE TABLE global_config ( id SERIAL PRIMARY KEY, key VARCHAR(100), value TEXT, modified_at TIMESTAMP) WITH ( replication_regions = 'us-west,eu-central,ap-southeast', replication_strategy = 'eventual_consistency');
-- Ensure regional presenceSELECT region_name, status FROM pg_regions;5. Failover Configuration
Automatic failover on primary failure.
-- Configure failoverALTER SYSTEM SET auto_failover = true;ALTER SYSTEM SET failover_timeout = '30s';ALTER SYSTEM SET failure_detection_method = 'heartbeat';
-- Promote secondarySELECT pg_promote(); -- On standby
-- Check cluster statusSELECT * FROM pg_nodes WHERE node_type = 'primary';Common Use Cases
Use Case 1: High Availability
-- Primary in datacenter A, secondary in datacenter B-- If primary fails, secondary takes over automaticallySET high_availability_mode = 'active_passive';
-- Monitor replication lagSELECT usename, application_name, write_lag, flush_lag, replay_lagFROM pg_stat_replication;Use Case 2: Read Scaling
-- Route reads to closest replica-- Writes go to primary-- Distributed load balancing
SELECT * FROM query_result_cacheWHERE node_id = 'nearest_node' -- Automatic routingLIMIT 100;Use Case 3: Disaster Recovery
-- Backup in separate regionCREATE TABLE disaster_recovery_config ( rpo_minutes INT DEFAULT 15, -- Recovery Point Objective rto_minutes INT DEFAULT 5, -- Recovery Time Objective backup_region VARCHAR(100));
-- Regular backups to cloudBACKUP DATABASE TO 's3://backups/prod';Use Case 4: Data Locality
-- Store data close to users-- US customers -> US nodes-- EU customers -> EU nodes
SELECT * FROM customersWHERE region = current_region()-- Automatically routed to regional nodePerformance Optimization
1. Replication Settings
-- Optimize for throughputSET wal_level = 'replica';SET max_wal_senders = 10;SET wal_keep_size = '1GB';
-- Monitor replication performanceSELECT slot_name, restart_lsn, confirmed_flush_lsnFROM pg_replication_slots;2. Sharding Considerations
-- Monitor shard balanceSELECT shard_location, COUNT(*) as row_countFROM ordersGROUP BY shard_location;
-- Rebalance if neededALTER TABLE orders REBALANCE SHARDS;3. Conflict Resolution
-- For multi-master, define conflict resolutionSET conflict_resolution_policy = 'last_write_wins';-- ORSET conflict_resolution_policy = 'custom_function';CREATE FUNCTION resolve_conflict(v1 ANY, v2 ANY) RETURNS ANY AS $$ SELECT CASE WHEN v1 > v2 THEN v1 ELSE v2 END; $$ LANGUAGE SQL;Monitoring & Health
-- Cluster health checkSELECT node_name, status, last_heartbeat, lag_bytesFROM pg_cluster_health;
-- Replication slots statusSELECT slot_name, slot_type, restart_lsn FROM pg_replication_slots;
-- Lag monitoringSELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) as replication_lag_seconds;Troubleshooting
Q: Replication lagging?
A: Check network, increase wal_buffers, reduce replication rate.
Q: Shards unbalanced?
A: Run REBALANCE SHARDS or redistribute manually.
Q: Failover not automatic?
A: Check auto_failover setting and ensure failure detection is working.
Best Practices
-
Monitor Continuously
-- Track replication lagSELECT * FROM pg_stat_replication ORDER BY write_lag DESC; -
Plan Capacity
-- Calculate shards neededshard_count = total_rows / target_shard_size; -
Test Failover
-- Regular failover drills-- Promote standby in test environment -
Secure Communication
-- Use SSL for replicationALTER SYSTEM SET ssl = on;ALTER SYSTEM SET ssl_cert_file = '/path/to/cert.pem';
Next Steps
- Read
/docs/guides/user/REPLICATION_OPERATIONS_GUIDE.mdfor operational details - Check sharding strategy guide:
/docs/guides/user/SHARDING_STRATEGY_DECISION_GUIDE.md - Review multi-master guide:
/docs/guides/user/MULTI_MASTER_REPLICATION_OPERATIONS.md
Related Features
- High Availability:
/docs/guides/user/HIGH_AVAILABILITY_OPERATIONAL_PROCEDURES.md - Multi-Cloud:
/docs/guides/user/MULTI_CLOUD_FEDERATION_GUIDE.md - Backup & Recovery:
/docs/guides/user/BACKUP_RECOVERY_GUIDE.md
Document Version: 1.0 Last Updated: December 30, 2025 Audience: Database architects, DevOps engineers Reading Time: 10 minutes