PostgreSQL HA: Streaming Replication with Automatic Failover
Introduction
High availability (HA) is a foundational requirement for production PostgreSQL deployments. In this post I'll walk you through a practical, production-ready approach to PostgreSQL HA using streaming replication for data continuity plus automatic failover for resilient operations. You'll get configuration examples, commands, and recommended settings so your database cluster can survive failures with minimal downtime.
Why streaming replication + automatic failover?
Streaming replication provides continuous shipping of WAL (Write-Ahead Log) records from a primary to one or more replicas. That guarantees near real-time data replication. Automatic failover tools such as Patroni or repmgr remove human intervention from the process of promoting a replica when the primary fails, restoring write capability quickly and consistently.
Architecture overview
A common HA architecture includes:
- Primary node (read/write)
- One or more replicas (read-only, warm or hot)
- Distributed consensus/leader election layer (etcd/consul/Zookeeper)
- Failover manager (Patroni or repmgrd) which integrates with the consensus layer
Prerequisites
- PostgreSQL 12+ installed on all nodes
- Network connectivity between nodes on the PostgreSQL port (default 5432)
- An etcd/consul cluster or equivalent for leader election (for Patroni)
- A PostgreSQL superuser for configuration tasks
Step-by-step setup
1. Configure the primary
Edit postgresql.conf on the primary and set replication-related parameters:
wal_level = replicamax_wal_senders = 10max_replication_slots = 10archive_mode = onarchive_command = 'test ! -f /var/lib/postgresql/wal_archive/%f && cp %p /var/lib/postgresql/wal_archive/%f'synchronous_commit = onsynchronous_standby_names = '*'Allow the replication user in pg_hba.conf (adjust CIDR to your network):
# allow replication connections from replicashost replication replicator 10.0.0.0/24 md5Create a replication role:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'strong_password';2. Take a base backup for the replica
The simplest way to provision a replica is to use pg_basebackup:
pg_basebackup -h primary.example.com -D /var/lib/postgresql/12/main -U replicator -v -P --wal-method=streamFor PostgreSQL 12 and later, after the base backup completes create an empty file named standby.signal inside the replica's $PGDATA and add the primary connection info to postgresql.conf or postgresql.auto.conf:
# in postgresql.conf on the replicaprimary_conninfo = 'host=primary.example.com port=5432 user=replicator password=strong_password'Start PostgreSQL on the replica and confirm replication with:
SELECT client_addr, state, sync_state FROM pg_stat_replication;3. Replication slots (optional but recommended)
Create a replication slot on the primary to prevent WAL from being removed before replicas consume it:
SELECT * FROM pg_create_physical_replication_slot('replica_slot_1');Configure the replica to use the slot by adding primary_slot_name = 'replica_slot_1' to primary_conninfo or manage slots via Patroni.
Automatic failover with Patroni
Patroni is a lightweight orchestrator that uses a distributed consensus store (etcd/consul) to manage leader election and coordinates PostgreSQL configuration changes for failover. Below is a minimal Patroni YAML snippet for a node:
scope: postgres_clusternamespace: /service/name: node1etcd: host: 10.0.0.1:2379bootstrap: dcs: ttl: 30 initdb: - encoding: UTF8Start Patroni as a system service on each node. Patroni will handle base backups (using pg_basebackup), create standby signal files, manage replication slots, and perform automatic failover using the consensus store.
Testing failover and recovery
1. Simulate primary failure (stop Patroni or PostgreSQL on primary):
systemctl stop patroni2. Patroni will detect the failure and elect a new leader. Verify the new primary:
patronictl -c /etc/patroni.yml list3. When the old primary returns, Patroni will rejoin it as a replica (unless configured otherwise).
Operational tips and best practices
- Use multiple replicas across different availability zones to avoid correlated failures.
- Monitor replication lag via
pg_stat_replicationand alerts for WAL accumulation. - Secure replication users and network traffic (use TLS for Patroni and PostgreSQL connections where possible).
- Test failover regularly in a staging environment to verify application resilience.
- Consider WAL archiving (WAL-G / WAL-E) for PITR and longer retention.
When to choose managed solutions
Running and operating HA PostgreSQL with streaming replication and Patroni is reliable, but it requires operational maturity. If you prefer a managed approach, platforms like Worlber Carbonate or Quick Deploy provide managed PostgreSQL and DaaS options that include encryption, backup, and HA orchestration handled for you. For teams that want transparent data encryption in addition to HA, Worlber's PGEE integrates with PostgreSQL deployments to secure data at rest.
Conclusion
Implementing streaming replication plus an automatic failover manager such as Patroni gives you a robust, production-ready PostgreSQL HA solution. Use replication slots and WAL archiving to protect against data loss, test failovers frequently, and monitor replication lag closely. If you need help designing or operating PostgreSQL HA in Saudi Arabia or the Middle East, Worlber's team of senior DBAs can help accelerate the journey from architecture to production.
Ready to Transform Your Database Infrastructure?
Worlber helps companies across Saudi Arabia and the Middle East build reliable, secure, and cost-effective database solutions.
📧 Email us: contactus@worlber.com
🌐 Visit: worlber.sa/contact
📞 Call: Talk to our database experts today
Worlber - Your trusted PostgreSQL partner in the Middle East