Common PostgreSQL Mistakes — Connection Pooling, Vacuuming, Indexing
Common PostgreSQL mistakes and how to avoid them
In production PostgreSQL deployments I see three recurring issues: connection pooling misconfiguration, neglecting vacuuming and autovacuum tuning, and poor indexing practices. These mistakes lead to connection storms, bloat, slow queries, and high operational costs — problems that grow quickly in the fast digital transformation projects aligned to Saudi Vision 2030 and PDPL-driven compliance.
1. Connection pooling: architecture mistakes and fixes
Symptom: many short-lived clients cause a flood of backend connections. Result: CPU spikes, context switching, out-of-memory processes and connection-limit errors. The right solution is a lightweight pooler such as PgBouncer in front of PostgreSQL.
Essentials:
- Use PgBouncer in transaction pooling for typical web apps. Avoid statement pooling if you rely on prepared statements or session variables.
- Set reasonable pool sizes per app server. One pool per application tier is simpler than many small ones.
- Monitor active connections and wait events — not just total connections.
# example pgbouncer.ini (minimal)
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = users.txt
pool_mode = transaction
max_client_conn = 2000
default_pool_size = 50
reserve_pool_size = 5
reserve_pool_timeout = 5
Connect applications to PgBouncer: psql "host=pgbouncer-host port=6432 user=appuser dbname=mydb". If you use serverless apps or many microservices, prefer centralized poolers or managed solutions — Worlber’s Quick Deploy can provision consistent pooler topologies quickly.
Practical checks
# check active connections on server
SELECT count(*) FROM pg_stat_activity WHERE state <> 'idle';
# recommended for PgBouncer: monitor histograms and wait times from its SHOW commands
SHOW POOLS;
2. Vacuuming and autovacuum: prevent bloat early
Symptom: tables grow in disk size but not in logical rows, queries slow, and transaction ID wraparound risk increases. Common cause: default autovacuum settings on high-write systems are insufficient.
Best practices:
- Keep autovacuum enabled; tune
autovacuum_vacuum_scale_factorandautovacuum_vacuum_thresholdper-table if necessary. - Increase
autovacuum_max_workersand reduceautovacuum_naptimefor busy databases. - Use manual VACUUM (FULL rarely) and VACUUM FREEZE for aging clusters when needed.
# recommended postgresql.conf snippets
autovacuum = on
autovacuum_max_workers = 6
autovacuum_naptime = 30
autovacuum_vacuum_scale_factor = 0.02 # 2% before autovacuum kicks in
autovacuum_vacuum_threshold = 50
Inspect bloat and autovacuum activity:
-- find tables with many dead tuples
SELECT relname, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
-- check table sizes (identify bloat candidates)
SELECT relname, pg_total_relation_size(relid) AS total_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY total_size DESC
LIMIT 10;
When autovacuum cannot keep up, schedule targeted VACUUM (VERBOSE, ANALYZE) and consider REINDEX CONCURRENTLY or use tools like pg_repack to remove bloat non-disruptively.
3. Indexing: too many or the wrong indexes
Indexes speed reads and slow writes. Common mistakes: creating indexes without analyzing queries, leaving unused indexes, or relying on defaults instead of partial/expression/BRIN indexes.
How I approach indexing as a senior DBA:
- Always start with EXPLAIN ANALYZE to understand planner choices.
- Create indexes concurrently in production:
CREATE INDEX CONCURRENTLY. - Prefer partial indexes for predicates (active=true), expression indexes for case-insensitive searches, and BRIN for large append-only time-series tables.
# examples
-- case-insensitive email searches
CREATE INDEX CONCURRENTLY idx_users_lower_email ON users (lower(email));
-- partial index for active rows
CREATE INDEX CONCURRENTLY idx_orders_active ON orders (created_at) WHERE status = 'active';
-- BRIN for huge time-series
CREATE INDEX CONCURRENTLY idx_events_time_brin ON events USING brin(event_time);
Identify unused indexes and costs:
SELECT schemaname, relname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;
Drop unused indexes after careful verification. Every unnecessary index increases INSERT/UPDATE cost and storage footprint, which hurts PDPL-compliant logging if you retain large audit tables.
Quick operational checklist
- Deploy PgBouncer for high-concurrency workloads; avoid connecting hundreds of clients directly to PostgreSQL.
- Monitor autovacuum stats weekly; tune thresholds by table size and write pattern.
- Use EXPLAIN ANALYZE before adding indexes. Prefer CONCURRENTLY and partial/expression indexes where appropriate.
- Automate routine checks with monitoring (pg_stat_* views, PgBouncer metrics) and alert on bloat, autovacuum lag, and connection saturation.
- For cloud or regulated deployments in Saudi Arabia, integrate these practices with encryption and access controls. PGEE by CYBERTEC can add TDE, masking, and auditing for PDPL compliance where needed.
Final notes from the field
Most performance problems are operational rather than theoretical. Connection pooling, timely vacuuming, and purposeful indexing form the backbone of predictable PostgreSQL operations. In GCC and Saudi projects driven by Vision 2030, efficiency and compliance go hand-in-hand: smaller, well-tuned databases cost less to secure, back up, and scale.
Ready to Transform Your Database Infrastructure?
Worlber helps companies across Saudi Arabia and the Middle East build reliable, secure, and cost-effective database solutions.
📧 Email us: contactus@worlber.com
🌐 Visit: worlber.sa/contact
📞 Call: Talk to our database experts today
Worlber — Your trusted PostgreSQL partner in the Middle East