Troubleshooting
Problem, cause, fix. In that order.
Connection exhaustion
Symptom
Applications get "connection refused" or time out after 30 seconds waiting for a connection.
Cause
All backend connections are in use. The pool is full and the blocking_timeout (30s default) expired before a connection became available.
Fix
- -- Check for long-running transactions holding connections open
- -- Check for connection leaks in application code (connections not returned to the pool)
- -- Increase
MAX_CONNECTIONSonly if the above are ruled out - -- Verify PostgreSQL's own
max_connectionsis not the bottleneck
Increasing pool size without fixing the root cause just delays the problem. If the pool fills up consistently, the issue is almost always in application code.
High latency despite pooler
Symptom
Queries through the pooler are slower than expected. Adding the pooler made things worse, not better.
Cause
Several possibilities, in order of likelihood:
- -- The queries themselves are slow (pooler does not fix slow queries)
- -- All pool connections are busy and clients are waiting for one to free up
- -- Network hop between application and pooler is adding latency
- -- Pool size is too small for the concurrency level
Fix
- -- Run the same query directly against PostgreSQL to isolate pooler overhead from query time
- -- Check if connections are waiting (application-side connection acquire time)
- -- Deploy the pooler as close to the application as possible (same node, same VPC)
- -- Review
MAX_CONNECTIONSrelative to actual concurrent query load
Incorrect connection sizing
Symptom
PostgreSQL reports "too many connections" even though the pooler is running. Or: the pool has hundreds of idle connections that are never used.
Cause
MAX_CONNECTIONSin the pooler exceeds PostgreSQL's own max_connections. Or: multiple pooler replicas each open their full pool size, and the total exceeds the backend limit.
Fix
- -- Calculate:
replicas × MAX_CONNECTIONS ≤ PostgreSQL max_connections - -- Leave headroom for admin connections and monitoring tools
- -- Start with 25–50 per replica and increase only when connections are consistently waiting
Example: 2 replicas × 50 = 100 backend connections. If PostgreSQL has max_connections = 100, there is no room for anything else. Set it to max_connections = 120 or reduce the pool size.
Misconfigured authentication
Symptom
Clients get "authentication failed" errors even though credentials work when connecting directly to PostgreSQL.
Cause
- --
PG_USERNAME/PG_PASSWORDwere set but do not match the PostgreSQL credentials - -- PostgreSQL requires
scram-sha-256but the pooler or client is attemptingmd5 - -- PostgreSQL
pg_hba.confrejects connections from the pooler's IP address
Fix
- -- Use passthrough mode (no
PG_USERNAME/PG_PASSWORD) unless you specifically need registered users - -- Verify PostgreSQL
pg_hba.confallows connections from the pooler container or pod IP range - -- Check that both client and server agree on the auth method
Container and network issues
Symptom
The container starts but clients cannot connect, or the container keeps restarting.
Common causes and fixes
| Problem | Fix |
|---|---|
| Container starts, clients get "connection refused" | Port 6432 not published. Check -p 6432:6432 or Kubernetes Service. |
| Container restarts repeatedly | Check logs: docker logs pgagroal. Usually a config error or OOM. Check resource limits. |
| Pooler starts but cannot reach PostgreSQL | Verify PG_BACKEND_HOST is reachable from the container. Check DNS, network policies, security groups. |
| Connections work, then stop after backend restart | pgagroal recovers automatically (typically within 60s). If it does not, check that the backend is fully ready before clients retry. |
| Health check fails but container seems fine | The ping command needs access to the Unix socket in /tmp. If the filesystem is misconfigured (e.g., missing emptyDir), the socket cannot be created. |
Diagnostic checklist
When something is wrong, run through this in order:
- Is the container running?
docker psorkubectl get pods - Is the daemon healthy?
docker exec pgagroal pgagroal-cli -c /etc/pgagroal/pgagroal.conf ping - What do the logs say?
docker logs pgagroal - Can the container reach PostgreSQL?
docker exec pgagroal pg_isready -h $PG_BACKEND_HOST -p $PG_BACKEND_PORT - Can a client connect through the pooler?
psql -h localhost -p 6432 -U user -d db -c 'SELECT 1'
If step 1–3 pass but step 4 fails, the problem is between the pooler and PostgreSQL (network, DNS, firewall). If step 4 passes but step 5 fails, the problem is between the client and the pooler (port not published, auth mismatch).
See also: Configuration for pool sizing and timeouts, or Observability for monitoring pool health.