PostgreSQL Connection Pooling for Mobile Backends at Scale
Meta description: Compare PgBouncer, Supavisor, and PgPool-II for mobile backends. Production configs for 10K+ concurrent mobile users on a single PostgreSQL instance.
Tags: backend, architecture, mobile, cloud, api
TL;DR
Mobile traffic patterns are fundamentally different from web traffic — bursty reconnections, unpredictable keep-alive behavior, and thundering herd problems at scale. Most teams copy their web pooling config and wonder why their database falls over at 5K users. PgBouncer in transaction mode remains the gold standard for raw performance, Supavisor wins for multi-tenant Postgres-as-a-service setups, and built-in pooling (PgPool-II or framework-level pools) should be your last resort at scale. I’ll walk you through the architecture, the numbers, and the configs that actually work.
Why Mobile Traffic Breaks Your Pool
Web clients maintain relatively stable connection patterns — a user loads a page, fires a few queries, and the connection returns to the pool. Mobile clients are chaotic. Consider what happens when 10,000 users open your app during a morning commute:
- Cold start storms: Mobile OS kills background processes aggressively. Every app open is a fresh connection attempt.
- Retry amplification: Flaky cellular networks cause timeouts, and mobile SDKs retry aggressively — one user can generate 3-5 connection attempts in seconds.
- Zombie connections: Users tunnel into subways, connections hang in TCP limbo, and your pool slots are occupied by ghosts.
A default PostgreSQL instance supports ~100 connections. Each connection consumes roughly 5-10 MB of RAM for the backend process. At 500 connections, you’re burning 5 GB just on connection overhead before a single query runs.
The Contenders: A Data-Driven Comparison
In my experience building production systems serving mobile clients, here’s how the three main strategies compare:
| Metric | PgBouncer (Transaction) | Supavisor | Built-in (PgPool-II) |
|---|---|---|---|
| Max throughput (queries/sec) | ~18,000 | ~14,000 | ~9,500 |
| Connection overhead per client | ~2 KB | ~3 KB | ~5-10 MB |
| Latency overhead (p99) | +0.1 ms | +0.3 ms | +1.2 ms |
| Multi-tenant support | Manual | Native | None |
| Prepared statement support | Limited¹ | Full | Full |
| Operational complexity | Low | Medium | High |
| Protocol | PostgreSQL wire protocol | PostgreSQL wire protocol (BEAM/Elixir) | PostgreSQL native |
¹ PgBouncer 1.21+ adds server_prepared_statements = yes for protocol-level prepared statement support, significantly improving compatibility in transaction mode.
Throughput figures are approximate ranges drawn from production load testing and community benchmarks using pgbench (64 threads, 200 concurrent connections, PostgreSQL 16, 4-vCPU/16GB instance, simple protocol, read-heavy workload). Results vary with hardware, query complexity, and network topology — treat these as directional comparisons rather than absolute guarantees. For reproducible benchmarks, run pgbench -c 200 -j 64 -T 120 -P 10 against your specific setup.
The numbers tell a clear story here. PgBouncer handles nearly 2x the throughput of PgPool-II with negligible memory overhead per connection.
PgBouncer: The Production Workhorse
For most mobile backends, PgBouncer in transaction mode is the answer. Here’s the configuration I use for a 10K+ concurrent user setup:
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = scram-sha-256
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 25
min_pool_size = 10
reserve_pool_size = 5
reserve_pool_timeout = 3
server_idle_timeout = 300
client_idle_timeout = 60
server_login_retry = 1
query_wait_timeout = 10
Key decisions explained
default_pool_size = 25 — This is the number of actual PostgreSQL connections per database. With a 4-vCPU database server, 25 active connections roughly matches CPU parallelism plus I/O wait headroom. The formula I use: (CPU cores * 2) + effective_spindle_count. For SSDs, spindle count is typically 1.
client_idle_timeout = 60 — Aggressive for web, but perfect for mobile. If a mobile client hasn’t sent a query in 60 seconds, reclaim that slot. Mobile apps backgrounded for more than a minute have likely been killed anyway.
query_wait_timeout = 10 — Fail fast. Mobile UIs should show errors within 10 seconds, not hang indefinitely. This prevents queue buildup during connection storms.
When Supavisor Makes Sense
Supavisor, built on the BEAM VM, shines in multi-tenant architectures where you’re running one Postgres per tenant or routing across multiple databases. If you’re building a SaaS platform with isolated databases per customer, Supavisor’s native tenant routing eliminates the need for multiple PgBouncer instances. For a single-database mobile backend, though, it adds latency and operational complexity you don’t need.
The Pattern That Handles 10K Users
Here is what most teams get wrong about this: they put the pooler at the wrong layer. The architecture that scales:
Mobile Clients (10K+)
→ API Gateway (rate limiting + auth)
→ Application Servers (3-5 instances, each with local pool of 20)
→ PgBouncer (max 100 server connections)
→ PostgreSQL (max_connections = 120)
You want two layers of pooling: application-level (HikariCP, SQLAlchemy pool, Prisma pool) set to a small size per instance, feeding into PgBouncer which multiplexes everything down to what PostgreSQL can actually handle. This is where the leverage comes from — 10,000 mobile clients become 60-100 application connections across your server fleet, which PgBouncer multiplexes down to 25 active PostgreSQL backends. One well-configured pool outperforms ten poorly configured ones.
The Prepared Statement Trap
Transaction mode PgBouncer doesn’t support named prepared statements because connections are reassigned between transactions. Your options:
- Use protocol-level prepared statements with
server_prepared_statements = yes(PgBouncer 1.21+) — this is now the preferred approach - Use
DEALLOCATE ALLat transaction boundaries (performance hit, legacy fallback) - Disable prepared statements in your ORM — for mobile backends with simple queries, this is often an acceptable tradeoff
For Kotlin/Exposed or JOOQ backends, either enable protocol-level prepared statements in PgBouncer 1.21+ or disable prepared statement caching at the driver level and let PgBouncer handle connection efficiency.
Monitoring What Matters
Run SHOW POOLS and SHOW STATS on PgBouncer regularly. The metrics that predict failures before they happen:
cl_waiting> 0 sustained: Your pool is too small or queries are too slowavg_wait_time> 100ms: Clients are queueing — scale horizontally or optimize queriessv_idleconsistently 0: No headroom — increasedefault_pool_size
Set up a Grafana dashboard with these three metrics front and center. Alert on them during peak hours — they predict connection pool exhaustion 10-15 minutes before users start seeing errors.
Actionable Takeaways
-
Use PgBouncer in transaction mode with aggressive client timeouts — set
client_idle_timeoutto 60s for mobile backends anddefault_pool_sizeto(2 * CPU cores) + 1. This alone handles 10K+ mobile clients on a single PostgreSQL instance. -
Implement two-layer pooling — application-level pools (small, per-instance) feeding into PgBouncer. Never let your application servers connect directly to PostgreSQL in a mobile backend architecture.
-
Monitor
cl_waitingandavg_wait_timeas leading indicators — these metrics predict connection pool exhaustion before users see errors. Alert on them, not on connection count alone.