Backpressure-Aware SSE Reconnection in Mobile Clients: EventSource Gaps, Exponential Backoff with Jitter, and the Kotlin Flow Architecture That Prevents Message Loss During Network Transitions

TL;DR

Standard EventSource implementations silently drop messages during mobile network transitions. Last-Event-ID alone doesn’t guarantee delivery because servers can evict event history before your client reconnects. What you actually need: a Kotlin Flow-based SSE consumer backed by a local Write-Ahead Log (WAL), exponential backoff with jitter, and explicit backpressure signaling when the UI layer falls behind the event stream.

The problem most teams ignore

I’ve built several production systems that rely on Server-Sent Events. Teams spend 90% of their effort on the server push architecture and roughly 0% thinking about what happens on the client when a user walks from wifi into an elevator.

What actually happens during a cellular-to-wifi handoff:

The TCP connection drops silently (no FIN, no RST)
The OS detects the new network after 2-15 seconds
The EventSource implementation reconnects
Events emitted during the gap are gone

The W3C EventSource spec defines Last-Event-ID as the recovery mechanism. The client sends the last received ID on reconnection, and the server replays from that point. In theory, this works. In practice, most server implementations use bounded in-memory buffers for event history.

Server Implementation	Default Buffer	Eviction Policy
Node.js `sse-channel`	500 events	FIFO ring buffer
Go `r3labs/sse`	1000 events	Time-based (5 min)
Spring SseEmitter	None	No replay support
Nginx `nchan`	Configurable	Memory + time

If your client disconnects for longer than the server retains history, Last-Event-ID returns nothing. No error. No indication of loss. Just silence.

Exponential backoff with jitter: getting reconnection right

The SSE spec suggests a server-sent retry field to control reconnection timing. Most implementations default to a fixed 3-second retry. On mobile, this is wrong for two reasons: it creates thundering herd problems when thousands of clients reconnect after a regional outage, and it wastes battery during extended dead zones.

The mistake I see constantly: teams implement backoff without jitter, which just shifts the thundering herd to a later time slot.

fun reconnectDelay(attempt: Int, baseMs: Long = 1000L, maxMs: Long = 30_000L): Long {
    val exponential = baseMs * 2.0.pow(attempt.coerceAtMost(5)).toLong()
    val capped = exponential.coerceAtMost(maxMs)
    val jitter = (capped * Random.nextDouble(0.5, 1.0)).toLong()
    return jitter
}

The numbers are stark. With 10,000 clients reconnecting, fixed 3-second retry concentrates all connections in a single 100ms window. Full jitter spreads them across 15 seconds, a 150x reduction in peak load.

The Kotlin Flow architecture

The core architecture uses three layers connected via Kotlin Flows with explicit backpressure:

class ResilientSseConsumer(
    private val db: WalDatabase,
    private val connectivity: ConnectivityMonitor
) {
    fun events(): Flow<SseEvent> = channelFlow {
        connectivity.networkState.collectLatest { state ->
            if (state.isConnected) {
                val lastId = db.walDao().lastEventId()
                sseConnect(lastId)
                    .onEach { event ->
                        db.walDao().insert(event.toWalEntry())
                    }
                    .buffer(capacity = 64, onBufferOverflow = BufferOverflow.SUSPEND)
                    .collect { send(it) }
            }
        }
    }.flowOn(Dispatchers.IO)
}

WAL-backed message buffer

Every received event gets written to a Room database WAL before delivery to the UI. This survives process death. On reconnection, the client reads the last persisted event ID from SQLite, not from memory.

Network transition handling

Instead of relying on EventSource’s built-in reconnection, the architecture observes ConnectivityManager callbacks via collectLatest. When the network changes, the current connection is cancelled and a fresh one is established with the correct Last-Event-ID. collectLatest is the key operator here because it ensures only one active SSE connection exists at any time.

Backpressure via `buffer(SUSPEND)`

When the UI can’t consume events fast enough (common during rapid state updates), .buffer(capacity = 64, onBufferOverflow = SUSPEND) applies backpressure upstream. The SSE read loop pauses, TCP flow control kicks in, and the server naturally slows delivery. No dropped messages. No unbounded memory growth.

Strategy	Memory behavior	Message loss	Process death recovery
Raw EventSource	Unbounded	Yes, on reconnect	None
EventSource + Last-Event-ID	Unbounded	Server buffer dependent	None
Flow + WAL + Backpressure	Bounded (64 events)	No	Full recovery

Handling the gap between server and client

Even with WAL persistence, there’s a window where the server may have evicted events that the client hasn’t yet received. The defense is a sequence number embedded in each event. On reconnection, the client compares the first received sequence number against its last persisted one. If there’s a gap, it triggers a full state sync via a REST fallback endpoint.

if (firstEvent.sequence - lastPersistedSequence > 1) {
    val fullState = api.getFullState()
    db.walDao().replaceAll(fullState)
}

This hybrid approach, SSE for real-time and REST for gap recovery, is the only pattern I’ve seen work reliably in production across flaky mobile networks.

What to do with all this

Don’t trust Last-Event-ID alone. Persist event IDs in a local WAL and implement sequence gap detection with a REST fallback for full state recovery.
Use collectLatest with ConnectivityManager for network transitions. Don’t rely on EventSource reconnection. It’s unaware of Android network lifecycle and will maintain zombie connections during handoffs.
Apply explicit backpressure with buffer(SUSPEND). Unbounded event buffering on mobile leads to OOM crashes under burst traffic. Let Kotlin Flow’s structured concurrency propagate backpressure through TCP flow control to the server.

Backpressure-Aware SSE Reconnection in Mobile Clients: EventSource Gaps, Exponential Backoff with Jitter, and the Kotlin Flow Architecture That Prevents Message Loss During Network Transitions

TL;DR

The problem most teams ignore

Exponential backoff with jitter: getting reconnection right

The Kotlin Flow architecture

WAL-backed message buffer

Network transition handling

Backpressure via `buffer(SUSPEND)`

Handling the gap between server and client

What to do with all this

Related Posts

PgBouncer transaction mode for 50k mobile users

Android LLM speed: KV cache persistence cuts latency 60%

gRPC-Web on mobile without a proxy: Connect Protocol

Backpressure-Aware SSE Reconnection in Mobile Clients: EventSource Gaps, Exponential Backoff with Jitter, and the Kotlin Flow Architecture That Prevents Message Loss During Network Transitions

TL;DR

The problem most teams ignore

Exponential backoff with jitter: getting reconnection right

The Kotlin Flow architecture

WAL-backed message buffer

Network transition handling

Backpressure via buffer(SUSPEND)

Handling the gap between server and client

What to do with all this

Related Posts

PgBouncer transaction mode for 50k mobile users

Android LLM speed: KV cache persistence cuts latency 60%

gRPC-Web on mobile without a proxy: Connect Protocol

Backpressure via `buffer(SUSPEND)`