Stop using OFFSET for pagination — it won't scale
Meta description: LIMIT/OFFSET pagination degrades linearly with dataset size. Keyset cursor pagination gives you consistent O(1) performance at any page depth.
Tags: backend, api, rest, architecture, graphql
TL;DR
LIMIT/OFFSET pagination forces your database to scan and discard N rows before returning results, so page 10,000 is orders of magnitude slower than page 1. Keyset (cursor) pagination uses indexed column seeks to jump directly to the next result set, delivering consistent performance regardless of depth. If your dataset exceeds a few thousand rows and users (or integrations) page deeply, you need cursor pagination. Full stop.
The hidden cost of OFFSET
Most teams assume the database “skips” rows efficiently when using OFFSET. It doesn’t. When you run:
SELECT * FROM orders ORDER BY id LIMIT 20 OFFSET 100000;
PostgreSQL scans 100,020 rows, discards the first 100,000, and returns the remaining 20. Every discarded row costs CPU and I/O. The deeper you page, the worse it gets. Performance degrades linearly with the offset value.
In my experience, this is the single most common API performance issue in backend services that have outgrown their MVP phase. It starts small. Pagination works fine with 5,000 rows. Then you hit 500,000 rows, and your page-50 request takes 2 seconds. By the time you reach millions of rows, deep pages are timing out entirely.
What EXPLAIN ANALYZE reveals
-- OFFSET approach (page 5000, 20 rows per page)
EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY id LIMIT 20 OFFSET 100000;
-- Result: Seq Scan on orders
-- Planning Time: 0.08 ms
-- Execution Time: 112.45 ms (scanned 100,020 rows)
-- Keyset approach (same logical page)
EXPLAIN ANALYZE
SELECT * FROM orders WHERE id > 100000 ORDER BY id LIMIT 20;
-- Result: Index Scan using orders_pkey
-- Planning Time: 0.07 ms
-- Execution Time: 0.12 ms (scanned 20 rows)
That’s ~900x faster at this depth. The keyset query performs an index seek directly to the starting point, then reads exactly 20 rows. No scanning, no discarding. Consistent O(log n + k) performance where k is your page size.
OFFSET vs cursor: the full comparison
| Criteria | LIMIT/OFFSET | Keyset cursor |
|---|---|---|
| Performance at depth | Degrades linearly with offset | Constant regardless of position |
| DB operation | Sequential scan + discard | Index seek + limited scan |
| CPU/IO cost | Grows with page number | Fixed per request |
| Duplicate/skipped rows on INSERT | Yes — rows shift between pages | No — cursor is stable |
| Arbitrary page jumping | Supported (page 1, 50, 200) | Not natively supported |
| Implementation complexity | Trivial | Moderate |
| Suited for infinite scroll / feeds | Poor | Excellent |
| Suited for numbered page UIs | Adequate for small datasets | Requires adaptation |
Implementing cursor pagination
Basic REST API pattern
Your API response includes a next_cursor that the client passes back:
{
"data": [...],
"pagination": {
"next_cursor": "eyJpZCI6MTAwMDIwfQ==",
"has_more": true
}
}
The cursor is typically a Base64-encoded representation of the last row’s sort key. On the server side:
// Ktor route example
get("/orders") {
val cursor = call.parameters["cursor"]
?.let { Base64.decode(it) }
?.let { Json.decodeFromString<Cursor>(it) }
val orders = db.orders
.run { if (cursor != null) where { id greater cursor.lastId } else this }
.orderBy(Orders.id)
.limit(20)
.toList()
val nextCursor = orders.lastOrNull()?.let {
Base64.encodeToString(Cursor(lastId = it.id))
}
call.respond(PagedResponse(orders, nextCursor))
}
Spring Boot follows the same pattern: decode the cursor, apply a WHERE clause, encode the next cursor from the last result.
The composite cursor edge case
When sorting by a non-unique column (e.g., created_at), you need a composite cursor that includes a tiebreaker:
-- Sorting by created_at (non-unique) + id (unique tiebreaker)
SELECT * FROM orders
WHERE (created_at, id) > (:last_timestamp, :last_id)
ORDER BY created_at ASC, id ASC
LIMIT 20;
This tuple comparison leverages PostgreSQL’s row-value comparison and can use a composite index on (created_at, id) efficiently. Without the tiebreaker, rows with identical timestamps get skipped or duplicated.
GraphQL connections
The Relay Connection specification was built for cursor pagination. If you’re building a GraphQL API, this is the standard approach:
query {
orders(first: 20, after: "eyJpZCI6MTAwMDIwfQ==") {
edges {
node { id total createdAt }
cursor
}
pageInfo { hasNextPage endCursor }
}
}
When OFFSET is still fine
I’m not dogmatic about this. OFFSET works when your dataset is small (under ~10,000 rows), users never page past the first few pages, and you need numbered page navigation. Admin dashboards with modest data volumes are a classic example. The real problem is when teams pick OFFSET as a default and never revisit the decision as data grows.
What to do now
Run EXPLAIN ANALYZE on your paginated queries at realistic offsets (page 100, 500, 1000). If execution time grows linearly with the page number, you’ve got a problem that will only get worse.
For any API-facing or feed-style pagination, default to keyset cursors. Encode composite cursors (sort column + unique tiebreaker) as opaque Base64 tokens so you handle the non-unique sort column case from day one.
Reserve OFFSET for admin UIs with small, bounded datasets. If you need page-number navigation on larger datasets, consider a hybrid: keyset pagination under the hood with a separate count query (cached) for the page count display.