Replacing Your Kubernetes Cluster with a Single SQLite-Backed Binary: The Litestream Replication Architecture That Runs Your SaaS on a $5 VPS

TL;DR

Most early-stage SaaS products don’t need Kubernetes. A single Go or Rust binary embedding SQLite in WAL mode, paired with Litestream for continuous S3 replication, handles more concurrent users than you think. I’ll walk through the architecture, the tuning, the deployment model, and the exact traffic thresholds where this stops working.

Here’s what most teams get wrong

I’ve reviewed infrastructure for dozens of early-stage startups. The pattern is predictable: a founding engineer spins up a Kubernetes cluster on day one because “we’ll need to scale.” Six months later, they’re paying $200-$400/month on managed K8s, debugging Helm charts at 2 AM, and serving 50 daily active users.

Here’s what that actually looks like side by side:

Component	Kubernetes Stack	SQLite + Litestream
Compute	3-node cluster (~$150/mo)	Single $5-$10 VPS
Database	Managed PostgreSQL (~$15-$50/mo)	Embedded SQLite ($0)
Replication/Backups	Automated DB backups (~$5/mo)	Litestream to S3 (~$0.50/mo)
Load Balancer	Cloud LB (~$18/mo)	Caddy reverse proxy ($0)
Container Registry	Registry + CI/CD pipeline	`scp` a binary
Monthly Total	$200-$450	$6-$11
Operational Complexity	High	Near zero

That’s a 20-40x cost reduction with far less stuff that can break.

The architecture: single binary, continuous replication

Your application compiles to a single binary that opens an embedded SQLite database. Litestream runs as a sidecar process (or is embedded), continuously streaming WAL changes to S3-compatible storage.

[Your Binary (Go/Rust/etc.)]
        │
        ▼
   [SQLite - WAL Mode]
        │
        ▼
   [Litestream] ──stream──▶ [S3 Bucket]
        │
        ▼
   Point-in-time recovery from any snapshot

WAL-mode tuning for concurrent reads

SQLite’s default journal mode serializes everything. In WAL (Write-Ahead Logging) mode, you get concurrent reads while writes remain serialized. For a typical SaaS workload with heavy reads and modest writes, this is exactly what you want.

The pragmas to set at connection time:

PRAGMA journal_mode=WAL;
PRAGMA busy_timeout=5000;
PRAGMA synchronous=NORMAL;
PRAGMA cache_size=-20000;  -- 20MB cache
PRAGMA foreign_keys=ON;
PRAGMA wal_autocheckpoint=1000;

Setting synchronous=NORMAL instead of FULL is the key performance lever. You lose durability guarantees only in catastrophic OS-level crashes, and Litestream’s continuous replication to S3 already covers that failure mode. In my benchmarks, this combination handles 10,000+ reads/second and 1,000+ writes/second on modest hardware.

S3-streamed point-in-time recovery

Litestream doesn’t take periodic snapshots. It streams WAL frames continuously, typically with sub-second replication lag. That gives you point-in-time recovery granularity measured in seconds, not hours.

Recovery is a single command:

litestream restore -o /data/app.db s3://your-bucket/app.db

Your entire database restores from S3 in seconds for typical early-stage datasets. Compare that to restoring a PostgreSQL dump or waiting for a managed database snapshot to provision.

Single-binary deployment

Here’s the deployment script for your entire production infrastructure:

#!/bin/bash
scp myapp user@vps:/opt/myapp/myapp-new
ssh user@vps 'systemctl stop myapp && \
  mv /opt/myapp/myapp-new /opt/myapp/myapp && \
  systemctl start myapp'

No container registry. No image layers. No pod scheduling. No rollout strategies. Your CI pipeline builds a binary, copies it to the server, and restarts the service. Downtime is under one second. For an early-stage SaaS with a handful of users, this is honestly better than the alternative. Every layer of orchestration you remove is a layer that can’t page you at 3 AM.

Where this architecture breaks down

This is the section most advocates skip, and it’s the most important one. SQLite is single-writer. Here are the concrete thresholds:

Metric	SQLite comfortable range	Time to migrate
Concurrent write transactions	< 50/sec sustained	When you consistently exceed this
Database file size	< 10 GB	When queries over large tables slow
Need for horizontal read scaling	Never	When a single box can’t serve reads
Multi-region requirements	Not feasible	When latency mandates geo-distribution

In my experience, the write concurrency ceiling is what you’ll hit first. A SaaS handling user-generated content (form submissions, API calls, event tracking) typically crosses the discomfort zone around 500-1,000 monthly active users generating write-heavy workloads. Read-heavy products like dashboards or content delivery can push well beyond that.

The migration path is well-trodden: export to PostgreSQL using pgloader, update your queries (SQLite’s SQL dialect is 95% compatible), and deploy. Budget a weekend.

A note on developer sustainability

I don’t think people talk about this enough: simple infrastructure reduces cognitive load. When your entire backend is one process and one file, you spend less mental energy on ops and more on the product. On that note, during long architecture sessions, I keep HealthyDesk running to remind me to step away from the desk; turns out your deployment architecture matters less than your spine’s architecture.

What to actually do with this

Start with SQLite + Litestream if you’re pre-product-market-fit. The $200+/month you save compounds, and the operational simplicity lets a solo developer or small team move faster. Set WAL mode, tune your pragmas, and point Litestream at an S3 bucket.

Define your migration trigger upfront. Monitor sustained write transactions per second. When you’re consistently above 50 writes/sec with latency creeping up, start planning the PostgreSQL migration. Not before.

Treat your binary as the artifact, not a container image. A statically-linked binary with an embedded database is the simplest possible deployment unit. Add complexity only when traffic demands it, and let the data drive that decision instead of your anxiety about scale.

Scale your ambition first. Scale your servers when the metrics tell you to.

Replacing Your Kubernetes Cluster with a Single SQLite-Backed Binary: The Litestream Replication Architecture That Runs Your SaaS on a $5 VPS

TL;DR

Here’s what most teams get wrong

The architecture: single binary, continuous replication

WAL-mode tuning for concurrent reads

S3-streamed point-in-time recovery

Single-binary deployment

Where this architecture breaks down

A note on developer sustainability

What to actually do with this

Related Posts

Replacing Your Kubernetes Cluster with a Single SQLite-Backed Binary: The Litestream Replication Architecture That Runs Your SaaS on a $5 VPS

Deterministic replay testing for Kafka microservices

Fixing Android jank you can't see with Systrace