MVP Factory
ai startup development

Zero downtime schema migrations in distributed DBs

KW
Krystian Wiewiór · · 6 min read

Meta description: How CockroachDB, YugabyteDB, TiDB, and Spanner handle online DDL vs PostgreSQL. Strategies for zero downtime schema migrations on sharded clusters.

TL;DR

Single-node PostgreSQL takes aggressive locks during schema changes that block reads and writes. Distributed databases like CockroachDB, YugabyteDB, TiDB, and Spanner implement online DDL, meaning schema changes propagate across nodes without downtime. But each system handles it differently, with real trade-offs in performance, compatibility, and operational complexity. This is what you need to know before running ALTER TABLE on a 50-node cluster.

The problem: ALTER TABLE on a sharded cluster

The scariest moment in any release isn’t the code deploy. It’s the schema migration. On single-node PostgreSQL, an ALTER TABLE ... ADD COLUMN with a default value acquired an ACCESS EXCLUSIVE lock until PostgreSQL 11 introduced heap rewrites avoidance for columns with non-volatile defaults. Even today, operations like adding a NOT NULL constraint with validation or creating an index without CONCURRENTLY will block concurrent queries.

Now multiply that across a distributed cluster. Dozens of nodes, each holding a subset of data, all needing to agree on what the schema looks like. A naive approach (lock the entire table globally) means seconds or minutes of downtime on large tables. Unacceptable for systems targeting five-nines availability.

How distributed databases solve online DDL

The core insight is schema versioning. Instead of a single atomic flip, distributed databases introduce the new schema gradually, allowing nodes to temporarily operate on different schema versions simultaneously.

Google Spanner’s schema change protocol ensures that at any point in time, no more than two schema versions are active across the cluster. CockroachDB adopted a similar model based on the Spanner paper, implementing a multi-phase state machine for schema changes.

A typical online ADD COLUMN looks like this:

Phase 1: DELETE-ONLY           → New column exists but only delete operations consider it
Phase 2: DELETE-AND-WRITE-ONLY → New column accepts writes but reads ignore it  
Phase 3: PUBLIC                → Column is fully visible to all operations

Each phase propagates across nodes with a lease-based protocol. Nodes that haven’t received the update yet continue operating on the prior version, and the protocol guarantees correctness across both versions.

Online DDL support compared

FeaturePostgreSQLCockroachDBYugabyteDBTiDBSpanner
Add columnLock required (fast since v11)OnlineOnlineOnlineOnline
Add indexCONCURRENTLY optionOnline (backfill)OnlineOnline (internal backfill worker)¹Online
Drop columnACCESS EXCLUSIVE lockOnline, multi-phaseOnlineOnlineOnline
Add NOT NULL constraintFull table scan + lockOnline validationOnlineOnlineOnline
Change column typeRewrites table + lockLimited supportLimited supportSupported via ALTERLimited²
Schema versioningNone (single node)Spanner-based protocolRaft-based propagationSchema lease + owner modelNative multi-version
Max concurrent DDLsLimited by lock contentionMultiple (with caveats)MultipleSingle DDL owner per tableMultiple

¹ TiDB uses an internal DDL worker (not to be confused with TiDB Data Migration/DM, which handles external replication). See TiDB DDL documentation.

² Spanner supports some in-place type promotions (e.g., widening STRING length) without table rewrites, but does not support arbitrary type changes in-place.

Look at that table. Every distributed database has invested in making DDL non-blocking, yet column type changes remain painful across nearly every system. That’s where you’ll still feel the sting.

Migration strategies that actually work

1. Expand-contract pattern

The most battle-tested pattern across all distributed databases:

-- Phase 1: Expand -- add the new column
ALTER TABLE orders ADD COLUMN status_v2 INT;

-- Phase 2: Dual-write -- application writes to both columns
-- Phase 3: Backfill -- migrate existing data
UPDATE orders SET status_v2 = status WHERE status_v2 IS NULL;

-- Phase 4: Contract -- drop the old column
ALTER TABLE orders DROP COLUMN status;

This decouples the schema change from the application deploy. That matters when your cluster nodes might be running different binary versions during a rolling update.

2. Declarative schema management

Tools like atlas or CockroachDB’s built-in schema change jobs let you declare the desired state and let the system figure out the migration path. In my experience, this scales better than hand-written migration files once you’re past 100+ migrations. The upfront investment pays off.

3. Guard against long-running backfills

On a 500GB table distributed across 30 nodes, an index backfill can take hours. Both CockroachDB and TiDB support throttling backfills to limit impact on foreground traffic. Always configure this. I’ve seen unthrottled backfills consume enough I/O to cause latency spikes that trigger cascading timeouts.

What most teams get wrong

They treat distributed DDL like single-node DDL. Write a migration, run it in CI, assume it’s safe. But distributed DDL introduces failure modes that don’t exist on a single node.

Node restarts mid-migration can leave schema changes in an intermediate state. In CockroachDB, schema change jobs are resumable. If a node crashes, check and resume the job:

-- CockroachDB: inspect and resume a failed schema change
SHOW JOBS WHEN COMPLETE (SELECT job_id FROM crdb_internal.jobs WHERE job_type = 'SCHEMA CHANGE' AND status = 'reverting');
RESUME JOB <job_id>;

Stuck or conflicting DDLs are common in TiDB when concurrent DDLs target the same table. Use the admin command to identify and cancel:

-- TiDB: view and cancel a stuck DDL
ADMIN SHOW DDL JOBS;
ADMIN CANCEL DDL JOBS job_id;

Clock skew (especially on non-Spanner systems without TrueTime) can cause version propagation delays. Monitor schema lease durations and set alerts when lease renewal latency exceeds your configured threshold.

One hard rule: always test migrations against a multi-node staging cluster that simulates failure injection (kill a node mid-backfill, introduce network partitions). A single-node dev instance tells you almost nothing about how a migration will behave in production.

What to do with all this

Use the expand-contract pattern for any non-trivial schema change. It works universally across PostgreSQL and every distributed database, and it cleanly separates schema changes from application deploys. There’s a reason everyone reaches for it.

Build a DDL compatibility matrix for your specific database version. Online DDL support evolves fast. CockroachDB v23.1 introduced declarative schema changer as the default, and TiDB v6.2 added concurrent DDL execution. What required downtime two versions ago may be fully online today. Check before you assume.

Throttle backfills, instrument migrations, and rehearse failure recovery. Set explicit rate limits on index builds, emit metrics from your migration tooling, and practice the RESUME JOB / ADMIN CANCEL DDL workflows before you need them at 2 AM. You don’t want your first time running these commands to be during an incident.


Tags: distributed-databases, schema-migration, cockroachdb, postgresql, devops


Share: Twitter LinkedIn