# Database Migration Plan: Postgres → TiDB

We're hitting vertical scaling limits on our primary Postgres instance. Write throughput is capped at ~8K TPS and we need 25K+ for the holiday season.

## Why TiDB

- MySQL-compatible wire protocol (most ORMs work out of the box)
- Horizontal write scaling via Raft-based sharding
- HTAP: real-time analytics without a separate data warehouse
- Strong consistency (no eventual consistency surprises)

## Migration Strategy

```mermaid
gantt
    title Migration Timeline
    dateFormat  YYYY-MM-DD
    section Phase 1
    Schema conversion     :a1, 2026-04-01, 14d
    Dual-write setup      :a2, after a1, 7d
    section Phase 2
    Shadow traffic         :b1, after a2, 14d
    Data validation        :b2, after b1, 7d
    section Phase 3
    Cutover (read)         :c1, after b2, 3d
    Cutover (write)        :c2, after c1, 3d
    Decommission Postgres  :c3, after c2, 14d
```

## Schema Changes Required

1. Replace `SERIAL` with `AUTO_RANDOM` for distributed ID generation
2. Convert `JSONB` columns to `JSON` (TiDB uses MySQL JSON)
3. Replace Postgres-specific extensions (pg_trgm → full-text index)
4. Adjust isolation level expectations (TiDB uses snapshot isolation)

## Risks

| Risk | Impact | Mitigation |
|------|--------|------------|
| ORM compatibility gaps | High | Run full test suite against TiDB in CI |
| Latency increase on joins | Medium | Pre-optimize slow queries, add covering indexes |
| Operational complexity | Medium | Team training, runbook creation |
| Data loss during dual-write | Critical | Checksums + reconciliation job every 5min |

## Rollback Plan

Dual-write runs for full 14 days. If issues arise after cutover:
1. Flip read traffic back to Postgres (instant)
2. Replay TiDB writes to Postgres via CDC
3. Full rollback within 30 minutes