mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 21:41:09 +02:00
Final cleanup of references missed in previous rename commits: - Dockerfiles: PUBLIC_MANA_CORE_AUTH_URL → PUBLIC_MANA_AUTH_URL - Go modules: github.com/manacore/* → github.com/mana/* (7 go.mod files) - launchd plists: com.manacore.* → com.mana.* (14 files renamed + content) - Image assets: *_Manacore_AI_Credits* → *_Mana_AI_Credits* (11 files) - .env.example files: ManaCore brand strings → Mana - .prettierignore: stale apps/manacore/* paths → apps/mana/* - Markdown docs (CLAUDE.md, /docs/*): mana-core-auth → mana-auth, etc. Excluded from rename: .claude/, devlog/, manascore/ (historical content), client testimonials, blueprints, npm package refs (@mana-core/*). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
667 lines
20 KiB
Markdown
667 lines
20 KiB
Markdown
# Database Migration Guide
|
|
|
|
This document describes database migration best practices, procedures, and tooling for the Mana monorepo. **This is a core system concept** - all developers should understand these patterns.
|
|
|
|
## Table of Contents
|
|
|
|
1. [Overview](#overview)
|
|
2. [Drizzle Migration Internals](#drizzle-migration-internals)
|
|
3. [Migration Commands](#migration-commands)
|
|
4. [Development vs Production](#development-vs-production)
|
|
5. [CI/CD Pipeline](#cicd-pipeline)
|
|
6. [Advisory Locks](#advisory-locks)
|
|
7. [Zero-Downtime Migrations](#zero-downtime-migrations)
|
|
8. [Rollback Procedures](#rollback-procedures)
|
|
9. [Troubleshooting](#troubleshooting)
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
All backends in the Mana monorepo use **Drizzle ORM** for database schema management. We use two different approaches depending on the environment:
|
|
|
|
| Environment | Command | Purpose |
|
|
|-------------|---------|---------|
|
|
| **Development** | `drizzle-kit push` | Fast iteration, direct schema sync |
|
|
| **Production** | `drizzle-kit generate` + `migrate` | Tracked migrations with history |
|
|
|
|
### Key Principles
|
|
|
|
1. **Migrations run BEFORE code deployment** - Ensures database is ready for new code
|
|
2. **Advisory locks prevent concurrent migrations** - Safe for multi-replica deployments
|
|
3. **Expand-contract pattern for breaking changes** - Zero-downtime schema changes
|
|
4. **Data persistence** - Migrations never delete user data unless explicitly requested
|
|
|
|
### Quick Decision Guide
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Which command should I use? │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ Local development? │
|
|
│ └── YES → pnpm db:push (fast, no tracking) │
|
|
│ │
|
|
│ Staging/Production? │
|
|
│ └── YES → pnpm db:generate + pnpm db:migrate (tracked) │
|
|
│ │
|
|
│ Need to inspect data? │
|
|
│ └── YES → pnpm db:studio (opens Drizzle Studio) │
|
|
│ │
|
|
│ Schema changed by someone else? │
|
|
│ └── YES → git pull + pnpm db:push (local) │
|
|
│ git pull + pnpm db:migrate (staging/prod) │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Drizzle Migration Internals
|
|
|
|
Understanding how Drizzle manages migrations is essential for debugging issues.
|
|
|
|
### The Two Modes
|
|
|
|
#### 1. Push Mode (`drizzle-kit push`)
|
|
|
|
**How it works:**
|
|
1. Drizzle introspects your TypeScript schema files
|
|
2. Drizzle introspects the current database schema
|
|
3. Drizzle computes the diff between them
|
|
4. Drizzle generates and **immediately executes** the SQL to sync them
|
|
|
|
**Characteristics:**
|
|
- No migration files created
|
|
- No history tracking
|
|
- Direct database modification
|
|
- Interactive confirmation (use `--force` to skip)
|
|
|
|
**When to use:** Local development, experimentation, prototyping
|
|
|
|
#### 2. Generate + Migrate Mode (`drizzle-kit generate` + `migrate`)
|
|
|
|
**How it works:**
|
|
|
|
**Step 1: Generate** (`drizzle-kit generate`)
|
|
1. Drizzle introspects your TypeScript schema files
|
|
2. Drizzle reads the last snapshot from `migrations/meta/`
|
|
3. Drizzle computes the diff
|
|
4. Drizzle creates migration files (SQL + snapshot)
|
|
|
|
**Step 2: Migrate** (`pnpm db:migrate`)
|
|
1. Script reads `migrations/meta/_journal.json`
|
|
2. Script queries `__drizzle_migrations` table in database
|
|
3. Script determines which migrations haven't been applied
|
|
4. Script executes pending migrations in order
|
|
5. Script records applied migrations in `__drizzle_migrations`
|
|
|
|
**Characteristics:**
|
|
- Creates versioned SQL files
|
|
- Full history tracking
|
|
- Repeatable deployments
|
|
- Can be reviewed before applying
|
|
|
|
**When to use:** Staging, production, CI/CD pipelines
|
|
|
|
### Migration File Structure
|
|
|
|
```
|
|
src/db/migrations/
|
|
├── 0000_initial_schema/
|
|
│ ├── migration.sql # The actual SQL to execute
|
|
│ └── snapshot.json # Schema snapshot AFTER this migration
|
|
├── 0001_add_user_preferences/
|
|
│ ├── migration.sql
|
|
│ └── snapshot.json
|
|
├── 0002_add_credits_table/
|
|
│ ├── migration.sql
|
|
│ └── snapshot.json
|
|
└── meta/
|
|
└── _journal.json # Migration registry (order + metadata)
|
|
```
|
|
|
|
### The Journal File (`_journal.json`)
|
|
|
|
This file tracks all generated migrations:
|
|
|
|
```json
|
|
{
|
|
"version": "7",
|
|
"dialect": "postgresql",
|
|
"entries": [
|
|
{
|
|
"idx": 0,
|
|
"version": "7",
|
|
"when": 1733066521000,
|
|
"tag": "0000_initial_schema",
|
|
"breakpoints": true
|
|
},
|
|
{
|
|
"idx": 1,
|
|
"version": "7",
|
|
"when": 1733152921000,
|
|
"tag": "0001_add_user_preferences",
|
|
"breakpoints": true
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Key fields:**
|
|
- `idx`: Sequential index (order matters!)
|
|
- `tag`: Folder name containing the migration
|
|
- `when`: Unix timestamp when generated
|
|
- `breakpoints`: Whether to use statement breakpoints
|
|
|
|
### The Database Tracking Table (`__drizzle_migrations`)
|
|
|
|
Drizzle creates this table automatically to track applied migrations:
|
|
|
|
```sql
|
|
-- Schema: drizzle
|
|
-- Table: __drizzle_migrations
|
|
CREATE TABLE drizzle.__drizzle_migrations (
|
|
id SERIAL PRIMARY KEY,
|
|
hash TEXT NOT NULL,
|
|
created_at BIGINT NOT NULL
|
|
);
|
|
```
|
|
|
|
**Query applied migrations:**
|
|
```sql
|
|
SELECT * FROM drizzle.__drizzle_migrations ORDER BY created_at;
|
|
```
|
|
|
|
### How Migration Tracking Works
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ _journal.json │ │ __drizzle_ │
|
|
│ (filesystem) │ │ migrations (db) │
|
|
└────────┬────────┘ └────────┬────────┘
|
|
│ │
|
|
▼ ▼
|
|
[0000, 0001, 0002] [hash_0000, hash_0001]
|
|
│ │
|
|
└───────────┬───────────┘
|
|
│
|
|
▼
|
|
Pending: [0002]
|
|
│
|
|
▼
|
|
Execute 0002/migration.sql
|
|
│
|
|
▼
|
|
Insert into __drizzle_migrations
|
|
```
|
|
|
|
### Snapshot Files
|
|
|
|
Each migration includes a `snapshot.json` that captures the **complete schema state** after that migration. This allows Drizzle to:
|
|
|
|
1. Compute diffs for the next migration
|
|
2. Detect schema drift
|
|
3. Generate accurate SQL
|
|
|
|
**Important:** Never modify snapshots manually!
|
|
|
|
---
|
|
|
|
## Migration Commands
|
|
|
|
### All Backends
|
|
|
|
```bash
|
|
# Development - push schema directly (fast, no history)
|
|
pnpm db:push
|
|
|
|
# Generate migration files from schema changes
|
|
pnpm db:generate
|
|
|
|
# Run migrations with advisory locks (production-safe)
|
|
pnpm db:migrate
|
|
|
|
# Open Drizzle Studio for database inspection
|
|
pnpm db:studio
|
|
```
|
|
|
|
### Root-Level Commands
|
|
|
|
```bash
|
|
# Setup all databases (creates DBs + pushes schemas)
|
|
pnpm setup:db
|
|
|
|
# Setup specific service
|
|
pnpm setup:db:auth
|
|
pnpm setup:db:chat
|
|
```
|
|
|
|
### Per-Service Commands
|
|
|
|
```bash
|
|
# mana-auth
|
|
pnpm --filter mana-auth db:push
|
|
pnpm --filter mana-auth db:generate
|
|
pnpm --filter mana-auth db:migrate
|
|
|
|
# chat-backend
|
|
pnpm --filter @chat/server db:push
|
|
pnpm --filter @chat/server db:migrate
|
|
```
|
|
|
|
---
|
|
|
|
## Development vs Production
|
|
|
|
### Development Workflow
|
|
|
|
For local development, use `db:push` for fast iteration:
|
|
|
|
```bash
|
|
# 1. Make schema changes in src/db/schema/*.ts
|
|
# 2. Push changes to local database
|
|
pnpm db:push
|
|
|
|
# Or use the full dev command which handles this automatically
|
|
pnpm dev:chat:full
|
|
```
|
|
|
|
**Why `push` for development?**
|
|
- Instant feedback on schema changes
|
|
- No migration file clutter during experimentation
|
|
- Automatically handled by `dev:*:full` commands
|
|
|
|
### Production Workflow
|
|
|
|
For staging/production, use migration files for trackability:
|
|
|
|
```bash
|
|
# 1. Make schema changes in src/db/schema/*.ts
|
|
|
|
# 2. Generate migration file
|
|
pnpm db:generate --name add_user_preferences
|
|
|
|
# 3. Review generated SQL
|
|
cat src/db/migrations/*/migration.sql
|
|
|
|
# 4. Commit migration files
|
|
git add src/db/migrations/
|
|
git commit -m "feat: add user preferences table"
|
|
|
|
# 5. CI/CD runs migrations automatically on deploy
|
|
```
|
|
|
|
**Why migrations for production?**
|
|
- Audit trail of all schema changes
|
|
- Repeatable deployments
|
|
- Rollback capability (with manual down migrations)
|
|
|
|
---
|
|
|
|
## CI/CD Pipeline
|
|
|
|
### Deployment Flow
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Build │───>│ Create DB │───>│ Migrate │───>│ Deploy │
|
|
│ Images │ │ (if new) │ │ Database │ │ Code │
|
|
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
### Migration Step Features
|
|
|
|
1. **Retry logic** - 3 attempts with exponential backoff (10s, 20s, 30s)
|
|
2. **Timeout protection** - 5-minute timeout per migration
|
|
3. **Advisory locks** - Prevents concurrent migrations
|
|
4. **Graceful fallback** - Falls back to `db:push` if `db:migrate` unavailable
|
|
|
|
### Staging Deployment
|
|
|
|
Migrations run automatically after database creation:
|
|
|
|
```yaml
|
|
# .github/workflows/cd-staging.yml
|
|
- name: Run database migrations
|
|
run: |
|
|
docker compose exec -T mana-auth pnpm run db:migrate
|
|
```
|
|
|
|
### Production Deployment
|
|
|
|
Migrations run BEFORE deploying new code:
|
|
|
|
```yaml
|
|
# .github/workflows/cd-production.yml
|
|
- name: Run database migrations
|
|
run: |
|
|
docker compose run --rm mana-auth pnpm run db:migrate
|
|
|
|
- name: Deploy with zero-downtime
|
|
run: |
|
|
docker compose up -d
|
|
```
|
|
|
|
---
|
|
|
|
## Advisory Locks
|
|
|
|
Advisory locks prevent multiple instances from running migrations simultaneously.
|
|
|
|
### How It Works
|
|
|
|
```typescript
|
|
// services/mana-auth/src/db/migrate.ts
|
|
|
|
const MIGRATION_LOCK_ID = 987654321;
|
|
|
|
// Acquire lock before migration
|
|
await db.execute(sql`SELECT pg_try_advisory_lock(${LOCK_ID})`);
|
|
|
|
// Run migrations...
|
|
|
|
// Release lock after migration
|
|
await db.execute(sql`SELECT pg_advisory_unlock(${LOCK_ID})`);
|
|
```
|
|
|
|
### Lock Behavior
|
|
|
|
| Scenario | Behavior |
|
|
|----------|----------|
|
|
| Lock acquired | Migration runs immediately |
|
|
| Lock held by another process | Waits up to 5 minutes, then fails |
|
|
| Lock stuck | Manual release required (see Troubleshooting) |
|
|
|
|
### Lock IDs by Service
|
|
|
|
| Service | Lock ID |
|
|
|---------|---------|
|
|
| mana-auth | `987654321` |
|
|
| chat-backend | (to be assigned) |
|
|
| todo-backend | (to be assigned) |
|
|
|
|
### Migration Script Architecture
|
|
|
|
The production migration script (`src/db/migrate.ts`) is designed for safe, concurrent-safe deployments:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ migrate.ts Execution Flow │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ 1. Load environment variables (.env) │
|
|
│ └── DATABASE_URL, MIGRATION_TIMEOUT │
|
|
│ │
|
|
│ 2. Create single-connection pool │
|
|
│ └── max: 1 (dedicated migration connection) │
|
|
│ │
|
|
│ 3. Test database connectivity (with retry) │
|
|
│ └── SELECT 1 (max 3 attempts, exponential backoff) │
|
|
│ │
|
|
│ 4. Acquire advisory lock │
|
|
│ ├── pg_try_advisory_lock() - non-blocking attempt │
|
|
│ └── If busy: poll every 5s until timeout (default: 5 min) │
|
|
│ │
|
|
│ 5. Check for migration files │
|
|
│ └── If meta/_journal.json missing: exit gracefully │
|
|
│ │
|
|
│ 6. Run Drizzle migrations │
|
|
│ └── migrate(db, { migrationsFolder }) │
|
|
│ │
|
|
│ 7. Cleanup (always runs, even on error) │
|
|
│ ├── Release advisory lock │
|
|
│ └── Close database connection │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Key Components:**
|
|
|
|
| Component | Purpose | Configuration |
|
|
|-----------|---------|---------------|
|
|
| `withRetry()` | Retry transient errors (network, connection) | 3 attempts, exponential backoff |
|
|
| `acquireLock()` | Non-blocking lock attempt | `pg_try_advisory_lock()` |
|
|
| `waitForLock()` | Polling wait for lock | 5s intervals, configurable timeout |
|
|
| `releaseLock()` | Release lock in finally block | Always runs |
|
|
|
|
**Error Handling:**
|
|
|
|
```typescript
|
|
// Transient errors (will retry):
|
|
- ECONNREFUSED, ETIMEDOUT, ENOTFOUND
|
|
- Connection errors
|
|
- PostgreSQL 57P03 (cannot connect now)
|
|
|
|
// Non-transient errors (immediate failure):
|
|
- Missing DATABASE_URL
|
|
- SQL syntax errors
|
|
- Schema conflicts
|
|
- Lock timeout
|
|
```
|
|
|
|
**Exit Codes:**
|
|
|
|
| Code | Meaning |
|
|
|------|---------|
|
|
| 0 | Success - all migrations applied |
|
|
| 1 | Failure - check logs for details |
|
|
|
|
---
|
|
|
|
## Zero-Downtime Migrations
|
|
|
|
For breaking schema changes, use the **expand-contract pattern**:
|
|
|
|
### Phase 1: Expand
|
|
|
|
Add new schema elements alongside existing ones:
|
|
|
|
```sql
|
|
-- Migration: 001_add_full_name.sql
|
|
ALTER TABLE users ADD COLUMN full_name TEXT;
|
|
```
|
|
|
|
### Phase 2: Migrate
|
|
|
|
Update application to write to both, backfill data:
|
|
|
|
```typescript
|
|
// Application code - dual write
|
|
await db.update(users).set({
|
|
name: newName, // Old column
|
|
fullName: newName, // New column
|
|
});
|
|
|
|
// Backfill script
|
|
UPDATE users SET full_name = name WHERE full_name IS NULL;
|
|
```
|
|
|
|
### Phase 3: Contract
|
|
|
|
After 1-2 weeks, remove old column:
|
|
|
|
```sql
|
|
-- Migration: 002_drop_name_column.sql
|
|
ALTER TABLE users DROP COLUMN name;
|
|
```
|
|
|
|
### Common Patterns
|
|
|
|
| Change Type | Approach |
|
|
|-------------|----------|
|
|
| Add column | Direct `ALTER TABLE ADD COLUMN` |
|
|
| Drop column | Remove from code first, wait 2 weeks, then drop |
|
|
| Rename column | Add new → dual-write → backfill → drop old |
|
|
| Change type | Add new column → backfill with cast → swap |
|
|
| Add NOT NULL | Add nullable → backfill → add constraint |
|
|
|
|
### Index Creation
|
|
|
|
Always use `CONCURRENTLY` to avoid table locks:
|
|
|
|
```sql
|
|
-- Good
|
|
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
|
|
|
|
-- Bad (locks table)
|
|
CREATE INDEX idx_users_email ON users(email);
|
|
```
|
|
|
|
---
|
|
|
|
## Rollback Procedures
|
|
|
|
### Automatic Rollback (Not Supported)
|
|
|
|
Drizzle ORM does not support automatic rollbacks. Plan your migrations carefully.
|
|
|
|
### Manual Rollback
|
|
|
|
1. **Write down migration scripts** alongside up migrations:
|
|
|
|
```
|
|
src/db/migrations/
|
|
├── 001_add_feature.up.sql
|
|
├── 001_add_feature.down.sql # Manual rollback script
|
|
```
|
|
|
|
2. **Execute rollback manually**:
|
|
|
|
```bash
|
|
# Connect to database
|
|
docker compose exec -T postgres psql -U postgres -d mana_platform
|
|
|
|
# Run down migration
|
|
\i /path/to/001_add_feature.down.sql
|
|
```
|
|
|
|
### Rollback Checklist
|
|
|
|
- [ ] Identify affected migration
|
|
- [ ] Verify rollback script exists and is tested
|
|
- [ ] Create database backup before rollback
|
|
- [ ] Execute rollback in staging first
|
|
- [ ] Monitor for issues after rollback
|
|
- [ ] Update application code if needed
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Migration Lock Stuck
|
|
|
|
If a migration lock is stuck (process crashed without releasing):
|
|
|
|
```sql
|
|
-- Check for stuck locks
|
|
SELECT * FROM pg_locks WHERE locktype = 'advisory';
|
|
|
|
-- Release specific lock (replace LOCK_ID)
|
|
SELECT pg_advisory_unlock(987654321);
|
|
|
|
-- Release all advisory locks for current session
|
|
SELECT pg_advisory_unlock_all();
|
|
```
|
|
|
|
### Migration Timeout
|
|
|
|
If migrations time out:
|
|
|
|
1. Check for long-running queries: `SELECT * FROM pg_stat_activity;`
|
|
2. Increase timeout: `MIGRATION_TIMEOUT=600 pnpm db:migrate`
|
|
3. Break large migrations into smaller steps
|
|
|
|
### Schema Drift
|
|
|
|
If staging/production schema differs from expected:
|
|
|
|
```bash
|
|
# Generate migration from current schema
|
|
pnpm db:generate --name sync_schema
|
|
|
|
# Review and apply
|
|
pnpm db:migrate
|
|
```
|
|
|
|
### Connection Issues
|
|
|
|
```bash
|
|
# Test database connectivity
|
|
docker compose exec -T postgres pg_isready -U postgres
|
|
|
|
# Check environment variables
|
|
echo $DATABASE_URL
|
|
|
|
# Manual connection test
|
|
docker compose exec -T postgres psql -U postgres -d mana_platform -c "SELECT 1"
|
|
```
|
|
|
|
### Migration Fails in CI/CD
|
|
|
|
1. Check GitHub Actions logs for specific error
|
|
2. Verify DATABASE_URL is correctly set in secrets
|
|
3. Ensure database exists before migration runs
|
|
4. Check if another migration is running (advisory lock)
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### DO
|
|
|
|
- Run migrations before deploying new code
|
|
- Test migrations in staging before production
|
|
- Use `CONCURRENTLY` for index creation
|
|
- Keep migrations small and focused
|
|
- Commit migration files to version control
|
|
- Wait 1-2 weeks before dropping columns
|
|
|
|
### DON'T
|
|
|
|
- Run `db:push` in production
|
|
- Delete migration files after they've been applied
|
|
- Modify migration files after they've been applied
|
|
- Add NOT NULL without default or backfill
|
|
- Create indexes without `CONCURRENTLY`
|
|
- Drop columns immediately after removing from code
|
|
|
|
---
|
|
|
|
## Migration File Structure
|
|
|
|
```
|
|
services/mana-auth/
|
|
├── src/db/
|
|
│ ├── schema/
|
|
│ │ ├── index.ts # Export all schemas
|
|
│ │ ├── auth.schema.ts # User, session tables
|
|
│ │ └── credits.schema.ts # Credit system tables
|
|
│ ├── migrations/
|
|
│ │ ├── 0001_initial/
|
|
│ │ │ ├── snapshot.json
|
|
│ │ │ └── migration.sql
|
|
│ │ └── meta/
|
|
│ │ └── _journal.json # Migration history
|
|
│ ├── connection.ts # Database connection
|
|
│ └── migrate.ts # Migration script with locks
|
|
└── drizzle.config.ts # Drizzle configuration
|
|
```
|
|
|
|
---
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `DATABASE_URL` | PostgreSQL connection string | Required |
|
|
| `MIGRATION_TIMEOUT` | Max seconds for migration | `300` |
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Drizzle ORM Migrations](https://orm.drizzle.team/docs/migrations)
|
|
- [PostgreSQL Advisory Locks](https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS)
|
|
- [Expand-Contract Pattern](https://martinfowler.com/bliki/ParallelChange.html)
|
|
- [Zero-Downtime PostgreSQL Migrations](https://postgres.ai/blog/20210923-zero-downtime-postgres-schema-migrations-lock-timeout-and-retries)
|