mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 19:01:08 +02:00
Two CD-pipeline ergonomics fixes that surfaced during the 2026-04-28
schema-drift sweep.
(C) Auto-apply additive Drizzle migrations
========================================
8 services use Drizzle (mana-auth/-credits/-events/-research/-mail/
-subscriptions/-user/-analytics) but the CD pipeline never ran their
`db:push` script, so 4 schema additions stayed undeployed for days
(auth.users.kind, credits.{sync_subscriptions,reservations},
event_discovery.*) until live PostgresErrors surfaced them.
New `scripts/mac-mini/safe-db-push.sh`:
- Uses `drizzle-kit generate` to write a probe SQL file (does NOT
apply yet).
- Greps the generated SQL for destructive patterns (DROP TABLE/
COLUMN/TYPE/SCHEMA/INDEX, ALTER COLUMN ... TYPE, RENAME).
- Refuses to auto-apply if any are found — operator must review and
run `pnpm db:push --force` manually after pg_dump.
- Otherwise applies via `drizzle-kit push --force` and cleans up the
probe artifacts.
CD step "Apply schema migrations" runs between build and container
restart, sourcing each changed service's DATABASE_URL from compose
config (with @postgres → @localhost rewrite for the host runner).
Failure aborts deploy before the new container starts — the old
container keeps running with the old schema, which matches.
(D) Build-time RAM headroom
========================================
mana-web's Vite build needs 8 GiB of Node heap; Colima's VM is sized
at 12 GiB; ~3.5 GiB of other containers run during deploy. The 2026-
04-28 mana-web deploy OOM'd at the Vite step ("cannot allocate
memory") and only succeeded on retry once concurrent traffic settled.
New `scripts/mac-mini/build-memory-headroom.sh`:
- `start`: stops every container matching `^mana-mon-` (the
observability stack — VictoriaMetrics, Loki, Glitchtip, cAdvisor,
umami, blackbox, exporters). Frees ~700 MiB.
- `stop`: restores them from the snapshot list captured at start.
- `wrap <cmd>`: pause + run + always-resume via trap.
CD wraps the build loop with start/stop, but only when mana-web is in
the change set — other services build well below 4 GiB and don't
need the headroom. The monitoring stack resumes before the migration
step so cAdvisor + exporters are back online for the deploy-metrics
collection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
124 lines
4.6 KiB
Bash
Executable file
124 lines
4.6 KiB
Bash
Executable file
#!/usr/bin/env bash
|
|
#
|
|
# Safely apply Drizzle schema changes to the prod Postgres for a single
|
|
# service.
|
|
#
|
|
# Why this exists: every Drizzle-using service has a `db:push` script
|
|
# but the CD pipeline never ran it, so schema changes drifted silently
|
|
# between the typed Drizzle definition and the live database. Today's
|
|
# audit found four such drifts (auth.users.kind, credits.sync_subscriptions,
|
|
# credits.reservations, event_discovery.*) — all additive, all easily
|
|
# applied once detected, but they should never have stayed undetected.
|
|
#
|
|
# How it works:
|
|
# 1. `drizzle-kit generate` produces a SQL diff file under the
|
|
# service's `drizzle/` dir without applying it.
|
|
# 2. We grep the generated SQL for destructive patterns. If any are
|
|
# found, we ABORT and refuse to apply — the operator must review
|
|
# and run `drizzle-kit push --force` manually.
|
|
# 3. If only additive changes are present, we run `drizzle-kit push
|
|
# --force` to apply them. Then delete the generated marker file
|
|
# so it doesn't pile up in the repo.
|
|
#
|
|
# Destructive patterns we refuse to auto-apply:
|
|
# - DROP TABLE / DROP COLUMN / DROP TYPE / DROP SCHEMA / DROP INDEX
|
|
# - ALTER COLUMN ... TYPE (change column type — usually data-loss)
|
|
# - RENAME COLUMN / RENAME TABLE (data still there, but breaking
|
|
# change for any caller pinned to the old name)
|
|
#
|
|
# Usage: scripts/mac-mini/safe-db-push.sh <service-name>
|
|
# Env requirements:
|
|
# - DATABASE_URL: connection string to apply migrations against
|
|
# - PROJECT_DIR : repo root (the deploy workflow sets this)
|
|
|
|
set -euo pipefail
|
|
|
|
SVC="${1:?usage: $0 <service-name>}"
|
|
PROJECT_DIR="${PROJECT_DIR:-$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)}"
|
|
SVC_DIR="$PROJECT_DIR/services/$SVC"
|
|
|
|
if [ ! -d "$SVC_DIR" ]; then
|
|
echo "[safe-db-push] $SVC: no service dir at $SVC_DIR — skipping"
|
|
exit 0
|
|
fi
|
|
if [ ! -f "$SVC_DIR/drizzle.config.ts" ] && [ ! -f "$SVC_DIR/drizzle.config.js" ]; then
|
|
echo "[safe-db-push] $SVC: no drizzle config — skipping"
|
|
exit 0
|
|
fi
|
|
if ! grep -q '"db:push"' "$SVC_DIR/package.json" 2>/dev/null; then
|
|
echo "[safe-db-push] $SVC: no db:push script — skipping"
|
|
exit 0
|
|
fi
|
|
|
|
if [ -z "${DATABASE_URL:-}" ]; then
|
|
echo "[safe-db-push] $SVC: DATABASE_URL not set — skipping"
|
|
exit 0
|
|
fi
|
|
|
|
cd "$SVC_DIR"
|
|
|
|
# Snapshot the existing migration set before we generate. Anything new
|
|
# afterwards is the diff this push would apply.
|
|
PRE_GEN_FILES=$(find drizzle -maxdepth 2 -name '*.sql' 2>/dev/null | sort || true)
|
|
|
|
# Generate-only — does not touch the database.
|
|
echo "[safe-db-push] $SVC: generating diff…"
|
|
GEN_OUT=$(pnpm exec drizzle-kit generate --name "__ci_safety_check_$$" 2>&1 || true)
|
|
echo "$GEN_OUT" | tail -20
|
|
|
|
POST_GEN_FILES=$(find drizzle -maxdepth 2 -name '*.sql' 2>/dev/null | sort || true)
|
|
|
|
# New SQL files = the diff
|
|
NEW_SQL=$(comm -13 <(echo "$PRE_GEN_FILES") <(echo "$POST_GEN_FILES") | grep -v '^$' || true)
|
|
|
|
if [ -z "$NEW_SQL" ]; then
|
|
echo "[safe-db-push] $SVC: no schema changes — clean."
|
|
exit 0
|
|
fi
|
|
|
|
echo "[safe-db-push] $SVC: schema diff detected:"
|
|
echo "$NEW_SQL"
|
|
|
|
# Trap so we always remove the generated probe files, even on failure.
|
|
cleanup() {
|
|
for f in $NEW_SQL; do
|
|
rm -f "$f"
|
|
done
|
|
# drizzle-kit also writes a meta entry; remove the most recent one.
|
|
if [ -f drizzle/meta/_journal.json ]; then
|
|
# Best-effort cleanup — strip the entry that references our probe tag.
|
|
# If jq isn't available, leave it; the next legitimate `db:push` will
|
|
# overwrite anyway.
|
|
if command -v jq >/dev/null 2>&1; then
|
|
tmp=$(mktemp)
|
|
jq '.entries |= map(select(.tag | test("__ci_safety_check") | not))' \
|
|
drizzle/meta/_journal.json > "$tmp" && mv "$tmp" drizzle/meta/_journal.json || true
|
|
fi
|
|
fi
|
|
}
|
|
trap cleanup EXIT
|
|
|
|
# Refuse to auto-apply destructive changes. The operator must review
|
|
# and either fix the schema (if the diff was unintentional) or run
|
|
# `drizzle-kit push --force` manually after taking a fresh pg_dump.
|
|
DESTRUCTIVE_PATTERN='DROP[[:space:]]+(TABLE|COLUMN|TYPE|SCHEMA|INDEX|VIEW|FUNCTION)|ALTER[[:space:]]+TABLE.*ALTER[[:space:]]+COLUMN.*TYPE|RENAME[[:space:]]+(COLUMN|TABLE|TO)'
|
|
|
|
DESTRUCTIVE_HITS=""
|
|
for sql in $NEW_SQL; do
|
|
hits=$(grep -niE "$DESTRUCTIVE_PATTERN" "$sql" || true)
|
|
if [ -n "$hits" ]; then
|
|
DESTRUCTIVE_HITS="$DESTRUCTIVE_HITS\n=== $sql ===\n$hits"
|
|
fi
|
|
done
|
|
|
|
if [ -n "$DESTRUCTIVE_HITS" ]; then
|
|
echo "[safe-db-push] $SVC: ✗ DESTRUCTIVE changes detected — refusing to auto-apply"
|
|
echo " Review the diff and run \`pnpm db:push --force\` manually after backup."
|
|
echo -e "$DESTRUCTIVE_HITS"
|
|
exit 1
|
|
fi
|
|
|
|
# Additive only — safe to apply.
|
|
echo "[safe-db-push] $SVC: ✓ additive only, applying…"
|
|
pnpm exec drizzle-kit push --force
|
|
echo "[safe-db-push] $SVC: ✓ schema is now in sync"
|