managarten/services/mana-auth/sql/003_recovery_wrap.sql
Till JS f46d1328d8 feat(mana-auth): phase 9 milestone 2 — vault recovery wrap + zero-knowledge
Server-side support for the Phase 9 zero-knowledge opt-in. Adds the
recovery-wrap columns + four new vault operations + the routes that
expose them.

Schema (sql/003_recovery_wrap.sql)
----------------------------------
Adds to auth.encryption_vaults:

  - recovery_wrapped_mk    text                  (NULL until set)
  - recovery_iv            text                  (NULL until set)
  - recovery_format_version smallint NOT NULL DEFAULT 1
  - recovery_set_at        timestamptz
  - zero_knowledge         boolean NOT NULL DEFAULT false

Drops NOT NULL from wrapped_mk + wrap_iv (a vault in zero-knowledge
mode has no server-side wrap at all).

Three CHECK constraints enforce the invariant at the DB level so no
service bug can leave a vault in an inconsistent state:

  - encryption_vaults_has_wrap         — at least one of (wrapped_mk,
                                          recovery_wrapped_mk) is set
  - encryption_vaults_wrap_iv_pair     — ciphertext + IV are paired
                                          (both NULL or both set) on
                                          each wrap form
  - encryption_vaults_zk_consistency   — zero_knowledge=true implies
                                          wrapped_mk IS NULL AND
                                          recovery_wrapped_mk IS NOT NULL

If a code-level bug ever tried to enable ZK without a recovery wrap,
or to leave both wraps empty, Postgres would reject the UPDATE.

Drizzle schema (db/schema/encryption-vaults.ts)
-----------------------------------------------
Mirrors the migration: wrappedMk + wrapIv become nullable, the four
new columns added with the right defaults. Inline doc comment explains
the zero-knowledge fork.

Service (services/encryption-vault/index.ts)
--------------------------------------------
VaultFetchResult gains optional `requiresRecoveryCode` /
`recoveryWrappedMk` / `recoveryIv` so the route handler can serialize
the right shape. masterKey becomes Uint8Array | null (null in ZK mode).

Existing methods updated:
  - init: branches on row.zeroKnowledge — returns the recovery blob
    instead of an unwrapped MK if the user is already in ZK mode
  - getMasterKey: same fork, with audit context "zk-recovery-blob"
  - rotate: throws ZeroKnowledgeRotateForbidden in ZK mode (the server
    can't re-wrap a key it can't read). Also wipes any stale recovery
    wrap on rotation — the new MK has nothing to do with the old one,
    so the old recovery code would unwrap into garbage.

New methods:
  - setRecoveryWrap(userId, { recoveryWrappedMk, recoveryIv }, ctx)
    Stores (or replaces) the user's recovery wrap. Idempotent.
  - clearRecoveryWrap(userId, ctx)
    Removes the recovery wrap. Forbidden if ZK is active (would lock
    the user out) — throws ZeroKnowledgeActiveError → 409.
  - enableZeroKnowledge(userId, ctx)
    NULLs out wrapped_mk + wrap_iv, sets zero_knowledge=true. Requires
    a recovery wrap to already be present — throws
    RecoveryWrapMissingError → 400 otherwise. Idempotent on already-on.
  - disableZeroKnowledge(userId, mkBytes, ctx)
    Inverse: takes a freshly-unwrapped MK from the client, KEK-wraps
    it, stores as wrapped_mk, flips zero_knowledge=false. The client
    is the only entity that can supply the MK at this point, since
    the server can't decrypt the recovery wrap.

Three new error classes:
  - RecoveryWrapMissingError → 400 RECOVERY_WRAP_MISSING
  - ZeroKnowledgeActiveError → 409 ZK_ACTIVE
  - ZeroKnowledgeRotateForbidden → 409 ZK_ROTATE_FORBIDDEN

Audit action union extended with:
  - 'recovery_set' | 'recovery_clear' | 'zk_enable' | 'zk_disable'

Routes (routes/encryption-vault.ts)
-----------------------------------
GET /key + POST /init now share a serializeFetchResult helper that
returns either:
  - { masterKey, formatVersion, kekId }                 (standard)
  - { requiresRecoveryCode: true, recoveryWrappedMk,    (ZK mode)
      recoveryIv, formatVersion }

Three new routes:
  - POST   /recovery-wrap   — body: { recoveryWrappedMk, recoveryIv }
                              Stores the wrap. Validates both fields
                              are non-empty strings.
  - DELETE /recovery-wrap   — Removes the wrap. 409 if ZK active.
  - POST   /zero-knowledge  — body: { enable: boolean, masterKey?: base64 }
                              enable=true:  flip on (no body MK needed)
                              enable=false: flip off (MK required)
                              Validates the MK decodes to exactly 32 bytes.
                              Wipes the bytes after handing them to the
                              service.

POST /rotate now catches ZeroKnowledgeRotateForbidden → 409
ZK_ROTATE_FORBIDDEN so the client can show "disable zero-knowledge
first".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:05:49 +02:00

86 lines
4.2 KiB
SQL

-- Migration: encryption_vaults recovery wrap + zero-knowledge mode
--
-- Phase 9 of the encryption rollout. Adds three new columns + makes
-- wrapped_mk nullable so a user can opt into "true zero-knowledge"
-- mode where the server can no longer decrypt their data.
--
-- The opt-in flow is:
-- 1. Client generates a 32-byte recovery secret (client-only)
-- 2. Client wraps the existing master key with a recovery-derived key
-- 3. Client posts the wrapped MK + IV to /me/encryption-vault/recovery-wrap
-- 4. The server stores recovery_wrapped_mk + recovery_iv (both NULLABLE
-- until the user enables the recovery wrap; both NOT NULL once set)
-- 5. Client posts /me/encryption-vault/zero-knowledge with `enable: true`
-- The server NULLs out wrapped_mk + wrap_iv, sets zero_knowledge=true.
-- The server can no longer decrypt the user's data.
-- 6. On the next unlock, GET /key returns the recovery_wrapped_mk blob
-- with `requiresRecoveryCode: true`. The client prompts the user for
-- the recovery code, derives the wrap key, unwraps locally.
--
-- The "disable" flow is the inverse: the client unwraps locally, generates
-- a new server-side wrapped_mk via a fresh KEK wrap, and posts it back.
--
-- Idempotent: re-running on a partially-migrated DB is safe.
-- ─── Add new columns ──────────────────────────────────────────
ALTER TABLE auth.encryption_vaults
ADD COLUMN IF NOT EXISTS recovery_wrapped_mk TEXT,
ADD COLUMN IF NOT EXISTS recovery_iv TEXT,
ADD COLUMN IF NOT EXISTS recovery_format_version SMALLINT NOT NULL DEFAULT 1,
ADD COLUMN IF NOT EXISTS recovery_set_at TIMESTAMPTZ,
ADD COLUMN IF NOT EXISTS zero_knowledge BOOLEAN NOT NULL DEFAULT false;
-- ─── Make wrapped_mk + wrap_iv nullable ───────────────────────
-- These were NOT NULL in the Phase 2 migration. After Phase 9, a vault
-- in zero-knowledge mode has no server-side wrap at all, so both columns
-- have to allow NULL. Existing rows are unaffected (they have non-NULL
-- values; the constraint just relaxes).
ALTER TABLE auth.encryption_vaults
ALTER COLUMN wrapped_mk DROP NOT NULL,
ALTER COLUMN wrap_iv DROP NOT NULL;
-- ─── Sanity constraint ────────────────────────────────────────
-- A vault row must have AT LEAST one usable wrap form, otherwise the
-- user has lost access to their data and we should have rejected the
-- mutation that left the row in this state. The check enforces that
-- at least one of (wrapped_mk, recovery_wrapped_mk) is populated.
ALTER TABLE auth.encryption_vaults
DROP CONSTRAINT IF EXISTS encryption_vaults_has_wrap;
ALTER TABLE auth.encryption_vaults
ADD CONSTRAINT encryption_vaults_has_wrap
CHECK (wrapped_mk IS NOT NULL OR recovery_wrapped_mk IS NOT NULL);
-- ─── Cross-field consistency ──────────────────────────────────
-- If recovery_wrapped_mk is set, recovery_iv must also be set.
-- If wrapped_mk is set, wrap_iv must also be set.
ALTER TABLE auth.encryption_vaults
DROP CONSTRAINT IF EXISTS encryption_vaults_wrap_iv_pair;
ALTER TABLE auth.encryption_vaults
ADD CONSTRAINT encryption_vaults_wrap_iv_pair
CHECK (
(wrapped_mk IS NULL) = (wrap_iv IS NULL)
AND
(recovery_wrapped_mk IS NULL) = (recovery_iv IS NULL)
);
-- ─── Zero-knowledge implies the server wrap is gone ───────────
-- If a vault is in zero-knowledge mode, the KEK-wrapped MK MUST be
-- absent — otherwise the "server can no longer decrypt" promise is
-- a lie. The recovery wrap MUST be present, otherwise the user is
-- locked out.
ALTER TABLE auth.encryption_vaults
DROP CONSTRAINT IF EXISTS encryption_vaults_zk_consistency;
ALTER TABLE auth.encryption_vaults
ADD CONSTRAINT encryption_vaults_zk_consistency
CHECK (
(zero_knowledge = false)
OR
(zero_knowledge = true AND wrapped_mk IS NULL AND recovery_wrapped_mk IS NOT NULL)
);