The F4 server-side singleton bootstrap was fire-and-forget at signup
time — a transient mana_sync outage during registration would leave the
user with no singleton and only the in-store `getOrCreateLocalDoc()`
fallback to race on the first write. The signup-hook is still the
happy-path zero-latency bootstrap; this commit adds a deliberate
reconciliation path that converges on every boot.
- Idempotent `bootstrapUserSingletons` / `bootstrapSpaceSingletons`:
both functions now existence-check sync_changes before INSERT and
return boolean (true=inserted, false=skipped).
- New endpoint `POST /api/v1/me/bootstrap-singletons` — JWT-gated under
the existing `/api/v1/me/*` prefix. Provisions the caller's
userContext and the kontextDoc for every Space they're a member of.
Returns `{ ok, bootstrapped: { userContext, spaces: { id: bool } } }`.
- Webapp `(app)/+layout.svelte` calls the endpoint once per
authenticated boot, after `restoreClientIdFromDexie()` and before
`createUnifiedSync.startAll()`. Best-effort; failures swallow into a
console warning and the in-store fallback still covers the rare
race window.
Plan: docs/plans/sync-field-meta-overhaul.md (F4-robust row).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetrically extends the F4 server-side singleton bootstrap to the
per-Space `kontextDoc`. Every Space-creation — Personal at signup and
brand/club/family/team/practice via the org plugin — now writes an empty
kontextDoc row straight into mana_sync.sync_changes with origin='system',
client_id='system:bootstrap'. Fresh clients pull the row instead of
racing on a local insert that the next pull would clobber.
- New `bootstrapSpaceSingletons(spaceId, ownerUserId, syncSql)` in
services/mana-auth/src/services/bootstrap-singletons.ts; shared
`buildFieldMeta` helper extracted.
- `createBetterAuth(databaseUrl, syncDatabaseUrl, webauthn)` now takes
the sync-DB URL and lazy-creates a module-scoped postgres pool for
the bootstrap inserts.
- Hook into `databaseHooks.user.create.after` (only on `created: true`
from createPersonalSpaceFor) and `organizationHooks.afterCreateOrganization`.
- Webapp `kontextStore.ensureDoc()` made private as `getOrCreateLocalDoc()` —
same fallback role as userContextStore's after F5. Public API is now just
setContent + appendContent.
Plan: docs/plans/sync-field-meta-overhaul.md (F4-fu row in Shipping Log).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the userContext race-on-first-mount that surfaced as a
"10 fields overwritten" conflict toast pre-F2. Adds a fire-and-forget
hook in the /register flow that writes the per-user `userContext`
singleton straight into `mana_sync.sync_changes` with
`client_id='system:bootstrap'` and `origin='system'`.
Behavior:
- On successful `signUpEmail`, `bootstrapUserSingletons(userId, syncSql)`
inserts a `profile/userContext` row with the empty-default shape that
mirrors the webapp's `emptyUserContext()` factory in
`apps/mana/apps/web/src/lib/modules/profile/types.ts`.
- The receiving client treats the change as origin='server-replay'
on apply (per F2 conflict-gate), so no toasts on first pull.
- Failure is logged but does not abort registration — the webapp's
existing `ensureDoc()` fallback still works during the F4→F5
transition.
Module-scoped postgres pool (max=2 connections) lazy-initialized on
first signUp; reused for the lifetime of the process. Same pattern as
`UserDataService.getSyncSql`.
Out of scope for F4:
- `kontextDoc` is per-Space (not per-user) — bootstrap there will be
hooked into the Space-creation flow, not /register. The webapp's
`ensureDoc()` for kontextDoc stays as-is for now.
- Webapp `ensureDoc()` removal is F5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two interlocking fixes driven by a production lockout incident.
## Bug that motivated this
A fresh schema-drift column (auth.users.onboarding_completed_at) made
every Better Auth query crash with Postgres 42703. The /login wrapper
swallowed the non-2xx and mapped it onto a generic "401 Invalid
credentials" AND bumped the password lockout counter — so 5 legit
login attempts against a broken DB would have locked every real user
out of their own account. Same wrapper pattern on /register, /refresh,
/reset-password etc. The 30-minute hunt ended in a one-off repro
script that finally surfaced the real Postgres error.
The user-facing passkey button additionally returned generic 404s on
every login-page mount because the route wasn't registered (the DB
schema existed, the Better Auth plugin wasn't wired).
## Phase 1 — Error classification (services/mana-auth/src/lib/auth-errors)
- 19-code AuthErrorCode taxonomy (INVALID_CREDENTIALS, EMAIL_NOT_VERIFIED,
ACCOUNT_LOCKED, SERVICE_UNAVAILABLE, PASSKEY_VERIFICATION_FAILED, …)
- classifyFromResponse/classifyFromError handle: Better Auth APIError
(duck-typed on `name === 'APIError'`), Postgres errors (23505 unique,
42703/08xxx → infra), ZodError, fetch/ECONNREFUSED network errors,
bare Error, unknown.
- respondWithError routes the structured response, logs at the right
level, fires the correct security event, and CRITICALLY only bumps
the lockout counter for actual credential failures — SERVICE_UNAVAILABLE
and INTERNAL never touch lockout.
- All 12 endpoints in routes/auth.ts refactored (/login, /register,
/logout, /session-to-token, /refresh, /validate, /forgot-password,
/reset-password, /resend-verification, /profile GET+POST,
/change-email, /change-password, /account DELETE).
- Fixed pre-existing auth.api.forgetPassword typo (→ requestPasswordReset).
- shared-logger + requestLogger middleware wired in index.ts; all
console.* calls in the service removed.
## Phase 2 — Passkey end-to-end (@better-auth/passkey 1.6+)
- sql/007_passkey_bootstrap.sql: idempotent schema alignment —
friendly_name→name, +aaguid, transports jsonb→text, +method column
on login_attempts.
- better-auth.config.ts: passkey plugin wired with rpID/rpName/origin
from new webauthn config section. rpID defaults to mana.how in prod
(from COOKIE_DOMAIN), localhost in dev.
- routes/passkeys.ts: 7 wrapper endpoints (capability probe,
register/options+verify, authenticate/options+verify with JWT mint,
list, delete, rename). Each routes errors through the classifier;
authenticate/verify promotes generic INVALID_CREDENTIALS to
PASSKEY_VERIFICATION_FAILED.
- PasskeyRateLimitService: in-memory per-IP (options: 20/min) and
per-credential (verify: 10 failures/min → 5 min cooldown) buckets.
Deliberately separate from the password lockout — different factor,
different blast radius.
- Client: authService.getPasskeyCapability() async probe, memoised per
session. authStore.passkeyAvailable reactive state. LoginPage gates
on === true so a slow probe doesn't flash the button in.
- AuthResult grew a code: AuthErrorCode field; handleAuthError in
shared-auth prefers the server envelope over the legacy message
heuristics.
## Tests
- 30 unit tests for the classifier covering every branch (including
the exact Postgres 42703 shape that started this).
- 9 unit tests for the rate limiter.
- 14 integration tests for the auth routes — the regression test
explicitly asserts "upstream 500 → 503 + zero lockout bumps".
- 101 tests pass, 0 fail, 30 pre-existing skips unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 of the Mission Key-Grant rollout. Webapp can now request a
wrapped per-mission data key; mana-ai can unwrap and (Phase 2) use it.
mana-auth:
- POST /api/v1/me/ai-mission-grant — HKDF-derives MDK from the user
master key, RSA-OAEP-2048-wraps with the mana-ai public key, returns
{ wrappedKey, derivation, issuedAt, expiresAt }
- MissionGrantService refuses zero-knowledge users (409 ZK_ACTIVE) and
returns 503 GRANT_NOT_CONFIGURED when MANA_AI_PUBLIC_KEY_PEM is unset
- TTL clamped to [1h, 30d]
mana-ai:
- configureMissionGrantKey + unwrapMissionGrant with structured failure
reasons (not-configured / expired / malformed / wrap-rejected)
- mana_ai.decrypt_audit table + RLS policy scoped to
app.current_user_id — append-only row per server-side decrypt attempt
- MANA_AI_PRIVATE_KEY_PEM env slot; absent = grants silently disabled
No existing behaviour changes: missions without a grant run exactly as
before. Grant flow is wired end-to-end but unused until Phase 2 lands
the encrypted resolver.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The workbench-registry app id 'inventar' did not match its
@mana/shared-branding MANA_APPS counterpart 'inventory', so the tier-
gating join in apps/web/src/lib/app-registry/registry.ts silently
failed for the inventory module — it fell into the "no MANA_APPS
entry, default visible" fallback and was effectively un-gated. The
codebase had also voted overwhelmingly for 'inventar' (53 files) vs
'inventory' (3 files in shared-branding), so the long-standing
mismatch was just bookkeeping debt waiting to bite.
Pre-release, no live data, so the cleanest fix is to align everything
on the English 'inventory':
- Workbench-registry id, module.config.ts appId, module folder, route
folder and i18n locale folder all renamed via git mv
- Standalone apps/inventar/ workspace package renamed
- All imports, store identifiers (InventarEvents → InventoryEvents,
INVENTAR_GUEST_SEED, inventarModuleConfig), i18n keys and href/goto
paths follow the rename
- The German display label "Inventar" is preserved everywhere it is a
user-visible string (page titles, i18n values, toast labels)
- Dexie table prefixes (invCollections, invItems, …) are unchanged
- Drive-by fix: ListView.svelte was querying non-existent
inventarCollections/inventarItems tables — corrected to the actual
invCollections/invItems names from module.config
- The "inventar ↔ inventory id mismatch" workaround comment in
registry.ts is removed since the mismatch no longer exists
module-registry.ts also picks up the user's parallel newsModuleConfig
addition because both edits land in the same import block — keeping
them split would have left the build in an inconsistent state.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
While adding negative-path integration tests for the auth flow I
discovered that *neither* of the lockout primitives in
services/mana-auth/src/services/security.ts has actually been
working in production. Two independent silent failures that combined
into a "the lockout never triggers, ever" outcome:
1. recordAttempt() inserted into auth.login_attempts with explicit
`id = gen_random_uuid()`, but auth.login_attempts.id is a
`serial integer` column with `nextval('auth.login_attempts_id_seq')`
as default. The UUID-into-integer cast threw a type error every
single time, the bare `catch {}` swallowed it as "non-critical",
and not a single login attempt was ever persisted. Lockout's "5
failures in 15 min" check was running against an empty table.
2. checkLockout() built `attempted_at > ${new Date(...)}` via the
drizzle sql template, but postgres-js cannot bind a JS Date object
directly — it tries to byteLength() the parameter and crashes with
`Received an instance of Date`. Same anti-pattern: bare `catch`,
returns `{locked: false}` (fail-open), no log, completely invisible.
Both are "silent broken since the encryption-vault series of changes"
class — caught only because the integration test for the lockout flow
expected the 6th login attempt to return 429 and got 200 instead.
Fixes:
- recordAttempt(): drop the bogus `id` column from the INSERT (let the
sequence default assign it), default ipAddress to null instead of
letting `${undefined}` collapse the parameter slot, and surface
errors in the catch instead of swallowing them silently.
- checkLockout(): pass `windowStart.toISOString()` instead of the Date
object so postgres-js can serialize it. Same catch upgrade — log the
cause when failing open.
Failure-path test additions (tests/integration/auth-failures.test.ts):
- wrong password: assert 401, no JWT, +1 LOGIN_FAILURE in security_events,
+1 row in auth.login_attempts
- account lockout: 5 failed attempts then 6th returns 429 with
remainingSeconds, even with the correct password
- unverified email login: 403 with code = EMAIL_NOT_VERIFIED
- validate with garbage token: valid !== true
- resend verification: second mail arrives in mailpit
Plus the run-integration-tests.sh helper now runs both .test.ts files
and tests/integration/package.json's `test` script does the same.
Negative-control: reverted the recordAttempt fix (re-added the bogus
gen_random_uuid id), the wrong-password test failed at the
login_attempts assertion. Reverted the checkLockout fix, the lockout
test failed at the 429 assertion. Both fixes verified to be load-bearing.
6 tests, 45 expects, ~1.3s on a warm cache.
logEvent() builds its INSERT via a raw `sql` tagged template:
sql\`INSERT INTO auth.security_events
(..., user_id, ip_address, user_agent, metadata, ...)
VALUES (..., \${params.userId}, \${params.ipAddress},
\${params.userAgent}, \${...metadata}, ...)\`
Most call sites only pass userId+eventType (or only eventType for the
LOGIN_FAILURE / PASSWORD_RESET_REQUESTED / PROFILE_UPDATED /
PASSWORD_CHANGED / ACCOUNT_DELETED events). The other params land in
the template as `undefined`, and postgres-js's tagged-template renderer
collapses `${undefined}` into literal nothing — producing this:
VALUES (gen_random_uuid(), $1, $2, , , $3::jsonb, NOW())
^^^^
Postgres rejects with "syntax error at or near \",\"". The catch block
swallowed it as a `console.warn('Failed to log security event
(non-critical):', params.eventType)` with no error detail, which is why
this has been silently broken for who knows how long — every register,
every login, every password change has been losing its audit row.
Fix:
- Coerce optional params to `null` (`params.userId ?? null`) before
interpolation. NULL is what postgres-js renders for an explicit null.
- Surface the actual error in the catch warn so the next time something
similar happens it shows up in logs instead of just "non-critical".
Verified the diagnosis by toggling `log_statement = all` on the test
postgres, triggering a register, and reading the literal failed
statement out of postgres logs.
Closes backlog #1 from the Phase 9 audit. Adds 28 integration tests
for the EncryptionVaultService against a real Postgres so the
RLS policies, CHECK constraints and audit-row writes are exercised
as the production app actually sees them. The pure-crypto KEK tests
in kek.test.ts already covered the wrap/unwrap primitives — this
new file fills in the service-shaped gaps that need a real DB.
Test infrastructure
-------------------
- Reads TEST_DATABASE_URL from env. Whole suite is SKIPPED via
describe.skip if unset, so unrelated CI runs and `bun test` from
a fresh checkout don't fail on missing connection. The
encryption-vault sub-job has to provision a Postgres explicitly.
- Schema is assumed already migrated (run `pnpm db:push` or apply
sql/002 + sql/003 manually before invoking the suite). Tests
insert a fresh test user per case via beforeEach so cross-test
pollution is impossible despite the FK to auth.users.
- afterAll cleans up the user (CASCADE wipes vault + audit) and
closes the postgres pool so bun test exits cleanly.
Coverage
--------
init (3):
- Mints a fresh vault, wrapped_mk + wrap_iv populated, ZK off
- Idempotent (returns same key)
- Audit rows are written
getStatus (5):
- vaultExists=false for unconfigured user
- vaultExists=true after init, no recovery wrap
- hasRecoveryWrap=true after setRecoveryWrap
- zeroKnowledge=true after enableZK
- Does NOT write an audit row (cheap metadata read)
setRecoveryWrap (4):
- Stores wrap on existing vault
- VaultNotFoundError on missing vault
- Idempotent (replaces previous wrap)
- Writes recovery_set audit row
clearRecoveryWrap (3):
- Removes the wrap
- ZeroKnowledgeActiveError when ZK is on
- VaultNotFoundError on missing vault
enableZeroKnowledge (4):
- Flips zero_knowledge=true and NULLs out wrapped_mk + wrap_iv
- RecoveryWrapMissingError if no recovery wrap is set
- Idempotent (already-on is no-op)
- VaultNotFoundError on missing vault
disableZeroKnowledge (2):
- Restores wrapped_mk from a client-supplied master key,
verifies the round-trip via getMasterKey returns the same bytes
- No-op when ZK is already off
getMasterKey (3):
- Returns unwrapped MK in standard mode
- Returns recovery blob with requiresRecoveryCode=true in ZK mode
- VaultNotFoundError on missing vault
rotate (2):
- Mints fresh MK and wipes any existing recovery wrap
- ZeroKnowledgeRotateForbidden in ZK mode
DB-level invariants (2):
- Setting wrapped_mk back while ZK active is rejected by
encryption_vaults_zk_consistency
- Setting wrap_iv to NULL while wrapped_mk is set is rejected
by encryption_vaults_wrap_iv_pair
Both wrap the Drizzle update in an arrow IIFE so
expect(...).rejects.toThrow() sees a real Promise (Drizzle's
chainable update() only executes on await/then).
Run results
-----------
With TEST_DATABASE_URL set + schema migrated:
28 pass, 0 fail, 64 expect() calls
Without TEST_DATABASE_URL set (default):
0 pass, 30 skip (full suite cleanly skipped)
KEK tests in kek.test.ts still run unaffected.
Drive-by: kek.test.ts header comment updated to point at the new
sibling file instead of saying "tests will live alongside mana-sync"
(which was outdated speculation from Phase 2).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes the Phase 9 Milestone 4 known limitation where the settings
page always started in 'idle' state regardless of whether the user
had already enabled zero-knowledge mode. Adds a cheap server-side
status read + hydrates the page on mount.
Server side
-----------
New VaultStatus interface and getStatus(userId) method on
EncryptionVaultService — single SELECT against encryption_vaults,
no decryption, no audit logging (this gets called on every settings
page mount and we don't want to flood the audit log with read-only
metadata fetches). Returns sane defaults when the vault row doesn't
exist yet so the client can avoid a 404 dance.
GET /api/v1/me/encryption-vault/status →
{
vaultExists: boolean,
hasRecoveryWrap: boolean,
zeroKnowledge: boolean,
recoverySetAt: string | null
}
Client side
-----------
vault-client.ts gains a `getStatus()` method that bypasses the
fetchVault retry helper (status reads should be cheap and one-shot;
if they fail we let the caller fall back to defaults). Re-exports
VaultStatus + RecoveryCodeSetupResult from the crypto barrel.
settings/security/+page.svelte
------------------------------
onMount kicks off a getStatus() call. Two things change based on
the response:
1. If the server says zero_knowledge=true, jump zkSetupStep to
'enabled' so the page renders the active-state UI directly
instead of the setup flow.
2. New `hasRecoveryWrap` state tracks whether a wrap is stored,
even if ZK isn't active yet. The idle branch now has TWO
variants:
- hasRecoveryWrap=false: original "Recovery-Code einrichten"
single button (unchanged from milestone 4)
- hasRecoveryWrap=true: amber notice "you have a code stored
but ZK isn't active" with three buttons:
* "Zero-Knowledge jetzt aktivieren" (jumps straight to the
enable call)
* "Neuen Recovery-Code generieren" (rotates the wrap)
* "Recovery-Code entfernen" (with two-click confirmation,
calls DELETE /recovery-wrap)
This handles the previously-orphaned state where a user generated a
code, copied it to their password manager, but never confirmed the
final activation step. Without this branch, after a reload the
settings page would show "Setup" again and the call would fail
with "vault is already in zero-knowledge mode" — except it wouldn't,
because the vault wasn't actually in ZK yet, just had a recovery wrap
stored. Either way the state was confusing.
handleSetupRecoveryCode + handleClearRecoveryCode now keep
hasRecoveryWrap in sync after the round trip.
Fail-quiet on getStatus error: if the network/auth/server-side fetch
fails, the page stays at the idle default. The user can still run
the setup flow, and any inconsistencies surface via the usual
server-side error responses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server-side support for the Phase 9 zero-knowledge opt-in. Adds the
recovery-wrap columns + four new vault operations + the routes that
expose them.
Schema (sql/003_recovery_wrap.sql)
----------------------------------
Adds to auth.encryption_vaults:
- recovery_wrapped_mk text (NULL until set)
- recovery_iv text (NULL until set)
- recovery_format_version smallint NOT NULL DEFAULT 1
- recovery_set_at timestamptz
- zero_knowledge boolean NOT NULL DEFAULT false
Drops NOT NULL from wrapped_mk + wrap_iv (a vault in zero-knowledge
mode has no server-side wrap at all).
Three CHECK constraints enforce the invariant at the DB level so no
service bug can leave a vault in an inconsistent state:
- encryption_vaults_has_wrap — at least one of (wrapped_mk,
recovery_wrapped_mk) is set
- encryption_vaults_wrap_iv_pair — ciphertext + IV are paired
(both NULL or both set) on
each wrap form
- encryption_vaults_zk_consistency — zero_knowledge=true implies
wrapped_mk IS NULL AND
recovery_wrapped_mk IS NOT NULL
If a code-level bug ever tried to enable ZK without a recovery wrap,
or to leave both wraps empty, Postgres would reject the UPDATE.
Drizzle schema (db/schema/encryption-vaults.ts)
-----------------------------------------------
Mirrors the migration: wrappedMk + wrapIv become nullable, the four
new columns added with the right defaults. Inline doc comment explains
the zero-knowledge fork.
Service (services/encryption-vault/index.ts)
--------------------------------------------
VaultFetchResult gains optional `requiresRecoveryCode` /
`recoveryWrappedMk` / `recoveryIv` so the route handler can serialize
the right shape. masterKey becomes Uint8Array | null (null in ZK mode).
Existing methods updated:
- init: branches on row.zeroKnowledge — returns the recovery blob
instead of an unwrapped MK if the user is already in ZK mode
- getMasterKey: same fork, with audit context "zk-recovery-blob"
- rotate: throws ZeroKnowledgeRotateForbidden in ZK mode (the server
can't re-wrap a key it can't read). Also wipes any stale recovery
wrap on rotation — the new MK has nothing to do with the old one,
so the old recovery code would unwrap into garbage.
New methods:
- setRecoveryWrap(userId, { recoveryWrappedMk, recoveryIv }, ctx)
Stores (or replaces) the user's recovery wrap. Idempotent.
- clearRecoveryWrap(userId, ctx)
Removes the recovery wrap. Forbidden if ZK is active (would lock
the user out) — throws ZeroKnowledgeActiveError → 409.
- enableZeroKnowledge(userId, ctx)
NULLs out wrapped_mk + wrap_iv, sets zero_knowledge=true. Requires
a recovery wrap to already be present — throws
RecoveryWrapMissingError → 400 otherwise. Idempotent on already-on.
- disableZeroKnowledge(userId, mkBytes, ctx)
Inverse: takes a freshly-unwrapped MK from the client, KEK-wraps
it, stores as wrapped_mk, flips zero_knowledge=false. The client
is the only entity that can supply the MK at this point, since
the server can't decrypt the recovery wrap.
Three new error classes:
- RecoveryWrapMissingError → 400 RECOVERY_WRAP_MISSING
- ZeroKnowledgeActiveError → 409 ZK_ACTIVE
- ZeroKnowledgeRotateForbidden → 409 ZK_ROTATE_FORBIDDEN
Audit action union extended with:
- 'recovery_set' | 'recovery_clear' | 'zk_enable' | 'zk_disable'
Routes (routes/encryption-vault.ts)
-----------------------------------
GET /key + POST /init now share a serializeFetchResult helper that
returns either:
- { masterKey, formatVersion, kekId } (standard)
- { requiresRecoveryCode: true, recoveryWrappedMk, (ZK mode)
recoveryIv, formatVersion }
Three new routes:
- POST /recovery-wrap — body: { recoveryWrappedMk, recoveryIv }
Stores the wrap. Validates both fields
are non-empty strings.
- DELETE /recovery-wrap — Removes the wrap. 409 if ZK active.
- POST /zero-knowledge — body: { enable: boolean, masterKey?: base64 }
enable=true: flip on (no body MK needed)
enable=false: flip off (MK required)
Validates the MK decodes to exactly 32 bytes.
Wipes the bytes after handing them to the
service.
POST /rotate now catches ZeroKnowledgeRotateForbidden → 409
ZK_ROTATE_FORBIDDEN so the client can show "disable zero-knowledge
first".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds the server side of the per-user encryption vault. Phase 1 shipped
the client foundation (no-op while every table is enabled:false). This
commit lets the client actually fetch a master key when Phase 3 flips
the registry switches.
Schema (Drizzle + raw SQL migration)
- auth.encryption_vaults: per-user wrapped MK + IV + format version +
kek_id stamp + created/rotated timestamps. PK = user_id, ON DELETE
CASCADE so account deletion wipes the vault.
- auth.encryption_vault_audit: append-only trail of init/fetch/rotate
actions with IP, user-agent, HTTP status, free-form context.
- sql/002_encryption_vaults.sql: idempotent CREATE TABLE + ENABLE +
FORCE row-level security with a `current_setting('app.current_user_id')`
policy on both tables. FORCE makes the policy apply to the table
owner too — no bypass via grants.
KEK loader (services/encryption-vault/kek.ts)
- Loads a 32-byte AES-256 KEK from the MANA_AUTH_KEK env var (base64).
- Production: missing or wrong-length input is fatal at boot.
- Development: 32-zero-byte fallback so contributors can run the
service without provisioning a secret. Logs a loud warning.
- wrapMasterKey / unwrapMasterKey use Web Crypto AES-GCM-256 over the
raw 32-byte MK with a fresh 12-byte IV per wrap. Returns base64
pair for storage.
- generateMasterKey + activeKekId helpers used by the service.
- Future migration to KMS / Vault: only loadKek() changes; the
kek_id stamp on each row tracks which KEK produced it.
EncryptionVaultService (services/encryption-vault/index.ts)
- init(userId): idempotent — returns existing MK or mints a new one.
- getMasterKey(userId): unwraps the stored MK; throws VaultNotFoundError
on no-row so the route can return 404 cleanly.
- rotate(userId): mints fresh MK, replaces wrap. Caller is on the
hook for re-encryption — destructive by design.
- withUserScope(userId, fn): wraps every read/write in a Drizzle
transaction with set_config('app.current_user_id', userId, true)
so the RLS policy admits only the matching row. Empty userId is
rejected up-front.
- writeAudit() appends a row to encryption_vault_audit on every
action including failures, so probing attempts leave a trail.
Routes (routes/encryption-vault.ts)
- POST /api/v1/me/encryption-vault/init — idempotent bootstrap
- GET /api/v1/me/encryption-vault/key — fetch the active MK
- POST /api/v1/me/encryption-vault/rotate — destructive rotation
- All return base64-encoded master key bytes plus formatVersion +
kekId. JWT-protected via the existing /api/v1/me/* middleware.
- readAuditContext() pulls X-Forwarded-For + User-Agent off the
request for the audit row.
Bootstrap (index.ts)
- loadKek() runs at top-level await before any route can fire so a
misconfigured KEK fails closed at boot, never at request time.
- encryptionVaultService is mounted under /api/v1/me/encryption-vault
so it inherits the existing JWT middleware and shows up next to the
GDPR self-service endpoints.
Tests (services/encryption-vault/kek.test.ts)
- 11 Bun-test cases covering: KEK load (happy path, wrong length,
idempotent, before-load guard), generateMasterKey randomness,
wrap/unwrap roundtrip, IV uniqueness across repeated wraps,
wrong-MK-length rejection, tampered-ciphertext rejection,
wrong-length IV rejection, wrong-KEK rejection.
- Service-level integration tests deferred — they need a real
Postgres for the RLS behaviour, set up via existing mana-sync
test pattern in CI.
Config + env
- .env.development gains MANA_AUTH_KEK= (empty → dev fallback)
with a comment explaining the production requirement.
- services/mana-auth/package.json gains "test": "bun test".
Verified: 11/11 KEK tests passing, 31/31 Phase 1 client tests still
passing, only pre-existing TS errors remain in mana-auth (auth.ts:281
forgetPassword + api-keys.ts:50 insert overload — both unrelated).
Phase 3: client wires the MemoryKeyProvider to GET /encryption-vault/key
on login, flips registry entries to enabled:true table by table, and
extends the Dexie hooks to call wrapValue/unwrapValue on configured
fields.
Phase 4: settings UI for lock state, key rotation, recovery code opt-in.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract ~120 hardcoded German strings from 14 Svelte components into i18n locale
files using svelte-i18n $t() calls. Add new translation sections (taskForm, filters,
tags, subtasks, durationPicker, kanban, toolbar) across all 5 languages (de/en/fr/es/it).
Also add missing shared common translations for Spanish, French, and Italian
(150+ keys each) in packages/shared-i18n.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server
(192.168.178.11) via LAN instead of host.docker.internal
- Upgrade default model to gemma3:12b and max concurrent to 5
- Add daily signup limit service (MAX_DAILY_SIGNUPS env var)
- Add GET /api/v1/auth/signup-status public endpoint
- Add k6 load test suite (web-apps, auth, sync-websocket, ollama)
- Add capacity planning documentation
- Fix: add eslint-config to sveltekit-base and calendar Dockerfiles
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete the mana-auth Hono service with all remaining endpoints
from mana-core-auth.
Added:
- routes/auth.ts: Full auth flow (register, login, logout, validate,
password reset, profile, change-password, account deletion,
security events) with lockout + security event logging
- routes/guilds.ts: Guild CRUD, member management, invitations
(delegates to Better Auth org plugin + mana-credits for pools)
- routes/api-keys.ts: API key generation, listing, revocation,
validation (sk_live_* format, SHA-256 hashed)
- routes/me.ts: GDPR data export/delete (Articles 17 & 20)
- services/security.ts: SecurityEventsService (fire-and-forget audit)
+ AccountLockoutService (5 failures/15min → 30min lockout)
- services/api-keys.ts: Key generation, validation, scope checks
Updated:
- index.ts: Wire all routes with proper middleware (JWT, service auth)
Service now has ~1,900 LOC covering all functionality from the
original ~11,500 LOC NestJS mana-core-auth (83% reduction).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>