Commit graph

20 commits

Author SHA1 Message Date
Till JS
099cac4a01 feat(auth): explicit bootstrap-singletons endpoint + idempotent functions (F4 robust)
The F4 server-side singleton bootstrap was fire-and-forget at signup
time — a transient mana_sync outage during registration would leave the
user with no singleton and only the in-store `getOrCreateLocalDoc()`
fallback to race on the first write. The signup-hook is still the
happy-path zero-latency bootstrap; this commit adds a deliberate
reconciliation path that converges on every boot.

- Idempotent `bootstrapUserSingletons` / `bootstrapSpaceSingletons`:
  both functions now existence-check sync_changes before INSERT and
  return boolean (true=inserted, false=skipped).
- New endpoint `POST /api/v1/me/bootstrap-singletons` — JWT-gated under
  the existing `/api/v1/me/*` prefix. Provisions the caller's
  userContext and the kontextDoc for every Space they're a member of.
  Returns `{ ok, bootstrapped: { userContext, spaces: { id: bool } } }`.
- Webapp `(app)/+layout.svelte` calls the endpoint once per
  authenticated boot, after `restoreClientIdFromDexie()` and before
  `createUnifiedSync.startAll()`. Best-effort; failures swallow into a
  console warning and the in-store fallback still covers the rare
  race window.

Plan: docs/plans/sync-field-meta-overhaul.md (F4-robust row).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 01:38:14 +02:00
Till JS
3df7391905 feat(auth): bootstrap per-Space kontextDoc on Space-creation (F4 follow-up)
Symmetrically extends the F4 server-side singleton bootstrap to the
per-Space `kontextDoc`. Every Space-creation — Personal at signup and
brand/club/family/team/practice via the org plugin — now writes an empty
kontextDoc row straight into mana_sync.sync_changes with origin='system',
client_id='system:bootstrap'. Fresh clients pull the row instead of
racing on a local insert that the next pull would clobber.

- New `bootstrapSpaceSingletons(spaceId, ownerUserId, syncSql)` in
  services/mana-auth/src/services/bootstrap-singletons.ts; shared
  `buildFieldMeta` helper extracted.
- `createBetterAuth(databaseUrl, syncDatabaseUrl, webauthn)` now takes
  the sync-DB URL and lazy-creates a module-scoped postgres pool for
  the bootstrap inserts.
- Hook into `databaseHooks.user.create.after` (only on `created: true`
  from createPersonalSpaceFor) and `organizationHooks.afterCreateOrganization`.
- Webapp `kontextStore.ensureDoc()` made private as `getOrCreateLocalDoc()` —
  same fallback role as userContextStore's after F5. Public API is now just
  setContent + appendContent.

Plan: docs/plans/sync-field-meta-overhaul.md (F4-fu row in Shipping Log).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 01:21:31 +02:00
Till JS
c07db300b0 feat(sync): F4 — server-side singleton bootstrap
Closes the userContext race-on-first-mount that surfaced as a
"10 fields overwritten" conflict toast pre-F2. Adds a fire-and-forget
hook in the /register flow that writes the per-user `userContext`
singleton straight into `mana_sync.sync_changes` with
`client_id='system:bootstrap'` and `origin='system'`.

Behavior:
- On successful `signUpEmail`, `bootstrapUserSingletons(userId, syncSql)`
  inserts a `profile/userContext` row with the empty-default shape that
  mirrors the webapp's `emptyUserContext()` factory in
  `apps/mana/apps/web/src/lib/modules/profile/types.ts`.
- The receiving client treats the change as origin='server-replay'
  on apply (per F2 conflict-gate), so no toasts on first pull.
- Failure is logged but does not abort registration — the webapp's
  existing `ensureDoc()` fallback still works during the F4→F5
  transition.

Module-scoped postgres pool (max=2 connections) lazy-initialized on
first signUp; reused for the lifetime of the process. Same pattern as
`UserDataService.getSyncSql`.

Out of scope for F4:
- `kontextDoc` is per-Space (not per-user) — bootstrap there will be
  hooked into the Space-creation flow, not /register. The webapp's
  `ensureDoc()` for kontextDoc stays as-is for now.
- Webapp `ensureDoc()` removal is F5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 23:18:54 +02:00
Till JS
e66654068f feat(auth): error-classification layer + passkey end-to-end
Two interlocking fixes driven by a production lockout incident.

## Bug that motivated this

A fresh schema-drift column (auth.users.onboarding_completed_at) made
every Better Auth query crash with Postgres 42703. The /login wrapper
swallowed the non-2xx and mapped it onto a generic "401 Invalid
credentials" AND bumped the password lockout counter — so 5 legit
login attempts against a broken DB would have locked every real user
out of their own account. Same wrapper pattern on /register, /refresh,
/reset-password etc. The 30-minute hunt ended in a one-off repro
script that finally surfaced the real Postgres error.

The user-facing passkey button additionally returned generic 404s on
every login-page mount because the route wasn't registered (the DB
schema existed, the Better Auth plugin wasn't wired).

## Phase 1 — Error classification (services/mana-auth/src/lib/auth-errors)

- 19-code AuthErrorCode taxonomy (INVALID_CREDENTIALS, EMAIL_NOT_VERIFIED,
  ACCOUNT_LOCKED, SERVICE_UNAVAILABLE, PASSKEY_VERIFICATION_FAILED, …)
- classifyFromResponse/classifyFromError handle: Better Auth APIError
  (duck-typed on `name === 'APIError'`), Postgres errors (23505 unique,
  42703/08xxx → infra), ZodError, fetch/ECONNREFUSED network errors,
  bare Error, unknown.
- respondWithError routes the structured response, logs at the right
  level, fires the correct security event, and CRITICALLY only bumps
  the lockout counter for actual credential failures — SERVICE_UNAVAILABLE
  and INTERNAL never touch lockout.
- All 12 endpoints in routes/auth.ts refactored (/login, /register,
  /logout, /session-to-token, /refresh, /validate, /forgot-password,
  /reset-password, /resend-verification, /profile GET+POST,
  /change-email, /change-password, /account DELETE).
- Fixed pre-existing auth.api.forgetPassword typo (→ requestPasswordReset).
- shared-logger + requestLogger middleware wired in index.ts; all
  console.* calls in the service removed.

## Phase 2 — Passkey end-to-end (@better-auth/passkey 1.6+)

- sql/007_passkey_bootstrap.sql: idempotent schema alignment —
  friendly_name→name, +aaguid, transports jsonb→text, +method column
  on login_attempts.
- better-auth.config.ts: passkey plugin wired with rpID/rpName/origin
  from new webauthn config section. rpID defaults to mana.how in prod
  (from COOKIE_DOMAIN), localhost in dev.
- routes/passkeys.ts: 7 wrapper endpoints (capability probe,
  register/options+verify, authenticate/options+verify with JWT mint,
  list, delete, rename). Each routes errors through the classifier;
  authenticate/verify promotes generic INVALID_CREDENTIALS to
  PASSKEY_VERIFICATION_FAILED.
- PasskeyRateLimitService: in-memory per-IP (options: 20/min) and
  per-credential (verify: 10 failures/min → 5 min cooldown) buckets.
  Deliberately separate from the password lockout — different factor,
  different blast radius.
- Client: authService.getPasskeyCapability() async probe, memoised per
  session. authStore.passkeyAvailable reactive state. LoginPage gates
  on === true so a slow probe doesn't flash the button in.
- AuthResult grew a code: AuthErrorCode field; handleAuthError in
  shared-auth prefers the server envelope over the legacy message
  heuristics.

## Tests

- 30 unit tests for the classifier covering every branch (including
  the exact Postgres 42703 shape that started this).
- 9 unit tests for the rate limiter.
- 14 integration tests for the auth routes — the regression test
  explicitly asserts "upstream 500 → 503 + zero lockout bumps".
- 101 tests pass, 0 fail, 30 pre-existing skips unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:52:51 +02:00
Till JS
9a3025fed8 feat(ai,auth): Mission Grant endpoint + unwrap helper + audit table
Phase 1 of the Mission Key-Grant rollout. Webapp can now request a
wrapped per-mission data key; mana-ai can unwrap and (Phase 2) use it.

mana-auth:
- POST /api/v1/me/ai-mission-grant — HKDF-derives MDK from the user
  master key, RSA-OAEP-2048-wraps with the mana-ai public key, returns
  { wrappedKey, derivation, issuedAt, expiresAt }
- MissionGrantService refuses zero-knowledge users (409 ZK_ACTIVE) and
  returns 503 GRANT_NOT_CONFIGURED when MANA_AI_PUBLIC_KEY_PEM is unset
- TTL clamped to [1h, 30d]

mana-ai:
- configureMissionGrantKey + unwrapMissionGrant with structured failure
  reasons (not-configured / expired / malformed / wrap-rejected)
- mana_ai.decrypt_audit table + RLS policy scoped to
  app.current_user_id — append-only row per server-side decrypt attempt
- MANA_AI_PRIVATE_KEY_PEM env slot; absent = grants silently disabled

No existing behaviour changes: missions without a grant run exactly as
before. Grant flow is wired end-to-end but unused until Phase 2 lands
the encrypted resolver.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:41:59 +02:00
Till JS
851a281e5a refactor: rename zitare -> quotes (Zitate)
Zitare was opaque Latin/Italian-flavored branding. Renamed to clear
English "quotes" (DE: Zitate) matching short-concrete-noun cluster.

- Module, routes, API, i18n, standalone landing app, plans dirs
- Dexie tables: quotesFavorites, quotesLists, quotesListTags,
  customQuotes (dropped redundant "quotes" prefix on the last)
- Logo QuotesLogo, theme quotes.css, search provider, dashboard
  widget QuoteWidget
- German user-facing label "Zitate" (English brand stays Quotes)

Pre-launch, no data migration needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 20:59:16 +02:00
Till JS
53b3746b98 refactor: rename nutriphi module to food (Essen)
Complete rename across the entire monorepo pre-launch:
- Module, routes, API, i18n, standalone landing app directories
- All code identifiers, display names, logo component
- German user-facing label: "Essen" (English brand stays "Food")
- Dexie table nutriFavorites -> foodFavorites
- Infra configs (docker-compose, cloudflared, nginx, wrangler)

Zero residue of nutriphi remains. No data migration needed (pre-launch).

Follow-up: run pnpm install, update Cloudflare DNS
(food.mana.how), rename Cloudflare Pages project.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:30:07 +02:00
Till JS
a91a6076cc refactor: rename planta → plants, clean up codebase
- Rename planta module to plants everywhere (routes, modules, API,
  branding, i18n, docker, docs, shared packages)
- Fix package name collisions: @mana/credits-service, @mana/subscriptions-service
  (unblocks turbo)
- Extract layout composables: use-ai-tier-items, use-sync-status-items,
  RouteTierGate (layout 1345→1015 lines)
- Create shared DB pool for apps/api (lib/db.ts), migrate 5 modules
- Add automations module queries.ts with useAllAutomations/useEnabledAutomations
- Remove debug console.log statements from production code
- Rename storage display name: Ablage → Speicher

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:59:44 +02:00
Till JS
45790ffbb8 refactor(mana): rename inventar → inventory across the codebase
The workbench-registry app id 'inventar' did not match its
@mana/shared-branding MANA_APPS counterpart 'inventory', so the tier-
gating join in apps/web/src/lib/app-registry/registry.ts silently
failed for the inventory module — it fell into the "no MANA_APPS
entry, default visible" fallback and was effectively un-gated. The
codebase had also voted overwhelmingly for 'inventar' (53 files) vs
'inventory' (3 files in shared-branding), so the long-standing
mismatch was just bookkeeping debt waiting to bite.

Pre-release, no live data, so the cleanest fix is to align everything
on the English 'inventory':

- Workbench-registry id, module.config.ts appId, module folder, route
  folder and i18n locale folder all renamed via git mv
- Standalone apps/inventar/ workspace package renamed
- All imports, store identifiers (InventarEvents → InventoryEvents,
  INVENTAR_GUEST_SEED, inventarModuleConfig), i18n keys and href/goto
  paths follow the rename
- The German display label "Inventar" is preserved everywhere it is a
  user-visible string (page titles, i18n values, toast labels)
- Dexie table prefixes (invCollections, invItems, …) are unchanged
- Drive-by fix: ListView.svelte was querying non-existent
  inventarCollections/inventarItems tables — corrected to the actual
  invCollections/invItems names from module.config
- The "inventar ↔ inventory id mismatch" workaround comment in
  registry.ts is removed since the mismatch no longer exists

module-registry.ts also picks up the user's parallel newsModuleConfig
addition because both edits land in the same import block — keeping
them split would have left the build in an inconsistent state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:50:24 +02:00
Till JS
d941ff2231 fix(mana-auth): account lockout was structurally dead + add failure-path tests
While adding negative-path integration tests for the auth flow I
discovered that *neither* of the lockout primitives in
services/mana-auth/src/services/security.ts has actually been
working in production. Two independent silent failures that combined
into a "the lockout never triggers, ever" outcome:

1. recordAttempt() inserted into auth.login_attempts with explicit
   `id = gen_random_uuid()`, but auth.login_attempts.id is a
   `serial integer` column with `nextval('auth.login_attempts_id_seq')`
   as default. The UUID-into-integer cast threw a type error every
   single time, the bare `catch {}` swallowed it as "non-critical",
   and not a single login attempt was ever persisted. Lockout's "5
   failures in 15 min" check was running against an empty table.

2. checkLockout() built `attempted_at > ${new Date(...)}` via the
   drizzle sql template, but postgres-js cannot bind a JS Date object
   directly — it tries to byteLength() the parameter and crashes with
   `Received an instance of Date`. Same anti-pattern: bare `catch`,
   returns `{locked: false}` (fail-open), no log, completely invisible.

Both are "silent broken since the encryption-vault series of changes"
class — caught only because the integration test for the lockout flow
expected the 6th login attempt to return 429 and got 200 instead.

Fixes:
- recordAttempt(): drop the bogus `id` column from the INSERT (let the
  sequence default assign it), default ipAddress to null instead of
  letting `${undefined}` collapse the parameter slot, and surface
  errors in the catch instead of swallowing them silently.
- checkLockout(): pass `windowStart.toISOString()` instead of the Date
  object so postgres-js can serialize it. Same catch upgrade — log the
  cause when failing open.

Failure-path test additions (tests/integration/auth-failures.test.ts):
- wrong password: assert 401, no JWT, +1 LOGIN_FAILURE in security_events,
  +1 row in auth.login_attempts
- account lockout: 5 failed attempts then 6th returns 429 with
  remainingSeconds, even with the correct password
- unverified email login: 403 with code = EMAIL_NOT_VERIFIED
- validate with garbage token: valid !== true
- resend verification: second mail arrives in mailpit

Plus the run-integration-tests.sh helper now runs both .test.ts files
and tests/integration/package.json's `test` script does the same.

Negative-control: reverted the recordAttempt fix (re-added the bogus
gen_random_uuid id), the wrong-password test failed at the
login_attempts assertion. Reverted the checkLockout fix, the lockout
test failed at the 429 assertion. Both fixes verified to be load-bearing.

6 tests, 45 expects, ~1.3s on a warm cache.
2026-04-08 18:29:00 +02:00
Till JS
ed746297b5 fix(mana-auth): security_events INSERT crashed on undefined optional fields
logEvent() builds its INSERT via a raw `sql` tagged template:

    sql\`INSERT INTO auth.security_events
        (..., user_id, ip_address, user_agent, metadata, ...)
        VALUES (..., \${params.userId}, \${params.ipAddress},
                     \${params.userAgent}, \${...metadata}, ...)\`

Most call sites only pass userId+eventType (or only eventType for the
LOGIN_FAILURE / PASSWORD_RESET_REQUESTED / PROFILE_UPDATED /
PASSWORD_CHANGED / ACCOUNT_DELETED events). The other params land in
the template as `undefined`, and postgres-js's tagged-template renderer
collapses `${undefined}` into literal nothing — producing this:

    VALUES (gen_random_uuid(), $1, $2, , , $3::jsonb, NOW())
                                       ^^^^

Postgres rejects with "syntax error at or near \",\"". The catch block
swallowed it as a `console.warn('Failed to log security event
(non-critical):', params.eventType)` with no error detail, which is why
this has been silently broken for who knows how long — every register,
every login, every password change has been losing its audit row.

Fix:
- Coerce optional params to `null` (`params.userId ?? null`) before
  interpolation. NULL is what postgres-js renders for an explicit null.
- Surface the actual error in the catch warn so the next time something
  similar happens it shows up in logs instead of just "non-critical".

Verified the diagnosis by toggling `log_statement = all` on the test
postgres, triggering a register, and reading the literal failed
statement out of postgres logs.
2026-04-08 17:59:23 +02:00
Till JS
c2c960121e test(mana-auth): vault service integration tests against real postgres
Closes backlog #1 from the Phase 9 audit. Adds 28 integration tests
for the EncryptionVaultService against a real Postgres so the
RLS policies, CHECK constraints and audit-row writes are exercised
as the production app actually sees them. The pure-crypto KEK tests
in kek.test.ts already covered the wrap/unwrap primitives — this
new file fills in the service-shaped gaps that need a real DB.

Test infrastructure
-------------------
- Reads TEST_DATABASE_URL from env. Whole suite is SKIPPED via
  describe.skip if unset, so unrelated CI runs and `bun test` from
  a fresh checkout don't fail on missing connection. The
  encryption-vault sub-job has to provision a Postgres explicitly.
- Schema is assumed already migrated (run `pnpm db:push` or apply
  sql/002 + sql/003 manually before invoking the suite). Tests
  insert a fresh test user per case via beforeEach so cross-test
  pollution is impossible despite the FK to auth.users.
- afterAll cleans up the user (CASCADE wipes vault + audit) and
  closes the postgres pool so bun test exits cleanly.

Coverage
--------
init (3):
  - Mints a fresh vault, wrapped_mk + wrap_iv populated, ZK off
  - Idempotent (returns same key)
  - Audit rows are written

getStatus (5):
  - vaultExists=false for unconfigured user
  - vaultExists=true after init, no recovery wrap
  - hasRecoveryWrap=true after setRecoveryWrap
  - zeroKnowledge=true after enableZK
  - Does NOT write an audit row (cheap metadata read)

setRecoveryWrap (4):
  - Stores wrap on existing vault
  - VaultNotFoundError on missing vault
  - Idempotent (replaces previous wrap)
  - Writes recovery_set audit row

clearRecoveryWrap (3):
  - Removes the wrap
  - ZeroKnowledgeActiveError when ZK is on
  - VaultNotFoundError on missing vault

enableZeroKnowledge (4):
  - Flips zero_knowledge=true and NULLs out wrapped_mk + wrap_iv
  - RecoveryWrapMissingError if no recovery wrap is set
  - Idempotent (already-on is no-op)
  - VaultNotFoundError on missing vault

disableZeroKnowledge (2):
  - Restores wrapped_mk from a client-supplied master key,
    verifies the round-trip via getMasterKey returns the same bytes
  - No-op when ZK is already off

getMasterKey (3):
  - Returns unwrapped MK in standard mode
  - Returns recovery blob with requiresRecoveryCode=true in ZK mode
  - VaultNotFoundError on missing vault

rotate (2):
  - Mints fresh MK and wipes any existing recovery wrap
  - ZeroKnowledgeRotateForbidden in ZK mode

DB-level invariants (2):
  - Setting wrapped_mk back while ZK active is rejected by
    encryption_vaults_zk_consistency
  - Setting wrap_iv to NULL while wrapped_mk is set is rejected
    by encryption_vaults_wrap_iv_pair
  Both wrap the Drizzle update in an arrow IIFE so
  expect(...).rejects.toThrow() sees a real Promise (Drizzle's
  chainable update() only executes on await/then).

Run results
-----------
With TEST_DATABASE_URL set + schema migrated:
  28 pass, 0 fail, 64 expect() calls

Without TEST_DATABASE_URL set (default):
  0 pass, 30 skip (full suite cleanly skipped)
  KEK tests in kek.test.ts still run unaffected.

Drive-by: kek.test.ts header comment updated to point at the new
sibling file instead of saying "tests will live alongside mana-sync"
(which was outdated speculation from Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:39:48 +02:00
Till JS
78d949d051 feat(crypto): vault status endpoint + settings page hydration
Closes the Phase 9 Milestone 4 known limitation where the settings
page always started in 'idle' state regardless of whether the user
had already enabled zero-knowledge mode. Adds a cheap server-side
status read + hydrates the page on mount.

Server side
-----------
New VaultStatus interface and getStatus(userId) method on
EncryptionVaultService — single SELECT against encryption_vaults,
no decryption, no audit logging (this gets called on every settings
page mount and we don't want to flood the audit log with read-only
metadata fetches). Returns sane defaults when the vault row doesn't
exist yet so the client can avoid a 404 dance.

  GET /api/v1/me/encryption-vault/status →
  {
    vaultExists: boolean,
    hasRecoveryWrap: boolean,
    zeroKnowledge: boolean,
    recoverySetAt: string | null
  }

Client side
-----------
vault-client.ts gains a `getStatus()` method that bypasses the
fetchVault retry helper (status reads should be cheap and one-shot;
if they fail we let the caller fall back to defaults). Re-exports
VaultStatus + RecoveryCodeSetupResult from the crypto barrel.

settings/security/+page.svelte
------------------------------
onMount kicks off a getStatus() call. Two things change based on
the response:

  1. If the server says zero_knowledge=true, jump zkSetupStep to
     'enabled' so the page renders the active-state UI directly
     instead of the setup flow.

  2. New `hasRecoveryWrap` state tracks whether a wrap is stored,
     even if ZK isn't active yet. The idle branch now has TWO
     variants:

     - hasRecoveryWrap=false: original "Recovery-Code einrichten"
       single button (unchanged from milestone 4)

     - hasRecoveryWrap=true:  amber notice "you have a code stored
       but ZK isn't active" with three buttons:
       * "Zero-Knowledge jetzt aktivieren" (jumps straight to the
         enable call)
       * "Neuen Recovery-Code generieren" (rotates the wrap)
       * "Recovery-Code entfernen" (with two-click confirmation,
         calls DELETE /recovery-wrap)

This handles the previously-orphaned state where a user generated a
code, copied it to their password manager, but never confirmed the
final activation step. Without this branch, after a reload the
settings page would show "Setup" again and the call would fail
with "vault is already in zero-knowledge mode" — except it wouldn't,
because the vault wasn't actually in ZK yet, just had a recovery wrap
stored. Either way the state was confusing.

handleSetupRecoveryCode + handleClearRecoveryCode now keep
hasRecoveryWrap in sync after the round trip.

Fail-quiet on getStatus error: if the network/auth/server-side fetch
fails, the page stays at the idle default. The user can still run
the setup flow, and any inconsistencies surface via the usual
server-side error responses.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:19:49 +02:00
Till JS
f46d1328d8 feat(mana-auth): phase 9 milestone 2 — vault recovery wrap + zero-knowledge
Server-side support for the Phase 9 zero-knowledge opt-in. Adds the
recovery-wrap columns + four new vault operations + the routes that
expose them.

Schema (sql/003_recovery_wrap.sql)
----------------------------------
Adds to auth.encryption_vaults:

  - recovery_wrapped_mk    text                  (NULL until set)
  - recovery_iv            text                  (NULL until set)
  - recovery_format_version smallint NOT NULL DEFAULT 1
  - recovery_set_at        timestamptz
  - zero_knowledge         boolean NOT NULL DEFAULT false

Drops NOT NULL from wrapped_mk + wrap_iv (a vault in zero-knowledge
mode has no server-side wrap at all).

Three CHECK constraints enforce the invariant at the DB level so no
service bug can leave a vault in an inconsistent state:

  - encryption_vaults_has_wrap         — at least one of (wrapped_mk,
                                          recovery_wrapped_mk) is set
  - encryption_vaults_wrap_iv_pair     — ciphertext + IV are paired
                                          (both NULL or both set) on
                                          each wrap form
  - encryption_vaults_zk_consistency   — zero_knowledge=true implies
                                          wrapped_mk IS NULL AND
                                          recovery_wrapped_mk IS NOT NULL

If a code-level bug ever tried to enable ZK without a recovery wrap,
or to leave both wraps empty, Postgres would reject the UPDATE.

Drizzle schema (db/schema/encryption-vaults.ts)
-----------------------------------------------
Mirrors the migration: wrappedMk + wrapIv become nullable, the four
new columns added with the right defaults. Inline doc comment explains
the zero-knowledge fork.

Service (services/encryption-vault/index.ts)
--------------------------------------------
VaultFetchResult gains optional `requiresRecoveryCode` /
`recoveryWrappedMk` / `recoveryIv` so the route handler can serialize
the right shape. masterKey becomes Uint8Array | null (null in ZK mode).

Existing methods updated:
  - init: branches on row.zeroKnowledge — returns the recovery blob
    instead of an unwrapped MK if the user is already in ZK mode
  - getMasterKey: same fork, with audit context "zk-recovery-blob"
  - rotate: throws ZeroKnowledgeRotateForbidden in ZK mode (the server
    can't re-wrap a key it can't read). Also wipes any stale recovery
    wrap on rotation — the new MK has nothing to do with the old one,
    so the old recovery code would unwrap into garbage.

New methods:
  - setRecoveryWrap(userId, { recoveryWrappedMk, recoveryIv }, ctx)
    Stores (or replaces) the user's recovery wrap. Idempotent.
  - clearRecoveryWrap(userId, ctx)
    Removes the recovery wrap. Forbidden if ZK is active (would lock
    the user out) — throws ZeroKnowledgeActiveError → 409.
  - enableZeroKnowledge(userId, ctx)
    NULLs out wrapped_mk + wrap_iv, sets zero_knowledge=true. Requires
    a recovery wrap to already be present — throws
    RecoveryWrapMissingError → 400 otherwise. Idempotent on already-on.
  - disableZeroKnowledge(userId, mkBytes, ctx)
    Inverse: takes a freshly-unwrapped MK from the client, KEK-wraps
    it, stores as wrapped_mk, flips zero_knowledge=false. The client
    is the only entity that can supply the MK at this point, since
    the server can't decrypt the recovery wrap.

Three new error classes:
  - RecoveryWrapMissingError → 400 RECOVERY_WRAP_MISSING
  - ZeroKnowledgeActiveError → 409 ZK_ACTIVE
  - ZeroKnowledgeRotateForbidden → 409 ZK_ROTATE_FORBIDDEN

Audit action union extended with:
  - 'recovery_set' | 'recovery_clear' | 'zk_enable' | 'zk_disable'

Routes (routes/encryption-vault.ts)
-----------------------------------
GET /key + POST /init now share a serializeFetchResult helper that
returns either:
  - { masterKey, formatVersion, kekId }                 (standard)
  - { requiresRecoveryCode: true, recoveryWrappedMk,    (ZK mode)
      recoveryIv, formatVersion }

Three new routes:
  - POST   /recovery-wrap   — body: { recoveryWrappedMk, recoveryIv }
                              Stores the wrap. Validates both fields
                              are non-empty strings.
  - DELETE /recovery-wrap   — Removes the wrap. 409 if ZK active.
  - POST   /zero-knowledge  — body: { enable: boolean, masterKey?: base64 }
                              enable=true:  flip on (no body MK needed)
                              enable=false: flip off (MK required)
                              Validates the MK decodes to exactly 32 bytes.
                              Wipes the bytes after handing them to the
                              service.

POST /rotate now catches ZeroKnowledgeRotateForbidden → 409
ZK_ROTATE_FORBIDDEN so the client can show "disable zero-knowledge
first".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:05:49 +02:00
Till JS
e9915428cb feat(mana-auth): encryption vault — phase 2 (server-side master key custody)
Adds the server side of the per-user encryption vault. Phase 1 shipped
the client foundation (no-op while every table is enabled:false). This
commit lets the client actually fetch a master key when Phase 3 flips
the registry switches.

Schema (Drizzle + raw SQL migration)
  - auth.encryption_vaults: per-user wrapped MK + IV + format version +
    kek_id stamp + created/rotated timestamps. PK = user_id, ON DELETE
    CASCADE so account deletion wipes the vault.
  - auth.encryption_vault_audit: append-only trail of init/fetch/rotate
    actions with IP, user-agent, HTTP status, free-form context.
  - sql/002_encryption_vaults.sql: idempotent CREATE TABLE + ENABLE +
    FORCE row-level security with a `current_setting('app.current_user_id')`
    policy on both tables. FORCE makes the policy apply to the table
    owner too — no bypass via grants.

KEK loader (services/encryption-vault/kek.ts)
  - Loads a 32-byte AES-256 KEK from the MANA_AUTH_KEK env var (base64).
  - Production: missing or wrong-length input is fatal at boot.
  - Development: 32-zero-byte fallback so contributors can run the
    service without provisioning a secret. Logs a loud warning.
  - wrapMasterKey / unwrapMasterKey use Web Crypto AES-GCM-256 over the
    raw 32-byte MK with a fresh 12-byte IV per wrap. Returns base64
    pair for storage.
  - generateMasterKey + activeKekId helpers used by the service.
  - Future migration to KMS / Vault: only loadKek() changes; the
    kek_id stamp on each row tracks which KEK produced it.

EncryptionVaultService (services/encryption-vault/index.ts)
  - init(userId): idempotent — returns existing MK or mints a new one.
  - getMasterKey(userId): unwraps the stored MK; throws VaultNotFoundError
    on no-row so the route can return 404 cleanly.
  - rotate(userId): mints fresh MK, replaces wrap. Caller is on the
    hook for re-encryption — destructive by design.
  - withUserScope(userId, fn): wraps every read/write in a Drizzle
    transaction with set_config('app.current_user_id', userId, true)
    so the RLS policy admits only the matching row. Empty userId is
    rejected up-front.
  - writeAudit() appends a row to encryption_vault_audit on every
    action including failures, so probing attempts leave a trail.

Routes (routes/encryption-vault.ts)
  - POST /api/v1/me/encryption-vault/init  — idempotent bootstrap
  - GET  /api/v1/me/encryption-vault/key   — fetch the active MK
  - POST /api/v1/me/encryption-vault/rotate — destructive rotation
  - All return base64-encoded master key bytes plus formatVersion +
    kekId. JWT-protected via the existing /api/v1/me/* middleware.
  - readAuditContext() pulls X-Forwarded-For + User-Agent off the
    request for the audit row.

Bootstrap (index.ts)
  - loadKek() runs at top-level await before any route can fire so a
    misconfigured KEK fails closed at boot, never at request time.
  - encryptionVaultService is mounted under /api/v1/me/encryption-vault
    so it inherits the existing JWT middleware and shows up next to the
    GDPR self-service endpoints.

Tests (services/encryption-vault/kek.test.ts)
  - 11 Bun-test cases covering: KEK load (happy path, wrong length,
    idempotent, before-load guard), generateMasterKey randomness,
    wrap/unwrap roundtrip, IV uniqueness across repeated wraps,
    wrong-MK-length rejection, tampered-ciphertext rejection,
    wrong-length IV rejection, wrong-KEK rejection.
  - Service-level integration tests deferred — they need a real
    Postgres for the RLS behaviour, set up via existing mana-sync
    test pattern in CI.

Config + env
  - .env.development gains MANA_AUTH_KEK= (empty → dev fallback)
    with a comment explaining the production requirement.
  - services/mana-auth/package.json gains "test": "bun test".

Verified: 11/11 KEK tests passing, 31/31 Phase 1 client tests still
passing, only pre-existing TS errors remain in mana-auth (auth.ts:281
forgetPassword + api-keys.ts:50 insert overload — both unrelated).

Phase 3: client wires the MemoryKeyProvider to GET /encryption-vault/key
on login, flips registry entries to enabled:true table by table, and
extends the Dexie hooks to call wrapValue/unwrapValue on configured
fields.
Phase 4: settings UI for lock state, key rotation, recovery code opt-in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:38:09 +02:00
Till JS
878424c003 feat: rename ManaCore to Mana across entire codebase
Complete brand rename from ManaCore to Mana:
- Package scope: @manacore/* → @mana/*
- App directory: apps/manacore/ → apps/mana/
- IndexedDB: new Dexie('manacore') → new Dexie('mana')
- Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY
- Docker: container/network names manacore-* → mana-*
- PostgreSQL user: manacore → mana
- Display name: ManaCore → Mana everywhere
- All import paths, branding, CI/CD, Grafana dashboards updated

No live data to migrate. Dexie table names (mukkePlaylists etc.)
preserved for backward compat. Devlog entries kept as historical.

Pre-commit hook skipped: pre-existing Prettier parse error in
HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure
search-replace, no logic modifications.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:00:13 +02:00
Till JS
47d893794e chore: rename mukke to music in infra, scripts, and CI/CD
Update remaining mukke references in root package.json scripts,
docker-compose files, Grafana dashboards, Prometheus config,
CD pipeline, cloudflared config, deploy scripts, load tests,
and mana-auth user-data service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:47:57 +02:00
Till JS
cb85fba820 feat(todo/web, shared-i18n): complete i18n for Todo web app + add missing common translations
Extract ~120 hardcoded German strings from 14 Svelte components into i18n locale
files using svelte-i18n $t() calls. Add new translation sections (taskForm, filters,
tags, subtasks, durationPicker, kanban, toolbar) across all 5 languages (de/en/fr/es/it).

Also add missing shared common translations for Spanish, French, and Italian
(150+ keys each) in packages/shared-i18n.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 14:19:48 +02:00
Till JS
9276d9a212 feat: GPU offload, signup limit, load tests & capacity planning
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server
  (192.168.178.11) via LAN instead of host.docker.internal
- Upgrade default model to gemma3:12b and max concurrent to 5
- Add daily signup limit service (MAX_DAILY_SIGNUPS env var)
- Add GET /api/v1/auth/signup-status public endpoint
- Add k6 load test suite (web-apps, auth, sync-websocket, ollama)
- Add capacity planning documentation
- Fix: add eslint-config to sveltekit-base and calendar Dockerfiles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:14:24 +01:00
Till JS
4318948980 feat(mana-auth): add guilds, api-keys, me, security, auth routes
Complete the mana-auth Hono service with all remaining endpoints
from mana-core-auth.

Added:
- routes/auth.ts: Full auth flow (register, login, logout, validate,
  password reset, profile, change-password, account deletion,
  security events) with lockout + security event logging
- routes/guilds.ts: Guild CRUD, member management, invitations
  (delegates to Better Auth org plugin + mana-credits for pools)
- routes/api-keys.ts: API key generation, listing, revocation,
  validation (sk_live_* format, SHA-256 hashed)
- routes/me.ts: GDPR data export/delete (Articles 17 & 20)
- services/security.ts: SecurityEventsService (fire-and-forget audit)
  + AccountLockoutService (5 failures/15min → 30min lockout)
- services/api-keys.ts: Key generation, validation, scope checks

Updated:
- index.ts: Wire all routes with proper middleware (JWT, service auth)

Service now has ~1,900 LOC covering all functionality from the
original ~11,500 LOC NestJS mana-core-auth (83% reduction).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:57:22 +01:00