Commit graph

6 commits

Author SHA1 Message Date
Till JS
8e8b6ac65f fix(mana-auth) + chore: rewrite /api/v1/auth/login JWT mint, remove Matrix stack
This commit bundles two unrelated changes that were swept together by an
accidental `git add -A` in another working session. Documented here so the
history reflects what's actually inside.

═══════════════════════════════════════════════════════════════════════
1. fix(mana-auth): /api/v1/auth/login mints JWT via auth.handler instead
   of api.signInEmail
═══════════════════════════════════════════════════════════════════════

Previous attempt (commit 55cc75e7d) tried to fix the broken JWT mint in
/api/v1/auth/login by switching the cookie name from `mana.session_token`
to `__Secure-mana.session_token` for production. That was necessary but
not sufficient: Better Auth's session cookie value isn't just the raw
session token, it's `<token>.<HMAC>` where the HMAC is derived from the
better-auth secret. Reconstructing the cookie from auth.api.signInEmail's
JSON response only gave us the raw token, so /api/auth/token's
get-session middleware still couldn't validate it and the JWT mint kept
silently failing.

Real fix: do the sign-in via auth.handler (the HTTP path) rather than
auth.api.signInEmail (the SDK path). The handler returns a real fetch
Response with a Set-Cookie header containing the fully signed cookie
envelope. We capture that header verbatim and forward it as the cookie
on the /api/auth/token request, which now passes validation and mints
the JWT correctly.

Verified end-to-end on auth.mana.how:

  $ curl -X POST https://auth.mana.how/api/v1/auth/login \
      -d '{"email":"...","password":"..."}'
  {
    "user": {...},
    "token": "<session token>",
    "accessToken": "eyJhbGciOiJFZERTQSI...",   ← real JWT now
    "refreshToken": "<session token>"
  }

Side benefits:
- Email-not-verified path is now handled by checking
  signInResponse.status === 403 directly, no more catching APIError
  with the comment-noted async-stream footgun.
- X-Forwarded-For is forwarded explicitly so Better Auth's rate limiter
  and our security log see the real client IP.
- The leftover catch block now only handles unexpected exceptions
  (network errors etc); the FORBIDDEN-checking logic in it is dead but
  harmless and left in for defense in depth.

═══════════════════════════════════════════════════════════════════════
2. chore: remove the entire self-hosted Matrix stack (Synapse, Element,
   Manalink, mana-matrix-bot)
═══════════════════════════════════════════════════════════════════════

The Matrix subsystem ran parallel to the main Mana product without any
load-bearing integration: the unified web app never imported matrix-js-sdk,
the chat module uses mana-sync (local-first), and mana-matrix-bot's
plugins duplicated features the unified app already ships natively.
Keeping it alive cost a Synapse + Element + matrix-web + bot container
quartet, three Cloudflare routes, an OIDC provider plugin in mana-auth,
and a steady drip of devlog/dependency churn.

Removed:
- apps/matrix (Manalink web + mobile, ~150 files)
- services/mana-matrix-bot (Go bot with ~20 plugins)
- docker/matrix configs (Synapse + Element)
- synapse/element-web/matrix-web/mana-matrix-bot services in
  docker-compose.macmini.yml
- matrix.mana.how/element.mana.how/link.mana.how Cloudflare tunnel routes
- OIDC provider plugin + matrix-synapse trustedClient + matrixUserLinks
  table from mana-auth (oauth_* schema definitions also removed)
- MatrixService import path in mana-media (importFromMatrix endpoint)
- Matrix notification channel in mana-notify (worker, metrics, config,
  channel_type enum, MatrixOptions handler)
- Matrix entries from shared-branding (mana-apps + app-icons),
  notify-client, the i18n bundle, the observatory map, the credits
  app-label list, the landing footer/apps page, the prometheus + alerts
  + promtail tier mappings, and the matrix-related deploy paths in
  cd-macmini.yml + ci.yml

Devlog/manascore/blueprint entries that mention Matrix are left intact
as historical record. The oauth_* + matrix_user_links Postgres tables
stay on existing prod databases — code can no longer write to them, drop
them in a follow-up migration if you want them gone for real.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:32:13 +02:00
Till JS
f46d1328d8 feat(mana-auth): phase 9 milestone 2 — vault recovery wrap + zero-knowledge
Server-side support for the Phase 9 zero-knowledge opt-in. Adds the
recovery-wrap columns + four new vault operations + the routes that
expose them.

Schema (sql/003_recovery_wrap.sql)
----------------------------------
Adds to auth.encryption_vaults:

  - recovery_wrapped_mk    text                  (NULL until set)
  - recovery_iv            text                  (NULL until set)
  - recovery_format_version smallint NOT NULL DEFAULT 1
  - recovery_set_at        timestamptz
  - zero_knowledge         boolean NOT NULL DEFAULT false

Drops NOT NULL from wrapped_mk + wrap_iv (a vault in zero-knowledge
mode has no server-side wrap at all).

Three CHECK constraints enforce the invariant at the DB level so no
service bug can leave a vault in an inconsistent state:

  - encryption_vaults_has_wrap         — at least one of (wrapped_mk,
                                          recovery_wrapped_mk) is set
  - encryption_vaults_wrap_iv_pair     — ciphertext + IV are paired
                                          (both NULL or both set) on
                                          each wrap form
  - encryption_vaults_zk_consistency   — zero_knowledge=true implies
                                          wrapped_mk IS NULL AND
                                          recovery_wrapped_mk IS NOT NULL

If a code-level bug ever tried to enable ZK without a recovery wrap,
or to leave both wraps empty, Postgres would reject the UPDATE.

Drizzle schema (db/schema/encryption-vaults.ts)
-----------------------------------------------
Mirrors the migration: wrappedMk + wrapIv become nullable, the four
new columns added with the right defaults. Inline doc comment explains
the zero-knowledge fork.

Service (services/encryption-vault/index.ts)
--------------------------------------------
VaultFetchResult gains optional `requiresRecoveryCode` /
`recoveryWrappedMk` / `recoveryIv` so the route handler can serialize
the right shape. masterKey becomes Uint8Array | null (null in ZK mode).

Existing methods updated:
  - init: branches on row.zeroKnowledge — returns the recovery blob
    instead of an unwrapped MK if the user is already in ZK mode
  - getMasterKey: same fork, with audit context "zk-recovery-blob"
  - rotate: throws ZeroKnowledgeRotateForbidden in ZK mode (the server
    can't re-wrap a key it can't read). Also wipes any stale recovery
    wrap on rotation — the new MK has nothing to do with the old one,
    so the old recovery code would unwrap into garbage.

New methods:
  - setRecoveryWrap(userId, { recoveryWrappedMk, recoveryIv }, ctx)
    Stores (or replaces) the user's recovery wrap. Idempotent.
  - clearRecoveryWrap(userId, ctx)
    Removes the recovery wrap. Forbidden if ZK is active (would lock
    the user out) — throws ZeroKnowledgeActiveError → 409.
  - enableZeroKnowledge(userId, ctx)
    NULLs out wrapped_mk + wrap_iv, sets zero_knowledge=true. Requires
    a recovery wrap to already be present — throws
    RecoveryWrapMissingError → 400 otherwise. Idempotent on already-on.
  - disableZeroKnowledge(userId, mkBytes, ctx)
    Inverse: takes a freshly-unwrapped MK from the client, KEK-wraps
    it, stores as wrapped_mk, flips zero_knowledge=false. The client
    is the only entity that can supply the MK at this point, since
    the server can't decrypt the recovery wrap.

Three new error classes:
  - RecoveryWrapMissingError → 400 RECOVERY_WRAP_MISSING
  - ZeroKnowledgeActiveError → 409 ZK_ACTIVE
  - ZeroKnowledgeRotateForbidden → 409 ZK_ROTATE_FORBIDDEN

Audit action union extended with:
  - 'recovery_set' | 'recovery_clear' | 'zk_enable' | 'zk_disable'

Routes (routes/encryption-vault.ts)
-----------------------------------
GET /key + POST /init now share a serializeFetchResult helper that
returns either:
  - { masterKey, formatVersion, kekId }                 (standard)
  - { requiresRecoveryCode: true, recoveryWrappedMk,    (ZK mode)
      recoveryIv, formatVersion }

Three new routes:
  - POST   /recovery-wrap   — body: { recoveryWrappedMk, recoveryIv }
                              Stores the wrap. Validates both fields
                              are non-empty strings.
  - DELETE /recovery-wrap   — Removes the wrap. 409 if ZK active.
  - POST   /zero-knowledge  — body: { enable: boolean, masterKey?: base64 }
                              enable=true:  flip on (no body MK needed)
                              enable=false: flip off (MK required)
                              Validates the MK decodes to exactly 32 bytes.
                              Wipes the bytes after handing them to the
                              service.

POST /rotate now catches ZeroKnowledgeRotateForbidden → 409
ZK_ROTATE_FORBIDDEN so the client can show "disable zero-knowledge
first".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:05:49 +02:00
Till JS
e9915428cb feat(mana-auth): encryption vault — phase 2 (server-side master key custody)
Adds the server side of the per-user encryption vault. Phase 1 shipped
the client foundation (no-op while every table is enabled:false). This
commit lets the client actually fetch a master key when Phase 3 flips
the registry switches.

Schema (Drizzle + raw SQL migration)
  - auth.encryption_vaults: per-user wrapped MK + IV + format version +
    kek_id stamp + created/rotated timestamps. PK = user_id, ON DELETE
    CASCADE so account deletion wipes the vault.
  - auth.encryption_vault_audit: append-only trail of init/fetch/rotate
    actions with IP, user-agent, HTTP status, free-form context.
  - sql/002_encryption_vaults.sql: idempotent CREATE TABLE + ENABLE +
    FORCE row-level security with a `current_setting('app.current_user_id')`
    policy on both tables. FORCE makes the policy apply to the table
    owner too — no bypass via grants.

KEK loader (services/encryption-vault/kek.ts)
  - Loads a 32-byte AES-256 KEK from the MANA_AUTH_KEK env var (base64).
  - Production: missing or wrong-length input is fatal at boot.
  - Development: 32-zero-byte fallback so contributors can run the
    service without provisioning a secret. Logs a loud warning.
  - wrapMasterKey / unwrapMasterKey use Web Crypto AES-GCM-256 over the
    raw 32-byte MK with a fresh 12-byte IV per wrap. Returns base64
    pair for storage.
  - generateMasterKey + activeKekId helpers used by the service.
  - Future migration to KMS / Vault: only loadKek() changes; the
    kek_id stamp on each row tracks which KEK produced it.

EncryptionVaultService (services/encryption-vault/index.ts)
  - init(userId): idempotent — returns existing MK or mints a new one.
  - getMasterKey(userId): unwraps the stored MK; throws VaultNotFoundError
    on no-row so the route can return 404 cleanly.
  - rotate(userId): mints fresh MK, replaces wrap. Caller is on the
    hook for re-encryption — destructive by design.
  - withUserScope(userId, fn): wraps every read/write in a Drizzle
    transaction with set_config('app.current_user_id', userId, true)
    so the RLS policy admits only the matching row. Empty userId is
    rejected up-front.
  - writeAudit() appends a row to encryption_vault_audit on every
    action including failures, so probing attempts leave a trail.

Routes (routes/encryption-vault.ts)
  - POST /api/v1/me/encryption-vault/init  — idempotent bootstrap
  - GET  /api/v1/me/encryption-vault/key   — fetch the active MK
  - POST /api/v1/me/encryption-vault/rotate — destructive rotation
  - All return base64-encoded master key bytes plus formatVersion +
    kekId. JWT-protected via the existing /api/v1/me/* middleware.
  - readAuditContext() pulls X-Forwarded-For + User-Agent off the
    request for the audit row.

Bootstrap (index.ts)
  - loadKek() runs at top-level await before any route can fire so a
    misconfigured KEK fails closed at boot, never at request time.
  - encryptionVaultService is mounted under /api/v1/me/encryption-vault
    so it inherits the existing JWT middleware and shows up next to the
    GDPR self-service endpoints.

Tests (services/encryption-vault/kek.test.ts)
  - 11 Bun-test cases covering: KEK load (happy path, wrong length,
    idempotent, before-load guard), generateMasterKey randomness,
    wrap/unwrap roundtrip, IV uniqueness across repeated wraps,
    wrong-MK-length rejection, tampered-ciphertext rejection,
    wrong-length IV rejection, wrong-KEK rejection.
  - Service-level integration tests deferred — they need a real
    Postgres for the RLS behaviour, set up via existing mana-sync
    test pattern in CI.

Config + env
  - .env.development gains MANA_AUTH_KEK= (empty → dev fallback)
    with a comment explaining the production requirement.
  - services/mana-auth/package.json gains "test": "bun test".

Verified: 11/11 KEK tests passing, 31/31 Phase 1 client tests still
passing, only pre-existing TS errors remain in mana-auth (auth.ts:281
forgetPassword + api-keys.ts:50 insert overload — both unrelated).

Phase 3: client wires the MemoryKeyProvider to GET /encryption-vault/key
on login, flips registry entries to enabled:true table by table, and
extends the Dexie hooks to call wrapValue/unwrapValue on configured
fields.
Phase 4: settings UI for lock state, key rotation, recovery code opt-in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:38:09 +02:00
Till JS
b737240ec1 feat(auth): add access tier system for phased app releases
Introduces a tiered access control system so apps can be released
gradually (founder → alpha → beta → public) without extra infrastructure.
Users are gated at the AuthGate level based on their tier vs the app's
requiredTier. All apps remain deployed and reachable, but only users
with sufficient tier can enter.

- Add accessTier enum + column to users schema (default: 'public')
- Add tier claim to JWT payload in better-auth config
- Add requiredTier field to ManaApp interface + all 25 apps
- Add hasAppAccess(), getAccessibleManaApps(), ACCESS_TIER_LABELS
- Update AuthGate with tier check + access denied screen
- Update getPillAppItems + Home page to filter by user tier
- Update all 22 app layouts to pass user tier to PillNav
- Add admin API: GET/PUT /api/v1/admin/users/:id/tier
- Document access tier system in CLAUDE.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 21:50:06 +02:00
Till JS
09ccf32091 fix(mana-auth): fix schema import paths (.schema → .ts)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:17:57 +01:00
Till JS
61ee1ae269 feat(services): create mana-auth (Hono + Bun) — Phase 5 auth rewrite
Rewrite the central authentication service from NestJS to Hono + Bun.
Uses Better Auth's native fetch-based handler — no Express conversion.

Key architecture changes:
- Better Auth handler mounted directly on Hono (app.all('/api/auth/*'))
- No NestJS DI, modules, guards, decorators — plain TypeScript
- JWT validation via jose (same as extracted services)
- Email via nodemailer (simplified, German templates)
- ~1,400 LOC vs ~11,500 LOC in NestJS (88% reduction)

Service structure:
- auth/better-auth.config.ts — copied from mana-core-auth (framework-agnostic)
- auth/stores.ts — in-memory stores for email redirect URLs
- email/send.ts — nodemailer email functions
- middleware/ — JWT auth, service auth, error handler (shared pattern)
- db/schema/ — copied from mana-core-auth (Drizzle, framework-agnostic)

Port: 3001 (same as mana-core-auth — drop-in replacement)
Database: mana_auth (same DB, same schemas)

Better Auth plugins: Organization, JWT (EdDSA), OIDC Provider,
Two-Factor (TOTP), Magic Link

Note: This is the initial version. Guilds, API keys, Me (GDPR),
security (lockout/audit), and admin endpoints will be added
incrementally. The old mana-core-auth remains until fully replaced.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:43:44 +01:00