Fans out the cards-server fix from 08f422340 to every other Bun
service in the monorepo. --hot keeps the port bound across HMR
reloads via globalThis[hmrSymbol]; --watch restarts the process
and races old/new Bun.serve for the port.
Touched: dev:auth, dev:credits, dev:events, dev:analytics,
dev:memoro:server, dev:memoro:audio-server, dev:uload:server in
the root package.json plus the matching `dev` script in each
service's own package.json. All six services already export the
`{ port, fetch }` default that Bun's --hot expects.
Smoke-tested: pnpm dev:cardecky:full boots clean, then touching
auth/credits/cards-server entry files all hot-reload without
dropping their port.
(memoro/apps/audio-server doesn't have a `dev: bun --watch ...`
script in its own package.json, so only the root entry got the
swap there.)
First server-side error-tracking integration. Pattern mirrors the
client-side one in apps/mana/apps/web/src/hooks.client.ts:
- pull @mana/shared-error-tracking into mana-auth's deps (workspace pkg
with @sentry/node + a no-op fallback when GLITCHTIP_DSN is unset)
- call initErrorTracking() at the top of services/mana-auth/src/index.ts
before the rest of the module body executes — this lets Sentry hook
uncaughtException / unhandledRejection before any Hono handlers register
- wrap app.onError so non-HTTPException throws also flow into
captureException with path/method/query context. HTTPExceptions are
intentional 4xx/422 and stay out of the issue list (otherwise every
401 from a stale session would page somebody at 3am)
- compose: pass GLITCHTIP_DSN_MANA_PLATFORM through as GLITCHTIP_DSN per
service so each container's events get tagged with serverName='mana-auth'
DSN itself isn't in the repo; lives in .env.macmini on the Mac Mini and
is referenced from the Glitchtip credentials doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two interlocking fixes driven by a production lockout incident.
## Bug that motivated this
A fresh schema-drift column (auth.users.onboarding_completed_at) made
every Better Auth query crash with Postgres 42703. The /login wrapper
swallowed the non-2xx and mapped it onto a generic "401 Invalid
credentials" AND bumped the password lockout counter — so 5 legit
login attempts against a broken DB would have locked every real user
out of their own account. Same wrapper pattern on /register, /refresh,
/reset-password etc. The 30-minute hunt ended in a one-off repro
script that finally surfaced the real Postgres error.
The user-facing passkey button additionally returned generic 404s on
every login-page mount because the route wasn't registered (the DB
schema existed, the Better Auth plugin wasn't wired).
## Phase 1 — Error classification (services/mana-auth/src/lib/auth-errors)
- 19-code AuthErrorCode taxonomy (INVALID_CREDENTIALS, EMAIL_NOT_VERIFIED,
ACCOUNT_LOCKED, SERVICE_UNAVAILABLE, PASSKEY_VERIFICATION_FAILED, …)
- classifyFromResponse/classifyFromError handle: Better Auth APIError
(duck-typed on `name === 'APIError'`), Postgres errors (23505 unique,
42703/08xxx → infra), ZodError, fetch/ECONNREFUSED network errors,
bare Error, unknown.
- respondWithError routes the structured response, logs at the right
level, fires the correct security event, and CRITICALLY only bumps
the lockout counter for actual credential failures — SERVICE_UNAVAILABLE
and INTERNAL never touch lockout.
- All 12 endpoints in routes/auth.ts refactored (/login, /register,
/logout, /session-to-token, /refresh, /validate, /forgot-password,
/reset-password, /resend-verification, /profile GET+POST,
/change-email, /change-password, /account DELETE).
- Fixed pre-existing auth.api.forgetPassword typo (→ requestPasswordReset).
- shared-logger + requestLogger middleware wired in index.ts; all
console.* calls in the service removed.
## Phase 2 — Passkey end-to-end (@better-auth/passkey 1.6+)
- sql/007_passkey_bootstrap.sql: idempotent schema alignment —
friendly_name→name, +aaguid, transports jsonb→text, +method column
on login_attempts.
- better-auth.config.ts: passkey plugin wired with rpID/rpName/origin
from new webauthn config section. rpID defaults to mana.how in prod
(from COOKIE_DOMAIN), localhost in dev.
- routes/passkeys.ts: 7 wrapper endpoints (capability probe,
register/options+verify, authenticate/options+verify with JWT mint,
list, delete, rename). Each routes errors through the classifier;
authenticate/verify promotes generic INVALID_CREDENTIALS to
PASSKEY_VERIFICATION_FAILED.
- PasskeyRateLimitService: in-memory per-IP (options: 20/min) and
per-credential (verify: 10 failures/min → 5 min cooldown) buckets.
Deliberately separate from the password lockout — different factor,
different blast radius.
- Client: authService.getPasskeyCapability() async probe, memoised per
session. authStore.passkeyAvailable reactive state. LoginPage gates
on === true so a slow probe doesn't flash the button in.
- AuthResult grew a code: AuthErrorCode field; handleAuthError in
shared-auth prefers the server envelope over the legacy message
heuristics.
## Tests
- 30 unit tests for the classifier covering every branch (including
the exact Postgres 42703 shape that started this).
- 9 unit tests for the rate limiter.
- 14 integration tests for the auth routes — the regression test
explicitly asserts "upstream 500 → 503 + zero lockout bumps".
- 101 tests pass, 0 fail, 30 pre-existing skips unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the canonical SpaceType + SPACE_MODULE_ALLOWLIST to @mana/shared-types
(framework-free) so the Bun services can consume them without pulling in
Svelte. shared-branding keeps only the UI-facing labels and descriptions
and re-exports the canonical types for frontend convenience.
Wires two Better Auth organization hooks in mana-auth:
- beforeCreateOrganization asserts metadata.type is a valid SpaceType,
rejecting the create with a BAD_REQUEST otherwise.
- beforeDeleteOrganization rejects deletion of the personal space.
Covered by bun tests (11 assertions) for the helper module.
No migration and no schema change — type lives in the existing
organization.metadata jsonb column.
Plan: docs/plans/spaces-foundation.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 of the Mission Key-Grant rollout. Webapp can now request a
wrapped per-mission data key; mana-ai can unwrap and (Phase 2) use it.
mana-auth:
- POST /api/v1/me/ai-mission-grant — HKDF-derives MDK from the user
master key, RSA-OAEP-2048-wraps with the mana-ai public key, returns
{ wrappedKey, derivation, issuedAt, expiresAt }
- MissionGrantService refuses zero-knowledge users (409 ZK_ACTIVE) and
returns 503 GRANT_NOT_CONFIGURED when MANA_AI_PUBLIC_KEY_PEM is unset
- TTL clamped to [1h, 30d]
mana-ai:
- configureMissionGrantKey + unwrapMissionGrant with structured failure
reasons (not-configured / expired / malformed / wrap-rejected)
- mana_ai.decrypt_audit table + RLS policy scoped to
app.current_user_id — append-only row per server-side decrypt attempt
- MANA_AI_PRIVATE_KEY_PEM env slot; absent = grants silently disabled
No existing behaviour changes: missions without a grant run exactly as
before. Grant flow is wired end-to-end but unused until Phase 2 lands
the encrypted resolver.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-launch theme system audit found multiple parallel layers in themes.css
(--theme-X full hsl strings, --X partial shadcn aliases, --color-X populated
by runtime store with raw channels) plus dead-code companion files. The
inconsistency caused light-mode regressions when scoped-CSS consumers
wrote `var(--color-X)` standalone — the variable holds raw HSL channels
which is invalid as a color value, browser fell back to inherited (white).
Rewrite to one consistent layer:
- Source of truth: --color-X defined as raw HSL channels (e.g.
`0 0% 17%`) in :root, .dark, and all variant [data-theme="..."]
blocks. Matches the format the runtime store
(@mana/shared-theme/src/utils.ts) writes, eliminating the
static-fallback-vs-runtime mismatch and the corresponding flash
of unstyled content on hydration.
- @theme inline uses self-reference + Tailwind v4 <alpha-value>
placeholder so utility classes generate correctly AND opacity
modifiers work: `text-foreground/50` → `hsl(var(--color-foreground) / 0.5)`.
- @layer components (.btn-primary, .card, .badge, etc.) wraps
var(--color-X) refs with hsl() — they were broken in light mode
too for the same reason.
Convention going forward (also documented in the file header):
1. Markup: use Tailwind utility classes (text-foreground, bg-card, …)
2. Scoped CSS: hsl(var(--color-X)) — always wrap with hsl()
3. NEVER raw var(--color-X) in CSS — that's the bug pattern
Net file: 692 → 580 LOC. Single source layer, no indirection.
Also delete dead companion files (zero imports anywhere):
- tailwind-v4.css (had broken self-reference, never imported)
- theme-variables.css (legacy hex-based palette)
- components.css (legacy component utilities)
- index.js / preset.js / colors.js (Tailwind v3 preset format,
irrelevant under Tailwind v4)
package.json exports map shrinks accordingly to just `./themes.css`.
Consumers using `hsl(var(--color-X))` (~379 files across mana-web,
manavoxel-web, arcade-web) keep working unchanged — the public API
name `--color-X` is preserved. Only the broken pattern `var(--color-X)`
(~61 files) needs a follow-up sweep, handled in a separate commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mana-auth has been crash-looping in production with:
error: Cannot find package 'nanoid' from
'/app/src/services/encryption-vault/index.ts'
The encryption-vault service imports nanoid for audit row IDs (line 27,
used at line 547 in the audit log writer), but nanoid was never added
to services/mana-auth/package.json. The import was introduced in commit
e9915428c (phase 2 — server-side master key custody) and slipped past
because nanoid happens to exist transitively in the workspace via
postcss → nanoid@3.3.11. Local pnpm store lookups would resolve it just
fine; a strict isolated container build can't.
Fix:
- Add "nanoid": "^5.0.0" to services/mana-auth/package.json deps
- pnpm install pulled nanoid@5.1.7 into services/mana-auth/node_modules
Verified the import resolves locally:
bun -e 'import { nanoid } from "nanoid"; console.log(nanoid())'
→ ok: 6TLuTWlenhC0KnSESn5Ex
The Mac Mini still needs to redeploy mana-auth (rebuild image with the
new lockfile, restart container) to pick this up — production is
currently 502ing on auth.mana.how.
Adds the server side of the per-user encryption vault. Phase 1 shipped
the client foundation (no-op while every table is enabled:false). This
commit lets the client actually fetch a master key when Phase 3 flips
the registry switches.
Schema (Drizzle + raw SQL migration)
- auth.encryption_vaults: per-user wrapped MK + IV + format version +
kek_id stamp + created/rotated timestamps. PK = user_id, ON DELETE
CASCADE so account deletion wipes the vault.
- auth.encryption_vault_audit: append-only trail of init/fetch/rotate
actions with IP, user-agent, HTTP status, free-form context.
- sql/002_encryption_vaults.sql: idempotent CREATE TABLE + ENABLE +
FORCE row-level security with a `current_setting('app.current_user_id')`
policy on both tables. FORCE makes the policy apply to the table
owner too — no bypass via grants.
KEK loader (services/encryption-vault/kek.ts)
- Loads a 32-byte AES-256 KEK from the MANA_AUTH_KEK env var (base64).
- Production: missing or wrong-length input is fatal at boot.
- Development: 32-zero-byte fallback so contributors can run the
service without provisioning a secret. Logs a loud warning.
- wrapMasterKey / unwrapMasterKey use Web Crypto AES-GCM-256 over the
raw 32-byte MK with a fresh 12-byte IV per wrap. Returns base64
pair for storage.
- generateMasterKey + activeKekId helpers used by the service.
- Future migration to KMS / Vault: only loadKek() changes; the
kek_id stamp on each row tracks which KEK produced it.
EncryptionVaultService (services/encryption-vault/index.ts)
- init(userId): idempotent — returns existing MK or mints a new one.
- getMasterKey(userId): unwraps the stored MK; throws VaultNotFoundError
on no-row so the route can return 404 cleanly.
- rotate(userId): mints fresh MK, replaces wrap. Caller is on the
hook for re-encryption — destructive by design.
- withUserScope(userId, fn): wraps every read/write in a Drizzle
transaction with set_config('app.current_user_id', userId, true)
so the RLS policy admits only the matching row. Empty userId is
rejected up-front.
- writeAudit() appends a row to encryption_vault_audit on every
action including failures, so probing attempts leave a trail.
Routes (routes/encryption-vault.ts)
- POST /api/v1/me/encryption-vault/init — idempotent bootstrap
- GET /api/v1/me/encryption-vault/key — fetch the active MK
- POST /api/v1/me/encryption-vault/rotate — destructive rotation
- All return base64-encoded master key bytes plus formatVersion +
kekId. JWT-protected via the existing /api/v1/me/* middleware.
- readAuditContext() pulls X-Forwarded-For + User-Agent off the
request for the audit row.
Bootstrap (index.ts)
- loadKek() runs at top-level await before any route can fire so a
misconfigured KEK fails closed at boot, never at request time.
- encryptionVaultService is mounted under /api/v1/me/encryption-vault
so it inherits the existing JWT middleware and shows up next to the
GDPR self-service endpoints.
Tests (services/encryption-vault/kek.test.ts)
- 11 Bun-test cases covering: KEK load (happy path, wrong length,
idempotent, before-load guard), generateMasterKey randomness,
wrap/unwrap roundtrip, IV uniqueness across repeated wraps,
wrong-MK-length rejection, tampered-ciphertext rejection,
wrong-length IV rejection, wrong-KEK rejection.
- Service-level integration tests deferred — they need a real
Postgres for the RLS behaviour, set up via existing mana-sync
test pattern in CI.
Config + env
- .env.development gains MANA_AUTH_KEK= (empty → dev fallback)
with a comment explaining the production requirement.
- services/mana-auth/package.json gains "test": "bun test".
Verified: 11/11 KEK tests passing, 31/31 Phase 1 client tests still
passing, only pre-existing TS errors remain in mana-auth (auth.ts:281
forgetPassword + api-keys.ts:50 insert overload — both unrelated).
Phase 3: client wires the MemoryKeyProvider to GET /encryption-vault/key
on login, flips registry entries to enabled:true table by table, and
extends the Dexie hooks to call wrapValue/unwrapValue on configured
fields.
Phase 4: settings UI for lock state, key rotation, recovery code opt-in.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace direct Brevo SMTP sending with HTTP calls to mana-notify's
notification API. This centralizes all email configuration in one
service (mana-notify) and removes the nodemailer dependency from
mana-auth. SMTP provider is now swappable via a single env var.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite the central authentication service from NestJS to Hono + Bun.
Uses Better Auth's native fetch-based handler — no Express conversion.
Key architecture changes:
- Better Auth handler mounted directly on Hono (app.all('/api/auth/*'))
- No NestJS DI, modules, guards, decorators — plain TypeScript
- JWT validation via jose (same as extracted services)
- Email via nodemailer (simplified, German templates)
- ~1,400 LOC vs ~11,500 LOC in NestJS (88% reduction)
Service structure:
- auth/better-auth.config.ts — copied from mana-core-auth (framework-agnostic)
- auth/stores.ts — in-memory stores for email redirect URLs
- email/send.ts — nodemailer email functions
- middleware/ — JWT auth, service auth, error handler (shared pattern)
- db/schema/ — copied from mana-core-auth (Drizzle, framework-agnostic)
Port: 3001 (same as mana-core-auth — drop-in replacement)
Database: mana_auth (same DB, same schemas)
Better Auth plugins: Organization, JWT (EdDSA), OIDC Provider,
Two-Factor (TOTP), Magic Link
Note: This is the initial version. Guilds, API keys, Me (GDPR),
security (lockout/audit), and admin endpoints will be added
incrementally. The old mana-core-auth remains until fully replaced.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>