managarten/services/mana-auth/src/config.ts
Till JS e66654068f feat(auth): error-classification layer + passkey end-to-end
Two interlocking fixes driven by a production lockout incident.

## Bug that motivated this

A fresh schema-drift column (auth.users.onboarding_completed_at) made
every Better Auth query crash with Postgres 42703. The /login wrapper
swallowed the non-2xx and mapped it onto a generic "401 Invalid
credentials" AND bumped the password lockout counter — so 5 legit
login attempts against a broken DB would have locked every real user
out of their own account. Same wrapper pattern on /register, /refresh,
/reset-password etc. The 30-minute hunt ended in a one-off repro
script that finally surfaced the real Postgres error.

The user-facing passkey button additionally returned generic 404s on
every login-page mount because the route wasn't registered (the DB
schema existed, the Better Auth plugin wasn't wired).

## Phase 1 — Error classification (services/mana-auth/src/lib/auth-errors)

- 19-code AuthErrorCode taxonomy (INVALID_CREDENTIALS, EMAIL_NOT_VERIFIED,
  ACCOUNT_LOCKED, SERVICE_UNAVAILABLE, PASSKEY_VERIFICATION_FAILED, …)
- classifyFromResponse/classifyFromError handle: Better Auth APIError
  (duck-typed on `name === 'APIError'`), Postgres errors (23505 unique,
  42703/08xxx → infra), ZodError, fetch/ECONNREFUSED network errors,
  bare Error, unknown.
- respondWithError routes the structured response, logs at the right
  level, fires the correct security event, and CRITICALLY only bumps
  the lockout counter for actual credential failures — SERVICE_UNAVAILABLE
  and INTERNAL never touch lockout.
- All 12 endpoints in routes/auth.ts refactored (/login, /register,
  /logout, /session-to-token, /refresh, /validate, /forgot-password,
  /reset-password, /resend-verification, /profile GET+POST,
  /change-email, /change-password, /account DELETE).
- Fixed pre-existing auth.api.forgetPassword typo (→ requestPasswordReset).
- shared-logger + requestLogger middleware wired in index.ts; all
  console.* calls in the service removed.

## Phase 2 — Passkey end-to-end (@better-auth/passkey 1.6+)

- sql/007_passkey_bootstrap.sql: idempotent schema alignment —
  friendly_name→name, +aaguid, transports jsonb→text, +method column
  on login_attempts.
- better-auth.config.ts: passkey plugin wired with rpID/rpName/origin
  from new webauthn config section. rpID defaults to mana.how in prod
  (from COOKIE_DOMAIN), localhost in dev.
- routes/passkeys.ts: 7 wrapper endpoints (capability probe,
  register/options+verify, authenticate/options+verify with JWT mint,
  list, delete, rename). Each routes errors through the classifier;
  authenticate/verify promotes generic INVALID_CREDENTIALS to
  PASSKEY_VERIFICATION_FAILED.
- PasskeyRateLimitService: in-memory per-IP (options: 20/min) and
  per-credential (verify: 10 failures/min → 5 min cooldown) buckets.
  Deliberately separate from the password lockout — different factor,
  different blast radius.
- Client: authService.getPasskeyCapability() async probe, memoised per
  session. authStore.passkeyAvailable reactive state. LoginPage gates
  on === true so a slow probe doesn't flash the button in.
- AuthResult grew a code: AuthErrorCode field; handleAuthError in
  shared-auth prefers the server envelope over the legacy message
  heuristics.

## Tests

- 30 unit tests for the classifier covering every branch (including
  the exact Postgres 42703 shape that started this).
- 9 unit tests for the rate limiter.
- 14 integration tests for the auth routes — the regression test
  explicitly asserts "upstream 500 → 503 + zero lockout bumps".
- 101 tests pass, 0 fail, 30 pre-existing skips unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:52:51 +02:00

108 lines
4.4 KiB
TypeScript

export interface Config {
port: number;
databaseUrl: string;
syncDatabaseUrl: string;
baseUrl: string;
cookieDomain: string;
nodeEnv: string;
serviceKey: string;
cors: { origins: string[] };
manaNotifyUrl: string;
manaCreditsUrl: string;
manaSubscriptionsUrl: string;
manaMailUrl: string;
/** Base64-encoded 32-byte AES-256 key encryption key (KEK). Wraps each
* user's master key in auth.encryption_vaults. Required in production
* — in development a deterministic dev KEK is auto-generated so the
* service still boots, with a loud warning. */
encryptionKek: string;
/**
* PEM-encoded RSA-OAEP-2048 public key for the mana-ai Mission
* Grant runner. The `/me/ai-mission-grant` endpoint wraps per-
* mission data keys with this public key so only mana-ai (holder
* of the paired private key) can unwrap them. Optional at boot:
* when absent, the endpoint returns 503 so the UI can degrade
* to foreground-only execution.
*/
missionGrantPublicKeyPem?: string;
/** WebAuthn passkey settings. `rpId` is the effective domain the
* authenticator binds credentials to — `mana.how` in prod (scopes
* passkeys across all subdomains) and `localhost` in dev. `origin`
* is the URL where the browser made the WebAuthn call; mismatches
* cause the verification step to fail with `invalid origin`. `name`
* is shown to the user in the authenticator prompt ("Register a
* passkey for Mana"). */
webauthn: {
rpId: string;
rpName: string;
origin: string | string[];
};
}
export function loadConfig(): Config {
const env = (key: string, fallback?: string) => process.env[key] || fallback || '';
const nodeEnv = env('NODE_ENV', 'development');
// Encryption KEK: in production a missing/short value is fatal — the
// vault service refuses to mint or unwrap any master keys without a
// real KEK. In development we auto-fill with a deterministic dev key
// so contributors can run the service without setting up a secret.
let encryptionKek = env('MANA_AUTH_KEK');
if (!encryptionKek) {
if (nodeEnv === 'production') {
throw new Error(
'mana-auth: MANA_AUTH_KEK env var is required in production. ' +
'Set it to a base64-encoded 32-byte random value: ' +
'`openssl rand -base64 32`'
);
}
// 32 zero bytes — deterministic, obviously not for production. The
// vault service logs a loud warning at startup when it sees this.
encryptionKek = 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
}
const corsOrigins = env('CORS_ORIGINS', 'http://localhost:5173').split(',');
// WebAuthn: derive sensible defaults from the auth service's
// BASE_URL + COOKIE_DOMAIN so a dev never has to set three extra
// env vars. In prod, override explicitly.
//
// rpId must be the bare effective domain (no protocol, no port).
// A mismatch between rpId and the client's origin hostname causes
// SecurityError at registration time. Deriving rpId from
// COOKIE_DOMAIN (already stripped of its leading dot for the shared
// cookie) keeps it honest — `.mana.how` → `mana.how` — and falls
// back to the hostname of BASE_URL.
const cookieDomain = env('COOKIE_DOMAIN');
const defaultRpId = cookieDomain
? cookieDomain.replace(/^\./, '')
: new URL(env('BASE_URL', 'http://localhost:3001')).hostname;
return {
port: parseInt(env('PORT', '3001'), 10),
databaseUrl: env('DATABASE_URL', 'postgresql://mana:devpassword@localhost:5432/mana_platform'),
syncDatabaseUrl: env(
'SYNC_DATABASE_URL',
'postgresql://mana:devpassword@localhost:5432/mana_sync'
),
baseUrl: env('BASE_URL', 'http://localhost:3001'),
cookieDomain,
nodeEnv,
serviceKey: env('MANA_SERVICE_KEY', 'dev-service-key'),
cors: { origins: corsOrigins },
manaNotifyUrl: env('MANA_NOTIFY_URL', 'http://localhost:3013'),
manaCreditsUrl: env('MANA_CREDITS_URL', 'http://localhost:3061'),
manaSubscriptionsUrl: env('MANA_SUBSCRIPTIONS_URL', 'http://localhost:3063'),
manaMailUrl: env('MANA_MAIL_URL', 'http://localhost:3042'),
encryptionKek,
missionGrantPublicKeyPem: env('MANA_AI_PUBLIC_KEY_PEM') || undefined,
webauthn: {
rpId: env('WEBAUTHN_RP_ID', defaultRpId),
rpName: env('WEBAUTHN_RP_NAME', 'Mana'),
// Pass every CORS origin as allowed WebAuthn origin by default
// so the same passkey works from any app subdomain. Override
// with WEBAUTHN_ORIGIN to restrict further.
origin: env('WEBAUTHN_ORIGIN') ? env('WEBAUTHN_ORIGIN').split(',') : corsOrigins,
},
};
}