mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-15 01:41:08 +02:00
While adding negative-path integration tests for the auth flow I
discovered that *neither* of the lockout primitives in
services/mana-auth/src/services/security.ts has actually been
working in production. Two independent silent failures that combined
into a "the lockout never triggers, ever" outcome:
1. recordAttempt() inserted into auth.login_attempts with explicit
`id = gen_random_uuid()`, but auth.login_attempts.id is a
`serial integer` column with `nextval('auth.login_attempts_id_seq')`
as default. The UUID-into-integer cast threw a type error every
single time, the bare `catch {}` swallowed it as "non-critical",
and not a single login attempt was ever persisted. Lockout's "5
failures in 15 min" check was running against an empty table.
2. checkLockout() built `attempted_at > ${new Date(...)}` via the
drizzle sql template, but postgres-js cannot bind a JS Date object
directly — it tries to byteLength() the parameter and crashes with
`Received an instance of Date`. Same anti-pattern: bare `catch`,
returns `{locked: false}` (fail-open), no log, completely invisible.
Both are "silent broken since the encryption-vault series of changes"
class — caught only because the integration test for the lockout flow
expected the 6th login attempt to return 429 and got 200 instead.
Fixes:
- recordAttempt(): drop the bogus `id` column from the INSERT (let the
sequence default assign it), default ipAddress to null instead of
letting `${undefined}` collapse the parameter slot, and surface
errors in the catch instead of swallowing them silently.
- checkLockout(): pass `windowStart.toISOString()` instead of the Date
object so postgres-js can serialize it. Same catch upgrade — log the
cause when failing open.
Failure-path test additions (tests/integration/auth-failures.test.ts):
- wrong password: assert 401, no JWT, +1 LOGIN_FAILURE in security_events,
+1 row in auth.login_attempts
- account lockout: 5 failed attempts then 6th returns 429 with
remainingSeconds, even with the correct password
- unverified email login: 403 with code = EMAIL_NOT_VERIFIED
- validate with garbage token: valid !== true
- resend verification: second mail arrives in mailpit
Plus the run-integration-tests.sh helper now runs both .test.ts files
and tests/integration/package.json's `test` script does the same.
Negative-control: reverted the recordAttempt fix (re-added the bogus
gen_random_uuid id), the wrong-password test failed at the
login_attempts assertion. Reverted the checkLockout fix, the lockout
test failed at the 429 assertion. Both fixes verified to be load-bearing.
6 tests, 45 expects, ~1.3s on a warm cache.
|
||
|---|---|---|
| .. | ||
| auth-failures.test.ts | ||
| auth-flow.test.ts | ||
| package.json | ||
| README.md | ||
Integration tests
End-to-end tests that exercise real services against real Postgres + Redis + a fake SMTP server (Mailpit), via docker-compose.test.yml.
What's covered
| File | Flow under test |
|---|---|
auth-flow.test.ts |
register → email verification (via Mailpit) → login → JWT validation → /me/data → encryption vault init/key → logout |
Running locally
./scripts/run-integration-tests.sh
That script:
- Brings up
docker-compose.test.yml(postgres, redis, mailpit, mana-auth, mana-notify) on isolated ports (5443,6390,8026,3091,3092) - Waits for everything to be healthy
- Pushes the
@mana/authDrizzle schema into the test database - Applies the encryption-vault SQL migrations (
002_encryption_vaults.sql,003_recovery_wrap.sql) - Runs
bun test auth-flow.test.tsfrom this directory - Tears the stack down on exit (success or failure)
The whole thing runs in well under a minute on a warm Docker cache.
Mailpit web UI
While the stack is up you can also browse incoming mail manually at http://127.0.0.1:8026.
Why this exists
Bugs caught by this test the first time it ran:
services/mana-authimportednanoidbut didn't declare it in itspackage.json→Cannot find package 'nanoid'at startup, register endpoint 500'd. Localpnpm installresolved it transitively viapostcss → nanoid@3.3.11, an isolated container build couldn't.MANA_AUTH_KEKwas never passed through to the mana-auth container indocker-compose.macmini.yml, so the prod service hard-failed at startup withMANA_AUTH_KEK env var is required in production.- The encryption-vault SQL migrations (
002,003) had never been applied to prod Postgres, so any vault endpoint 500'd withrelation "auth.encryption_vaults" does not exist. /api/v1/auth/loginminted a JWT by reconstructing the session cookie under the wrong name (mana.session_tokeninstead of__Secure-mana.session_token), so the JWT-mint silently fell through and clients gotaccessToken: undefined.- mana-notify SMTP credentials were misconfigured against Stalwart, so no verification email actually went out — the failure was buried in mana-notify worker logs and the auth flow appeared to "work" only because the user could be flipped to verified by other means.
Each of those would have been a single red bun test run instead of a multi-hour debugging session.
Adding more flows
Drop another <name>.test.ts next to auth-flow.test.ts and update package.json to include it. Use the same helpers (postJson, waitForMail, pgExec) — they're free to copy.
CI
The same script runs in .github/workflows/ci.yml as a required PR check. Don't bypass it.