Source default was 3026 but Mac Mini production has been overriding to
3025 via the launchd plist in scripts/mac-mini/setup-image-gen.sh ever
since the service was set up. The override existed in exactly one place
that is not version-controlled in any obvious way — anyone redeploying
without that script would land on 3026 and clients pointing at 3025
would fail to connect.
Source default → 3025 across main.py, setup.sh, README, CLAUDE.md so the
launchd plist is no longer load-bearing. The Mac Mini setup script still
sets PORT=3025 explicitly; that's now belt-and-suspenders rather than the
only thing keeping production alive.
Also added a note clarifying that this Mac Mini service (flux2.c, MPS,
arm64-only) is *not* the same thing as the "image-gen" running on the
Windows GPU server (PyTorch + diffusers + CUDA, port 3023, code lives at
C:\mana\services\mana-image-gen\ outside this repo). Two different
implementations sharing a name was confusing the port-collision audit.
Updated docs/PORT_SCHEMA.md warning block to retract the previous false
claims of two active port collisions:
- image-gen ↔ video-gen on 3026 — wrong: image-gen runs on Mac Mini
on 3025 (now also the source default), video-gen is alone on the
Windows GPU on 3026
- voice-bot ↔ sync on 3050 — latent only: mana-voice-bot is not
deployed anywhere (no launchd, no scheduled task, no cloudflared
route), so the collision is in source defaults but not in production
The voice-bot 3050 default should still be moved before voice-bot is
ever deployed — flagged in the PORT_SCHEMA warning instead of silently
fixed since voice-bot deployment is its own decision.
New service docs:
- services/mana-stt/CLAUDE.md — FastAPI surface with Whisper MLX (local),
WhisperX (rich), and Voxtral (local + Mistral API). Documents the lazy
backend loading and the launchd plist setup on the Mac Mini.
- services/mana-events/CLAUDE.md — Hono/Bun service for public RSVP and
event-sharing. Documents the host (JWT) vs public (token) split, the
rate-limit sweeper, and the createApp factory pattern that lets unit
tests run without bootstrapping the production sweeper.
Stale entries fixed:
- mana-auth: dropped "rewritten from NestJS / drop-in replacement" — the
rewrite is the only mana-auth there is now. Email channel updated from
Brevo SMTP to self-hosted Stalwart (see docs/MAIL_SERVER.md).
- mana-notify: same Brevo → Stalwart fix in the channel table and env
var defaults.
PORT_SCHEMA.md flagged as aspirational:
- The doc was dated 2026-03-28 and presented as "single source of truth",
but cross-checking against actual service source files (config.go,
main.py, start.sh) shows nothing matches. Added a prominent warning at
the top with the real ports + two confirmed collisions:
* mana-image-gen and mana-video-gen both default to PORT 3026
* mana-voice-bot and mana-sync both default to PORT 3050
Today these are masked because image-gen + voice-bot live on the
Windows GPU server while video-gen + sync live on the Mac Mini, but
the moment they share a host they collide. Either execute the planned
reorg or pick non-colliding ports and rewrite the doc to match
reality — flagged as a real follow-up.
PyTorch's `torch.cuda.get_device_properties(0)` returns a
`_CudaDeviceProperties` object whose memory attribute is
`total_memory` (bytes), not `total_mem`. The typo crashed the
service immediately at startup because `get_model_info()` is
called from the FastAPI lifespan handler, not lazily — uvicorn
logged "Application startup failed" before any request could land.
Found while installing mana-video-gen on the Windows GPU box
(192.168.178.11:3026) for the gpu-video.mana.how Cloudflare route.
After the fix the service starts cleanly under the ManaVideoGen
scheduled task and responds 200 on /health both LAN and via
Cloudflare tunnel. status.mana.how now reports 42/42 — first time
ever.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Five documentation surfaces gained encryption awareness in this
sweep. Before this commit, the only place anyone could learn about
the at-rest encryption layer or the zero-knowledge opt-in was the
internal DATA_LAYER_AUDIT.md. New contributors and self-hosters
would never discover one of the most important features of the
product just by reading the standard onboarding docs.
apps/docs/src/content/docs/architecture/security.mdx (NEW)
----------------------------------------------------------
First-class user-facing security page in the Starlight site,
slotted into the Architecture sidebar between Authentication and
Backend.
Sections:
- What's encrypted (overview table of 27 modules + the
intentional plaintext carve-outs)
- Standard mode flow with ASCII diagram
- "What Mana CAN see" trust statements per mode
- Zero-knowledge mode setup walkthrough (Steps component)
- Unlock flow on a new device
- Recovery code rotation
- Deployment requirements (the loud MANA_AUTH_KEK warning)
- Audit trail action vocabulary
- Threat model summary table
- Implementation file references with paths
services/mana-auth/CLAUDE.md
----------------------------
New "Encryption Vault" section under Key Endpoints, listing all 7
routes (status, init, key, rotate, recovery-wrap GET+DELETE,
zero-knowledge) with their HTTP method, path, error codes, and a
description. Mentions the three CHECK constraints + RLS + audit
table. Points readers at DATA_LAYER_AUDIT.md and the new
security.mdx for the deep dive.
Environment Variables block gains MANA_AUTH_KEK with a multi-line
comment explaining the openssl rand command + dev fallback warning.
apps/mana/CLAUDE.md
-------------------
Full rewrite. The existing file was from the Supabase era and
described things like @supabase/ssr, safeGetSession(), and a
five-table schema with users + organizations + teams that doesn't
exist any more. Replaced with the unified-app architecture:
- Module system layout (collections.ts / queries.ts / stores/)
- Mana Auth (Better Auth + EdDSA JWT) instead of Supabase
- Local-first data layer with the full pipeline diagram
- At-rest encryption section with the "when writing module code
that touches sensitive fields" 4-step guide
- Updated routing structure (no more separate /organizations,
/teams routes)
- Module store pattern code example
- Reference document table at the bottom pointing at the audit,
the new security.mdx, and the auth doc
Root CLAUDE.md
--------------
New "At-Rest Encryption (Phase 1–9)" subsection under the
Local-First Architecture section. Two-mode trust summary table,
production requirement for MANA_AUTH_KEK with the openssl command,
the "when writing module code" 4-step guide, and a reference
table. New contributors reading the root CLAUDE.md from top to
bottom now hit encryption naturally as part of the data layer
discussion.
.env.macmini.example
--------------------
MANA_AUTH_KEK was missing from the production env example
entirely — the macmini deployment would silently boot on the
32-zero-byte dev fallback if you copied this file. Added with a
multi-paragraph comment covering: how to generate, why it's
required, how to store securely (Docker secrets / KMS / Vault),
and the rotation caveat.
apps/docs/src/content/docs/deployment/self-hosting.mdx
------------------------------------------------------
Two changes:
1. Added MANA_AUTH_KEK to the mana-auth service block in the
Compose example with an inline comment pointing at the new
section below.
2. New "Encryption Vault Setup" H2 section with subsections:
- Generating a KEK (with a fake example value labelled DO NOT
USE — generate your own)
- Securing the KEK (Docker secrets, KMS, systemd
LoadCredential, anti-patterns)
- "What if I lose the KEK?" — explains the data is
unrecoverable by design and mitigation via zero-knowledge
mode opt-in
- KEK rotation — calls out the missing background re-wrap
job as a known limitation
apps/docs/astro.config.mjs
--------------------------
Added "Security & Encryption" entry to the Architecture sidebar
between Authentication and Backend so the new page is reachable
from the docs nav.
Astro check: 0 errors, 0 warnings, 0 hints across 4 .astro files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes backlog #1 from the Phase 9 audit. Adds 28 integration tests
for the EncryptionVaultService against a real Postgres so the
RLS policies, CHECK constraints and audit-row writes are exercised
as the production app actually sees them. The pure-crypto KEK tests
in kek.test.ts already covered the wrap/unwrap primitives — this
new file fills in the service-shaped gaps that need a real DB.
Test infrastructure
-------------------
- Reads TEST_DATABASE_URL from env. Whole suite is SKIPPED via
describe.skip if unset, so unrelated CI runs and `bun test` from
a fresh checkout don't fail on missing connection. The
encryption-vault sub-job has to provision a Postgres explicitly.
- Schema is assumed already migrated (run `pnpm db:push` or apply
sql/002 + sql/003 manually before invoking the suite). Tests
insert a fresh test user per case via beforeEach so cross-test
pollution is impossible despite the FK to auth.users.
- afterAll cleans up the user (CASCADE wipes vault + audit) and
closes the postgres pool so bun test exits cleanly.
Coverage
--------
init (3):
- Mints a fresh vault, wrapped_mk + wrap_iv populated, ZK off
- Idempotent (returns same key)
- Audit rows are written
getStatus (5):
- vaultExists=false for unconfigured user
- vaultExists=true after init, no recovery wrap
- hasRecoveryWrap=true after setRecoveryWrap
- zeroKnowledge=true after enableZK
- Does NOT write an audit row (cheap metadata read)
setRecoveryWrap (4):
- Stores wrap on existing vault
- VaultNotFoundError on missing vault
- Idempotent (replaces previous wrap)
- Writes recovery_set audit row
clearRecoveryWrap (3):
- Removes the wrap
- ZeroKnowledgeActiveError when ZK is on
- VaultNotFoundError on missing vault
enableZeroKnowledge (4):
- Flips zero_knowledge=true and NULLs out wrapped_mk + wrap_iv
- RecoveryWrapMissingError if no recovery wrap is set
- Idempotent (already-on is no-op)
- VaultNotFoundError on missing vault
disableZeroKnowledge (2):
- Restores wrapped_mk from a client-supplied master key,
verifies the round-trip via getMasterKey returns the same bytes
- No-op when ZK is already off
getMasterKey (3):
- Returns unwrapped MK in standard mode
- Returns recovery blob with requiresRecoveryCode=true in ZK mode
- VaultNotFoundError on missing vault
rotate (2):
- Mints fresh MK and wipes any existing recovery wrap
- ZeroKnowledgeRotateForbidden in ZK mode
DB-level invariants (2):
- Setting wrapped_mk back while ZK active is rejected by
encryption_vaults_zk_consistency
- Setting wrap_iv to NULL while wrapped_mk is set is rejected
by encryption_vaults_wrap_iv_pair
Both wrap the Drizzle update in an arrow IIFE so
expect(...).rejects.toThrow() sees a real Promise (Drizzle's
chainable update() only executes on await/then).
Run results
-----------
With TEST_DATABASE_URL set + schema migrated:
28 pass, 0 fail, 64 expect() calls
Without TEST_DATABASE_URL set (default):
0 pass, 30 skip (full suite cleanly skipped)
KEK tests in kek.test.ts still run unaffected.
Drive-by: kek.test.ts header comment updated to point at the new
sibling file instead of saying "tests will live alongside mana-sync"
(which was outdated speculation from Phase 2).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes the Phase 9 Milestone 4 known limitation where the settings
page always started in 'idle' state regardless of whether the user
had already enabled zero-knowledge mode. Adds a cheap server-side
status read + hydrates the page on mount.
Server side
-----------
New VaultStatus interface and getStatus(userId) method on
EncryptionVaultService — single SELECT against encryption_vaults,
no decryption, no audit logging (this gets called on every settings
page mount and we don't want to flood the audit log with read-only
metadata fetches). Returns sane defaults when the vault row doesn't
exist yet so the client can avoid a 404 dance.
GET /api/v1/me/encryption-vault/status →
{
vaultExists: boolean,
hasRecoveryWrap: boolean,
zeroKnowledge: boolean,
recoverySetAt: string | null
}
Client side
-----------
vault-client.ts gains a `getStatus()` method that bypasses the
fetchVault retry helper (status reads should be cheap and one-shot;
if they fail we let the caller fall back to defaults). Re-exports
VaultStatus + RecoveryCodeSetupResult from the crypto barrel.
settings/security/+page.svelte
------------------------------
onMount kicks off a getStatus() call. Two things change based on
the response:
1. If the server says zero_knowledge=true, jump zkSetupStep to
'enabled' so the page renders the active-state UI directly
instead of the setup flow.
2. New `hasRecoveryWrap` state tracks whether a wrap is stored,
even if ZK isn't active yet. The idle branch now has TWO
variants:
- hasRecoveryWrap=false: original "Recovery-Code einrichten"
single button (unchanged from milestone 4)
- hasRecoveryWrap=true: amber notice "you have a code stored
but ZK isn't active" with three buttons:
* "Zero-Knowledge jetzt aktivieren" (jumps straight to the
enable call)
* "Neuen Recovery-Code generieren" (rotates the wrap)
* "Recovery-Code entfernen" (with two-click confirmation,
calls DELETE /recovery-wrap)
This handles the previously-orphaned state where a user generated a
code, copied it to their password manager, but never confirmed the
final activation step. Without this branch, after a reload the
settings page would show "Setup" again and the call would fail
with "vault is already in zero-knowledge mode" — except it wouldn't,
because the vault wasn't actually in ZK yet, just had a recovery wrap
stored. Either way the state was confusing.
handleSetupRecoveryCode + handleClearRecoveryCode now keep
hasRecoveryWrap in sync after the round trip.
Fail-quiet on getStatus error: if the network/auth/server-side fetch
fails, the page stays at the idle default. The user can still run
the setup flow, and any inconsistencies surface via the usual
server-side error responses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server-side support for the Phase 9 zero-knowledge opt-in. Adds the
recovery-wrap columns + four new vault operations + the routes that
expose them.
Schema (sql/003_recovery_wrap.sql)
----------------------------------
Adds to auth.encryption_vaults:
- recovery_wrapped_mk text (NULL until set)
- recovery_iv text (NULL until set)
- recovery_format_version smallint NOT NULL DEFAULT 1
- recovery_set_at timestamptz
- zero_knowledge boolean NOT NULL DEFAULT false
Drops NOT NULL from wrapped_mk + wrap_iv (a vault in zero-knowledge
mode has no server-side wrap at all).
Three CHECK constraints enforce the invariant at the DB level so no
service bug can leave a vault in an inconsistent state:
- encryption_vaults_has_wrap — at least one of (wrapped_mk,
recovery_wrapped_mk) is set
- encryption_vaults_wrap_iv_pair — ciphertext + IV are paired
(both NULL or both set) on
each wrap form
- encryption_vaults_zk_consistency — zero_knowledge=true implies
wrapped_mk IS NULL AND
recovery_wrapped_mk IS NOT NULL
If a code-level bug ever tried to enable ZK without a recovery wrap,
or to leave both wraps empty, Postgres would reject the UPDATE.
Drizzle schema (db/schema/encryption-vaults.ts)
-----------------------------------------------
Mirrors the migration: wrappedMk + wrapIv become nullable, the four
new columns added with the right defaults. Inline doc comment explains
the zero-knowledge fork.
Service (services/encryption-vault/index.ts)
--------------------------------------------
VaultFetchResult gains optional `requiresRecoveryCode` /
`recoveryWrappedMk` / `recoveryIv` so the route handler can serialize
the right shape. masterKey becomes Uint8Array | null (null in ZK mode).
Existing methods updated:
- init: branches on row.zeroKnowledge — returns the recovery blob
instead of an unwrapped MK if the user is already in ZK mode
- getMasterKey: same fork, with audit context "zk-recovery-blob"
- rotate: throws ZeroKnowledgeRotateForbidden in ZK mode (the server
can't re-wrap a key it can't read). Also wipes any stale recovery
wrap on rotation — the new MK has nothing to do with the old one,
so the old recovery code would unwrap into garbage.
New methods:
- setRecoveryWrap(userId, { recoveryWrappedMk, recoveryIv }, ctx)
Stores (or replaces) the user's recovery wrap. Idempotent.
- clearRecoveryWrap(userId, ctx)
Removes the recovery wrap. Forbidden if ZK is active (would lock
the user out) — throws ZeroKnowledgeActiveError → 409.
- enableZeroKnowledge(userId, ctx)
NULLs out wrapped_mk + wrap_iv, sets zero_knowledge=true. Requires
a recovery wrap to already be present — throws
RecoveryWrapMissingError → 400 otherwise. Idempotent on already-on.
- disableZeroKnowledge(userId, mkBytes, ctx)
Inverse: takes a freshly-unwrapped MK from the client, KEK-wraps
it, stores as wrapped_mk, flips zero_knowledge=false. The client
is the only entity that can supply the MK at this point, since
the server can't decrypt the recovery wrap.
Three new error classes:
- RecoveryWrapMissingError → 400 RECOVERY_WRAP_MISSING
- ZeroKnowledgeActiveError → 409 ZK_ACTIVE
- ZeroKnowledgeRotateForbidden → 409 ZK_ROTATE_FORBIDDEN
Audit action union extended with:
- 'recovery_set' | 'recovery_clear' | 'zk_enable' | 'zk_disable'
Routes (routes/encryption-vault.ts)
-----------------------------------
GET /key + POST /init now share a serializeFetchResult helper that
returns either:
- { masterKey, formatVersion, kekId } (standard)
- { requiresRecoveryCode: true, recoveryWrappedMk, (ZK mode)
recoveryIv, formatVersion }
Three new routes:
- POST /recovery-wrap — body: { recoveryWrappedMk, recoveryIv }
Stores the wrap. Validates both fields
are non-empty strings.
- DELETE /recovery-wrap — Removes the wrap. 409 if ZK active.
- POST /zero-knowledge — body: { enable: boolean, masterKey?: base64 }
enable=true: flip on (no body MK needed)
enable=false: flip off (MK required)
Validates the MK decodes to exactly 32 bytes.
Wipes the bytes after handing them to the
service.
POST /rotate now catches ZeroKnowledgeRotateForbidden → 409
ZK_ROTATE_FORBIDDEN so the client can show "disable zero-knowledge
first".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add an "eventItems" mini-collection attached to each social event so
hosts can track what each guest is bringing, and so public visitors
on the share-link page can claim an item without an account.
Local-first side
- New eventItems table (Dexie v11), module config update for sync.
- LocalEventItem type + EventItem domain type, useEventItems query.
- eventItemsStore: addItem / updateItem / toggleDone / assign /
deleteItem. Every mutation pushes the full list to the server
snapshot via eventsStore.syncItems if the event is published.
- BringListEditor component on the host DetailView with assign-to-
guest dropdown, quantity, and done-checkbox.
- eventsStore.syncItems + a syncItems call in publishEvent so the
public page sees pre-existing items as soon as the event ships.
Server side
- New event_items_published table (FK cascade from events_published
so unpublishing wipes the bring list along with the snapshot).
- Host endpoints PUT/GET /events/:eventId/items: full-replace upsert
that preserves any existing claimed_by_name across host edits, max
100 items, ownership check.
- Public POST /rsvp/:token/items/:itemId/claim: name-only claim, 1×
per item (first write wins), shares the per-token hourly rate
bucket with RSVP submissions to keep the abuse surface uniform.
- GET /rsvp/:token now also returns the bring list (sorted) so the
public page renders in a single round-trip.
Public RSVP page
- Renders the bring list with claim buttons; clicking prompts for a
name and POSTs the claim, then optimistically updates the UI.
- New bring-list i18n keys for all five locales (de/en/it/fr/es).
Tests
- 15 new server tests covering host PUT/GET (insert / update / prune /
ownership / claimed-name preservation / cascade), GET /rsvp item
exposure, and POST /claim (success / double-claim / cross-token /
cancelled / validation). 50 server tests total, all green.
- E2E spec scoped to .guest-editor where the new BringListEditor
introduced a duplicate "Hinzufügen" button label.
Add bun:test integration suite that exercises every public and host
endpoint plus the rate-bucket sweeper against a real Postgres. The
Hono app factory was extracted from index.ts into app.ts so tests can
build their own instance with a header-based auth mock instead of
spinning up mana-auth + JWKS.
Coverage:
- health route smoke
- public RSVP: snapshot fetch (incl. 404, cancelled, summary
privacy), submit, validation (name, status, email, plus-ones,
cancelled), upsert dedup (incl. null/missing email parity), summary
aggregation across yes/no/maybe + plus-ones, rate-limit cap (5/h),
absolute per-token cap (20)
- host events: publish (auth, idempotent token reuse, ownership),
snapshot update (partial, ownership, 404), delete (cascade FK to
rsvps + buckets, ownership, idempotent), get rsvps (ownership)
- sweeper: removes >2h-old buckets, keeps fresh ones, no-op on empty
Mock auth lives in a small helper that injects an X-Test-User header
into a fake middleware, so the same createApp() factory powers both
production (real jwtAuth) and tests (header mock).
Five small follow-ups on Phase 1b:
- docker-compose.macmini.yml: add the mana-events container with the
same shape as mana-credits, expose port 3065, add a Traefik route
for events.mana.how, and inject PUBLIC_MANA_EVENTS_URL into the
mana-web container so the SvelteKit SSR + browser both reach it.
- mana-events: background sweeper that deletes rsvp_rate_buckets
rows older than 2h every hour. Without it, long-published events
accumulate one row per traffic-hour forever (FK cascade only fires
on snapshot delete).
- PublicRsvpList: track consecutiveFailures and only show the error
banner after two failures in a row, so a single mid-poll network
hiccup doesn't flash a 30s error the user can't act on.
- apps/mana/apps/web: declare postgres as a devDep (already imported
by the e2e spec via pnpm hoisting, now explicit).
Adds end-to-end browser voice capture for the Memoro module, mirroring the
existing dreams pattern: MediaRecorder → SvelteKit server proxy → mana-stt
on the Windows GPU box via Cloudflare tunnel.
Recording UI lives in /memoro page header (mic button + live timer + cancel +
sticky-permission retry). Server proxy at /api/v1/memoro/transcribe forwards
the blob with the server-held X-API-Key. memosStore.createFromVoice creates a
placeholder memo with processingStatus='processing' and fires transcribeBlob
in the background, which writes the transcript and flips status on completion
(or 'failed' with error in metadata).
Also corrects the mana-stt hostname across the repo: stt-api.mana.how (which
never existed in DNS) → gpu-stt.mana.how (the actual Cloudflare tunnel route
to the Windows GPU box). Adds an ENVIRONMENT_VARIABLES.md section explaining
how to obtain MANA_STT_API_KEY and where the tunnel terminates. Adds tunnel
health probes to the mac-mini health-check script so we catch tunnel-side
breakage in addition to LAN-side.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds the server side of the per-user encryption vault. Phase 1 shipped
the client foundation (no-op while every table is enabled:false). This
commit lets the client actually fetch a master key when Phase 3 flips
the registry switches.
Schema (Drizzle + raw SQL migration)
- auth.encryption_vaults: per-user wrapped MK + IV + format version +
kek_id stamp + created/rotated timestamps. PK = user_id, ON DELETE
CASCADE so account deletion wipes the vault.
- auth.encryption_vault_audit: append-only trail of init/fetch/rotate
actions with IP, user-agent, HTTP status, free-form context.
- sql/002_encryption_vaults.sql: idempotent CREATE TABLE + ENABLE +
FORCE row-level security with a `current_setting('app.current_user_id')`
policy on both tables. FORCE makes the policy apply to the table
owner too — no bypass via grants.
KEK loader (services/encryption-vault/kek.ts)
- Loads a 32-byte AES-256 KEK from the MANA_AUTH_KEK env var (base64).
- Production: missing or wrong-length input is fatal at boot.
- Development: 32-zero-byte fallback so contributors can run the
service without provisioning a secret. Logs a loud warning.
- wrapMasterKey / unwrapMasterKey use Web Crypto AES-GCM-256 over the
raw 32-byte MK with a fresh 12-byte IV per wrap. Returns base64
pair for storage.
- generateMasterKey + activeKekId helpers used by the service.
- Future migration to KMS / Vault: only loadKek() changes; the
kek_id stamp on each row tracks which KEK produced it.
EncryptionVaultService (services/encryption-vault/index.ts)
- init(userId): idempotent — returns existing MK or mints a new one.
- getMasterKey(userId): unwraps the stored MK; throws VaultNotFoundError
on no-row so the route can return 404 cleanly.
- rotate(userId): mints fresh MK, replaces wrap. Caller is on the
hook for re-encryption — destructive by design.
- withUserScope(userId, fn): wraps every read/write in a Drizzle
transaction with set_config('app.current_user_id', userId, true)
so the RLS policy admits only the matching row. Empty userId is
rejected up-front.
- writeAudit() appends a row to encryption_vault_audit on every
action including failures, so probing attempts leave a trail.
Routes (routes/encryption-vault.ts)
- POST /api/v1/me/encryption-vault/init — idempotent bootstrap
- GET /api/v1/me/encryption-vault/key — fetch the active MK
- POST /api/v1/me/encryption-vault/rotate — destructive rotation
- All return base64-encoded master key bytes plus formatVersion +
kekId. JWT-protected via the existing /api/v1/me/* middleware.
- readAuditContext() pulls X-Forwarded-For + User-Agent off the
request for the audit row.
Bootstrap (index.ts)
- loadKek() runs at top-level await before any route can fire so a
misconfigured KEK fails closed at boot, never at request time.
- encryptionVaultService is mounted under /api/v1/me/encryption-vault
so it inherits the existing JWT middleware and shows up next to the
GDPR self-service endpoints.
Tests (services/encryption-vault/kek.test.ts)
- 11 Bun-test cases covering: KEK load (happy path, wrong length,
idempotent, before-load guard), generateMasterKey randomness,
wrap/unwrap roundtrip, IV uniqueness across repeated wraps,
wrong-MK-length rejection, tampered-ciphertext rejection,
wrong-length IV rejection, wrong-KEK rejection.
- Service-level integration tests deferred — they need a real
Postgres for the RLS behaviour, set up via existing mana-sync
test pattern in CI.
Config + env
- .env.development gains MANA_AUTH_KEK= (empty → dev fallback)
with a comment explaining the production requirement.
- services/mana-auth/package.json gains "test": "bun test".
Verified: 11/11 KEK tests passing, 31/31 Phase 1 client tests still
passing, only pre-existing TS errors remain in mana-auth (auth.ts:281
forgetPassword + api-keys.ts:50 insert overload — both unrelated).
Phase 3: client wires the MemoryKeyProvider to GET /encryption-vault/key
on login, flips registry entries to enabled:true table by table, and
extends the Dexie hooks to call wrapValue/unwrapValue on configured
fields.
Phase 4: settings UI for lock state, key rotation, recovery code opt-in.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add an ON DELETE CASCADE FK from rsvp_rate_buckets.token to
events_published.token. Without it, deleting a snapshot left orphaned
rate-limit rows behind, slowly leaking storage. Verified with a
direct SQL cascade test.
New Hono+Bun service at services/mana-events on port 3065 with two
schemas in mana_platform: events_published (snapshots) and public_rsvps
(unauthenticated responses), plus a per-token hourly rate-limit bucket.
- Host endpoints (JWT) for publish/update/unpublish/list-rsvps
- Public endpoints for snapshot fetch + RSVP upsert with rate limiting
- New /rsvp/[token] page outside the auth gate, SSR-loads the snapshot
- Client store wires publishEvent/unpublishEvent to the server, syncs
snapshot updates after edits, and deletes the snapshot on event delete
- DetailView polls GET /events/:id/rsvps every 30s while open and lets
hosts import a public response into their local guest list
- generate-env, setup-databases.sh, .env.development, hooks.server.ts,
package.json wired for local dev
Defense-in-depth on top of the existing application-level WHERE clauses:
- Migrate() now ENABLE + FORCE row level security on sync_changes and
installs a policy that gates rows on current_setting('app.current_user_id').
FORCE makes the policy apply to the table owner too, so the application
role used by mana-sync cannot bypass it regardless of grants.
- New withUser(ctx, userID, fn) helper opens a transaction and calls
set_config('app.current_user_id', userID, true) before running fn.
Empty userIDs are rejected up-front so an unauthenticated request can
never reach the database with an empty RLS scope (which would match
every row).
- RecordChange / GetChangesSince / GetAllChangesSince all run inside
withUser. WITH CHECK on the policy double-validates the user_id column
on insert against the active session, so a future code path that
forgets the WHERE clause cannot leak data.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add curated icon registry (73 Phosphor icons, 8 categories) in shared-icons
- Add DynamicIcon atom and IconPicker molecule in shared-ui
- Migrate habits module from emoji strings to Phosphor icon names
- Add Dexie version(2) migration for emoji→icon field rename
- Replace inline SVGs in habits with Phosphor components
- Add drag-and-drop photo upload to Photos workbench ListView
- Add blob: to CSP img-src for upload previews
- Add dev:media script and include mana-media in dev:manacore:servers
- Add ./toast export to shared-ui package.json
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gmail rejects emails without a valid Message-ID header (RFC 5322).
Add Message-ID and Date headers to all outgoing emails.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Go's smtp.PlainAuth refuses to send credentials when the hostname
doesn't match the TLS cert (internal Docker hostname 'stalwart' vs
cert CN 'localhost'). Replace with custom LOGIN auth that works with
any SMTP server. Add detailed error logging at each SMTP stage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add SMTP_INSECURE_TLS env var to skip certificate verification for
internal Docker-network SMTP connections. Stalwart's self-signed cert
uses 'localhost' as CN which doesn't match the 'stalwart' hostname.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The unified web app calls auth.mana.how/api/v1/settings to sync theme,
nav, locale, and device settings — but the endpoint was missing, causing
404 errors in production. Implements all 7 CRUD routes against the
existing auth.user_settings table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace direct Brevo SMTP sending with HTTP calls to mana-notify's
notification API. This centralizes all email configuration in one
service (mana-notify) and removes the nodemailer dependency from
mana-auth. SMTP provider is now swappable via a single env var.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug 1: NotifyUser() early-returned when no WebSocket clients existed,
skipping SSE subscriber notifications entirely. Fixed by restructuring
to check WS clients and SSE subscribers independently.
Bug 2: SSE stream cursor defaulted to client's `since` parameter when
no initial data existed. If `since` was in the future (or very recent),
live updates had created_at < cursor and were silently filtered out.
Fixed by defaulting cursor to now() when no initial data is returned.
Bug 3: NotifyUser used original sseSubs slice instead of sseSubsCopy
after releasing the read lock (race condition).
Verified E2E: Push from client A → SSE stream on client B receives
live change event with correct data within ~1 second.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New endpoint GET /sync/{appId}/stream sends Server-Sent Events with
change data directly, replacing the WebSocket notification + HTTP pull
round-trip pattern.
Server (Go):
- HandleStream() in handler.go: SSE endpoint with initial sync + live streaming
- Hub.Subscribe()/Unsubscribe() in hub.go: channel-based SSE subscriber system
- Notification type for type-safe SSE events
- convertChanges() helper extracted from duplicated code
- WriteTimeout set to 0 for SSE long-lived connections
Protocol: Client connects to /sync/{appId}/stream?collections=a,b&since=...
Server sends initial changes, then streams live changes as other clients sync.
Heartbeat every 30s keeps connection alive. Push still uses POST /sync/{appId}.
WebSocket remains available as fallback (not removed).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server now returns hasMore: true when there are more than 1000 changes
pending for a collection. Client continues pulling in a loop until
hasMore is false, using the last row's timestamp as cursor.
Prevents data loss after long offline periods where >1000 changes
accumulated for a single collection.
Server changes (Go):
- GetChangesSince() accepts limit parameter
- HandlePull() fetches limit+1, trims, sets hasMore
- SyncedUntil uses last row's timestamp when paginating
Client changes (TypeScript):
- Pull loop: while (hasMore) { fetch → apply → advance cursor }
- Cursor only persisted after all pages fetched
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mirrors the frontend unification (single IndexedDB) on the backend.
All services now use pgSchema() for isolation within one shared database,
enabling cross-schema JOINs, simplified ops, and zero DB setup for new apps.
- Migrate 7 services from pgTable() to pgSchema(): mana-user (usr),
mana-media (media), todo, traces, presi, uload, cards
- Update all DATABASE_URLs in .env.development, docker-compose, configs
- Rewrite init-db scripts for 2 databases + 12 schemas
- Rewrite setup-databases.sh for consolidated architecture
- Update shared-drizzle-config default to mana_platform
- Update CLAUDE.md with new database architecture docs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Create packages/shared-python/manacore_auth/ with:
- auth.py: API key validation, rate limiting, local + external auth
- external_auth.py: mana-core-auth remote validation with caching
- create_auth_dependency(scope): factory for per-service auth deps
Migrated services:
- mana-stt: auth.py now wraps shared auth with scope="stt" (272→42 LOC)
- mana-tts: auth.py now wraps shared auth with scope="tts" (272→42 LOC)
The only difference between services was the scope parameter ("stt" vs "tts").
Both external_auth.py files were 100% identical and are now thin re-exports.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- shared-auth-stores: delete createSupabaseAuthStore (zero usage across monorepo,
all apps use createManaAuthStore). Remove export + types from index.ts.
- services: move ollama-metrics-proxy (stub — just a Grafana dashboard JSON) and
it-landing (Astro landing page, not a service) to services-archived/
- lint-staged: add services-archived/ to eslint ignore pattern
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add unified /ws endpoint that serves all app notifications over a single connection.
The server now includes appId in the sync-available message payload so the client
knows which app to pull. Legacy /ws/{appId} endpoint remains for backward compatibility.
Backend (Go):
- hub.go: Message struct gains AppId field, NotifyUser sends to all user clients
(unified clients receive everything, legacy clients filtered by appId)
- main.go: new GET /ws route (empty appId = unified mode)
Frontend (sync.ts):
- Single connectUnifiedWs() replaces 27 per-app connectWs() calls
- Parses msg.appId from server to pull only the affected app
- Reconnect/offline logic simplified to one WS
This reduces WebSocket connections from 27 per user to 1, cutting server
connection overhead by ~96%.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- mana-image-gen: change default port from 3025 to 3026 to avoid conflict with mana-llm
- Dashboard widgets (12): replace APP_URLS.{app}.dev/prod with internal route paths (/todo, /calendar, etc.)
and remove target="_blank" since all apps are now internal routes in the unified app
- Home page: use goto() for internal apps, keep window.open() only for external apps (matrix, arcade)
- AppRow: remove unused APP_URLS import
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New GPU service for fast text-to-video generation using LTX-Video (~2B params)
on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM.
Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mana-stt: add WhisperX service with CUDA GPU support, speaker diarization, and auto-fallback chain.
mana-notify: add locale fallback and default templates for task reminders.
CD: update deployment pipeline and docker-compose configuration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract ~120 hardcoded German strings from 14 Svelte components into i18n locale
files using svelte-i18n $t() calls. Add new translation sections (taskForm, filters,
tags, subtasks, durationPicker, kanban, toolbar) across all 5 languages (de/en/fr/es/it).
Also add missing shared common translations for Spanish, French, and Italian
(150+ keys each) in packages/shared-i18n.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Delete apps/memoro/apps/backend/ (NestJS) and apps/memoro/apps/audio-backend/
(NestJS) — all functionality has been ported to the new Hono/Bun servers
(apps/server/ and apps/audio-server/).
Also clean up root and memoro package.json scripts to remove references
to the old @memoro/backend and @memoro/audio-backend packages.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Accessing (error as any)?.body?.code on a Better Auth APIError triggers an internal
async stream read. When the request body contains special chars like '!', the deferred
JSON parse fails as an unhandled rejection that races with the response, causing 500.
Use only error.status === 'FORBIDDEN' which is a simple string property.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Better Auth uses callbackURL to determine the post-verification redirect target.
Setting only redirectTo left callbackURL=/ which resolved to auth.mana.how/ (404).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, Better Auth's definePayload receives a user object
without the custom accessTier column, causing the JWT tier claim
to always default to 'public'.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The client (shared-auth) calls /api/v1/auth/session-to-token for SSO and
2FA flows, but this endpoint was never implemented. Also, the login endpoint
returned raw Better Auth session data instead of the expected
{ accessToken, refreshToken } format.
- Add POST /api/v1/auth/session-to-token endpoint
- Fix login to generate JWT via Better Auth's /api/auth/token
- Fix refresh to return JWT instead of raw session data
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces a tiered access control system so apps can be released
gradually (founder → alpha → beta → public) without extra infrastructure.
Users are gated at the AuthGate level based on their tier vs the app's
requiredTier. All apps remain deployed and reachable, but only users
with sufficient tier can enter.
- Add accessTier enum + column to users schema (default: 'public')
- Add tier claim to JWT payload in better-auth config
- Add requiredTier field to ManaApp interface + all 25 apps
- Add hasAppAccess(), getAccessibleManaApps(), ACCESS_TIER_LABELS
- Update AuthGate with tier check + access denied screen
- Update getPillAppItems + Home page to filter by user tier
- Update all 22 app layouts to pass user tier to PillNav
- Add admin API: GET/PUT /api/v1/admin/users/:id/tier
- Document access tier system in CLAUDE.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>