Local dev secrets like MANA_STT_API_KEY had no persistent home — they
lived only in the gitignored, generator-overwritten per-app .env files.
Every `pnpm setup:env` wiped them, so devs had to re-paste keys after
any env regeneration. Same recurring friction for MANA_LLM_API_KEY,
MANA_AUTH_KEK, OAuth keys, etc.
New layer: `.env.secrets` at the repo root.
- Gitignored, optional, never required for the build to pass
- Read by generate-env.mjs AFTER .env.development; non-empty values
override the matching key, so the merged result drives every per-app
.env the generator writes
- Empty values fall through to the .env.development defaults — a
freshly-copied .env.secrets.example is a no-op
- One source of truth for all dev secrets, propagated to every app
with one `pnpm setup:env`
Files:
- `.env.secrets.example` — committed template documenting all known
secret keys (mana-stt, mana-llm, auth KEK, sync JWT, MinIO, third-
party APIs). Devs `cp .env.secrets.example .env.secrets` and fill in.
- `.gitignore` — ignores .env.secrets, allows .env.secrets.example
- `scripts/generate-env.mjs` — loads .env.secrets if present, prints
"Loaded N secrets from .env.secrets" so devs see the override
taking effect
- `scripts/setup-secrets.mjs` + `pnpm setup:secrets` — convenience
script that SSHes to mana-server, greps the prod .env for the keys
defined in .env.secrets.example, and writes them locally. Confirms
before overwriting an existing .env.secrets unless --force is set;
reports which keys couldn't be found on the remote so devs know
what's left to fill manually
- `docs/LOCAL_DEVELOPMENT.md` + `docs/ENVIRONMENT_VARIABLES.md` —
walk-through and architecture diagram update
Verified end-to-end:
- `rm .env.secrets apps/mana/apps/web/.env && pnpm setup:env` →
STT key empty (no regression for devs who haven't opted in)
- `pnpm setup:secrets --force && pnpm setup:env` →
STT key propagated, "Loaded 3 secrets from .env.secrets" in output
- POST /api/v1/voice/transcribe with a real audio file →
full transcript back via gpu-stt.mana.how, end-to-end working
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The playground route was previously a stub. This turns it into a proper
module:
- A streaming chat surface that talks to mana-llm's OpenAI-compatible
/v1/chat/completions and /v1/models. The SSE chunk parser is hand-rolled
in modules/playground/llm.ts (~30 lines) rather than pulling a dep —
the wire format is straight OpenAI and the playground is the only
consumer right now. If chat / todo enrichment / cycles insights end up
hitting the same surface, this lifts cleanly into $lib/data/llm-client.ts.
- A persisted **snippets** store: name + systemPrompt + (model, temperature)
defaults that the user can pin and reorder. Stateless chat history stays
out — that's what the chat module is for. Both `name` and `systemPrompt`
are encrypted (same pattern as notes/dreams), with a registry entry in
data/crypto/registry.ts and a Dexie schema in data/database.ts.
- Standard module wiring: collections.ts / queries.ts / types.ts /
stores/snippets.svelte.ts / module.config.ts, registered in
module-registry.ts alongside the other 30+ modules.
- ListView.svelte and the (app)/playground/+page.svelte route consume
the new store + the streaming client.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 8 follow-up. Places carries GDPR-sensitive PII so it gets the same
treatment as the rest of Phase 7+8, with one deliberate carve-out:
- `places` encrypts the user-typed surface (name / description / address)
but leaves lat/lng plaintext so the proximity matcher in
tracking.svelte.ts can run during background geolocation logging without
a vault unlock. The trade-off is documented inline in registry.ts: a
handful of named POIs is much less sensitive than the full movement
trail.
- `locationLogs` IS the movement trail, so every coordinate field
(latitude, longitude, accuracy, altitude, speed, heading) is encrypted.
Indexed columns (timestamp, placeId, [placeId+timestamp]) stay plaintext
for the time-range scans in the log view.
- `placeTags` stays out of the registry — pure FK join table, no user
content, same pattern as manaLinks.
queries.useAllPlaces / useLocationLogs now decrypt before mapping to the
DTO. placesStore.create/update snapshot the plaintext DTO before
encryptRecord mutates the local in place — same pattern as
notes/dreams/contacts. trackingStore.logPosition decrypts the place set
before running the nearest-place match (the lat/lng carve-out means this
still works pre-unlock, but downstream consumers want the decrypted name).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
VictoriaMetrics + vmalert previously copied prometheus.yml/alerts.yml from
/mnt/prometheus-config/ into /etc/prometheus/ at container start. The copy
silently drifted from the host file whenever the container wasn't restarted —
which is exactly what hid the matrix/element removal from status.mana.how
until 2026-04-08, when VM was still actively scraping the deleted targets
because its in-container config snapshot pre-dated the cleanup.
Now both containers mount ./docker/prometheus directly into /etc/prometheus
(resp. /etc/alerts) read-only and point the binary at it, and deploy.sh
issues POST /-/reload to both after each deploy so config edits go live
without a container recreate.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The first prod deploy of voice quick-add (3b41b39a3) silently fell
back for every transcript: title=transcript verbatim, dueDate=null,
priority=null, labels=[]. The endpoint code was reaching the
fallback() path even though mana-llm was healthy and reachable from
inside the mana-web container.
Root cause: SvelteKit's $env/dynamic/private explicitly excludes any
env var that starts with the public prefix (default PUBLIC_). The
parse-task code read
env.MANA_LLM_URL || env.PUBLIC_MANA_LLM_URL || 'http://localhost:3025'
expecting to fall back to PUBLIC_MANA_LLM_URL when MANA_LLM_URL was
unset, but $env/dynamic/private treats PUBLIC_MANA_LLM_URL as if it
didn't exist on the server side. So it always fell through to
http://localhost:3025, which from inside mana-web is nothing,
fetch threw, and coerce returned the fallback shape.
Two fixes:
1. docker-compose.macmini.yml — set MANA_LLM_URL (no prefix) on
mana-web alongside PUBLIC_MANA_LLM_URL. The PUBLIC_ var is still
needed for the browser-side playground and status page; the
private one is what the parse endpoints actually read.
2. parse-task and parse-habit — drop the dead env.PUBLIC_MANA_LLM_URL
fallback so the next dev who reads the code doesn't think it'd
ever work. Add a comment explaining the SvelteKit gotcha so the
next person setting up a new env var doesn't repeat this mistake.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a 13-step integration test that exercises register → email
verification → login → JWT validation → /me/data → encryption-vault
init/key → logout against a real stack of postgres + redis + mailpit +
mana-auth + mana-notify in docker compose.
Verified locally that this catches every regression we hit on
2026-04-08 in well under a second:
- missing nanoid dependency → register endpoint 500
- missing MANA_AUTH_KEK env passthrough → mana-auth never starts
- missing encryption-vault SQL migrations → vault endpoints 500
- wrong cookie name in /api/v1/auth/login → no accessToken in response
- mana-notify SMTP misconfigured → mailpit poll times out
Files:
- docker-compose.test.yml — minimal isolated stack on alt ports
(postgres 5443, redis 6390, mailpit 1026/8026, mana-auth 3091,
mana-notify 3092). Runs alongside the dev stack without collision.
Postgres healthcheck runs a real query rather than just pg_isready
to avoid the race where pg_isready reports healthy while the docker
init scripts are still running on a unix socket.
- tests/integration/auth-flow.test.ts — bun test that drives the full
flow via fetch + mailpit's REST API. Cleans up its test user from
postgres in afterAll. Self-contained, no extra deps.
- tests/integration/README.md — what's covered, why it exists, how
to run locally + extend.
- scripts/run-integration-tests.sh — orchestrator. Brings up the
stack, pushes the @mana/auth Drizzle schema, applies the
encryption-vault SQL migrations (002, 003), restarts mana-auth so
it sees the fresh tables, runs the test, tears down on exit.
KEEP_STACK=1 to leave it up for manual mailpit inspection.
- docker-compose.dev.yml — also adds Mailpit as a regular dev service
(ports 1025/8025) so local development can have a working email
capture without spinning up the test stack.
- .github/workflows/ci.yml — new auth-integration job that runs on
every PR. Calls run-integration-tests.sh; on failure dumps
mana-auth + mana-notify logs and the mailpit message queue. Marked
as a required check via the existing PR validation pipeline.
Reproduced 3 clean runs and 1 negative-control run (removed nanoid
from package.json → mana-auth container exits → script aborts with
non-zero) before committing. Full happy path runs in ~22s on a warm
Docker cache.
SvelteKit's production build forbids non-handler exports from a
+server.ts file — dev runs them fine but `pnpm build` errored with
"Invalid export 'coerce' in /api/v1/voice/parse-task" when trying to
deploy mana-web with the new unit tests.
Move ParseResult, fallback, DATE_TRIGGER_PATTERNS,
PRIORITY_TRIGGER_PATTERNS, transcriptMentions, coerce, and extractJson
into a sibling coerce.ts module. The +server.ts file imports from
there and only exports POST, which is the prod build's hard rule.
Tests now import from ./coerce instead of from the route handler,
which also drops the $env/dynamic/private resolution dance from the
test fast path — coerce.test.ts now runs in ~130ms instead of ~400ms
because it pulls in zero SvelteKit runtime.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fills the gap between the last entry (2026-03-31) and today. The 2026-04-06
slot is intentionally skipped — no commits that day.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The deterministic guards in parse-task's coerce() are the load-bearing
defense against gemma3 hallucinating dueDate / priority on bare tasks.
The integration tests against the live LLM cover the happy path
end-to-end, but they go offline as soon as mana-llm is unreachable —
the unit tests cover the guard logic in isolation with synthetic LLM
responses, so a regression in the rules is caught even when the LLM
itself is dark.
22 cases:
- transcriptMentions: substring matching, case-insensitivity, empty
pattern list, the German + English date words from the few-shot
examples, and the negative cases ("Mülltonnen rausstellen",
"Buy milk") that must NOT trigger.
- coerce: fallback shape on garbage input, transcript-as-title when
the model omits one, time-component stripping ("2026-04-09T14:00:00"
→ "2026-04-09"), malformed dueDate rejection, the dueDate /
priority hallucination guards (drop when the transcript has no
trigger word), real-date / real-priority preservation, label
filtering (cap at 3, drop non-strings, empty array on non-array
input), invalid priority value rejection.
Helpers exported solely for the tests via a __test object — the
production endpoint goes through buildPrompt + coerce as before.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two related changes that fall out of real end-to-end testing against
the now-working local mana-llm.
1. Default model bumped from gemma3:4b to gemma3:12b for both
parse-task and parse-habit. The 4b model gets weekday math
off-by-one ("nächsten Montag" from a Wednesday → 2026-04-14
instead of 2026-04-13), aggressively shortens titles ("Anna
anrufen" → "Anrufen"), and frequently paraphrases habit names
instead of copying verbatim ("Joggen" instead of "Laufen") which
the verbatim-validation in coerce drops, costing an LLM round-trip
for nothing. The 12b variant is roughly 10% slower for these
tiny prompts (~1.1s vs ~1.0s on the GPU box) so the accuracy
win is essentially free.
2. parse-task prompt rewritten as few-shot. Pure rule descriptions
were *worse* than simple examples — the long "Rules — read
carefully" section in the previous prompt actually made the model
compute next Monday as 2026-04-14 even though a direct "what date
is next Monday?" prompt to the same model returned 2026-04-13.
The detailed rules were also priming the model to over-shorten
titles and over-eagerly tag filler words. Five worked examples
(including the previously-failing "Anna nächsten Montag anrufen"
case) plus one novel case ("Mama am Wochenende besuchen") all
come back correct now, including for the novel one.
The deterministic guards in coerce() are kept as a backstop for the
day the GPU box swaps in a weaker model — they're cheap and don't
hurt the happy path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same convention as STT_URL — nobody runs mana-llm in local Docker for
dev work, the shared gateway is always reachable, so the path of least
friction is to point at it by default. Devs who want a fully offline
stack can still override the var locally.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Real end-to-end testing against the now-working local mana-llm
surfaced two model behaviours the prompt couldn't talk down:
1. gemma3:4b stamps today's date on every task that doesn't have a
real time anchor. "Mülltonnen rausstellen" came back with
dueDate=2026-04-08 and priority=low even though the prompt
explicitly said "MUST be null when no date is mentioned". After
typing "Buy milk" the user would silently get a today-due task,
which is worse than no parsing at all.
2. The model occasionally returns dueDate as a full ISO timestamp
("2026-04-09T14:00:00") when the transcript mentions a time. The
coerce regex previously matched the prefix and let the timestamp
through unchanged, which then breaks the YYYY-MM-DD-shaped Dexie
field downstream.
Fix: deterministic post-processing in coerce. The prompt is also
tightened with explicit "ONLY when…" rules but the guards are the
load-bearing change since gemma3:4b ignores prompt restrictions.
- Strict YYYY-MM-DD extraction: a leading-anchor regex match keeps
only the date prefix even if the model adds a time component.
- DATE_TRIGGER_PATTERNS: substring scan over the original transcript
for German + English date words. If the LLM returned a dueDate but
the transcript has zero matches, drop the date — it was a
hallucination. False positives are preferable to false negatives:
letting through a fake date is more annoying than suppressing a
real one the user can re-type.
- PRIORITY_TRIGGER_PATTERNS: same idea for priority. The model thinks
taxes are inherently urgent; we don't want to inherit that opinion.
The labels field is left noisy on purpose — "müll", "unbedingt",
"erledigen" all come back from a single transcript and only the ones
that fuzzy-match an existing workspace tag end up on the task, so
filtering filler words at this layer would be wasted work.
Verified against five transcripts spanning bare/explicit/relative
date in DE + EN. Real LLM round-trip via http://localhost:5173 →
https://llm.mana.how → ollama gemma3:4b. Local mana-llm now reaches
its Ollama backend after the gpu-proxy routing fix in 7f382138a.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A grep audit after the previous matrix removal commits found a handful
of stragglers in non-runtime files that the earlier sweeps missed:
- services/mana-llm/CLAUDE.md: removed matrix-ollama-bot from the
consumer-apps diagram and from the related-services table
- services/mana-video-gen/CLAUDE.md: removed "Matrix Bots" integration
bullet
- packages/notify-client/README.md: removed sendMatrix() doc entry
(the method itself was already gone in the prior cleanup)
- docker/grafana/dashboards/logs-explorer.json: dropped the "Matrix
Stack" log row that queried tier="matrix" (would show no data forever)
- docker/grafana/dashboards/master-overview.json: dropped the "Matrix
Bots" stat panel that counted up{job=~"matrix-.*-bot"}
- apps/mana/apps/landing/src/data/ecosystem-health.json: regenerated via
scripts/ecosystem-audit.mjs to drop matrix from the app list, icon
counts, file analytics, top offenders and authGuard missing list
- .gitignore: removed services/matrix-stt-bot/data/ pattern (the
service itself was deleted long ago)
Production-side stragglers also addressed (not in this commit):
- DROP USER synapse on prod Postgres (the parallel cleanup commit
2514831a3 dropped DATABASE matrix + DATABASE synapse but left the
role behind)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mana-service-llm container had OLLAMA_URL pointed at the GPU box's
LAN address (192.168.178.11:11434). On the Mac Mini host that route
works fine, but from inside any Colima container the entire
192.168.178.0/24 subnet gets synthesized RST — Colima's VM "claims"
the LAN range without being able to route to it, so every connect()
returns "Connection refused" before a packet ever leaves the box.
mana-llm started cleanly, reported the configured upstream as
"unhealthy", served an empty /v1/models list, and every chat
completion failed with "All connection attempts failed". The most
visible downstream effect: voice quick-add (parse-task, parse-habit)
silently degraded to its no-LLM fallback for everyone hitting the
local stack — same shape as a successful response, no error log,
just no enrichment.
The Mac Mini already runs a gpu-proxy LaunchAgent
(com.mana.gpu-proxy, /Users/mana/gpu-proxy.py) that forwards
127.0.0.1:13434 → 192.168.178.11:11434 alongside several other GPU
service ports. Pointing OLLAMA_URL at host.docker.internal:13434 and
adding the host-gateway extra_hosts mapping puts mana-llm on the
already-running rail. Verified end-to-end: from inside the container,
GET http://host.docker.internal:13434/api/tags now returns the full
model list (gemma3:4b, gemma3:12b, gemma3:27b, qwen2.5-coder:14b,
nomic-embed-text).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reflects the removal of apps/matrix and services/mana-matrix-bot from
the workspace plus the dropped @matrix-org/matrix-sdk-crypto-nodejs
override in package.json. Net -365 lines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The /api/v1/voice/parse-task and /api/v1/voice/parse-habit endpoints
forwarded transcripts to mana-llm without an X-API-Key header. This
worked against the local mana-llm container (no auth) but silently
fell back to the no-LLM path when pointed at gpu-llm.mana.how, which
requires an API key — voice quick-add would look like it was running
in degraded mode forever with no signal that auth was the cause.
Now both endpoints read MANA_LLM_API_KEY from the server-side env and
attach it as X-API-Key when present, mirroring the pattern already
used by /api/v1/voice/transcribe for mana-stt. When the var is empty
the header is omitted, so local Docker setups without auth still work.
Plumbing: generate-env.mjs writes MANA_LLM_URL + MANA_LLM_API_KEY into
apps/mana/apps/web/.env, .env.development gets the new keys with empty
defaults, ENVIRONMENT_VARIABLES.md documents the gateway and where to
get a key.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The matrix subsystem was removed in a prior commit. This commit cleans
up the small leftovers that grep found:
- docker-compose.macmini.yml: dropped the "Matrix Stack" port-range
comment, the "matrix" category from the naming convention, and a
stale watchtower comment about Matrix notifications.
- packages/credits/src/operations.ts: removed AI_BOT_CHAT credit
operation type and its definition. It was the billing entry for "Chat
with AI via Matrix bot" — no callers left.
- services/mana-credits gifts schema + service + validation: removed the
targetMatrixId column / param / Zod field. The corresponding
PostgreSQL column was dropped manually with
`ALTER TABLE gifts.gift_codes DROP COLUMN target_matrix_id` on prod.
- docker/grafana/dashboards/{master,system}-overview.json: removed the
`up{job="synapse"}` panel queries — they would have shown No Data
forever now that Synapse is gone.
Production-side cleanup performed in parallel (not in this commit):
- Stopped + removed mana-matrix-{synapse,element,web,bot} containers
- Removed mana-matrix-bot:local, matrix-web:latest,
matrixdotorg/synapse:latest, vectorim/element-web:latest images (~3 GB)
- Removed mana-matrix-bots-data Docker volume
- Removed /Volumes/ManaData/matrix/ media store (4.3 MB)
- DROP DATABASE matrix; DROP DATABASE synapse; on Postgres
Cosmetic leftovers intentionally untouched:
- Eisenhower matrix in todo (LayoutMode 'matrix') — productivity concept
- ${{ matrix.service }} in .github/workflows — GitHub Actions strategy
- services/mana-media/apps/api/dist/.../matrix/* — stale build output
(not in git, regenerated next mana-media build)
Two new test files lock in the matching boundary where free-text LLM
hints meet the user's actual workspace data — that's where bugs hide
silently. Both matchers are now pure-function-shaped (the production
wrappers just feed them Dexie data) so the tests run without
fake-indexeddb or any I/O.
todo: 16 cases for matchLabelsToTagsPure covering exact / case /
diacritic / substring / specificity rules + the "never invent tags"
guarantee.
habits: 11 cases for matchHabitToTranscript including the word-
boundary "Bier vs ausprobiert" false-positive, multi-word matching,
and a real bug the test surfaced on the first run:
Without specificity ranking, "Tee" would always beat "Grüner Tee"
because the first matching habit in input order won. The matcher
now collects all candidates and returns the one with the most
matched tokens, so multi-word habits beat single-word substrings
whenever both could fit the transcript.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit bundles two unrelated changes that were swept together by an
accidental `git add -A` in another working session. Documented here so the
history reflects what's actually inside.
═══════════════════════════════════════════════════════════════════════
1. fix(mana-auth): /api/v1/auth/login mints JWT via auth.handler instead
of api.signInEmail
═══════════════════════════════════════════════════════════════════════
Previous attempt (commit 55cc75e7d) tried to fix the broken JWT mint in
/api/v1/auth/login by switching the cookie name from `mana.session_token`
to `__Secure-mana.session_token` for production. That was necessary but
not sufficient: Better Auth's session cookie value isn't just the raw
session token, it's `<token>.<HMAC>` where the HMAC is derived from the
better-auth secret. Reconstructing the cookie from auth.api.signInEmail's
JSON response only gave us the raw token, so /api/auth/token's
get-session middleware still couldn't validate it and the JWT mint kept
silently failing.
Real fix: do the sign-in via auth.handler (the HTTP path) rather than
auth.api.signInEmail (the SDK path). The handler returns a real fetch
Response with a Set-Cookie header containing the fully signed cookie
envelope. We capture that header verbatim and forward it as the cookie
on the /api/auth/token request, which now passes validation and mints
the JWT correctly.
Verified end-to-end on auth.mana.how:
$ curl -X POST https://auth.mana.how/api/v1/auth/login \
-d '{"email":"...","password":"..."}'
{
"user": {...},
"token": "<session token>",
"accessToken": "eyJhbGciOiJFZERTQSI...", ← real JWT now
"refreshToken": "<session token>"
}
Side benefits:
- Email-not-verified path is now handled by checking
signInResponse.status === 403 directly, no more catching APIError
with the comment-noted async-stream footgun.
- X-Forwarded-For is forwarded explicitly so Better Auth's rate limiter
and our security log see the real client IP.
- The leftover catch block now only handles unexpected exceptions
(network errors etc); the FORBIDDEN-checking logic in it is dead but
harmless and left in for defense in depth.
═══════════════════════════════════════════════════════════════════════
2. chore: remove the entire self-hosted Matrix stack (Synapse, Element,
Manalink, mana-matrix-bot)
═══════════════════════════════════════════════════════════════════════
The Matrix subsystem ran parallel to the main Mana product without any
load-bearing integration: the unified web app never imported matrix-js-sdk,
the chat module uses mana-sync (local-first), and mana-matrix-bot's
plugins duplicated features the unified app already ships natively.
Keeping it alive cost a Synapse + Element + matrix-web + bot container
quartet, three Cloudflare routes, an OIDC provider plugin in mana-auth,
and a steady drip of devlog/dependency churn.
Removed:
- apps/matrix (Manalink web + mobile, ~150 files)
- services/mana-matrix-bot (Go bot with ~20 plugins)
- docker/matrix configs (Synapse + Element)
- synapse/element-web/matrix-web/mana-matrix-bot services in
docker-compose.macmini.yml
- matrix.mana.how/element.mana.how/link.mana.how Cloudflare tunnel routes
- OIDC provider plugin + matrix-synapse trustedClient + matrixUserLinks
table from mana-auth (oauth_* schema definitions also removed)
- MatrixService import path in mana-media (importFromMatrix endpoint)
- Matrix notification channel in mana-notify (worker, metrics, config,
channel_type enum, MatrixOptions handler)
- Matrix entries from shared-branding (mana-apps + app-icons),
notify-client, the i18n bundle, the observatory map, the credits
app-label list, the landing footer/apps page, the prometheus + alerts
+ promtail tier mappings, and the matrix-related deploy paths in
cd-macmini.yml + ci.yml
Devlog/manascore/blueprint entries that mention Matrix are left intact
as historical record. The oauth_* + matrix_user_links Postgres tables
stay on existing prod databases — code can no longer write to them, drop
them in a follow-up migration if you want them gone for real.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Users can now define multiple named layouts of the workbench homepage and
switch between them. Each scene holds its own openApps list with per-app
window state (minimized / maximized / size). Scene list syncs cross-device
via mana-sync; the active scene id is per-device (localStorage) so device
A doesn't pull device B into a different scene.
- new `workbenchScenes` Dexie table, registered in manaCoreConfig
- `workbenchScenesStore` (Dexie liveQuery) with scene CRUD + per-scene app
mutations; auto-seeds a default "Home" scene on first run
- SceneTabs pill bar above the carousel with dnd reorder + context menu
(rename / duplicate / delete); SceneRenameDialog and a reusable
ConfirmDialog for the destructive path
- workbench +page.svelte refactored to delegate all openApps mutations to
the store; the carousel itself is unchanged
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The parse-task endpoint already returns free-text label hints from the
LLM ("steuern", "haushalt", …). Now the todo store fuzzy-matches each
hint against the user's existing tags via tagCollection and assigns
the matched IDs to the task's metadata.labelIds.
Match policy is intentionally conservative:
- Normalize via NFD strip + lowercase + collapsed whitespace
- Exact normalized match wins
- Substring fallback only for ≥3 char strings (avoids "ab" hitting
every tag containing "ab")
- Never auto-creates a tag — even if the LLM is sure, an unknown topic
silently drops, because auto-creating would clutter the user's tag
list with one-off duplicates from voice transcripts
Both flows pick this up: voice always (transcripts almost always carry
topic hints) and typed only when there's structured payoff, same
asymmetry as before — typed quick-add now also enriches when the LLM
just finds a tag match without a date or priority.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The custom /api/v1/auth/login route signs the user in via the
better-auth SDK (auth.api.signInEmail) and then forges a request to
/api/auth/token to mint a JWT, passing the session token as a synthetic
cookie header.
The cookie name was hardcoded as `mana.session_token=...`, but in
production better-auth issues the session cookie with the __Secure-
prefix (because secure: true is enabled). Get-session middleware on the
/api/auth/token side couldn't find the session under the unprefixed
name, so it returned 401 silently. Result: tokenResponse.ok was false,
the route fell through, and the response had no `accessToken` field at
all — only the bare { token, user, redirect } from signInEmail.
The frontend in @mana/shared-auth then picked this up as
`data.accessToken === undefined` and stored undefined as the JWT, while
the parallel /api/auth/sign-in/email call masked the visible damage by
setting the SSO cookie. So login *appeared* to work in the browser
(cookie present, session worked) but the JWT path was always broken.
Fix: pick the cookie name based on config.nodeEnv. In production use
__Secure-mana.session_token, in development use mana.session_token (no
__Secure- prefix because secure: false in dev).
Verified end-to-end on auth.mana.how:
POST /api/v1/auth/login → response now includes accessToken (a real
JWT, EdDSA, with sub/email/role/sid/tier/iss/aud claims), refreshToken
(the session token), plus the original signInEmail fields.
The other /api/auth/get-session call sites in this file forward the
incoming request headers verbatim, so they preserve whatever real cookie
the browser sent and don't have this bug.
Press Enter on "Steuererklärung morgen 14 Uhr hoch" and the task lands
instantly with your exact text as the title — then a background pass
through /api/v1/voice/parse-task swaps in dueDate + priority once
mana-llm answers. The title only gets rewritten when the LLM actually
finds structured info (dueDate or priority); for plain titles like
"Mülltonnen rausstellen" the typed text is left alone, since silently
"cleaning up" perfectly fine input is more annoying than helpful.
Pulled the parse + STT-then-parse plumbing apart so both flows share
parseTaskText() and only differ in policy: voice always applies the
LLM title (raw transcripts are noisy), typed only when there's
structured payoff.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous voice quick-add dumped the whole transcript into the task
title — fine for "Steuererklärung" but useless for "Steuererklärung
morgen 14 Uhr hoch", which should land as title="Steuererklärung",
dueDate=tomorrow, priority="high".
New endpoint /api/v1/voice/parse-task posts the transcript to mana-llm
(gemma3:4b, temperature 0) with a tight system prompt that asks for
strict JSON: { title, dueDate, priority, labels }. The endpoint coerces
the response back into the typed shape and falls through to
{ title: transcript, … } whenever anything goes wrong — mana-llm down,
JSON garbled, network timeout. Voice quick-add must never fail harder
than typed quick-add, so the fallback path is the rule, not the
exception.
Labels come back from the LLM as free-text topic hints and don't yet
map to the workspace's tag IDs — fuzzy matching against existing tags
is a follow-up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The per-module /api/v1/memoro/transcribe and /api/v1/dreams/transcribe
endpoints were literal copies that proxied to mana-stt. Now that the
generic /api/v1/voice/transcribe endpoint exists (added with notes),
point both stores at it and delete the duplicates. -200 LOC, one place
to update STT auth or response shape from now on.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Speak a task and it lands in the list as a placeholder while mana-stt
transcribes it; the title swaps in once the transcript returns.
No date/priority/label parsing yet — that's a follow-up that needs an
LLM pass over the transcript. For now the whole transcript becomes the
task title and the user can edit inline.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drop a mic into Notes — record, transcribe through the new generic
/api/v1/voice/transcribe proxy (mana-stt), then write the result back
into the placeholder note. The first transcript line becomes the title
when it fits in 80 chars, otherwise a generic 'Sprachnotiz' label.
The inline editor refreshes from the live note while the placeholder
'…' content is still on screen, so a transcript that arrives a moment
after the editor opens shows up automatically without overwriting
anything the user has typed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mana-auth's config.ts has hard-failed startup since commit e9915428c
(phase 2 encryption vault) when MANA_AUTH_KEK is unset in production.
.env.macmini.example documents the variable, but the docker-compose
service definition for mana-auth never had a corresponding
MANA_AUTH_KEK: ${MANA_AUTH_KEK} line in its environment block, so even
when the variable was set in the host .env, it never reached the
container. Result: every restart since yesterday looped on
"MANA_AUTH_KEK env var is required in production".
Added the env passthrough alongside BETTER_AUTH_SECRET with an inline
comment pointing at the generation command + service CLAUDE.md.
Operator action required on the Mac Mini:
KEK=$(openssl rand -base64 32)
echo "MANA_AUTH_KEK=$KEK" >> .env
./scripts/mac-mini/build-app.sh mana-auth # or compose up -d mana-auth
Then back the value up — it cannot be rotated today without re-wrapping
all existing user vaults (no background re-wrap job yet, kek_id column
on encryption_vaults is reserved for the future migration path).
Dreams and Memoro had two literal copies of the MediaRecorder boilerplate
plus parallel mic-button markup, error UI, and requireAuth gating. Lift
the recorder + bar into $lib/components/voice and add it to the memoro
workbench ListView (which had no mic at all). New voice-capture features
just drop in <VoiceCaptureBar> with idleLabel/feature/reason/onComplete.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mana-auth has been crash-looping in production with:
error: Cannot find package 'nanoid' from
'/app/src/services/encryption-vault/index.ts'
The encryption-vault service imports nanoid for audit row IDs (line 27,
used at line 547 in the audit log writer), but nanoid was never added
to services/mana-auth/package.json. The import was introduced in commit
e9915428c (phase 2 — server-side master key custody) and slipped past
because nanoid happens to exist transitively in the workspace via
postcss → nanoid@3.3.11. Local pnpm store lookups would resolve it just
fine; a strict isolated container build can't.
Fix:
- Add "nanoid": "^5.0.0" to services/mana-auth/package.json deps
- pnpm install pulled nanoid@5.1.7 into services/mana-auth/node_modules
Verified the import resolves locally:
bun -e 'import { nanoid } from "nanoid"; console.log(nanoid())'
→ ok: 6TLuTWlenhC0KnSESn5Ex
The Mac Mini still needs to redeploy mana-auth (rebuild image with the
new lockfile, restart container) to pick this up — production is
currently 502ing on auth.mana.how.
The lockfile had drifted out of sync with two package.json files:
- services/mana-events/package.json declared drizzle-orm, hono, jose,
postgres, zod, drizzle-kit, typescript — but mana-events was never
registered as an importer in pnpm-lock.yaml at all. A frozen-lockfile
install would fail.
- apps/mana/apps/web/package.json had "postgres": "^3.4.9" as a
devDependency that the lockfile hadn't picked up.
Both are already declared in their package.json — this commit just
locks them in. No new top-level dependencies are introduced.
The rest of the diff is non-substantive churn from running pnpm install
(jiti peer-version flips between 1.21.7 ↔ 2.6.1, expo-font peer
specifier format becoming more explicit). Net diff is −102 lines
despite registering two new importers, because the peer-format
verbose-ification deduplicates a few entries.
The unified Mana app runs most modules in a "guest mode": you can
open a module, look around, type a quick note, etc. without an
account. But anything that touches an *encrypted* table (dreams
voice capture, memoro recordings, notes, todo, calendar events, …)
needs the user to be logged in — the encryption vault only unlocks
against a Mana Auth session, and writing to those tables without
it throws `VaultLockedError` at the very last step of the action.
Before this commit, every entry point into an encryption-required
action would silently let the guest go through the whole flow
(record audio, wait for transcription, open the dexie write) and
then explode with a stack-trace error. The user lost work and
didn't know why. The dreams voice capture flow surfaced this
during the 2026-04-08 STT debugging session.
The fix is a global imperative gate: `requireAuth({ feature, reason })`.
Call sites await it before the action; it returns immediately if the
user is already authenticated, otherwise pops a global modal that
asks the guest to log in or cancel. Promise-based, so callers
decide what to do with `false` (silent abort, restore state, own
toast).
$lib/auth/require-auth.svelte.ts new — store + helper
$lib/components/auth/AuthRequiredModal.svelte new — global modal
routes/+layout.svelte mount the modal once
packages/shared-utils/src/analytics.ts new ManaEvents.featureBlockedByAuth
event for conversion tracking
Wired into the two voice-capture entry points that actually exhibited
the bug:
modules/dreams/ListView.svelte → feature: 'dreams-voice-capture'
routes/(app)/memoro/+page.svelte → feature: 'memoro-voice-capture'
Both gate on `requireAuth()` BEFORE the mic permission request, so
guests see the friendly "Konto erforderlich" modal instead of
recording → transcribing → crashing.
Design choices documented in detail in the require-auth.svelte.ts
header comment:
- Imperative function (not a button wrapper component) so it
works in event handlers, store actions, keyboard shortcuts,
drag-drop handlers — anywhere async code runs.
- Single global modal mounted once in the root layout, no
portal/z-index gymnastics; two simultaneous prompts replace
each other (the most recent one wins).
- Checks `authStore.isAuthenticated`, not vault-unlocked state —
the user-facing concept is "I need an account", not "I need
a working encryption vault". Vault-unlock failures (network
error etc.) are a separate bug class with their own UX.
- The modal navigates to `/login?next=<current path>` so the
user lands back on the same page after logging in. The
Promise resolves `false` on navigation; the user re-clicks
the original button after coming back, and the second click
sees `isAuthenticated === true` and proceeds without a modal.
Re-triggering the original action across a navigation cycle
would require restoring half-recorded mic state — not worth
the complexity, and the second click is a clean UX.
How to wire a new entry point (4 lines):
import { requireAuth } from '$lib/auth/require-auth.svelte';
async function handleCreateThing() {
const ok = await requireAuth({
feature: 'create-thing',
reason: 'Things werden verschlüsselt gespeichert. Dafür brauchst du ein Mana-Konto.',
});
if (!ok) return;
// ...existing logic
}
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three independent bugs that conspired to make the dreams + memoro mic
buttons completely unusable in production AND in dev. Each one alone
would have been the only blocker; they layered on top of each other so
fixing the top one just exposed the next.
1. Permissions-Policy header blocked the microphone API entirely.
`packages/shared-utils/src/security-headers.ts` set
`microphone=()` which means "no origin, including self, may use
the microphone". `getUserMedia()` throws a `Permissions policy
violation` and the browser never even shows the permission
dialog — no amount of OS / browser / site settings can override
it because the policy blocks the API at the document level.
Fix: change to `microphone=(self)` so mana.how itself can use
the API. Camera stays disallowed (no module needs it).
2. Notification permission was requested at layout mount time.
`(app)/+layout.svelte` called
`notificationService.requestPermission()` from `onMount()`. Modern
browsers require permission requests to come from a user gesture
— calling it without one queues the prompt until the next click.
That meant the user's FIRST click on any button (in this case the
dreams "Traum sprechen" mic button) showed the queued notifications
prompt instead of the action they actually clicked. Worse,
`getUserMedia()` was then silently dropped because Chrome only
shows one permission dialog at a time.
Fix: remove the mount-time call entirely. Notification permission
must be requested from a button the user explicitly clicks
("Benachrichtigungen aktivieren" toggle in Settings or first time
a reminder is created) — the reminder scheduler still runs without
permission, it just won't fire OS notifications until granted.
3. vite-plugin-pwa registered a service worker in dev that cached
the old layout chunks across reloads, so the fix for #2 was
invisible until the user manually unregistered the SW in DevTools.
`vite-plugin-pwa` defaults `devEnabled: true`, which is a
well-known footgun for fast iteration. Production still gets the
full SW (this only flips dev). The 2026-04-08 mic-button hunt
took an extra hour for exactly this reason.
Fix: pass `devEnabled: false` to createPWAConfig in vite.config.ts.
Verified: in a fresh incognito tab on `localhost:5173/`, opening the
Dreams app in the workbench and clicking the mic button now shows the
microphone permission dialog directly (no notifications hijack), and
recording → transcription works end-to-end against the production
mana-stt service on the GPU box.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mana-voice-bot's source default was 3050, which collided with mana-sync.
Today the collision is latent (voice-bot isn't deployed anywhere), but
sooner or later someone is going to start it on a host that's already
running mana-sync and the second one will refuse to bind. Moving to
3024 puts it inside the AI/ML port range alongside its dependencies
(stt 3020, tts 3022, image-gen 3023, llm 3025) and away from sync.
Updated:
- app/main.py — PORT default 3050 → 3024
- start.sh, setup.sh — same fix in the example commands
- CLAUDE.md — full rewrite. Old version described "Mac Mini deployment"
with launchd; the new version explicitly says "not deployed yet" and
documents the seven concrete steps to deploy on the Windows GPU box
alongside the other AI services (Scheduled Task, service.pyw, .env,
firewall rule, cloudflared route, WINDOWS_GPU_SERVER_SETUP.md update).
docs/WINDOWS_GPU_SERVER_SETUP.md:
- Added the missing ManaVideoGen scheduled task to all four
Start-ScheduledTask snippets — video-gen has been running on the
Windows GPU but the doc had never picked it up.
- Added a "mana-video-gen (Port 3026)" service section parallel to the
existing image-gen one, with venv path, repo pointer, model, etc.
- Added a repo-pendants table mapping C:\mana\services\<svc>\ to the
corresponding services/<svc>/ directory in the repo, plus a note that
changes should flow repo→Windows, not the other way around.
docs/PORT_SCHEMA.md:
- Reconciled the warning block with the post-cleanup reality: no more
active or latent port collisions (image-gen ↔ video-gen and
voice-bot ↔ sync are both resolved). Listed the actual ports per host
with public URLs. Kept the planned-vs-actual disclaimer for the
services that still don't match the aspirational ranges (mana-credits
3061 vs planned 3002, etc).
The Mac Mini hasn't run mana-llm/stt/tts/image-gen for a while — those
services live on the Windows GPU server now. The Mac-targeted
installers, plists, and platform-checking setup scripts have been
sitting in the repo as cargo-cult, suggesting Mac Mini deployment is
still a real option. It isn't.
Removed (Mac-Mini deployment infrastructure):
services/mana-stt/
- com.mana.mana-stt.plist (LaunchAgent)
- com.mana.vllm-voxtral.plist (LaunchAgent for the abandoned local Voxtral experiment)
- install-service.sh (single-service launchd installer)
- install-services.sh (mana-stt + vllm-voxtral installer)
- setup.sh (Mac arm64 installer)
- scripts/setup-vllm.sh (vLLM-Voxtral setup)
- scripts/start-vllm-voxtral.sh
services/mana-tts/
- com.mana.mana-tts.plist
- install-service.sh
- setup.sh (Mac arm64 installer)
scripts/mac-mini/
- setup-image-gen.sh (Mac flux2.c launchd installer)
- setup-stt.sh
- setup-tts.sh
- launchd/com.mana.image-gen.plist
- launchd/com.mana.mana-stt.plist
- launchd/com.mana.mana-tts.plist
setup-tts-bot.sh stays — it's the Matrix TTS bot installer (Synapse
side), not the mana-tts service.
Updated:
- services/mana-stt/CLAUDE.md, README.md — fully rewritten for the
Windows GPU reality (CUDA WhisperX, Scheduled Task ManaSTT, .env keys
matching the actual production .env on the box)
- services/mana-tts/CLAUDE.md, README.md — same treatment, documenting
Kokoro/Piper/F5-TTS on the Windows GPU under Scheduled Task ManaTTS
- scripts/mac-mini/README.md — dropped the STT setup section, replaced
with a pointer to docs/WINDOWS_GPU_SERVER_SETUP.md and the per-service
CLAUDE.md files
- docs/MAC_MINI_SERVER.md — expanded the "deactivated launchagents"
list to mention the now-removed plists, added the full GPU service
port table with public URLs, added a cleanup snippet for any old plists
still installed on a Mac Mini somewhere
The repo's mana-image-gen used to be a Mac Mini–only service built on
flux2.c with hard MPS+arm64 platform checks. The actual production
image-gen runs on the Windows GPU server (RTX 3090) using HuggingFace
diffusers + PyTorch CUDA + FLUX.1-schnell — completely different code
that lived only at C:\mana\services\mana-image-gen\ on the GPU box.
This commit pulls the Windows implementation into the repo and deletes
the Mac one, so there's exactly one mana-image-gen and its source of
truth is git rather than one folder on one machine.
Removed:
- setup.sh — Mac-only flux2.c installer with hard arm64 platform check
- app/main.py (Mac flux2.c subprocess wrapper version)
- app/flux_service.py (Mac flux2.c subprocess wrapper version)
Added (pulled from C:\mana\services\mana-image-gen\):
- app/main.py — FastAPI endpoints (/generate, /images/*, /cleanup)
- app/flux_service.py — diffusers FluxPipeline wrapper
- app/api_auth.py — ApiKeyMiddleware (GPU_API_KEY)
- app/vram_manager.py — shared VRAM accounting
- service.pyw — Windows runner used by the ManaImageGen scheduled task
Updated:
- main.py PORT default from 3025 → 3023 to match the production reality
(the service.pyw runner already binds 3023 explicitly via uvicorn.run,
but the source default should match so direct uvicorn invocations and
local tests don't pick the wrong port)
- CLAUDE.md fully rewritten to describe the Windows/CUDA/diffusers stack
- README.md trimmed to a pointer at CLAUDE.md + the public URL
- .env.example written from scratch (didn't exist before — the service's
.env on the GPU box was undocumented)
The setup-image-gen.sh launchd installer in scripts/mac-mini/ and the
actual Mac Mini deployment will be cleaned up in the next commit, along
with the rest of the Mac-Mini AI service infrastructure.
The Windows GPU server has been the actual production home for these
services for some time, and the running code there has drifted ahead of
the repo. This sync pulls the live versions back into the repo so the
Windows box is no longer the only place those changes exist.
Pulled from C:\mana\services\* on mana-server-gpu (192.168.178.11):
mana-llm:
- src/main.py, src/config.py — small fixes (auth wiring, config tweaks)
- src/api_auth.py — NEW (cross-service GPU_API_KEY validator)
- service.pyw — Windows runner used by the ManaLLM scheduled task
(sets up logging redirect, loads .env, calls uvicorn)
mana-stt:
- app/main.py — substantial cleanup (684→392 lines), drops the
whisperx-as-separate-backend branching now that whisper_service.py
rolls whisperx in directly
- app/whisper_service.py — full CUDA + whisperx rewrite (158→358 lines)
- app/auth.py + external_auth.py — significantly expanded auth
- app/vram_manager.py — NEW (shared VRAM accounting helper)
- service.pyw — Windows runner with CUDA pre-init, FFmpeg PATH
injection, .env loading
- removed: app/whisper_service_cuda.py (folded into whisper_service.py)
- removed: app/whisperx_service.py (folded into whisper_service.py)
mana-tts:
- app/auth.py, external_auth.py — same auth expansion as stt
- app/f5_service.py, kokoro_service.py — Windows tweaks
- app/vram_manager.py — NEW (same shared helper as stt)
- service.pyw — Windows runner
mana-video-gen:
- service.pyw — Windows runner (no other changes; the .py code on the
GPU box is byte-identical to what's already in the repo)
The service.pyw files contain absolute Windows paths
(C:\mana\services\<svc>) and a hardcoded FFmpeg PATH for the tills user
profile. Kept as-is intentionally — they exist to be deployed to that
one machine and any abstraction layer would just hide what's actually
happening. Anyone redeploying to a different layout will need to edit
the path strings, which is a known and obvious change.
Mac-Mini infrastructure for these services (launchd plists, install
scripts, scripts/mac-mini/setup-{stt,tts}.sh, the Mac-flux2c image-gen
implementation) is still on disk and will be removed in a follow-up
commit, along with replacing mana-image-gen with the Windows
diffusers+CUDA implementation. This commit is just the live-code sync.
LoginPage cleanup:
- Drop dev pre-fill credentials and the secret logo-as-button trick
- Remove duplicate in-component theme toggle; accept isDark as a prop and let the (auth) layout's global theme toggle drive it
- Move passkey CTA below the password form so the primary flow stays primary
- Remove the dead "Angemeldet bleiben" checkbox (was bound but never forwarded to onSignIn)
- Fix the skip-to-form link to use sr-only/focus:not-sr-only so it only appears on keyboard focus
- Fix the "oder" divider to render its before/after hairlines by setting an explicit color on the parent
- Wire focus-visible outlines on all interactive controls
- Bump 0.6 → 0.75 opacity on subtitle text for AA contrast
- Drop opacity-60 from the headerControls wrapper
Robustness:
- Track all setTimeout IDs in a Set and clear them in an effect cleanup so navigation away doesn't fire stale callbacks (success redirects, error shake, focus restore)
- Replace (result as any) casts with the new typed AuthResult fields
- New resolveErrorCode() helper prefers result.errorCode and falls back to legacy string matching, so rate-limit / account-lock detection survives i18n
- WebAuthn Conditional UI: on mount, if PublicKeyCredential.isConditionalMediationAvailable(), call onSignInWithPasskey({ conditional: true }) so passkeys appear inline in the email autofill dropdown
- Extract the dismissible success-banner markup into a {#snippet successBanner} and reuse it for the verified / verification-sent / magic-link-sent cases (~50 lines of duplicate JSX out)
Page wrappers:
- login/+page.svelte passes isDark={theme.isDark} so the in-app theme store drives both layouts
- register/+page.svelte wraps trackGuestConversion() in queueMicrotask + try/catch so analytics can never block the success redirect
- Drop the dead baseSignupCredits={25} prop from register/+page.svelte (RegisterPage never accepted it)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add AuthErrorCode union and typed twoFactorRedirect/retryAfter fields on AuthResult so the frontend can branch on stable codes instead of locale-dependent error strings.
- Extend signInWithPasskey with an optional { conditional } flag, threaded through to @simplewebauthn/browser via useBrowserAutofill, so hosts can opt into WebAuthn Conditional UI (passkey suggestions inline in the email autofill dropdown).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Source default was 3026 but Mac Mini production has been overriding to
3025 via the launchd plist in scripts/mac-mini/setup-image-gen.sh ever
since the service was set up. The override existed in exactly one place
that is not version-controlled in any obvious way — anyone redeploying
without that script would land on 3026 and clients pointing at 3025
would fail to connect.
Source default → 3025 across main.py, setup.sh, README, CLAUDE.md so the
launchd plist is no longer load-bearing. The Mac Mini setup script still
sets PORT=3025 explicitly; that's now belt-and-suspenders rather than the
only thing keeping production alive.
Also added a note clarifying that this Mac Mini service (flux2.c, MPS,
arm64-only) is *not* the same thing as the "image-gen" running on the
Windows GPU server (PyTorch + diffusers + CUDA, port 3023, code lives at
C:\mana\services\mana-image-gen\ outside this repo). Two different
implementations sharing a name was confusing the port-collision audit.
Updated docs/PORT_SCHEMA.md warning block to retract the previous false
claims of two active port collisions:
- image-gen ↔ video-gen on 3026 — wrong: image-gen runs on Mac Mini
on 3025 (now also the source default), video-gen is alone on the
Windows GPU on 3026
- voice-bot ↔ sync on 3050 — latent only: mana-voice-bot is not
deployed anywhere (no launchd, no scheduled task, no cloudflared
route), so the collision is in source defaults but not in production
The voice-bot 3050 default should still be moved before voice-bot is
ever deployed — flagged in the PORT_SCHEMA warning instead of silently
fixed since voice-bot deployment is its own decision.
New service docs:
- services/mana-stt/CLAUDE.md — FastAPI surface with Whisper MLX (local),
WhisperX (rich), and Voxtral (local + Mistral API). Documents the lazy
backend loading and the launchd plist setup on the Mac Mini.
- services/mana-events/CLAUDE.md — Hono/Bun service for public RSVP and
event-sharing. Documents the host (JWT) vs public (token) split, the
rate-limit sweeper, and the createApp factory pattern that lets unit
tests run without bootstrapping the production sweeper.
Stale entries fixed:
- mana-auth: dropped "rewritten from NestJS / drop-in replacement" — the
rewrite is the only mana-auth there is now. Email channel updated from
Brevo SMTP to self-hosted Stalwart (see docs/MAIL_SERVER.md).
- mana-notify: same Brevo → Stalwart fix in the channel table and env
var defaults.
PORT_SCHEMA.md flagged as aspirational:
- The doc was dated 2026-03-28 and presented as "single source of truth",
but cross-checking against actual service source files (config.go,
main.py, start.sh) shows nothing matches. Added a prominent warning at
the top with the real ports + two confirmed collisions:
* mana-image-gen and mana-video-gen both default to PORT 3026
* mana-voice-bot and mana-sync both default to PORT 3050
Today these are masked because image-gen + voice-bot live on the
Windows GPU server while video-gen + sync live on the Mac Mini, but
the moment they share a host they collide. Either execute the planned
reorg or pick non-colliding ports and rewrite the doc to match
reality — flagged as a real follow-up.
Two changes:
1. New BACKLOG_FILE_BYTES_ENCRYPTION.md captures everything I'd
want to know if I were picking up the file-bytes encryption
work cold in 6 months. ~370 lines, sits next to
DATA_LAYER_AUDIT.md for discoverability.
Sections:
- TL;DR + status (deferred, no production impact yet)
- Goal + non-goals
- Threat model delta table (mode-by-mode)
- Architecture: write path with ASCII flow diagram
- Architecture: read path with ASCII flow diagram
- The six hard parts:
1. Web Crypto AES-GCM doesn't stream → chunked-AEAD wrapper
2. Multipart uploads need coordinated chunking (S3 5 MB minimum
vs. our 1 MB AES-GCM chunks)
3. Resumable uploads + key persistence (new _pendingUploads
table for the in-flight content key)
4. No more server-side thumbnails (three options, recommended:
client-side resize before upload)
5. Sharing complicates the trust model (URL-fragment key
sharing, recommended; Mega.nz / Cryptpad pattern)
6. Migration of existing plaintext files (lazy on-read,
recommended)
- Schema delta (sql + Dexie additions)
- File map (~2200 LoC across 9 new files + 3 touched)
- Testing strategy (unit + integration + e2e per layer)
- Out-of-scope items explicitly listed
- Decision criteria for when to actually do this
- Five open questions for whoever picks it up
- Cross-references to related files
The doc is opinionated where I have a defensible recommendation
and explicit about uncertainty where I don't.
2. DATA_LAYER_AUDIT.md updates:
- Backlog "Offen" item #1 (File-Bytes-Encryption) now points
directly at the new plan doc with a one-line teaser.
- Backlog "Abgeschlossen" gains a row C for the Conflict
Visualization UI shipped in ed8ab4483 (was still listed as
open from the previous audit roll-up).
- List renumbered: Conflict-UI dropped from "Offen", remaining
items shifted up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root file cleanup:
- mac-mini-setup.sh → scripts/mac-mini/bootstrap.sh (first-time bootstrap
belongs next to the other mac-mini setup-* scripts)
- test-chat-auth.sh → scripts/test-chat-auth.sh (ad-hoc smoke test, no
reason to live in the repo root)
- cloudflared-config.yml stays in root on purpose — it's the single source
of truth read by scripts/mac-mini/setup-*.sh and scripts/check-status.sh.
Docs:
- docs/POSTMORTEM_2026-04-07.md → docs/postmortems/2026-04-07-memoro-deploy-prod-wipe.md
(creates the postmortems/ home for future entries; descriptive name)
- docs/future/MAIL_SERVER_MAC_MINI_TEMP.md deleted — what it described
("Bereit zur Umsetzung", Stalwart on Mac Mini) is what's actually
running today, documented in docs/MAIL_SERVER.md. The DEDICATED variant
in docs/future/ remains since it's still a real future plan.
Root CLAUDE.md fix:
- @mana/local-store description was wrong — claimed it was legacy/standalone
only, but it's still used by apps/mana/apps/web itself, plus manavoxel,
arcade, and three shared packages.
Not touched (flagged for follow-up):
- NewAppIdeas/ (344K of "Roblox Reimagined" planning notes in repo root) —
user decision: archive externally or move under docs/future/
- Doc giants (PROJECT_OVERVIEW 41k, MATRIX_BOT_ARCHITECTURE 36k, etc.) —
splitting them is its own refactor
- Service CLAUDE.md staleness audit across 18 services — too broad for
this pass
The "Gefällt es dir?" guest nudge was a free-floating fixed element at
bottom: 10rem, so it didn't follow the bottom-stack when the PillNav was
collapsed. Move it inside .bottom-stack as the first child so it shares
the stack's reflow.
NotificationBar now uses the elevation system (--color-surface-elevated,
--color-border-strong, --color-foreground) instead of hardcoded rgba so
it adapts to all themes. Bumped the CTA button (shadow + hover lift) and
container (stronger border, layered shadow) to be more visible.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed:
- apps/manacore/ — three Svelte files were byte-identical duplicates of
the apps/mana/ versions, leftover from the 2025 rename. Untracked .env
files in the same dir were also cleared.
- 21 empty apps/*/apps/web-archived/ directories — leftover from the
unification move, never tracked in git.
- services/it-landing/ — empty directory, picked up by the services/*
workspace glob for no reason.
- apps/news/apps/server-archived/ — empty.
Fixed:
- scripts/mac-mini/status.sh: COMPOSE_PROJECT_NAME fallback was still
manacore-monorepo from before the rename.
Documented:
- Root CLAUDE.md now describes apps/api/ (the @mana/api unified backend)
as a top-level peer to apps/mana/. It was completely missing from the
trimmed CLAUDE.md, which made the layout look frontend-only.
Closes backlog C from the Phase 9 audit. The data layer has had
real field-level LWW since Sprint 1, but when the server's value
beat a local edit, the user had no way to know. This commit adds
the missing UI piece: a toast that appears whenever applyServerChanges
overwrites a non-empty local field with a strictly newer server
value, with a one-click "restore my version" path.
sync.ts — detection
-------------------
Two new exports:
- SyncConflictPayload: per-field overwrite event shape
(tableName, recordId, field, wasLocal, nowServer, localTime,
serverTime).
- subscribeSyncConflicts(listener): in-module pub/sub. Returns
an unsubscribe function.
Both LWW branches in applyServerChanges (insert-as-update and the
canonical update-with-fields path) now call notifyConflict() when:
1. The server time is STRICTLY greater (not equal) than the local
field time → there's actually an edit window to lose
2. The local field value is non-null/undefined → user actually
typed something to overwrite
3. The values are not equal (cheap JSON-string compare for objects,
=== for primitives) → there's a real change, not an idempotent
server replay
Why a custom registry instead of CustomEvent + window.dispatchEvent?
The existing sync-telemetry + quota-detect helpers use
window.dispatchEvent which doesn't work in node-based vitest envs
(no DOM EventTarget). The conflict bus is small enough that a plain
Set<listener> is simpler than polyfilling EventTarget — and the
node test path matters because we need automated coverage of the
detection logic.
conflict-store.svelte.ts — UI state
-----------------------------------
Svelte 5 $state-backed store with three responsibilities:
1. Coalescing: a SyncConflict is keyed by `${tableName}|${recordId}`,
so a burst of N field-overwrites on the same record collapses
into ONE toast with all affected fields underneath. The original
wasLocal value is preserved across coalescing (we don't clobber
the user's first typed value if a later field event arrives).
2. Auto-dismiss: each conflict has a 30s TTL after which it
evicts itself. Manual dismiss trumps the timer.
3. Restore: writes wasLocal back to Dexie with a fresh updatedAt
that beats the server's serverTime, plus a __fieldTimestamps
patch so the field-LWW pass on the next sync round will let
our value win. Deferred via setTimeout(0) so it lands AFTER
applyServerChanges releases its per-table apply lock — running
before the lock release would silently drop the restore (the
hook suppression is per-table-set, not per-record).
FIFO eviction at MAX_VISIBLE=8 keeps a bursty server from growing
the visible array unbounded.
SyncConflictToast.svelte — the UI
---------------------------------
Mounts globally in +layout.svelte. Stacks bottom-right above the
OfflineIndicator. Each toast shows:
- Module label ("Aufgabe", "Notiz", "Termin", …) derived from a
table-name → German label map. Unknown tables fall through to
the bare table name.
- Field count summary ("Feld »title«" / "3 Felder") — we
deliberately do NOT render the actual values because some are
encrypted blobs and decrypting them in the toast would be
significant complexity for marginal UX gain. The user knows
what they were just editing.
- Two buttons: "Wiederherstellen" (calls conflictStore.restore)
and "Behalten" (calls dismiss).
Slide-in animation, dark-mode-aware styling, role="alertdialog"
for accessibility.
Wiring
------
data-layer-listeners.ts:
- Imports installConflictListener from conflict-store
- Calls it from installDataLayerListeners() right after the
quota + telemetry handlers
- Adds the disposeConflict() call to the cleanup return
+layout.svelte:
- Imports SyncConflictToast and mounts it next to SuggestionToast
so it inherits the same global-overlay positioning context
Tests
-----
Five new integration tests in sync.test.ts cover:
- Fires when server overwrites a non-empty local field with a
strictly newer value
- Does NOT fire when local field is null/undefined (no edit to lose)
- Does NOT fire when values are equal (idempotent replay)
- Fires once per overwritten field on a multi-field update
- Does NOT fire on a timestamp tie (LWW lets server win silently
when there's no real edit window)
All 25 sync tests + 138 total data-layer tests pass. The new
captureConflicts() helper subscribes via subscribeSyncConflicts()
which works in the node-vitest env without needing a DOM polyfill.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root CLAUDE.md: 1138 → 169 lines. Removed ghost apps-archived list,
Supabase env examples, duplicate mana-auth row, contradictory "Code
Quality TODO" block. Pushed search/storage/database/landing/manascore
howtos out to docs/ + .claude/guidelines/ pointers.
apps/mana/CLAUDE.md: 259 → 175 lines. Dropped non-existent workbench/
route from the routing diagram. Folded the auth section into a pointer
to root + the mana-specific current-user stamping pattern. Merged the
two module-system sections. Kept the data-flow ASCII diagram and the
encryption 3-step workflow (the part you actually need while writing
stores).
PyTorch's `torch.cuda.get_device_properties(0)` returns a
`_CudaDeviceProperties` object whose memory attribute is
`total_memory` (bytes), not `total_mem`. The typo crashed the
service immediately at startup because `get_model_info()` is
called from the FastAPI lifespan handler, not lazily — uvicorn
logged "Application startup failed" before any request could land.
Found while installing mana-video-gen on the Windows GPU box
(192.168.178.11:3026) for the gpu-video.mana.how Cloudflare route.
After the fix the service starts cleanly under the ManaVideoGen
scheduled task and responds 200 on /health both LAN and via
Cloudflare tunnel. status.mana.how now reports 42/42 — first time
ever.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Five documentation surfaces gained encryption awareness in this
sweep. Before this commit, the only place anyone could learn about
the at-rest encryption layer or the zero-knowledge opt-in was the
internal DATA_LAYER_AUDIT.md. New contributors and self-hosters
would never discover one of the most important features of the
product just by reading the standard onboarding docs.
apps/docs/src/content/docs/architecture/security.mdx (NEW)
----------------------------------------------------------
First-class user-facing security page in the Starlight site,
slotted into the Architecture sidebar between Authentication and
Backend.
Sections:
- What's encrypted (overview table of 27 modules + the
intentional plaintext carve-outs)
- Standard mode flow with ASCII diagram
- "What Mana CAN see" trust statements per mode
- Zero-knowledge mode setup walkthrough (Steps component)
- Unlock flow on a new device
- Recovery code rotation
- Deployment requirements (the loud MANA_AUTH_KEK warning)
- Audit trail action vocabulary
- Threat model summary table
- Implementation file references with paths
services/mana-auth/CLAUDE.md
----------------------------
New "Encryption Vault" section under Key Endpoints, listing all 7
routes (status, init, key, rotate, recovery-wrap GET+DELETE,
zero-knowledge) with their HTTP method, path, error codes, and a
description. Mentions the three CHECK constraints + RLS + audit
table. Points readers at DATA_LAYER_AUDIT.md and the new
security.mdx for the deep dive.
Environment Variables block gains MANA_AUTH_KEK with a multi-line
comment explaining the openssl rand command + dev fallback warning.
apps/mana/CLAUDE.md
-------------------
Full rewrite. The existing file was from the Supabase era and
described things like @supabase/ssr, safeGetSession(), and a
five-table schema with users + organizations + teams that doesn't
exist any more. Replaced with the unified-app architecture:
- Module system layout (collections.ts / queries.ts / stores/)
- Mana Auth (Better Auth + EdDSA JWT) instead of Supabase
- Local-first data layer with the full pipeline diagram
- At-rest encryption section with the "when writing module code
that touches sensitive fields" 4-step guide
- Updated routing structure (no more separate /organizations,
/teams routes)
- Module store pattern code example
- Reference document table at the bottom pointing at the audit,
the new security.mdx, and the auth doc
Root CLAUDE.md
--------------
New "At-Rest Encryption (Phase 1–9)" subsection under the
Local-First Architecture section. Two-mode trust summary table,
production requirement for MANA_AUTH_KEK with the openssl command,
the "when writing module code" 4-step guide, and a reference
table. New contributors reading the root CLAUDE.md from top to
bottom now hit encryption naturally as part of the data layer
discussion.
.env.macmini.example
--------------------
MANA_AUTH_KEK was missing from the production env example
entirely — the macmini deployment would silently boot on the
32-zero-byte dev fallback if you copied this file. Added with a
multi-paragraph comment covering: how to generate, why it's
required, how to store securely (Docker secrets / KMS / Vault),
and the rotation caveat.
apps/docs/src/content/docs/deployment/self-hosting.mdx
------------------------------------------------------
Two changes:
1. Added MANA_AUTH_KEK to the mana-auth service block in the
Compose example with an inline comment pointing at the new
section below.
2. New "Encryption Vault Setup" H2 section with subsections:
- Generating a KEK (with a fake example value labelled DO NOT
USE — generate your own)
- Securing the KEK (Docker secrets, KMS, systemd
LoadCredential, anti-patterns)
- "What if I lose the KEK?" — explains the data is
unrecoverable by design and mitigation via zero-knowledge
mode opt-in
- KEK rotation — calls out the missing background re-wrap
job as a known limitation
apps/docs/astro.config.mjs
--------------------------
Added "Security & Encryption" entry to the Architecture sidebar
between Authentication and Backend so the new page is reachable
from the docs nav.
Astro check: 0 errors, 0 warnings, 0 hints across 4 .astro files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Marks the four backlog items closed in this session — vault service
integration tests, recovery code rotation, pre-wired insert helpers
for future server-pushed records, and boards/boardItems encryption.
Updates the encrypted-tables list to 27 tables.
Updates
-------
1. Sprint table grows by 4 rows (BL1, BL2, BL3+4, BL5) with the
four backlog commits.
2. Test-Status line bumped:
21 web test files → 21 web + 2 mana-auth
78 vitest crypto tests + 39 bun mana-auth tests
"25+ tables" → "27 tables" (boards + boardItems added)
3. Section 5 encrypted-tables list grows by:
- boards (name, description)
- boardItems (textContent, only when itemType === 'text')
Both labelled "9 BL" in the Phase column to mark them as
backlog-sweep additions.
4. "Tabellen ohne Encryption (bewusst)" subsection: removed the
stale "boards/boardItems are a candidate for later" entry —
they're encrypted now. Added a redirect note pointing readers
at Section 6 where the actual decision is recorded.
5. Section 6 ("Backlog") completely restructured. The flat
"in priority order" list became two subsections:
"Abgeschlossen (Phase 9 Follow-Up Sweep)" — table with the four
commits + a one-line "what" notice each. Item 3+4 is explicitly
marked as a re-frame: the original "server pushes plaintext"
risk turned out to overstate the problem because the
generate/upload UIs are TODO stubs. The fix was pre-wired
insert() helpers, not a server-side rewrite.
"Offen" — five remaining items, reordered:
1. File-Bytes-Encryption (NEW: surfaced as "#4b" while
documenting that filesStore.insert() only protects metadata)
2. Image-Generation / File-Upload Wire-Up (NEW: ensures the
future UIs go through the helpers from #3+4)
3. Conflict Visualization UI (unchanged)
4. Composite Indexes für Multi-Account (unchanged)
5. V3 Migration Tests (unchanged)
6. Eckdaten line bumped from "25+ Tabellen aktiv" to "27 Tabellen
aktiv". Best Practices line for ZK gets the "+ rotate im
Active-State-Support" suffix.
7. Last-update header bumped to today.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>