Commit graph

636 commits

Author SHA1 Message Date
Till JS
5387681936 fix(mana-analytics): copy shared-logger into installer (transitive workspace dep)
shared-hono depends on @mana/shared-logger; without it, the bun runtime
crashes on first import with ENOENT for the workspace symlink target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 00:52:33 +02:00
Till JS
db01ddccc7 fix(mana-analytics): multi-stage Dockerfile that resolves workspace deps
Switches the build context to repo-root so the pnpm-workspace install
can pull in @mana/shared-hono. Mirrors the mana-auth/mana-ai pattern
(node+pnpm installer stage → bun runtime stage).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 00:50:25 +02:00
Till JS
8b0a943e71 feat(mana-analytics): pseudonym + reactions + public feed + admin
Macht mana-analytics zur Backend-Heimat des Public-Community-Hubs.

- Pseudonym: createDisplayHash(userId, secret) + generateDisplayName()
  produzieren deterministisch "Wachsame Eule #4528" pro User. 100
  Adjektive × 80 Tieren × 10000 Suffixe → ~80M Kombinationen. 7 unit
  tests, 0 PII im Output.
- Schema-Erweiterung user_feedback: display_hash, display_name,
  module_context, parent_id (1-level Threading), reactions jsonb
  (cached emoji→count), score (cached weighted sort).
- feedback_votes ersetzt durch feedback_reactions (Slack-Pattern:
  unique(feedback_id, user_id, emoji), pro User mehrere Emojis möglich).
- Service: createFeedback stempelt display_hash + display_name. Neue
  Methoden getPublicFeed (redacted), getReplies, toggleReaction
  (rebuilt reactions+score). Admin-Methoden adminListAll/adminUpdate
  founder-tier-gated im Route-Layer.
- Routes:
  /api/v1/public/feedback/*  — anonymous reads (kein Auth, kein
                              userId/displayHash/deviceInfo im Output)
  /api/v1/feedback/*         — auth-required Submit/React/Manage,
                              plus :id/replies, :id/react, /admin
- Config: neuer FEEDBACK_PSEUDONYM_SECRET-Env-Var seedet die Hashes;
  Rotation re-keyt nur künftige Pseudonyme, alte Records bleiben stabil.

Migration 0002_public-community-foundation.sql idempotent, lokal +
prod (mana-server) eingespielt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 00:00:35 +02:00
Till JS
c07db300b0 feat(sync): F4 — server-side singleton bootstrap
Closes the userContext race-on-first-mount that surfaced as a
"10 fields overwritten" conflict toast pre-F2. Adds a fire-and-forget
hook in the /register flow that writes the per-user `userContext`
singleton straight into `mana_sync.sync_changes` with
`client_id='system:bootstrap'` and `origin='system'`.

Behavior:
- On successful `signUpEmail`, `bootstrapUserSingletons(userId, syncSql)`
  inserts a `profile/userContext` row with the empty-default shape that
  mirrors the webapp's `emptyUserContext()` factory in
  `apps/mana/apps/web/src/lib/modules/profile/types.ts`.
- The receiving client treats the change as origin='server-replay'
  on apply (per F2 conflict-gate), so no toasts on first pull.
- Failure is logged but does not abort registration — the webapp's
  existing `ensureDoc()` fallback still works during the F4→F5
  transition.

Module-scoped postgres pool (max=2 connections) lazy-initialized on
first signUp; reused for the lifetime of the process. Same pattern as
`UserDataService.getSyncSql`.

Out of scope for F4:
- `kontextDoc` is per-Space (not per-user) — bootstrap there will be
  hooked into the Space-creation flow, not /register. The webapp's
  `ensureDoc()` for kontextDoc stays as-is for now.
- Webapp `ensureDoc()` removal is F5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 23:18:54 +02:00
Till JS
6bb9d77be9 feat(sync): F3 — drop updatedAt as a synced data field
Removes `updatedAt` from the wire protocol and from every Local-prefixed
record type. Replaced by two orthogonal mechanisms — deriveUpdatedAt()
for read-side public-facing values, _updatedAtIndex shadow for indexed
sorts.

Local-side:
- New `_updatedAtIndex` shadow column. Stamped by the Dexie creating /
  updating hook on every write. Stripped from the pending-change payload
  so it never travels to mana-sync. Indexed in Dexie v53 on the 22 tables
  that previously indexed `updatedAt`.
- `deriveUpdatedAt(record)` in sync.ts returns max(__fieldMeta[*].at) so
  the public-facing Task / Note / etc. shape keeps an `updatedAt: string`
  property without holding it as data.
- Type-converters across ~60 module/queries.ts and types.ts files now
  call `deriveUpdatedAt(local)` instead of reading `local.updatedAt`.

Module-store sweep:
- Regex codemod removed `updatedAt: new Date().toISOString()` /
  `: now` / `: now()` / `: nowIso()` stamping from 121 store files
  (~382 call sites total). Single-property update calls
  (`{ updatedAt: now }`) collapsed to `{}`; touch-only patterns
  (writing/drafts, writing/generations) kept the call as a no-op
  because the hook now stamps `_updatedAtIndex` automatically on
  any Dexie modification.
- Local* interfaces stripped of `updatedAt: string` (43 types.ts files).
  Public-facing types (Task, Note, Mission, Agent, …) keep
  `updatedAt: string` as a computed read-side property.
- Companion's chat conversation now sorts on a real
  `lastMessageAt` data field instead of touching `updatedAt`.
- Session-only stores (times/session-alarms, session-countdown-timers)
  stamp `updatedAt: now` directly because they're not in Dexie and
  have no field-meta layer to derive from.

Sync engine:
- applyServerChanges sets `_updatedAtIndex` itself when applying
  server changes (max of server-field times for updates, recordTime
  for inserts) so server-replays land orderable.
- Dropped the legacy `localUpdatedAt` fallback — every record now has
  `__fieldMeta`, the per-field at is the canonical source.
- Soft-delete tombstone path stops stamping `updatedAt: serverTime`,
  uses `_updatedAtIndex` instead.

Server-side:
- mana-ai iteration-writer no longer emits `updatedAt` in
  sync_changes.data; receivers derive it from the field-meta map.
- mana-sync types: no change (the wire format already uses
  `field_meta` / `at` from F1).

Out of scope: backend Drizzle schemas (mana-credits, mana-events, …)
keep their `updated_at` columns. Those are pure server-internal — not
part of the sync_changes / __fieldMeta mechanism F3 cleans up.

Tests + checks:
- 0 svelte-check errors over 7652 files.
- 29/29 sync.test.ts (vitest).
- 61 mana-ai bun tests.
- mana-sync go test ./... cached green.

Plan: docs/plans/sync-field-meta-overhaul.md F3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 23:12:22 +02:00
Till JS
f10439e369 fix(mana-analytics): point migration at public.feedback_status, not feedback.*
Drizzle's pgEnum() ohne pgSchema-Wrap landet immer in public — der
schemaFilter versteckt das nur im Diff (siehe Repo-Memory:
"Drizzle enums with schemaFilter must use pgSchema().enum()"). Die
Tabelle feedback.user_feedback referenziert die Enums quer aus public,
das funktioniert; aber die ALTER-TYPE-Statements in der ursprünglichen
Migration zielten auf feedback.feedback_status / feedback.feedback_category
und hätten damit nichts gefunden.

Lokal verifiziert (mana_platform.public.feedback_status,
mana_platform.public.feedback_category):
- 6 Status-Werte umbenannt → submitted/under_review/planned/in_progress/completed/declined
- Default-Status auf 'submitted'
- Category 'onboarding-wish' hinzugefügt
- Re-Run idempotent (DO-Blöcke + ADD VALUE IF NOT EXISTS)

Mittelfristig sollte feedbackSchema.enum(...) verwendet werden, damit
Enums tatsächlich im feedback-Namespace landen — eigener Refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 21:55:45 +02:00
Till JS
ba6274edbe refactor(feedback): align package + DB enums, plan central hub
Macht @mana/feedback zur SSOT für alle Nutzer-Feedback-Categories und
-Status — Voraussetzung dafür, dass Onboarding-Wishes, NPS, Churn-Feedback
etc. künftig dort landen.

- Status-Enum: DB-Werte umbenannt new/reviewed/done/rejected →
  submitted/under_review/completed/declined (Package gewinnt). PG≥10
  ALTER TYPE … RENAME VALUE ist non-destructive.
- Category 'praise' ins Package aufgenommen (war nur in DB).
- Category 'onboarding-wish' neu in Package + DB für den Wish-Step.
- Default status in DB: 'new' → 'submitted'.
- CreateFeedbackInput.isPublic optional → Service reicht durch, default
  bleibt true; private Categories wie onboarding-wish setzen false.
- Schema-Datei mit SSOT-Kommentar versehen, der Drift in Zukunft verhindert.

Hand-authored Migration unter services/mana-analytics/drizzle/0001_*.sql
weil drizzle-kit push Enum-Werte nicht zuverlässig umbenennt. Manuell
einspielen vor nächstem db:push:

  psql "\$DATABASE_URL" -f services/mana-analytics/drizzle/0001_align-feedback-enums.sql

Plan in docs/plans/feedback-hub.md (Phase 0–4); Phase 0 + 1 jetzt, 2-4
deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 21:52:25 +02:00
Till JS
7766ea5021 docs(plans): mark llm-fallback-aliases SHIPPED, add M-by-M commit table
All 5 milestones landed today in one continuous session: registry,
health cache, fallback router, observability, and consumer migration.
115 service-side tests, validator covers 2538 files.
2026-04-26 21:27:57 +02:00
Till JS
fea3adf5fe feat(llm-aliases): M5 — migrate consumers to MANA_LLM aliases
Final milestone of docs/plans/llm-fallback-aliases.md. Every backend
caller now requests models via the `mana/<class>` alias system instead
of hardcoded `ollama/...` strings. mana-llm resolves aliases through
`services/mana-llm/aliases.yaml` with health-aware fallback (M3) and
emits resolved-model + fallback metrics (M4).

SSOT moved to `packages/shared-ai/src/llm-aliases.ts` so apps/api,
apps/mana/apps/web, and services/mana-ai all import the same
`MANA_LLM` constant via the existing `@mana/shared-ai` workspace
dependency. Three additional sites (memoro-server, mana-events,
mana-research) inline the alias string with a SSOT comment because
they don't pull @mana/shared-ai today.

Migrated 14 sites across 10 files:
- apps/api: writing(LONG_FORM), comic(STRUCTURED), context(FAST_TEXT),
  food(VISION), plants(VISION), research orchestrator (3 tiers
  collapsed to STRUCTURED+FAST_TEXT/LONG_FORM)
- apps/mana/apps/web: voice/parse-task + parse-habit (STRUCTURED)
- services/mana-ai: planner llm-client + tick.ts (REASONING)
- services/mana-events: website-extractor (STRUCTURED, inlined)
- services/mana-research: mana-llm client (FAST_TEXT, inlined)
- apps/memoro/apps/server: ai.ts (FAST_TEXT, inlined)

Legacy env-vars removed: WRITING_MODEL, COMIC_STORYBOARD_MODEL,
VISION_MODEL, MANA_LLM_DEFAULT_MODEL. The chain in aliases.yaml is
now the single tuning surface; SIGHUP reloads it without redeploys.

New `scripts/validate-llm-strings.mjs` regex-scans 2538 files for
hardcoded `<provider>/<model>` strings and fails the build if any
land outside the SSOT or the explicitly-allowed paths (image-gen
modules, model-inspector code, this validator itself, the registry).
Wired into `validate:all` next to the i18n + theme validators.

Verified: `pnpm validate:llm-strings` clean, `pnpm --filter @mana/api
type-check` clean, `pnpm --filter @mana/ai-service type-check`
clean. Web type-check has 2 pre-existing errors in
SettingsSidebar.svelte (i18n MessageFormatter type drift, last
touched in 988c17a67 — unrelated to this work).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 21:26:03 +02:00
Till JS
8a49e3ffd5 feat(mana-llm): M4 — observability, debug endpoints, SIGHUP reload
- `X-Mana-LLM-Resolved: <provider>/<model>` header on non-streaming
  responses. Streaming clients read the same info from each chunk's
  `model` field (SSE headers go out before the chain is walked).
- Three new Prometheus metrics: `mana_llm_alias_resolved_total{alias,
  target}` (which concrete model an alias resolved to per request),
  `mana_llm_fallback_total{from_model, to_model, reason}` (each
  fallback transition), `mana_llm_provider_healthy{provider}` (gauge,
  mirrors the circuit-breaker).
- New debug endpoints: `GET /v1/aliases` (registry inspection — chain
  + description per alias, useful for confirming SIGHUP reloads),
  `GET /v1/health` (full per-provider liveness snapshot — failure
  counter, last error, unhealthy-until backoff).
- `kill -HUP <pid>` reloads `aliases.yaml`. Parse errors leave the
  previous good state in memory and log the rejection.
- `ProviderHealthCache.add_listener()` for cache→metrics decoupling:
  the gauge is updated via a transition-only listener wired in main.py
  rather than the cache importing prometheus_client itself.
- Request-side metrics now use the requested model string, success-side
  uses the resolved one. So `mana_llm_llm_requests_total{provider="ollama",
  model="gemma3:12b"}` reflects actual upstream load even when callers
  used `mana/long-form` aliases.

16 new observability tests (test_m4_observability.py): listener
fire-on-transition semantics, exception-isolation, multi-listener,
counter increments, gauge writes, end-to-end alias→metric flow,
v1/aliases + v1/health endpoint shape, response.model carries the
resolved target after fallback. Total suite: 115/115 in 1.6s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:52:28 +02:00
Till JS
3046da3b19 feat(mana-llm): M3 — health-aware router with alias + chain fallback
Replaces the old Ollama→Google special-case auto-fallback with the
unified pipeline: caller passes either a direct provider/model or an
alias from the `mana/` namespace; the router resolves to a chain and
walks it skipping unhealthy providers (per ProviderHealthCache from M2),
trying each entry, marking provider unhealthy on retryable errors and
falling through to the next.

Retryable: ConnectError, ReadTimeout, RemoteProtocolError, 5xx,
ProviderRateLimitError. Propagated (don't fall back, don't poison the
cache): ProviderCapabilityError, ProviderAuthError, ProviderBlockedError,
4xx, unknown exception types. The cache stays "what the network told us
about this provider's liveness" — caller errors don't muddy that signal.

Streaming: pre-first-byte fallback only. Once a chunk has been yielded
the provider is committed; mid-stream errors propagate as-is so we
don't splice two voices into one output.

`NoHealthyProviderError` (HTTP 503) carries a structured attempt log —
each chain entry shows up as `(model, reason)` so the cause of a 503
is visible in the response and metrics, not only in service logs.

main.py wires the lifespan: aliases.yaml is loaded, ProviderHealthCache
created, ProviderRouter takes both as constructor deps, HealthProbe
spawned with cheap HTTP probes per configured provider (Ollama
/api/tags, OpenAI-compat /v1/models with Bearer header). Google is
skipped — google-genai SDK has no obvious cheap probe; the call-site
fallback handles real errors.

22 new router tests (test_router_fallback.py): chain walking, capability
& auth propagation, 5xx vs 4xx differentiation, rate-limit retry,
all-fail → NoHealthyProviderError, direct provider strings bypass
aliases, streaming pre-first-byte fallback, mid-stream-failure does
NOT fall back, empty stream commits without retry, cache feedback on
success/failure/non-retryable. Existing test_providers.py updated for
the new constructor signature; all 99 service tests green via the dev
container (Python 3.12).

Legacy purged: `_ollama_concurrent`, `_ollama_health_cache`,
`_can_fallback_to_google`, `_should_use_ollama`, `_fallback_to_google`,
`_get_ollama_health_cached` all gone. The `auto_fallback_enabled` /
`ollama_max_concurrent` settings remain in config.py for now (M5 will
remove them along with the per-feature env-var overrides).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:44:16 +02:00
Till JS
59557e62d7 feat(mana-llm): M2 — ProviderHealthCache + background probe loop
Per-provider liveness with circuit-breaker semantics. The router (M3)
will read `is_healthy()` to skip dead providers in a chain; the probe
loop and the call-site fallback handler write state via
`mark_healthy` / `mark_unhealthy`.

State machine: 1st failure stays healthy (transient blips happen);
2nd consecutive failure trips the breaker and sets a 60s backoff
window during which `is_healthy → False`. After the window the
provider is half-open again — next call exercises it, success
resets, failure re-arms.

HealthProbe is the background asyncio.Task that pings every
registered provider every 30s with a 3s timeout. Probes run
concurrently per tick and one bad probe can't sink the loop. Probe
functions are injected (`{name: async-fn}`) so this module stays
decoupled from the provider classes — the wiring lives in main.py
where we already know which providers are configured.

32 new tests (FakeClock for deterministic backoff timing, slow-probe
helpers for parallelism + timeout, lifecycle tests for start/stop
idempotency and tick-after-error survival). 64/64 alias+health tests
green.

Not yet wired into the request path — that's M3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:29:57 +02:00
Till JS
dff8629e1d feat(mana-llm): M1 — AliasRegistry + aliases.yaml SSOT
First milestone of the LLM-fallback plan (docs/plans/llm-fallback-aliases.md).
Introduces the `mana/<class>` namespace; the registry parses + validates
aliases.yaml at startup and reloads on demand. Schema-rejects empty
chains, missing provider prefixes, alias names outside the reserved
namespace, default→unknown references, etc.

Reload semantics: parse error keeps the previous good state in memory
so a typo + SIGHUP doesn't take the service down.

5 aliases ship with the initial config: fast-text, long-form, structured,
reasoning, vision. Each chain ends with a cloud provider so the system
keeps working when the GPU server is offline.

32 unit tests covering happy path, schema validation, namespace check,
reload safety, and a guard that the shipped aliases.yaml itself parses.
M2 (health-cache + probe-loop) and M3 (router fallback execution) build
on this; aliases are not yet wired into the request path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:23:51 +02:00
Till JS
dff02d24a9 fix(mana-media): HEIC uploads from Chrome — sniff + transcode at the edge
iPhone HEIC photos uploaded through Chrome on macOS landed as
`mimeType: application/octet-stream` because Chrome doesn't recognise
the HEIC MIME and `file.type` was empty. The transform endpoint then
refused with `Transform only supported for images` (HTTP 400) and
the wardrobe Try-On flow surfaced this as `mana-media transform
failed for <id>: HTTP 400`. Even fixing the MIME wouldn't have been
enough — sharp's prebuilt binary ships the heif container format
without a HEVC decoder plugin (libde265 is omitted for patent
reasons), so the actual decode would still throw.

Three-part fix at the upload edge:

1. New `services/sniff.ts` — magic-byte sniffer for image MIMEs.
   Reads the first ~16 bytes and recognises JPEG, PNG, GIF, WebP,
   BMP, TIFF, HEIC, HEIF, AVIF. Returns `null` for everything else
   so the caller can fall back to whatever the browser claimed.

2. Upload route — sniffs every upload before passing the buffer to
   `uploadService.upload`. Trusts magic bytes over `file.type` so
   Chrome's empty-type HEIC still lands with `image/heic`. Removes
   the entire class of `application/octet-stream` rows for files
   that are obviously images.

3. HEIC/HEIF transcoded to JPEG at upload via the new
   `heic-convert` dependency (pure-JS WASM, no system libs needed).
   The original buffer is replaced with the JPEG bytes, the MIME
   becomes `image/jpeg`, and the filename's `.HEIC` extension is
   rewritten to `.jpg`. Downstream code (process pipeline, transform
   endpoint, sharp) then deals exclusively with formats sharp can
   actually decode. Failure path returns HTTP 500 with a clear
   `HEIC conversion failed` error so the client knows it wasn't a
   generic crash.

Bonus, transform endpoint hardening: `mimeType.startsWith('image/')`
gate now also accepts a row whose stored MIME is wrong (legacy
`application/octet-stream` from before this fix) when the actual
bytes sniff as an image. Lets old broken rows still serve where
the format itself is decodable; the upload-side fix prevents new
ones from existing.

Sharp 0.33 on this machine reports `heif: 1.18.2` for the container
but rejects the actual HEVC compressed bitstream — confirmed by the
exact error string `No decoding plugin installed for this
compression format (11.6003)`. Going through `heic-convert` first
sidesteps that entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 13:46:13 +02:00
Till JS
e66654068f feat(auth): error-classification layer + passkey end-to-end
Two interlocking fixes driven by a production lockout incident.

## Bug that motivated this

A fresh schema-drift column (auth.users.onboarding_completed_at) made
every Better Auth query crash with Postgres 42703. The /login wrapper
swallowed the non-2xx and mapped it onto a generic "401 Invalid
credentials" AND bumped the password lockout counter — so 5 legit
login attempts against a broken DB would have locked every real user
out of their own account. Same wrapper pattern on /register, /refresh,
/reset-password etc. The 30-minute hunt ended in a one-off repro
script that finally surfaced the real Postgres error.

The user-facing passkey button additionally returned generic 404s on
every login-page mount because the route wasn't registered (the DB
schema existed, the Better Auth plugin wasn't wired).

## Phase 1 — Error classification (services/mana-auth/src/lib/auth-errors)

- 19-code AuthErrorCode taxonomy (INVALID_CREDENTIALS, EMAIL_NOT_VERIFIED,
  ACCOUNT_LOCKED, SERVICE_UNAVAILABLE, PASSKEY_VERIFICATION_FAILED, …)
- classifyFromResponse/classifyFromError handle: Better Auth APIError
  (duck-typed on `name === 'APIError'`), Postgres errors (23505 unique,
  42703/08xxx → infra), ZodError, fetch/ECONNREFUSED network errors,
  bare Error, unknown.
- respondWithError routes the structured response, logs at the right
  level, fires the correct security event, and CRITICALLY only bumps
  the lockout counter for actual credential failures — SERVICE_UNAVAILABLE
  and INTERNAL never touch lockout.
- All 12 endpoints in routes/auth.ts refactored (/login, /register,
  /logout, /session-to-token, /refresh, /validate, /forgot-password,
  /reset-password, /resend-verification, /profile GET+POST,
  /change-email, /change-password, /account DELETE).
- Fixed pre-existing auth.api.forgetPassword typo (→ requestPasswordReset).
- shared-logger + requestLogger middleware wired in index.ts; all
  console.* calls in the service removed.

## Phase 2 — Passkey end-to-end (@better-auth/passkey 1.6+)

- sql/007_passkey_bootstrap.sql: idempotent schema alignment —
  friendly_name→name, +aaguid, transports jsonb→text, +method column
  on login_attempts.
- better-auth.config.ts: passkey plugin wired with rpID/rpName/origin
  from new webauthn config section. rpID defaults to mana.how in prod
  (from COOKIE_DOMAIN), localhost in dev.
- routes/passkeys.ts: 7 wrapper endpoints (capability probe,
  register/options+verify, authenticate/options+verify with JWT mint,
  list, delete, rename). Each routes errors through the classifier;
  authenticate/verify promotes generic INVALID_CREDENTIALS to
  PASSKEY_VERIFICATION_FAILED.
- PasskeyRateLimitService: in-memory per-IP (options: 20/min) and
  per-credential (verify: 10 failures/min → 5 min cooldown) buckets.
  Deliberately separate from the password lockout — different factor,
  different blast radius.
- Client: authService.getPasskeyCapability() async probe, memoised per
  session. authStore.passkeyAvailable reactive state. LoginPage gates
  on === true so a slow probe doesn't flash the button in.
- AuthResult grew a code: AuthErrorCode field; handleAuthError in
  shared-auth prefers the server envelope over the legacy message
  heuristics.

## Tests

- 30 unit tests for the classifier covering every branch (including
  the exact Postgres 42703 shape that started this).
- 9 unit tests for the rate limiter.
- 14 integration tests for the auth routes — the regression test
  explicitly asserts "upstream 500 → 503 + zero lockout bumps".
- 101 tests pass, 0 fail, 30 pre-existing skips unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:52:51 +02:00
Till JS
5aecf8b90d feat(onboarding): M2 — route guard + shell + Screen 1 (name)
- PATCH /api/v1/me/profile in mana-auth (name, image with 1–80 char
  validation) — powers the Screen-1 save
- (app)/+layout.svelte:
  * isOnboarding derived from pathname
  * handleAuthReady loads onboardingStatus, redirects brand-new users
    to /onboarding/name (fire-and-forget so sync/data-layer init keeps
    running in parallel)
  * chrome (PillNav, wallpaper, bottom-stack) hidden in onboarding mode;
    AuthGate still wraps so the flow enforces authentication
- /onboarding/+layout.svelte: full-viewport shell with progress dots
  (1/3, 2/3, 3/3) and a skip-all that marks the flow complete and
  sends the user home
- /onboarding/+page.svelte: redirects bare entry to /onboarding/name
- /onboarding/name/+page.svelte: text input (1–40 chars), Enter = Weiter,
  skip falls back to email local-part so Screen 2's greeting is never
  empty

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:49:52 +02:00
Till JS
5a92e1168b feat(onboarding): M1 — data model + endpoints + client store
- auth.users: new nullable `onboarding_completed_at` column
- new /api/v1/me/onboarding routes: GET, POST /complete, PATCH /reset
- onboardingStatus Svelte store in the web app that reads/writes via
  those endpoints (no JWT claim so completing the flow takes effect
  without a token re-mint)
- docs/plans/onboarding-flow.md adjusted: no backfill (launch without
  existing users), better-auth `name` clarified, 7 templates including
  "Arbeit" confirmed

Foundation for the 3-screen first-login flow (Name → Look → Templates).
No UI and no route guard yet — those ship in M2 when the redirect target
actually exists. Schema change is a pure column-add, applied via
`pnpm --filter @mana/auth db:push`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:24:49 +02:00
Till JS
f7536bc0b9 feat(shared-ai): route compactor to Haiku-tier model by default (M2.5)
compactHistory() now defaults to DEFAULT_COMPACT_MODEL =
'google/gemini-2.5-flash-lite' when the caller doesn't override. Lite
is ~3–5x cheaper than gemini-2.5-flash with near-identical
summarisation quality — summarisation doesn't need the same tier as
reasoning + tool-calling, and the compactor fires exactly when token
spend is highest, so the cheaper route saves exactly where it matters.

CompactHistoryOptions.model is now optional. All three consumers
(mana-ai tick, webapp Companion, webapp Mission runner) drop their
explicit gemini-2.5-flash override and let the default apply.

This is the pragmatic M2.5: no mana-llm changes. The "tier" abstraction
(X-Model-Tier header, env-routed aliases) from the Claude-Code report
makes sense only once multiple utility tasks need cheaper routing —
topic-detection, classification, command-injection checks. Today only
the compactor wants it, and a model constant is the simplest contract
that works.

2 new tests (default applied + override honoured). 79 shared-ai tests
green, all three consumers type-check clean. One pre-existing unrelated
type error in apps/mana/apps/web/src/lib/modules/wardrobe/queries.ts
(not touched by this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 18:26:50 +02:00
Till JS
89388fb369 refactor(mana-auth): move enums from public to auth schema
pgEnum() defaults to the public schema. Because
drizzle.config.ts sets schemaFilter: ['auth'], push introspection
never saw the enums and kept re-emitting CREATE TYPE access_tier ...,
failing with 42710. This blocked setup-databases.sh from advancing
mana-auth past the enum declarations and silently masked other drift
(e.g. the new `kind` column on auth.users going un-pushed).

Source side: three enums now live on authSchema via
authSchema.enum(...) instead of pgEnum(...). DB side: migration 006
recreates access_tier / user_role / user_kind inside the auth schema,
repoints auth.users.access_tier and auth.users.role via ::text cast
(preserving all data and defaults), and drops the old public types.

After this, `drizzle-kit push --force` reports "No changes detected"
on a clean DB and the broader `pnpm setup:db` run is green without
workarounds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:36:39 +02:00
Till JS
52f53c844b chore(mana-auth): add 005 persona tables migration
Documents the SQL that was applied manually to match the personas.ts
Drizzle schema introduced in 493db0c3b. Idempotent. See
docs/plans/mana-mcp-and-personas.md for the design. Required because
the spaces tables created alongside personas sit outside the auth
schemaFilter, and pre-existing public enums would otherwise trip
drizzle-kit push (resolved separately in migration 006).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:36:26 +02:00
Till JS
72f7978ed4 feat(agent-loop): expose compactionsDone + compactedReminder producer
Closes the loop on M2: when the compactor fires, the LLM needs to know
it's now seeing a <compact-summary> instead of raw turns so it
doesn't waste a turn asking about lost details or re-executing tools
whose responses are gone.

shared-ai:
  - LoopState grows `compactionsDone: number` (cap-1 by current loop
    policy, but shape kept as count for future multi-compact cycles).
  - runPlannerLoop populates it on each reminder-channel call. New
    loop test asserts [0, 1] sequence: round 1 before compaction,
    round 2 after.

mana-ai:
  - New producer `compactedReminder` — fires severity=info when
    compactionsDone >= 1, wrapped in a German one-liner ("frag nicht
    nach verlorenen Details").
  - Injected FIRST in buildReminderChannel so the LLM frames the rest
    of the round with "I'm looking at a summary" context. Metric
    surface stays `{producer='compacted', severity='info'}`.

4 new reminder tests (3 pure producer + 1 composition-ordering) +
1 loop-wiring test. 77 shared-ai, 20 reminders.test.ts — green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:36:21 +02:00
Till JS
eb8fac23ec fix(personas): exact tool_use_id pairing + CI drift audit
Two loose ends from M3/M4:

1. Tool_use_id-based error attribution in the persona-runner
-----------------------------------------------------------
The previous collectActionsFromMessage() flipped the *most recent*
ActionRow to 'error' when a tool_result carried is_error:true. That was
fine as long as Claude invoked tools strictly in sequence, but when
the planner pipelines multiple tools in one turn, a later tool_result
carries an earlier tool_use_id — the last-action fallback mis-
attributes the error.

runMainTurn() now keeps a tool_use_id → action-index Map for the
duration of the tick. On tool_use we stash block.id, on tool_result we
look up the exact ActionRow via tool_use_id and flip that one. The
"flip last" path survives as a pure fallback if a future SDK ever
ships a block without an id.

2. New audit:encrypted-tools script
-----------------------------------
scripts/audit-encrypted-tools.ts — loads registerAllModules() and
apps/mana/…/crypto/registry.ts, diffs every ToolSpec.encryptedFields
against the authoritative web-app ENCRYPTION_REGISTRY.

Catches three classes of drift:
- missing-table : tool declares a table the web-app doesn't encrypt
- field-drift   : both agree a table is encrypted but the field lists
                  differ (half-encryption in the wire is silent death)
- disabled      : web-app has enabled:false while the tool still
                  encrypts — advisory warning, not a fail

Negative-tested by injecting a deliberate drift on todo.create +
todo.list (shortened ENCRYPTED_FIELDS to ['title']); the auditor
flagged both tools with full field diffs, restore returned to green.

Wired into `pnpm run validate:all` so the contract survives future
edits on either side. Fills the M4 audit gap noted in
project_mana_mcp_personas.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:34:52 +02:00
Till JS
83a4606a9a feat(mana-ai): wire context-window compactor into mission runner (M2.3)
The Claude-Code wU2 pattern goes live. Every mission run now passes a
compactor into runPlannerLoop that will fire once if cumulative token
usage crosses 92% of MANA_AI_COMPACT_MAX_CTX (default 1_000_000, the
gemini-2.5-flash ceiling). Override via env for deployments on smaller
models; set to 0 to disable entirely.

The compactor reuses the planner's own LlmClient + gemini-2.5-flash
model for now. When mana-llm grows a Haiku tier we'll route the
compactor there — it's pure summarisation and a cheaper model saves
tokens exactly where they matter.

New metrics:
  - mana_ai_compactions_triggered_total — counter, one per firing
  - mana_ai_compacted_turns — histogram, how many middle turns got
    folded each time (< 3 ⇒ maxCtx is probably misconfigured)

Logs print a 60-char tail of the summary.goal so the "what was this
mission doing again" question survives a compaction.

No new tests here — compactHistory and the loop wiring are already
covered by the 22 tests in shared-ai (M2.1 + M2.2). The 57 existing
mana-ai bun tests stay green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:28:20 +02:00
Till JS
5a5e24f582 chore(docker): drop obsolete services/mana-search/docker-compose.dev.yml
Superseded by the top-level docker-compose.dev.yml (which defines
searxng + redis as part of the unified dev stack via `pnpm docker:up`).
This per-service file was an artefact from before the unified setup
and no script / doc / README still references it.

An orphan `mana-searxng-dev` + `mana-search-redis-dev` had been running
from this file for ~2 weeks, squatting on the host's port 8080. Every
first `pnpm dev:mana:all` after a cold machine start would fail with

  Bind for 0.0.0.0:8080 failed: port is already allocated

because the top-level compose's `mana-searxng` service couldn't take
8080 while the orphan held it. The second invocation silently
"worked" — docker saw the freshly-created mana-searxng container and
skipped the bind step on the idempotent up, leaving it healthy but
only reachable inside the docker network (8080/tcp, no external
publish).

Cleanup already done out-of-band:
  docker compose -f services/mana-search/docker-compose.dev.yml down
  docker compose -f docker-compose.dev.yml up -d --force-recreate searxng

Deleting the file so a stale `docker compose -f …/mana-search/dev.yml up`
can't resurrect the orphan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:27:19 +02:00
Till JS
3edf680ea0 feat(mana-ai): telemetry for reminder producers (mana_ai_reminders_emitted_total)
Producers now return structured {producer, severity, text} objects
instead of raw strings. buildReminderChannel collects them, increments
mana_ai_reminders_emitted_total{producer, severity} per emission, and
maps back to strings for the shared-ai loop input.

Why structured: the Prometheus label "severity" lets dashboards split
75-99% token-budget warnings (severity=warn) from 100%+ escalations
(severity=escalate) without NLP on the reminder text. Adding a new
producer that emits only info-level state (e.g. stale-sync warning)
falls out for free.

Active producer labels today:
  - token-budget (warn, escalate)
  - retry-loop (warn)

With this plus the scrape job (d087b4744), we can finally answer:
"does the budget warning actually change LLM behaviour?" — correlate
reminders_emitted_total{producer='token-budget'} with
tick_duration_seconds or planner_rounds_histogram.

3 tests updated to assert the new {producer, severity, text} shape
(16 reminder tests total, all green).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:10:27 +02:00
Till JS
8f283726b1 feat(agent-loop): activate retryLoopReminder via LoopState.recentCalls
Extends LoopState with a sliding window of the last N ExecutedCalls
(oldest-first), capped at LOOP_STATE_RECENT_CALLS_WINDOW = 5. The loop
maintains the window automatically; reminderChannel producers read it
without touching internal state.

This activates retryLoopReminder which was shape-only in faa472be9.
The guard now fires end-to-end: when round >= 3 and the tail-2 calls
both returned success:false, the LLM sees a "stop retrying, write a
summary instead" <reminder> on the next turn. The tail-2 check rather
than window-wide is deliberate — a flaky run with intermittent success
(F, F, F, OK, F) is not a retry loop, just flaky tools.

Why window=5: retry loops usually manifest within 2-3 consecutive
rounds; a 5-deep window gives room for burst-detection and
stale-tool heuristics without bloating the reminder channel. Cap
keeps the reminder producers O(5) regardless of loop length.

Tests: 3 new (sliding-window cap + slide + order in shared-ai, retry
composition + budget+retry chain + tail-only heuristic in mana-ai).
Total agent-loop tests now 74 across both packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:02:40 +02:00
Till JS
25c3bb6cdf docs(mana-mcp,mana-ai): CLAUDE.md coverage for M1 agent-loop primitives
mana-mcp:
  - Policy-gate section: POLICY_MODE semantics, the four decision
    rules, where to find soak metrics during log-only burn-in.
  - /metrics section pointing at the Prometheus job.

mana-ai:
  - New v0.8 status block: reminderChannel wiring, the two live
    producers (tokenBudgetReminder active, retryLoopReminder dormant
    pending LoopState extension), why POLICY_MODE here is limited to
    freetext inspection, why parallel-reads have no effect until the
    tool-registry absorbs the full AI_TOOL_CATALOG (M4 of personas).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:25:14 +02:00
Till JS
c94ab01c69 feat(mana-mcp): Prometheus metrics for policy gate + tool invocations
Replaces the stub /metrics endpoint with a real prom-client registry
(mana_mcp_ prefix, {service="mana-mcp"} default label). Default
process metrics come along for free.

Policy-gate telemetry is the whole point — without it we can't soak
POLICY_MODE=log-only safely or decide when to flip to enforce. New
counter mana_mcp_policy_decisions_total{decision, reason, mode} buckets
every evaluatePolicy() call:

  decision ∈ {allow, deny, flagged}
  reason   ∈ {admin-scope-not-invokable, destructive-not-allowed,
              rate-limit-exceeded, injection-marker, clean, unknown}
  mode     ∈ {log-only, enforce}

So the rate of "would have been denied" during soak is visible directly
as policy_decisions_total{decision="deny", mode="log-only"}.

Also:
  - mana_mcp_tool_invocations_total{tool, outcome} — success |
    handler-error | input-invalid. Policy denies are NOT counted here
    (they're in policy_decisions_total above); this counter only counts
    calls that actually reached the handler or tripped zod validation.
  - mana_mcp_tool_duration_seconds histogram per tool/outcome.

Dep: prom-client ^15.1.3 (same version mana-ai pins).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:23:08 +02:00
Till JS
f07eae3c01 feat(personas): M3.b-d — tick loop + Claude Agent SDK + persistence (real)
Previous commit 38dc80654 carries this M3 title but its payload is an
unrelated apps/api/picture change — shared-.git-index race with a
parallel session (see feedback_git_workflow.md). This commit holds the
actual M3.b/c/d code. Leaving the misnamed commit for the user to
re-attribute / revert as they prefer.

Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The
runner picks up due personas, drives each through Claude + MCP for
one simulated turn, collects actions + ratings, persists through
service-key internal endpoints in mana-auth.

Internal endpoints (mana-auth, service-key-gated)

- GET  /api/v1/internal/personas/due
    Returns personas whose tickCadence + lastActiveAt say they're
    due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri.
    NULLS FIRST so never-run personas go ahead of stale ones.

- POST /api/v1/internal/personas/:id/actions
    Batch ≤ 500. Row ids are deterministic
    `${tickId}-${i}-${toolName}` + ON CONFLICT DO NOTHING so the
    runner can retry a tick without doubling audit rows. Also
    bumps personas.last_active_at so the next /due call sees it.

- POST /api/v1/internal/personas/:id/feedback
    Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is
    one rating per module per tick.

Runner tick pipeline (services/mana-persona-runner/src/runner/)

- claude-session.ts
    Two phases per tick. runMainTurn feeds the persona's system
    prompt + a German "simulate a day" user prompt to Claude Agent
    SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP
    server. We iterate the returned AsyncGenerator and extract
    tool_use blocks into ActionRows; a tool_result with
    is_error=true flips the most recent action. runRatingTurn is a
    fresh query() with tools:[] asking Claude in character to rate
    each used module 1-5 as strict JSON. We parse with tolerance
    for whitespace / fences. Unparseable output becomes a synthetic
    '__parse' feedback row so operators see the failure.

- tick.ts
    Orchestrator. Skips when config.paused. Fetches /due, processes
    in batches of config.concurrency via Promise.allSettled so a
    single persona failure never kills the batch. Returns
    {due, ranSuccessfully, failed[], durationMs}.

- types.ts
    ActionRow + FeedbackRow shapes shared between claude-session
    and the internal client.

Runner bootstrap (src/index.ts)

- setInterval(config.tickIntervalMs) starts the tick loop on boot.
  tickInFlight guards against overlap when Claude latency >
  interval. If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing,
  loop is disabled with a warn line — /health + /diag/login still
  work.
- POST /diag/tick (dev-only) fires one tick on demand, returns
  the result. Avoids waiting a full interval during testing.
- Graceful SIGTERM/SIGINT shutdown clears the interval.

Client

- clients/mana-auth-internal.ts
    X-Service-Key client for the three endpoints above.
    Constructor throws on empty serviceKey — fail loud.

Boot smoke verified: /health returns ok, /diag/tick 500s with
descriptive messages when keys absent. Warning lines on boot when
keys are missing. Type-check green across mana-auth, tool-registry,
mcp, persona-runner.

M3 exit gate is the end-to-end smoke recipe (docker up → db:push →
seed:personas → diag/tick → psql) documented in
services/mana-persona-runner/CLAUDE.md.

M2.d (cross-space family/team memberships) still deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:18:31 +02:00
Till JS
a1caeaa7f3 feat(personas): M3.a — scaffold mana-persona-runner service on :3070
First concrete piece of M3 (docs/plans/mana-mcp-and-personas.md). The
tick loop itself and the Claude Agent SDK + MCP integration are M3.b;
the action/feedback persistence endpoints are M3.c. This commit just
stands up the service so the remaining pieces have a shell to land in.

Service shape (Bun/Hono on :3070)

- src/config.ts
    Env-driven configuration: auth URL, MCP URL, service key for
    action/feedback callbacks (M3.c), Anthropic API key, deterministic
    PERSONA_SEED_SECRET (must match scripts/personas/password.ts so the
    runner can log back in without any stored credentials), tick
    interval and concurrency, RUNNER_PAUSED kill-switch. Production
    start asserts all secrets are set and the dev fallback secret is
    rotated.

- src/password.ts
    Bit-for-bit identical HMAC-SHA256 password derivation to
    scripts/personas/password.ts. Duplicated deliberately: the two
    sides can't share code (one is a repo-root utility script, the
    other is a workspace service) but must stay in sync — comment
    at the top calls this out.

- src/clients/auth.ts
    Two upstream calls the runner needs for one tick: POST /auth/login
    and GET /api/auth/organization/list. loginAndResolvePersonalSpace()
    wraps both and picks the persona's auto-created personal space as
    the write target (throws if none exists — Spaces-Foundation should
    always have seeded one on signup).

- src/index.ts
    Hono app: /health, /metrics (stub), and a dev-only /diag/login
    endpoint that takes a persona email, derives the password, logs
    in, resolves the personal space, and returns {userId, spaceId} as
    an end-to-end sanity check. Disabled in production.

No tick loop yet — RUNNER_PAUSED prints an info line on boot, but
nothing fires. The dispatcher + Claude Agent SDK + MCP client land in
M3.b; the internal POST callbacks into mana-auth for persona_actions /
persona_feedback land in M3.c.

Infra

- Port 3070 added to docs/PORT_SCHEMA.md.
- Service listed in root CLAUDE.md next to mana-mcp.
- services/mana-persona-runner/CLAUDE.md documents what's built today,
  what lands in M3.b/c, and the local diag smoke recipe.

Boot smoke verified: /health returns ok + paused/interval/concurrency,
/diag/login without email returns 400.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:00:43 +02:00
Till JS
faa472be91 feat(mana-ai): first live reminder producers — token budget + retry-loop
Wires the M1 reminderChannel into the mana-ai mission runner with two
initial producers in services/mana-ai/src/planner/reminders.ts:

- tokenBudgetReminder — warns at 75% of the agent's daily cap, emits a
  stronger "wrap up NOW" message at/above 100%. Uses pretick usage +
  accumulated round usage so the warning tracks drift during a long
  plan.
- retryLoopReminder — shape is in place (round≥3 + last 2 failures),
  currently limited to the single lastCall LoopState exposes. Extends
  cleanly once LoopState carries the full failure window.

buildReminderChannel composes active producers; the tick hoists
pretickUsage24h so the channel has the baseline. Each round the loop
re-evaluates the producers, so usage drift across rounds surfaces on
the NEXT turn.

Also exports LoopState + ReminderChannel from @mana/shared-ai top-level
so consumers don't need to reach into /planner.

Tests: 13 new bun tests covering thresholds, pretick+round summing,
composition, and per-round re-evaluation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:00:04 +02:00
Till JS
e5d230e599 feat(agent-loop): M1 — policy gate + reminder channel + parallel reads
Three Claude-Code-inspired primitives for runPlannerLoop, derived from the
reverse-engineering reports in docs/reports/:

1. **Policy gate** (@mana/tool-registry) — evaluatePolicy() gates every tool
   dispatch: denies admin-scope, denies destructive tools not in the user's
   opt-in list, rate-limits per tool (30/60s default), flags prompt-injection
   markers in freetext without blocking. Wired into mana-mcp with a
   per-user rolling invocation log and POLICY_MODE env (off|log-only|enforce,
   default log-only). mana-ai uses detectInjectionMarker only — tool dispatch
   there is plan-only, so rate-limit/destructive checks don't apply yet.

2. **Reminder channel** (packages/shared-ai/src/planner/loop.ts) — new
   reminderChannel callback in PlannerLoopInput. Called once per round with
   LoopState snapshot (round, toolCallCount, usage, lastCall); returned
   strings wrap in <reminder> tags and inject as transient system messages
   into THIS LLM request only. Never pushed to messages[] — the Claude-Code
   <system-reminder> pattern that keeps the KV-cache prefix stable.

3. **Parallel reads** (loop.ts) — isParallelSafe predicate enables
   Promise.all dispatch when every tool_call in a round is parallel-safe,
   in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades
   the whole round to sequential. messages[] always appends in source
   order, never completion order, so the debug log stays linear.
   Default-off (undefined predicate) preserves pre-M1 behaviour.

Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel,
4 reminder). All 74 green, type-check clean across 4 packages.

Design/plan: docs/plans/agent-loop-improvements-m1.md
Reports: docs/reports/claude-code-architecture.md,
         docs/reports/mana-agent-improvements-from-claude-code.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 13:56:40 +02:00
Till JS
493db0c3b2 feat(personas): M2.a-c — persona schemas + admin endpoints + seed pipeline
Continuation of docs/plans/mana-mcp-and-personas.md. Personas are the
auto-test users the M3 runner will drive — they're real Mana users
(kind='persona', tier='founder'), registered through the same Better
Auth pipeline as humans, just stamped differently and metadata-tracked
so the persona-runner knows how to role-play them.

Schemas (auth namespace — personas are 1:1 with users, no reason for a
separate platform.* schema that the plan originally sketched)

- userKindEnum ('human' | 'persona' | 'system') + users.kind column,
  wired into better-auth additionalFields so the JWT/user object carry
  the flag. Default 'human' keeps every existing user untouched.
- auth.personas — 1:1 descriptor (archetype, systemPrompt, moduleMix
  jsonb, tickCadence, lastActiveAt). CASCADE from users.id.
- auth.persona_actions — tick-grouped audit of every tool call the
  runner makes (toolName, inputHash for dedup, result, latency).
- auth.persona_feedback — structured 1-5 ratings per module per tick,
  plus free-text notes. This is where the runner writes the
  self-reflection step at end of each tick.

Admin endpoints (/api/v1/admin/personas, admin-tier-gated)

- POST /            create-or-update by email. Uses auth.api.signUpEmail
                    if the user's new, then stamps kind+tier+verified
                    and upserts the personas row. Idempotent — safe to
                    re-run after catalog edits.
- GET  /            list with 7-day action count per persona.
- GET  /:id         detail + recent 20 actions + per-module feedback
                    aggregate.
- DELETE /:id       hard delete. Refuses non-persona users as
                    defense-in-depth: an admin typo here would cascade
                    through the full user-delete chain.

Catalog + seed pipeline (scripts/personas/)

- catalog.json      10 handwritten personas spanning 7 archetypes
                    (adhd-student, ceo-busy, creative-parent, solo-dev,
                    researcher, freelancer, overwhelmed-newbie).
                    Five pairs of personas that will later share
                    family/team spaces (cross-space setup is deferred
                    to M2.d per the plan).
- catalog.ts        zod-validated loader. Refines email to require
                    @mana.test TLD — non-existent, no bounce risk.
- password.ts       deterministic HMAC-SHA256(PERSONA_SEED_SECRET,
                    email). No stored per-persona credentials; the
                    runner re-derives on every login. Refuses the
                    dev-fallback secret in production.
- seed.ts           POST /admin/personas per catalog entry. Flags:
                    --auth=, --jwt=, --dry-run.
- cleanup.ts        Hard-delete every live persona. Warns when the
                    live set drifts from the catalog.

Root package.json:
  pnpm seed:personas
  pnpm seed:personas:cleanup

Extends the ESLint root-ignore list with `scripts/**` so Bun-typed
utility scripts don't fail the typed-parser check they weren't opted
into. Consistent with the rest of scripts/ being .mjs+.sh.

To go live (user action):
  pnpm docker:up
  cd services/mana-auth && bun run db:push
  export MANA_ADMIN_JWT=...
  pnpm seed:personas

M2.d deferred: cross-space (family/team/practice) memberships between
persona pairs. Better Auth's org-invite flow is multi-step and would
roughly double the M2 scope; the persona-runner (M3) can operate in
personal spaces first, shared-space tests land as their own milestone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 13:55:14 +02:00
Till JS
16c8818338 feat(mcp): M1+M1.5 MCP gateway + tool-registry + shared-crypto
Foundation for autonomous Claude-driven testing. Plan:
docs/plans/mana-mcp-and-personas.md.

New packages
- @mana/tool-registry — schema-first ToolSpec<InputSchema, OutputSchema>
  with zod generics, scope ('user-space' | 'admin') and policyHint
  ('read' | 'write' | 'destructive'). sync-client helpers speak the
  mana-sync push/pull protocol directly so RLS and field-level LWW are
  preserved. MasterKeyClient fetches per-user MKs via the existing
  mana-auth GET /api/v1/me/encryption-vault/key endpoint (JWT-gated,
  ZK-aware, already audited) — no new service-key endpoint built.
  ZeroKnowledgeUserError surfaced as a typed throw.
- @mana/shared-crypto — AES-GCM-256 primitives extracted from the web
  app's $lib/data/crypto/aes.ts so the server-side tool handlers and the
  browser produce byte-for-byte identical wire format
  (enc:1:{b64(iv)}.{b64(ct)}). Web app aes.ts now re-exports from
  shared-crypto — 5 existing importers unchanged, svelte-check stays
  green.

New service
- services/mana-mcp (:3069, Bun/Hono) — MCP Streamable HTTP gateway.
  JWKS auth against mana-auth, per-user session isolation (session-id
  belongs to the user who opened it — cross-user access returns 403),
  admin-scoped tools filtered out before registration. MasterKeyClient
  cached per process with a 5-minute TTL.

11 tools registered
- habits.{create,list,update,archive}, spaces.list (plaintext, M1)
- todo.{create,list,complete}, notes.{create,search}, journal.add
  (encrypted — field lists match
  apps/mana/apps/web/src/lib/data/crypto/registry.ts verbatim)

Infra
- Port 3069 added to docs/PORT_SCHEMA.md
- services/mana-mcp/CLAUDE.md with architecture, auth model,
  tool-authoring recipe, local smoke-test steps
- Root CLAUDE.md services list updated

Type-check green across shared-crypto, mana-tool-registry, mana-mcp.
svelte-check on apps/mana/apps/web stays at 0 errors / 0 warnings.
Boot smoke verified: /health returns registry.loaded=true, unauthed
/mcp → 401, invalid-JWT /mcp → 401 with descriptive message.

Decisions locked in for later milestones (per plan D1–D10):
- Personas will be real mana-auth users (users.kind='persona'), no
  service-key bypass (D1, D2)
- Tool-registry is the SSOT; mana-ai and the legacy
  apps/api/src/mcp/server.ts get merged into it in M4 (three current
  parallel tool catalogs collapse to one)
- Persona-runner (:3070) will be a separate service using the Claude
  Agent SDK + MCP client (D5)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 13:18:35 +02:00
Till JS
c1498c1099 fix(infra): include shared-types in mana-auth Dockerfile installer
mana-auth's package.json declares @mana/shared-types as a workspace
dependency, but the Dockerfile's install stage never copied its source
into the build context. pnpm then silently failed to create the
workspace symlink under node_modules, and bun hit ENOENT on every
import at runtime: "reading /app/services/mana-auth/node_modules/
@mana/shared-types".

The broken image sat undetected as long as the long-running container
didn't restart. Tonight's deploy recreated it and every mana-auth
container immediately crash-looped — taking mana-api and mana-web
down with it via depends_on.

Same class of bug as 70c62e758 (shared-logger).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 02:31:20 +02:00
Till JS
fd1ea47075 feat(backup): client-driven v2 snapshot export, drop server-side backup
Replaces the mana-sync event-stream export (GET /backup/export) with a
fully client-driven `.mana` v2 archive: webapp reads Dexie, decrypts
per-field, packages JSONL + manifest, optionally PBKDF2+AES-GCM seals
with a passphrase.

- New: backup/v2/{format,passphrase,export,import}.ts + format.test.ts
  (10 tests: round-trip, sealed path, 3 failure modes incl. wrong-
  passphrase vs. tamper distinction).
- UI: ExportImportPanel with module multi-select, optional passphrase,
  progress + sealed-file detection — replaces the old backup flow in
  Settings → MyData.
- Removes services/mana-sync/internal/backup/ and the corresponding
  client helpers + v1 tests. No parallel paths, no legacy shim.
- Why client-driven: zero-knowledge users hold their vault key only
  client-side, so a server exporter cannot produce plaintext archives;
  GDPR Art. 20 portability is better served by plaintext-by-default.
- Cross-account restore works via re-encryption under the target
  vault key (no MK transfer needed).

DATA_LAYER_AUDIT.md §8 rewritten to reflect the new architecture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:46:29 +02:00
Till JS
3a7bc7f1c3 test(mana-research): fixture-based tests for Gemini poll-response parser
Re-commit of c413ab7dd (reverted in c31dcdd66) without the unrelated
files that accidentally got swept into the original stage. Parser
content is identical.

The real Gemini /v1beta/interactions/:id completed shape bit us once
already during the initial smoke-test (we had OpenAI-style nested
`output.message.content[]` coded; reality is a flat `outputs` array
of thought|text|image items, with url_citations that carry no title
and usage fields named `total_input_tokens` rather than `input_tokens`).

This test pins the parser against a synthetic fixture covering the
cases we saw in the wild plus the failure modes that are hard to
provoke from a live API call:

  - status dispatch (queued, in_progress, failed, cancelled, incomplete)
  - completed body concatenated across text items, skipping thought/image
  - empty/missing `outputs` without crashing
  - missing usage
  - citations deduped by url, hostname extracted as title
  - wrong-type annotations and those without url skipped
  - real vertexaisearch redirect URLs Gemini emits
  - fallback to url as title when the URL is unparseable
  - trimming of leading/trailing whitespace

To make this testable I pulled the completed-branch of
pollGeminiDeepResearch into a standalone parseInteractionResponse
helper — same behaviour, now reachable without mocking global fetch.

Also adds the `test` script to package.json so `pnpm --filter
@mana/research-service test` works.

17 pass / 0 fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:44:21 +02:00
Till JS
c31dcdd66c Revert "test(mana-research): fixture-based tests for Gemini poll-response parser"
This reverts commit c413ab7dd3.
2026-04-22 18:43:48 +02:00
Till JS
c413ab7dd3 test(mana-research): fixture-based tests for Gemini poll-response parser
The real Gemini /v1beta/interactions/:id completed shape bit us once
already during the initial smoke-test (we had OpenAI-style nested
`output.message.content[]` coded; reality is a flat `outputs` array
of thought|text|image items, with url_citations that carry no title
and usage fields named `total_input_tokens` rather than `input_tokens`).

This test pins the parser against a synthetic fixture covering the
cases we saw in the wild plus the failure modes that are hard to
provoke from a live API call:

  - status dispatch (queued, in_progress, failed, cancelled, incomplete)
  - completed body concatenated across text items, skipping thought/image
  - empty/missing `outputs` without crashing
  - missing usage
  - citations deduped by url, hostname extracted as title
  - wrong-type annotations and those without url skipped
  - real vertexaisearch redirect URLs Gemini emits
  - fallback to url as title when the URL is unparseable
  - trimming of leading/trailing whitespace

To make this testable I pulled the completed-branch of
pollGeminiDeepResearch into a standalone parseInteractionResponse
helper — same behaviour, now reachable without mocking global fetch.

Also adds the `test` script to package.json so `pnpm --filter
@mana/research-service test` works.

17 pass / 0 fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:34:33 +02:00
Till JS
2a18cb5ee4 feat(mana-ai): v0.7 — cross-tick Deep Research Max pre-planning
Opt-in path for missions that want Gemini Deep Research Max (up to 60 min
per task) instead of the shallow RSS pre-research. Because Max runs well
past a single 60-second tick, the state is carried across ticks:

  tick N:   submit → INSERT mission_research_jobs row → skip planner
  tick N+k: poll → still running → skip planner (metric pending_skips)
  tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan

- ManaResearchClient talks to mana-research's new internal
  /v1/internal/research/async endpoints with X-Service-Key +
  X-User-Id. Graceful-null on transport errors so a flaky
  mana-research never crashes the tick loop.
- New table mana_ai.mission_research_jobs with PK (user_id, mission_id)
  — presence is the "pending" flag; delete-on-terminal keeps queries
  trivial.
- handleDeepResearch() encapsulates the state machine; planOneMission
  now returns a discriminated union (planned | skipped | failed) so
  "research pending" isn't miscounted as a parse failure.
- Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits
  per run):
    1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off)
    2. DEEP_RESEARCH_TRIGGER regex matches the mission objective
       (strict: "deep research", "tiefe recherche", "umfassende
       recherche", "hintergrundrecherche", "deep dive")
  Falls back to shallow RSS when either gate fails or the submit
  errors upstream.
- Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total
  labelled by provider, plus _pending_skips_total.
- docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds
  mana-research to depends_on.
- Full write-up with real API response shape (outputs plural, not
  OpenAI-style), step-3 MCP-server plan (security-gated, not built),
  ops + kill-switch: docs/reports/gemini-deep-research.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:56:06 +02:00
Till JS
f10a95e842 feat(mana-research): add Gemini 3.1 Pro Deep Research async providers
- New providers gemini-deep-research + gemini-deep-research-max on the
  Interactions API (preview-04-2026). Submit/poll split, tier parameter
  selects between standard (~minutes, $1–3) and max (up to 60 min, $3–7).
- Parser matches the real response shape: flat `outputs` array of
  thought|text|image items, url_citation annotations without title,
  `usage.total_input_tokens` / `total_output_tokens`.
- Route generalisation: /v1/research/async accepts `provider` with
  default 'openai-deep-research' (backward compatible) and dispatches
  to the right submit/poll pair.
- New internal service-to-service endpoint /v1/internal/research/async
  gated by X-Service-Key + X-User-Id for credit accounting. Enables
  mana-ai to drive deep-research jobs on the mission owner's wallet
  without requiring a user JWT.
- Pricing: 300 credits (standard) / 1500 credits (max). Conservative
  markup over the ~$3/$7 ceiling so the first runs can't surprise us.
- Docs: AGENT_PROVIDER_IDS + pricing + env map + auto-router stay in
  sync; CLAUDE.md Phase 3b now current; API_KEYS.md references the
  new providers under GOOGLE_GENAI_API_KEY.

Verified with a real smoke test against the Gemini API: submit + poll
both succeed, completed response parsed cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:55:30 +02:00
Till JS
177734a860 fix(tsconfig): unblock shared-types consumers
shared-types/src/index.ts re-exports with explicit .ts extensions
(Tailwind v4 module resolver needs them). TS 5.7 requires consumers
to opt in via allowImportingTsExtensions. The flag only type-checks
when noEmit:true; the NestJS builder also needs
rewriteRelativeImportExtensions so tsc still emits valid JS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 18:53:55 +02:00
Till JS
12be75e6a6 fix(broadcast): track route paths + shared-branding tsconfig
Two fixes surfaced by the end-to-end smoke test.

1. broadcast-track.ts: inner route paths double-prefixed.
   Routes were declared as '/track/open/:token' etc, then
   mounted at '/api/v1/track', yielding '/api/v1/track/track/open/:token'
   — every tracking endpoint returned 404. Dropped the redundant
   '/track/' prefix so the full path is now
   '/api/v1/track/{open,click,unsubscribe}/:token' as the
   orchestrator + client both expect.

   Verified with live curl:
   - /track/open/BAD → 200 image/gif 42 bytes (graceful no-signal)
   - /track/click/?url missing → 400 missing url
   - /track/click?url=javascript: → 400 bad url
   - /track/click?url=https://ok + bad token → 302 graceful
   - /track/unsubscribe/BAD GET → 400 HTML
   - /track/unsubscribe/BAD POST → 400 (RFC 8058)

2. shared-branding/tsconfig.json: allowImportingTsExtensions
   missing. shared-types/src/index.ts uses explicit .ts
   imports (intentional, for Tailwind's module resolver); any
   downstream tsconfig without allowImportingTsExtensions emits
   8 errors. shared-auth already had this fix — shared-branding
   gets the same treatment. noEmit:true is set, so no rewrite
   flag needed.

   Verified: shared-branding pnpm check → 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 18:30:47 +02:00
Till JS
1861e89d45 chore(broadcast): wire mana-mail into env pipeline + push schema
The three final pre-dogfood items:

1. drizzle.config: schemaFilter now includes 'broadcast' alongside
   'mail'. Without this, `bun run db:push` skipped the broadcast
   tables — schema existed in code but not in Postgres. Tested via
   db:push + psql \dt (3 tables created: campaigns, events, sends).

2. .env.development: new MANA-MAIL SERVICE section with Stalwart
   knobs + broadcast config (tracking secret, rate limits, send
   throttle). DEV secret is explicitly labelled non-production —
   prod rotates via env.

3. generate-env.mjs: new block writes services/mana-mail/.env on
   `pnpm setup:env`. Mirrors the invoices / research / events
   pattern. All 16 broadcast/mail vars flow through from SSOT.

Verified end-to-end:
- pnpm setup:env → services/mana-mail/.env contains
  BROADCAST_TRACKING_SECRET + rate limits
- bun run src/index.ts → /health returns 200 with the new config
- psql → broadcast.campaigns / events / sends are materialised

Broadcast module is now fully ready to send real mail — nothing
else required before the first dogfood campaign.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:21:57 +02:00
Till JS
260dd312a9 feat(broadcast): M8 DNS auth check (SPF / DKIM / DMARC)
Closes the last plan milestone. Users can verify their sending-domain
setup without leaving the broadcast settings page.

Server (mana-mail)
- services/dns-check.ts: parseSpf / parseDkim / parseDmarc are pure
  functions. SPF accepts include:<mailDomain>, flags weak (+all) and
  wrong (include missing) and multi-record (RFC 7208 §3.2). DKIM
  needs v=DKIM1 + a p= public-key segment. DMARC requires v=DMARC1,
  flags p=none as weak (monitoring only), ok on quarantine/reject.
  All three are case-insensitive.
- lookupTxt(): DNS-over-HTTPS against Cloudflare 1.1.1.1 — avoids
  the Bun/container udp-resolver flakiness and works everywhere.
  Multi-string TXT (`"a" "b"`) get concatenated before parsing.
- checkDomain(): one call, three parallel DoH lookups, returns a
  structured result with suggested copy-paste records scoped to the
  user's actual mail domain from config.
- Route: GET /v1/mail/dns-check?domain=&selector= (JWT auth). Zod
  validates the domain looks sensible before hitting DoH.
- 16 unit tests covering all three parsers + multi-record edge case.

Client
- api.ts: runDnsCheck(domain, selector?) helper with typed result.
- components/DnsCheckBanner.svelte: derives domain from the default
  from-email (after @), calls the check on-demand, renders per-record
  status chips (ok / weak / wrong / missing) with messages, exposes
  copy-pasteable SPF + DMARC records when anything's off. DKIM setup
  is provider-specific so we show a hint rather than a canned record.
  Last-check timestamp persists to settings.dnsCheck so the banner
  survives a reload without re-hitting the API.
- Wired into SettingsForm between Impressum and Standard-Footer —
  where the user is already thinking about "what's required to
  actually send".

All checks clean:
- webapp pnpm check: 0 broadcast errors (4 pre-existing articles errors
  from parallel Spaces work, unrelated)
- mana-mail tests: 36/36 across tracking-token + link-rewriter + dns-check
- mana-mail build: 2.51 MB (+8 KB for juice — dns-check itself is ~3 KB)

Plan: docs/plans/broadcast-module.md §M8. All 10 milestones now done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:48:03 +02:00
Till JS
a312d98f09 feat(broadcast): click-link tracking + send throttle
Closes the last two dogfood blockers before real-campaign use.

link-rewriter.ts
- rewriteClickLinks(): walks <a href="http…"> in the HTML body and
  replaces each URL with /api/v1/track/click/{token}?url={original}
  so clicks go through the tracking endpoint. Regex-based because
  Tiptap output is well-formed; returns a count for debugging.
- Leaves mailto: / tel: / anchor fragments alone — wrapping those
  breaks the recipient's native handler and accomplishes nothing.
- `skipUrls` param carries the unsubscribe + web-view URLs (already
  tracking endpoints themselves) so they don't get double-wrapped.
- 11 unit tests covering http/https rewriting, skip list, non-http
  schemes, attribute preservation, multi-link count, quoted-attr
  variants, idempotency.

Orchestrator wiring
- substituteUrls now calls rewriteClickLinks after the preview-
  placeholder swap and before the open-pixel injection. The
  unsubscribe + web-view URLs from this same function are passed
  in as skip entries so they survive the pass untouched.
- Constructor gains `sendThrottleMs` param (default 150ms).
- Main send loop awaits sleep(throttleMs) between iterations. 150ms
  = ~6/sec = ~360/min, safely below most SMTP provider limits.
  100-recipient campaign = ~15s extra wall-clock but that's fine
  for MVP (and most campaigns are way smaller).

Config
- New env BROADCAST_SEND_THROTTLE_MS (default 150). Wired from
  loadConfig to the orchestrator constructor.

The broadcast module is now functionally complete for dogfooding.
Remaining before a real campaign can actually go out: run
`cd services/mana-mail && bun run db:push` to materialise the
broadcast.* schema tables.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:07:58 +02:00
Till JS
d887fc125d feat(broadcast): settings + detail view + compliance polish
Closes the "could actually dogfood" gap: legal address can be set,
sent campaigns have a proper view with live stats, and the send path
respects DSGVO.

Webapp
- components/SettingsForm.svelte: sender defaults + Impressum (required,
  highlighted amber until filled) + footer. Matches the invoices
  SenderProfileForm pattern — immediate save, dedicated section per
  concern.
- /broadcasts/settings/+page.svelte: mounts the form. ComposeView step
  3's "Einstellungen öffnen" CTA now lands somewhere.
- views/DetailView.svelte: read-only view for sent/scheduled/cancelled
  campaigns. 5-card stats grid (sent, open, click, bounce, unsub) with
  rate percentages. Polls mana-mail every 30s for up to 30 min after
  mount, persists back to Dexie via applyServerStatus so the list view
  + widget catch up. Includes a preview of the actual rendered campaign
  so "what went out" is visible after the fact.
- /broadcasts/[id]/+page.svelte: DetailView for non-drafts; drafts
  bounce to /edit via $effect-triggered goto.
- ListView row-click now routes by status (draft → edit, else → detail).

mana-mail compliance
- Orchestrator loadUnsubscribedEmails(): queries broadcast.sends WHERE
  status='unsubscribed' scoped to the user, filters the recipient list
  BEFORE any send rows get written. Campaign's totalRecipients reflects
  the post-skip count so open rates aren't inflated by "virtual sends".
  Skipped count surfaces in result.errors for the UI to show.
- jmap-client.submitEmail: new extraHeaders param. Sets custom headers
  via JMAP's `header:<Name>:asText` property convention.
- Orchestrator sets RFC 8058 headers per recipient:
    List-Unsubscribe: <https://.../track/unsubscribe/{token}>
    List-Unsubscribe-Post: List-Unsubscribe=One-Click
  This is what makes Gmail / Apple Mail show their native "Abmelden"
  button in the message header (not just a body link).

All checks clean: 0 TS errors, 37/37 webapp tests, 9/9 tracking-token
tests, mana-mail bun build = 2.50 MB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:43:36 +02:00
Till JS
f17383f9f2 feat(broadcast): M4 bulk-send via mana-mail + tracking infrastructure
End-to-end send path lives: click "Jetzt senden" in step 4 → client
resolves recipients → POST /v1/mail/bulk-send → mana-mail loops through
JMAP with per-recipient signed URLs → status flips draft → sent.

mana-mail (backend)
- New Postgres schema `broadcast.{campaigns,sends,events}` in Drizzle.
  Campaigns + sends keyed on the webapp's local ids so joins are free;
  events append-only with send_id FK, dedup at query-time not write-time
  so tracking pixel hits don't contend on a transaction.
- tracking-token.ts: HMAC-SHA256 over JSON({campaignId, sendId, nonce}),
  base64url.base64url encoded. JSON inner payload instead of delimiter
  splits so IDs can contain any character. timingSafeEqual for the HMAC
  comparison. 9 unit tests covering roundtrip / tamper / malformed.
- broadcast-orchestrator.ts: takes pre-resolved recipient list, inlines
  CSS once via juice (webResources.images=false so no external fetches
  slow the loop), per-recipient substitutes `{{unsubscribe_url}}` /
  `{{web_view_url}}` + injects open pixel, submits each mail through
  the user's own JMAP account. Writes sends rows first (status=queued)
  so a crash mid-loop leaves truthful DB state. Returns aggregate
  stats + per-email errors.
- Routes: POST /v1/mail/bulk-send (JWT, cap at 5000 recipients via
  zod + config), GET /v1/mail/campaigns/:id/events (JWT, aggregates
  opens + clicks + unsubscribes with COUNT DISTINCT for the "unique"
  metric), GET/POST /v1/track/{open,click,unsubscribe}/:token (public,
  no auth, signed URL is the only gate).
- Track routes mounted OUTSIDE /api/v1/mail/* because the JWT
  middleware guards that subtree — recipients aren't logged in.
- Config: BROADCAST_TRACKING_SECRET (separate from SERVICE_KEY so the
  blast radius of a leak stays narrow),
  BROADCAST_MAX_RECIPIENTS_PER_CAMPAIGN (default 5000),
  BROADCAST_MAX_RECIPIENTS_PER_HOUR (default 500, not yet enforced).
- Added juice@^11 dependency.

Webapp (client)
- api.ts: sendCampaign() resolves the audience from Dexie contacts,
  renders the full email HTML + plaintext with placeholders, POSTs to
  mana-mail. Contacts NEVER leave the client decrypted — the server
  only sees the flat recipient list the user's client produced.
- fetchCampaignStats() for M7 dashboard/detail polling.
- ComposeView step 4 replaced: confirmation modal with "sicher?"
  question, sending state with spinner, done state with delivered
  count + expandable per-email error list + "Zur Übersicht" button.
- Status transitions to 'sent' with cached stats after successful
  send via applyServerStatus.

Known M4 gaps (fill in M5)
- Open/click/unsubscribe track endpoints return valid responses but
  event dedup is rough — one insert per hit, dedup at query time
  only. M5 adds windowed IP-hash dedup.
- Synchronous send loop. 100 recipients ≈ 15s blocking. M5/M6 moves
  this to an async job queue with SSE progress.
- Each recipient generates a "Sent" folder entry in the user's
  Stalwart mailbox. Fine for 50-recipient newsletters, silly for
  5000. Phase 2 carves out a dedicated broadcast mailbox.

Plan: docs/plans/broadcast-module.md §M4.
Next: M5 open/click tracking with dedup + rate-limits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 13:53:13 +02:00
Till JS
1d3794f96c feat(mana-ai): Prometheus metrics for tool-calls, loop rounds, provider errors
Three new counters + one histogram fill the observability gap from
the function-calling migration:

- mana_ai_tool_calls_total{tool, policy, outcome} — one tick per
  tool_call the planner produced. `outcome` is `deferred` on the
  server (stub onToolCall records for later client execution);
  webapp runner will emit success/failure once it grows its own
  Prom surface.
- mana_ai_planner_rounds (histogram, buckets 1..5) — distribution of
  rounds consumed per iteration. Runs close to the cap signal a
  planner struggling with the mission objective.
- mana_ai_provider_errors_total{provider, kind} — structured errors
  surfaced from mana-llm. Kind mirrors the ProviderError hierarchy
  added in commit 1 of the migration (blocked/truncated/auth/
  rate_limit/capability/unknown).

Plumbing:
- llm-client.ts parses mana-llm's `{detail: {kind, message}}` 4xx/5xx
  body shape and re-throws as ProviderCallError carrying the kind.
- tick.ts observes metrics at the natural emission points — rounds
  + per-call counter after runPlannerLoop returns, provider_errors
  in the catch block.

Grafana dashboards + status.mana.how already pick up the
collectDefaultMetrics prefix, so these metrics land in the existing
mana-ai panel without scraper changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 20:48:29 +02:00
Till JS
38d35247cd feat(spaces): end-to-end shared-space sync (membership lookup + plaintext)
Closes the gap between "invite flow UI exists" and "two users in the
same space actually see each other's data". Three pieces land together
because they're meaningless without each other.

mana-auth — new internal endpoint:
  GET /api/v1/internal/users/:userId/memberships
  Returns [{organizationId, role}, ...] for the user. mana-sync uses
  this to populate the multi-member RLS session config.

mana-sync — membership lookup:
  new internal/memberships package with an HTTP client + 5 min
  per-user cache, fail-open (empty list = pre-Spaces behavior).
  Config gets MANA_AUTH_URL (default http://localhost:3001).
  Handler.NewHandler takes the Lookup. Every Push/Pull/Stream call
  now passes spaceIDsFor(userID) to Store methods.
  GetChangesSince + GetAllChangesSince extend their WHERE clause:
    WHERE (user_id = $1 OR space_id = ANY($memberSpaces))
  so co-members see each other's rows, not just the author.

apps/web — encryption skip for shared-space records:
  encryptRecord now checks record.spaceId:
    - `_personal:<userId>` sentinel OR no active shared space → encrypt
      with user master key (E2E as today).
    - Active space resolves to non-personal type AND spaceId matches
      that space → skip encryption; write lands plaintext.
  decryptRecord is unchanged because its per-field isEncrypted() guard
  already passes plaintext through.
  Phase-1 compromise: shared-space data is protected by server RLS
  only, not E2E. Phase 2 adds per-Space shared keys with per-member
  wrap — tracked in docs/plans/spaces-foundation.md.

Plus docs/plans/shared-space-smoketest.md: step-by-step Zwei-User-Test
mit erwarteten Ergebnissen und Debugging-Hinweisen bei Problemen.

Build + go test + web check all green.

Plan: docs/plans/spaces-foundation.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 20:46:53 +02:00