Tests:
- packages/feedback/src/avatar.test.ts — 10 unit tests (determinism,
mirror-symmetry, color contrast, padding-resilience, pseudonym-
integration, density-sanity).
- services/mana-analytics/src/services/feedback-redact.test.ts —
9 privacy-boundary tests verifying:
* anonymous path NEVER includes realName, even when author opted in
* auth path NEVER includes realName when author opted OUT
* realName only when (opted-in AND auth-path) — both gates required
* userId / deviceInfo / voteCount stripped from output
Plan-Doc:
- docs/plans/feedback-rewards-and-identity.md status → shipped (3.A,
3.B, 3.C, 3.F live; 3.D, 3.E open) mit Commit-Hashes.
Service-Layer minor: REWARD-const + redact als __TEST__-Export
publik gemacht (nur fürs Testen, kein Verhaltensänderung).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Macht aus den Pseudonymen echte Charaktere ohne Klarnamen-Zwang.
Pixel-Identicon-Avatar (3.C.2):
- generateAvatarSvg(displayHash) — pure-function, deterministisch.
5×5 left-mirrored Identicon mit HSL-Foreground/Background aus dem
Hash. Inline-SVG, kein Storage, kein img-load-Flicker.
- <EulenAvatar> Component im Package, in ItemCard neben dem Pseudonym.
Klarname-Toggle (3.C.1):
- auth.users + community_show_real_name boolean (default off, opt-in).
- PATCH /api/v1/me/profile akzeptiert communityShowRealName.
- mana-analytics LEFT JOINs auth.users → bei opt-in liefert auth-
required /public + /me/reacted Endpoints zusätzlich realName.
- Anonymous /api/v1/public/feedback/* zeigt realName NIE — auch nicht
wenn opted-in. Public-Mirror bleibt für SEO + Privacy safe.
- Migration 008_community_identity.sql lokal + prod eingespielt.
Karma-System (3.C.3):
- auth.users + community_karma int. toggleReaction increment/decrement
am Author-User (Self-Reactions zählen nicht — kein Self-Farming).
- KARMA_THRESHOLDS + tierFromKarma() im Package: Bronze (0-9) /
Silver (10-49) / Gold (50-199) / Platin (200+).
- ItemCard zeigt Tier-Dot neben dem Pseudonym, Title-Tooltip mit
Karma-Zahl. Floor-clamped at 0.
Eulen-Profil (3.C.4):
- GET /api/v1/public/feedback/eule/{hash} — alle public-Posts dieser
Eule + aggregiertes Karma. SHA256-Format-Validation.
- /community/eule/[hash] Public-SSR-Route mit Avatar-Hero, Tier-Badge,
Karma-Counter, Post-Liste. Author-Klick im ItemCard navigiert hin.
- publicFeedbackService.getEulenProfile() im Package.
PublicFeedbackItem erweitert um displayHash (public Pseudonym-ID,
SHA256 ist one-way → safe to expose) + karma + optional realName.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Schließt den Loop zwischen Submit und Ship. User kriegt jetzt:
- Toast beim nächsten App-Start, wenn ein eigener oder unterstützter
Wisch ›planned/in_progress/completed/declined‹ wurde
- /profile/my-wishes als persönliche Roadmap mit drei Tabs:
Eigene · Unterstützt · Inbox
Server (mana-analytics):
- Neue Tabelle feedback_notifications mit ON DELETE CASCADE auf
user_feedback. Migration 0004 lokal + prod eingespielt.
- adminUpdate enqueued bei jeder Status-Transition Author-
Notifications. AdminResponse-Edits feuern eine eigene
'admin_response'-Notify. tryGrantShipBonus hängt zusätzlich
Reactioner-Notifications dran (›Dein Like ist gelandet, +25 Mana‹).
- Endpoints:
GET /api/v1/feedback/me/notifications?unread_only=true&limit=N
POST /api/v1/feedback/me/notifications/:id/read
POST /api/v1/feedback/me/notifications/read-all
GET /api/v1/feedback/me/reacted (für die My-Wishes-Page)
Package (@mana/feedback):
- FeedbackNotification + NotificationKind types exportiert
- service.getNotifications/markNotificationRead/markAllNotificationsRead
- service.getMyReactedItems
Web:
- lib/notifications/feedback-toaster.svelte.ts: Boot-Pull + 60s-Poll,
rendert unread-notifications via toast-store, markiert sofort read.
In (app)/+layout.svelte's authReady-Hook gestartet/gestoppt.
- /profile/my-wishes: Tab-View über getMyFeedback + getMyReactedItems
+ getNotifications. Tabs zeigen Counter-Badges, unread-Badge in der
Inbox-Sektion. ›Alle als gelesen markieren‹-Action vorhanden.
Pre-launch saubere Lösung — kein Polling-Spam (60s), Mark-Read direkt
nach Toast-Display, fail-soft an mehreren Stellen.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Build was succeeding-by-luck because the wrong filter resolved to
nothing → pnpm installed all workspace deps. After Phase 3.A added
the new grant route, the install pruning must have changed enough
that the build started failing with /app/node_modules: not found.
Fix the filter to match the real package name.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3.A des feedback-rewards-and-identity-Plans. Direkter Reziprozitäts-
Loop: User kriegt sofort etwas zurück fürs Mitwirken, Originalwunsch-
Eulen werden beim Ship belohnt, Reagierer kriegen einen Anteil.
mana-credits:
- Neuer Endpoint POST /api/v1/internal/credits/grant + grantCredits()
Service-Methode mit Idempotency via metadata.referenceId.
- transaction_type-Enum erweitert um 'grant' (eigener Typ statt
Mismatch mit 'refund').
- Migration 0001_grant_transaction_type.sql + partial-Index auf
metadata->>'referenceId' für O(log n) Idempotency-Lookup.
mana-analytics:
- FeedbackService stempelt sofort +5 Credits beim createFeedback (top-
level only, Replies bekommen nichts), wenn Mindest-20-Zeichen erfüllt
und Rate-Limit (10/User/24h via feedback_grant_log) nicht überschritten.
- adminUpdate triggert beim FRISCHEN Übergang nach 'completed':
+500 Credits an Original-Wisher + +25 an alle, die mit 👍 oder 🚀
reagiert haben. Doppel-Pay strukturell unmöglich via referenceId
(`<id>_shipped`, `<id>_reaction_<userId>`).
- Founder-Whitelist via FEEDBACK_FOUNDER_USER_IDS env (verhindert
Self-Reward).
- Drop voteCount-Spalte (durch reactions/score seit 0002 ersetzt).
- Migration 0003_grant_log_drop_vote_count.sql idempotent, lokal +
prod eingespielt.
Plan: docs/plans/feedback-rewards-and-identity.md (Phase 3.A-3.F).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The F4 server-side singleton bootstrap was fire-and-forget at signup
time — a transient mana_sync outage during registration would leave the
user with no singleton and only the in-store `getOrCreateLocalDoc()`
fallback to race on the first write. The signup-hook is still the
happy-path zero-latency bootstrap; this commit adds a deliberate
reconciliation path that converges on every boot.
- Idempotent `bootstrapUserSingletons` / `bootstrapSpaceSingletons`:
both functions now existence-check sync_changes before INSERT and
return boolean (true=inserted, false=skipped).
- New endpoint `POST /api/v1/me/bootstrap-singletons` — JWT-gated under
the existing `/api/v1/me/*` prefix. Provisions the caller's
userContext and the kontextDoc for every Space they're a member of.
Returns `{ ok, bootstrapped: { userContext, spaces: { id: bool } } }`.
- Webapp `(app)/+layout.svelte` calls the endpoint once per
authenticated boot, after `restoreClientIdFromDexie()` and before
`createUnifiedSync.startAll()`. Best-effort; failures swallow into a
console warning and the in-store fallback still covers the rare
race window.
Plan: docs/plans/sync-field-meta-overhaul.md (F4-robust row).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetrically extends the F4 server-side singleton bootstrap to the
per-Space `kontextDoc`. Every Space-creation — Personal at signup and
brand/club/family/team/practice via the org plugin — now writes an empty
kontextDoc row straight into mana_sync.sync_changes with origin='system',
client_id='system:bootstrap'. Fresh clients pull the row instead of
racing on a local insert that the next pull would clobber.
- New `bootstrapSpaceSingletons(spaceId, ownerUserId, syncSql)` in
services/mana-auth/src/services/bootstrap-singletons.ts; shared
`buildFieldMeta` helper extracted.
- `createBetterAuth(databaseUrl, syncDatabaseUrl, webauthn)` now takes
the sync-DB URL and lazy-creates a module-scoped postgres pool for
the bootstrap inserts.
- Hook into `databaseHooks.user.create.after` (only on `created: true`
from createPersonalSpaceFor) and `organizationHooks.afterCreateOrganization`.
- Webapp `kontextStore.ensureDoc()` made private as `getOrCreateLocalDoc()` —
same fallback role as userContextStore's after F5. Public API is now just
setContent + appendContent.
Plan: docs/plans/sync-field-meta-overhaul.md (F4-fu row in Shipping Log).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
shared-hono depends on @mana/shared-logger; without it, the bun runtime
crashes on first import with ENOENT for the workspace symlink target.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches the build context to repo-root so the pnpm-workspace install
can pull in @mana/shared-hono. Mirrors the mana-auth/mana-ai pattern
(node+pnpm installer stage → bun runtime stage).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the userContext race-on-first-mount that surfaced as a
"10 fields overwritten" conflict toast pre-F2. Adds a fire-and-forget
hook in the /register flow that writes the per-user `userContext`
singleton straight into `mana_sync.sync_changes` with
`client_id='system:bootstrap'` and `origin='system'`.
Behavior:
- On successful `signUpEmail`, `bootstrapUserSingletons(userId, syncSql)`
inserts a `profile/userContext` row with the empty-default shape that
mirrors the webapp's `emptyUserContext()` factory in
`apps/mana/apps/web/src/lib/modules/profile/types.ts`.
- The receiving client treats the change as origin='server-replay'
on apply (per F2 conflict-gate), so no toasts on first pull.
- Failure is logged but does not abort registration — the webapp's
existing `ensureDoc()` fallback still works during the F4→F5
transition.
Module-scoped postgres pool (max=2 connections) lazy-initialized on
first signUp; reused for the lifetime of the process. Same pattern as
`UserDataService.getSyncSql`.
Out of scope for F4:
- `kontextDoc` is per-Space (not per-user) — bootstrap there will be
hooked into the Space-creation flow, not /register. The webapp's
`ensureDoc()` for kontextDoc stays as-is for now.
- Webapp `ensureDoc()` removal is F5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes `updatedAt` from the wire protocol and from every Local-prefixed
record type. Replaced by two orthogonal mechanisms — deriveUpdatedAt()
for read-side public-facing values, _updatedAtIndex shadow for indexed
sorts.
Local-side:
- New `_updatedAtIndex` shadow column. Stamped by the Dexie creating /
updating hook on every write. Stripped from the pending-change payload
so it never travels to mana-sync. Indexed in Dexie v53 on the 22 tables
that previously indexed `updatedAt`.
- `deriveUpdatedAt(record)` in sync.ts returns max(__fieldMeta[*].at) so
the public-facing Task / Note / etc. shape keeps an `updatedAt: string`
property without holding it as data.
- Type-converters across ~60 module/queries.ts and types.ts files now
call `deriveUpdatedAt(local)` instead of reading `local.updatedAt`.
Module-store sweep:
- Regex codemod removed `updatedAt: new Date().toISOString()` /
`: now` / `: now()` / `: nowIso()` stamping from 121 store files
(~382 call sites total). Single-property update calls
(`{ updatedAt: now }`) collapsed to `{}`; touch-only patterns
(writing/drafts, writing/generations) kept the call as a no-op
because the hook now stamps `_updatedAtIndex` automatically on
any Dexie modification.
- Local* interfaces stripped of `updatedAt: string` (43 types.ts files).
Public-facing types (Task, Note, Mission, Agent, …) keep
`updatedAt: string` as a computed read-side property.
- Companion's chat conversation now sorts on a real
`lastMessageAt` data field instead of touching `updatedAt`.
- Session-only stores (times/session-alarms, session-countdown-timers)
stamp `updatedAt: now` directly because they're not in Dexie and
have no field-meta layer to derive from.
Sync engine:
- applyServerChanges sets `_updatedAtIndex` itself when applying
server changes (max of server-field times for updates, recordTime
for inserts) so server-replays land orderable.
- Dropped the legacy `localUpdatedAt` fallback — every record now has
`__fieldMeta`, the per-field at is the canonical source.
- Soft-delete tombstone path stops stamping `updatedAt: serverTime`,
uses `_updatedAtIndex` instead.
Server-side:
- mana-ai iteration-writer no longer emits `updatedAt` in
sync_changes.data; receivers derive it from the field-meta map.
- mana-sync types: no change (the wire format already uses
`field_meta` / `at` from F1).
Out of scope: backend Drizzle schemas (mana-credits, mana-events, …)
keep their `updated_at` columns. Those are pure server-internal — not
part of the sync_changes / __fieldMeta mechanism F3 cleans up.
Tests + checks:
- 0 svelte-check errors over 7652 files.
- 29/29 sync.test.ts (vitest).
- 61 mana-ai bun tests.
- mana-sync go test ./... cached green.
Plan: docs/plans/sync-field-meta-overhaul.md F3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drizzle's pgEnum() ohne pgSchema-Wrap landet immer in public — der
schemaFilter versteckt das nur im Diff (siehe Repo-Memory:
"Drizzle enums with schemaFilter must use pgSchema().enum()"). Die
Tabelle feedback.user_feedback referenziert die Enums quer aus public,
das funktioniert; aber die ALTER-TYPE-Statements in der ursprünglichen
Migration zielten auf feedback.feedback_status / feedback.feedback_category
und hätten damit nichts gefunden.
Lokal verifiziert (mana_platform.public.feedback_status,
mana_platform.public.feedback_category):
- 6 Status-Werte umbenannt → submitted/under_review/planned/in_progress/completed/declined
- Default-Status auf 'submitted'
- Category 'onboarding-wish' hinzugefügt
- Re-Run idempotent (DO-Blöcke + ADD VALUE IF NOT EXISTS)
Mittelfristig sollte feedbackSchema.enum(...) verwendet werden, damit
Enums tatsächlich im feedback-Namespace landen — eigener Refactor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Macht @mana/feedback zur SSOT für alle Nutzer-Feedback-Categories und
-Status — Voraussetzung dafür, dass Onboarding-Wishes, NPS, Churn-Feedback
etc. künftig dort landen.
- Status-Enum: DB-Werte umbenannt new/reviewed/done/rejected →
submitted/under_review/completed/declined (Package gewinnt). PG≥10
ALTER TYPE … RENAME VALUE ist non-destructive.
- Category 'praise' ins Package aufgenommen (war nur in DB).
- Category 'onboarding-wish' neu in Package + DB für den Wish-Step.
- Default status in DB: 'new' → 'submitted'.
- CreateFeedbackInput.isPublic optional → Service reicht durch, default
bleibt true; private Categories wie onboarding-wish setzen false.
- Schema-Datei mit SSOT-Kommentar versehen, der Drift in Zukunft verhindert.
Hand-authored Migration unter services/mana-analytics/drizzle/0001_*.sql
weil drizzle-kit push Enum-Werte nicht zuverlässig umbenennt. Manuell
einspielen vor nächstem db:push:
psql "\$DATABASE_URL" -f services/mana-analytics/drizzle/0001_align-feedback-enums.sql
Plan in docs/plans/feedback-hub.md (Phase 0–4); Phase 0 + 1 jetzt, 2-4
deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All 5 milestones landed today in one continuous session: registry,
health cache, fallback router, observability, and consumer migration.
115 service-side tests, validator covers 2538 files.
Final milestone of docs/plans/llm-fallback-aliases.md. Every backend
caller now requests models via the `mana/<class>` alias system instead
of hardcoded `ollama/...` strings. mana-llm resolves aliases through
`services/mana-llm/aliases.yaml` with health-aware fallback (M3) and
emits resolved-model + fallback metrics (M4).
SSOT moved to `packages/shared-ai/src/llm-aliases.ts` so apps/api,
apps/mana/apps/web, and services/mana-ai all import the same
`MANA_LLM` constant via the existing `@mana/shared-ai` workspace
dependency. Three additional sites (memoro-server, mana-events,
mana-research) inline the alias string with a SSOT comment because
they don't pull @mana/shared-ai today.
Migrated 14 sites across 10 files:
- apps/api: writing(LONG_FORM), comic(STRUCTURED), context(FAST_TEXT),
food(VISION), plants(VISION), research orchestrator (3 tiers
collapsed to STRUCTURED+FAST_TEXT/LONG_FORM)
- apps/mana/apps/web: voice/parse-task + parse-habit (STRUCTURED)
- services/mana-ai: planner llm-client + tick.ts (REASONING)
- services/mana-events: website-extractor (STRUCTURED, inlined)
- services/mana-research: mana-llm client (FAST_TEXT, inlined)
- apps/memoro/apps/server: ai.ts (FAST_TEXT, inlined)
Legacy env-vars removed: WRITING_MODEL, COMIC_STORYBOARD_MODEL,
VISION_MODEL, MANA_LLM_DEFAULT_MODEL. The chain in aliases.yaml is
now the single tuning surface; SIGHUP reloads it without redeploys.
New `scripts/validate-llm-strings.mjs` regex-scans 2538 files for
hardcoded `<provider>/<model>` strings and fails the build if any
land outside the SSOT or the explicitly-allowed paths (image-gen
modules, model-inspector code, this validator itself, the registry).
Wired into `validate:all` next to the i18n + theme validators.
Verified: `pnpm validate:llm-strings` clean, `pnpm --filter @mana/api
type-check` clean, `pnpm --filter @mana/ai-service type-check`
clean. Web type-check has 2 pre-existing errors in
SettingsSidebar.svelte (i18n MessageFormatter type drift, last
touched in 988c17a67 — unrelated to this work).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `X-Mana-LLM-Resolved: <provider>/<model>` header on non-streaming
responses. Streaming clients read the same info from each chunk's
`model` field (SSE headers go out before the chain is walked).
- Three new Prometheus metrics: `mana_llm_alias_resolved_total{alias,
target}` (which concrete model an alias resolved to per request),
`mana_llm_fallback_total{from_model, to_model, reason}` (each
fallback transition), `mana_llm_provider_healthy{provider}` (gauge,
mirrors the circuit-breaker).
- New debug endpoints: `GET /v1/aliases` (registry inspection — chain
+ description per alias, useful for confirming SIGHUP reloads),
`GET /v1/health` (full per-provider liveness snapshot — failure
counter, last error, unhealthy-until backoff).
- `kill -HUP <pid>` reloads `aliases.yaml`. Parse errors leave the
previous good state in memory and log the rejection.
- `ProviderHealthCache.add_listener()` for cache→metrics decoupling:
the gauge is updated via a transition-only listener wired in main.py
rather than the cache importing prometheus_client itself.
- Request-side metrics now use the requested model string, success-side
uses the resolved one. So `mana_llm_llm_requests_total{provider="ollama",
model="gemma3:12b"}` reflects actual upstream load even when callers
used `mana/long-form` aliases.
16 new observability tests (test_m4_observability.py): listener
fire-on-transition semantics, exception-isolation, multi-listener,
counter increments, gauge writes, end-to-end alias→metric flow,
v1/aliases + v1/health endpoint shape, response.model carries the
resolved target after fallback. Total suite: 115/115 in 1.6s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the old Ollama→Google special-case auto-fallback with the
unified pipeline: caller passes either a direct provider/model or an
alias from the `mana/` namespace; the router resolves to a chain and
walks it skipping unhealthy providers (per ProviderHealthCache from M2),
trying each entry, marking provider unhealthy on retryable errors and
falling through to the next.
Retryable: ConnectError, ReadTimeout, RemoteProtocolError, 5xx,
ProviderRateLimitError. Propagated (don't fall back, don't poison the
cache): ProviderCapabilityError, ProviderAuthError, ProviderBlockedError,
4xx, unknown exception types. The cache stays "what the network told us
about this provider's liveness" — caller errors don't muddy that signal.
Streaming: pre-first-byte fallback only. Once a chunk has been yielded
the provider is committed; mid-stream errors propagate as-is so we
don't splice two voices into one output.
`NoHealthyProviderError` (HTTP 503) carries a structured attempt log —
each chain entry shows up as `(model, reason)` so the cause of a 503
is visible in the response and metrics, not only in service logs.
main.py wires the lifespan: aliases.yaml is loaded, ProviderHealthCache
created, ProviderRouter takes both as constructor deps, HealthProbe
spawned with cheap HTTP probes per configured provider (Ollama
/api/tags, OpenAI-compat /v1/models with Bearer header). Google is
skipped — google-genai SDK has no obvious cheap probe; the call-site
fallback handles real errors.
22 new router tests (test_router_fallback.py): chain walking, capability
& auth propagation, 5xx vs 4xx differentiation, rate-limit retry,
all-fail → NoHealthyProviderError, direct provider strings bypass
aliases, streaming pre-first-byte fallback, mid-stream-failure does
NOT fall back, empty stream commits without retry, cache feedback on
success/failure/non-retryable. Existing test_providers.py updated for
the new constructor signature; all 99 service tests green via the dev
container (Python 3.12).
Legacy purged: `_ollama_concurrent`, `_ollama_health_cache`,
`_can_fallback_to_google`, `_should_use_ollama`, `_fallback_to_google`,
`_get_ollama_health_cached` all gone. The `auto_fallback_enabled` /
`ollama_max_concurrent` settings remain in config.py for now (M5 will
remove them along with the per-feature env-var overrides).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-provider liveness with circuit-breaker semantics. The router (M3)
will read `is_healthy()` to skip dead providers in a chain; the probe
loop and the call-site fallback handler write state via
`mark_healthy` / `mark_unhealthy`.
State machine: 1st failure stays healthy (transient blips happen);
2nd consecutive failure trips the breaker and sets a 60s backoff
window during which `is_healthy → False`. After the window the
provider is half-open again — next call exercises it, success
resets, failure re-arms.
HealthProbe is the background asyncio.Task that pings every
registered provider every 30s with a 3s timeout. Probes run
concurrently per tick and one bad probe can't sink the loop. Probe
functions are injected (`{name: async-fn}`) so this module stays
decoupled from the provider classes — the wiring lives in main.py
where we already know which providers are configured.
32 new tests (FakeClock for deterministic backoff timing, slow-probe
helpers for parallelism + timeout, lifecycle tests for start/stop
idempotency and tick-after-error survival). 64/64 alias+health tests
green.
Not yet wired into the request path — that's M3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First milestone of the LLM-fallback plan (docs/plans/llm-fallback-aliases.md).
Introduces the `mana/<class>` namespace; the registry parses + validates
aliases.yaml at startup and reloads on demand. Schema-rejects empty
chains, missing provider prefixes, alias names outside the reserved
namespace, default→unknown references, etc.
Reload semantics: parse error keeps the previous good state in memory
so a typo + SIGHUP doesn't take the service down.
5 aliases ship with the initial config: fast-text, long-form, structured,
reasoning, vision. Each chain ends with a cloud provider so the system
keeps working when the GPU server is offline.
32 unit tests covering happy path, schema validation, namespace check,
reload safety, and a guard that the shipped aliases.yaml itself parses.
M2 (health-cache + probe-loop) and M3 (router fallback execution) build
on this; aliases are not yet wired into the request path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
iPhone HEIC photos uploaded through Chrome on macOS landed as
`mimeType: application/octet-stream` because Chrome doesn't recognise
the HEIC MIME and `file.type` was empty. The transform endpoint then
refused with `Transform only supported for images` (HTTP 400) and
the wardrobe Try-On flow surfaced this as `mana-media transform
failed for <id>: HTTP 400`. Even fixing the MIME wouldn't have been
enough — sharp's prebuilt binary ships the heif container format
without a HEVC decoder plugin (libde265 is omitted for patent
reasons), so the actual decode would still throw.
Three-part fix at the upload edge:
1. New `services/sniff.ts` — magic-byte sniffer for image MIMEs.
Reads the first ~16 bytes and recognises JPEG, PNG, GIF, WebP,
BMP, TIFF, HEIC, HEIF, AVIF. Returns `null` for everything else
so the caller can fall back to whatever the browser claimed.
2. Upload route — sniffs every upload before passing the buffer to
`uploadService.upload`. Trusts magic bytes over `file.type` so
Chrome's empty-type HEIC still lands with `image/heic`. Removes
the entire class of `application/octet-stream` rows for files
that are obviously images.
3. HEIC/HEIF transcoded to JPEG at upload via the new
`heic-convert` dependency (pure-JS WASM, no system libs needed).
The original buffer is replaced with the JPEG bytes, the MIME
becomes `image/jpeg`, and the filename's `.HEIC` extension is
rewritten to `.jpg`. Downstream code (process pipeline, transform
endpoint, sharp) then deals exclusively with formats sharp can
actually decode. Failure path returns HTTP 500 with a clear
`HEIC conversion failed` error so the client knows it wasn't a
generic crash.
Bonus, transform endpoint hardening: `mimeType.startsWith('image/')`
gate now also accepts a row whose stored MIME is wrong (legacy
`application/octet-stream` from before this fix) when the actual
bytes sniff as an image. Lets old broken rows still serve where
the format itself is decodable; the upload-side fix prevents new
ones from existing.
Sharp 0.33 on this machine reports `heif: 1.18.2` for the container
but rejects the actual HEVC compressed bitstream — confirmed by the
exact error string `No decoding plugin installed for this
compression format (11.6003)`. Going through `heic-convert` first
sidesteps that entirely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two interlocking fixes driven by a production lockout incident.
## Bug that motivated this
A fresh schema-drift column (auth.users.onboarding_completed_at) made
every Better Auth query crash with Postgres 42703. The /login wrapper
swallowed the non-2xx and mapped it onto a generic "401 Invalid
credentials" AND bumped the password lockout counter — so 5 legit
login attempts against a broken DB would have locked every real user
out of their own account. Same wrapper pattern on /register, /refresh,
/reset-password etc. The 30-minute hunt ended in a one-off repro
script that finally surfaced the real Postgres error.
The user-facing passkey button additionally returned generic 404s on
every login-page mount because the route wasn't registered (the DB
schema existed, the Better Auth plugin wasn't wired).
## Phase 1 — Error classification (services/mana-auth/src/lib/auth-errors)
- 19-code AuthErrorCode taxonomy (INVALID_CREDENTIALS, EMAIL_NOT_VERIFIED,
ACCOUNT_LOCKED, SERVICE_UNAVAILABLE, PASSKEY_VERIFICATION_FAILED, …)
- classifyFromResponse/classifyFromError handle: Better Auth APIError
(duck-typed on `name === 'APIError'`), Postgres errors (23505 unique,
42703/08xxx → infra), ZodError, fetch/ECONNREFUSED network errors,
bare Error, unknown.
- respondWithError routes the structured response, logs at the right
level, fires the correct security event, and CRITICALLY only bumps
the lockout counter for actual credential failures — SERVICE_UNAVAILABLE
and INTERNAL never touch lockout.
- All 12 endpoints in routes/auth.ts refactored (/login, /register,
/logout, /session-to-token, /refresh, /validate, /forgot-password,
/reset-password, /resend-verification, /profile GET+POST,
/change-email, /change-password, /account DELETE).
- Fixed pre-existing auth.api.forgetPassword typo (→ requestPasswordReset).
- shared-logger + requestLogger middleware wired in index.ts; all
console.* calls in the service removed.
## Phase 2 — Passkey end-to-end (@better-auth/passkey 1.6+)
- sql/007_passkey_bootstrap.sql: idempotent schema alignment —
friendly_name→name, +aaguid, transports jsonb→text, +method column
on login_attempts.
- better-auth.config.ts: passkey plugin wired with rpID/rpName/origin
from new webauthn config section. rpID defaults to mana.how in prod
(from COOKIE_DOMAIN), localhost in dev.
- routes/passkeys.ts: 7 wrapper endpoints (capability probe,
register/options+verify, authenticate/options+verify with JWT mint,
list, delete, rename). Each routes errors through the classifier;
authenticate/verify promotes generic INVALID_CREDENTIALS to
PASSKEY_VERIFICATION_FAILED.
- PasskeyRateLimitService: in-memory per-IP (options: 20/min) and
per-credential (verify: 10 failures/min → 5 min cooldown) buckets.
Deliberately separate from the password lockout — different factor,
different blast radius.
- Client: authService.getPasskeyCapability() async probe, memoised per
session. authStore.passkeyAvailable reactive state. LoginPage gates
on === true so a slow probe doesn't flash the button in.
- AuthResult grew a code: AuthErrorCode field; handleAuthError in
shared-auth prefers the server envelope over the legacy message
heuristics.
## Tests
- 30 unit tests for the classifier covering every branch (including
the exact Postgres 42703 shape that started this).
- 9 unit tests for the rate limiter.
- 14 integration tests for the auth routes — the regression test
explicitly asserts "upstream 500 → 503 + zero lockout bumps".
- 101 tests pass, 0 fail, 30 pre-existing skips unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- PATCH /api/v1/me/profile in mana-auth (name, image with 1–80 char
validation) — powers the Screen-1 save
- (app)/+layout.svelte:
* isOnboarding derived from pathname
* handleAuthReady loads onboardingStatus, redirects brand-new users
to /onboarding/name (fire-and-forget so sync/data-layer init keeps
running in parallel)
* chrome (PillNav, wallpaper, bottom-stack) hidden in onboarding mode;
AuthGate still wraps so the flow enforces authentication
- /onboarding/+layout.svelte: full-viewport shell with progress dots
(1/3, 2/3, 3/3) and a skip-all that marks the flow complete and
sends the user home
- /onboarding/+page.svelte: redirects bare entry to /onboarding/name
- /onboarding/name/+page.svelte: text input (1–40 chars), Enter = Weiter,
skip falls back to email local-part so Screen 2's greeting is never
empty
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- auth.users: new nullable `onboarding_completed_at` column
- new /api/v1/me/onboarding routes: GET, POST /complete, PATCH /reset
- onboardingStatus Svelte store in the web app that reads/writes via
those endpoints (no JWT claim so completing the flow takes effect
without a token re-mint)
- docs/plans/onboarding-flow.md adjusted: no backfill (launch without
existing users), better-auth `name` clarified, 7 templates including
"Arbeit" confirmed
Foundation for the 3-screen first-login flow (Name → Look → Templates).
No UI and no route guard yet — those ship in M2 when the redirect target
actually exists. Schema change is a pure column-add, applied via
`pnpm --filter @mana/auth db:push`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
compactHistory() now defaults to DEFAULT_COMPACT_MODEL =
'google/gemini-2.5-flash-lite' when the caller doesn't override. Lite
is ~3–5x cheaper than gemini-2.5-flash with near-identical
summarisation quality — summarisation doesn't need the same tier as
reasoning + tool-calling, and the compactor fires exactly when token
spend is highest, so the cheaper route saves exactly where it matters.
CompactHistoryOptions.model is now optional. All three consumers
(mana-ai tick, webapp Companion, webapp Mission runner) drop their
explicit gemini-2.5-flash override and let the default apply.
This is the pragmatic M2.5: no mana-llm changes. The "tier" abstraction
(X-Model-Tier header, env-routed aliases) from the Claude-Code report
makes sense only once multiple utility tasks need cheaper routing —
topic-detection, classification, command-injection checks. Today only
the compactor wants it, and a model constant is the simplest contract
that works.
2 new tests (default applied + override honoured). 79 shared-ai tests
green, all three consumers type-check clean. One pre-existing unrelated
type error in apps/mana/apps/web/src/lib/modules/wardrobe/queries.ts
(not touched by this commit).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pgEnum() defaults to the public schema. Because
drizzle.config.ts sets schemaFilter: ['auth'], push introspection
never saw the enums and kept re-emitting CREATE TYPE access_tier ...,
failing with 42710. This blocked setup-databases.sh from advancing
mana-auth past the enum declarations and silently masked other drift
(e.g. the new `kind` column on auth.users going un-pushed).
Source side: three enums now live on authSchema via
authSchema.enum(...) instead of pgEnum(...). DB side: migration 006
recreates access_tier / user_role / user_kind inside the auth schema,
repoints auth.users.access_tier and auth.users.role via ::text cast
(preserving all data and defaults), and drops the old public types.
After this, `drizzle-kit push --force` reports "No changes detected"
on a clean DB and the broader `pnpm setup:db` run is green without
workarounds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the SQL that was applied manually to match the personas.ts
Drizzle schema introduced in 493db0c3b. Idempotent. See
docs/plans/mana-mcp-and-personas.md for the design. Required because
the spaces tables created alongside personas sit outside the auth
schemaFilter, and pre-existing public enums would otherwise trip
drizzle-kit push (resolved separately in migration 006).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the loop on M2: when the compactor fires, the LLM needs to know
it's now seeing a <compact-summary> instead of raw turns so it
doesn't waste a turn asking about lost details or re-executing tools
whose responses are gone.
shared-ai:
- LoopState grows `compactionsDone: number` (cap-1 by current loop
policy, but shape kept as count for future multi-compact cycles).
- runPlannerLoop populates it on each reminder-channel call. New
loop test asserts [0, 1] sequence: round 1 before compaction,
round 2 after.
mana-ai:
- New producer `compactedReminder` — fires severity=info when
compactionsDone >= 1, wrapped in a German one-liner ("frag nicht
nach verlorenen Details").
- Injected FIRST in buildReminderChannel so the LLM frames the rest
of the round with "I'm looking at a summary" context. Metric
surface stays `{producer='compacted', severity='info'}`.
4 new reminder tests (3 pure producer + 1 composition-ordering) +
1 loop-wiring test. 77 shared-ai, 20 reminders.test.ts — green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two loose ends from M3/M4:
1. Tool_use_id-based error attribution in the persona-runner
-----------------------------------------------------------
The previous collectActionsFromMessage() flipped the *most recent*
ActionRow to 'error' when a tool_result carried is_error:true. That was
fine as long as Claude invoked tools strictly in sequence, but when
the planner pipelines multiple tools in one turn, a later tool_result
carries an earlier tool_use_id — the last-action fallback mis-
attributes the error.
runMainTurn() now keeps a tool_use_id → action-index Map for the
duration of the tick. On tool_use we stash block.id, on tool_result we
look up the exact ActionRow via tool_use_id and flip that one. The
"flip last" path survives as a pure fallback if a future SDK ever
ships a block without an id.
2. New audit:encrypted-tools script
-----------------------------------
scripts/audit-encrypted-tools.ts — loads registerAllModules() and
apps/mana/…/crypto/registry.ts, diffs every ToolSpec.encryptedFields
against the authoritative web-app ENCRYPTION_REGISTRY.
Catches three classes of drift:
- missing-table : tool declares a table the web-app doesn't encrypt
- field-drift : both agree a table is encrypted but the field lists
differ (half-encryption in the wire is silent death)
- disabled : web-app has enabled:false while the tool still
encrypts — advisory warning, not a fail
Negative-tested by injecting a deliberate drift on todo.create +
todo.list (shortened ENCRYPTED_FIELDS to ['title']); the auditor
flagged both tools with full field diffs, restore returned to green.
Wired into `pnpm run validate:all` so the contract survives future
edits on either side. Fills the M4 audit gap noted in
project_mana_mcp_personas.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Claude-Code wU2 pattern goes live. Every mission run now passes a
compactor into runPlannerLoop that will fire once if cumulative token
usage crosses 92% of MANA_AI_COMPACT_MAX_CTX (default 1_000_000, the
gemini-2.5-flash ceiling). Override via env for deployments on smaller
models; set to 0 to disable entirely.
The compactor reuses the planner's own LlmClient + gemini-2.5-flash
model for now. When mana-llm grows a Haiku tier we'll route the
compactor there — it's pure summarisation and a cheaper model saves
tokens exactly where they matter.
New metrics:
- mana_ai_compactions_triggered_total — counter, one per firing
- mana_ai_compacted_turns — histogram, how many middle turns got
folded each time (< 3 ⇒ maxCtx is probably misconfigured)
Logs print a 60-char tail of the summary.goal so the "what was this
mission doing again" question survives a compaction.
No new tests here — compactHistory and the loop wiring are already
covered by the 22 tests in shared-ai (M2.1 + M2.2). The 57 existing
mana-ai bun tests stay green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Superseded by the top-level docker-compose.dev.yml (which defines
searxng + redis as part of the unified dev stack via `pnpm docker:up`).
This per-service file was an artefact from before the unified setup
and no script / doc / README still references it.
An orphan `mana-searxng-dev` + `mana-search-redis-dev` had been running
from this file for ~2 weeks, squatting on the host's port 8080. Every
first `pnpm dev:mana:all` after a cold machine start would fail with
Bind for 0.0.0.0:8080 failed: port is already allocated
because the top-level compose's `mana-searxng` service couldn't take
8080 while the orphan held it. The second invocation silently
"worked" — docker saw the freshly-created mana-searxng container and
skipped the bind step on the idempotent up, leaving it healthy but
only reachable inside the docker network (8080/tcp, no external
publish).
Cleanup already done out-of-band:
docker compose -f services/mana-search/docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml up -d --force-recreate searxng
Deleting the file so a stale `docker compose -f …/mana-search/dev.yml up`
can't resurrect the orphan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Producers now return structured {producer, severity, text} objects
instead of raw strings. buildReminderChannel collects them, increments
mana_ai_reminders_emitted_total{producer, severity} per emission, and
maps back to strings for the shared-ai loop input.
Why structured: the Prometheus label "severity" lets dashboards split
75-99% token-budget warnings (severity=warn) from 100%+ escalations
(severity=escalate) without NLP on the reminder text. Adding a new
producer that emits only info-level state (e.g. stale-sync warning)
falls out for free.
Active producer labels today:
- token-budget (warn, escalate)
- retry-loop (warn)
With this plus the scrape job (d087b4744), we can finally answer:
"does the budget warning actually change LLM behaviour?" — correlate
reminders_emitted_total{producer='token-budget'} with
tick_duration_seconds or planner_rounds_histogram.
3 tests updated to assert the new {producer, severity, text} shape
(16 reminder tests total, all green).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends LoopState with a sliding window of the last N ExecutedCalls
(oldest-first), capped at LOOP_STATE_RECENT_CALLS_WINDOW = 5. The loop
maintains the window automatically; reminderChannel producers read it
without touching internal state.
This activates retryLoopReminder which was shape-only in faa472be9.
The guard now fires end-to-end: when round >= 3 and the tail-2 calls
both returned success:false, the LLM sees a "stop retrying, write a
summary instead" <reminder> on the next turn. The tail-2 check rather
than window-wide is deliberate — a flaky run with intermittent success
(F, F, F, OK, F) is not a retry loop, just flaky tools.
Why window=5: retry loops usually manifest within 2-3 consecutive
rounds; a 5-deep window gives room for burst-detection and
stale-tool heuristics without bloating the reminder channel. Cap
keeps the reminder producers O(5) regardless of loop length.
Tests: 3 new (sliding-window cap + slide + order in shared-ai, retry
composition + budget+retry chain + tail-only heuristic in mana-ai).
Total agent-loop tests now 74 across both packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mana-mcp:
- Policy-gate section: POLICY_MODE semantics, the four decision
rules, where to find soak metrics during log-only burn-in.
- /metrics section pointing at the Prometheus job.
mana-ai:
- New v0.8 status block: reminderChannel wiring, the two live
producers (tokenBudgetReminder active, retryLoopReminder dormant
pending LoopState extension), why POLICY_MODE here is limited to
freetext inspection, why parallel-reads have no effect until the
tool-registry absorbs the full AI_TOOL_CATALOG (M4 of personas).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the stub /metrics endpoint with a real prom-client registry
(mana_mcp_ prefix, {service="mana-mcp"} default label). Default
process metrics come along for free.
Policy-gate telemetry is the whole point — without it we can't soak
POLICY_MODE=log-only safely or decide when to flip to enforce. New
counter mana_mcp_policy_decisions_total{decision, reason, mode} buckets
every evaluatePolicy() call:
decision ∈ {allow, deny, flagged}
reason ∈ {admin-scope-not-invokable, destructive-not-allowed,
rate-limit-exceeded, injection-marker, clean, unknown}
mode ∈ {log-only, enforce}
So the rate of "would have been denied" during soak is visible directly
as policy_decisions_total{decision="deny", mode="log-only"}.
Also:
- mana_mcp_tool_invocations_total{tool, outcome} — success |
handler-error | input-invalid. Policy denies are NOT counted here
(they're in policy_decisions_total above); this counter only counts
calls that actually reached the handler or tripped zod validation.
- mana_mcp_tool_duration_seconds histogram per tool/outcome.
Dep: prom-client ^15.1.3 (same version mana-ai pins).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous commit 38dc80654 carries this M3 title but its payload is an
unrelated apps/api/picture change — shared-.git-index race with a
parallel session (see feedback_git_workflow.md). This commit holds the
actual M3.b/c/d code. Leaving the misnamed commit for the user to
re-attribute / revert as they prefer.
Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The
runner picks up due personas, drives each through Claude + MCP for
one simulated turn, collects actions + ratings, persists through
service-key internal endpoints in mana-auth.
Internal endpoints (mana-auth, service-key-gated)
- GET /api/v1/internal/personas/due
Returns personas whose tickCadence + lastActiveAt say they're
due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri.
NULLS FIRST so never-run personas go ahead of stale ones.
- POST /api/v1/internal/personas/:id/actions
Batch ≤ 500. Row ids are deterministic
`${tickId}-${i}-${toolName}` + ON CONFLICT DO NOTHING so the
runner can retry a tick without doubling audit rows. Also
bumps personas.last_active_at so the next /due call sees it.
- POST /api/v1/internal/personas/:id/feedback
Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is
one rating per module per tick.
Runner tick pipeline (services/mana-persona-runner/src/runner/)
- claude-session.ts
Two phases per tick. runMainTurn feeds the persona's system
prompt + a German "simulate a day" user prompt to Claude Agent
SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP
server. We iterate the returned AsyncGenerator and extract
tool_use blocks into ActionRows; a tool_result with
is_error=true flips the most recent action. runRatingTurn is a
fresh query() with tools:[] asking Claude in character to rate
each used module 1-5 as strict JSON. We parse with tolerance
for whitespace / fences. Unparseable output becomes a synthetic
'__parse' feedback row so operators see the failure.
- tick.ts
Orchestrator. Skips when config.paused. Fetches /due, processes
in batches of config.concurrency via Promise.allSettled so a
single persona failure never kills the batch. Returns
{due, ranSuccessfully, failed[], durationMs}.
- types.ts
ActionRow + FeedbackRow shapes shared between claude-session
and the internal client.
Runner bootstrap (src/index.ts)
- setInterval(config.tickIntervalMs) starts the tick loop on boot.
tickInFlight guards against overlap when Claude latency >
interval. If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing,
loop is disabled with a warn line — /health + /diag/login still
work.
- POST /diag/tick (dev-only) fires one tick on demand, returns
the result. Avoids waiting a full interval during testing.
- Graceful SIGTERM/SIGINT shutdown clears the interval.
Client
- clients/mana-auth-internal.ts
X-Service-Key client for the three endpoints above.
Constructor throws on empty serviceKey — fail loud.
Boot smoke verified: /health returns ok, /diag/tick 500s with
descriptive messages when keys absent. Warning lines on boot when
keys are missing. Type-check green across mana-auth, tool-registry,
mcp, persona-runner.
M3 exit gate is the end-to-end smoke recipe (docker up → db:push →
seed:personas → diag/tick → psql) documented in
services/mana-persona-runner/CLAUDE.md.
M2.d (cross-space family/team memberships) still deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First concrete piece of M3 (docs/plans/mana-mcp-and-personas.md). The
tick loop itself and the Claude Agent SDK + MCP integration are M3.b;
the action/feedback persistence endpoints are M3.c. This commit just
stands up the service so the remaining pieces have a shell to land in.
Service shape (Bun/Hono on :3070)
- src/config.ts
Env-driven configuration: auth URL, MCP URL, service key for
action/feedback callbacks (M3.c), Anthropic API key, deterministic
PERSONA_SEED_SECRET (must match scripts/personas/password.ts so the
runner can log back in without any stored credentials), tick
interval and concurrency, RUNNER_PAUSED kill-switch. Production
start asserts all secrets are set and the dev fallback secret is
rotated.
- src/password.ts
Bit-for-bit identical HMAC-SHA256 password derivation to
scripts/personas/password.ts. Duplicated deliberately: the two
sides can't share code (one is a repo-root utility script, the
other is a workspace service) but must stay in sync — comment
at the top calls this out.
- src/clients/auth.ts
Two upstream calls the runner needs for one tick: POST /auth/login
and GET /api/auth/organization/list. loginAndResolvePersonalSpace()
wraps both and picks the persona's auto-created personal space as
the write target (throws if none exists — Spaces-Foundation should
always have seeded one on signup).
- src/index.ts
Hono app: /health, /metrics (stub), and a dev-only /diag/login
endpoint that takes a persona email, derives the password, logs
in, resolves the personal space, and returns {userId, spaceId} as
an end-to-end sanity check. Disabled in production.
No tick loop yet — RUNNER_PAUSED prints an info line on boot, but
nothing fires. The dispatcher + Claude Agent SDK + MCP client land in
M3.b; the internal POST callbacks into mana-auth for persona_actions /
persona_feedback land in M3.c.
Infra
- Port 3070 added to docs/PORT_SCHEMA.md.
- Service listed in root CLAUDE.md next to mana-mcp.
- services/mana-persona-runner/CLAUDE.md documents what's built today,
what lands in M3.b/c, and the local diag smoke recipe.
Boot smoke verified: /health returns ok + paused/interval/concurrency,
/diag/login without email returns 400.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the M1 reminderChannel into the mana-ai mission runner with two
initial producers in services/mana-ai/src/planner/reminders.ts:
- tokenBudgetReminder — warns at 75% of the agent's daily cap, emits a
stronger "wrap up NOW" message at/above 100%. Uses pretick usage +
accumulated round usage so the warning tracks drift during a long
plan.
- retryLoopReminder — shape is in place (round≥3 + last 2 failures),
currently limited to the single lastCall LoopState exposes. Extends
cleanly once LoopState carries the full failure window.
buildReminderChannel composes active producers; the tick hoists
pretickUsage24h so the channel has the baseline. Each round the loop
re-evaluates the producers, so usage drift across rounds surfaces on
the NEXT turn.
Also exports LoopState + ReminderChannel from @mana/shared-ai top-level
so consumers don't need to reach into /planner.
Tests: 13 new bun tests covering thresholds, pretick+round summing,
composition, and per-round re-evaluation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three Claude-Code-inspired primitives for runPlannerLoop, derived from the
reverse-engineering reports in docs/reports/:
1. **Policy gate** (@mana/tool-registry) — evaluatePolicy() gates every tool
dispatch: denies admin-scope, denies destructive tools not in the user's
opt-in list, rate-limits per tool (30/60s default), flags prompt-injection
markers in freetext without blocking. Wired into mana-mcp with a
per-user rolling invocation log and POLICY_MODE env (off|log-only|enforce,
default log-only). mana-ai uses detectInjectionMarker only — tool dispatch
there is plan-only, so rate-limit/destructive checks don't apply yet.
2. **Reminder channel** (packages/shared-ai/src/planner/loop.ts) — new
reminderChannel callback in PlannerLoopInput. Called once per round with
LoopState snapshot (round, toolCallCount, usage, lastCall); returned
strings wrap in <reminder> tags and inject as transient system messages
into THIS LLM request only. Never pushed to messages[] — the Claude-Code
<system-reminder> pattern that keeps the KV-cache prefix stable.
3. **Parallel reads** (loop.ts) — isParallelSafe predicate enables
Promise.all dispatch when every tool_call in a round is parallel-safe,
in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades
the whole round to sequential. messages[] always appends in source
order, never completion order, so the debug log stays linear.
Default-off (undefined predicate) preserves pre-M1 behaviour.
Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel,
4 reminder). All 74 green, type-check clean across 4 packages.
Design/plan: docs/plans/agent-loop-improvements-m1.md
Reports: docs/reports/claude-code-architecture.md,
docs/reports/mana-agent-improvements-from-claude-code.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Continuation of docs/plans/mana-mcp-and-personas.md. Personas are the
auto-test users the M3 runner will drive — they're real Mana users
(kind='persona', tier='founder'), registered through the same Better
Auth pipeline as humans, just stamped differently and metadata-tracked
so the persona-runner knows how to role-play them.
Schemas (auth namespace — personas are 1:1 with users, no reason for a
separate platform.* schema that the plan originally sketched)
- userKindEnum ('human' | 'persona' | 'system') + users.kind column,
wired into better-auth additionalFields so the JWT/user object carry
the flag. Default 'human' keeps every existing user untouched.
- auth.personas — 1:1 descriptor (archetype, systemPrompt, moduleMix
jsonb, tickCadence, lastActiveAt). CASCADE from users.id.
- auth.persona_actions — tick-grouped audit of every tool call the
runner makes (toolName, inputHash for dedup, result, latency).
- auth.persona_feedback — structured 1-5 ratings per module per tick,
plus free-text notes. This is where the runner writes the
self-reflection step at end of each tick.
Admin endpoints (/api/v1/admin/personas, admin-tier-gated)
- POST / create-or-update by email. Uses auth.api.signUpEmail
if the user's new, then stamps kind+tier+verified
and upserts the personas row. Idempotent — safe to
re-run after catalog edits.
- GET / list with 7-day action count per persona.
- GET /:id detail + recent 20 actions + per-module feedback
aggregate.
- DELETE /:id hard delete. Refuses non-persona users as
defense-in-depth: an admin typo here would cascade
through the full user-delete chain.
Catalog + seed pipeline (scripts/personas/)
- catalog.json 10 handwritten personas spanning 7 archetypes
(adhd-student, ceo-busy, creative-parent, solo-dev,
researcher, freelancer, overwhelmed-newbie).
Five pairs of personas that will later share
family/team spaces (cross-space setup is deferred
to M2.d per the plan).
- catalog.ts zod-validated loader. Refines email to require
@mana.test TLD — non-existent, no bounce risk.
- password.ts deterministic HMAC-SHA256(PERSONA_SEED_SECRET,
email). No stored per-persona credentials; the
runner re-derives on every login. Refuses the
dev-fallback secret in production.
- seed.ts POST /admin/personas per catalog entry. Flags:
--auth=, --jwt=, --dry-run.
- cleanup.ts Hard-delete every live persona. Warns when the
live set drifts from the catalog.
Root package.json:
pnpm seed:personas
pnpm seed:personas:cleanup
Extends the ESLint root-ignore list with `scripts/**` so Bun-typed
utility scripts don't fail the typed-parser check they weren't opted
into. Consistent with the rest of scripts/ being .mjs+.sh.
To go live (user action):
pnpm docker:up
cd services/mana-auth && bun run db:push
export MANA_ADMIN_JWT=...
pnpm seed:personas
M2.d deferred: cross-space (family/team/practice) memberships between
persona pairs. Better Auth's org-invite flow is multi-step and would
roughly double the M2 scope; the persona-runner (M3) can operate in
personal spaces first, shared-space tests land as their own milestone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for autonomous Claude-driven testing. Plan:
docs/plans/mana-mcp-and-personas.md.
New packages
- @mana/tool-registry — schema-first ToolSpec<InputSchema, OutputSchema>
with zod generics, scope ('user-space' | 'admin') and policyHint
('read' | 'write' | 'destructive'). sync-client helpers speak the
mana-sync push/pull protocol directly so RLS and field-level LWW are
preserved. MasterKeyClient fetches per-user MKs via the existing
mana-auth GET /api/v1/me/encryption-vault/key endpoint (JWT-gated,
ZK-aware, already audited) — no new service-key endpoint built.
ZeroKnowledgeUserError surfaced as a typed throw.
- @mana/shared-crypto — AES-GCM-256 primitives extracted from the web
app's $lib/data/crypto/aes.ts so the server-side tool handlers and the
browser produce byte-for-byte identical wire format
(enc:1:{b64(iv)}.{b64(ct)}). Web app aes.ts now re-exports from
shared-crypto — 5 existing importers unchanged, svelte-check stays
green.
New service
- services/mana-mcp (:3069, Bun/Hono) — MCP Streamable HTTP gateway.
JWKS auth against mana-auth, per-user session isolation (session-id
belongs to the user who opened it — cross-user access returns 403),
admin-scoped tools filtered out before registration. MasterKeyClient
cached per process with a 5-minute TTL.
11 tools registered
- habits.{create,list,update,archive}, spaces.list (plaintext, M1)
- todo.{create,list,complete}, notes.{create,search}, journal.add
(encrypted — field lists match
apps/mana/apps/web/src/lib/data/crypto/registry.ts verbatim)
Infra
- Port 3069 added to docs/PORT_SCHEMA.md
- services/mana-mcp/CLAUDE.md with architecture, auth model,
tool-authoring recipe, local smoke-test steps
- Root CLAUDE.md services list updated
Type-check green across shared-crypto, mana-tool-registry, mana-mcp.
svelte-check on apps/mana/apps/web stays at 0 errors / 0 warnings.
Boot smoke verified: /health returns registry.loaded=true, unauthed
/mcp → 401, invalid-JWT /mcp → 401 with descriptive message.
Decisions locked in for later milestones (per plan D1–D10):
- Personas will be real mana-auth users (users.kind='persona'), no
service-key bypass (D1, D2)
- Tool-registry is the SSOT; mana-ai and the legacy
apps/api/src/mcp/server.ts get merged into it in M4 (three current
parallel tool catalogs collapse to one)
- Persona-runner (:3070) will be a separate service using the Claude
Agent SDK + MCP client (D5)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mana-auth's package.json declares @mana/shared-types as a workspace
dependency, but the Dockerfile's install stage never copied its source
into the build context. pnpm then silently failed to create the
workspace symlink under node_modules, and bun hit ENOENT on every
import at runtime: "reading /app/services/mana-auth/node_modules/
@mana/shared-types".
The broken image sat undetected as long as the long-running container
didn't restart. Tonight's deploy recreated it and every mana-auth
container immediately crash-looped — taking mana-api and mana-web
down with it via depends_on.
Same class of bug as 70c62e758 (shared-logger).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the mana-sync event-stream export (GET /backup/export) with a
fully client-driven `.mana` v2 archive: webapp reads Dexie, decrypts
per-field, packages JSONL + manifest, optionally PBKDF2+AES-GCM seals
with a passphrase.
- New: backup/v2/{format,passphrase,export,import}.ts + format.test.ts
(10 tests: round-trip, sealed path, 3 failure modes incl. wrong-
passphrase vs. tamper distinction).
- UI: ExportImportPanel with module multi-select, optional passphrase,
progress + sealed-file detection — replaces the old backup flow in
Settings → MyData.
- Removes services/mana-sync/internal/backup/ and the corresponding
client helpers + v1 tests. No parallel paths, no legacy shim.
- Why client-driven: zero-knowledge users hold their vault key only
client-side, so a server exporter cannot produce plaintext archives;
GDPR Art. 20 portability is better served by plaintext-by-default.
- Cross-account restore works via re-encryption under the target
vault key (no MK transfer needed).
DATA_LAYER_AUDIT.md §8 rewritten to reflect the new architecture.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-commit of c413ab7dd (reverted in c31dcdd66) without the unrelated
files that accidentally got swept into the original stage. Parser
content is identical.
The real Gemini /v1beta/interactions/:id completed shape bit us once
already during the initial smoke-test (we had OpenAI-style nested
`output.message.content[]` coded; reality is a flat `outputs` array
of thought|text|image items, with url_citations that carry no title
and usage fields named `total_input_tokens` rather than `input_tokens`).
This test pins the parser against a synthetic fixture covering the
cases we saw in the wild plus the failure modes that are hard to
provoke from a live API call:
- status dispatch (queued, in_progress, failed, cancelled, incomplete)
- completed body concatenated across text items, skipping thought/image
- empty/missing `outputs` without crashing
- missing usage
- citations deduped by url, hostname extracted as title
- wrong-type annotations and those without url skipped
- real vertexaisearch redirect URLs Gemini emits
- fallback to url as title when the URL is unparseable
- trimming of leading/trailing whitespace
To make this testable I pulled the completed-branch of
pollGeminiDeepResearch into a standalone parseInteractionResponse
helper — same behaviour, now reachable without mocking global fetch.
Also adds the `test` script to package.json so `pnpm --filter
@mana/research-service test` works.
17 pass / 0 fail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The real Gemini /v1beta/interactions/:id completed shape bit us once
already during the initial smoke-test (we had OpenAI-style nested
`output.message.content[]` coded; reality is a flat `outputs` array
of thought|text|image items, with url_citations that carry no title
and usage fields named `total_input_tokens` rather than `input_tokens`).
This test pins the parser against a synthetic fixture covering the
cases we saw in the wild plus the failure modes that are hard to
provoke from a live API call:
- status dispatch (queued, in_progress, failed, cancelled, incomplete)
- completed body concatenated across text items, skipping thought/image
- empty/missing `outputs` without crashing
- missing usage
- citations deduped by url, hostname extracted as title
- wrong-type annotations and those without url skipped
- real vertexaisearch redirect URLs Gemini emits
- fallback to url as title when the URL is unparseable
- trimming of leading/trailing whitespace
To make this testable I pulled the completed-branch of
pollGeminiDeepResearch into a standalone parseInteractionResponse
helper — same behaviour, now reachable without mocking global fetch.
Also adds the `test` script to package.json so `pnpm --filter
@mana/research-service test` works.
17 pass / 0 fail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Opt-in path for missions that want Gemini Deep Research Max (up to 60 min
per task) instead of the shallow RSS pre-research. Because Max runs well
past a single 60-second tick, the state is carried across ticks:
tick N: submit → INSERT mission_research_jobs row → skip planner
tick N+k: poll → still running → skip planner (metric pending_skips)
tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan
- ManaResearchClient talks to mana-research's new internal
/v1/internal/research/async endpoints with X-Service-Key +
X-User-Id. Graceful-null on transport errors so a flaky
mana-research never crashes the tick loop.
- New table mana_ai.mission_research_jobs with PK (user_id, mission_id)
— presence is the "pending" flag; delete-on-terminal keeps queries
trivial.
- handleDeepResearch() encapsulates the state machine; planOneMission
now returns a discriminated union (planned | skipped | failed) so
"research pending" isn't miscounted as a parse failure.
- Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits
per run):
1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off)
2. DEEP_RESEARCH_TRIGGER regex matches the mission objective
(strict: "deep research", "tiefe recherche", "umfassende
recherche", "hintergrundrecherche", "deep dive")
Falls back to shallow RSS when either gate fails or the submit
errors upstream.
- Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total
labelled by provider, plus _pending_skips_total.
- docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds
mana-research to depends_on.
- Full write-up with real API response shape (outputs plural, not
OpenAI-style), step-3 MCP-server plan (security-gated, not built),
ops + kill-switch: docs/reports/gemini-deep-research.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New providers gemini-deep-research + gemini-deep-research-max on the
Interactions API (preview-04-2026). Submit/poll split, tier parameter
selects between standard (~minutes, $1–3) and max (up to 60 min, $3–7).
- Parser matches the real response shape: flat `outputs` array of
thought|text|image items, url_citation annotations without title,
`usage.total_input_tokens` / `total_output_tokens`.
- Route generalisation: /v1/research/async accepts `provider` with
default 'openai-deep-research' (backward compatible) and dispatches
to the right submit/poll pair.
- New internal service-to-service endpoint /v1/internal/research/async
gated by X-Service-Key + X-User-Id for credit accounting. Enables
mana-ai to drive deep-research jobs on the mission owner's wallet
without requiring a user JWT.
- Pricing: 300 credits (standard) / 1500 credits (max). Conservative
markup over the ~$3/$7 ceiling so the first runs can't surprise us.
- Docs: AGENT_PROVIDER_IDS + pricing + env map + auto-router stay in
sync; CLAUDE.md Phase 3b now current; API_KEYS.md references the
new providers under GOOGLE_GENAI_API_KEY.
Verified with a real smoke test against the Gemini API: submit + poll
both succeed, completed response parsed cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
shared-types/src/index.ts re-exports with explicit .ts extensions
(Tailwind v4 module resolver needs them). TS 5.7 requires consumers
to opt in via allowImportingTsExtensions. The flag only type-checks
when noEmit:true; the NestJS builder also needs
rewriteRelativeImportExtensions so tsc still emits valid JS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes surfaced by the end-to-end smoke test.
1. broadcast-track.ts: inner route paths double-prefixed.
Routes were declared as '/track/open/:token' etc, then
mounted at '/api/v1/track', yielding '/api/v1/track/track/open/:token'
— every tracking endpoint returned 404. Dropped the redundant
'/track/' prefix so the full path is now
'/api/v1/track/{open,click,unsubscribe}/:token' as the
orchestrator + client both expect.
Verified with live curl:
- /track/open/BAD → 200 image/gif 42 bytes (graceful no-signal)
- /track/click/?url missing → 400 missing url
- /track/click?url=javascript: → 400 bad url
- /track/click?url=https://ok + bad token → 302 graceful
- /track/unsubscribe/BAD GET → 400 HTML
- /track/unsubscribe/BAD POST → 400 (RFC 8058)
2. shared-branding/tsconfig.json: allowImportingTsExtensions
missing. shared-types/src/index.ts uses explicit .ts
imports (intentional, for Tailwind's module resolver); any
downstream tsconfig without allowImportingTsExtensions emits
8 errors. shared-auth already had this fix — shared-branding
gets the same treatment. noEmit:true is set, so no rewrite
flag needed.
Verified: shared-branding pnpm check → 0 errors.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>