The Kleiderschrank module shipped end-to-end (M1–M5 + M4.1) but was
never surfaced on the workbench homepage — it was reachable only via
direct /wardrobe URLs. This adds the tile so users can add it to a
scene and open it from the launcher like every other module.
- apps.ts: registerApp({ id: 'wardrobe', name: 'Kleiderschrank',
color: #e11d48, icon: CoatHanger }) — list view loads
$lib/modules/wardrobe/ListView.svelte (tab switcher Kleidung /
Outfits). Detail routes stay SvelteKit-based
(/wardrobe/garment/[id], /wardrobe/outfit/[id],
/wardrobe/compose/[[outfitId]]) so the workbench only needs the
root list slot.
- categories.ts: wardrobe → 'creative' (next to picture, library,
playground, quiz).
Color matches the shared-branding entry in mana-apps.ts. Icon is the
phosphor CoatHanger (there is no bare "Hanger" in phosphor-svelte).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
getInScopeSpaceIds() used getCurrentUserId() (null for guests), so
guest-created rows stamped `_personal:guest` by the write hook
became invisible — empty scene, "App hinzufügen" silently no-op'd
because activeSceneIdState resolved to null.
Switch to getEffectiveUserId() so the read filter always matches
what the hook stamps. Four regression tests cover guest-only,
signed-in-no-space, non-personal active space, and personal-sentinel-
is-active collapsing to a single id.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the one checklist item M4 left for later — "TryOnButton auf
DetailGarmentView (mit impliziten 'Solo-Outfit')". A user can now open
a single garment's detail page, see "An mir anprobieren · 10 Credits",
and get an inline preview of themselves wearing just that one item
(or just that accessory, for glasses/jewelry/hat/accessory).
Client:
- api/try-on.ts: extracts a shared callGenerateWithReference() helper
and a dimsForSize() utility from runOutfitTryOn so the new
runGarmentTryOn can share the HTTP-error matrix + picture.images
row shape without a refactor of the outfit path.
- runGarmentTryOn({ garment, faceRefMediaId, bodyRefMediaId?, prompt?,
quality? }): auto-detects accessoryOnly from the garment's category
(FACE_ONLY_CATEGORIES), composes the DE default prompt ("im/in
<Name>", "mit <Name>" für Accessoires), writes a picture.images row
with wardrobeOutfitId=null so it doesn't pollute any outfit's
try-on history. Does NOT update any outfit.lastTryOn — it's a
standalone preview, on purpose.
- GarmentTryOnButton.svelte: thinner sibling of TryOnButton. Same
three states (ready / missing-refs / loading), same non-personal-
space disclaimer. Extra: inline preview panel showing the last
rendered result, with a link to the Picture gallery ("Gefunden in
der Picture-Galerie als normale Generierung.").
- DetailGarmentView now puts the try-on action above the existing
wear-tracking button. Try-on is the more engaging action for this
page; demoting "heute getragen" to a secondary-styled button
respects that without removing it.
Plan docs:
- docs/plans/wardrobe-module.md — rewrites the Status block to M1-M5
with actual commit hashes, and checks off the per-milestone task
lists. Adds a new M4.1 block for solo-garment try-on.
- docs/plans/me-images-and-reference-generation.md — adds the v40
space-scope migration (cb9a9bb42) as its own row in the commit
table, with a pointer to the sub-plan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
83 new tests across 5 files — pure-logic, fast, run on every
push. Caught one real bug + motivated one small refactor.
Coverage:
- apps/mana/.../website/constants.test.ts (8): isValidSlug + RESERVED_SLUGS
+ isValidPath. Caught the 1-char-slug bug (regex allowed length 1;
UI + plan say min 2). Fixed the regex in both the webapp and the
mirrored server list.
- apps/mana/.../website/publish.test.ts extended (8 total): adds
self-parent cycle, 3-level nesting, all-orphans, empty-input cases
on top of the original determinism + orphan-drop tests.
- apps/mana/.../website/templates.test.ts (7): parameterised over each
of the 4 bundled templates — clone produces fresh UUIDs, page +
block counts match, navConfig populated. Plus unknown-template and
duplicate-slug rejection. Container-nesting is punted to the smoke
test (none of the bundled templates use columns yet).
- packages/website-blocks/src/schemas.test.ts (38): every block
(11) + sanity-checks (defaults satisfy own schema, enum + length
bounds, required fields). Pure Zod — no Svelte runtime needed.
- packages/website-blocks/src/themes/themes.test.ts (12): preset
parity, resolveTheme overrides, themeCssVars output format +
heading-font fallback.
- apps/api/src/modules/website/reserved-slugs.test.ts (10): mirror of
the client tests for the server SSOT, plus new hostname validation
cases (.mana.how reservation, length, malformed edges).
Refactor:
- apps/api/src/modules/website/reserved-slugs.ts now owns
isValidHostname + RESERVED_HOSTNAMES. domains.ts imports them.
Pure functions live next to the other pure validators; easier to
test + share.
All 83 new tests green. Web-app svelte-check + apps/api type-check
both clean. Existing publish.test.ts / website-blocks tests still
pass (the monorepo-wide count is now well above 83 — these are
the new ones from this commit).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the M3 sub-agent loop. Both webapp consumers of runPlannerLoop
now expose the `task` tool to their planner LLM and route matching
calls to a session-bound sub-agent handler.
Pattern (identical in both files):
1. Hoist the regular tool dispatcher into a local `dispatchTool`
so both the main loop AND the sub-agent executor can share it.
The parent's guardrail, executor, actor attribution, and
domain-event emission happen exactly once — sub-agent tool
calls route through the same function.
2. Build a per-session taskHandler via createTaskToolHandler()
with parentDepth=0 (sub-agents themselves refuse to recurse)
and model=google/gemini-2.5-flash-lite (cheap tier —
sub-agents are summarisation-heavy, no reason to burn primary
budget on them).
3. toolsWithTask = [...regular tools, TASK_TOOL_SCHEMA].
4. onToolCall branches on `call.name === TASK_TOOL_NAME` →
taskHandler.handle; else dispatchTool. Both return
ToolResult, loop doesn't care which route was taken.
Companion:
- parentTools = AI_TOOL_CATALOG (full catalog)
- Token tracking via taskHandler.cumulativeUsage() available if
we later want to attribute sub-agent tokens to a companion-
session counter
Mission runner:
- parentTools = availableTools (agent-policy-filtered)
- Sub-agent inherits the same filter — a research sub-agent in a
mission that already had policy:deny on `list_events` still
can't see `list_events`, defense-in-depth
- runToolCall still gets aiActor → sub-agent tool executions are
attributed to the same mission/iteration as the parent
mana-ai deliberately NOT wired: its onToolCall is a no-op recorder
(plans get staged, executed client-side on sync). Sub-agents there
would produce no value since the sub-agent couldn't execute tools
either, just plan. When the tool-registry fully absorbs AI_TOOL_CATALOG
(Personas-plan M4), mana-ai will get sub-agent support in that same
migration.
No new tests — shared-ai's 107 tests cover the primitive + handler
exhaustively. Existing 31 companion+mission tests remain green;
svelte-check clean across 7427 files.
Completes M3. runPlannerLoop now has Claude-Code's four big patterns:
policy-gate (M1) / reminder-channel (M1) / parallel-reads (M1) /
compactor (M2) / sub-agents (M3).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Exposes runSubAgent() as a tool the planner LLM can call natively,
matching Claude Code's `Task` tool shape: { subagent_type, description,
prompt } -> single-string summary.
New exports from @mana/shared-ai:
- TASK_TOOL_NAME = 'task'
- TASK_TOOL_SCHEMA — ToolSchema ready to drop into a runPlannerLoop
`tools` array. subagent_type enum = research|plan|general;
description+prompt required; defaultPolicy: 'auto' (control-flow,
not a user-data write).
- createTaskToolHandler(opts) — factory returning:
- handle(call): structured ToolResult with the sub-agent's
summary as message + data {subAgentType, toolsCalled,
rounds, stopReason, usage}
- cumulativeUsage(): rolled-up TokenUsage across all sub-agent
invocations — parent budget accounting reads from here
- invocationCount(): metric-ready counter
Why not in mana-tool-registry: `task` is a loop-internal control-flow
primitive, not a user-data operation. Registry is for habits/notes/etc.
where MCP exposure and space-scoping matter. task never touches mana-
sync and never crosses the MCP boundary.
Recursion guard is defense-in-depth: the primitive throws
SubAgentRecursionError, this handler catches parentDepth >=
MAX_SUB_AGENT_DEPTH up front and returns a structured ToolResult
instead so the LLM sees it as regular tool-feedback.
Exceptions from the sub-agent (provider down, network) get wrapped
as `{ success: false, message: 'Sub-agent failed: ...' }`. The parent
loop's round continues.
14 new tests covering schema shape, recursion rejection, argument
validation (4 cases), happy path with tool dispatch, cumulative
usage tracking across multiple invocations, exception wrapping,
and parent-dispatcher routing.
107 shared-ai tests green total (was 93).
M3.3 consumer wiring follows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M5 of docs/plans/wardrobe-module.md — exposes the Wardrobe feature
through the shared tool-registry so MCP clients (Claude Desktop)
and the mana-ai mission runner can browse, compose, and try on
outfits alongside the built-in UI. Follows the pattern M5 of the
me-images plan established in packages/mana-tool-registry/src/
modules/me.ts — encrypted reads via mana-sync pull + client-side
filter on `row.spaceId === ctx.spaceId`, writes via pushInsert
with encryptRecordFields, HTTP proxy for the try-on endpoint.
Four tools in packages/mana-tool-registry/src/modules/wardrobe.ts:
- wardrobe.listGarments(category?, tags?, limit?) — read. Pulls
wardrobeGarments from mana-sync, filters to the active space,
decrypts name/brand/color/size/material/tags/notes, applies
optional category + intersection-tag filters, caps at 200 rows
(50 default). Archived + soft-deleted items excluded.
- wardrobe.listOutfits(occasion?, favoriteOnly?, limit?) — read.
Same shape, filters by occasion (closed enum, plaintext —
unencrypted filter) and favorite. garmentIds arrive plaintext
so the agent can immediately resolve them via listGarments when
it needs more than ids.
- wardrobe.createOutfit({ name, garmentIds, occasion?, tags?,
description? }) — write. Encrypts name/description/tags, pushes
an insert tagged with ctx.spaceId. No cross-space validation of
the garmentIds — the calling agent is expected to have called
listGarments first; dangling refs surface visually in the UI
rather than as a hard server error.
- wardrobe.tryOn({ outfitId, prompt?, accessoryOnly?, quality? }) —
write (consumes credits). Biggest tool of the set: pulls the
outfit, its garments, and the caller's meImages in three
separate mana-sync pulls, resolves the primary face-ref +
body-ref, auto-detects accessoryOnly from garment categories
(FACE_ONLY_CATEGORIES: accessory/glasses/jewelry/hat), composes
refs respecting the 8-slot server cap, composes a default DE
prompt from the outfit name + occasion, and proxies to
/api/v1/picture/generate-with-reference with the user's JWT.
Returns the resulting image's URL + mediaId + prompt + model.
Deliberately does NOT persist a picture.images row or update
outfit.lastTryOn from the tool — those live on the client's
imagesStore / wardrobeOutfitsStore and doing them server-side
would race with a user who's also looking at the outfit page.
Agents use tryOn as a preview/inspection primitive; the user
commits from the UI.
Types: 'wardrobe' added to the ModuleId union. registerWardrobeTools
wired into registerAllModules — mana-mcp's createMcpServerForUser
iterates the registry and exposes any user-space tool automatically.
Credit model: quality defaults to 'medium' (10 credits per render),
same tarif as text-to-image generation. The agent pays for the
generation out of the calling user's credit balance via the
standard validateCredits/consumeCredits chain on the server endpoint.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New packages/shared-ai/src/planner/sub-agent.ts implementing the
"one level deep, fresh messages, restricted tools, single-string
return" sub-agent contract from Claude Code's KN5/I2A launcher.
Four invariants enforced at the primitive level:
1. FRESH messages[] — parent's history never leaks in. The sub-agent
only sees its own system prompt + the task description. Hundreds
of scanned files stay inside the sub-agent.
2. RESTRICTED tool-whitelist — parent's full catalog is filtered
per SubAgentType ('research' = auto-policy only, 'general' =
everything, 'plan' = auto-policy + 3-round cap). Custom filter
overrides the type default.
3. SINGLE RETURN VALUE — sub-agent returns summary:string for
the parent to render as task-tool-result. Individual tool calls
stay in rawResult for debug capture but never cross the boundary.
4. ONE LEVEL DEEP — MAX_SUB_AGENT_DEPTH = 1. parentDepth >= 1 throws
SubAgentRecursionError; the consumer task-tool handler will
also check, this is defense-in-depth.
Model is required (no default) — routing to a cheaper tier like the
compactor does is an explicit decision, not a sneaky default.
Belt-and-suspenders wrapper on onToolCall rejects any tool call
whose name isn't in the whitelist, even if the LLM fabricates one.
14 new tests covering recursion guard, tool filtering per type,
custom filter, whitelist rejection, fresh-messages isolation, usage
roll-up, default summary on max-rounds, type-specific system prompt,
system-prompt override, and end-to-end tool-call -> result -> summary.
93 shared-ai tests green total (was 79).
M3.2 (task tool in registry) and M3.3 (consumer wiring) follow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M4 of docs/plans/wardrobe-module.md — the loop closes. A user with at
least a face-ref in the active space can click "Anprobieren" on an
outfit detail page; the client composes a reference call against the
existing M3 `/generate-with-reference` endpoint, persists the result
into the Picture gallery with a `wardrobeOutfitId` back-reference,
and pins a `lastTryOn` snapshot on the outfit so its card instantly
shows the AI preview next time.
Server side — picture/routes.ts:
- verifyMediaOwnership now accepts `apps: string | readonly string[]`.
Under the hood it runs one list() per app-tag and unions the owned
set before the missing-id check. Preserves the 500-row per-app
sanity cap. Single-tag callers unchanged — it's an additive widen.
- Picture /generate-with-reference passes `['me', 'wardrobe']` so
face/body portraits (me-images) and garment photos (wardrobe) can
ride in the same referenceMediaIds array. Anything outside those
two tags still 404s — no expansion of the trust surface.
Client side — wardrobe/api/try-on.ts:
- `runOutfitTryOn({ outfit, garments, faceRefMediaId, bodyRefMediaId?, ... })`
composes the ref list (face → body → up to 6 garments, respecting
the 8-slot server cap), picks portrait 1024x1536 by default (or
1024x1024 in accessory-only mode), and POSTs with
`model='openai/gpt-image-2'`, `quality='medium'`, `n=1`. One render
per click; multi-variant is a future Generator-style extension.
- Default prompts are composed in DE from the outfit meta (name +
occasion); callers can override via `prompt`. Accessory-only mode
uses a tighter studio-portrait phrasing since the fullbody ref is
dropped there.
- `isAccessoryOnlyOutfit()` helper — iff every garment is in
FACE_ONLY_CATEGORIES, skip body-ref and render square. Covers the
Brille-Try-On headline use case.
- On success: inserts a `picture.images` row with generationMode=
'reference', referenceImageIds, and wardrobeOutfitId set; then
calls wardrobeOutfitsStore.setLastTryOn() with imageId + imageUrl
so OutfitCard + DetailOutfitView immediately flip to the AI cover.
TryOnButton — wardrobe/components/TryOnButton.svelte:
- Three states: ready (click to render), missing-references (shows
UserCircle + link to /profile/me-images, with the right hint for
accessory-only vs. fullbody), loading (spinner).
- Credit estimate on the button (10c medium quality).
- Hints: accessory-only, too-many-garments (>6, over server cap),
and non-personal-space disclosure — the family-space case gets its
own sentence since "Try-On rendert dich, nicht dein Kind" is
non-obvious.
- Reads face-ref/body-ref via useImageByPrimary (space-scoped after
the v40 meImages migration — brand/club/family spaces need their
own references uploaded).
UI wiring:
- DetailOutfitView replaces the M3 stub button with <TryOnButton/>.
The existing "Try-On Verlauf"-Strip already reads
`useOutfitTryOns(outfit.id)` which filters `picture.images` by
wardrobeOutfitId — it lights up automatically on first render.
Not in M4 (punted to follow-ups):
- Solo-garment try-on on DetailGarmentView ("nur diese Brille auf
mein Gesicht"). Plan called it out as optional; the outfit flow
already covers it when the outfit contains only that one garment.
- Multi-variant rendering (n=2/4). Usable "show me 3 looks" needs a
picker UI on top, not just a param bump.
- Quality + prompt override in the button. A power-user panel can
come later; default medium + auto-prompt keeps M4's click-to-try-on
one-tap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M3 of docs/plans/wardrobe-module.md — layers outfit composition on top
of M2's garment grid. Users can now combine their garments into named
outfits, see them in a second tab under /wardrobe, open a per-outfit
detail page, and edit via the same composer route.
Routes:
- /wardrobe/compose — empty composer, creates a new outfit
- /wardrobe/compose/[outfitId] — composer pre-populated with an
existing outfit, saves back into it (SvelteKit optional-param
`[[outfitId]]` folder name). Both wrap OutfitComposer in
`{#key outfitId ?? 'new'}` so create→edit navigation cleanly
re-mounts with the right initial state.
- /wardrobe/outfit/[id] — outfit detail; wrapped in `{#key id}`
for the same reason as the garment detail route.
Components:
- OutfitCard — grid tile. Cover precedence: lastTryOn.imageUrl
(M4 payload) → 2×2 garment-thumbnail collage → empty state.
Shows name + "<n> Stücke · <occasion>" line + favorite heart
overlay when set.
- OutfitComposer — two-column editor. Left: garments grouped by
category with +/✓ overlay toggles and a scroll container capped
at 70vh so the right-hand editor doesn't disappear below the
fold on long libraries. Right: name + description + occasion
dropdown + season pill-toggles + comma-tags + composition chips
with hover-× to remove. Click-to-add (no drag-drop — simpler
mental model, keyboard-accessible for free, 100% of the
workflow covered).
- OutfitsView — sibling to GridView, renders the outfit grid and
the "+ Neues Outfit" CTA. Shows a garments-first empty state
when the user has no clothing at all, an outfit-only empty state
when they do but haven't composed anything yet.
- DetailOutfitView — cover + metadata card + "Zusammenstellung"
grid (each garment tile links back to its own detail page).
Try-On button is a stub for M4 ("kommt bald"); the Try-On
history strip reads from picture.images via the existing
useOutfitTryOns query and renders once M4 starts writing those
back-references.
ListView now toggles between Garments (GridView, default) and
Outfits (OutfitsView) tabs; local state, lost on hard reload,
kept across in-app navigation.
Types: OutfitTryOn gains `imageUrl: string` (mana-media URL cached
alongside the picture.images.id pointer). Needed so the OutfitCard
renders the try-on thumb with one HTTP round-trip instead of a
Dexie→picture.images→mana-media lookup chain. Source of truth
remains the picture.images row; this is just a cache.
No M1 data shape breaks — only additive field on OutfitTryOn and
that type wasn't used anywhere in shipped code yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The code is shipped (M1–M7) but nothing has run against real
Postgres + mana-sync + mana-media + a browser. This smoke-test doc is
the click-through a human needs to do before we trust the feature in
production.
- docs/plans/website-builder-smoketest.md — 10 scenarios end-to-end
from migrations + dev-stack through create/publish, block coverage
(image upload, gallery lightbox, columns container), forms with
honeypot + rate-limit, moduleEmbed with public-flag enforcement,
templates + AI tools, subdomain rewrite, custom-domain DNS verify,
rollback + analytics, metrics + GC script, edge-cases + security.
Lists bekannte Limits (CF SaaS gap, target-delivery, AiProposalInbox)
explicitly so the tester knows what NOT to expect.
- docs/optimizable/manual-test-backlog.md — new release-blocker entry
pointing at the walkthrough. Follows the same format as the Shared
Space + Data Export entries.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bau-Plan für die M1-Polish-Lücke: heute lehnt evaluatePolicy() jeden
destructive-Call ab weil settingsFor() hardcoded { allowDestructive: [] }
zurückgibt. In POLICY_MODE=enforce würde das alle User von destructive
Tools aussperren. Bisher kein Problem — es gibt keine destructive Tools
in der Registry. Sobald das erste kommt, greift dieser Plan.
Kernentscheidungen:
- Scope: per-SPACE, nicht per-User. Passt zum Space-scoped-data
model; ein Admin opt-in'd, alle Members des Spaces profitieren.
- Authority: Server-authoritative. Nicht in Dexie. JWT-gated
PUT, Role-Check (nur owner/admin dürfen ändern), RLS.
- Storage: eigene Tabelle mana_spaces.space_policy_preferences
plus append-only space_policy_audit für Diff-Tracking. NICHT
JSON-Column auf spaces — typisiert, indexierbar, mehr Raum für
spätere per-Space-Rate-Limits etc.
- Fail-closed: wenn mana-mcp apps-api nicht erreicht, wird
destructive geblockt. 30s TTL-Cache, kurz genug für Revoke-Speed.
- Acknowledgement enforced at API: PUT verlangt acknowledged:true
wenn neue Tools zur Liste. Anti-Click-Through by construction.
Inkludiert:
- Schema + RLS (Postgres)
- GET/PUT/audit-GET + interner service-key-GET
- SpacePolicyClient-Pattern in mana-mcp (wie MasterKeyClient)
- UI /s/:space/settings/ai-policy mit Audit-Section
- Metriken-Erweiterung (policy-changes counter)
- Rollout-Reihenfolge + Tests + offene Fragen
Bau-Trigger: erster PR der ein Tool mit policyHint:'destructive'
in die Registry bringt.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third encrypted module in @mana/tool-registry, brings the registry to
16 tools across 7 modules. Lets the Anna / Sofia / Maya personas
(whose moduleMix puts mood at 20–30 %) actually exercise their
daily-tracking routine when the runner ticks.
Three tools, all encrypted per the web-app registry
(moodEntries: entry<LocalMoodEntry>(['withWhom', 'notes'])):
- mood.log
Write a mood entry. `level` 1–10, `emotion` + `secondaryEmotions`
from the taxonomy copied verbatim from apps/mana/.../modules/mood/
types.ts (keep in sync if new emotions/activities get added). date
+ time default to server-clock now; personas logging
retrospectively pass them explicitly.
- mood.today
Return every entry for today (or `{ date }`) sorted by time.
Multiple entries per day are normal — the web app timelines them.
- mood.recent
Last N days (default 7), newest first. Useful for
self-reflection turns like "how has your week been?".
Scope decisions
Calendar was on the shortlist but dropped: `events` writes couple to
`timeBlocks` (a separate table/appId), so one tool call becomes two
sync pushes with a shared transaction concern — worth a careful
session, not a drive-by. Goals dropped because `companionGoals` is
owned by the Companion Brain, not a regular module, and has no clear
mana-sync appId convention. Both candidates for a focused follow-up.
Verified
- `pnpm run validate:all` green (crypto registry 202/202, encrypted-
tools audit 9/9 including the 3 new mood tools)
- type-check across tool-registry + mcp + runner green
- registerAllModules → 16 tools, 7 modules:
habits: create/list/update/archive
journal: add 🔐
me: listReferenceImages 🔐 / generateWithReference
mood: log 🔐 / today 🔐 / recent 🔐
notes: create 🔐 / search 🔐
spaces: list
todo: create 🔐 / list 🔐 / complete
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M2 of docs/plans/wardrobe-module.md — the first interactive surface on
top of the M1 data layer. Users can now upload photos, browse their
garment grid filtered by category, and edit/archive/delete individual
items. Outfits (M3) and Try-On (M4) are still placeholders.
Route:
- /wardrobe — grid view with active-space badge in the intro card
(identical pattern to /profile/me-images since the pool IS per-
space). Category tabs across the top: "Alle" + eleven categories
with live counts. Dropping files while a category tab is active
creates garments with that category preselected; dropping on
"Alle" defaults to `other` and the user edits on the detail page.
- /wardrobe/garment/[id] — detail view. Renders the primary photo
+ metadata card; a pencil toggles into GarmentForm for inline
edit. Three actions: "Heute getragen" (bumps wearCount + stamps
lastWornAt, prominent primary button), Archive, and Delete with
confirm. The route wraps DetailGarmentView in `{#key id}` so
navigating between different garments cleanly remounts the
liveQuery + form state.
Components:
- CategoryTabs — horizontal pill row with per-category count
badges. Stays compact on mobile via overflow-x-auto.
- GarmentCard — tile with primary photo + name + brand + wear-
count hint; click navigates to detail.
- GarmentForm — inline edit sheet (name, category, brand, color,
size, material, tags comma-separated, notes, price+currency).
Comma→array for tags because that's how most users think about
them; the store normalizes on save.
- GridView — orchestrates queries, filter tabs, drop zone (reuses
MeImageUploadZone from profile since it's already generic about
what "files" mean), and the empty states (no garments at all vs.
no garments in this category).
Small conveniences:
- api/upload.ts wraps the M1 POST /api/v1/wardrobe/garments/upload
endpoint with fetchWithAuth; same shape as profile's me-images
client (mediaId/storagePath/publicUrl/thumbnailUrl).
- api/media-url.ts — tiny mediaId → URL resolver using the same
inline PUBLIC_MANA_MEDIA_URL pattern wallpaper and invoices/
pdf/logo already use. Worth a shared helper later but premature
while three call sites disagree on which variant to default to.
- constants.ts — CATEGORY_ORDER / CATEGORY_LABELS plus
OCCASION_LABELS and SEASON_LABELS for M3 to pick up.
Svelte 5 note: GarmentForm's `$state(garment.xxx)` initializers
trip the state_referenced_locally check, but the intent is
correct — the parent uses `{#key id}` to remount on navigation,
so the captures are a feature, not a bug. Suppressed per-line
with `svelte-ignore` and a comment pointing at the remount
mechanism.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the plan. Prometheus metrics across the website endpoints, a
cookieless analytics block users can opt in to, a read-only orphan-
asset scan script, plus two M2 debts (rollback UI + determinism test).
apps/api:
- New /metrics endpoint (unauth; internal-network only via reverse proxy).
Scrape with the existing Prometheus config that already covers mana-ai.
- lib/metrics.ts with prom-client Registry and default-metrics prefix
`mana_api_`. Website-specific counters/histograms:
website_publish_total{result=success|slug_taken|invalid|error}
website_publish_duration_seconds (Histogram)
website_submissions_total{result=received|spam|rate_limit|not_found|invalid}
website_host_resolve_total{result=hit|miss|error}
website_domain_verify_total{result=verified|failed}
website_public_reads_total{result=hit|not_found}
website_public_read_age_seconds (Histogram — age of served snapshot)
- Instrument publish.ts, submit.ts, public-routes.ts, domains.ts with
.inc() calls on every code path.
packages/website-blocks:
- New `analytics` block: Plausible + Umami support with self-hosted
script-URL override. Hidden in edit/preview, emits exactly one
<script> in public mode. No cookies, no PII. Registered in block-
registry; 11 blocks total now.
apps/api/scripts/gc-website-assets.ts:
- Read-only scan: walks published_snapshots.blob + submissions.payload
for /api/v1/media/{id}/ references, asks mana-media for items scoped
to app=website, flags orphans older than 30d. Writes report to
/tmp/gc-website-assets-<ts>.json. Deletion toggle is a future commit.
apps/mana/apps/web:
- RollbackDialog component + PublishBar integration. Closes the M2
debt "Rollback funktioniert" (API + store were there; UI was missing).
- publish.test.ts: snapshot determinism + orphan-drop tests. 4/4 pass.
docs:
- observability/website.md: metric reference, PromQL queries, alert
suggestions, Grafana dashboard pointer.
- plans/website-builder.md: M7 checklist updated (Per-site-stats +
submission-retention explicitly deferred with reason), shipping log
table completed with all M1→M7 commits.
Validation:
- apps/mana/apps/web: pnpm check → 0 errors 0 warnings
- apps/api: tsc --noEmit → clean
- website-blocks tsc → clean
- publish.test.ts → 4/4 pass
Note: validate:all's check:crypto fails on unrelated WIP (wardrobe
module's Dexie tables aren't classified yet in encryption-registry).
Pre-existing failure, not introduced by this commit — the pre-commit
lint-staged run does NOT include check:crypto so it doesn't block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M1 of docs/plans/wardrobe-module.md — pure data layer + backend plumbing,
zero UI (that's M2). A user can now hold a digital wardrobe per space:
brand merch, club Trikots, family Kleiderschrank, team Kostüme, practice
Dresscode, and personal closet all live as separate pools under the same
Dexie tables, space-scoped like tags/scenes/agents after Phase 2c.
Data model — two tables, no join:
- wardrobeGarments (Dexie v41): single clothing items / accessories.
Indexed on `category` + `createdAt` + `isArchived`. Encrypted:
name/brand/color/size/material/tags/notes. Plaintext: category,
mediaIds, counters, timestamps — all indexed or structural.
`mediaIds[0]` is the primary photo used for try-on; additional
ids are alternate views (back, detail) for M7.
- wardrobeOutfits (Dexie v41): named compositions referencing
garment ids. Encrypted: name/description/tags. Plaintext:
garmentIds (FK array), occasion (closed enum — useful for
undecrypted filtering), season, booleans, lastTryOn snapshot.
- picture.images gains `wardrobeOutfitId?: string | null` as a
plaintext back-reference. Try-on results land in the Picture
gallery like any other generation; the outfit detail view
queries them via this id rather than maintaining a third table.
Space scope:
- `wardrobe` added to all five explicit allowlists in shared-types/
spaces.ts (personal is wildcard, no edit needed). Each space type
gets a one-line comment explaining the real-world use case.
- App registry: `wardrobe` entry in shared-branding/mana-apps.ts
with a rose→fuchsia gradient icon (T-shirt on hanger silhouette),
color #e11d48, tier 'beta', status 'beta'.
- Module registry: wardrobeModuleConfig imported + appended to
MODULE_CONFIGS so SYNC_APP_MAP picks it up automatically.
Backend:
- MAX_REFERENCE_IMAGES bumped 4 → 8 in picture/generate-with-
reference (plus the client-side default in ReferenceImagePicker).
Justified with a comment: face + body + top + bottom + shoes +
outerwear + 2 accessories = 8. Cost doesn't scale with ref count
(OpenAI bills per output), so the bump is a pure capability
expansion with no credit-side risk.
- New POST /api/v1/wardrobe/garments/upload wraps uploadImageToMedia
with app='wardrobe'. Registered under /api/v1/wardrobe in index.ts.
Pattern 1:1 with the profile/me-images/upload endpoint; tier-gating
falls out of wardrobe NOT being in RESOURCE_MODULES (tier='guest'
works — consistent with picture's plain CRUD).
Stores emit domain events (WardrobeGarmentAdded, WardrobeOutfitCreated,
WardrobeOutfitTryOn, etc.) so later mana-ai missions can observe
activity without polling.
No UI in this commit. M2 (Garments-Grundlayer) wires the route + grid
+ upload-zone; M3 the Outfit composer; M4 the Try-On integration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
compactHistory() now defaults to DEFAULT_COMPACT_MODEL =
'google/gemini-2.5-flash-lite' when the caller doesn't override. Lite
is ~3–5x cheaper than gemini-2.5-flash with near-identical
summarisation quality — summarisation doesn't need the same tier as
reasoning + tool-calling, and the compactor fires exactly when token
spend is highest, so the cheaper route saves exactly where it matters.
CompactHistoryOptions.model is now optional. All three consumers
(mana-ai tick, webapp Companion, webapp Mission runner) drop their
explicit gemini-2.5-flash override and let the default apply.
This is the pragmatic M2.5: no mana-llm changes. The "tier" abstraction
(X-Model-Tier header, env-routed aliases) from the Claude-Code report
makes sense only once multiple utility tasks need cheaper routing —
topic-detection, classification, command-injection checks. Today only
the compactor wants it, and a model constant is the simplest contract
that works.
2 new tests (default applied + override honoured). 79 shared-ai tests
green, all three consumers type-check clean. One pre-existing unrelated
type error in apps/mana/apps/web/src/lib/modules/wardrobe/queries.ts
(not touched by this commit).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update the plan doc to match reality:
- Title + intro: "M1 + M2 (core)" instead of just M1.
- Exit criteria: mark the two achievable ones DONE with commit
refs; flag POLICY_MODE=enforce soak as ops-blocked; correct the
parallel-read-speedup criterion that was misformulated (mana-ai
SERVER_TOOLS are all propose-policy, so parallelisation
actually kicks in on the webapp side, covered by 54a12ffd5).
- New M2 section: 5-row status table (M2.1-M2.4 + bonus shipped;
M2.5 Haiku-tier pending).
- M2 config table (MANA_AI_COMPACT_MAX_CTX).
- M2 metrics listed (compactions_triggered_total, compacted_turns).
- Open polish items: allowDestructive still hardcoded to [].
No code changes. Future sessions reading the plan now see the
actual shipped surface instead of a stale M1-only snapshot.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two entries:
- **MCP gateway + Persona-runner — end-to-end live smoke** (🟠)
Covers M1+M1.5+M2+M3 commits. Unit tests verified ~2600 LOC at
the type/shape level, but nothing has ever talked to a real
Postgres + mana-auth + Anthropic. 11-step recipe walks through
seed → tick → verify in psql, including the encryption-on-wire
check (enc:1: prefix in sync_changes, plaintext in web app).
- **Persona visual regression — capture first baselines** (🟡)
Depends on the smoke run above succeeding (empty personas produce
meaningless baselines). Eyeball-check step is explicit — the
first PNG IS the reference, no CI can catch "baseline was wrong".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flips `meImages` out of USER_LEVEL_TABLES so it lives under the same
tenancy model as every other data table (tags, scenes, tasks, …).
Precursor to the Wardrobe module, which is space-scoped across all
six space types — leaving meImages user-global would leave an
inconsistency where the Wardrobe catalog is per-space but its
reference input is cross-space, plus a latent privacy leak in shared
spaces (agents in a brand-space would see the owner's entire pool).
Plan: docs/plans/me-images-space-scope-migration.md.
Key decisions:
- Strict scope, no cross-space fallback. Switching into a brand-space
with no uploaded face shows an empty state and links back to
/profile/me-images; it does not quietly reach into the personal-
space pool. Keeps the mental model clean.
- auth.users.image remains pinned to personal-space primary-avatar.
Only a primary change inside personal space triggers the Better
Auth sync; brand/club/family/team/practice primaries stay local.
- Single Dexie v40 upgrade: stamps `spaceId=_personal:<uid>`
sentinel, `authorId=<uid>`, `visibility='space'` on every existing
row and drops the legacy `userId` column. Dexie upgrades block app
startup, so by the time the new code's scopedForModule reads run,
every row is already space-stamped. reconcileSentinels() on the
next active-space bootstrap rewrites `_personal:<uid>` to the real
personal-space id, same path v28 used.
- Legacy-avatar migration (M2.5) now pins its row to
`_personal:<uid>` explicitly — the legacy avatar is the user's
global SSO identity and belongs in the personal space even if the
migration happens to fire while the user is in a brand space.
Code changes:
- types.ts: LocalMeImage gains spaceId/authorId/visibility (all
optional — stamped by hook). Public MeImage exposes spaceId for
queries that want to branch on space type.
- database.ts: meImages out of USER_LEVEL_TABLES; new v40 upgrade
block that stamps sentinels + drops userId in one pass.
- queries.ts: all four hooks (useAllMeImages, useMeImagesByKind,
useReferenceImages, useImageByPrimary) read via scopedForModule.
Scope-switch triggers automatic re-render via the existing
scopedTable filter path.
- stores/me-images.svelte.ts: setPrimaryInTx uses scopedForModule so
a setPrimary in Brand-space never clears Personal-space's holder.
syncAvatarToAuth gates on activeSpace.type==='personal' so non-
personal primary changes don't leak into Better Auth.
createMeImage accepts optional spaceId override — the legacy-
avatar migration uses it, regular uploads let the hook stamp the
active space.
- migration/legacy-avatar.ts: explicitly passes
spaceId=_personal:<uid> to pin the legacy row into personal space.
- MeImagesView.svelte: subtle badge in the intro card shows the
active space ("Persönlich" for personal, space name otherwise) so
users notice when the pool changes on space switch.
- packages/mana-tool-registry/src/modules/me.ts: me.listReferenceImages
filters pulled rows by row.spaceId === ctx.spaceId. mana-sync
returns all spaces the user belongs to; the tool only wants the
active space's subset.
No schema/index change on meImages (non-indexed fields, pool size
small enough for in-memory scopedTable filter). If perf matters
later, adding [spaceId+kind] is a 5-minute follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One focused dashboard covering the M1+M2 instrumentation in a single
view. Sections top-to-bottom:
1. Service Health — mana-mcp + mana-ai up/down, 1h deny rate,
compactions/h. The deny rate is the single most important
number during POLICY_MODE=log-only soak: a non-zero
deny/min in log-only means real traffic that enforce mode
would reject.
2. Policy Gate (mana-mcp)
- Decisions / sec by outcome (allow/deny/flagged)
- Deny reasons breakdown — the soak signal for flipping to
enforce. If one reason dominates, address it before the flip.
- Tool invocations / sec by outcome (success / handler-error /
input-invalid)
- Top 10 invoked tools (24h) — usage heatmap for prioritising
which tools deserve the best policy-hint tuning.
- Handler p50/p95/p99 latency per tool.
3. Reminder Channel (mana-ai)
- Rate by producer (token-budget, retry-loop, compacted)
- Rate by severity. The interesting signal is whether
warn/escalate trend DOWN over time — it means the LLM is
actually reacting to the hints. If warn stays flat, the
producer wording probably isn't landing.
4. Context Compactor (mana-ai)
- Triggers/h cumulative
- Turns folded per compaction (p50/p95). Values < 3 flag
MANA_AI_COMPACT_MAX_CTX misconfig — the threshold is firing
on already-short histories.
5. Mission Runner Baseline — tick duration + planner rounds for
correlation (e.g. "did enabling the compactor change mean
tick duration?").
Dashboard provisioning already auto-loads anything in /var/lib/grafana/
dashboards (docker/grafana/provisioning/dashboards/default.yml), so
this is live after the next grafana restart. UID agent-loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pgEnum() defaults to the public schema. Because
drizzle.config.ts sets schemaFilter: ['auth'], push introspection
never saw the enums and kept re-emitting CREATE TYPE access_tier ...,
failing with 42710. This blocked setup-databases.sh from advancing
mana-auth past the enum declarations and silently masked other drift
(e.g. the new `kind` column on auth.users going un-pushed).
Source side: three enums now live on authSchema via
authSchema.enum(...) instead of pgEnum(...). DB side: migration 006
recreates access_tier / user_role / user_kind inside the auth schema,
repoints auth.users.access_tier and auth.users.role via ::text cast
(preserving all data and defaults), and drops the old public types.
After this, `drizzle-kit push --force` reports "No changes detected"
on a clean DB and the broader `pnpm setup:db` run is green without
workarounds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the SQL that was applied manually to match the personas.ts
Drizzle schema introduced in 493db0c3b. Idempotent. See
docs/plans/mana-mcp-and-personas.md for the design. Required because
the spaces tables created alongside personas sit outside the auth
schemaFilter, and pre-existing public enums would otherwise trip
drizzle-kit push (resolved separately in migration 006).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the loop on M2: when the compactor fires, the LLM needs to know
it's now seeing a <compact-summary> instead of raw turns so it
doesn't waste a turn asking about lost details or re-executing tools
whose responses are gone.
shared-ai:
- LoopState grows `compactionsDone: number` (cap-1 by current loop
policy, but shape kept as count for future multi-compact cycles).
- runPlannerLoop populates it on each reminder-channel call. New
loop test asserts [0, 1] sequence: round 1 before compaction,
round 2 after.
mana-ai:
- New producer `compactedReminder` — fires severity=info when
compactionsDone >= 1, wrapped in a German one-liner ("frag nicht
nach verlorenen Details").
- Injected FIRST in buildReminderChannel so the LLM frames the rest
of the round with "I'm looking at a summary" context. Metric
surface stays `{producer='compacted', severity='info'}`.
4 new reminder tests (3 pure producer + 1 composition-ordering) +
1 loop-wiring test. 77 shared-ai, 20 reminders.test.ts — green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
push_schema used to print "Failed (may not have db:push script)" for
every non-zero exit, lumping real failures (stuck rename prompts,
pre-existing public enums) in with missing scripts. Now it prints the
real exit code and tails the last 5 lines of drizzle-kit output so the
root cause is visible without re-running by hand.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before: guests had to open the user-menu dropdown to find the login
button. Now the login CTA renders as a visible primary pill immediately
right of the (icon-only) user-menu trigger, so signing in is one click.
Removed the duplicate Anmelden entry from userMenuBarItems — theme,
mode toggle, and language stay in the bar for signed-out users.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two loose ends from M3/M4:
1. Tool_use_id-based error attribution in the persona-runner
-----------------------------------------------------------
The previous collectActionsFromMessage() flipped the *most recent*
ActionRow to 'error' when a tool_result carried is_error:true. That was
fine as long as Claude invoked tools strictly in sequence, but when
the planner pipelines multiple tools in one turn, a later tool_result
carries an earlier tool_use_id — the last-action fallback mis-
attributes the error.
runMainTurn() now keeps a tool_use_id → action-index Map for the
duration of the tick. On tool_use we stash block.id, on tool_result we
look up the exact ActionRow via tool_use_id and flip that one. The
"flip last" path survives as a pure fallback if a future SDK ever
ships a block without an id.
2. New audit:encrypted-tools script
-----------------------------------
scripts/audit-encrypted-tools.ts — loads registerAllModules() and
apps/mana/…/crypto/registry.ts, diffs every ToolSpec.encryptedFields
against the authoritative web-app ENCRYPTION_REGISTRY.
Catches three classes of drift:
- missing-table : tool declares a table the web-app doesn't encrypt
- field-drift : both agree a table is encrypted but the field lists
differ (half-encryption in the wire is silent death)
- disabled : web-app has enabled:false while the tool still
encrypts — advisory warning, not a fail
Negative-tested by injecting a deliberate drift on todo.create +
todo.list (shortened ENCRYPTED_FIELDS to ['title']); the auditor
flagged both tools with full field diffs, restore returned to green.
Wired into `pnpm run validate:all` so the contract survives future
edits on either side. Fills the M4 audit gap noted in
project_mana_mcp_personas.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetrical to 83a4606a9 which wired the compactor into mana-ai. Both
webapp consumers of runPlannerLoop (Companion chat engine, Mission
runner) now pass a compactor that folds the middle of messages into
a <compact-summary> when cumulative token usage hits 92% of
maxContextTokens.
COMPACT_MAX_CTX is a module constant — gemini-2.5-flash's 1M-token
ceiling — not env-wired. Vite builds for the browser and PUBLIC_*
flags are the wrong tool for a value that only matters to the loop
runtime; changing the model means changing the constant alongside the
model reference anyway.
Uses the same LlmClient + model as the planner's own calls. A cheaper
compactor-tier model (Haiku) is the optional M2.5 follow-up and does
not require changing this wiring — only the compactHistory `opts.model`
gets swapped.
Type-check clean (svelte-check 0 errors 0 warnings across 7389 files).
All 31 companion + mission tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SvelteKit hook + new DB table + founder-gated API + UI section. Ships
the code path for public-site routing on {slug}.mana.how and custom
hostnames. Cloudflare SaaS Hostnames integration is stubbed — see
plan §M6 "Offene Enden".
apps/api/src/modules/website:
- schema.ts: new `customDomains` table. Fields: id, site_id, hostname
(unique), status (pending | verifying | verified | failed),
verification_token, dns_target, verified_at.
- drizzle/website/0002_custom_domains.sql: manual migration with
partial unique index on (hostname) WHERE status='verified'.
- domains.ts (new, authenticated + founder-gated via
`requireTier('founder')`): POST/GET/DELETE /sites/:id/domains,
POST /sites/:id/domains/:domainId/verify. Verify runs CNAME + TXT
checks via node:dns/promises with an apex-domain A-record fallback.
Reserved-hostname list prevents users from binding mana.how subdomains.
- public-routes.ts: new GET /public/resolve-host?host= — unauthenticated
resolver used by hooks.server.ts. Returns { slug, siteId } only for
verified bindings tied to a currently-published site.
apps/mana/apps/web/src/hooks.server.ts:
- After the existing https/app-subdomain guards, a new
`resolveWebsiteRewrite()` step rewrites `event.url.pathname`:
{slug}.mana.how/path → /s/{slug}/path (pure string)
custom-host.com/path → /s/{resolved}/path (API call, 60s LRU)
- Browser URL stays on the custom host — this is a server-side rewrite,
not a 302. APP_SUBDOMAINS + RESERVED_WEBSITE_SUBDOMAINS win over
website routing. Localhost and apex mana.how are skipped.
apps/mana/apps/web/src/lib/modules/website:
- domains.ts (new): typed client for list/add/verify/remove. Handles
200 + expected 400 (verification-failed) separately.
- components/DomainsSection.svelte: add-input, per-domain status pill,
DNS-instructions box (CNAME + TXT with copy-to-clipboard), Verify
button. Mounted inside SiteSettingsDialog as its own section — the
existing theme/footer controls stay put.
docs/plans/website-builder.md:
- M6 checklist updated with what shipped vs. ops-gap (CF SaaS).
- `mana-landing-builder` consolidation: DECIDED to keep parallel. Four
reasons in the plan. Revisit-criterion stated.
- Shipping log table seeded with M1→M6 commits.
Validation:
- pnpm run validate:all: 6/6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api type-check: green
Apply schema with:
psql "$DATABASE_URL" -f apps/api/drizzle/website/0002_custom_domains.sql
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Claude-Code wU2 pattern goes live. Every mission run now passes a
compactor into runPlannerLoop that will fire once if cumulative token
usage crosses 92% of MANA_AI_COMPACT_MAX_CTX (default 1_000_000, the
gemini-2.5-flash ceiling). Override via env for deployments on smaller
models; set to 0 to disable entirely.
The compactor reuses the planner's own LlmClient + gemini-2.5-flash
model for now. When mana-llm grows a Haiku tier we'll route the
compactor there — it's pure summarisation and a cheaper model saves
tokens exactly where they matter.
New metrics:
- mana_ai_compactions_triggered_total — counter, one per firing
- mana_ai_compacted_turns — histogram, how many middle turns got
folded each time (< 3 ⇒ maxCtx is probably misconfigured)
Logs print a 60-char tail of the summary.goal so the "what was this
mission doing again" question survives a compaction.
No new tests here — compactHistory and the loop wiring are already
covered by the 22 tests in shared-ai (M2.1 + M2.2). The 57 existing
mana-ai bun tests stay green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Superseded by the top-level docker-compose.dev.yml (which defines
searxng + redis as part of the unified dev stack via `pnpm docker:up`).
This per-service file was an artefact from before the unified setup
and no script / doc / README still references it.
An orphan `mana-searxng-dev` + `mana-search-redis-dev` had been running
from this file for ~2 weeks, squatting on the host's port 8080. Every
first `pnpm dev:mana:all` after a cold machine start would fail with
Bind for 0.0.0.0:8080 failed: port is already allocated
because the top-level compose's `mana-searxng` service couldn't take
8080 while the orphan held it. The second invocation silently
"worked" — docker saw the freshly-created mana-searxng container and
skipped the bind step on the idempotent up, leaving it healthy but
only reachable inside the docker network (8080/tcp, no external
publish).
Cleanup already done out-of-band:
docker compose -f services/mana-search/docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml up -d --force-recreate searxng
Deleting the file so a stale `docker compose -f …/mana-search/dev.yml up`
can't resurrect the orphan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PlannerLoopInput grows an optional compactor:
compactor?: {
maxContextTokens: number;
threshold?: number; // default 0.92, matches Claude Code wU2
compact: (messages) => Promise<{ messages, compactedTurns }>;
}
Before each LLM call the loop checks whether promptTokens+completion
has crossed threshold × maxContextTokens. If yes AND we haven't
compacted this run yet, the callback runs, its returned messages
REPLACE the live history, and compactionsDone flips to 1 so a
runaway tool can't re-trigger.
Design choices:
- Fires at most ONCE per loop run. If the fresh (compacted)
history hits the threshold again in the same run, the LLM
round budget will hit first; better to terminate than to
recursively compact a summary.
- No reminder emitted automatically — the caller can wire
that via reminderChannel by reading compactionsDone from
LoopState (next PR; compactionsDone isn't exposed yet to
keep the state surface small).
- compactor callback is injectable, not hardcoded to
compactHistory() from compact.ts. Lets mana-ai route the
compactor LLM call to a cheaper model (Haiku) without
changing the loop.
- Zero maxContextTokens → skip silently (same contract as
shouldCompact()).
Also cleaned up the isParallelSafe non-null-assertion warning by
hoisting the predicate to a local with proper narrowing.
5 new loop tests: below-threshold no-op, single-fire replacement,
once-per-run idempotency, zero-cap bail, no-op when compactor
returns 0 turns. 76 shared-ai tests total, green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revises the wardrobe plan's space-scope decision from "only personal"
to the full matrix — brand has merch, clubs have Trikots, families
have shared kids' wardrobes, teams have costumes/uniforms, practices
have Dresscode items. All six space types get wardrobe in the
allowlist; garments + outfits are stamped with spaceId/authorId/
visibility like tags/scenes/agents (post Phase 2c).
Adds a sixth decision block (Space-scoped catalog, user-scoped
Try-On subject): the catalog lives in its space, but Try-On
references are always the *calling user's* meImages — one human,
one identity, brought into every space. A brand team member trying
on merch sees themselves wearing it; a club member trying on a
Trikot sees themselves wearing it; natural and correct.
The single edge case is family spaces where a parent might want
"try on kid's shirt" — the plan punts that explicitly. The
catalog side (adding items, composing outfits) works unrestricted;
Try-On shows a hint that it renders the calling user. If real
demand shows up later, a separate plan can introduce per-space
subject references (spaceMembers[].faceMediaId or similar) —
today not speculating.
Membership gating falls out of the existing scopedForModule/
mana-sync-RLS stack; no extra code in wardrobe.
M1 checklist updated: wardrobe is NOT in USER_LEVEL_TABLES, queries
go through scopedForModule, allowlist entry covers all six types.
M4 checklist gains the "in non-personal spaces show the subject
hint" item.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Claude-Code wU2 pattern: when token usage hits ~92% of the provider's
context budget, fold all pre-tail turns into a single structured summary
(Goal / Decisions / Tools Called / Current Progress) so subsequent
rounds see a synopsis instead of the raw log.
This commit ships ONLY the primitive. Wiring it into runPlannerLoop
(auto-trigger before the next LLM call when shouldCompact() fires)
is M2.2 so the surface stays small and testable.
New exports from @mana/shared-ai:
- shouldCompact(totalTokens, maxContextTokens, threshold?)
→ boolean; DEFAULT_COMPACT_THRESHOLD = 0.92, matching Claude Code.
Bails safely when maxContextTokens is missing (local models often
don't report usage).
- compactHistory(messages, { llm, model, keepRecent?, temperature? })
→ { messages, summary, compactedTurns, usage? }
Preserves: [0]=system, [1]=first user, [last N]=recent turns
(default 4). Everything between gets sent through the compact
agent with COMPACT_SYSTEM_PROMPT — a fixed 4-section Markdown
schema. Temperature default 0.2 because we want summarisation,
not creativity.
- parseCompactSummary / renderCompactSummary — round-trip helpers.
Parser is tolerant (missing sections → empty string) so a partial
compaction still produces a usable summary.
The summary replaces the middle as a single role='assistant' message
wrapped in <compact-summary> tags. Assistant role (not system) because
some providers reject arbitrary system messages deep in history.
Tests: 17 new across the 4 exports (trigger logic, Markdown round-trip,
structural preservation of anchors + tail, usage passthrough, custom
keepRecent). All 71 shared-ai tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two things:
1. AI tools (9) in the website module — writes go through the standard
proposal flow, reads run auto during planning.
- shared-ai/src/tools/schemas.ts: AI_TOOL_CATALOG entries with
defaultPolicy propose/auto.
- webapp modules/website/tools.ts: execute functions wired to the
existing stores. ModuleTool[] registered in data/tools/init.ts.
- Propose: create_website, apply_website_template, create_website_page,
add_website_block, update_website_block, publish_website
- Auto: list_websites, list_website_pages, list_website_blocks
Server-side mana-tool-registry integration (mana-mcp, mana-ai) is
a M5.x follow-up — webapp flow unblocks the missions-based use case.
2. Starter templates — clone into a fresh site with new UUIDs.
- templates/types.ts: SiteTemplate shape with localId / parentLocalId
so container→child references survive the clone.
- 4 templates: portfolio (4 pages), personal-linktree (1 page, 6 CTAs),
event (3 pages incl. RSVP form), blank (1 empty page). Deferred:
smb-corporate + product-landing (need team/pricing/testimonials
blocks, M6+).
- sitesStore.applyTemplate: walks template, bulk-inserts new rows,
remaps parent refs. Sets navConfig items from template pages.
- TemplatePicker component + /website/new route. Replaces the old
quick-create modal; ListView now links to /new. AppRegistry
context-menu action points there too.
AiProposalInbox integration deferred — the component doesn't exist in
the webapp yet (the plan mentions it aspirationally). defaultPolicy
'propose' is already set so writes stage correctly once the UI catches
up.
Validation:
- pnpm run validate:all: 6/6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api + packages/shared-ai type-check: green
Plan: docs/plans/website-builder.md (M5 shipped)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Producers now return structured {producer, severity, text} objects
instead of raw strings. buildReminderChannel collects them, increments
mana_ai_reminders_emitted_total{producer, severity} per emission, and
maps back to strings for the shared-ai loop input.
Why structured: the Prometheus label "severity" lets dashboards split
75-99% token-budget warnings (severity=warn) from 100%+ escalations
(severity=escalate) without NLP on the reminder text. Adding a new
producer that emits only info-level state (e.g. stale-sync warning)
falls out for free.
Active producer labels today:
- token-budget (warn, escalate)
- retry-loop (warn)
With this plus the scrape job (d087b4744), we can finally answer:
"does the budget warning actually change LLM behaviour?" — correlate
reminders_emitted_total{producer='token-budget'} with
tick_duration_seconds or planner_rounds_histogram.
3 tests updated to assert the new {producer, severity, text} shape
(16 reminder tests total, all green).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two plan updates as a set:
- me-images-and-reference-generation.md: rewrites the "Status" block
to reflect what actually shipped (M1 89258eb45, M2 a64a7e39c, M2.5
e2b5ac38c, M3 in 38dc80654, M4 in d087b4744, M5 fc635f983) and
adds an "Offen" section listing the small follow-ups that didn't
make the M1-M5 cut — global aiUsesReferenceImages kill-switch,
kind-editor on existing tiles, reference-display in picture
detail view, legacy-avatar re-upload hint — plus the three
optional later tracks (M6 local FLUX+PuLID, M7 inpainting masks,
M8 zero-knowledge blobs). Milestones checklist is now
✅-annotated per shipped item with actual decisions (Dexie v38
instead of v27, no me-storage bucket after all, generation_log
deferred, etc.).
- wardrobe-module.md: new plan. Data layer sketch (two tables:
wardrobeGarments + wardrobeOutfits, reuses me-images + picture
as dependencies), UI breakdown (/wardrobe, /wardrobe/compose,
garment + outfit detail routes), Try-On as a thin wrapper over
the M3 endpoint (with the cap bumped from 4 → 8 references, so
face + body + up-to-6 garments fits one call), four MCP tools
in a new wardrobe.ts module, and two optional later tracks
(Persona Stil-Coach template, context-driven outfit suggestion
mission). The explicit non-goals block keeps the scope tight:
no product DB, no replacement for inventory, no shopping, no
style-coaching that feels judgmental.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three edge-level fixes applied live to the Mac Mini today, now
committed so the canonical state matches:
1. apps/mana/apps/web/Dockerfile: add COPY for @mana/shared-crypto
(added recently as a workspace dep but the Dockerfile missed it,
so pnpm install failed with ERR_PNPM_WORKSPACE_PKG_NOT_FOUND on
every rebuild — same class as the shared-types / shared-ai /
shared-rss fixes earlier today).
2. docker-compose.macmini.yml (mana-web service): set
PUBLIC_MANA_RESEARCH_URL + PUBLIC_MANA_RESEARCH_URL_CLIENT. Without
this pair the SSR-injected window.__PUBLIC_MANA_RESEARCH_URL__ was
empty and research fetches 404'd against the current origin.
3. docker-compose.macmini.yml (umami service): pin image to
postgresql-v2.18.0. The rolling `postgresql-latest` tag jumped to
Umami 3.1.0 (Next.js 16) which crashed the container on every
POST /api/send — browser page loaders hung up to 10s on the
failing tracker request. v2.18.0 is the last known-stable v2;
DB schema is still v2-compatible so the downgrade is clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends LoopState with a sliding window of the last N ExecutedCalls
(oldest-first), capped at LOOP_STATE_RECENT_CALLS_WINDOW = 5. The loop
maintains the window automatically; reminderChannel producers read it
without touching internal state.
This activates retryLoopReminder which was shape-only in faa472be9.
The guard now fires end-to-end: when round >= 3 and the tail-2 calls
both returned success:false, the LLM sees a "stop retrying, write a
summary instead" <reminder> on the next turn. The tail-2 check rather
than window-wide is deliberate — a flaky run with intermittent success
(F, F, F, OK, F) is not a retry loop, just flaky tools.
Why window=5: retry loops usually manifest within 2-3 consecutive
rounds; a 5-deep window gives room for burst-detection and
stale-tool heuristics without bloating the reminder channel. Cap
keeps the reminder producers O(5) regardless of loop length.
Tests: 3 new (sliding-window cap + slide + order in shared-ai, retry
composition + budget+retry chain + tail-only heuristic in mana-ai).
Total agent-loop tests now 74 across both packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes M5 of docs/plans/me-images-and-reference-generation.md —
exposes the meImages feature through the shared tool-registry so MCP
clients (Claude Desktop) and the mana-ai mission runner can drive it
alongside the built-in webapp UI.
Two tools in packages/mana-tool-registry/src/modules/me.ts:
- me.listReferenceImages(kind?) — scope: user-space, read. Pulls the
user's meImages rows from mana-sync (app='profile'), filters to
usage.aiReference=true and soft-live records, decrypts the `label`
and `tags` fields with the caller's master key (same pattern as
notes.search). Returns mediaIds + kind + primary-slot info so a
persona can pick references intelligently. ZK users will see this
fail at getMasterKey() — correct, because the label is truly
unrecoverable server-side for them.
- me.generateWithReference({prompt, referenceMediaIds, quality,
size, n}) — scope: user-space, write. Thin proxy over the M3
endpoint POST /api/v1/picture/generate-with-reference in apps/api:
forwards the JWT, lets apps/api re-verify ownership, and returns
the generated images' mediaIds + URLs. Credits are consumed at
the same 3/10/25 tarif as text-to-image, so a persona plan pass
should gate this behind explicit budget rather than leaving it on
auto-policy.
Registered in modules/index.ts + adds 'me' to the ModuleId union in
types.ts. No other wiring needed — mana-mcp's createMcpServerForUser
iterates the registry and exposes any user-space tool, so both tools
become available to Claude Desktop immediately on next deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hard-follow-up to M1's soft Dexie schema landing (plan
docs/plans/me-images-and-reference-generation.md). After this commit
the source of truth for the avatar is meImages(primaryFor='avatar');
auth.users.image becomes a derived mirror that gets pushed back to
Better Auth whenever the primary changes.
Changes:
- New migration/legacy-avatar.ts: one-shot, idempotent bootstrap. On
first visit to /profile/me-images it reads profile.image via
profileService.getProfile() and writes a single meImage with
kind='face', primaryFor='avatar', usage.aiReference=false. The
mediaId is a sentinel `legacy-avatar:<uid>` — the original bytes
never went through mana-media, so verifyMediaOwnership (M3) will
naturally bounce if the user ever flips aiReference on without
re-uploading. Guarded per user via localStorage +
existing-avatar-holder check so reruns are no-ops.
- Store avatar autosync: setPrimary and deleteMeImage now push
meImages(primaryFor='avatar').publicUrl back to
profileService.updateProfile({ image }). The avatar slot is
coupled to face-ref — setting a new face-ref primary also claims
the avatar on the same row, so users don't need a second UI
control to keep their profile picture fresh. Failures are logged
but swallowed; meImages stays authoritative for in-app rendering.
- MeImagesView triggers the migration once on mount.
- EditProfileModal replaces the broken inline avatar upload (the old
POST /api/v1/storage/avatar/upload endpoint never existed in the
unified API) with a read-only preview + a button that closes the
modal and navigates to /profile/me-images. Name + email flows are
untouched.
- profileService.uploadAvatar + AvatarUploadResponse + its test are
deleted (no callers left after the modal rewrite).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two new block types and the server-side infrastructure for
untrusted input + cross-module data embedding.
Forms:
- packages/website-blocks/src/form: declarative fields (text, email,
tel, url, textarea, number) with required / maxLength / placeholder
per field. Honeypot hidden input in the renderer; public-mode POST
to a same-origin SvelteKit proxy that forwards to mana-api.
- apps/api: website.submissions table (schema.ts + 0001_submissions.sql)
+ POST /public/submit/:siteSlug/:blockId. Loads the current published
snapshot, finds the form block, validates payload against its
declared fields (trim, type check, length cap), rejects honeypot
submissions silently, rate-limits per IP (10 / 5 min) in-memory.
Unknown keys are dropped — clients can only submit declared fields.
- Owner-facing: GET/DELETE /sites/:id/submissions + SubmissionsView
component + /(app)/website/[siteId]/submissions route. Shows
incoming submissions with status pill + payload preview + delete.
- apps/mana/.../routes/s/[siteSlug]/__submit/[blockId]/+server.ts:
same-origin proxy so form posts don't trigger CORS and IP / user-
agent headers are forwarded via SvelteKit's trusted getClientAddress.
M4 first-pass does NOT wire target-module delivery (contacts / notify).
Submissions stay in the inbox until owner-side tool handlers land
(M4.x). `target` enum is intentionally `['inbox']` only for now.
moduleEmbed:
- packages/website-blocks/src/moduleEmbed: source dropdown
(picture.board | library.entries), max-items, layout (grid | list),
optional filter object. The `resolved` field on props is populated at
publish time by the editor-side resolver — public renderer reads it
directly, no Dexie / API round-trip needed.
- apps/mana/.../website/embeds.ts: per-source resolvers. picture.board
enforces `isPublic=true`; library.entries respects filter.isFavorite
/ kind / status so owners can expose a subset (e.g. "my favorites").
- buildSnapshot() walks the tree after assembly and fills in
block.props.resolved for every moduleEmbed. Publish slower, public
visits fast. No cross-service call at render time.
Validation:
- pnpm run validate:all: 6/6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api type-check: green
Apply Postgres with:
psql "$DATABASE_URL" -f apps/api/drizzle/website/0001_submissions.sql
Plan: docs/plans/website-builder.md (M4 shipped)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smallest possible foundation for the persona-driven visual regression
suite (M5 in docs/plans/mana-mcp-and-personas.md). One flow, two
viewports, one persona — enough to prove the stack end-to-end:
seed-script → mana-auth → API login → cookie injection → web app →
screenshot → disk. Extending is copy-paste per flow.
tests/personas/
playwright.config.ts
Own config separate from the root tests/e2e/ suite. Two viewports
(1440×900 desktop Chrome + Pixel 5 mobile) — more can be added
once baselines settle without quadrupling the review load.
Diff threshold 0.2 %, animations disabled, snapshots land under
__snapshots__/{spec}/{arg}-{project}.png. No auto-webServer —
the whole point is to catch regressions against the real stack
the user runs, not a hermetic one; if the stack is down, tests
fail loud.
fixtures/persona-auth.ts
Typed Playwright `test.extend` with a `personaKey` worker option
and a `personaPage` fixture that returns a pre-logged-in Page
pointed at `/`. Login is API-side: POST /api/v1/auth/login with
the deterministic HMAC-SHA256 password, parse Set-Cookie headers,
inject into the browser context. Derivation is a bit-identical
mirror of scripts/personas/password.ts and
services/mana-persona-runner/src/password.ts — a 3-way contract.
Changing one without the others locks the suite out of every
persona. PERSONAS map exports all 10 catalog emails for typed
access.
flows/home.spec.ts
One smoke flow. Asserts the persona isn't redirected to /login,
hides any [data-testid="live-time"] so clock widgets don't
invalidate diffs, captures a full-page screenshot. When this
goes green, the whole pipeline is plumbed. Copy this file to
add per-module tours.
package.json
@mana/tests-personas workspace. Scripts: `test`, `test:update`,
`report` (HTML diff viewer).
README.md
Prerequisites (stack up + seeded + ideally persona-runner ticked
once), run recipe, env vars, architecture diagram, extension
pattern.
root package.json: `pnpm test:personas` + `:update`.
.gitignore: playwright-report-personas/ + test-results/ so generated
artefacts never get committed.
Type-check / list: `playwright test --list` succeeds, 2 tests (one
per viewport) registered for home.spec.ts.
Not attempted in this commit (user action to run the stack):
- Actual baseline capture (needs docker up + db:push + seed:personas
+ ANTHROPIC_API_KEY + diag/tick).
- Additional flows (todo, journal, notes, habits, calendar). They're
copy-paste per README. Land when the stack is smoked.
- Nightly CI job. Will land once baselines are stable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expands the builder from 3 M1 blocks to 8. Containers (columns) and
media blocks (image, gallery) are the structural additions; cta and faq
round out the content coverage.
packages/website-blocks:
- image, cta, faq, columns (container), gallery — each with Zod schema,
renderer (mode-aware for edit/preview/public), and fallback inspector.
- Block type extended with optional `children` + `renderChild` snippet
so containers render their children through the same chrome the
outer renderer provides (click-to-select, public-path tagging).
- themes/: 3 presets (classic light, modern dark, warm) with
`resolveTheme` + `themeCssVars` helpers. Public layout now emits
CSS vars via `style=` on the root; block components read
`var(--wb-primary)` / `var(--wb-bg)` / `var(--wb-fg)` / etc.
- Registry updated; new exports + `./themes` subpath export.
apps/mana/apps/web/src/lib/modules/website:
- upload.ts: multipart POST to mana-media with `app=website` scope,
returns { mediaId, url }. 25 MB cap, non-image rejection client-side.
- components/ImageInspector + GalleryInspector: app-side overrides
wired to upload. Registered via `CUSTOM_INSPECTORS` in BlockInspector
so block.type → app-side inspector, fallback to registry otherwise.
- components/SiteSettingsDialog: theme preset picker + color overrides
for primary/bg/fg + footer text. Mounted from a ⚙ button in the
editor's left pane.
- components/BlockRenderer: rebuilt around a byParent map + recursive
`renderBlock` snippet so container blocks can render their children
through the same click-to-select wrapper as top-level blocks.
- routes/s/[siteSlug]: rename `[[...path]]` → `[...path]` (SvelteKit
treats rest segments as optional automatically — double-bracket form
errored at sync time). +page.svelte renders snapshot trees
recursively so published pages match the editor.
apps/api: unchanged.
Validation:
- pnpm run validate:all: all 6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api type-check: green
- website-blocks tsc: green
Plan: docs/plans/website-builder.md (M3 block shipped)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mana-mcp:
- Policy-gate section: POLICY_MODE semantics, the four decision
rules, where to find soak metrics during log-only burn-in.
- /metrics section pointing at the Prometheus job.
mana-ai:
- New v0.8 status block: reminderChannel wiring, the two live
producers (tokenBudgetReminder active, retryLoopReminder dormant
pending LoopState extension), why POLICY_MODE here is limited to
freetext inspection, why parallel-reads have no effect until the
tool-registry absorbs the full AI_TOOL_CATALOG (M4 of personas).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pairs with c94ab01c6 which added the real /metrics endpoint. Without a
scrape job the policy_decisions_total counter has nowhere to go and
the soak period is flying blind.
30s interval to match mana-ai. Same job shape as mana-ai — any Grafana
dashboard that auto-discovers services via labels will pick this up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the stub /metrics endpoint with a real prom-client registry
(mana_mcp_ prefix, {service="mana-mcp"} default label). Default
process metrics come along for free.
Policy-gate telemetry is the whole point — without it we can't soak
POLICY_MODE=log-only safely or decide when to flip to enforce. New
counter mana_mcp_policy_decisions_total{decision, reason, mode} buckets
every evaluatePolicy() call:
decision ∈ {allow, deny, flagged}
reason ∈ {admin-scope-not-invokable, destructive-not-allowed,
rate-limit-exceeded, injection-marker, clean, unknown}
mode ∈ {log-only, enforce}
So the rate of "would have been denied" during soak is visible directly
as policy_decisions_total{decision="deny", mode="log-only"}.
Also:
- mana_mcp_tool_invocations_total{tool, outcome} — success |
handler-error | input-invalid. Policy denies are NOT counted here
(they're in policy_decisions_total above); this counter only counts
calls that actually reached the handler or tripped zod validation.
- mana_mcp_tool_duration_seconds histogram per tool/outcome.
Dep: prom-client ^15.1.3 (same version mana-ai pins).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous commit 38dc80654 carries this M3 title but its payload is an
unrelated apps/api/picture change — shared-.git-index race with a
parallel session (see feedback_git_workflow.md). This commit holds the
actual M3.b/c/d code. Leaving the misnamed commit for the user to
re-attribute / revert as they prefer.
Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The
runner picks up due personas, drives each through Claude + MCP for
one simulated turn, collects actions + ratings, persists through
service-key internal endpoints in mana-auth.
Internal endpoints (mana-auth, service-key-gated)
- GET /api/v1/internal/personas/due
Returns personas whose tickCadence + lastActiveAt say they're
due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri.
NULLS FIRST so never-run personas go ahead of stale ones.
- POST /api/v1/internal/personas/:id/actions
Batch ≤ 500. Row ids are deterministic
`${tickId}-${i}-${toolName}` + ON CONFLICT DO NOTHING so the
runner can retry a tick without doubling audit rows. Also
bumps personas.last_active_at so the next /due call sees it.
- POST /api/v1/internal/personas/:id/feedback
Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is
one rating per module per tick.
Runner tick pipeline (services/mana-persona-runner/src/runner/)
- claude-session.ts
Two phases per tick. runMainTurn feeds the persona's system
prompt + a German "simulate a day" user prompt to Claude Agent
SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP
server. We iterate the returned AsyncGenerator and extract
tool_use blocks into ActionRows; a tool_result with
is_error=true flips the most recent action. runRatingTurn is a
fresh query() with tools:[] asking Claude in character to rate
each used module 1-5 as strict JSON. We parse with tolerance
for whitespace / fences. Unparseable output becomes a synthetic
'__parse' feedback row so operators see the failure.
- tick.ts
Orchestrator. Skips when config.paused. Fetches /due, processes
in batches of config.concurrency via Promise.allSettled so a
single persona failure never kills the batch. Returns
{due, ranSuccessfully, failed[], durationMs}.
- types.ts
ActionRow + FeedbackRow shapes shared between claude-session
and the internal client.
Runner bootstrap (src/index.ts)
- setInterval(config.tickIntervalMs) starts the tick loop on boot.
tickInFlight guards against overlap when Claude latency >
interval. If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing,
loop is disabled with a warn line — /health + /diag/login still
work.
- POST /diag/tick (dev-only) fires one tick on demand, returns
the result. Avoids waiting a full interval during testing.
- Graceful SIGTERM/SIGINT shutdown clears the interval.
Client
- clients/mana-auth-internal.ts
X-Service-Key client for the three endpoints above.
Constructor throws on empty serviceKey — fail loud.
Boot smoke verified: /health returns ok, /diag/tick 500s with
descriptive messages when keys absent. Warning lines on boot when
keys are missing. Type-check green across mana-auth, tool-registry,
mcp, persona-runner.
M3 exit gate is the end-to-end smoke recipe (docker up → db:push →
seed:personas → diag/tick → psql) documented in
services/mana-persona-runner/CLAUDE.md.
M2.d (cross-space family/team memberships) still deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The
runner now picks up due personas, drives them through Claude + MCP
for one simulated turn, collects actions + ratings, and persists
them through service-key internal endpoints in mana-auth.
Internal endpoints (mana-auth, service-key-gated)
- GET /api/v1/internal/personas/due
Returns personas whose tickCadence + lastActiveAt say they're
due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri.
NULLS FIRST so never-run personas go ahead of stale ones.
- POST /api/v1/internal/personas/:id/actions
Batch ≤ 500. Row ids are deterministic
(`${tickId}-${i}-${toolName}`) + ON CONFLICT DO NOTHING so the
runner can retry a tick without doubling audit rows. Also
bumps personas.last_active_at so the next /due call sees it.
- POST /api/v1/internal/personas/:id/feedback
Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is
one rating per module per tick.
Runner tick pipeline (services/mana-persona-runner/src/runner/)
- claude-session.ts
Two phases per tick. runMainTurn feeds the persona's system
prompt + a German "simulate a day" user prompt to Claude Agent
SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP
server. We iterate the returned AsyncGenerator and extract
tool_use blocks into ActionRows; tool_result with is_error=true
flips the most recent action. runRatingTurn is a fresh query()
with tools:[] asking Claude in character to rate each used
module 1-5 as strict JSON, which we parse with tolerance for
surrounding whitespace / fences. Unparseable output becomes a
synthetic '__parse' feedback row so operators see the failure.
- tick.ts
Orchestrator. Skips if config.paused. Fetches /due, processes
in batches of config.concurrency (Promise.allSettled so one
failure doesn't kill the batch), returns {due, ranSuccessfully,
failed[], durationMs}.
- types.ts
ActionRow and FeedbackRow shapes shared between claude-session
and the internal client; mirrors the mana-auth schema but in
narrow plain TS for the wire.
Runner bootstrap (src/index.ts)
- setInterval(config.tickIntervalMs) starts the tick loop on boot.
tickInFlight guards against overlap when Claude latency > interval.
If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing, loop is
disabled with a warn line — /health still works, /diag/login
still works.
- New dev-only POST /diag/tick fires a single tick on demand and
returns the result, so you can verify without waiting 60 s.
- Graceful SIGTERM/SIGINT shutdown clears the interval.
Client
- clients/mana-auth-internal.ts
X-Service-Key client for the three endpoints above. Constructor
throws if serviceKey is empty — fail loud, not silent.
Boot smoke: /health + /diag/tick both return descriptive 500s when
keys are absent, 200/JSON when present. Warning lines show up on
boot for missing keys. Type-check green across mana-auth, tool-
registry, mcp, persona-runner.
End-to-end smoke recipe (docker up → db:push → seed:personas →
diag/tick → psql) documented in
services/mana-persona-runner/CLAUDE.md. That's the M3 exit gate.
M2.d (cross-space family/team memberships) still deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Enables the M1 parallel-reads optimisation on the webapp side. Both
consumers of runPlannerLoop pass an isParallelSafe predicate derived
from the tool catalog:
isParallelSafe: (name) =>
AI_TOOL_CATALOG_BY_NAME.get(name)?.defaultPolicy === 'auto'
Auto-policy tools (list_tasks, get_habits, nutrition_summary, …) run
via Promise.all in batches of 10 when the LLM fans them out in one
round. Propose-policy tools — which surface to the user as Proposal
cards — stay sequential so intent ordering in the inbox is preserved
and pre-execute guardrails can reason about prior-step state.
Tests: 31 existing companion + mission tests pass unchanged; the
parallel path is exercised via the new loop.test.ts cases shipped
with the M1 commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>