Two entries:
- **MCP gateway + Persona-runner — end-to-end live smoke** (🟠)
Covers M1+M1.5+M2+M3 commits. Unit tests verified ~2600 LOC at
the type/shape level, but nothing has ever talked to a real
Postgres + mana-auth + Anthropic. 11-step recipe walks through
seed → tick → verify in psql, including the encryption-on-wire
check (enc:1: prefix in sync_changes, plaintext in web app).
- **Persona visual regression — capture first baselines** (🟡)
Depends on the smoke run above succeeding (empty personas produce
meaningless baselines). Eyeball-check step is explicit — the
first PNG IS the reference, no CI can catch "baseline was wrong".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flips `meImages` out of USER_LEVEL_TABLES so it lives under the same
tenancy model as every other data table (tags, scenes, tasks, …).
Precursor to the Wardrobe module, which is space-scoped across all
six space types — leaving meImages user-global would leave an
inconsistency where the Wardrobe catalog is per-space but its
reference input is cross-space, plus a latent privacy leak in shared
spaces (agents in a brand-space would see the owner's entire pool).
Plan: docs/plans/me-images-space-scope-migration.md.
Key decisions:
- Strict scope, no cross-space fallback. Switching into a brand-space
with no uploaded face shows an empty state and links back to
/profile/me-images; it does not quietly reach into the personal-
space pool. Keeps the mental model clean.
- auth.users.image remains pinned to personal-space primary-avatar.
Only a primary change inside personal space triggers the Better
Auth sync; brand/club/family/team/practice primaries stay local.
- Single Dexie v40 upgrade: stamps `spaceId=_personal:<uid>`
sentinel, `authorId=<uid>`, `visibility='space'` on every existing
row and drops the legacy `userId` column. Dexie upgrades block app
startup, so by the time the new code's scopedForModule reads run,
every row is already space-stamped. reconcileSentinels() on the
next active-space bootstrap rewrites `_personal:<uid>` to the real
personal-space id, same path v28 used.
- Legacy-avatar migration (M2.5) now pins its row to
`_personal:<uid>` explicitly — the legacy avatar is the user's
global SSO identity and belongs in the personal space even if the
migration happens to fire while the user is in a brand space.
Code changes:
- types.ts: LocalMeImage gains spaceId/authorId/visibility (all
optional — stamped by hook). Public MeImage exposes spaceId for
queries that want to branch on space type.
- database.ts: meImages out of USER_LEVEL_TABLES; new v40 upgrade
block that stamps sentinels + drops userId in one pass.
- queries.ts: all four hooks (useAllMeImages, useMeImagesByKind,
useReferenceImages, useImageByPrimary) read via scopedForModule.
Scope-switch triggers automatic re-render via the existing
scopedTable filter path.
- stores/me-images.svelte.ts: setPrimaryInTx uses scopedForModule so
a setPrimary in Brand-space never clears Personal-space's holder.
syncAvatarToAuth gates on activeSpace.type==='personal' so non-
personal primary changes don't leak into Better Auth.
createMeImage accepts optional spaceId override — the legacy-
avatar migration uses it, regular uploads let the hook stamp the
active space.
- migration/legacy-avatar.ts: explicitly passes
spaceId=_personal:<uid> to pin the legacy row into personal space.
- MeImagesView.svelte: subtle badge in the intro card shows the
active space ("Persönlich" for personal, space name otherwise) so
users notice when the pool changes on space switch.
- packages/mana-tool-registry/src/modules/me.ts: me.listReferenceImages
filters pulled rows by row.spaceId === ctx.spaceId. mana-sync
returns all spaces the user belongs to; the tool only wants the
active space's subset.
No schema/index change on meImages (non-indexed fields, pool size
small enough for in-memory scopedTable filter). If perf matters
later, adding [spaceId+kind] is a 5-minute follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One focused dashboard covering the M1+M2 instrumentation in a single
view. Sections top-to-bottom:
1. Service Health — mana-mcp + mana-ai up/down, 1h deny rate,
compactions/h. The deny rate is the single most important
number during POLICY_MODE=log-only soak: a non-zero
deny/min in log-only means real traffic that enforce mode
would reject.
2. Policy Gate (mana-mcp)
- Decisions / sec by outcome (allow/deny/flagged)
- Deny reasons breakdown — the soak signal for flipping to
enforce. If one reason dominates, address it before the flip.
- Tool invocations / sec by outcome (success / handler-error /
input-invalid)
- Top 10 invoked tools (24h) — usage heatmap for prioritising
which tools deserve the best policy-hint tuning.
- Handler p50/p95/p99 latency per tool.
3. Reminder Channel (mana-ai)
- Rate by producer (token-budget, retry-loop, compacted)
- Rate by severity. The interesting signal is whether
warn/escalate trend DOWN over time — it means the LLM is
actually reacting to the hints. If warn stays flat, the
producer wording probably isn't landing.
4. Context Compactor (mana-ai)
- Triggers/h cumulative
- Turns folded per compaction (p50/p95). Values < 3 flag
MANA_AI_COMPACT_MAX_CTX misconfig — the threshold is firing
on already-short histories.
5. Mission Runner Baseline — tick duration + planner rounds for
correlation (e.g. "did enabling the compactor change mean
tick duration?").
Dashboard provisioning already auto-loads anything in /var/lib/grafana/
dashboards (docker/grafana/provisioning/dashboards/default.yml), so
this is live after the next grafana restart. UID agent-loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pgEnum() defaults to the public schema. Because
drizzle.config.ts sets schemaFilter: ['auth'], push introspection
never saw the enums and kept re-emitting CREATE TYPE access_tier ...,
failing with 42710. This blocked setup-databases.sh from advancing
mana-auth past the enum declarations and silently masked other drift
(e.g. the new `kind` column on auth.users going un-pushed).
Source side: three enums now live on authSchema via
authSchema.enum(...) instead of pgEnum(...). DB side: migration 006
recreates access_tier / user_role / user_kind inside the auth schema,
repoints auth.users.access_tier and auth.users.role via ::text cast
(preserving all data and defaults), and drops the old public types.
After this, `drizzle-kit push --force` reports "No changes detected"
on a clean DB and the broader `pnpm setup:db` run is green without
workarounds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the SQL that was applied manually to match the personas.ts
Drizzle schema introduced in 493db0c3b. Idempotent. See
docs/plans/mana-mcp-and-personas.md for the design. Required because
the spaces tables created alongside personas sit outside the auth
schemaFilter, and pre-existing public enums would otherwise trip
drizzle-kit push (resolved separately in migration 006).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the loop on M2: when the compactor fires, the LLM needs to know
it's now seeing a <compact-summary> instead of raw turns so it
doesn't waste a turn asking about lost details or re-executing tools
whose responses are gone.
shared-ai:
- LoopState grows `compactionsDone: number` (cap-1 by current loop
policy, but shape kept as count for future multi-compact cycles).
- runPlannerLoop populates it on each reminder-channel call. New
loop test asserts [0, 1] sequence: round 1 before compaction,
round 2 after.
mana-ai:
- New producer `compactedReminder` — fires severity=info when
compactionsDone >= 1, wrapped in a German one-liner ("frag nicht
nach verlorenen Details").
- Injected FIRST in buildReminderChannel so the LLM frames the rest
of the round with "I'm looking at a summary" context. Metric
surface stays `{producer='compacted', severity='info'}`.
4 new reminder tests (3 pure producer + 1 composition-ordering) +
1 loop-wiring test. 77 shared-ai, 20 reminders.test.ts — green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
push_schema used to print "Failed (may not have db:push script)" for
every non-zero exit, lumping real failures (stuck rename prompts,
pre-existing public enums) in with missing scripts. Now it prints the
real exit code and tails the last 5 lines of drizzle-kit output so the
root cause is visible without re-running by hand.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before: guests had to open the user-menu dropdown to find the login
button. Now the login CTA renders as a visible primary pill immediately
right of the (icon-only) user-menu trigger, so signing in is one click.
Removed the duplicate Anmelden entry from userMenuBarItems — theme,
mode toggle, and language stay in the bar for signed-out users.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two loose ends from M3/M4:
1. Tool_use_id-based error attribution in the persona-runner
-----------------------------------------------------------
The previous collectActionsFromMessage() flipped the *most recent*
ActionRow to 'error' when a tool_result carried is_error:true. That was
fine as long as Claude invoked tools strictly in sequence, but when
the planner pipelines multiple tools in one turn, a later tool_result
carries an earlier tool_use_id — the last-action fallback mis-
attributes the error.
runMainTurn() now keeps a tool_use_id → action-index Map for the
duration of the tick. On tool_use we stash block.id, on tool_result we
look up the exact ActionRow via tool_use_id and flip that one. The
"flip last" path survives as a pure fallback if a future SDK ever
ships a block without an id.
2. New audit:encrypted-tools script
-----------------------------------
scripts/audit-encrypted-tools.ts — loads registerAllModules() and
apps/mana/…/crypto/registry.ts, diffs every ToolSpec.encryptedFields
against the authoritative web-app ENCRYPTION_REGISTRY.
Catches three classes of drift:
- missing-table : tool declares a table the web-app doesn't encrypt
- field-drift : both agree a table is encrypted but the field lists
differ (half-encryption in the wire is silent death)
- disabled : web-app has enabled:false while the tool still
encrypts — advisory warning, not a fail
Negative-tested by injecting a deliberate drift on todo.create +
todo.list (shortened ENCRYPTED_FIELDS to ['title']); the auditor
flagged both tools with full field diffs, restore returned to green.
Wired into `pnpm run validate:all` so the contract survives future
edits on either side. Fills the M4 audit gap noted in
project_mana_mcp_personas.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetrical to 83a4606a9 which wired the compactor into mana-ai. Both
webapp consumers of runPlannerLoop (Companion chat engine, Mission
runner) now pass a compactor that folds the middle of messages into
a <compact-summary> when cumulative token usage hits 92% of
maxContextTokens.
COMPACT_MAX_CTX is a module constant — gemini-2.5-flash's 1M-token
ceiling — not env-wired. Vite builds for the browser and PUBLIC_*
flags are the wrong tool for a value that only matters to the loop
runtime; changing the model means changing the constant alongside the
model reference anyway.
Uses the same LlmClient + model as the planner's own calls. A cheaper
compactor-tier model (Haiku) is the optional M2.5 follow-up and does
not require changing this wiring — only the compactHistory `opts.model`
gets swapped.
Type-check clean (svelte-check 0 errors 0 warnings across 7389 files).
All 31 companion + mission tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SvelteKit hook + new DB table + founder-gated API + UI section. Ships
the code path for public-site routing on {slug}.mana.how and custom
hostnames. Cloudflare SaaS Hostnames integration is stubbed — see
plan §M6 "Offene Enden".
apps/api/src/modules/website:
- schema.ts: new `customDomains` table. Fields: id, site_id, hostname
(unique), status (pending | verifying | verified | failed),
verification_token, dns_target, verified_at.
- drizzle/website/0002_custom_domains.sql: manual migration with
partial unique index on (hostname) WHERE status='verified'.
- domains.ts (new, authenticated + founder-gated via
`requireTier('founder')`): POST/GET/DELETE /sites/:id/domains,
POST /sites/:id/domains/:domainId/verify. Verify runs CNAME + TXT
checks via node:dns/promises with an apex-domain A-record fallback.
Reserved-hostname list prevents users from binding mana.how subdomains.
- public-routes.ts: new GET /public/resolve-host?host= — unauthenticated
resolver used by hooks.server.ts. Returns { slug, siteId } only for
verified bindings tied to a currently-published site.
apps/mana/apps/web/src/hooks.server.ts:
- After the existing https/app-subdomain guards, a new
`resolveWebsiteRewrite()` step rewrites `event.url.pathname`:
{slug}.mana.how/path → /s/{slug}/path (pure string)
custom-host.com/path → /s/{resolved}/path (API call, 60s LRU)
- Browser URL stays on the custom host — this is a server-side rewrite,
not a 302. APP_SUBDOMAINS + RESERVED_WEBSITE_SUBDOMAINS win over
website routing. Localhost and apex mana.how are skipped.
apps/mana/apps/web/src/lib/modules/website:
- domains.ts (new): typed client for list/add/verify/remove. Handles
200 + expected 400 (verification-failed) separately.
- components/DomainsSection.svelte: add-input, per-domain status pill,
DNS-instructions box (CNAME + TXT with copy-to-clipboard), Verify
button. Mounted inside SiteSettingsDialog as its own section — the
existing theme/footer controls stay put.
docs/plans/website-builder.md:
- M6 checklist updated with what shipped vs. ops-gap (CF SaaS).
- `mana-landing-builder` consolidation: DECIDED to keep parallel. Four
reasons in the plan. Revisit-criterion stated.
- Shipping log table seeded with M1→M6 commits.
Validation:
- pnpm run validate:all: 6/6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api type-check: green
Apply schema with:
psql "$DATABASE_URL" -f apps/api/drizzle/website/0002_custom_domains.sql
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Claude-Code wU2 pattern goes live. Every mission run now passes a
compactor into runPlannerLoop that will fire once if cumulative token
usage crosses 92% of MANA_AI_COMPACT_MAX_CTX (default 1_000_000, the
gemini-2.5-flash ceiling). Override via env for deployments on smaller
models; set to 0 to disable entirely.
The compactor reuses the planner's own LlmClient + gemini-2.5-flash
model for now. When mana-llm grows a Haiku tier we'll route the
compactor there — it's pure summarisation and a cheaper model saves
tokens exactly where they matter.
New metrics:
- mana_ai_compactions_triggered_total — counter, one per firing
- mana_ai_compacted_turns — histogram, how many middle turns got
folded each time (< 3 ⇒ maxCtx is probably misconfigured)
Logs print a 60-char tail of the summary.goal so the "what was this
mission doing again" question survives a compaction.
No new tests here — compactHistory and the loop wiring are already
covered by the 22 tests in shared-ai (M2.1 + M2.2). The 57 existing
mana-ai bun tests stay green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Superseded by the top-level docker-compose.dev.yml (which defines
searxng + redis as part of the unified dev stack via `pnpm docker:up`).
This per-service file was an artefact from before the unified setup
and no script / doc / README still references it.
An orphan `mana-searxng-dev` + `mana-search-redis-dev` had been running
from this file for ~2 weeks, squatting on the host's port 8080. Every
first `pnpm dev:mana:all` after a cold machine start would fail with
Bind for 0.0.0.0:8080 failed: port is already allocated
because the top-level compose's `mana-searxng` service couldn't take
8080 while the orphan held it. The second invocation silently
"worked" — docker saw the freshly-created mana-searxng container and
skipped the bind step on the idempotent up, leaving it healthy but
only reachable inside the docker network (8080/tcp, no external
publish).
Cleanup already done out-of-band:
docker compose -f services/mana-search/docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml up -d --force-recreate searxng
Deleting the file so a stale `docker compose -f …/mana-search/dev.yml up`
can't resurrect the orphan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PlannerLoopInput grows an optional compactor:
compactor?: {
maxContextTokens: number;
threshold?: number; // default 0.92, matches Claude Code wU2
compact: (messages) => Promise<{ messages, compactedTurns }>;
}
Before each LLM call the loop checks whether promptTokens+completion
has crossed threshold × maxContextTokens. If yes AND we haven't
compacted this run yet, the callback runs, its returned messages
REPLACE the live history, and compactionsDone flips to 1 so a
runaway tool can't re-trigger.
Design choices:
- Fires at most ONCE per loop run. If the fresh (compacted)
history hits the threshold again in the same run, the LLM
round budget will hit first; better to terminate than to
recursively compact a summary.
- No reminder emitted automatically — the caller can wire
that via reminderChannel by reading compactionsDone from
LoopState (next PR; compactionsDone isn't exposed yet to
keep the state surface small).
- compactor callback is injectable, not hardcoded to
compactHistory() from compact.ts. Lets mana-ai route the
compactor LLM call to a cheaper model (Haiku) without
changing the loop.
- Zero maxContextTokens → skip silently (same contract as
shouldCompact()).
Also cleaned up the isParallelSafe non-null-assertion warning by
hoisting the predicate to a local with proper narrowing.
5 new loop tests: below-threshold no-op, single-fire replacement,
once-per-run idempotency, zero-cap bail, no-op when compactor
returns 0 turns. 76 shared-ai tests total, green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revises the wardrobe plan's space-scope decision from "only personal"
to the full matrix — brand has merch, clubs have Trikots, families
have shared kids' wardrobes, teams have costumes/uniforms, practices
have Dresscode items. All six space types get wardrobe in the
allowlist; garments + outfits are stamped with spaceId/authorId/
visibility like tags/scenes/agents (post Phase 2c).
Adds a sixth decision block (Space-scoped catalog, user-scoped
Try-On subject): the catalog lives in its space, but Try-On
references are always the *calling user's* meImages — one human,
one identity, brought into every space. A brand team member trying
on merch sees themselves wearing it; a club member trying on a
Trikot sees themselves wearing it; natural and correct.
The single edge case is family spaces where a parent might want
"try on kid's shirt" — the plan punts that explicitly. The
catalog side (adding items, composing outfits) works unrestricted;
Try-On shows a hint that it renders the calling user. If real
demand shows up later, a separate plan can introduce per-space
subject references (spaceMembers[].faceMediaId or similar) —
today not speculating.
Membership gating falls out of the existing scopedForModule/
mana-sync-RLS stack; no extra code in wardrobe.
M1 checklist updated: wardrobe is NOT in USER_LEVEL_TABLES, queries
go through scopedForModule, allowlist entry covers all six types.
M4 checklist gains the "in non-personal spaces show the subject
hint" item.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Claude-Code wU2 pattern: when token usage hits ~92% of the provider's
context budget, fold all pre-tail turns into a single structured summary
(Goal / Decisions / Tools Called / Current Progress) so subsequent
rounds see a synopsis instead of the raw log.
This commit ships ONLY the primitive. Wiring it into runPlannerLoop
(auto-trigger before the next LLM call when shouldCompact() fires)
is M2.2 so the surface stays small and testable.
New exports from @mana/shared-ai:
- shouldCompact(totalTokens, maxContextTokens, threshold?)
→ boolean; DEFAULT_COMPACT_THRESHOLD = 0.92, matching Claude Code.
Bails safely when maxContextTokens is missing (local models often
don't report usage).
- compactHistory(messages, { llm, model, keepRecent?, temperature? })
→ { messages, summary, compactedTurns, usage? }
Preserves: [0]=system, [1]=first user, [last N]=recent turns
(default 4). Everything between gets sent through the compact
agent with COMPACT_SYSTEM_PROMPT — a fixed 4-section Markdown
schema. Temperature default 0.2 because we want summarisation,
not creativity.
- parseCompactSummary / renderCompactSummary — round-trip helpers.
Parser is tolerant (missing sections → empty string) so a partial
compaction still produces a usable summary.
The summary replaces the middle as a single role='assistant' message
wrapped in <compact-summary> tags. Assistant role (not system) because
some providers reject arbitrary system messages deep in history.
Tests: 17 new across the 4 exports (trigger logic, Markdown round-trip,
structural preservation of anchors + tail, usage passthrough, custom
keepRecent). All 71 shared-ai tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two things:
1. AI tools (9) in the website module — writes go through the standard
proposal flow, reads run auto during planning.
- shared-ai/src/tools/schemas.ts: AI_TOOL_CATALOG entries with
defaultPolicy propose/auto.
- webapp modules/website/tools.ts: execute functions wired to the
existing stores. ModuleTool[] registered in data/tools/init.ts.
- Propose: create_website, apply_website_template, create_website_page,
add_website_block, update_website_block, publish_website
- Auto: list_websites, list_website_pages, list_website_blocks
Server-side mana-tool-registry integration (mana-mcp, mana-ai) is
a M5.x follow-up — webapp flow unblocks the missions-based use case.
2. Starter templates — clone into a fresh site with new UUIDs.
- templates/types.ts: SiteTemplate shape with localId / parentLocalId
so container→child references survive the clone.
- 4 templates: portfolio (4 pages), personal-linktree (1 page, 6 CTAs),
event (3 pages incl. RSVP form), blank (1 empty page). Deferred:
smb-corporate + product-landing (need team/pricing/testimonials
blocks, M6+).
- sitesStore.applyTemplate: walks template, bulk-inserts new rows,
remaps parent refs. Sets navConfig items from template pages.
- TemplatePicker component + /website/new route. Replaces the old
quick-create modal; ListView now links to /new. AppRegistry
context-menu action points there too.
AiProposalInbox integration deferred — the component doesn't exist in
the webapp yet (the plan mentions it aspirationally). defaultPolicy
'propose' is already set so writes stage correctly once the UI catches
up.
Validation:
- pnpm run validate:all: 6/6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api + packages/shared-ai type-check: green
Plan: docs/plans/website-builder.md (M5 shipped)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Producers now return structured {producer, severity, text} objects
instead of raw strings. buildReminderChannel collects them, increments
mana_ai_reminders_emitted_total{producer, severity} per emission, and
maps back to strings for the shared-ai loop input.
Why structured: the Prometheus label "severity" lets dashboards split
75-99% token-budget warnings (severity=warn) from 100%+ escalations
(severity=escalate) without NLP on the reminder text. Adding a new
producer that emits only info-level state (e.g. stale-sync warning)
falls out for free.
Active producer labels today:
- token-budget (warn, escalate)
- retry-loop (warn)
With this plus the scrape job (d087b4744), we can finally answer:
"does the budget warning actually change LLM behaviour?" — correlate
reminders_emitted_total{producer='token-budget'} with
tick_duration_seconds or planner_rounds_histogram.
3 tests updated to assert the new {producer, severity, text} shape
(16 reminder tests total, all green).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two plan updates as a set:
- me-images-and-reference-generation.md: rewrites the "Status" block
to reflect what actually shipped (M1 89258eb45, M2 a64a7e39c, M2.5
e2b5ac38c, M3 in 38dc80654, M4 in d087b4744, M5 fc635f983) and
adds an "Offen" section listing the small follow-ups that didn't
make the M1-M5 cut — global aiUsesReferenceImages kill-switch,
kind-editor on existing tiles, reference-display in picture
detail view, legacy-avatar re-upload hint — plus the three
optional later tracks (M6 local FLUX+PuLID, M7 inpainting masks,
M8 zero-knowledge blobs). Milestones checklist is now
✅-annotated per shipped item with actual decisions (Dexie v38
instead of v27, no me-storage bucket after all, generation_log
deferred, etc.).
- wardrobe-module.md: new plan. Data layer sketch (two tables:
wardrobeGarments + wardrobeOutfits, reuses me-images + picture
as dependencies), UI breakdown (/wardrobe, /wardrobe/compose,
garment + outfit detail routes), Try-On as a thin wrapper over
the M3 endpoint (with the cap bumped from 4 → 8 references, so
face + body + up-to-6 garments fits one call), four MCP tools
in a new wardrobe.ts module, and two optional later tracks
(Persona Stil-Coach template, context-driven outfit suggestion
mission). The explicit non-goals block keeps the scope tight:
no product DB, no replacement for inventory, no shopping, no
style-coaching that feels judgmental.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three edge-level fixes applied live to the Mac Mini today, now
committed so the canonical state matches:
1. apps/mana/apps/web/Dockerfile: add COPY for @mana/shared-crypto
(added recently as a workspace dep but the Dockerfile missed it,
so pnpm install failed with ERR_PNPM_WORKSPACE_PKG_NOT_FOUND on
every rebuild — same class as the shared-types / shared-ai /
shared-rss fixes earlier today).
2. docker-compose.macmini.yml (mana-web service): set
PUBLIC_MANA_RESEARCH_URL + PUBLIC_MANA_RESEARCH_URL_CLIENT. Without
this pair the SSR-injected window.__PUBLIC_MANA_RESEARCH_URL__ was
empty and research fetches 404'd against the current origin.
3. docker-compose.macmini.yml (umami service): pin image to
postgresql-v2.18.0. The rolling `postgresql-latest` tag jumped to
Umami 3.1.0 (Next.js 16) which crashed the container on every
POST /api/send — browser page loaders hung up to 10s on the
failing tracker request. v2.18.0 is the last known-stable v2;
DB schema is still v2-compatible so the downgrade is clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends LoopState with a sliding window of the last N ExecutedCalls
(oldest-first), capped at LOOP_STATE_RECENT_CALLS_WINDOW = 5. The loop
maintains the window automatically; reminderChannel producers read it
without touching internal state.
This activates retryLoopReminder which was shape-only in faa472be9.
The guard now fires end-to-end: when round >= 3 and the tail-2 calls
both returned success:false, the LLM sees a "stop retrying, write a
summary instead" <reminder> on the next turn. The tail-2 check rather
than window-wide is deliberate — a flaky run with intermittent success
(F, F, F, OK, F) is not a retry loop, just flaky tools.
Why window=5: retry loops usually manifest within 2-3 consecutive
rounds; a 5-deep window gives room for burst-detection and
stale-tool heuristics without bloating the reminder channel. Cap
keeps the reminder producers O(5) regardless of loop length.
Tests: 3 new (sliding-window cap + slide + order in shared-ai, retry
composition + budget+retry chain + tail-only heuristic in mana-ai).
Total agent-loop tests now 74 across both packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes M5 of docs/plans/me-images-and-reference-generation.md —
exposes the meImages feature through the shared tool-registry so MCP
clients (Claude Desktop) and the mana-ai mission runner can drive it
alongside the built-in webapp UI.
Two tools in packages/mana-tool-registry/src/modules/me.ts:
- me.listReferenceImages(kind?) — scope: user-space, read. Pulls the
user's meImages rows from mana-sync (app='profile'), filters to
usage.aiReference=true and soft-live records, decrypts the `label`
and `tags` fields with the caller's master key (same pattern as
notes.search). Returns mediaIds + kind + primary-slot info so a
persona can pick references intelligently. ZK users will see this
fail at getMasterKey() — correct, because the label is truly
unrecoverable server-side for them.
- me.generateWithReference({prompt, referenceMediaIds, quality,
size, n}) — scope: user-space, write. Thin proxy over the M3
endpoint POST /api/v1/picture/generate-with-reference in apps/api:
forwards the JWT, lets apps/api re-verify ownership, and returns
the generated images' mediaIds + URLs. Credits are consumed at
the same 3/10/25 tarif as text-to-image, so a persona plan pass
should gate this behind explicit budget rather than leaving it on
auto-policy.
Registered in modules/index.ts + adds 'me' to the ModuleId union in
types.ts. No other wiring needed — mana-mcp's createMcpServerForUser
iterates the registry and exposes any user-space tool, so both tools
become available to Claude Desktop immediately on next deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hard-follow-up to M1's soft Dexie schema landing (plan
docs/plans/me-images-and-reference-generation.md). After this commit
the source of truth for the avatar is meImages(primaryFor='avatar');
auth.users.image becomes a derived mirror that gets pushed back to
Better Auth whenever the primary changes.
Changes:
- New migration/legacy-avatar.ts: one-shot, idempotent bootstrap. On
first visit to /profile/me-images it reads profile.image via
profileService.getProfile() and writes a single meImage with
kind='face', primaryFor='avatar', usage.aiReference=false. The
mediaId is a sentinel `legacy-avatar:<uid>` — the original bytes
never went through mana-media, so verifyMediaOwnership (M3) will
naturally bounce if the user ever flips aiReference on without
re-uploading. Guarded per user via localStorage +
existing-avatar-holder check so reruns are no-ops.
- Store avatar autosync: setPrimary and deleteMeImage now push
meImages(primaryFor='avatar').publicUrl back to
profileService.updateProfile({ image }). The avatar slot is
coupled to face-ref — setting a new face-ref primary also claims
the avatar on the same row, so users don't need a second UI
control to keep their profile picture fresh. Failures are logged
but swallowed; meImages stays authoritative for in-app rendering.
- MeImagesView triggers the migration once on mount.
- EditProfileModal replaces the broken inline avatar upload (the old
POST /api/v1/storage/avatar/upload endpoint never existed in the
unified API) with a read-only preview + a button that closes the
modal and navigates to /profile/me-images. Name + email flows are
untouched.
- profileService.uploadAvatar + AvatarUploadResponse + its test are
deleted (no callers left after the modal rewrite).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two new block types and the server-side infrastructure for
untrusted input + cross-module data embedding.
Forms:
- packages/website-blocks/src/form: declarative fields (text, email,
tel, url, textarea, number) with required / maxLength / placeholder
per field. Honeypot hidden input in the renderer; public-mode POST
to a same-origin SvelteKit proxy that forwards to mana-api.
- apps/api: website.submissions table (schema.ts + 0001_submissions.sql)
+ POST /public/submit/:siteSlug/:blockId. Loads the current published
snapshot, finds the form block, validates payload against its
declared fields (trim, type check, length cap), rejects honeypot
submissions silently, rate-limits per IP (10 / 5 min) in-memory.
Unknown keys are dropped — clients can only submit declared fields.
- Owner-facing: GET/DELETE /sites/:id/submissions + SubmissionsView
component + /(app)/website/[siteId]/submissions route. Shows
incoming submissions with status pill + payload preview + delete.
- apps/mana/.../routes/s/[siteSlug]/__submit/[blockId]/+server.ts:
same-origin proxy so form posts don't trigger CORS and IP / user-
agent headers are forwarded via SvelteKit's trusted getClientAddress.
M4 first-pass does NOT wire target-module delivery (contacts / notify).
Submissions stay in the inbox until owner-side tool handlers land
(M4.x). `target` enum is intentionally `['inbox']` only for now.
moduleEmbed:
- packages/website-blocks/src/moduleEmbed: source dropdown
(picture.board | library.entries), max-items, layout (grid | list),
optional filter object. The `resolved` field on props is populated at
publish time by the editor-side resolver — public renderer reads it
directly, no Dexie / API round-trip needed.
- apps/mana/.../website/embeds.ts: per-source resolvers. picture.board
enforces `isPublic=true`; library.entries respects filter.isFavorite
/ kind / status so owners can expose a subset (e.g. "my favorites").
- buildSnapshot() walks the tree after assembly and fills in
block.props.resolved for every moduleEmbed. Publish slower, public
visits fast. No cross-service call at render time.
Validation:
- pnpm run validate:all: 6/6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api type-check: green
Apply Postgres with:
psql "$DATABASE_URL" -f apps/api/drizzle/website/0001_submissions.sql
Plan: docs/plans/website-builder.md (M4 shipped)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smallest possible foundation for the persona-driven visual regression
suite (M5 in docs/plans/mana-mcp-and-personas.md). One flow, two
viewports, one persona — enough to prove the stack end-to-end:
seed-script → mana-auth → API login → cookie injection → web app →
screenshot → disk. Extending is copy-paste per flow.
tests/personas/
playwright.config.ts
Own config separate from the root tests/e2e/ suite. Two viewports
(1440×900 desktop Chrome + Pixel 5 mobile) — more can be added
once baselines settle without quadrupling the review load.
Diff threshold 0.2 %, animations disabled, snapshots land under
__snapshots__/{spec}/{arg}-{project}.png. No auto-webServer —
the whole point is to catch regressions against the real stack
the user runs, not a hermetic one; if the stack is down, tests
fail loud.
fixtures/persona-auth.ts
Typed Playwright `test.extend` with a `personaKey` worker option
and a `personaPage` fixture that returns a pre-logged-in Page
pointed at `/`. Login is API-side: POST /api/v1/auth/login with
the deterministic HMAC-SHA256 password, parse Set-Cookie headers,
inject into the browser context. Derivation is a bit-identical
mirror of scripts/personas/password.ts and
services/mana-persona-runner/src/password.ts — a 3-way contract.
Changing one without the others locks the suite out of every
persona. PERSONAS map exports all 10 catalog emails for typed
access.
flows/home.spec.ts
One smoke flow. Asserts the persona isn't redirected to /login,
hides any [data-testid="live-time"] so clock widgets don't
invalidate diffs, captures a full-page screenshot. When this
goes green, the whole pipeline is plumbed. Copy this file to
add per-module tours.
package.json
@mana/tests-personas workspace. Scripts: `test`, `test:update`,
`report` (HTML diff viewer).
README.md
Prerequisites (stack up + seeded + ideally persona-runner ticked
once), run recipe, env vars, architecture diagram, extension
pattern.
root package.json: `pnpm test:personas` + `:update`.
.gitignore: playwright-report-personas/ + test-results/ so generated
artefacts never get committed.
Type-check / list: `playwright test --list` succeeds, 2 tests (one
per viewport) registered for home.spec.ts.
Not attempted in this commit (user action to run the stack):
- Actual baseline capture (needs docker up + db:push + seed:personas
+ ANTHROPIC_API_KEY + diag/tick).
- Additional flows (todo, journal, notes, habits, calendar). They're
copy-paste per README. Land when the stack is smoked.
- Nightly CI job. Will land once baselines are stable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expands the builder from 3 M1 blocks to 8. Containers (columns) and
media blocks (image, gallery) are the structural additions; cta and faq
round out the content coverage.
packages/website-blocks:
- image, cta, faq, columns (container), gallery — each with Zod schema,
renderer (mode-aware for edit/preview/public), and fallback inspector.
- Block type extended with optional `children` + `renderChild` snippet
so containers render their children through the same chrome the
outer renderer provides (click-to-select, public-path tagging).
- themes/: 3 presets (classic light, modern dark, warm) with
`resolveTheme` + `themeCssVars` helpers. Public layout now emits
CSS vars via `style=` on the root; block components read
`var(--wb-primary)` / `var(--wb-bg)` / `var(--wb-fg)` / etc.
- Registry updated; new exports + `./themes` subpath export.
apps/mana/apps/web/src/lib/modules/website:
- upload.ts: multipart POST to mana-media with `app=website` scope,
returns { mediaId, url }. 25 MB cap, non-image rejection client-side.
- components/ImageInspector + GalleryInspector: app-side overrides
wired to upload. Registered via `CUSTOM_INSPECTORS` in BlockInspector
so block.type → app-side inspector, fallback to registry otherwise.
- components/SiteSettingsDialog: theme preset picker + color overrides
for primary/bg/fg + footer text. Mounted from a ⚙ button in the
editor's left pane.
- components/BlockRenderer: rebuilt around a byParent map + recursive
`renderBlock` snippet so container blocks can render their children
through the same click-to-select wrapper as top-level blocks.
- routes/s/[siteSlug]: rename `[[...path]]` → `[...path]` (SvelteKit
treats rest segments as optional automatically — double-bracket form
errored at sync time). +page.svelte renders snapshot trees
recursively so published pages match the editor.
apps/api: unchanged.
Validation:
- pnpm run validate:all: all 6 gates green
- pnpm run check (web): 0 errors, 0 warnings
- apps/api type-check: green
- website-blocks tsc: green
Plan: docs/plans/website-builder.md (M3 block shipped)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mana-mcp:
- Policy-gate section: POLICY_MODE semantics, the four decision
rules, where to find soak metrics during log-only burn-in.
- /metrics section pointing at the Prometheus job.
mana-ai:
- New v0.8 status block: reminderChannel wiring, the two live
producers (tokenBudgetReminder active, retryLoopReminder dormant
pending LoopState extension), why POLICY_MODE here is limited to
freetext inspection, why parallel-reads have no effect until the
tool-registry absorbs the full AI_TOOL_CATALOG (M4 of personas).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pairs with c94ab01c6 which added the real /metrics endpoint. Without a
scrape job the policy_decisions_total counter has nowhere to go and
the soak period is flying blind.
30s interval to match mana-ai. Same job shape as mana-ai — any Grafana
dashboard that auto-discovers services via labels will pick this up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the stub /metrics endpoint with a real prom-client registry
(mana_mcp_ prefix, {service="mana-mcp"} default label). Default
process metrics come along for free.
Policy-gate telemetry is the whole point — without it we can't soak
POLICY_MODE=log-only safely or decide when to flip to enforce. New
counter mana_mcp_policy_decisions_total{decision, reason, mode} buckets
every evaluatePolicy() call:
decision ∈ {allow, deny, flagged}
reason ∈ {admin-scope-not-invokable, destructive-not-allowed,
rate-limit-exceeded, injection-marker, clean, unknown}
mode ∈ {log-only, enforce}
So the rate of "would have been denied" during soak is visible directly
as policy_decisions_total{decision="deny", mode="log-only"}.
Also:
- mana_mcp_tool_invocations_total{tool, outcome} — success |
handler-error | input-invalid. Policy denies are NOT counted here
(they're in policy_decisions_total above); this counter only counts
calls that actually reached the handler or tripped zod validation.
- mana_mcp_tool_duration_seconds histogram per tool/outcome.
Dep: prom-client ^15.1.3 (same version mana-ai pins).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous commit 38dc80654 carries this M3 title but its payload is an
unrelated apps/api/picture change — shared-.git-index race with a
parallel session (see feedback_git_workflow.md). This commit holds the
actual M3.b/c/d code. Leaving the misnamed commit for the user to
re-attribute / revert as they prefer.
Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The
runner picks up due personas, drives each through Claude + MCP for
one simulated turn, collects actions + ratings, persists through
service-key internal endpoints in mana-auth.
Internal endpoints (mana-auth, service-key-gated)
- GET /api/v1/internal/personas/due
Returns personas whose tickCadence + lastActiveAt say they're
due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri.
NULLS FIRST so never-run personas go ahead of stale ones.
- POST /api/v1/internal/personas/:id/actions
Batch ≤ 500. Row ids are deterministic
`${tickId}-${i}-${toolName}` + ON CONFLICT DO NOTHING so the
runner can retry a tick without doubling audit rows. Also
bumps personas.last_active_at so the next /due call sees it.
- POST /api/v1/internal/personas/:id/feedback
Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is
one rating per module per tick.
Runner tick pipeline (services/mana-persona-runner/src/runner/)
- claude-session.ts
Two phases per tick. runMainTurn feeds the persona's system
prompt + a German "simulate a day" user prompt to Claude Agent
SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP
server. We iterate the returned AsyncGenerator and extract
tool_use blocks into ActionRows; a tool_result with
is_error=true flips the most recent action. runRatingTurn is a
fresh query() with tools:[] asking Claude in character to rate
each used module 1-5 as strict JSON. We parse with tolerance
for whitespace / fences. Unparseable output becomes a synthetic
'__parse' feedback row so operators see the failure.
- tick.ts
Orchestrator. Skips when config.paused. Fetches /due, processes
in batches of config.concurrency via Promise.allSettled so a
single persona failure never kills the batch. Returns
{due, ranSuccessfully, failed[], durationMs}.
- types.ts
ActionRow + FeedbackRow shapes shared between claude-session
and the internal client.
Runner bootstrap (src/index.ts)
- setInterval(config.tickIntervalMs) starts the tick loop on boot.
tickInFlight guards against overlap when Claude latency >
interval. If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing,
loop is disabled with a warn line — /health + /diag/login still
work.
- POST /diag/tick (dev-only) fires one tick on demand, returns
the result. Avoids waiting a full interval during testing.
- Graceful SIGTERM/SIGINT shutdown clears the interval.
Client
- clients/mana-auth-internal.ts
X-Service-Key client for the three endpoints above.
Constructor throws on empty serviceKey — fail loud.
Boot smoke verified: /health returns ok, /diag/tick 500s with
descriptive messages when keys absent. Warning lines on boot when
keys are missing. Type-check green across mana-auth, tool-registry,
mcp, persona-runner.
M3 exit gate is the end-to-end smoke recipe (docker up → db:push →
seed:personas → diag/tick → psql) documented in
services/mana-persona-runner/CLAUDE.md.
M2.d (cross-space family/team memberships) still deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The
runner now picks up due personas, drives them through Claude + MCP
for one simulated turn, collects actions + ratings, and persists
them through service-key internal endpoints in mana-auth.
Internal endpoints (mana-auth, service-key-gated)
- GET /api/v1/internal/personas/due
Returns personas whose tickCadence + lastActiveAt say they're
due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri.
NULLS FIRST so never-run personas go ahead of stale ones.
- POST /api/v1/internal/personas/:id/actions
Batch ≤ 500. Row ids are deterministic
(`${tickId}-${i}-${toolName}`) + ON CONFLICT DO NOTHING so the
runner can retry a tick without doubling audit rows. Also
bumps personas.last_active_at so the next /due call sees it.
- POST /api/v1/internal/personas/:id/feedback
Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is
one rating per module per tick.
Runner tick pipeline (services/mana-persona-runner/src/runner/)
- claude-session.ts
Two phases per tick. runMainTurn feeds the persona's system
prompt + a German "simulate a day" user prompt to Claude Agent
SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP
server. We iterate the returned AsyncGenerator and extract
tool_use blocks into ActionRows; tool_result with is_error=true
flips the most recent action. runRatingTurn is a fresh query()
with tools:[] asking Claude in character to rate each used
module 1-5 as strict JSON, which we parse with tolerance for
surrounding whitespace / fences. Unparseable output becomes a
synthetic '__parse' feedback row so operators see the failure.
- tick.ts
Orchestrator. Skips if config.paused. Fetches /due, processes
in batches of config.concurrency (Promise.allSettled so one
failure doesn't kill the batch), returns {due, ranSuccessfully,
failed[], durationMs}.
- types.ts
ActionRow and FeedbackRow shapes shared between claude-session
and the internal client; mirrors the mana-auth schema but in
narrow plain TS for the wire.
Runner bootstrap (src/index.ts)
- setInterval(config.tickIntervalMs) starts the tick loop on boot.
tickInFlight guards against overlap when Claude latency > interval.
If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing, loop is
disabled with a warn line — /health still works, /diag/login
still works.
- New dev-only POST /diag/tick fires a single tick on demand and
returns the result, so you can verify without waiting 60 s.
- Graceful SIGTERM/SIGINT shutdown clears the interval.
Client
- clients/mana-auth-internal.ts
X-Service-Key client for the three endpoints above. Constructor
throws if serviceKey is empty — fail loud, not silent.
Boot smoke: /health + /diag/tick both return descriptive 500s when
keys are absent, 200/JSON when present. Warning lines show up on
boot for missing keys. Type-check green across mana-auth, tool-
registry, mcp, persona-runner.
End-to-end smoke recipe (docker up → db:push → seed:personas →
diag/tick → psql) documented in
services/mana-persona-runner/CLAUDE.md. That's the M3 exit gate.
M2.d (cross-space family/team memberships) still deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Enables the M1 parallel-reads optimisation on the webapp side. Both
consumers of runPlannerLoop pass an isParallelSafe predicate derived
from the tool catalog:
isParallelSafe: (name) =>
AI_TOOL_CATALOG_BY_NAME.get(name)?.defaultPolicy === 'auto'
Auto-policy tools (list_tasks, get_habits, nutrition_summary, …) run
via Promise.all in batches of 10 when the LLM fans them out in one
round. Propose-policy tools — which surface to the user as Proposal
cards — stay sequential so intent ordering in the inbox is preserved
and pre-execute guardrails can reason about prior-step state.
Tests: 31 existing companion + mission tests pass unchanged; the
parallel path is exercised via the new loop.test.ts cases shipped
with the M1 commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M2 of docs/plans/me-images-and-reference-generation.md — the Settings
surface that sits on top of the M1 data layer. Users can now upload
a Face and a Fullbody reference into two primary slots, toss extra
references into a grid, and toggle each image's "KI darf nutzen" flag
individually.
Route placement: /profile/me-images (not /settings/me-images as the
plan originally proposed). The repo convention is per-module subroutes
(/todo/settings, /invoices/settings, …) — there is no global /settings
namespace to hang this off. Plan doc updated accordingly.
- MeImageUploadZone: drag-and-drop + file-picker, pattern from
picture/ListView but refactored into a reusable component. Fires
onFiles(File[]) so the parent decides kind + slot.
- MeImageSlotCard: large card for Face / Fullbody primary slots.
When filled it shows the portrait + the image's AI-toggle + delete
+ a compact "Neues Bild setzen" replacement zone. When empty it
collapses into a large drop-zone.
- MeImageTile: grid tile for everything that isn't currently holding
a primary slot — thumbnail, kind badge, Robot-AI-toggle, Star
primary-promotion (only enabled for kinds that map to a slot),
Trash delete.
- MeImagesView: orchestrates queries (useImageByPrimary for each
slot + useAllMeImages for the rest), upload flow (readDimensions →
uploadMeImageFile → store.createMeImage → optional setPrimary in
the same tick), and the three write actions (toggleAi, togglePrimary,
delete). Dropping a file on a slot drop-zone both uploads and claims
the slot, so the old holder automatically falls into the grid.
- Client: profile/api/me-images.ts wraps the M1 endpoint with
authStore.getValidToken() → Bearer header and a small
readImageDimensions helper that exposes natural width/height
synchronously (mana-media reports them later but we want them for
the Dexie row's first write).
- Discoverability: profile ListView "Konto" tab gains a "Meine Bilder"
action button that navigates to the new route with a one-line hint.
Still open (later commits): the hard-migration that rewrites
auth.users.image → meImages(primaryFor='avatar'), the global
aiUsesReferenceImages kill-switch (lives on profile singleton), and
the Picture-generator's Reference picker (M4, rides on top of M3's
backend endpoint).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First concrete piece of M3 (docs/plans/mana-mcp-and-personas.md). The
tick loop itself and the Claude Agent SDK + MCP integration are M3.b;
the action/feedback persistence endpoints are M3.c. This commit just
stands up the service so the remaining pieces have a shell to land in.
Service shape (Bun/Hono on :3070)
- src/config.ts
Env-driven configuration: auth URL, MCP URL, service key for
action/feedback callbacks (M3.c), Anthropic API key, deterministic
PERSONA_SEED_SECRET (must match scripts/personas/password.ts so the
runner can log back in without any stored credentials), tick
interval and concurrency, RUNNER_PAUSED kill-switch. Production
start asserts all secrets are set and the dev fallback secret is
rotated.
- src/password.ts
Bit-for-bit identical HMAC-SHA256 password derivation to
scripts/personas/password.ts. Duplicated deliberately: the two
sides can't share code (one is a repo-root utility script, the
other is a workspace service) but must stay in sync — comment
at the top calls this out.
- src/clients/auth.ts
Two upstream calls the runner needs for one tick: POST /auth/login
and GET /api/auth/organization/list. loginAndResolvePersonalSpace()
wraps both and picks the persona's auto-created personal space as
the write target (throws if none exists — Spaces-Foundation should
always have seeded one on signup).
- src/index.ts
Hono app: /health, /metrics (stub), and a dev-only /diag/login
endpoint that takes a persona email, derives the password, logs
in, resolves the personal space, and returns {userId, spaceId} as
an end-to-end sanity check. Disabled in production.
No tick loop yet — RUNNER_PAUSED prints an info line on boot, but
nothing fires. The dispatcher + Claude Agent SDK + MCP client land in
M3.b; the internal POST callbacks into mana-auth for persona_actions /
persona_feedback land in M3.c.
Infra
- Port 3070 added to docs/PORT_SCHEMA.md.
- Service listed in root CLAUDE.md next to mana-mcp.
- services/mana-persona-runner/CLAUDE.md documents what's built today,
what lands in M3.b/c, and the local diag smoke recipe.
Boot smoke verified: /health returns ok + paused/interval/concurrency,
/diag/login without email returns 400.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the M1 reminderChannel into the mana-ai mission runner with two
initial producers in services/mana-ai/src/planner/reminders.ts:
- tokenBudgetReminder — warns at 75% of the agent's daily cap, emits a
stronger "wrap up NOW" message at/above 100%. Uses pretick usage +
accumulated round usage so the warning tracks drift during a long
plan.
- retryLoopReminder — shape is in place (round≥3 + last 2 failures),
currently limited to the single lastCall LoopState exposes. Extends
cleanly once LoopState carries the full failure window.
buildReminderChannel composes active producers; the tick hoists
pretickUsage24h so the channel has the baseline. Each round the loop
re-evaluates the producers, so usage drift across rounds surfaces on
the NEXT turn.
Also exports LoopState + ReminderChannel from @mana/shared-ai top-level
so consumers don't need to reach into /planner.
Tests: 13 new bun tests covering thresholds, pretick+round summing,
composition, and per-round re-evaluation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three Claude-Code-inspired primitives for runPlannerLoop, derived from the
reverse-engineering reports in docs/reports/:
1. **Policy gate** (@mana/tool-registry) — evaluatePolicy() gates every tool
dispatch: denies admin-scope, denies destructive tools not in the user's
opt-in list, rate-limits per tool (30/60s default), flags prompt-injection
markers in freetext without blocking. Wired into mana-mcp with a
per-user rolling invocation log and POLICY_MODE env (off|log-only|enforce,
default log-only). mana-ai uses detectInjectionMarker only — tool dispatch
there is plan-only, so rate-limit/destructive checks don't apply yet.
2. **Reminder channel** (packages/shared-ai/src/planner/loop.ts) — new
reminderChannel callback in PlannerLoopInput. Called once per round with
LoopState snapshot (round, toolCallCount, usage, lastCall); returned
strings wrap in <reminder> tags and inject as transient system messages
into THIS LLM request only. Never pushed to messages[] — the Claude-Code
<system-reminder> pattern that keeps the KV-cache prefix stable.
3. **Parallel reads** (loop.ts) — isParallelSafe predicate enables
Promise.all dispatch when every tool_call in a round is parallel-safe,
in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades
the whole round to sequential. messages[] always appends in source
order, never completion order, so the debug log stays linear.
Default-off (undefined predicate) preserves pre-M1 behaviour.
Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel,
4 reminder). All 74 green, type-check clean across 4 packages.
Design/plan: docs/plans/agent-loop-improvements-m1.md
Reports: docs/reports/claude-code-architecture.md,
docs/reports/mana-agent-improvements-from-claude-code.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Continuation of docs/plans/mana-mcp-and-personas.md. Personas are the
auto-test users the M3 runner will drive — they're real Mana users
(kind='persona', tier='founder'), registered through the same Better
Auth pipeline as humans, just stamped differently and metadata-tracked
so the persona-runner knows how to role-play them.
Schemas (auth namespace — personas are 1:1 with users, no reason for a
separate platform.* schema that the plan originally sketched)
- userKindEnum ('human' | 'persona' | 'system') + users.kind column,
wired into better-auth additionalFields so the JWT/user object carry
the flag. Default 'human' keeps every existing user untouched.
- auth.personas — 1:1 descriptor (archetype, systemPrompt, moduleMix
jsonb, tickCadence, lastActiveAt). CASCADE from users.id.
- auth.persona_actions — tick-grouped audit of every tool call the
runner makes (toolName, inputHash for dedup, result, latency).
- auth.persona_feedback — structured 1-5 ratings per module per tick,
plus free-text notes. This is where the runner writes the
self-reflection step at end of each tick.
Admin endpoints (/api/v1/admin/personas, admin-tier-gated)
- POST / create-or-update by email. Uses auth.api.signUpEmail
if the user's new, then stamps kind+tier+verified
and upserts the personas row. Idempotent — safe to
re-run after catalog edits.
- GET / list with 7-day action count per persona.
- GET /:id detail + recent 20 actions + per-module feedback
aggregate.
- DELETE /:id hard delete. Refuses non-persona users as
defense-in-depth: an admin typo here would cascade
through the full user-delete chain.
Catalog + seed pipeline (scripts/personas/)
- catalog.json 10 handwritten personas spanning 7 archetypes
(adhd-student, ceo-busy, creative-parent, solo-dev,
researcher, freelancer, overwhelmed-newbie).
Five pairs of personas that will later share
family/team spaces (cross-space setup is deferred
to M2.d per the plan).
- catalog.ts zod-validated loader. Refines email to require
@mana.test TLD — non-existent, no bounce risk.
- password.ts deterministic HMAC-SHA256(PERSONA_SEED_SECRET,
email). No stored per-persona credentials; the
runner re-derives on every login. Refuses the
dev-fallback secret in production.
- seed.ts POST /admin/personas per catalog entry. Flags:
--auth=, --jwt=, --dry-run.
- cleanup.ts Hard-delete every live persona. Warns when the
live set drifts from the catalog.
Root package.json:
pnpm seed:personas
pnpm seed:personas:cleanup
Extends the ESLint root-ignore list with `scripts/**` so Bun-typed
utility scripts don't fail the typed-parser check they weren't opted
into. Consistent with the rest of scripts/ being .mjs+.sh.
To go live (user action):
pnpm docker:up
cd services/mana-auth && bun run db:push
export MANA_ADMIN_JWT=...
pnpm seed:personas
M2.d deferred: cross-space (family/team/practice) memberships between
persona pairs. Better Auth's org-invite flow is multi-step and would
roughly double the M2 scope; the persona-runner (M3) can operate in
personal spaces first, shared-space tests land as their own milestone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two subdomains the webapp references in its SSR-injected config but
that had no tunnel entry:
- events.mana.how → mana-events on :3065. The container itself was
also missing (defined in compose but never started); started
today so the route now terminates somewhere real.
- research.mana.how → mana-research on :3068. The webapp was built
with PUBLIC_MANA_RESEARCH_URL empty, which made research fetches
fall back to mana.how and 404. The env-var side is still pending
a rebuild, but the tunnel side is live now.
Cloudflare CNAMEs already created via `tunnel route dns`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M1 of docs/plans/me-images-and-reference-generation.md — a user-owned
pool of reference images (face, fullbody, hands, …) that will back
image generation where the user appears as themselves (outfit try-on,
glasses, portraits) via OpenAI /v1/images/edits. Data layer only in
this commit; UI lands in M2, the edits endpoint in M3.
- Dexie v38: meImages table with id/kind/primaryFor/createdAt indices.
Added to USER_LEVEL_TABLES so the hook stamps userId and skips the
spaceId/authorId/visibility trio (one human = one face across every
Space, not per-Space).
- Encryption registry: label + tags encrypted; kind/primaryFor/usage
stay plaintext because they drive the indexed queries and the
Reference picker's filtering. mediaId/URLs/dimensions are structural.
- Profile module store: createMeImage, updateMeImage,
setAiReferenceEnabled (per-image KI opt-in — plan decision #5),
setPrimary (transactional slot swap — only one row per primary slot),
deleteMeImage. Emits MeImage* domain events.
- Queries: useAllMeImages, useMeImagesByKind, useReferenceImages
(only the rows the user opted in for KI), useImageByPrimary.
- POST /api/v1/profile/me-images/upload: thin wrapper over mana-media
with app='me' as the reference tag. No new MinIO bucket — plan
decision #1 revised after verifying mana-media uses one bucket and
only tags references by app.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Playwright-based diagnostic scripts for investigating
production-only browser issues that curl can't reproduce:
- scripts/smoke-prod.mjs: loads mana.how like a fresh incognito
tab, waits a configurable budget, reports every console error,
request failure, still-pending request, and slow resource.
- scripts/smoke-prod-load.mjs: measures DOMContentLoaded + load
event timing explicitly. Distinguishes "app interactive" from
"browser tab spinner stops".
Run: `node apps/mana/apps/web/scripts/smoke-prod.mjs`
MANA_URL=https://mana.how/login MANA_WAIT_MS=45000 node ...
Used today to rule out server-side issues in a loader-hang report
that reproduced only in one specific browser profile.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for autonomous Claude-driven testing. Plan:
docs/plans/mana-mcp-and-personas.md.
New packages
- @mana/tool-registry — schema-first ToolSpec<InputSchema, OutputSchema>
with zod generics, scope ('user-space' | 'admin') and policyHint
('read' | 'write' | 'destructive'). sync-client helpers speak the
mana-sync push/pull protocol directly so RLS and field-level LWW are
preserved. MasterKeyClient fetches per-user MKs via the existing
mana-auth GET /api/v1/me/encryption-vault/key endpoint (JWT-gated,
ZK-aware, already audited) — no new service-key endpoint built.
ZeroKnowledgeUserError surfaced as a typed throw.
- @mana/shared-crypto — AES-GCM-256 primitives extracted from the web
app's $lib/data/crypto/aes.ts so the server-side tool handlers and the
browser produce byte-for-byte identical wire format
(enc:1:{b64(iv)}.{b64(ct)}). Web app aes.ts now re-exports from
shared-crypto — 5 existing importers unchanged, svelte-check stays
green.
New service
- services/mana-mcp (:3069, Bun/Hono) — MCP Streamable HTTP gateway.
JWKS auth against mana-auth, per-user session isolation (session-id
belongs to the user who opened it — cross-user access returns 403),
admin-scoped tools filtered out before registration. MasterKeyClient
cached per process with a 5-minute TTL.
11 tools registered
- habits.{create,list,update,archive}, spaces.list (plaintext, M1)
- todo.{create,list,complete}, notes.{create,search}, journal.add
(encrypted — field lists match
apps/mana/apps/web/src/lib/data/crypto/registry.ts verbatim)
Infra
- Port 3069 added to docs/PORT_SCHEMA.md
- services/mana-mcp/CLAUDE.md with architecture, auth model,
tool-authoring recipe, local smoke-test steps
- Root CLAUDE.md services list updated
Type-check green across shared-crypto, mana-tool-registry, mana-mcp.
svelte-check on apps/mana/apps/web stays at 0 errors / 0 warnings.
Boot smoke verified: /health returns registry.loaded=true, unauthed
/mcp → 401, invalid-JWT /mcp → 401 with descriptive message.
Decisions locked in for later milestones (per plan D1–D10):
- Personas will be real mana-auth users (users.kind='persona'), no
service-key bypass (D1, D2)
- Tool-registry is the SSOT; mana-ai and the legacy
apps/api/src/mcp/server.ts get merged into it in M4 (three current
parallel tool catalogs collapse to one)
- Persona-runner (:3070) will be a separate service using the Claude
Agent SDK + MCP client (D5)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two pieces of the same cleanup:
1. build-app.sh now passes `--env-file .env.macmini` explicitly via a
shared COMPOSE_ARGS array. Without it, docker compose silently fell
back to `.env` in the project root — a separate file that happened
to hold MANA_AUTH_KEK and other secrets that `.env.macmini` lacked.
deploy.sh, restart.sh, and the CD workflow already used the flag;
this aligns build-app.sh with the rest. Server-side .env.macmini
was reconciled 2026-04-23 with the union of both files, so the
duplicate `.env` is no longer needed.
2. .env.macmini.example now documents 7 keys the prod stack actually
depends on but that had never been listed: GOOGLE_GEMINI_API_KEY /
GOOGLE_GENAI_API_KEY (SDK aliases for Deep-Research + mana-ai),
MANA_AI_PRIVATE_KEY_PEM / MANA_AI_PUBLIC_KEY_PEM (Mission-Grant
keypair), MANA_AI_DEEP_RESEARCH_ENABLED + PUBLIC_AI_MISSION_GRANTS
(feature flags), MANA_CORE_SERVICE_KEY (legacy alias), and the STT/
TTS internal shared secrets.
Matrix-bot tokens deliberately left undocumented — no Matrix homeserver
in the current running stack.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apps/api/package.json lists @mana/shared-ai and @mana/shared-rss as
workspace deps, but the Dockerfile's builder stage never copied their
source. pnpm silently skipped the symlinks, and bun hit ENOENT on every
articles / ai import at runtime. Same class as 70c62e758 (shared-logger
in mana-auth) and the shared-types fix one commit earlier.
Without this, any push that triggered a mana-api rebuild failed
health-check and cascaded mana-web offline via depends_on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mana-auth's package.json declares @mana/shared-types as a workspace
dependency, but the Dockerfile's install stage never copied its source
into the build context. pnpm then silently failed to create the
workspace symlink under node_modules, and bun hit ENOENT on every
import at runtime: "reading /app/services/mana-auth/node_modules/
@mana/shared-types".
The broken image sat undetected as long as the long-running container
didn't restart. Tonight's deploy recreated it and every mana-auth
container immediately crash-looped — taking mana-api and mana-web
down with it via depends_on.
Same class of bug as 70c62e758 (shared-logger).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs made the Mac Mini auto-deploy silently miss everything on a
multi-commit push:
1. Diff range was HEAD~1..HEAD, so a push with N commits only checked
the tip. Now uses github.event.before..sha, with a safe fallback to
HEAD~1 when the before SHA is absent (first push, force reset).
2. Service list was still the legacy per-product web/backend apps
(todo-web, chat-web, calendar-web, …) that were consolidated into
`mana-web` + `mana-api` months ago. The unified services didn't
exist in the workflow, so a push touching apps/mana/apps/web or
apps/api never rebuilt them.
Rewrite:
- Collapse per-service outputs into one `services` output driven by a
SERVICE_SOURCES array (add a new service by adding one line).
- Expanded service surface: mana-ai, mana-research, mana-events,
mana-user, mana-subscriptions, mana-analytics, mana-llm, mana-api,
mana-web, mana-credits, mana-geocoding, manavoxel-web — alongside
the Go services + memoro + landing-builder.
- Removed dead entries: todo/chat/calendar/clock/contacts/music/
storage/memoro-web variants.
- Expanded sveltekit-base trigger (any commit to shared-pwa /
shared-vite-config / root Dockerfile / pnpm-lock forces a base
rebuild — those were invisible before).
- Updated health-check URLs from the running containers' actual host
ports (PORT_SCHEMA.md prose + table disagreed; docker ps wins).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merges the feature-rich gallery (search, tag filters, favorites toggle,
view-mode toggles, detail modal) that previously lived in
routes/(app)/picture/+page.svelte INTO modules/picture/ListView.svelte,
and keeps the upload affordances (drag-and-drop, upload button, progress
chips) from the old ListView.
Route shrinks to a 3-liner: <RoutePage appId="picture"><ListView /></RoutePage>.
Responsive behaviour uses CSS container queries (@container inline-size)
on the ListView root. Below ~560px (carousel card width) the search bar,
tag chips and view-mode toggles hide; action-strip buttons drop to
icon-only. Above that breakpoint (route context, ≥~720px up to the
layout's max-w-7xl) everything is visible.
Drag-over handler distinguishes file drags from cross-module drag data
via dataTransfer.types.includes('Files'), so the upload overlay only
appears for real file drops — workbench card-to-card drags pass through
to the wrapping AppPage's dropTarget.
Data source changes from context-based (getContext('allImages')) to
direct Dexie live-queries via ./queries, so the component works in both
the carousel (no layout context) and the route (layout still provides
context for /picture/archive and /picture/board).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every +page.svelte under routes/(app) now renders inside workbench-card
chrome. Before, sub-routes floated directly on the app-shell background
— card-style paper/border/shadow only existed on the homepage carousel,
leaving /library, /notes, /picture, /finance etc. visually disconnected
from the rest of the app.
Coverage:
- 28 SIMPLE routes (single <ListView /> wraps): <RoutePage appId="...">
- 43 top-level main routes: <RoutePage> with preserved internal markup
- 122 sub-routes (/X/[id], /X/new, /X/settings, …): <RoutePage> with
backHref pointing at the parent listing. Title overrides for detail
pages (e.g. "Rechnung", "Deck", "Eintrag").
- Articles tab children (/articles/list, /favorites, /highlights, /stats)
get explicit title overrides ("Artikel · Leseliste", etc.).
A handful of special cases:
- calc/standard: <svelte:window> hoisted outside RoutePage (Svelte forbids
window bindings inside component children).
- agents/templates: {#snippet templateCard} hoisted outside so both {#each}
blocks inside RoutePage can @render it via page-scope lookup.
- citycorners redirect-stubs (add/, locations/[id]/, map/): left unwrapped
— they onMount → goto() with no body to wrap.
- 3 carousel routes (/, /todo, /contacts) keep their PageCarousel wrapping
untouched — they already provide card chrome.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a third provider path to /api/v1/picture/generate that calls OpenAI
gpt-image-2 when model starts with "openai/". Supports n=1..4 batch
generation with character continuity, base64 response decoded server-side
and uploaded to mana-media for dedup + thumbnails. Credit cost scales
by quality (low=3, medium=10, high=25) × n.
Env plumbing:
- scripts/generate-env.mjs: new apps/api/.env stanza propagates
OPENAI_API_KEY + REPLICATE_API_TOKEN from .env.secrets
- .env.macmini.example: documents OPENAI_API_KEY for prod
Frontend /picture/generate: model + quality + aspect-ratio + batch-count
selectors, real fetch with auth, persists each image via imagesStore.insert
(encrypted + synced). Wrapped in ModuleShell variant=fill with back-arrow
to /picture and a live credit badge in the header actions slot.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the old PageShell (workbench-only) with a single ModuleShell that
serves both carousel cards (variant=card, width-sized, window actions) and
sub-routes (variant=fill, fills main area, optional back button). RoutePage
wraps ModuleShell with auto-metadata lookup from the app-registry so every
(app)/*/+page.svelte can stay a three-liner.
Drops the dead onMinimize prop-drilling that was declared on PageShell but
never rendered — TodoPage/ContactPage callers cleaned up too.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c413ab7dd was reverted by c31dcdd66; the re-apply (3a7bc7f1c) only
brought back the mana-research tests, not my sweep. Restored in
af4fd2776. Update the shipping-log row + the attribution note so
future readers find the actual payload.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>