managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 20:41:09 +02:00

Author	SHA1	Message	Date
Till JS	79d112657c	feat(personas): M5.a — Playwright visual suite scaffold Smallest possible foundation for the persona-driven visual regression suite (M5 in docs/plans/mana-mcp-and-personas.md). One flow, two viewports, one persona — enough to prove the stack end-to-end: seed-script → mana-auth → API login → cookie injection → web app → screenshot → disk. Extending is copy-paste per flow. tests/personas/ playwright.config.ts Own config separate from the root tests/e2e/ suite. Two viewports (1440×900 desktop Chrome + Pixel 5 mobile) — more can be added once baselines settle without quadrupling the review load. Diff threshold 0.2 %, animations disabled, snapshots land under __snapshots__/{spec}/{arg}-{project}.png. No auto-webServer — the whole point is to catch regressions against the real stack the user runs, not a hermetic one; if the stack is down, tests fail loud. fixtures/persona-auth.ts Typed Playwright `test.extend` with a `personaKey` worker option and a `personaPage` fixture that returns a pre-logged-in Page pointed at `/`. Login is API-side: POST /api/v1/auth/login with the deterministic HMAC-SHA256 password, parse Set-Cookie headers, inject into the browser context. Derivation is a bit-identical mirror of scripts/personas/password.ts and services/mana-persona-runner/src/password.ts — a 3-way contract. Changing one without the others locks the suite out of every persona. PERSONAS map exports all 10 catalog emails for typed access. flows/home.spec.ts One smoke flow. Asserts the persona isn't redirected to /login, hides any [data-testid="live-time"] so clock widgets don't invalidate diffs, captures a full-page screenshot. When this goes green, the whole pipeline is plumbed. Copy this file to add per-module tours. package.json @mana/tests-personas workspace. Scripts: `test`, `test:update`, `report` (HTML diff viewer). README.md Prerequisites (stack up + seeded + ideally persona-runner ticked once), run recipe, env vars, architecture diagram, extension pattern. root package.json: `pnpm test:personas` + `:update`. .gitignore: playwright-report-personas/ + test-results/ so generated artefacts never get committed. Type-check / list: `playwright test --list` succeeds, 2 tests (one per viewport) registered for home.spec.ts. Not attempted in this commit (user action to run the stack): - Actual baseline capture (needs docker up + db:push + seed:personas + ANTHROPIC_API_KEY + diag/tick). - Additional flows (todo, journal, notes, habits, calendar). They're copy-paste per README. Land when the stack is smoked. - Nightly CI job. Will land once baselines are stable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:33:06 +02:00
Till JS	f07eae3c01	feat(personas): M3.b-d — tick loop + Claude Agent SDK + persistence (real) Previous commit `38dc80654` carries this M3 title but its payload is an unrelated apps/api/picture change — shared-.git-index race with a parallel session (see feedback_git_workflow.md). This commit holds the actual M3.b/c/d code. Leaving the misnamed commit for the user to re-attribute / revert as they prefer. Closes the M3 loop from docs/plans/mana-mcp-and-personas.md. The runner picks up due personas, drives each through Claude + MCP for one simulated turn, collects actions + ratings, persists through service-key internal endpoints in mana-auth. Internal endpoints (mana-auth, service-key-gated) - GET /api/v1/internal/personas/due Returns personas whose tickCadence + lastActiveAt say they're due. Rules: hourly > 1h, daily > 24h, weekdays > 24h mon-fri. NULLS FIRST so never-run personas go ahead of stale ones. - POST /api/v1/internal/personas/:id/actions Batch ≤ 500. Row ids are deterministic `${tickId}-${i}-${toolName}` + ON CONFLICT DO NOTHING so the runner can retry a tick without doubling audit rows. Also bumps personas.last_active_at so the next /due call sees it. - POST /api/v1/internal/personas/:id/feedback Batch ≤ 100. Row id is `${tickId}-${module}` — natural key is one rating per module per tick. Runner tick pipeline (services/mana-persona-runner/src/runner/) - claude-session.ts Two phases per tick. runMainTurn feeds the persona's system prompt + a German "simulate a day" user prompt to Claude Agent SDK's query(), with mana-mcp wired in as a streamable-HTTP MCP server. We iterate the returned AsyncGenerator and extract tool_use blocks into ActionRows; a tool_result with is_error=true flips the most recent action. runRatingTurn is a fresh query() with tools:[] asking Claude in character to rate each used module 1-5 as strict JSON. We parse with tolerance for whitespace / fences. Unparseable output becomes a synthetic '__parse' feedback row so operators see the failure. - tick.ts Orchestrator. Skips when config.paused. Fetches /due, processes in batches of config.concurrency via Promise.allSettled so a single persona failure never kills the batch. Returns {due, ranSuccessfully, failed[], durationMs}. - types.ts ActionRow + FeedbackRow shapes shared between claude-session and the internal client. Runner bootstrap (src/index.ts) - setInterval(config.tickIntervalMs) starts the tick loop on boot. tickInFlight guards against overlap when Claude latency > interval. If MANA_SERVICE_KEY or ANTHROPIC_API_KEY is missing, loop is disabled with a warn line — /health + /diag/login still work. - POST /diag/tick (dev-only) fires one tick on demand, returns the result. Avoids waiting a full interval during testing. - Graceful SIGTERM/SIGINT shutdown clears the interval. Client - clients/mana-auth-internal.ts X-Service-Key client for the three endpoints above. Constructor throws on empty serviceKey — fail loud. Boot smoke verified: /health returns ok, /diag/tick 500s with descriptive messages when keys absent. Warning lines on boot when keys are missing. Type-check green across mana-auth, tool-registry, mcp, persona-runner. M3 exit gate is the end-to-end smoke recipe (docker up → db:push → seed:personas → diag/tick → psql) documented in services/mana-persona-runner/CLAUDE.md. M2.d (cross-space family/team memberships) still deferred. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:18:31 +02:00
Till JS	493db0c3b2	feat(personas): M2.a-c — persona schemas + admin endpoints + seed pipeline Continuation of docs/plans/mana-mcp-and-personas.md. Personas are the auto-test users the M3 runner will drive — they're real Mana users (kind='persona', tier='founder'), registered through the same Better Auth pipeline as humans, just stamped differently and metadata-tracked so the persona-runner knows how to role-play them. Schemas (auth namespace — personas are 1:1 with users, no reason for a separate platform.* schema that the plan originally sketched) - userKindEnum ('human' \| 'persona' \| 'system') + users.kind column, wired into better-auth additionalFields so the JWT/user object carry the flag. Default 'human' keeps every existing user untouched. - auth.personas — 1:1 descriptor (archetype, systemPrompt, moduleMix jsonb, tickCadence, lastActiveAt). CASCADE from users.id. - auth.persona_actions — tick-grouped audit of every tool call the runner makes (toolName, inputHash for dedup, result, latency). - auth.persona_feedback — structured 1-5 ratings per module per tick, plus free-text notes. This is where the runner writes the self-reflection step at end of each tick. Admin endpoints (/api/v1/admin/personas, admin-tier-gated) - POST / create-or-update by email. Uses auth.api.signUpEmail if the user's new, then stamps kind+tier+verified and upserts the personas row. Idempotent — safe to re-run after catalog edits. - GET / list with 7-day action count per persona. - GET /:id detail + recent 20 actions + per-module feedback aggregate. - DELETE /:id hard delete. Refuses non-persona users as defense-in-depth: an admin typo here would cascade through the full user-delete chain. Catalog + seed pipeline (scripts/personas/) - catalog.json 10 handwritten personas spanning 7 archetypes (adhd-student, ceo-busy, creative-parent, solo-dev, researcher, freelancer, overwhelmed-newbie). Five pairs of personas that will later share family/team spaces (cross-space setup is deferred to M2.d per the plan). - catalog.ts zod-validated loader. Refines email to require @mana.test TLD — non-existent, no bounce risk. - password.ts deterministic HMAC-SHA256(PERSONA_SEED_SECRET, email). No stored per-persona credentials; the runner re-derives on every login. Refuses the dev-fallback secret in production. - seed.ts POST /admin/personas per catalog entry. Flags: --auth=, --jwt=, --dry-run. - cleanup.ts Hard-delete every live persona. Warns when the live set drifts from the catalog. Root package.json: pnpm seed:personas pnpm seed:personas:cleanup Extends the ESLint root-ignore list with `scripts/**` so Bun-typed utility scripts don't fail the typed-parser check they weren't opted into. Consistent with the rest of scripts/ being .mjs+.sh. To go live (user action): pnpm docker:up cd services/mana-auth && bun run db:push export MANA_ADMIN_JWT=... pnpm seed:personas M2.d deferred: cross-space (family/team/practice) memberships between persona pairs. Better Auth's org-invite flow is multi-step and would roughly double the M2 scope; the persona-runner (M3) can operate in personal spaces first, shared-space tests land as their own milestone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:55:14 +02:00
Till JS	16c8818338	feat(mcp): M1+M1.5 MCP gateway + tool-registry + shared-crypto Foundation for autonomous Claude-driven testing. Plan: docs/plans/mana-mcp-and-personas.md. New packages - @mana/tool-registry — schema-first ToolSpec<InputSchema, OutputSchema> with zod generics, scope ('user-space' \| 'admin') and policyHint ('read' \| 'write' \| 'destructive'). sync-client helpers speak the mana-sync push/pull protocol directly so RLS and field-level LWW are preserved. MasterKeyClient fetches per-user MKs via the existing mana-auth GET /api/v1/me/encryption-vault/key endpoint (JWT-gated, ZK-aware, already audited) — no new service-key endpoint built. ZeroKnowledgeUserError surfaced as a typed throw. - @mana/shared-crypto — AES-GCM-256 primitives extracted from the web app's $lib/data/crypto/aes.ts so the server-side tool handlers and the browser produce byte-for-byte identical wire format (enc:1:{b64(iv)}.{b64(ct)}). Web app aes.ts now re-exports from shared-crypto — 5 existing importers unchanged, svelte-check stays green. New service - services/mana-mcp (:3069, Bun/Hono) — MCP Streamable HTTP gateway. JWKS auth against mana-auth, per-user session isolation (session-id belongs to the user who opened it — cross-user access returns 403), admin-scoped tools filtered out before registration. MasterKeyClient cached per process with a 5-minute TTL. 11 tools registered - habits.{create,list,update,archive}, spaces.list (plaintext, M1) - todo.{create,list,complete}, notes.{create,search}, journal.add (encrypted — field lists match apps/mana/apps/web/src/lib/data/crypto/registry.ts verbatim) Infra - Port 3069 added to docs/PORT_SCHEMA.md - services/mana-mcp/CLAUDE.md with architecture, auth model, tool-authoring recipe, local smoke-test steps - Root CLAUDE.md services list updated Type-check green across shared-crypto, mana-tool-registry, mana-mcp. svelte-check on apps/mana/apps/web stays at 0 errors / 0 warnings. Boot smoke verified: /health returns registry.loaded=true, unauthed /mcp → 401, invalid-JWT /mcp → 401 with descriptive message. Decisions locked in for later milestones (per plan D1–D10): - Personas will be real mana-auth users (users.kind='persona'), no service-key bypass (D1, D2) - Tool-registry is the SSOT; mana-ai and the legacy apps/api/src/mcp/server.ts get merged into it in M4 (three current parallel tool catalogs collapse to one) - Persona-runner (:3070) will be a separate service using the Claude Agent SDK + MCP client (D5) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:18:35 +02:00

4 commits