Catches up all docs with the current state of the AI tool system.
services/mana-ai/CLAUDE.md:
- New v0.6 status section documenting NewsResearchClient,
pre-planning research injection, config.manaApiUrl, and the full
28-tool / 11-module inventory (17 propose + 11 auto).
apps/mana/CLAUDE.md:
- New "Tool Coverage" table in the AI Workbench section listing all
tools per module with their policy (propose vs auto).
- New "Templates" subsection documenting the two-section gallery
(agent vs workbench templates), the seed-handler registry, and
the current handlers (meditate, habits, goals).
- Architecture cross-reference updated to include §23.
docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md:
- §23.2 gains a "Server-Side Research (mana-ai, ab v0.6)" subsection
explaining how NewsResearchClient mirrors the client-side research
pre-step: same endpoints, same trigger regex, but HTTP-direct from
the Docker network instead of SvelteKit-internal.
docs/plans/README.md:
- workbench-templates.md added to the roadmap table (T1 shipped).
- Multi-agent description updated to mention 28 tools + server-side
web-research.
- Architecture cross-reference includes §23.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Multi-Agent Workbench shipped end-to-end (commits 1771063df through
7c89eb625). This commit turns the plan doc into a proper history + post-
mortem and captures the deferred Team-Workbench as its own forward plan
so the architectural breadcrumbs don't rot.
docs/plans/multi-agent-workbench.md:
- Status bumped to ✅ Shipped; every phase checkbox flipped.
- Open-questions section rewritten with the decisions that were
actually made (name-unique via store write-time check, per-source
system principalIds, policy fully migrated, scene binding default-
empty with smart suggestion).
- New "Shipping-Historie" table mapping each phase to its commit, the
number of files touched, and the test outcome.
- New "Lessons Learnt + Follow-Up Ideen" with:
* What went better than expected (L3 Actor cutover, getOrCreate
instead of unique index, displayName caching)
* Thin spots worth revisiting (avatar not on Actor, missing token
counter for budget, no missions list on agent detail, no
drag-reassign, scene binding doesn't drive filters yet)
* Five deferred follow-up projects (team features, agent memory
self-update, agent-to-agent messaging, meta-planner, per-agent
encryption domains)
docs/plans/team-workbench.md (NEW):
- Full forward-looking plan for the deferred Team-Workbench.
- Two use-cases (human multi-user vs multi-agent sharing team
context) with the observation that they share the same infra.
- Decision candidates table (still open — meant as T0 RFC fodder,
not baked in).
- Architecture sketch with data-model deltas over the current
single-user shape.
- Encryption subsection dedicated to the hardest problems: team-key
wrapping per member (reuses Mission-Grant pattern), member-removal
rotation (lazy vs eager), Zero-Knowledge-mode incompatibility.
- T0..T6 phasing (~7 weeks for a clean first-pass).
- Section "Wie Multi-Agent dafür den Weg geebnet hat" enumerating
the four invariants the shipped Phase 0-7 deliberately preserved
to make this plan cheap when it lands.
docs/plans/README.md (NEW):
- Index doc with the AI/Workbench roadmap as an ASCII flow so future
contributors can locate themselves in the sequence without reading
three 400-line plans first.
docs/future/AI_AGENTS_IDEAS.md:
- Header marks Point 1 (encrypted tables) as shipped via the Mission
Grant plan; points 2-8 stay relevant. Cross-link to all three plan
docs so this stays the go-to backlog.
services/mana-ai/CLAUDE.md:
- Design-context header expanded to link to all four related docs
(arch §20-22, both shipped plans, forward team plan, ideas backlog).
No code changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 6 — Multi-Agent observability:
- AI Workbench timeline gets a per-agent filter (dropdown with avatars)
alongside module + mission. TimelineBucket gains agentId +
agentDisplayName, projected off the bucket's first AI actor.
- Bucket header now leads with the agent's avatar + name (lookup via
the live useAgents query so renamed agents reflect instantly) and
falls back to Actor.displayName for deleted agents.
- AiProposalInbox card header replaces the generic Sparkle + "KI
schlägt vor" with an agent chip "🤖 Cashflow Watcher schlägt vor"
using the cached Actor.displayName. Ghost-agent label preserved
via the cached displayName even when the agent record is gone.
Phase 7 — Docs:
- docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md §22 added:
data model, identity flow, tick gate order, Scene-Agent binding
semantics, non-goals.
- services/mana-ai/CLAUDE.md status bumped to v0.5 (Multi-Agent
Workbench) with the per-agent runner features + metrics listed.
- apps/mana/CLAUDE.md AI Workbench section rewritten to cover the
Agent primitive, per-agent policy, scene lens, and the updated
timeline header.
Multi-Agent rollout is code-complete end-to-end:
Phase 0 Plan ✓ Phase 4 Policy-per-agent ✓
Phase 1 Actor identity ✓ Phase 5 Agent UI + Scene lens ✓
Phase 2 Agent CRUD ✓ Phase 6 Observability ✓
Phase 3 Tick agent-aware ✓ Phase 7 Docs ✓
Tests: webapp svelte-check 0 errors, 0 warnings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 2 of Mission Key-Grant. The tick loop now honours a mission's
grant by unwrapping the MDK and passing it + the record allowlist into
the resolvers. Encrypted modules (notes, tasks, calendar, journal,
kontext) resolve server-side instead of returning null.
- crypto/decrypt-value.ts: mirror of webapp AES-GCM wire format
(enc:1:<iv>.<ct>) — read-only, server never wraps
- db/resolvers/encrypted.ts: factory + 5 concrete resolvers. Scope-
violation bumps a metric + writes a structured audit row, decrypt
failures same. Zero-decrypt (no grant, or record absent) = silent
null, no audit noise.
- db/audit.ts: best-effort append to mana_ai.decrypt_audit; write
failures never cascade into tick failures.
- cron/tick.ts: buildResolverContext unwraps grant per mission; MDK
reference only lives for the scope of planOneMission.
- ResolverContext plumbed through resolveServerInputs; existing goals
resolver unchanged semantically.
- Metrics: mana_ai_decrypts_total{table}, mana_ai_grant_skips_total
{reason}, mana_ai_grant_scope_violations_total{table} (alert > 0).
Missions without a grant still run exactly as before — plaintext
resolvers fire, encrypted ones short-circuit to null. No behaviour
regression for existing users.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wires mana-ai into the existing observability stack so tick throughput,
plan-failure rates, planner latencies, and snapshot refresh health are
visible in Grafana + Prometheus, and the service's uptime surfaces on
status.mana.how under the "Internal" section.
- `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix.
Counters: ticks_total, plans_produced_total, plans_written_back_total,
parse_failures_total, mission_errors_total, snapshots_new/updated,
snapshot_rows_applied_total, http_requests_total.
Histograms: tick_duration_seconds (0.1–120s), planner_request_
duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s).
- `src/index.ts` — HTTP middleware labels every request by
method/path/status; `/metrics` serves the Prometheus text format.
- `src/cron/tick.ts` — increments counters + wraps the tick with
`tickDuration.startTimer()`. Snapshot stats fold through.
- `src/planner/client.ts` — wraps `complete()` in a latency histogram
timer so planner tail latency shows up separately from tick duration.
- `docker/prometheus/prometheus.yml` —
1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s).
2. `/health` added to the `blackbox-internal` job so uptime shows on
status.mana.how alongside mana-geocoding.
- `scripts/generate-status-page.sh` — friendly label for the new probe:
`mana-ai:3066/health` → "Mana AI Runner" (generator already iterates
`blackbox-internal`, no other changes needed).
- `package.json` — prom-client ^15.1.3
All 17 Bun tests still pass; tsc clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Background Hono/Bun service that scans mana_sync for due Missions and
will plan them via mana-llm without requiring an open browser tab.
Complements the foreground `startMissionTick` in the webapp.
v0.1 scope — scaffold that's deployable, boots cleanly, and reads real
data. Execution write-back is tracked as the next PR so we don't commit
a half-baked proposal-sync design.
Shipped:
- Hono app on :3066 with `/health` + service-key-gated `/internal/tick`
- `src/db/missions-projection.ts` — field-level LWW replay of
`sync_changes` for appId='ai' / table='aiMissions' → live Mission
records. Mirrors the webapp's `applyServerChanges` semantics against
Postgres instead of Dexie.
- `src/db/connection.ts` — bounded `postgres.js` pool (max 4, idle 30s)
- `src/cron/tick.ts` — overlap-guarded scheduler, `runTickOnce()` also
reachable via HTTP for CI/ops triggering
- `src/planner/client.ts` — mana-llm HTTP client shape
(OpenAI-compatible `/v1/chat/completions`)
- `src/middleware/service-auth.ts` — X-Service-Key gate, no end-user JWTs
reach this service
- Dockerfile + graceful SIGTERM shutdown (stops timer + releases pool)
Not yet implemented (documented in CLAUDE.md with design trade-offs):
- Prompt/parser server-side copies — today they live in the webapp.
Recommended next step: extract `@mana/shared-ai` package.
- Input resolvers for notes / kontext / goals — need projections or a
mana-sync internal endpoint
- Plan → Mission-iteration write-back + how proposals get back to the
user's device (leaning option (a): server writes iterations, the
webapp's sync effect translates them into local Proposals)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>