Opt-in path for missions that want Gemini Deep Research Max (up to 60 min
per task) instead of the shallow RSS pre-research. Because Max runs well
past a single 60-second tick, the state is carried across ticks:
tick N: submit → INSERT mission_research_jobs row → skip planner
tick N+k: poll → still running → skip planner (metric pending_skips)
tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan
- ManaResearchClient talks to mana-research's new internal
/v1/internal/research/async endpoints with X-Service-Key +
X-User-Id. Graceful-null on transport errors so a flaky
mana-research never crashes the tick loop.
- New table mana_ai.mission_research_jobs with PK (user_id, mission_id)
— presence is the "pending" flag; delete-on-terminal keeps queries
trivial.
- handleDeepResearch() encapsulates the state machine; planOneMission
now returns a discriminated union (planned | skipped | failed) so
"research pending" isn't miscounted as a parse failure.
- Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits
per run):
1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off)
2. DEEP_RESEARCH_TRIGGER regex matches the mission objective
(strict: "deep research", "tiefe recherche", "umfassende
recherche", "hintergrundrecherche", "deep dive")
Falls back to shallow RSS when either gate fails or the submit
errors upstream.
- Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total
labelled by provider, plus _pending_skips_total.
- docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds
mana-research to depends_on.
- Full write-up with real API response shape (outputs plural, not
OpenAI-style), step-3 MCP-server plan (security-gated, not built),
ops + kill-switch: docs/reports/gemini-deep-research.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
18 KiB
mana-ai
Background runner for the AI Workbench. Picks up due Missions from the mana_sync Postgres and plans/proposes next steps without requiring an open browser tab. Complements the foreground startMissionTick in the webapp (apps/mana/apps/web/src/lib/data/ai/missions/setup.ts).
Design context:
docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md§20 (AI Workbench base), §21 (Mission Key-Grants), §22 (Multi-Agent Workbench)docs/plans/ai-mission-key-grant.md— Shipped (per-mission key-grant for encrypted inputs)docs/plans/multi-agent-workbench.md— Shipped (named agents, per-agent policy/memory, scene lens)docs/plans/team-workbench.md— Forward-looking (multi-user + shared team context)docs/future/AI_AGENTS_IDEAS.md— Unshipped improvement backlog
Status: v0.3 (full close-the-loop)
What works end-to-end:
- Boots as a Hono/Bun service on port
3067 - Exposes
/healthand service-key-gated/internal/tick - Replays
sync_changesforappId='ai' / table='aiMissions'into live Mission records via field-level LWW (src/db/missions-projection.ts) - Lists due missions (
state='active' && nextRunAt <= now()) - For each due mission: shared
buildPlannerPrompt(from@mana/shared-ai) → mana-llm/v1/chat/completions→ strictparsePlannerResponse - Per-mission try/catch so one flaky LLM response doesn't abort the queue; stats differentiate
plansProduced/plansWrittenBack/parseFailures - Server-side tool allow-list (
src/planner/tools.ts) mirrors the webapp'sDEFAULT_AI_POLICYproposesubset - Write-back:
db/iteration-writer.tsappends the server-produced iteration toMission.iterations[]via async_changesINSERT under an RLS-scopedwithUsertransaction. Row is attributed with actor{kind:'system', source:'mission-runner'}. - Webapp staging effect (
server-iteration-staging.ts) picks up the synced iteration and translates each PlanStep into a local Proposal with full AI-actor attribution (missionId + iterationId + rationale). Idempotent via durableproposalIdmarkers. - Server-side input resolvers for plaintext tables —
db/resolvers/with a pluggable registry + single-record LWW replay (record-replay.ts).goalsresolver ships by default. Encrypted tables (notes, kontext, tasks, events, journal, …) are intentionally not resolved server-side; those missions depend on the foreground runner which decrypts client-side. Seeresolvers/types.tsfor the privacy rationale. - Materialized mission snapshots —
mana_ai.mission_snapshotstable with per-tick incremental refresh (db/snapshot-refresh.ts).listDueMissionsis now a single indexed SELECT; the prior O(N changes) LWW replay stays only inmergeAndFilterfor tests. Idempotentmigrate()on boot creates the schema. - Prometheus metrics on
/metrics— process defaults withmana_ai_prefix + counters (mana_ai_ticks_total,mana_ai_plans_produced_total,mana_ai_plans_written_back_total,mana_ai_parse_failures_total,mana_ai_mission_errors_total,mana_ai_snapshots_*) and histograms (mana_ai_tick_duration_seconds,mana_ai_planner_request_duration_seconds,mana_ai_http_request_duration_seconds). Scraped 30s bydocker/prometheus/prometheus.yml'smana-aijob./healthis also blackbox-probed and surfaces on status.mana.how under "Internal" as "Mana AI Runner".
All v0.3 roadmap items shipped. Future polish (not blockers):
- Multi-instance deploy with advisory locks on snapshot refresh (today single-process)
- Read-only
/internal/missions/:userIdendpoint for ops inspection
Status: v0.4 (Mission Key-Grants, in Arbeit)
Opt-in Mechanismus zum Entschluesseln der encrypted Input-Tabellen (notes, tasks, events, journal, kontext) serverseitig. Plan: docs/plans/ai-mission-key-grant.md. Architektur: docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md §21.
Was steht (Phase 0-2, Backend):
- RSA-OAEP-2048 keypair slots —
MANA_AI_PRIVATE_KEY_PEM(ai) /MANA_AI_PUBLIC_KEY_PEM(auth). Ohne Env-Var laeuft der Service unveraendert; Grants werden dann einfach uebersprungen. - Canonical HKDF in
@mana/shared-ai(missions/grant.ts). Scope-Binding (tables + recordIds) viainfo-String → Scope-Change = neuer Key = existierender Grant automatisch invalidiert. POST /api/v1/me/ai-mission-grantauf mana-auth — leitet MDK ab, RSA-wrapped, lehnt Zero-Knowledge-User ab, TTL-clamped [1h, 30d].mana_ai.decrypt_auditTabelle + RLS (user_scopeviaapp.current_user_id). Append-only.crypto/unwrap-grant.ts— Private-Key-Import, Grant-Entwrapping mit structured reasons (not-configured/expired/wrap-rejected/malformed).crypto/decrypt-value.ts— Mirror des webapp AES-GCM wire format (enc:1:<iv>.<ct>).- Encrypted Resolver (
db/resolvers/encrypted.ts) fuer notes / tasks / calendar / journal / kontext. Checkt recordId-Allowlist, replayt Record, entschluesseltenc:1:-Felder, schreibt Audit-Row pro Record. - Tick-Loop-Integration (
cron/tick.ts) — unwrappt Grant pro Mission, bautResolverContextmitmdk + allowlist, Key lebt nur waehrendplanOneMission. - Metriken:
mana_ai_decrypts_total{table},mana_ai_grant_scope_violations_total{table}(Alert > 0!),mana_ai_grant_skips_total{reason}.
Was offen ist (Phase 3, Frontend):
- Webapp
MissionGrantDialog+ Consent-Flow im Mission-Detail. - Revoke-Button + "Datenzugriff" Audit-Tab im Workbench.
GET /api/v1/me/ai-auditJWT-gated Endpoint live.- Feature-Flag
PUBLIC_AI_MISSION_GRANTS+ Cloudflare-Tunnel. - Produktions-Keypair auf Mac-Mini unter
secrets/mana-ai/.
Status: v0.5 (Multi-Agent Workbench)
Der Runner wird agent-bewusst — Missionen gehoeren einem benannten Agent, Policy und Memory leben auf dem Agent, Concurrency + Budget werden pro Agent respektiert.
mana_ai.agent_snapshotsTabelle (LWW-Projektion vonagentsaussync_changes).refreshAgentSnapshots+loadActiveAgentsparallel zum Mission-Snapshot-Refresh.ServerMission.agentId+ServerAgent.policydurchgereicht.- Tick resolvt pro Mission den Agent, gated
archived/paused/concurrency, schreibt iteration untermakeAgentActor(agent)Identitaet. <agent_context>Prompt-Block mit plaintextrole+systemPrompt+memory(ciphertext wird uebersprungen).filterToolsByAgentPolicyschneidetdeny-Tools raus bevor der Planner sie sieht.- Metrik
mana_ai_agent_decisions_total{decision}.
Status: v0.7 (Cross-Tick Deep Research, 2026-04-22)
Opt-in asynchroner Deep-Research-Pfad für Missions, die explizit tiefe Recherche wollen. Ruft mana-research's neue Gemini-Deep-Research-Max-Provider (gemini-deep-research / gemini-deep-research-max) über den internen Service-to-Service-Endpunkt /api/v1/internal/research/async auf. Weil Max bis zu 60 min läuft und unser Tick 60 s, läuft das über Ticks hinweg.
ManaResearchClient(clients/mana-research.ts) — HTTP-Client für mana-research's interne async-Endpoints.X-Service-Key+X-User-Id. Graceful-null bei Fehler.mana_ai.mission_research_jobsTabelle — ein Row pro pending Job pro Mission, PK(user_id, mission_id). Präsenz = "läuft gerade". Nachcompleted/failedwird gelöscht.- Cross-Tick State-Machine in
cron/tick.ts(handleDeepResearch):- Pending Job → poll →
queued/runningskip,completedinject Result,failedfall-through zu Shallow - Kein Job +
DEEP_RESEARCH_TRIGGER+config.deepResearchEnabled→ submit + insert → skip
- Pending Job → poll →
- Neuer Trigger
DEEP_RESEARCH_TRIGGERist strenger als der heutigeRESEARCH_TRIGGER— matcht nur "deep research", "tiefe recherche", "umfassende recherche", "hintergrundrecherche", "deep dive". Zusätzlich per ENV gegated (MANA_AI_DEEP_RESEARCH_ENABLED=true, default off). planOneMissionRückgabetyp ist jetzt eine Discriminated Union{outcome:'planned'|'skipped'|'failed'}.'skipped'(= research pending) wird nicht als parse-failure gezählt.- Metriken:
mana_ai_research_jobs_submitted_total{provider},_completed_total{provider},_failed_total{provider},_pending_skips_total. - Docker-Compose:
MANA_RESEARCH_URL,MANA_AI_DEEP_RESEARCH_ENABLED,depends_on: mana-research. @mana/shared-researchals workspace-dep +type-checkscript inpackage.json.
Bewusst nicht gemacht (offen):
- Mission-Config-Flag in der Webapp. Trigger ist heute Regex-basiert, nicht explizit konfigurierbar. Das reicht für den Pilot; wenn wir öffnen, brauchen wir eine UI-Checkbox im Mission-Detail.
- Image-Output (
charts, Nano-Banana). Steckt inproviderRaw, wird nicht im Answer-Text gerendert. - Streaming-Thought-Summaries. Würde eine eigene SSE-Brücke zum Frontend brauchen.
Details zum Deep-Research-Flow: docs/reports/gemini-deep-research.md §3.2.
Status: v0.6 (Server-side Web-Research + erweiterte Tools)
Der Runner kann jetzt vor dem Planner-Call eigenstaendig Web-Recherche ausfuehren (ohne Browser). Serverseitig werden 31 propose-Tools ueber 16 Module vom Planner vorgeschlagen (auto-Tools laufen ausschliesslich in der Webapp-Reasoning-Loop — der Server sieht nur propose).
NewsResearchClient(planner/news-research-client.ts) — HTTP-Client fuermana-api's/api/v1/news-research/discover+/search. Timeouts 15s/30s, graceful-null bei Fehler.- Pre-Planning-Research-Step in
cron/tick.ts— bei Mission-Objectives mit Research-Keywords (recherchier|research|news|today|historisch|...) wird automatisch vor dem Planner-Call RSS-Discovery + Search ausgefuehrt. Ergebnisse alsResolvedInputmitid='__web-research__'injiziert. config.manaApiUrl+ Docker-Compose-Wiring (MANA_API_URL: http://mana-api:3060,depends_on: mana-api).- 31 propose-Tools ueber 16 Module (Server-Sicht — auto-Tools sind nur in der Webapp):
- todo:
create_task,complete_task,complete_tasks_by_title - calendar:
create_event - notes:
create_note,update_note,append_to_note,add_tag_to_note - places:
create_place,visit_place - drink:
undo_drink - news:
save_news_article - news-research:
research_news - journal:
create_journal_entry - habits:
create_habit,log_habit - contacts:
create_contact - quiz:
create_quiz,update_quiz,add_quiz_question,update_quiz_question,delete_quiz_question - goals:
create_goal,pause_goal,resume_goal,complete_goal - mood:
log_mood - events:
suggest_event - finance:
add_transaction - times:
start_timer,stop_timer
- todo:
- Volle Tool-Liste inkl. der 28 auto-Tools: siehe
apps/mana/CLAUDE.md§Tool Coverage. Einzige Wahrheitsquelle istAI_TOOL_CATALOGin@mana/shared-ai/src/tools/schemas.ts; beide Seiten deriven daraus, Drift-Guard insrc/planner/tools.tsblockt Regressionen.
Port: 3067
Tech Stack
| Layer | Technology |
|---|---|
| Runtime | Bun |
| Framework | Hono |
| Database | PostgreSQL via postgres driver (read-only against mana_sync) |
| Auth | Service-to-service key; no end-user JWTs |
Quick Start
# Requires mana_sync DB reachable
cd services/mana-ai
bun run dev
# Smoke test
curl http://localhost:3067/health
curl -X POST -H "X-Service-Key: dev-service-key" http://localhost:3067/internal/tick
Environment Variables
PORT=3067
SYNC_DATABASE_URL=postgresql://mana:devpassword@localhost:5432/mana_sync
MANA_LLM_URL=http://localhost:3020
MANA_API_URL=http://localhost:3060 # news-research (RSS, shallow)
MANA_RESEARCH_URL=http://localhost:3068 # gemini-deep-research (deep, v0.7+)
MANA_AI_DEEP_RESEARCH_ENABLED=false # opt-in gate for Max tasks
MANA_SERVICE_KEY=dev-service-key
TICK_INTERVAL_MS=60000
TICK_ENABLED=true # flip to false to boot HTTP-only (for Docker health-check)
Architecture
┌────────────────────┐
│ mana-ai (Bun) │
│ :3067 │
│ │ 60s interval
│ ┌─────────────┐ │────────────────┐
│ │ tick loop │ │ │
│ │ runTickOnce │ │ │
│ └─────────────┘ │ │
│ │ │ │
│ │ SELECT │ │
│ ▼ │ │
│ ┌─────────────┐ │ │
│ │ missions- │ │ │
│ │ projection │ │ │
│ │ (LWW replay)│ │ │
│ └─────────────┘ │ ▼
│ │ ┌──────────────┐
│ ┌─────────────┐ │ │ mana_sync │
│ │ planner │───┼─────────▶│ (Postgres) │
│ │ client │ │ └──────────────┘
│ └─────────────┘ │
│ │ │
└───────┼────────────┘
│ POST /v1/chat/completions
▼
┌────────────────────┐
│ mana-llm (Python) │
│ :3020 │
└────────────────────┘
Open design questions (for next PR)
1. How do plan results get back to the user's device?
Proposals live in a local-only Dexie table (pendingProposals) — they don't sync. So the server can't just write proposals directly.
Options:
(a) Write iteration + plan to aiMissions, let the browser stage proposals on arrival.
Server appends an iteration with overallStatus: 'server-planned' and the plan steps. When the webapp next syncs, an effect subscribed to iteration changes translates each step into a local Proposal using the existing createProposal(). Clean: preserves the "proposals are local" invariant. Risk: duplicate proposals if multiple devices pick up the same iteration.
(b) Introduce aiProposedSteps as a synced table.
Server writes here directly; the webapp treats it as a source for its local pendingProposals. Requires a migration step + duplicates the proposal model.
(c) Make pendingProposals sync.
Simplest schema change, most invasive: approvals + rejections now race across devices. Would need server-authoritative state transitions.
Leaning (a) — minimal schema change, single source of truth. Implementation sketch: add iteration.source: 'browser' | 'server' and a "staging queue" on the webapp that dedups via iterationId.
2. Does the server need full LWW replay?
The projection replays every sync_changes row for aiMissions on every tick. For a small user base this is fine; past ~100 users × hundreds of rows it becomes wasteful.
Option: materialized view refreshed on sync-change insert via a trigger or a per-user ai_mission_snapshot table the service maintains. Defer until the load shows up.
3. Planner prompt: duplicate or share?
prompt.ts + parser.ts live in the webapp's @mana/web/src/lib/data/ai/missions/planner/. Server-side copies would drift. Options:
- Extract a
@mana/shared-aipackage with the prompt/parser - Keep two copies with a contract test
- Only the webapp plans; server just triggers the browser via push
First is cleanest; TS source, imports cleanly in both Bun and Vite.
Writing code in here
- No database schema of its own — this service is pure consumer. If you need persistent state (retry queues, per-user cursors), add a separate table namespace under
mana_ai.*schema on themana_syncdatabase, not a new DB. src/db/missions-projection.tsis the ONLY place that does LWW replay. Don't duplicate the logic; add new projection helpers there.- Follow the foreground-runner contract: injected deps (planner, write-back) for tests. Bun's
bun testruns insrc/**/*.test.ts.
Files
services/mana-ai/
├── src/
│ ├── index.ts — Hono bootstrap + tick scheduler wiring
│ ├── config.ts — Env loading
│ ├── cron/tick.ts — Scan loop, overlap-guarded. v0.7: cross-tick
│ │ deep-research state machine in
│ │ handleDeepResearch()
│ ├── clients/
│ │ └── mana-research.ts — v0.7: HTTP client for mana-research's
│ │ internal /research/async endpoints
│ ├── db/
│ │ ├── connection.ts — postgres.js pool
│ │ ├── migrate.ts — schema bootstrap (mission_snapshots,
│ │ │ decrypt_audit, agent_snapshots,
│ │ │ token_usage, mission_research_jobs)
│ │ ├── missions-projection.ts — sync_changes → Mission LWW replay
│ │ └── research-jobs.ts — v0.7: CRUD for mission_research_jobs
│ ├── planner/
│ │ ├── llm-client.ts — mana-llm HTTP client (OpenAI-compatible)
│ │ └── news-research-client.ts — mana-api RSS-based news-research
│ │ (shallow pre-planning step)
│ └── middleware/service-auth.ts — X-Service-Key gate for /internal/*
├── Dockerfile
├── package.json
├── tsconfig.json
└── CLAUDE.md