managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 22:01:09 +02:00

Author	SHA1	Message	Date
Till JS	e5d230e599	feat(agent-loop): M1 — policy gate + reminder channel + parallel reads Three Claude-Code-inspired primitives for runPlannerLoop, derived from the reverse-engineering reports in docs/reports/: 1. Policy gate (@mana/tool-registry) — evaluatePolicy() gates every tool dispatch: denies admin-scope, denies destructive tools not in the user's opt-in list, rate-limits per tool (30/60s default), flags prompt-injection markers in freetext without blocking. Wired into mana-mcp with a per-user rolling invocation log and POLICY_MODE env (off\|log-only\|enforce, default log-only). mana-ai uses detectInjectionMarker only — tool dispatch there is plan-only, so rate-limit/destructive checks don't apply yet. 2. Reminder channel (packages/shared-ai/src/planner/loop.ts) — new reminderChannel callback in PlannerLoopInput. Called once per round with LoopState snapshot (round, toolCallCount, usage, lastCall); returned strings wrap in <reminder> tags and inject as transient system messages into THIS LLM request only. Never pushed to messages[] — the Claude-Code <system-reminder> pattern that keeps the KV-cache prefix stable. 3. Parallel reads (loop.ts) — isParallelSafe predicate enables Promise.all dispatch when every tool_call in a round is parallel-safe, in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades the whole round to sequential. messages[] always appends in source order, never completion order, so the debug log stays linear. Default-off (undefined predicate) preserves pre-M1 behaviour. Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel, 4 reminder). All 74 green, type-check clean across 4 packages. Design/plan: docs/plans/agent-loop-improvements-m1.md Reports: docs/reports/claude-code-architecture.md, docs/reports/mana-agent-improvements-from-claude-code.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:56:40 +02:00
Till JS	2a18cb5ee4	feat(mana-ai): v0.7 — cross-tick Deep Research Max pre-planning Opt-in path for missions that want Gemini Deep Research Max (up to 60 min per task) instead of the shallow RSS pre-research. Because Max runs well past a single 60-second tick, the state is carried across ticks: tick N: submit → INSERT mission_research_jobs row → skip planner tick N+k: poll → still running → skip planner (metric pending_skips) tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan - ManaResearchClient talks to mana-research's new internal /v1/internal/research/async endpoints with X-Service-Key + X-User-Id. Graceful-null on transport errors so a flaky mana-research never crashes the tick loop. - New table mana_ai.mission_research_jobs with PK (user_id, mission_id) — presence is the "pending" flag; delete-on-terminal keeps queries trivial. - handleDeepResearch() encapsulates the state machine; planOneMission now returns a discriminated union (planned \| skipped \| failed) so "research pending" isn't miscounted as a parse failure. - Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits per run): 1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off) 2. DEEP_RESEARCH_TRIGGER regex matches the mission objective (strict: "deep research", "tiefe recherche", "umfassende recherche", "hintergrundrecherche", "deep dive") Falls back to shallow RSS when either gate fails or the submit errors upstream. - Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total labelled by provider, plus _pending_skips_total. - docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds mana-research to depends_on. - Full write-up with real API response shape (outputs plural, not OpenAI-style), step-3 MCP-server plan (security-gated, not built), ops + kill-switch: docs/reports/gemini-deep-research.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:56:06 +02:00
Till JS	76577869e1	feat(mana-ai): OpenTelemetry tracing + Grafana Tempo backend Add distributed tracing to the mana-ai background runner so mission execution can be visualized end-to-end in Grafana. Instrumentation (services/mana-ai/): - tracing.ts: OTel provider setup with OTLP/HTTP exporter, withSpan() helper - tick.ts: tick.planMission span with mission/agent/user attributes - client.ts: planner.complete span with LLM model, tokens, latency Infrastructure: - docker/tempo/tempo.yaml: Grafana Tempo config (OTLP HTTP on 4318) - docker-compose: tempo service + tempo_data volume + mana-ai env var - docker/grafana/provisioning/datasources/tempo.yml: auto-provisioned Trace flow: tick.planMission (root span) └── planner.complete (child span) ├── llm.model = "gpt-4o-mini" ├── llm.tokens.total = 1234 └── llm.response.length = 567 Enable: set OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 View: Grafana → Explore → Tempo datasource Also fixes: removed broken @mana/subscriptions workspace ref from arcade. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 15:21:23 +02:00
Till JS	0bf01f434e	feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration Wires mana-ai into the existing observability stack so tick throughput, plan-failure rates, planner latencies, and snapshot refresh health are visible in Grafana + Prometheus, and the service's uptime surfaces on status.mana.how under the "Internal" section. - `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix. Counters: ticks_total, plans_produced_total, plans_written_back_total, parse_failures_total, mission_errors_total, snapshots_new/updated, snapshot_rows_applied_total, http_requests_total. Histograms: tick_duration_seconds (0.1–120s), planner_request_ duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s). - `src/index.ts` — HTTP middleware labels every request by method/path/status; `/metrics` serves the Prometheus text format. - `src/cron/tick.ts` — increments counters + wraps the tick with `tickDuration.startTimer()`. Snapshot stats fold through. - `src/planner/client.ts` — wraps `complete()` in a latency histogram timer so planner tail latency shows up separately from tick duration. - `docker/prometheus/prometheus.yml` — 1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s). 2. `/health` added to the `blackbox-internal` job so uptime shows on status.mana.how alongside mana-geocoding. - `scripts/generate-status-page.sh` — friendly label for the new probe: `mana-ai:3066/health` → "Mana AI Runner" (generator already iterates `blackbox-internal`, no other changes needed). - `package.json` — prom-client ^15.1.3 All 17 Bun tests still pass; tsc clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:41:40 +02:00
Till JS	203fe3ef05	feat(mana-ai): wire shared-ai planner + real mana-llm calls (v0.2) Service now produces plans end-to-end for due missions. Takes the shared prompt/parser from @mana/shared-ai, calls mana-llm's OpenAI-compatible endpoint, parses + validates the response against a server-side tool allow-list. - `src/planner/tools.ts` — hardcoded subset of webapp tools where policy === 'propose'. Mirror of `DEFAULT_AI_POLICY` in the webapp; drift just means the server doesn't suggest newly-added tools (graceful degradation). Contract test between the two lists is a sensible follow-up. - `src/cron/tick.ts` - Iterates due missions, builds the shared Planner prompt per mission, parses the LLM response, logs the resulting plan - Per-mission try/catch so one flaky LLM response doesn't abort the queue; stats now track `plansProduced` + `parseFailures` - `serverMissionToSharedMission()` converts the projection shape to the shared-ai Mission type at the boundary - `resolvedInputs: []` today — the Planner sees concept + objective + iteration history only. Full resolvers (notes/kontext/goals via Postgres replay) land alongside write-back in the next PR. - No write-back yet: the plan is logged but not persisted to `sync_changes`. Write-back needs an RLS-scoped helper mirroring mana-sync's `withUser` pattern — tracked explicitly as the remaining open piece in CLAUDE.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:06:22 +02:00
Till JS	b9710e6c11	feat(mana-ai): scaffold server-side Mission Runner (v0.1) Background Hono/Bun service that scans mana_sync for due Missions and will plan them via mana-llm without requiring an open browser tab. Complements the foreground `startMissionTick` in the webapp. v0.1 scope — scaffold that's deployable, boots cleanly, and reads real data. Execution write-back is tracked as the next PR so we don't commit a half-baked proposal-sync design. Shipped: - Hono app on :3066 with `/health` + service-key-gated `/internal/tick` - `src/db/missions-projection.ts` — field-level LWW replay of `sync_changes` for appId='ai' / table='aiMissions' → live Mission records. Mirrors the webapp's `applyServerChanges` semantics against Postgres instead of Dexie. - `src/db/connection.ts` — bounded `postgres.js` pool (max 4, idle 30s) - `src/cron/tick.ts` — overlap-guarded scheduler, `runTickOnce()` also reachable via HTTP for CI/ops triggering - `src/planner/client.ts` — mana-llm HTTP client shape (OpenAI-compatible `/v1/chat/completions`) - `src/middleware/service-auth.ts` — X-Service-Key gate, no end-user JWTs reach this service - Dockerfile + graceful SIGTERM shutdown (stops timer + releases pool) Not yet implemented (documented in CLAUDE.md with design trade-offs): - Prompt/parser server-side copies — today they live in the webapp. Recommended next step: extract `@mana/shared-ai` package. - Input resolvers for notes / kontext / goals — need projections or a mana-sync internal endpoint - Plan → Mission-iteration write-back + how proposals get back to the user's device (leaning option (a): server writes iterations, the webapp's sync effect translates them into local Proposals) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 23:48:30 +02:00

6 commits