managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-15 01:01:09 +02:00

Author	SHA1	Message	Date
Till JS	e5d230e599	feat(agent-loop): M1 — policy gate + reminder channel + parallel reads Three Claude-Code-inspired primitives for runPlannerLoop, derived from the reverse-engineering reports in docs/reports/: 1. Policy gate (@mana/tool-registry) — evaluatePolicy() gates every tool dispatch: denies admin-scope, denies destructive tools not in the user's opt-in list, rate-limits per tool (30/60s default), flags prompt-injection markers in freetext without blocking. Wired into mana-mcp with a per-user rolling invocation log and POLICY_MODE env (off\|log-only\|enforce, default log-only). mana-ai uses detectInjectionMarker only — tool dispatch there is plan-only, so rate-limit/destructive checks don't apply yet. 2. Reminder channel (packages/shared-ai/src/planner/loop.ts) — new reminderChannel callback in PlannerLoopInput. Called once per round with LoopState snapshot (round, toolCallCount, usage, lastCall); returned strings wrap in <reminder> tags and inject as transient system messages into THIS LLM request only. Never pushed to messages[] — the Claude-Code <system-reminder> pattern that keeps the KV-cache prefix stable. 3. Parallel reads (loop.ts) — isParallelSafe predicate enables Promise.all dispatch when every tool_call in a round is parallel-safe, in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades the whole round to sequential. messages[] always appends in source order, never completion order, so the debug log stays linear. Default-off (undefined predicate) preserves pre-M1 behaviour. Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel, 4 reminder). All 74 green, type-check clean across 4 packages. Design/plan: docs/plans/agent-loop-improvements-m1.md Reports: docs/reports/claude-code-architecture.md, docs/reports/mana-agent-improvements-from-claude-code.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:56:40 +02:00
Till JS	d5b889ac58	docs(gemini-deep-research): Mac-Mini deploy log 2026-04-22 Capture the surprises from the first deploy so the next rollout (or rollback) has the full picture without spelunking logs: - mana-research had never been started on the Mac-Mini, even though it was defined in compose. First-boot via `docker compose up -d`. - research.* schema is not auto-migrated on service boot — drizzle push must be triggered explicitly: `docker exec mana-research bun run db:push`. 5 tables created. - GOOGLE_GENAI_API_KEY was missing in /Users/mana/.../mana-monorepo/.env. Copied the local key over, with `.env.bak.pre-gemini-deep-research` as rollback anchor. - Redis NOAUTH fix (commit `4867300d0`) referenced here. - Smoke-test outcome documented: the 500 was mana-credits 404 on a test user without a wallet row — expected, and it proves the whole auth/dispatch chain up to the credits hop works. - Also noted: mana-llm has the same bare REDIS_URL in compose (out-of-scope for this deploy), and /providers/health does not list async providers (known design gap). Status header updated to reflect deploy completion. Flag stays off (MANA_AI_DEEP_RESEARCH_ENABLED=false) pending explicit enablement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 18:22:31 +02:00
Till JS	2a18cb5ee4	feat(mana-ai): v0.7 — cross-tick Deep Research Max pre-planning Opt-in path for missions that want Gemini Deep Research Max (up to 60 min per task) instead of the shallow RSS pre-research. Because Max runs well past a single 60-second tick, the state is carried across ticks: tick N: submit → INSERT mission_research_jobs row → skip planner tick N+k: poll → still running → skip planner (metric pending_skips) tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan - ManaResearchClient talks to mana-research's new internal /v1/internal/research/async endpoints with X-Service-Key + X-User-Id. Graceful-null on transport errors so a flaky mana-research never crashes the tick loop. - New table mana_ai.mission_research_jobs with PK (user_id, mission_id) — presence is the "pending" flag; delete-on-terminal keeps queries trivial. - handleDeepResearch() encapsulates the state machine; planOneMission now returns a discriminated union (planned \| skipped \| failed) so "research pending" isn't miscounted as a parse failure. - Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits per run): 1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off) 2. DEEP_RESEARCH_TRIGGER regex matches the mission objective (strict: "deep research", "tiefe recherche", "umfassende recherche", "hintergrundrecherche", "deep dive") Falls back to shallow RSS when either gate fails or the submit errors upstream. - Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total labelled by provider, plus _pending_skips_total. - docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds mana-research to depends_on. - Full write-up with real API response shape (outputs plural, not OpenAI-style), step-3 MCP-server plan (security-gated, not built), ops + kill-switch: docs/reports/gemini-deep-research.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:56:06 +02:00
Till JS	11f768b8e5	docs(invoices): ClubDesk vs. Mana comparison + invoices module plan Competitive analysis of ClubDesk (reeweb ag, ~20'000 DACH clubs) with a dual-use roadmap identifying features that benefit both clubs and general users (freelancers/creators). First chosen step: invoices module with Swiss QR-Bill as the CH-differentiator. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:27:57 +02:00
Till JS	2bdb48bdd1	feat(research): add mana-research service — Phase 1 + 2 New Bun/Hono service on port 3068 that bundles many web-research providers behind a unified interface for side-by-side comparison. All eval runs persist in research.* (mana_platform) so quality can be reviewed later. Providers (Phase 1+2): search: searxng, duckduckgo, brave, tavily, exa, serper extract: readability (via mana-search), jina-reader, firecrawl Endpoints: POST /v1/search, /v1/search/compare — single + fan-out POST /v1/extract, /v1/extract/compare — single + fan-out GET /v1/runs, /v1/runs/:id — history POST /v1/runs/:run/results/:id/rate — manual eval GET /v1/providers, /v1/providers/health — catalog + readiness Auto-routing: when `provider` is omitted, queries are classified via regex (fast path, 0ms) with optional mana-llm fallback, then routed to the first available provider for that query type (news → tavily, academic → exa, semantic → exa, etc.). Credits: server-key calls go through mana-credits reserve → commit/refund so failed provider calls don't charge the user. BYO-keys supported via research.provider_configs (UI arrives in Phase 4). Cache: Redis with graceful degradation (1h TTL for search, 24h for extract). Pay-per-use APIs only — no subscription-gated providers. Docs: docs/plans/mana-research-service.md + docs/reports/web-research-capabilities.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 14:42:25 +02:00
Till JS	a1bb703086	docs: final report update — 7/10 roadmap items done, all tables consistent Fix missing strikethroughs in §6 table (#1, #2, #6) and update Fazit to reflect final state: 7 of 10 items done. Document remaining 3 langfristige Punkte with context on dependencies and priorities. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 15:23:37 +02:00
Till JS	62fc566693	docs: mark OTel tracing (#7 ) as done in architecture report 7 of 10 roadmap items now complete. Remaining: Agent-to-Agent (#4), A2A Agent Cards (#8), Graph Workflows (#9), Agent Memory (#10). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 15:21:45 +02:00
Till JS	acd7e0d6b0	docs: update architecture comparison — 5/10 roadmap items done Update report to reflect all completed work: - Matrix: streaming ✅, tool registration updated to 29 tools + MCP - §5.2 Streaming: marked done - §5.3 Tool System: marked done - §6 Table: items 1-3 + 5 struck through with commit refs - §8 Fazit: updated gaps and recommendations 5 of 10 roadmap items complete in one session: 1. SSE Streaming, 2. Dynamic Tool Registry, 3. Budget Enforcement, 5. MCP Server Export (27/29 tools with DB ops), plus Tool Drift Fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 15:00:09 +02:00
Till JS	04c806fbb2	feat(mcp): implement remaining 19 tool handlers (27/29 total) Complete tool handler coverage for the MCP server: Todo: complete_tasks_by_title Calendar: create_event (with timeBlock) Notes: update_note, append_to_note, add_tag_to_note Places: create_place, visit_place, get_places Drink: log_drink, get_drink_progress, undo_drink Food: log_meal, nutrition_summary Journal: create_journal_entry Habits: create_habit, log_habit (get_habits improved) News: save_news_article 27 of 29 tools now have real implementations. Remaining 2 (research_news, get_current_location) need external service calls that aren't available in the API server context. Also updates architecture comparison report to mark MCP as done. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:08:57 +02:00
Till JS	be81d11dc3	feat(ai): SSE streaming for foreground Mission Runner Enable real-time token streaming during the planner "calling-llm" phase so the user sees live progress ("empfange Plan… 128 tokens") instead of a static spinner. The parser still receives the full text once complete — no partial-JSON risk. Changes: - Extract shared SSE parser from playground into @mana/shared-llm/sse-parser - remote.ts: use stream:true when onToken callback is provided - AiPlanInput: add optional onToken field (shared-ai) - ai-plan task: pass onToken through to backend.generate() - runner.ts: throttled (500ms) phaseDetail updates during streaming - Playground: refactored to use shared SSE parser Also includes: AI agent architecture comparison report (docs/reports/) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 12:32:43 +02:00

10 commits