Commit graph

12 commits

Author SHA1 Message Date
Till JS
fc49198992 docs(geocoding): post-migration log + Photon weekly-refresh operator scripts
- Decision report: status flipped to MIGRATED; added migration log with
  five WSL2 gotchas (bzip2 missing, no official Photon image,
  firewall=true blocks cross-LAN, vmIdleTimeout=-1 ineffective,
  PowerShell pre-expansion of bash $(...)) and resource snapshot.
- mana-geocoding CLAUDE.md: PHOTON_SELF_API_URL note now reflects live
  primary status on mana-gpu since 2026-04-28.
- photon-self/: operator scripts for the weekly DB refresh — update.sh
  (atomic-swap with rollback), systemd unit + timer (Sun 03:30 +30min
  jitter, Persistent=true), README with re-installation instructions
  for DR. Currently installed and enabled on mana-gpu.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 21:31:37 +02:00
Till JS
6f83fba66a docs(reports): geocoding self-hosting decision — recommend Photon on mana-gpu
Compares Pelias / Nominatim / Photon for self-hosting on the GPU
server, with current (2026-04-28) numbers from upstream docs +
GraphHopper's Photon-data downloads:

  Photon Europe pre-built dump: 30.6 GB, weekly refresh
  Photon Germany pre-built dump: 5.8 GB, weekly refresh
  Nominatim Germany import:     ~100 GB disk, 8–12 h, 12 GB RAM
  Pelias DACH (current):         3 GB RAM, 4 services, JS patch hack

Recommendation: Photon Europe-wide on mana-gpu. Single Java process,
embedded OpenSearch, no PBF import (download a tarball, restart),
weekly auto-updates from GraphHopper, integrates with the wrapper's
existing PhotonProvider via just an env-var change.

Once self-hosted, Photon registers as `privacy: 'local'` — the
sensitive-query block (Hausarzt, Klinikum, …) gets a real local
backend and no longer has to return empty results when Pelias is
down. Public Photon stays in the chain as a `privacy: 'public'`
last-resort fallback.

Migration plan included (~3–4 h total, ~1 h waiting), with
phase-by-phase risk assessment.

Pelias does not return — the 3 GB RAM + multi-container + patched
JS combination has no operational case once we have a self-hosted
Photon that already matches our wrapper's wire format.
2026-04-28 17:04:30 +02:00
Till JS
e5d230e599 feat(agent-loop): M1 — policy gate + reminder channel + parallel reads
Three Claude-Code-inspired primitives for runPlannerLoop, derived from the
reverse-engineering reports in docs/reports/:

1. **Policy gate** (@mana/tool-registry) — evaluatePolicy() gates every tool
   dispatch: denies admin-scope, denies destructive tools not in the user's
   opt-in list, rate-limits per tool (30/60s default), flags prompt-injection
   markers in freetext without blocking. Wired into mana-mcp with a
   per-user rolling invocation log and POLICY_MODE env (off|log-only|enforce,
   default log-only). mana-ai uses detectInjectionMarker only — tool dispatch
   there is plan-only, so rate-limit/destructive checks don't apply yet.

2. **Reminder channel** (packages/shared-ai/src/planner/loop.ts) — new
   reminderChannel callback in PlannerLoopInput. Called once per round with
   LoopState snapshot (round, toolCallCount, usage, lastCall); returned
   strings wrap in <reminder> tags and inject as transient system messages
   into THIS LLM request only. Never pushed to messages[] — the Claude-Code
   <system-reminder> pattern that keeps the KV-cache prefix stable.

3. **Parallel reads** (loop.ts) — isParallelSafe predicate enables
   Promise.all dispatch when every tool_call in a round is parallel-safe,
   in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades
   the whole round to sequential. messages[] always appends in source
   order, never completion order, so the debug log stays linear.
   Default-off (undefined predicate) preserves pre-M1 behaviour.

Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel,
4 reminder). All 74 green, type-check clean across 4 packages.

Design/plan: docs/plans/agent-loop-improvements-m1.md
Reports: docs/reports/claude-code-architecture.md,
         docs/reports/mana-agent-improvements-from-claude-code.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 13:56:40 +02:00
Till JS
d5b889ac58 docs(gemini-deep-research): Mac-Mini deploy log 2026-04-22
Capture the surprises from the first deploy so the next rollout
(or rollback) has the full picture without spelunking logs:

- mana-research had never been started on the Mac-Mini, even though
  it was defined in compose. First-boot via `docker compose up -d`.
- research.* schema is not auto-migrated on service boot — drizzle
  push must be triggered explicitly: `docker exec mana-research
  bun run db:push`. 5 tables created.
- GOOGLE_GENAI_API_KEY was missing in /Users/mana/.../mana-monorepo/.env.
  Copied the local key over, with `.env.bak.pre-gemini-deep-research`
  as rollback anchor.
- Redis NOAUTH fix (commit 4867300d0) referenced here.
- Smoke-test outcome documented: the 500 was mana-credits 404 on a
  test user without a wallet row — expected, and it proves the whole
  auth/dispatch chain up to the credits hop works.
- Also noted: mana-llm has the same bare REDIS_URL in compose
  (out-of-scope for this deploy), and /providers/health does not list
  async providers (known design gap).

Status header updated to reflect deploy completion. Flag stays off
(MANA_AI_DEEP_RESEARCH_ENABLED=false) pending explicit enablement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:22:31 +02:00
Till JS
2a18cb5ee4 feat(mana-ai): v0.7 — cross-tick Deep Research Max pre-planning
Opt-in path for missions that want Gemini Deep Research Max (up to 60 min
per task) instead of the shallow RSS pre-research. Because Max runs well
past a single 60-second tick, the state is carried across ticks:

  tick N:   submit → INSERT mission_research_jobs row → skip planner
  tick N+k: poll → still running → skip planner (metric pending_skips)
  tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan

- ManaResearchClient talks to mana-research's new internal
  /v1/internal/research/async endpoints with X-Service-Key +
  X-User-Id. Graceful-null on transport errors so a flaky
  mana-research never crashes the tick loop.
- New table mana_ai.mission_research_jobs with PK (user_id, mission_id)
  — presence is the "pending" flag; delete-on-terminal keeps queries
  trivial.
- handleDeepResearch() encapsulates the state machine; planOneMission
  now returns a discriminated union (planned | skipped | failed) so
  "research pending" isn't miscounted as a parse failure.
- Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits
  per run):
    1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off)
    2. DEEP_RESEARCH_TRIGGER regex matches the mission objective
       (strict: "deep research", "tiefe recherche", "umfassende
       recherche", "hintergrundrecherche", "deep dive")
  Falls back to shallow RSS when either gate fails or the submit
  errors upstream.
- Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total
  labelled by provider, plus _pending_skips_total.
- docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds
  mana-research to depends_on.
- Full write-up with real API response shape (outputs plural, not
  OpenAI-style), step-3 MCP-server plan (security-gated, not built),
  ops + kill-switch: docs/reports/gemini-deep-research.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:56:06 +02:00
Till JS
11f768b8e5 docs(invoices): ClubDesk vs. Mana comparison + invoices module plan
Competitive analysis of ClubDesk (reeweb ag, ~20'000 DACH clubs) with a
dual-use roadmap identifying features that benefit both clubs and general
users (freelancers/creators). First chosen step: invoices module with
Swiss QR-Bill as the CH-differentiator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:27:57 +02:00
Till JS
2bdb48bdd1 feat(research): add mana-research service — Phase 1 + 2
New Bun/Hono service on port 3068 that bundles many web-research providers
behind a unified interface for side-by-side comparison. All eval runs
persist in research.* (mana_platform) so quality can be reviewed later.

Providers (Phase 1+2):
  search:  searxng, duckduckgo, brave, tavily, exa, serper
  extract: readability (via mana-search), jina-reader, firecrawl

Endpoints:
  POST /v1/search, /v1/search/compare       — single + fan-out
  POST /v1/extract, /v1/extract/compare     — single + fan-out
  GET  /v1/runs, /v1/runs/:id               — history
  POST /v1/runs/:run/results/:id/rate       — manual eval
  GET  /v1/providers, /v1/providers/health  — catalog + readiness

Auto-routing: when `provider` is omitted, queries are classified via regex
(fast path, 0ms) with optional mana-llm fallback, then routed to the first
available provider for that query type (news → tavily, academic → exa,
semantic → exa, etc.).

Credits: server-key calls go through mana-credits reserve → commit/refund
so failed provider calls don't charge the user. BYO-keys supported via
research.provider_configs (UI arrives in Phase 4).

Cache: Redis with graceful degradation (1h TTL for search, 24h for
extract). Pay-per-use APIs only — no subscription-gated providers.

Docs: docs/plans/mana-research-service.md + docs/reports/web-research-capabilities.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:42:25 +02:00
Till JS
a1bb703086 docs: final report update — 7/10 roadmap items done, all tables consistent
Fix missing strikethroughs in §6 table (#1, #2, #6) and update Fazit
to reflect final state: 7 of 10 items done. Document remaining 3
langfristige Punkte with context on dependencies and priorities.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:23:37 +02:00
Till JS
62fc566693 docs: mark OTel tracing (#7) as done in architecture report
7 of 10 roadmap items now complete. Remaining: Agent-to-Agent (#4),
A2A Agent Cards (#8), Graph Workflows (#9), Agent Memory (#10).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:21:45 +02:00
Till JS
acd7e0d6b0 docs: update architecture comparison — 5/10 roadmap items done
Update report to reflect all completed work:
- Matrix: streaming , tool registration updated to 29 tools + MCP
- §5.2 Streaming: marked done
- §5.3 Tool System: marked done
- §6 Table: items 1-3 + 5 struck through with commit refs
- §8 Fazit: updated gaps and recommendations

5 of 10 roadmap items complete in one session:
1. SSE Streaming, 2. Dynamic Tool Registry, 3. Budget Enforcement,
5. MCP Server Export (27/29 tools with DB ops), plus Tool Drift Fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:00:09 +02:00
Till JS
04c806fbb2 feat(mcp): implement remaining 19 tool handlers (27/29 total)
Complete tool handler coverage for the MCP server:

Todo: complete_tasks_by_title
Calendar: create_event (with timeBlock)
Notes: update_note, append_to_note, add_tag_to_note
Places: create_place, visit_place, get_places
Drink: log_drink, get_drink_progress, undo_drink
Food: log_meal, nutrition_summary
Journal: create_journal_entry
Habits: create_habit, log_habit (get_habits improved)
News: save_news_article

27 of 29 tools now have real implementations. Remaining 2
(research_news, get_current_location) need external service
calls that aren't available in the API server context.

Also updates architecture comparison report to mark MCP as done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:08:57 +02:00
Till JS
be81d11dc3 feat(ai): SSE streaming for foreground Mission Runner
Enable real-time token streaming during the planner "calling-llm" phase
so the user sees live progress ("empfange Plan… 128 tokens") instead of
a static spinner. The parser still receives the full text once complete —
no partial-JSON risk.

Changes:
- Extract shared SSE parser from playground into @mana/shared-llm/sse-parser
- remote.ts: use stream:true when onToken callback is provided
- AiPlanInput: add optional onToken field (shared-ai)
- ai-plan task: pass onToken through to backend.generate()
- runner.ts: throttled (500ms) phaseDetail updates during streaming
- Playground: refactored to use shared SSE parser

Also includes: AI agent architecture comparison report (docs/reports/)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:32:43 +02:00