Commit graph

5 commits

Author SHA1 Message Date
Till JS
8acf35eecf chore(dev): finish --watch → --hot sweep across remaining Bun services
Some checks failed
CD Mac Mini / Detect Changes (push) Failing after 11s
CI / Detect Changes (push) Successful in 7s
CI / Validate (push) Has been skipped
CI / Build mana-auth (push) Waiting to run
CI / Build mana-search (push) Waiting to run
CI / Build mana-sync (push) Waiting to run
CI / Build mana-notify (push) Waiting to run
CI / Build mana-api-gateway (push) Waiting to run
CI / Build mana-crawler (push) Waiting to run
CI / Build mana-media (push) Waiting to run
CI / Build mana-credits (push) Waiting to run
CI / Auth flow integration test (push) Has been skipped
Docker Validate / Validate Dockerfiles (push) Failing after 1m35s
Docker Validate / Build calendar-web (push) Has been skipped
Docker Validate / Build quotes-web (push) Has been skipped
Docker Validate / Build todo-backend (push) Has been skipped
Docker Validate / Build todo-web (push) Has been skipped
Docker Validate / Build mana-auth (push) Has been skipped
Docker Validate / Build mana-sync (push) Has been skipped
Docker Validate / Build mana-media (push) Has been skipped
Mirror to Forgejo / Push to Forgejo (push) Failing after 1s
CD Mac Mini / Deploy (push) Has been cancelled
Catches the service-level package.json files that the previous
sweep (4cca25ed0) missed — they don't appear in any dev:*:full
orchestrator but get invoked when someone runs `pnpm --filter
@mana/<service> dev` directly.

Touched: mana-geocoding, mana-mail, mana-subscriptions, mana-mcp,
news-ingester, mana-persona-runner, mana-research, mana-user,
plus apps/memoro (server + audio-server).

mana-ai stays on --watch on purpose: its entry uses an explicit
`Bun.serve({...})` call instead of `export default { port,
fetch }`, plus a SIGTERM/SIGINT handler that calls
`server.stop()`. --hot would replace the module without releasing
the old server reference and produce exactly the EADDRINUSE we're
trying to avoid. If mana-ai gets refactored to the standard
default-export shape, flip its dev script too.
2026-05-08 14:33:27 +02:00
Till JS
3a7bc7f1c3 test(mana-research): fixture-based tests for Gemini poll-response parser
Re-commit of c413ab7dd (reverted in c31dcdd66) without the unrelated
files that accidentally got swept into the original stage. Parser
content is identical.

The real Gemini /v1beta/interactions/:id completed shape bit us once
already during the initial smoke-test (we had OpenAI-style nested
`output.message.content[]` coded; reality is a flat `outputs` array
of thought|text|image items, with url_citations that carry no title
and usage fields named `total_input_tokens` rather than `input_tokens`).

This test pins the parser against a synthetic fixture covering the
cases we saw in the wild plus the failure modes that are hard to
provoke from a live API call:

  - status dispatch (queued, in_progress, failed, cancelled, incomplete)
  - completed body concatenated across text items, skipping thought/image
  - empty/missing `outputs` without crashing
  - missing usage
  - citations deduped by url, hostname extracted as title
  - wrong-type annotations and those without url skipped
  - real vertexaisearch redirect URLs Gemini emits
  - fallback to url as title when the URL is unparseable
  - trimming of leading/trailing whitespace

To make this testable I pulled the completed-branch of
pollGeminiDeepResearch into a standalone parseInteractionResponse
helper — same behaviour, now reachable without mocking global fetch.

Also adds the `test` script to package.json so `pnpm --filter
@mana/research-service test` works.

17 pass / 0 fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:44:21 +02:00
Till JS
c31dcdd66c Revert "test(mana-research): fixture-based tests for Gemini poll-response parser"
This reverts commit c413ab7dd3.
2026-04-22 18:43:48 +02:00
Till JS
c413ab7dd3 test(mana-research): fixture-based tests for Gemini poll-response parser
The real Gemini /v1beta/interactions/:id completed shape bit us once
already during the initial smoke-test (we had OpenAI-style nested
`output.message.content[]` coded; reality is a flat `outputs` array
of thought|text|image items, with url_citations that carry no title
and usage fields named `total_input_tokens` rather than `input_tokens`).

This test pins the parser against a synthetic fixture covering the
cases we saw in the wild plus the failure modes that are hard to
provoke from a live API call:

  - status dispatch (queued, in_progress, failed, cancelled, incomplete)
  - completed body concatenated across text items, skipping thought/image
  - empty/missing `outputs` without crashing
  - missing usage
  - citations deduped by url, hostname extracted as title
  - wrong-type annotations and those without url skipped
  - real vertexaisearch redirect URLs Gemini emits
  - fallback to url as title when the URL is unparseable
  - trimming of leading/trailing whitespace

To make this testable I pulled the completed-branch of
pollGeminiDeepResearch into a standalone parseInteractionResponse
helper — same behaviour, now reachable without mocking global fetch.

Also adds the `test` script to package.json so `pnpm --filter
@mana/research-service test` works.

17 pass / 0 fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:34:33 +02:00
Till JS
2bdb48bdd1 feat(research): add mana-research service — Phase 1 + 2
New Bun/Hono service on port 3068 that bundles many web-research providers
behind a unified interface for side-by-side comparison. All eval runs
persist in research.* (mana_platform) so quality can be reviewed later.

Providers (Phase 1+2):
  search:  searxng, duckduckgo, brave, tavily, exa, serper
  extract: readability (via mana-search), jina-reader, firecrawl

Endpoints:
  POST /v1/search, /v1/search/compare       — single + fan-out
  POST /v1/extract, /v1/extract/compare     — single + fan-out
  GET  /v1/runs, /v1/runs/:id               — history
  POST /v1/runs/:run/results/:id/rate       — manual eval
  GET  /v1/providers, /v1/providers/health  — catalog + readiness

Auto-routing: when `provider` is omitted, queries are classified via regex
(fast path, 0ms) with optional mana-llm fallback, then routed to the first
available provider for that query type (news → tavily, academic → exa,
semantic → exa, etc.).

Credits: server-key calls go through mana-credits reserve → commit/refund
so failed provider calls don't charge the user. BYO-keys supported via
research.provider_configs (UI arrives in Phase 4).

Cache: Redis with graceful degradation (1h TTL for search, 24h for
extract). Pay-per-use APIs only — no subscription-gated providers.

Docs: docs/plans/mana-research-service.md + docs/reports/web-research-capabilities.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:42:25 +02:00