- New providers gemini-deep-research + gemini-deep-research-max on the Interactions API (preview-04-2026). Submit/poll split, tier parameter selects between standard (~minutes, $1–3) and max (up to 60 min, $3–7). - Parser matches the real response shape: flat `outputs` array of thought|text|image items, url_citation annotations without title, `usage.total_input_tokens` / `total_output_tokens`. - Route generalisation: /v1/research/async accepts `provider` with default 'openai-deep-research' (backward compatible) and dispatches to the right submit/poll pair. - New internal service-to-service endpoint /v1/internal/research/async gated by X-Service-Key + X-User-Id for credit accounting. Enables mana-ai to drive deep-research jobs on the mission owner's wallet without requiring a user JWT. - Pricing: 300 credits (standard) / 1500 credits (max). Conservative markup over the ~$3/$7 ceiling so the first runs can't surprise us. - Docs: AGENT_PROVIDER_IDS + pricing + env map + auto-router stay in sync; CLAUDE.md Phase 3b now current; API_KEYS.md references the new providers under GOOGLE_GENAI_API_KEY. Verified with a real smoke test against the Gemini API: submit + poll both succeed, completed response parsed cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.1 KiB
mana-research
Web research orchestration service. Bundles 16+ providers (search, extract, agent) behind one interface. Pay-per-use APIs only, integrated with mana-credits 2-phase debit.
Plan: docs/plans/mana-research-service.md
Related analysis: docs/reports/web-research-capabilities.md
API-Keys Setup-Guide: API_KEYS.md — step-by-step per provider, pricing, signup URLs
Tech Stack
| Layer | Technology |
|---|---|
| Runtime | Bun |
| Framework | Hono |
| Database | PostgreSQL + Drizzle ORM (research.* schema in mana_platform) |
| Cache | Redis (ioredis, graceful degradation) |
| Auth | JWT via JWKS from mana-auth, plus X-Service-Key for service-to-service |
Quick Start
# From repo root: ensure postgres + redis are up, then run
pnpm --filter @mana/research-service dev
# Database schema (creates research.* tables)
cd services/mana-research
bun run db:push
bun run db:studio
Port: 3068
Phases
- Phase 1 ✅ — 4 search providers (
searxng,duckduckgo,brave,tavily),/v1/search,/v1/search/compare,/v1/runs,/v1/providers,mana-creditsreserve/commit/refund. - Phase 2 ✅ — +2 search providers (
exa,serper), 3 extract providers (readability,jina-reader,firecrawl),/v1/extract,/v1/extract/compare, query classifier + auto-router,/v1/providers/health. - Phase 3a ✅ — 4 sync research agents (
perplexity-sonar,claude-web-search,openai-responses,gemini-grounding),/v1/research,/v1/research/compare, agent auto-router. - Phase 3b (current) ✅ — async agents
openai-deep-research,gemini-deep-research,gemini-deep-research-maxviaresearch.async_jobsqueue. User-facing/v1/research/async, service-to-service/v1/internal/research/async(used by mana-ai's cross-tick deep-research flow). Seedocs/reports/gemini-deep-research.md. - Phase 4 — Research Lab UI + Settings for BYO-keys.
API Endpoints
User-facing (JWT auth)
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/search |
Single-provider search, or auto-routed if provider omitted. Body: { query, provider?, options?, useLlmClassifier? }. |
| POST | /api/v1/search/compare |
Fan-out to N providers (max 5), persist eval_run. Body: { query, providers[], options? }. |
| POST | /api/v1/extract |
Single-provider extract, auto-routed if provider omitted. Body: { url, provider?, options? }. |
| POST | /api/v1/extract/compare |
Fan-out to N extract providers (max 4). Body: { url, providers[], options? }. |
| POST | /api/v1/research |
Single-agent research. Auto-routed if provider omitted. Body: { query, provider?, options? }. |
| POST | /api/v1/research/compare |
Fan-out to N agents (max 4). Body: { query, providers[], options? }. |
| GET | /api/v1/runs |
List user's eval runs. Query: ?limit=50&offset=0. |
| GET | /api/v1/runs/:id |
Run + all results. |
| POST | /api/v1/runs/:runId/results/:resultId/rate |
Body: { rating: 1-5, notes? }. |
Public
| Method | Path | Description |
|---|---|---|
| GET | /health |
Health check. |
| GET | /metrics |
Prometheus stub (wired up later). |
| GET | /api/v1/providers |
List registered providers + capabilities + pricing. |
| GET | /api/v1/providers/health |
Per-provider readiness check (free / ready / needs-key). |
Service-to-service (X-Service-Key)
All /api/v1/internal/* routes require X-Service-Key: <MANA_SERVICE_KEY>. Endpoints that touch per-user state additionally require X-User-Id: <userId> so credit reservations + eval-run rows land on the right user.
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/internal/health |
Placeholder health probe. |
| POST | /api/v1/internal/research/async |
Submit async research job. Body: { query, provider, options? } where provider ∈ { openai-deep-research, gemini-deep-research, gemini-deep-research-max }. Requires X-User-Id. |
| GET | /api/v1/internal/research/async/:id |
Poll status / read completed result. Requires X-User-Id (same user as submit). |
Caller today: mana-ai (ManaResearchClient), which fires deep-research-max tasks from the tick-loop's pre-planning step for missions that opt in via DEEP_RESEARCH_TRIGGER.
Providers
Search (6)
| Provider | Key | Cost | Notes |
|---|---|---|---|
searxng |
— | 0 | Wraps mana-search (SearXNG). Self-hosted. |
duckduckgo |
— | 0 | Instant Answer API. Rate-limited. |
brave |
BRAVE_API_KEY |
5 | $5/1k PAYG. Independent index. |
tavily |
TAVILY_API_KEY |
8 | Agent-optimized, returns content. |
exa |
EXA_API_KEY |
6 | Semantic/neural, best for papers + semantic similarity. |
serper |
SERPER_API_KEY |
1 | Google SERP as JSON. $0.30–1/1k. |
Extract (3)
| Provider | Key | Cost | Notes |
|---|---|---|---|
readability |
— | 0 | Wraps mana-search /extract (go-readability). |
jina-reader |
optional JINA_API_KEY |
1 | r.jina.ai, JS-rendering + PDF, Markdown out. |
firecrawl |
FIRECRAWL_API_KEY |
10 | Playwright-based, best for JS-heavy sites. Self-hostable. |
Research Agents (4 sync, 1 async planned)
| Provider | Key | Cost | Notes |
|---|---|---|---|
perplexity-sonar |
PERPLEXITY_API_KEY |
50 | 4 models: sonar, sonar-pro, sonar-reasoning, sonar-deep-research. Best plug-and-play. |
gemini-grounding |
GOOGLE_GENAI_API_KEY |
100 | Gemini + Google Search grounding. Single-step. |
openai-responses |
OPENAI_API_KEY |
200 | Responses API with web_search_preview tool. Multi-step. |
claude-web-search |
ANTHROPIC_API_KEY |
200 | Claude + web_search_20250305 tool, up to 5 searches/call. |
openai-deep-research |
OPENAI_API_KEY |
1000 | async, returns taskId to poll. |
gemini-deep-research |
GOOGLE_GENAI_API_KEY |
300 | async, Gemini 3.1 Pro preview (04-2026). Standard tier, ~minutes. |
gemini-deep-research-max |
GOOGLE_GENAI_API_KEY |
1500 | async, Gemini 3.1 Pro preview (04-2026). Max tier, up to 60 min, deep synthesis. |
Auto-routing
When provider is omitted from POST /v1/search, the service classifies the query via regex (fast path, ~0ms) and optionally the LLM (useLlmClassifier: true), then picks the first available provider from SEARCH_ROUTE_MAP[type]:
news→ tavily, brave, serper, searxng, duckduckgogeneral→ brave, tavily, serper, searxngsemantic→ exa, tavily, braveacademic→ exa, searxng, bravecode→ exa, serper, braveconversational→ tavily, brave, serper
Extract auto-routing prefers firecrawl (best quality) → jina-reader → readability.
Credits Integration
Server-key mode uses mana-credits 2-phase debit:
reserve → provider call → (commit on success | refund on error)
BYO-key mode bypasses credits entirely (user brings their own API key, Phase 4 UI).
Pricing map: src/lib/pricing.ts.
Database
Schema research in mana_platform:
eval_runs— one per request (single/compare/automode).eval_results— one per provider response. Raw + normalized output, latency, cost, optional user rating.provider_configs— per-user BYO-key + budget.userId=nullreserved for server defaults.provider_stats— rolled-up daily metrics for admin dashboard + auto-router.
All eval runs are permanent by design — this is the comparison engine's point.
Environment Variables
PORT=3068
DATABASE_URL=postgresql://mana:devpassword@localhost:5432/mana_platform
REDIS_URL=redis://localhost:6379
MANA_AUTH_URL=http://localhost:3001
MANA_LLM_URL=http://localhost:3025
MANA_CREDITS_URL=http://localhost:3061
MANA_SEARCH_URL=http://localhost:3021
MANA_SERVICE_KEY=dev-service-key
CACHE_TTL_SECONDS=3600
CORS_ORIGINS=http://localhost:5173
# Provider keys (optional in dev — providers without keys are unavailable)
BRAVE_API_KEY=
TAVILY_API_KEY=
EXA_API_KEY=
SERPER_API_KEY=
JINA_API_KEY=
FIRECRAWL_API_KEY=
SCRAPINGBEE_API_KEY=
PERPLEXITY_API_KEY=
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
GOOGLE_GENAI_API_KEY=