diff --git a/docs/plans/mana-research-service.md b/docs/plans/mana-research-service.md index bc717b238..1554c17a2 100644 --- a/docs/plans/mana-research-service.md +++ b/docs/plans/mana-research-service.md @@ -430,17 +430,21 @@ Klassifikation ist optional und fällt bei LLM-Timeout auf `'general'` zurück. - [x] Run-Listen-Endpoints bereits in Phase 1 geliefert - [x] ~~Nightly-Job~~: Live-Aggregation im `addResult()`-Pfad via `onConflictDoUpdate` genügt für Phase 2. -### Phase 3 — Research Agents + mana-ai Migration (≈ 1–2 Wochen) +### Phase 3a — Sync Research Agents ✅ (2026-04-17) -- [ ] Provider-Adapter: +- [x] Provider-Adapter (via direct HTTP, keine SDK-Deps): - `PerplexitySonarProvider` (4 Modelle: sonar, sonar-pro, sonar-reasoning, sonar-deep-research) - - `ClaudeWebSearchProvider` (via Anthropic SDK + tool-use) - - `OpenAIResponsesProvider` (via OpenAI SDK + `web_search_preview` tool) - - `GeminiGroundingProvider` (via google-genai SDK mit Search-Grounding) - - `OpenAIDeepResearchProvider` — **async**, via BullMQ/inline Job-Queue, Response-Endpoint `GET /v1/research/tasks/:id` -- [ ] `POST /v1/research` + `POST /v1/research/compare` -- [ ] Auto-Router für `conversational`-Queries → Agent-Mode -- [ ] `mana-llm` um Anthropic- und OpenAI-Provider erweitern (nur für Claude/OpenAI Agents; restlicher LLM-Workflow bleibt Ollama-first) + - `ClaudeWebSearchProvider` (Anthropic Messages API mit `web_search_20250305` Tool) + - `OpenAIResponsesProvider` (OpenAI Responses API mit `web_search_preview` Tool) + - `GeminiGroundingProvider` (Google GenAI v1beta mit Google-Search-Grounding) +- [x] `POST /v1/research` + `POST /v1/research/compare` +- [x] Agent-Auto-Router (`pickAgent` wählt ersten Provider mit Key: perplexity → gemini → openai → claude → deep-research) +- [x] Agents in `/v1/providers` + `/v1/providers/health` integriert + +### Phase 3b — Async + Migrationen (offen) + +- [ ] `OpenAIDeepResearchProvider` — async, via Job-Queue, `GET /v1/research/tasks/:id` Polling-Endpoint +- [ ] Auto-Router für `conversational`-Queries → Agent-Mode in `/v1/search` (aktuell separate Endpoints) - [ ] **Migration:** `apps/api/src/modules/news-research/routes.ts` wird zum dünnen Adapter auf `mana-research` - [ ] **Migration:** `services/mana-ai/src/planner/news-research-client.ts` ruft jetzt `mana-research` direkt statt `mana-api` - [ ] **Migration:** `research_news`-Tool bekommt Option `depth: 'shallow' | 'deep'`; `deep` ruft Agent-Mode diff --git a/services/mana-research/CLAUDE.md b/services/mana-research/CLAUDE.md index 18a128c47..b56a39b50 100644 --- a/services/mana-research/CLAUDE.md +++ b/services/mana-research/CLAUDE.md @@ -32,8 +32,9 @@ bun run db:studio ## Phases - **Phase 1** ✅ — 4 search providers (`searxng`, `duckduckgo`, `brave`, `tavily`), `/v1/search`, `/v1/search/compare`, `/v1/runs`, `/v1/providers`, `mana-credits` reserve/commit/refund. -- **Phase 2 (current)** ✅ — +2 search providers (`exa`, `serper`), 3 extract providers (`readability`, `jina-reader`, `firecrawl`), `/v1/extract`, `/v1/extract/compare`, query classifier + auto-router, `/v1/providers/health`. -- **Phase 3** — Research agents (`perplexity-sonar`, `claude-web-search`, `openai-responses`, `gemini-grounding`, `openai-deep-research`). mana-ai migration to use this service. +- **Phase 2** ✅ — +2 search providers (`exa`, `serper`), 3 extract providers (`readability`, `jina-reader`, `firecrawl`), `/v1/extract`, `/v1/extract/compare`, query classifier + auto-router, `/v1/providers/health`. +- **Phase 3a (current)** ✅ — 4 sync research agents (`perplexity-sonar`, `claude-web-search`, `openai-responses`, `gemini-grounding`), `/v1/research`, `/v1/research/compare`, agent auto-router. +- **Phase 3b** — `openai-deep-research` (async via job queue), mana-ai migration to call mana-research, `research_news` tool gets `depth: shallow|deep` option, mana-api news-research becomes thin adapter. - **Phase 4** — Research Lab UI + Settings for BYO-keys. ## API Endpoints @@ -46,6 +47,8 @@ bun run db:studio | POST | `/api/v1/search/compare` | Fan-out to N providers (max 5), persist eval_run. Body: `{ query, providers[], options? }`. | | POST | `/api/v1/extract` | Single-provider extract, auto-routed if `provider` omitted. Body: `{ url, provider?, options? }`. | | POST | `/api/v1/extract/compare` | Fan-out to N extract providers (max 4). Body: `{ url, providers[], options? }`. | +| POST | `/api/v1/research` | Single-agent research. Auto-routed if `provider` omitted. Body: `{ query, provider?, options? }`. | +| POST | `/api/v1/research/compare` | Fan-out to N agents (max 4). Body: `{ query, providers[], options? }`. | | GET | `/api/v1/runs` | List user's eval runs. Query: `?limit=50&offset=0`. | | GET | `/api/v1/runs/:id` | Run + all results. | | POST | `/api/v1/runs/:runId/results/:resultId/rate` | Body: `{ rating: 1-5, notes? }`. | @@ -84,6 +87,16 @@ Reserved for Phase 3 when `mana-ai` migrates to call this service directly. `/ap | `jina-reader` | optional `JINA_API_KEY` | 1 | `r.jina.ai`, JS-rendering + PDF, Markdown out. | | `firecrawl` | `FIRECRAWL_API_KEY` | 10 | Playwright-based, best for JS-heavy sites. Self-hostable. | +### Research Agents (4 sync, 1 async planned) + +| Provider | Key | Cost | Notes | +|---|---|---|---| +| `perplexity-sonar` | `PERPLEXITY_API_KEY` | 50 | 4 models: sonar, sonar-pro, sonar-reasoning, sonar-deep-research. Best plug-and-play. | +| `gemini-grounding` | `GOOGLE_GENAI_API_KEY` | 100 | Gemini + Google Search grounding. Single-step. | +| `openai-responses` | `OPENAI_API_KEY` | 200 | Responses API with `web_search_preview` tool. Multi-step. | +| `claude-web-search` | `ANTHROPIC_API_KEY` | 200 | Claude + `web_search_20250305` tool, up to 5 searches/call. | +| `openai-deep-research` | `OPENAI_API_KEY` | 1000 | ⏳ Phase 3b — async, returns taskId to poll. | + ## Auto-routing When `provider` is omitted from `POST /v1/search`, the service classifies the query via regex (fast path, ~0ms) and optionally the LLM (`useLlmClassifier: true`), then picks the first available provider from `SEARCH_ROUTE_MAP[type]`: diff --git a/services/mana-research/src/executor/execute-research.ts b/services/mana-research/src/executor/execute-research.ts new file mode 100644 index 000000000..bcf54e18b --- /dev/null +++ b/services/mana-research/src/executor/execute-research.ts @@ -0,0 +1,148 @@ +/** + * Agent-side executor. Same shape as search/extract but for long-running + * LLM-backed research calls with citations. + */ + +import type { + AgentAnswer, + AgentOptions, + BillingMode, + ProviderId, + ProviderMeta, + ResearchAgent, +} from '@mana/shared-research'; +import type { CreditsClient } from '../clients/mana-credits'; +import type { Config } from '../config'; +import { ProviderNotConfiguredError } from '../lib/errors'; +import { priceFor } from '../lib/pricing'; +import type { ConfigStorage } from '../storage/configs'; +import { cacheGet, cacheKey, cacheSet } from '../lib/cache'; +import { mapEnvKey } from './env-map'; + +export interface ExecuteResearchInput { + provider: ResearchAgent; + query: string; + options: AgentOptions; + userId: string; + signal?: AbortSignal; +} + +export interface ExecuteResearchOutput { + success: boolean; + data?: { answer: AgentAnswer }; + meta: ProviderMeta; +} + +export interface ExecutorDeps { + credits: CreditsClient; + configs: ConfigStorage; + config: Config; +} + +export async function executeResearch( + input: ExecuteResearchInput, + deps: ExecutorDeps +): Promise { + const { provider, query, options, userId, signal } = input; + const providerId = provider.id; + const t0 = performance.now(); + + let apiKey: string | null = null; + let billingMode: BillingMode = 'free'; + + if (provider.requiresApiKey) { + const userConfig = await deps.configs.getForUser(userId, providerId); + if (userConfig?.enabled && userConfig.apiKeyEncrypted) { + apiKey = await deps.configs.decryptKey(userConfig); + if (apiKey) billingMode = 'byo-key'; + } + if (!apiKey) { + apiKey = deps.config.providerKeys[mapEnvKey(providerId)] ?? null; + if (apiKey) billingMode = 'server-key'; + } + if (!apiKey) { + return makeError(providerId, t0, new ProviderNotConfiguredError(providerId)); + } + } + + // Agent responses depend on query + model — include model in cache key + const ckey = cacheKey('agent', providerId, query, options); + const cached = await cacheGet<{ answer: AgentAnswer }>(ckey); + if (cached) { + return { + success: true, + data: cached, + meta: { + provider: providerId, + category: 'agent', + latencyMs: Math.round(performance.now() - t0), + costCredits: 0, + cacheHit: true, + billingMode, + }, + }; + } + + const price = billingMode === 'server-key' ? priceFor(providerId, 'research') : 0; + + let reservationId: string | null = null; + if (price > 0 && billingMode === 'server-key') { + try { + const reservation = await deps.credits.reserve( + userId, + price, + `research:${providerId}:research` + ); + reservationId = reservation.reservationId; + } catch (err) { + return makeError(providerId, t0, err as Error); + } + } + + try { + const res = await provider.research(query, options, { apiKey, userId, signal }); + await cacheSet(ckey, { answer: res.answer }, deps.config.cacheTtlSeconds); + + if (reservationId) { + await deps.credits + .commit(reservationId, `research ${providerId}`) + .catch((err) => console.warn('[executor] commit failed:', err)); + } + + return { + success: true, + data: { answer: res.answer }, + meta: { + provider: providerId, + category: 'agent', + latencyMs: Math.round(performance.now() - t0), + costCredits: price, + cacheHit: false, + billingMode, + }, + }; + } catch (err) { + if (reservationId) { + await deps.credits + .refund(reservationId) + .catch((refundErr) => console.warn('[executor] refund failed:', refundErr)); + } + return makeError(providerId, t0, err as Error); + } +} + +function makeError(providerId: ProviderId, t0: number, err: Error): ExecuteResearchOutput { + const code = (err as { code?: string }).code ?? err.name ?? 'ERROR'; + return { + success: false, + meta: { + provider: providerId, + category: 'agent', + latencyMs: Math.round(performance.now() - t0), + costCredits: 0, + cacheHit: false, + billingMode: 'free', + errorCode: code, + }, + }; +} diff --git a/services/mana-research/src/index.ts b/services/mana-research/src/index.ts index d41b3944c..cf4630dda 100644 --- a/services/mana-research/src/index.ts +++ b/services/mana-research/src/index.ts @@ -18,6 +18,7 @@ import { serviceAuth } from './middleware/service-auth'; import { healthRoutes } from './routes/health'; import { createSearchRoutes } from './routes/search'; import { createExtractRoutes } from './routes/extract'; +import { createResearchRoutes } from './routes/research'; import { createProvidersRoutes } from './routes/providers'; import { createRunsRoutes } from './routes/runs'; import { buildRegistry } from './providers/registry'; @@ -84,6 +85,9 @@ app.route( app.use('/api/v1/extract/*', jwtAuth(config.manaAuthUrl)); app.route('/api/v1/extract', createExtractRoutes(registry, runStorage, executorDeps, config)); +app.use('/api/v1/research/*', jwtAuth(config.manaAuthUrl)); +app.route('/api/v1/research', createResearchRoutes(registry, runStorage, executorDeps, config)); + app.use('/api/v1/runs/*', jwtAuth(config.manaAuthUrl)); app.route('/api/v1/runs', createRunsRoutes(runStorage)); diff --git a/services/mana-research/src/providers/agent/claude-web-search.ts b/services/mana-research/src/providers/agent/claude-web-search.ts new file mode 100644 index 000000000..5e31bb484 --- /dev/null +++ b/services/mana-research/src/providers/agent/claude-web-search.ts @@ -0,0 +1,135 @@ +/** + * Claude with server-side web_search tool. + * Docs: https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool + * + * Anthropic charges per tool invocation + tokens. No subscription. + */ + +import type { ResearchAgent, Citation } from '@mana/shared-research'; +import { ProviderError, ProviderNotConfiguredError } from '../../lib/errors'; + +const DEFAULT_MODEL = 'claude-opus-4-7'; +const DEFAULT_MAX_SEARCHES = 5; + +type ContentBlock = + | { type: 'text'; text: string; citations?: CitationBlock[] } + | { type: 'tool_use'; id: string; name: string; input: unknown } + | { type: 'server_tool_use'; id: string; name: string; input: unknown } + | { type: 'web_search_tool_result'; tool_use_id: string; content: WebSearchResult[] }; + +interface CitationBlock { + type: 'web_search_result_location'; + url?: string; + title?: string; + cited_text?: string; +} + +interface WebSearchResult { + type: 'web_search_result'; + url: string; + title: string; + page_age?: string; + encrypted_content?: string; +} + +interface AnthropicResponse { + content?: ContentBlock[]; + usage?: { + input_tokens?: number; + output_tokens?: number; + }; +} + +export function createClaudeWebSearchProvider(): ResearchAgent { + return { + id: 'claude-web-search', + requiresApiKey: true, + capabilities: { + multiStep: true, + async: false, + withCitations: true, + }, + async research(query, options, ctx) { + if (!ctx.apiKey) throw new ProviderNotConfiguredError('claude-web-search'); + const t0 = performance.now(); + + const model = options.model ?? DEFAULT_MODEL; + + const res = await fetch('https://api.anthropic.com/v1/messages', { + method: 'POST', + headers: { + 'x-api-key': ctx.apiKey, + 'anthropic-version': '2023-06-01', + 'Content-Type': 'application/json', + }, + body: JSON.stringify({ + model, + max_tokens: options.maxTokens ?? 2048, + temperature: options.temperature ?? 0.3, + system: options.systemPrompt, + messages: [{ role: 'user', content: query }], + tools: [ + { + type: 'web_search_20250305', + name: 'web_search', + max_uses: DEFAULT_MAX_SEARCHES, + }, + ], + }), + signal: ctx.signal, + }); + + if (!res.ok) { + const body = await res.text().catch(() => ''); + throw new ProviderError('claude-web-search', `HTTP ${res.status} ${body.slice(0, 200)}`); + } + + const data = (await res.json()) as AnthropicResponse; + + const textParts: string[] = []; + const citationsMap = new Map(); + + for (const block of data.content ?? []) { + if (block.type === 'text') { + textParts.push(block.text); + for (const cit of block.citations ?? []) { + if (cit.url && !citationsMap.has(cit.url)) { + citationsMap.set(cit.url, { + url: cit.url, + title: cit.title ?? cit.url, + snippet: cit.cited_text, + }); + } + } + } + if (block.type === 'web_search_tool_result') { + const results = block.content as WebSearchResult[]; + for (const r of results) { + if (r.url && !citationsMap.has(r.url)) { + citationsMap.set(r.url, { url: r.url, title: r.title }); + } + } + } + } + + const tokenUsage = data.usage + ? { + input: data.usage.input_tokens ?? 0, + output: data.usage.output_tokens ?? 0, + } + : undefined; + + return { + answer: { + query, + answer: textParts.join('\n\n'), + citations: [...citationsMap.values()], + tokenUsage, + providerRaw: data, + }, + rawLatencyMs: Math.round(performance.now() - t0), + tokenUsage, + }; + }, + }; +} diff --git a/services/mana-research/src/providers/agent/gemini-grounding.ts b/services/mana-research/src/providers/agent/gemini-grounding.ts new file mode 100644 index 000000000..e290fa5a0 --- /dev/null +++ b/services/mana-research/src/providers/agent/gemini-grounding.ts @@ -0,0 +1,108 @@ +/** + * Gemini with Google Search grounding. + * Docs: https://ai.google.dev/gemini-api/docs/grounding + * + * Pay-per-use (tokens + per-grounding-query fee). No subscription. + */ + +import type { Citation, ResearchAgent } from '@mana/shared-research'; +import { ProviderError, ProviderNotConfiguredError } from '../../lib/errors'; + +const DEFAULT_MODEL = 'gemini-2.0-flash'; + +interface GeminiResponse { + candidates?: Array<{ + content?: { + parts?: Array<{ text?: string }>; + }; + groundingMetadata?: { + groundingChunks?: Array<{ + web?: { uri: string; title?: string }; + }>; + webSearchQueries?: string[]; + }; + }>; + usageMetadata?: { + promptTokenCount?: number; + candidatesTokenCount?: number; + }; +} + +export function createGeminiGroundingProvider(): ResearchAgent { + return { + id: 'gemini-grounding', + requiresApiKey: true, + capabilities: { + multiStep: false, + async: false, + withCitations: true, + }, + async research(query, options, ctx) { + if (!ctx.apiKey) throw new ProviderNotConfiguredError('gemini-grounding'); + const t0 = performance.now(); + + const model = options.model ?? DEFAULT_MODEL; + const url = `https://generativelanguage.googleapis.com/v1beta/models/${model}:generateContent?key=${encodeURIComponent( + ctx.apiKey + )}`; + + const contents = [{ role: 'user', parts: [{ text: query }] }]; + + const res = await fetch(url, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + contents, + tools: [{ googleSearch: {} }], + generationConfig: { + temperature: options.temperature ?? 0.3, + maxOutputTokens: options.maxTokens ?? 2048, + }, + ...(options.systemPrompt + ? { systemInstruction: { parts: [{ text: options.systemPrompt }] } } + : {}), + }), + signal: ctx.signal, + }); + + if (!res.ok) { + const body = await res.text().catch(() => ''); + throw new ProviderError('gemini-grounding', `HTTP ${res.status} ${body.slice(0, 200)}`); + } + + const data = (await res.json()) as GeminiResponse; + const candidate = data.candidates?.[0]; + const answer = (candidate?.content?.parts ?? []).map((p) => p.text ?? '').join(''); + + const citationsMap = new Map(); + for (const chunk of candidate?.groundingMetadata?.groundingChunks ?? []) { + if (chunk.web?.uri && !citationsMap.has(chunk.web.uri)) { + citationsMap.set(chunk.web.uri, { + url: chunk.web.uri, + title: chunk.web.title ?? chunk.web.uri, + }); + } + } + + const tokenUsage = data.usageMetadata + ? { + input: data.usageMetadata.promptTokenCount ?? 0, + output: data.usageMetadata.candidatesTokenCount ?? 0, + } + : undefined; + + return { + answer: { + query, + answer, + citations: [...citationsMap.values()], + followUpQueries: candidate?.groundingMetadata?.webSearchQueries, + tokenUsage, + providerRaw: data, + }, + rawLatencyMs: Math.round(performance.now() - t0), + tokenUsage, + }; + }, + }; +} diff --git a/services/mana-research/src/providers/agent/openai-responses.ts b/services/mana-research/src/providers/agent/openai-responses.ts new file mode 100644 index 000000000..fadf72cf5 --- /dev/null +++ b/services/mana-research/src/providers/agent/openai-responses.ts @@ -0,0 +1,125 @@ +/** + * OpenAI Responses API with web_search_preview tool. + * Docs: https://platform.openai.com/docs/guides/tools-web-search + * + * Pay-per-use (tokens + per-search fees). No subscription. + */ + +import type { Citation, ResearchAgent } from '@mana/shared-research'; +import { ProviderError, ProviderNotConfiguredError } from '../../lib/errors'; + +const DEFAULT_MODEL = 'gpt-4o'; + +interface ResponsesApiResponse { + output?: Array; + output_text?: string; + usage?: { + input_tokens?: number; + output_tokens?: number; + }; +} + +type OutputItem = + | { type: 'message'; role: string; content: MessageContent[] } + | { type: 'web_search_call'; id: string; status: string }; + +type MessageContent = { + type: 'output_text'; + text: string; + annotations?: Array<{ + type: string; + url?: string; + title?: string; + start_index?: number; + end_index?: number; + }>; +}; + +export function createOpenAIResponsesProvider(): ResearchAgent { + return { + id: 'openai-responses', + requiresApiKey: true, + capabilities: { + multiStep: true, + async: false, + withCitations: true, + }, + async research(query, options, ctx) { + if (!ctx.apiKey) throw new ProviderNotConfiguredError('openai-responses'); + const t0 = performance.now(); + + const model = options.model ?? DEFAULT_MODEL; + + const body: Record = { + model, + input: options.systemPrompt + ? [ + { role: 'system', content: options.systemPrompt }, + { role: 'user', content: query }, + ] + : query, + tools: [{ type: 'web_search_preview' }], + max_output_tokens: options.maxTokens ?? 2048, + }; + + const res = await fetch('https://api.openai.com/v1/responses', { + method: 'POST', + headers: { + Authorization: `Bearer ${ctx.apiKey}`, + 'Content-Type': 'application/json', + }, + body: JSON.stringify(body), + signal: ctx.signal, + }); + + if (!res.ok) { + const errBody = await res.text().catch(() => ''); + throw new ProviderError('openai-responses', `HTTP ${res.status} ${errBody.slice(0, 200)}`); + } + + const data = (await res.json()) as ResponsesApiResponse; + + const textParts: string[] = []; + const citationsMap = new Map(); + + if (data.output_text) textParts.push(data.output_text); + + for (const item of data.output ?? []) { + if (item.type !== 'message') continue; + const msgItem = item as { content: MessageContent[] }; + for (const content of msgItem.content ?? []) { + if (content.type === 'output_text') { + if (!data.output_text) textParts.push(content.text); + for (const ann of content.annotations ?? []) { + if (ann.url && !citationsMap.has(ann.url)) { + citationsMap.set(ann.url, { + url: ann.url, + title: ann.title ?? ann.url, + }); + } + } + } + } + } + + const tokenUsage = data.usage + ? { + input: data.usage.input_tokens ?? 0, + output: data.usage.output_tokens ?? 0, + } + : undefined; + + return { + answer: { + query, + answer: textParts.join('\n\n'), + citations: [...citationsMap.values()], + tokenUsage, + providerRaw: data, + }, + rawLatencyMs: Math.round(performance.now() - t0), + tokenUsage, + }; + }, + }; +} diff --git a/services/mana-research/src/providers/agent/perplexity-sonar.ts b/services/mana-research/src/providers/agent/perplexity-sonar.ts new file mode 100644 index 000000000..b123123de --- /dev/null +++ b/services/mana-research/src/providers/agent/perplexity-sonar.ts @@ -0,0 +1,101 @@ +/** + * Perplexity Sonar — chat-completions API with built-in web search. + * Docs: https://docs.perplexity.ai/api-reference/chat-completions + * + * Models (pay-per-use): + * sonar — cheap, fast + * sonar-pro — balanced + * sonar-reasoning — chain-of-thought, deeper answers + * sonar-deep-research — longest/most comprehensive + */ + +import type { ResearchAgent } from '@mana/shared-research'; +import { ProviderError, ProviderNotConfiguredError } from '../../lib/errors'; + +const DEFAULT_MODEL = 'sonar'; +const ALLOWED_MODELS = new Set(['sonar', 'sonar-pro', 'sonar-reasoning', 'sonar-deep-research']); + +interface SonarResponse { + choices?: Array<{ + message?: { + content?: string; + }; + }>; + citations?: string[]; + search_results?: Array<{ url: string; title?: string; snippet?: string }>; + usage?: { + prompt_tokens?: number; + completion_tokens?: number; + }; +} + +export function createPerplexitySonarProvider(): ResearchAgent { + return { + id: 'perplexity-sonar', + requiresApiKey: true, + capabilities: { + multiStep: true, + async: false, + withCitations: true, + }, + async research(query, options, ctx) { + if (!ctx.apiKey) throw new ProviderNotConfiguredError('perplexity-sonar'); + const t0 = performance.now(); + + const model = + options.model && ALLOWED_MODELS.has(options.model) ? options.model : DEFAULT_MODEL; + + const res = await fetch('https://api.perplexity.ai/chat/completions', { + method: 'POST', + headers: { + Authorization: `Bearer ${ctx.apiKey}`, + 'Content-Type': 'application/json', + }, + body: JSON.stringify({ + model, + messages: [ + ...(options.systemPrompt ? [{ role: 'system', content: options.systemPrompt }] : []), + { role: 'user', content: query }, + ], + max_tokens: options.maxTokens ?? 1024, + temperature: options.temperature ?? 0.2, + }), + signal: ctx.signal, + }); + + if (!res.ok) { + const body = await res.text().catch(() => ''); + throw new ProviderError('perplexity-sonar', `HTTP ${res.status} ${body.slice(0, 200)}`); + } + + const data = (await res.json()) as SonarResponse; + const answer = data.choices?.[0]?.message?.content ?? ''; + + const citations = + data.search_results?.map((r) => ({ + url: r.url, + title: r.title ?? r.url, + snippet: r.snippet, + })) ?? (data.citations ?? []).map((url) => ({ url, title: url })); + + const tokenUsage = data.usage + ? { + input: data.usage.prompt_tokens ?? 0, + output: data.usage.completion_tokens ?? 0, + } + : undefined; + + return { + answer: { + query, + answer, + citations, + tokenUsage, + providerRaw: data, + }, + rawLatencyMs: Math.round(performance.now() - t0), + tokenUsage, + }; + }, + }; +} diff --git a/services/mana-research/src/providers/registry.ts b/services/mana-research/src/providers/registry.ts index 8cbc79422..e06f99e41 100644 --- a/services/mana-research/src/providers/registry.ts +++ b/services/mana-research/src/providers/registry.ts @@ -7,6 +7,7 @@ import type { ExtractProvider, ExtractProviderId, ProviderId, + ResearchAgent, SearchProvider, SearchProviderId, } from '@mana/shared-research'; @@ -21,10 +22,15 @@ import { createTavilyProvider } from './search/tavily'; import { createFirecrawlProvider } from './extract/firecrawl'; import { createJinaReaderProvider } from './extract/jina-reader'; import { createReadabilityProvider } from './extract/readability'; +import { createPerplexitySonarProvider } from './agent/perplexity-sonar'; +import { createClaudeWebSearchProvider } from './agent/claude-web-search'; +import { createOpenAIResponsesProvider } from './agent/openai-responses'; +import { createGeminiGroundingProvider } from './agent/gemini-grounding'; export interface ProviderRegistry { search: Map; extract: Map; + agent: Map; } export function buildRegistry(deps: { manaSearch: ManaSearchClient }): ProviderRegistry { @@ -41,7 +47,13 @@ export function buildRegistry(deps: { manaSearch: ManaSearchClient }): ProviderR extract.set('jina-reader', createJinaReaderProvider()); extract.set('firecrawl', createFirecrawlProvider()); - return { search, extract }; + const agent = new Map(); + agent.set('perplexity-sonar', createPerplexitySonarProvider()); + agent.set('claude-web-search', createClaudeWebSearchProvider()); + agent.set('openai-responses', createOpenAIResponsesProvider()); + agent.set('gemini-grounding', createGeminiGroundingProvider()); + + return { search, extract, agent }; } export function getSearchProvider(registry: ProviderRegistry, id: string): SearchProvider { @@ -64,6 +76,16 @@ export function getExtractProvider(registry: ProviderRegistry, id: string): Extr return provider; } +export function getAgent(registry: ProviderRegistry, id: string): ResearchAgent { + const provider = registry.agent.get(id as AgentProviderId); + if (!provider) { + throw new BadRequestError(`Unknown research agent: ${id}`, { + available: [...registry.agent.keys()], + }); + } + return provider; +} + export function listProviders(registry: ProviderRegistry) { return { search: [...registry.search.values()].map((p) => ({ @@ -78,7 +100,12 @@ export function listProviders(registry: ProviderRegistry) { requiresApiKey: p.requiresApiKey, capabilities: p.capabilities, })), - agent: [] as Array<{ id: AgentProviderId; category: 'agent'; requiresApiKey: boolean }>, + agent: [...registry.agent.values()].map((p) => ({ + id: p.id, + category: 'agent' as const, + requiresApiKey: p.requiresApiKey, + capabilities: p.capabilities, + })), }; } diff --git a/services/mana-research/src/router/auto-route.ts b/services/mana-research/src/router/auto-route.ts index 8d52c3974..b1f1ea388 100644 --- a/services/mana-research/src/router/auto-route.ts +++ b/services/mana-research/src/router/auto-route.ts @@ -3,7 +3,7 @@ * list. The first provider in the returned list that has a valid API key wins. */ -import type { SearchProviderId, ExtractProviderId } from '@mana/shared-research'; +import type { AgentProviderId, ExtractProviderId, SearchProviderId } from '@mana/shared-research'; import type { Config } from '../config'; import type { QueryType } from './classify'; @@ -68,3 +68,29 @@ export function pickExtractProvider( } return null; } + +/** + * Preference order for agents when caller doesn't specify one. Cheaper + + * fastest first, then better quality if keys are available. + */ +export const AGENT_DEFAULT_ORDER: AgentProviderId[] = [ + 'perplexity-sonar', // best plug-and-play, moderate cost + 'gemini-grounding', // cheap with Google Search + 'openai-responses', // Responses API + web_search_preview + 'claude-web-search', // high quality, higher cost + 'openai-deep-research', // last: async, very expensive +]; + +export function pickAgent(config: Config): AgentProviderId | null { + const envMap: Record = { + 'perplexity-sonar': 'perplexity', + 'claude-web-search': 'anthropic', + 'openai-responses': 'openai', + 'gemini-grounding': 'googleGenai', + 'openai-deep-research': 'openai', + }; + for (const id of AGENT_DEFAULT_ORDER) { + if (config.providerKeys[envMap[id]]) return id; + } + return null; +} diff --git a/services/mana-research/src/routes/providers.ts b/services/mana-research/src/routes/providers.ts index dcdd85366..47f6bcb81 100644 --- a/services/mana-research/src/routes/providers.ts +++ b/services/mana-research/src/routes/providers.ts @@ -16,7 +16,7 @@ export function createProvidersRoutes(registry: ProviderRegistry, config: Config return c.json({ search: list.search.map((p) => ({ ...p, pricing: PROVIDER_PRICING[p.id] })), extract: list.extract.map((p) => ({ ...p, pricing: PROVIDER_PRICING[p.id] })), - agent: list.agent, + agent: list.agent.map((p) => ({ ...p, pricing: PROVIDER_PRICING[p.id] })), }); }) .get('/health', (c) => { @@ -46,6 +46,11 @@ export function createProvidersRoutes(registry: ProviderRegistry, config: Config 'jina-reader': !!keys.jina, firecrawl: !!keys.firecrawl, scrapingbee: !!keys.scrapingbee, + 'perplexity-sonar': !!keys.perplexity, + 'claude-web-search': !!keys.anthropic, + 'openai-responses': !!keys.openai, + 'openai-deep-research': !!keys.openai, + 'gemini-grounding': !!keys.googleGenai, }; const list = listProviders(registry); @@ -64,6 +69,13 @@ export function createProvidersRoutes(registry: ProviderRegistry, config: Config serverKeyAvailable: !!keyMap[p.id], status: check(p.id, p.requiresApiKey, !!keyMap[p.id]), })), + ...list.agent.map((p) => ({ + id: p.id, + category: 'agent' as const, + requiresApiKey: p.requiresApiKey, + serverKeyAvailable: !!keyMap[p.id], + status: check(p.id, p.requiresApiKey, !!keyMap[p.id]), + })), ]; return c.json({ diff --git a/services/mana-research/src/routes/research.ts b/services/mana-research/src/routes/research.ts new file mode 100644 index 000000000..69c47fb44 --- /dev/null +++ b/services/mana-research/src/routes/research.ts @@ -0,0 +1,161 @@ +/** + * POST /v1/research — single-agent research (or auto-routed) + * POST /v1/research/compare — fan-out to multiple agents + */ + +import { Hono } from 'hono'; +import { z } from 'zod'; +import type { AgentAnswer } from '@mana/shared-research'; +import { AGENT_PROVIDER_IDS, agentOptionsSchema } from '@mana/shared-research'; +import type { ExecutorDeps } from '../executor/execute-research'; +import { executeResearch } from '../executor/execute-research'; +import type { HonoEnv } from '../lib/hono-env'; +import type { ProviderRegistry } from '../providers/registry'; +import { getAgent } from '../providers/registry'; +import type { RunStorage } from '../storage/runs'; +import { BadRequestError } from '../lib/errors'; +import type { Config } from '../config'; +import { pickAgent } from '../router/auto-route'; + +const MAX_COMPARE_AGENTS = 4; + +const researchBodySchema = z.object({ + query: z.string().min(1).max(2000), + provider: z.enum(AGENT_PROVIDER_IDS).optional(), + options: agentOptionsSchema.optional(), +}); + +const compareBodySchema = z.object({ + query: z.string().min(1).max(2000), + providers: z.array(z.enum(AGENT_PROVIDER_IDS)).min(1).max(MAX_COMPARE_AGENTS), + options: agentOptionsSchema.optional(), +}); + +export function createResearchRoutes( + registry: ProviderRegistry, + storage: RunStorage, + deps: ExecutorDeps, + config: Config +) { + return new Hono() + .post('/', async (c) => { + const user = c.get('user'); + const body = researchBodySchema.parse(await c.req.json()); + + const providerId = body.provider ?? pickAgent(config); + if (!providerId) { + throw new BadRequestError( + 'No research agent configured. Set at least one of PERPLEXITY_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_GENAI_API_KEY.' + ); + } + + const provider = getAgent(registry, providerId); + + const run = await storage.createRun({ + userId: user.userId, + query: body.query, + mode: body.provider ? 'single' : 'auto', + category: 'agent', + providersRequested: [providerId], + billingMode: provider.requiresApiKey ? 'server-key' : 'free', + }); + + const out = await executeResearch( + { + provider, + query: body.query, + options: body.options ?? {}, + userId: user.userId, + }, + deps + ); + + await storage.addResult({ + runId: run.id, + providerId, + success: out.success, + latencyMs: out.meta.latencyMs, + costCredits: out.meta.costCredits, + billingMode: out.meta.billingMode, + cacheHit: out.meta.cacheHit, + normalizedResult: out.data ?? null, + errorCode: out.meta.errorCode ?? null, + }); + + if (out.meta.costCredits > 0) { + await storage.finalizeRunCost(run.id, out.meta.costCredits); + } + + return c.json({ + runId: run.id, + query: body.query, + provider: providerId, + success: out.success, + data: out.data, + meta: out.meta, + }); + }) + .post('/compare', async (c) => { + const user = c.get('user'); + const body = compareBodySchema.parse(await c.req.json()); + + if (new Set(body.providers).size !== body.providers.length) { + throw new BadRequestError('Duplicate providers in request'); + } + + const providers = body.providers.map((id) => getAgent(registry, id)); + const anyPaid = providers.some((p) => p.requiresApiKey); + + const run = await storage.createRun({ + userId: user.userId, + query: body.query, + mode: 'compare', + category: 'agent', + providersRequested: body.providers, + billingMode: anyPaid ? 'mixed' : 'free', + }); + + const settled = await Promise.all( + providers.map((provider) => + executeResearch( + { + provider, + query: body.query, + options: body.options ?? {}, + userId: user.userId, + }, + deps + ) + ) + ); + + let totalCost = 0; + for (let i = 0; i < providers.length; i++) { + const out = settled[i]; + totalCost += out.meta.costCredits; + await storage.addResult({ + runId: run.id, + providerId: providers[i].id, + success: out.success, + latencyMs: out.meta.latencyMs, + costCredits: out.meta.costCredits, + billingMode: out.meta.billingMode, + cacheHit: out.meta.cacheHit, + normalizedResult: out.data ?? null, + errorCode: out.meta.errorCode ?? null, + }); + } + if (totalCost > 0) await storage.finalizeRunCost(run.id, totalCost); + + return c.json({ + runId: run.id, + query: body.query, + results: providers.map((provider, i) => ({ + provider: provider.id, + success: settled[i].success, + data: settled[i].data as { answer: AgentAnswer } | undefined, + meta: settled[i].meta, + })), + }); + }); +}