mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-18 08:49:39 +02:00
feat(questions): deep-research module — mana-search + mana-llm pipeline
End-to-end deep-research feature for the questions module: a fire-and-
forget orchestrator in apps/api that plans sub-queries with mana-llm,
retrieves sources via mana-search (with optional Readability extraction),
and streams a structured synthesis back to the web app over SSE.
Backend (apps/api/src/modules/research):
- schema.ts: pgSchema('research') with research_results + sources
- orchestrator.ts: three-phase pipeline (plan / retrieve / synthesise)
with depth-aware config (quick=1×, standard=3×, deep=6× sub-queries)
- pubsub.ts: in-process event bus, single-node, swappable for Redis
- routes.ts: POST /start (202, fire-and-forget), GET /:id/stream (SSE),
POST /start-sync (test only), GET /:id, GET /:id/sources
- Credit gating via @mana/shared-hono/credits — validate up-front,
consume best-effort on `done`. Failed runs cost nothing.
Helpers (apps/api/src/lib):
- llm.ts: llmJson() + llmStream() over mana-llm OpenAI-compat API
- search.ts: webSearch() + bulkExtract() over mana-search Go service
- responses.ts: shared errorResponse / listResponse / validationError
Schema deployment:
- drizzle.config.ts (research-scoped) + drizzle/research/0000_init.sql
hand-authored migration, deployable via psql -f or drizzle-kit push.
- drizzle-kit added as devDep with db:generate / db:push scripts.
Web client (apps/mana/apps/web/src/lib/api/research.ts):
- Typed start() / get() / listSources() / streamProgress(). The stream
uses fetch + ReadableStream (not EventSource) so we can attach the
JWT via Authorization header. Special-cases 402 for friendly toast.
- New PUBLIC_MANA_API_URL plumbing in hooks.server.ts + config.ts.
Module store (modules/questions/stores/answers.svelte.ts):
- New write-side store with createManual / startResearch / accept /
softDelete. startResearch creates an optimistic empty answer, opens
the SSE stream, debounces token deltas in 100ms batches into the
encrypted local row, and on `done` replaces the streamed text with
the parsed { summary, keyPoints, followUps } payload + citations
resolved against research.sources.id.
Citation rendering (modules/questions/components/AnswerCitations.svelte):
- Tokenises [n] markers in the answer body into clickable pills with
hover popovers showing title / host / snippet / external link.
- Lazy-loaded via a session-scoped source cache (stores/sources.svelte.ts)
that deduplicates concurrent fetches.
UI (routes/(app)/questions/[id]/+page.svelte):
- Recherche card with three-state button (start / cancel / re-run),
animated phase indicator, source counter.
- Confirmation dialog warning about web/LLM transmission since the
question itself is locally encrypted.
- Toasts for success / error / cancel via @mana/shared-ui/toast.
- Re-run flow soft-deletes prior research-driven answers but keeps
manual ones intact.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
30787e36d2
commit
e82851985b
18 changed files with 2221 additions and 4 deletions
73
apps/api/src/modules/research/schema.ts
Normal file
73
apps/api/src/modules/research/schema.ts
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
/**
|
||||
* Research module — DB schema (Drizzle / pgSchema 'research')
|
||||
*
|
||||
* Server-side store for deep-research runs orchestrated by apps/api.
|
||||
* Lives in mana_platform under its own pgSchema.
|
||||
*
|
||||
* - research_results: one row per research run, holds plan + final synthesis
|
||||
* - sources: one row per web source consumed by a run
|
||||
*
|
||||
* The local-first questions module references research_results.id from
|
||||
* LocalAnswer.researchResultId; sources are fetched on-demand via the API
|
||||
* and never mirrored into IndexedDB (they're public web content).
|
||||
*/
|
||||
|
||||
import { drizzle } from 'drizzle-orm/postgres-js';
|
||||
import postgres from 'postgres';
|
||||
import { pgSchema, uuid, text, timestamp, integer, jsonb } from 'drizzle-orm/pg-core';
|
||||
|
||||
const DATABASE_URL =
|
||||
process.env.DATABASE_URL ?? 'postgresql://mana:devpassword@localhost:5432/mana_platform';
|
||||
|
||||
export const researchSchema = pgSchema('research');
|
||||
|
||||
/**
|
||||
* One row per research run. Created in `planning` state immediately on
|
||||
* /start, then updated as the orchestrator advances through phases.
|
||||
*/
|
||||
export const researchResults = researchSchema.table('research_results', {
|
||||
id: uuid('id').defaultRandom().primaryKey(),
|
||||
userId: text('user_id').notNull(),
|
||||
questionId: text('question_id').notNull(), // mirrors local LocalQuestion.id (UUID)
|
||||
depth: text('depth').notNull(), // 'quick' | 'standard' | 'deep'
|
||||
status: text('status').notNull(), // 'planning' | 'searching' | 'extracting' | 'synthesizing' | 'done' | 'error'
|
||||
subQueries: jsonb('sub_queries').$type<string[]>(),
|
||||
summary: text('summary'),
|
||||
keyPoints: jsonb('key_points').$type<string[]>(),
|
||||
followUpQuestions: jsonb('follow_up_questions').$type<string[]>(),
|
||||
errorMessage: text('error_message'),
|
||||
startedAt: timestamp('started_at', { withTimezone: true }).defaultNow().notNull(),
|
||||
finishedAt: timestamp('finished_at', { withTimezone: true }),
|
||||
});
|
||||
|
||||
/**
|
||||
* Sources consumed during a research run. Rank reflects ordering in the
|
||||
* synthesis prompt so citation [n] in the summary maps to sources[n-1].
|
||||
*/
|
||||
export const sources = researchSchema.table('sources', {
|
||||
id: uuid('id').defaultRandom().primaryKey(),
|
||||
researchResultId: uuid('research_result_id')
|
||||
.notNull()
|
||||
.references(() => researchResults.id, { onDelete: 'cascade' }),
|
||||
url: text('url').notNull(),
|
||||
title: text('title'),
|
||||
snippet: text('snippet'),
|
||||
extractedContent: text('extracted_content'),
|
||||
category: text('category'),
|
||||
rank: integer('rank').notNull(),
|
||||
createdAt: timestamp('created_at', { withTimezone: true }).defaultNow().notNull(),
|
||||
});
|
||||
|
||||
const connection = postgres(DATABASE_URL, { max: 5, idle_timeout: 20 });
|
||||
export const db = drizzle(connection, { schema: { researchResults, sources } });
|
||||
|
||||
export type ResearchResult = typeof researchResults.$inferSelect;
|
||||
export type Source = typeof sources.$inferSelect;
|
||||
export type ResearchDepth = 'quick' | 'standard' | 'deep';
|
||||
export type ResearchStatus =
|
||||
| 'planning'
|
||||
| 'searching'
|
||||
| 'extracting'
|
||||
| 'synthesizing'
|
||||
| 'done'
|
||||
| 'error';
|
||||
Loading…
Add table
Add a link
Reference in a new issue