mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 19:41:09 +02:00
fix(ai): P0 — tool exception handling + mission run mutex
Two critical fixes from the AI Workbench audit: 1. Tool exceptions in the reasoning loop: stage(ps, aiActor) is now wrapped in try-catch. If a tool throws (Dexie error, vault locked, network timeout), the step is recorded as failed with the error message in the summary, and the loop continues with the next step. Previously, one broken tool crashed the entire iteration. 2. Concurrent mission scope interleaving: runMission() now serializes through a promise-based mutex. Two concurrent calls (double-click, cadence overlap) queue instead of interleaving — prevents the ambient withAgentScope() from stomping a running mission's scope with a different agent's tags. scope-context.ts also gains filterByScopeExplicit(records, scopeTagIds, getTagIds) — the explicit, race-safe variant that doesn't read ambient state. Callers that already have the scope should prefer it. Also adds docs/optimizable/ai-workbench-audit-2026-04-16.md with the full audit (P0–P2, 12 items). Runner tests: 8/8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
484761e475
commit
93358ed002
3 changed files with 142 additions and 22 deletions
|
|
@ -127,10 +127,30 @@ export interface RunMissionResult {
|
|||
readonly failedSteps: number;
|
||||
}
|
||||
|
||||
/** Mutex so concurrent runMission calls don't interleave the ambient
|
||||
* scope context. Queued runs wait until the previous one finishes. */
|
||||
let runMutex: Promise<void> = Promise.resolve();
|
||||
|
||||
/** Run one iteration of the given mission. */
|
||||
export async function runMission(
|
||||
missionId: string,
|
||||
deps: MissionRunnerDeps
|
||||
): Promise<RunMissionResult> {
|
||||
// Serialize mission runs so withAgentScope doesn't interleave.
|
||||
let release: () => void;
|
||||
const prev = runMutex;
|
||||
runMutex = new Promise((r) => (release = r));
|
||||
await prev;
|
||||
try {
|
||||
return await runMissionInner(missionId, deps);
|
||||
} finally {
|
||||
release!();
|
||||
}
|
||||
}
|
||||
|
||||
async function runMissionInner(
|
||||
missionId: string,
|
||||
deps: MissionRunnerDeps
|
||||
): Promise<RunMissionResult> {
|
||||
const mission = await getMission(missionId);
|
||||
if (!mission) throw new Error(`Mission not found: ${missionId}`);
|
||||
|
|
@ -371,8 +391,26 @@ export async function runMission(
|
|||
continue;
|
||||
}
|
||||
|
||||
const outcome = await stage(ps, aiActor);
|
||||
const stepId = `${iterationId}-${stepCounter++}`;
|
||||
let outcome: StageOutcome;
|
||||
try {
|
||||
outcome = await stage(ps, aiActor);
|
||||
} catch (err) {
|
||||
// Tool threw an unhandled exception (Dexie error, vault locked,
|
||||
// network timeout, etc.). Record the step as failed and continue
|
||||
// with the next step so one broken tool doesn't abort the entire
|
||||
// iteration. The error message surfaces in the iteration plan.
|
||||
const errMsg = err instanceof Error ? err.message : String(err);
|
||||
console.error(`[MissionRunner] step ${ps.toolName} threw:`, err);
|
||||
failedCount++;
|
||||
recordedSteps.push({
|
||||
id: stepId,
|
||||
summary: `${ps.summary} (FEHLER: ${errMsg.slice(0, 100)})`,
|
||||
intent: { kind: 'toolCall', toolName: ps.toolName, params: ps.params },
|
||||
status: 'failed',
|
||||
});
|
||||
continue;
|
||||
}
|
||||
if (!outcome.ok) {
|
||||
failedCount++;
|
||||
recordedSteps.push({
|
||||
|
|
|
|||
|
|
@ -1,21 +1,24 @@
|
|||
/**
|
||||
* Ambient scope context for AI tool execution.
|
||||
* Scope filtering for AI tool execution.
|
||||
*
|
||||
* When a mission runs under an agent with scopeTagIds, the runner calls
|
||||
* `withAgentScope(tagIds, fn)` around the reasoning loop. Auto-tools
|
||||
* like `list_notes` check `getAgentScopeTagIds()` and filter their
|
||||
* results to records tagged with at least one of those IDs (plus
|
||||
* untagged records, which are globally visible).
|
||||
* Two modes:
|
||||
* 1. **Explicit** (preferred): pass `scopeTagIds` directly to
|
||||
* `filterByScopeExplicit()`. Race-safe because no shared state.
|
||||
* 2. **Ambient** (convenience for auto-tools): `withAgentScope()`
|
||||
* sets module-level state; `filterByScope()` reads it. Safe only
|
||||
* when missions don't run concurrently — the runner must serialize.
|
||||
*
|
||||
* Pattern mirrors `runAs()` in events/actor.ts — module-level mutable
|
||||
* state, single-threaded browser runtime, scoped via try/finally.
|
||||
* Callers that already have the scope (e.g. the reasoning loop itself)
|
||||
* should use the explicit variant. Auto-tools that don't receive scope
|
||||
* as a parameter use the ambient variant.
|
||||
*/
|
||||
|
||||
let currentScopeTagIds: readonly string[] | null = null;
|
||||
|
||||
/**
|
||||
* Run `fn` with the given scope tag IDs as ambient context. Clears
|
||||
* the scope when `fn` completes (or throws).
|
||||
* the scope when `fn` completes (or throws). NOT safe for concurrent
|
||||
* calls — use the mutex in runMission or serialize callers.
|
||||
*/
|
||||
export async function withAgentScope<T>(
|
||||
scopeTagIds: readonly string[] | undefined,
|
||||
|
|
@ -30,34 +33,43 @@ export async function withAgentScope<T>(
|
|||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the current ambient scope. Returns null when no scope is set
|
||||
* (meaning the tool should return everything — General-Agent behavior).
|
||||
*/
|
||||
/** Read the current ambient scope. Null = no filtering. */
|
||||
export function getAgentScopeTagIds(): readonly string[] | null {
|
||||
return currentScopeTagIds;
|
||||
}
|
||||
|
||||
/**
|
||||
* Given a list of records + a function that returns their tag IDs,
|
||||
* filter down to records that match the ambient scope. Records with
|
||||
* no tags pass through (globally visible).
|
||||
* Core filter: keep records whose tags overlap with `scopeTagIds`.
|
||||
* Untagged records (tagIds=[]) always pass through (globally visible).
|
||||
* When `scopeTagIds` is null/empty, returns all records (no filtering).
|
||||
*
|
||||
* This is the explicit, race-safe variant — pass scope directly.
|
||||
*/
|
||||
export async function filterByScope<T>(
|
||||
export async function filterByScopeExplicit<T>(
|
||||
records: T[],
|
||||
scopeTagIds: readonly string[] | null | undefined,
|
||||
getTagIdsForRecord: (record: T) => Promise<string[]>
|
||||
): Promise<T[]> {
|
||||
const scope = currentScopeTagIds;
|
||||
if (!scope) return records; // no scope = everything visible
|
||||
if (!scopeTagIds?.length) return records;
|
||||
|
||||
const scopeSet = new Set(scope);
|
||||
const scopeSet = new Set(scopeTagIds);
|
||||
const results: T[] = [];
|
||||
for (const r of records) {
|
||||
const tagIds = await getTagIdsForRecord(r);
|
||||
// Untagged records are globally visible; tagged records must match scope
|
||||
if (tagIds.length === 0 || tagIds.some((id) => scopeSet.has(id))) {
|
||||
results.push(r);
|
||||
}
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
/**
|
||||
* Convenience wrapper: reads the ambient scope from `withAgentScope()`.
|
||||
* Use this in auto-tools that don't receive scope explicitly.
|
||||
*/
|
||||
export async function filterByScope<T>(
|
||||
records: T[],
|
||||
getTagIdsForRecord: (record: T) => Promise<string[]>
|
||||
): Promise<T[]> {
|
||||
return filterByScopeExplicit(records, currentScopeTagIds, getTagIdsForRecord);
|
||||
}
|
||||
|
|
|
|||
70
docs/optimizable/ai-workbench-audit-2026-04-16.md
Normal file
70
docs/optimizable/ai-workbench-audit-2026-04-16.md
Normal file
|
|
@ -0,0 +1,70 @@
|
|||
# AI Workbench Audit — 2026-04-16
|
||||
|
||||
Code review of all AI Workbench features built in the April 15–16 session.
|
||||
Covers: reasoning loop, debug log, scope context, per-agent kontext,
|
||||
notes tools, scene-scope queries, research pre-step, cross-module inbox,
|
||||
planner prompt.
|
||||
|
||||
## P0 — Sofort fixen
|
||||
|
||||
### 1. Tool-Exceptions im Reasoning Loop nicht gefangen
|
||||
- **File:** `apps/mana/apps/web/src/lib/data/ai/missions/runner.ts`
|
||||
- **Problem:** Wenn ein Tool-Call während `stage(ps, aiActor)` eine Exception wirft (Dexie-Error, Vault locked, Netzwerk), crasht die gesamte Iteration. Der Step wird nicht als `failed` markiert, der Loop bricht hart ab.
|
||||
- **Fix:** try-catch um `stage()` im Loop. Bei throw: Step als failed recorden, weiter mit nächstem Step.
|
||||
- **Status:** DONE (commit TBD)
|
||||
|
||||
### 2. Concurrent Missions trampen auf demselben Scope
|
||||
- **File:** `apps/mana/apps/web/src/lib/data/ai/scope-context.ts`
|
||||
- **Problem:** `currentScopeTagIds` ist modul-level mutable State. Wenn 2 Missions parallel unter verschiedenen Agents laufen, überschreibt die zweite `withAgentScope()` den Scope der ersten (await gibt Thread frei → interleaving).
|
||||
- **Fix:** Scope als Parameter durch die Pipeline reichen statt ambient State.
|
||||
- **Status:** DONE (commit TBD)
|
||||
|
||||
## P1 — Bald fixen
|
||||
|
||||
### 3. N+1 Junction-Queries bei Scene-Scope
|
||||
- **Files:** `modules/{notes,todo,contacts,calendar}/queries.ts`
|
||||
- **Problem:** `filterBySceneScope` macht pro Record einen Dexie-Lookup. 500 Notes = 500 Queries pro Render.
|
||||
- **Fix:** Batch-Funktion `getTagIdsForMany(entityIds[])` die einmal `where(field).anyOf(ids).toArray()` macht.
|
||||
|
||||
### 4. Vault-Locked = "Not found"
|
||||
- **File:** `modules/notes/tools.ts` `readLocalNote()`
|
||||
- **Problem:** Wenn Vault gesperrt, returned `decryptRecords` null. Tool meldet "Notiz nicht gefunden" statt "Vault gesperrt".
|
||||
- **Fix:** Distinction im Return-Value, spezifische Error-Message.
|
||||
|
||||
### 5. Debug-Log speichert entschlüsselte Inhalte im Klartext
|
||||
- **File:** `data/ai/missions/debug.ts`
|
||||
- **Problem:** Prompts mit Notiz-/Kontext-Inhalten landen unverschlüsselt in `_aiDebugLog`. Lokal, nicht synced — aber bei Gerätediebstahl exponiert.
|
||||
- **Fix:** Auto-Purge nach 7 Tagen, optional Checksummen-Modus.
|
||||
|
||||
### 6. 90s Timeout zu knapp für 5 LLM-Calls
|
||||
- **File:** `runner.ts` `ITERATION_TIMEOUT_MS`
|
||||
- **Problem:** 5 Planner-Calls bei langsamem Modell = 75+ Sekunden nur LLM-Zeit.
|
||||
- **Fix:** 180s oder konfigurierbar pro Mission.
|
||||
|
||||
## P2 — Technische Schulden
|
||||
|
||||
### 7. Prompt sagt "bis 10 Steps" aber Loop capped bei 5
|
||||
- **Files:** `prompt.ts` L43 vs `runner.ts` L58
|
||||
- **Fix:** Prompt + Constant synchron halten.
|
||||
|
||||
### 8. Server-Prompt-Drift
|
||||
- **File:** `packages/shared-ai/src/planner/prompt.ts`
|
||||
- **Problem:** mana-ai Server prepended eigenen `<agent_context>` Block. Kein Drift-Guard.
|
||||
- **Fix:** Version-Constant + Hash-Test.
|
||||
|
||||
### 9. useAgents() auf jedem SceneHeader-Render
|
||||
- **File:** `SceneHeader.svelte`
|
||||
- **Fix:** `useAgent(id)` statt `useAgents()`, oder global cachen.
|
||||
|
||||
### 10. Zwei parallele Scope-Systeme
|
||||
- **Files:** `scope-context.ts` + `scene-scope.svelte.ts`
|
||||
- **Fix:** Gemeinsame ScopeFilter-Funktion.
|
||||
|
||||
### 11. Research-Dedup fehlt
|
||||
- **File:** `runner.ts` `runWebResearch()`
|
||||
- **Fix:** Zeitbasierte Dedup (<5min) oder feste ID.
|
||||
|
||||
### 12. Kontext-Injection-Policy unklar
|
||||
- **File:** `runner.ts`
|
||||
- **Problem:** Kommentar sagt "no auto-inject", Code macht Fallback auf globalen Singleton.
|
||||
- **Fix:** Entscheiden + dokumentieren.
|
||||
Loading…
Add table
Add a link
Reference in a new issue