fix(personas): exact tool_use_id pairing + CI drift audit

Two loose ends from M3/M4: 1. Tool_use_id-based error attribution in the persona-runner ----------------------------------------------------------- The previous collectActionsFromMessage() flipped the *most recent* ActionRow to 'error' when a tool_result carried is_error:true. That was fine as long as Claude invoked tools strictly in sequence, but when the planner pipelines multiple tools in one turn, a later tool_result carries an earlier tool_use_id — the last-action fallback mis- attributes the error. runMainTurn() now keeps a tool_use_id → action-index Map for the duration of the tick. On tool_use we stash block.id, on tool_result we look up the exact ActionRow via tool_use_id and flip that one. The "flip last" path survives as a pure fallback if a future SDK ever ships a block without an id. 2. New audit:encrypted-tools script ----------------------------------- scripts/audit-encrypted-tools.ts — loads registerAllModules() and apps/mana/…/crypto/registry.ts, diffs every ToolSpec.encryptedFields against the authoritative web-app ENCRYPTION_REGISTRY. Catches three classes of drift: - missing-table : tool declares a table the web-app doesn't encrypt - field-drift : both agree a table is encrypted but the field lists differ (half-encryption in the wire is silent death) - disabled : web-app has enabled:false while the tool still encrypts — advisory warning, not a fail Negative-tested by injecting a deliberate drift on todo.create + todo.list (shortened ENCRYPTED_FIELDS to ['title']); the auditor flagged both tools with full field diffs, restore returned to green. Wired into `pnpm run validate:all` so the contract survives future edits on either side. Fills the M4 audit gap noted in project_mana_mcp_personas.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:41:23 +02:00 · 2026-04-23 15:34:52 +02:00 · 2026-04-23 15:34:52 +02:00 · eb8fac23ec
commit eb8fac23ec
parent 703ef69ca9
3 changed files with 150 additions and 12 deletions
--- a/services/mana-persona-runner/src/runner/claude-session.ts
+++ b/services/mana-persona-runner/src/runner/claude-session.ts
@ -96,8 +96,16 @@ export async function runMainTurn(input: SessionInput): Promise<SessionResult> {
 		},
 	});

+	// Per-turn Map so tool_result blocks can flip the *right* ActionRow
+	// (the one whose tool_use_id they reference), not "the most recent".
+	// Anthropic's stream interleaves tool_use and tool_result blocks
+	// deterministically, but when Claude pipelines multiple tools in one
+	// assistant turn, a later tool_result carries an earlier tool_use_id
+	// — the last-action fallback gets that wrong.
+	const toolUseIndex = new Map<string, number>();
+
 	for await (const msg of q as AsyncIterable<SDKMessage>) {
-		collectActionsFromMessage(msg, input.tickId, actions, modulesUsed);
+		collectActionsFromMessage(msg, input.tickId, actions, modulesUsed, toolUseIndex);
 	}

 	return { actions, feedback: [], modulesUsed };
@ -143,7 +151,8 @@ function collectActionsFromMessage(
 	msg: SDKMessage,
 	tickId: string,
 	actions: ActionRow[],
-	modulesUsed: Set<string>
+	modulesUsed: Set<string>,
+	toolUseIndex: Map<string, number>
 ): void {
 	// SDKMessage is a big union; we only care about assistant messages
 	// that contain tool_use blocks, and user messages that contain
@ -168,19 +177,23 @@ function collectActionsFromMessage(
 				inputHash: hashInput(block.input),
 				result: 'ok', // provisional; rewritten on matching tool_result if it was an error
 			});
+			// Record the Anthropic-assigned id so a later tool_result can
+			// find its row even if other tools finished in between.
+			const toolUseId = typeof block.id === 'string' ? block.id : null;
+			if (toolUseId) toolUseIndex.set(toolUseId, actions.length - 1);
 		} else if (blockType === 'tool_result') {
 			const isError = block.is_error === true;
 			if (!isError) continue;
-			// Flip the most recent action that matches this tool_use_id.
 			const toolUseId = typeof block.tool_use_id === 'string' ? block.tool_use_id : null;
-			if (!toolUseId) continue;
-			// We didn't store tool_use_id (would require pairing state); cheap
-			// fallback: mark the last action as error. Good enough for the
-			// audit dashboard; precise attribution lands in a later iteration.
-			const last = actions[actions.length - 1];
-			if (last) {
-				last.result = 'error';
-				last.errorMessage = stringifyBlock(block);
+			// Exact pairing via tool_use_id → fall back to last action only
+			// if the id is missing (shouldn't happen with the current SDK,
+			// but the fallback keeps the pipeline honest if Anthropic ever
+			// ships a block without an id).
+			const idx = toolUseId !== null ? toolUseIndex.get(toolUseId) : undefined;
+			const target = idx !== undefined ? actions[idx] : actions[actions.length - 1];
+			if (target) {
+				target.result = 'error';
+				target.errorMessage = stringifyBlock(block);
 			}
 		}
 	}