mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-16 04:39:39 +02:00
The Claude-Code wU2 pattern: when token usage hits ~92% of the provider's
context budget, fold all pre-tail turns into a single structured summary
(Goal / Decisions / Tools Called / Current Progress) so subsequent
rounds see a synopsis instead of the raw log.
This commit ships ONLY the primitive. Wiring it into runPlannerLoop
(auto-trigger before the next LLM call when shouldCompact() fires)
is M2.2 so the surface stays small and testable.
New exports from @mana/shared-ai:
- shouldCompact(totalTokens, maxContextTokens, threshold?)
→ boolean; DEFAULT_COMPACT_THRESHOLD = 0.92, matching Claude Code.
Bails safely when maxContextTokens is missing (local models often
don't report usage).
- compactHistory(messages, { llm, model, keepRecent?, temperature? })
→ { messages, summary, compactedTurns, usage? }
Preserves: [0]=system, [1]=first user, [last N]=recent turns
(default 4). Everything between gets sent through the compact
agent with COMPACT_SYSTEM_PROMPT — a fixed 4-section Markdown
schema. Temperature default 0.2 because we want summarisation,
not creativity.
- parseCompactSummary / renderCompactSummary — round-trip helpers.
Parser is tolerant (missing sections → empty string) so a partial
compaction still produces a usable summary.
The summary replaces the middle as a single role='assistant' message
wrapped in <compact-summary> tags. Assistant role (not system) because
some providers reject arbitrary system messages deep in history.
Tests: 17 new across the 4 exports (trigger logic, Markdown round-trip,
structural preservation of anchors + tail, usage passthrough, custom
keepRecent). All 71 shared-ai tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| src | ||
| package.json | ||
| tsconfig.json | ||