mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-15 00:41:09 +02:00
User test on browser tier (Gemma 4 E2B) showed two compounding bugs:
1. The LLM produces empty content. The cleanup chain strips it to ""
and falls through to runRules.
2. runRules takes the first 7 words of the transcript. For short
voice memos like "So erneut eine kleine Testaufnahme hier"
(6 words) that means the entire transcript becomes the title —
not actually a title, just the recording verbatim.
User log:
[memoro] enqueued title task ...
[generateTitle] LLM returned empty after cleanup, falling back to rules
[memoro-llm-watcher] writing title to memo X: "So erneut eine kleine Testaufnahme hier"
Three changes to fix the actual quality, not just the empty-string
symptom from the previous commit:
1. Rewrite the LLM prompt as few-shot
Replace the previous "Du erstellst kurze Titel — kein Markdown,
keine Anführungszeichen, keine Vorrede, kein Punkt am Ende" prompt
(a wall of negative constraints that small instruct models like
Gemma 4 E2B handle poorly) with a few-shot user-only message:
Erstelle einen kurzen Titel (3-5 Wörter) für die folgende Aufnahme.
Beispiel 1:
Aufnahme: "Erinnere mich daran, morgen Vormittag den Müll
rauszubringen, bevor die Müllabfuhr kommt."
Titel: Erinnerung Müll rausbringen
Beispiel 2: ... (Idee Präsentation Demo-Start)
Beispiel 3: ... (Steuererklärung 2025)
Aufnahme: "<user transcript>"
Titel:
Small instruct models complete the pattern much more reliably
than they obey negative constraints. The expected continuation is
just the title text, no punctuation, no markdown, no preamble.
2. Rolling cleanup that won't go to empty
The previous cleanup chain (`.trim().replace(quotes).replace(dots).trim()`)
could end up with "" if the model emitted only `.` or `**.**` or
similar. Replace with a four-stage chain that picks the FIRST
non-empty stage from the bottom up:
trimmed = result.content.trim()
stripFences = first line only (kills any model rambling)
stripQuotes = strip surrounding quotes/markdown markers
stripDots = strip trailing dots
cleaned = stripDots || stripQuotes || stripFences || trimmed
This way "Test." → "Test" but `"."` → `"."` (kept as-is rather
than stripped to empty). The runRules fallback only fires when
the model truly emits nothing usable in any stage.
3. runRules is smarter about short transcripts
For voice memos with ≤8 words in the first sentence, the "title"
would just be the whole transcript echoed back. That's not useful.
The new threshold: short transcripts get a date label instead
("Memo vom 9. April 2026"), longer ones still get the first-N-words
snippet. The threshold is empirical — short voice memos benefit
from a date marker, longer ones can spare a few words for a snippet.
Extracted dateLabel() to a module-scope function so both rulesImpl
(for empty/short transcripts) and the watcher's last-resort
backstop can format dates consistently.
Diagnostic: log the RAW LLM output before cleanup so the next test
session shows exactly what Gemma is producing. If the model is still
emitting only punctuation despite the few-shot prompt, the log will
show `"\n"` or `"."` and we'll know the bug is in the inference path
rather than the cleanup.
After this commit, the user-visible result for a 6-word transcript
on the browser tier should be:
- LLM produces something real ("Test der Sprachaufnahme") → write it
- LLM produces nothing → rules → "Memo vom 9. April 2026"
- both fail somehow → watcher's date backstop → same
- never the verbatim transcript
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| api | ||
| calc/packages/shared | ||
| calendar | ||
| cards | ||
| chat | ||
| citycorners | ||
| contacts | ||
| context | ||
| docs | ||
| guides | ||
| inventar | ||
| mana | ||
| manavoxel | ||
| memoro | ||
| moodlit | ||
| mukke | ||
| news | ||
| nutriphi | ||
| photos | ||
| picture | ||
| planta | ||
| presi | ||
| questions | ||
| skilltree | ||
| storage | ||
| times | ||
| todo | ||
| traces | ||
| uload | ||
| zitare/packages/content | ||