mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 21:21:10 +02:00
Two surprises came out of "why do we still use Gemma 3 instead of 4":
1. The hardcoded default in ManaServerBackend was `gemma3:4b`, which
was even smaller than mana-llm's actual server-side default of
`gemma3:12b`. My initial guess from docs/LOCAL_LLM_MODELS.md was
conservative.
2. The mana-llm OLLAMA_URL points at host.docker.internal:13434,
which is NOT the Mac Mini's local Ollama — it's a Python TCP
forwarder (~/gpu-proxy.py) that proxies to 192.168.178.11:11434
on the Windows GPU server. So title generation has been running
on the RTX 3090 the whole time, not on the M4 Metal GPU. The
Mac Mini's brew-installed ollama 0.15.4 wasn't even being used
for inference — only as a CLI to inspect the proxied Ollama.
To get to Gemma 4, both Ollama instances needed an upgrade:
- Mac Mini brew : 0.15.4 → 0.20.4 (cosmetic, the binary isn't on
the inference path; upgraded for consistency)
- GPU server : 0.18.2 → 0.20.4 via winget. Required restarting
the daemon via the OllamaServe scheduled task
that was already configured.
Then `ollama pull gemma4:e4b` on the GPU server (9.6 GB, ~10 min on
the LAN). Verified end-to-end via the proxy with a real chat
completion request to mana-llm — gemma4:e4b answered with a clean
4-word German title for a sample voice memo prompt:
prompt: "Erstelle einen kurzen 3-Wort Titel für: Es ist ein
schöner Tag heute am 9. April"
→ "Schöner Tag, neuntes April"
Changes in this commit:
packages/shared-llm/src/backends/mana-server.ts
- defaultModel: 'gemma3:4b' → 'gemma4:e4b'
- Updated docstring to explain why E4B is the right Mana-Server
tier default: 9.6 GB on disk, 128K context, "Effective 4B"
arch punches above its weight class for German prompts, and
the family stays consistent with the browser tier (Gemma 4
E2B is the smaller sibling) so the source label and prompt
behavior remain coherent across tiers.
apps/mana/apps/web/src/lib/modules/memoro/views/DetailView.svelte
- TITLE_SOURCE_LABELS map updated:
browser → "Auf deinem Gerät (Gemma 4 E2B)" (was "(Gemma 4)")
mana-server → "Mana-Server (Gemma 4 E4B)" (was "(gemma3:4b)")
- The label now reflects that BOTH the browser and the mana-server
tier are running Gemma 4 variants, which is more honest than
the previous mix.
Did NOT change:
- The Ollama OLLAMA_DEFAULT_MODEL env var in docker-compose.macmini.yml
(still gemma3:12b). That's the fallback for callers who don't
specify a model in their request. Our generate-title task always
sends an explicit model string, so it's unaffected. Bumping the
global default is a separate decision — it would change behavior
for the playground module and any other consumer that relies on
the implicit fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| credits | ||
| eslint-config | ||
| feedback | ||
| help | ||
| local-llm | ||
| local-store | ||
| notify-client | ||
| qr-export | ||
| shared-auth | ||
| shared-auth-ui | ||
| shared-branding | ||
| shared-drizzle-config | ||
| shared-error-tracking | ||
| shared-go | ||
| shared-hono | ||
| shared-i18n | ||
| shared-icons | ||
| shared-landing-ui | ||
| shared-links | ||
| shared-llm | ||
| shared-logger | ||
| shared-pwa | ||
| shared-python/manacore_auth | ||
| shared-storage | ||
| shared-stores | ||
| shared-tags | ||
| shared-tailwind | ||
| shared-theme | ||
| shared-theme-ui | ||
| shared-types | ||
| shared-ui | ||
| shared-uload | ||
| shared-utils | ||
| shared-vite-config | ||
| spiral-db | ||
| subscriptions | ||
| test-config | ||
| wallpaper-generator | ||