managarten/apps
Till JS 1f26aa4f2f feat(local-llm): swap WebLLM/Qwen for transformers.js + Gemma 4 E2B
Replace the entire @mana/local-llm engine with a transformers.js-based
implementation backed by Google's Gemma 4 E2B (released 2026-04-02).
The external API of LocalLLMEngine — load(), generate(), prompt(),
extractJson(), classify(), onStatusChange(), isSupported() — is
preserved 1:1, so the /llm-test page, the playground module, and the
Svelte 5 reactive bindings in svelte.svelte.ts need no changes
beyond updating the default model key.

Why the engine swap: MLC has not (and as of today still hasn't)
published Gemma 4 builds for WebLLM. The webml-community team and
HuggingFace's onnx-community already have Gemma 4 E2B running in
the browser via transformers.js + WebGPU, with a documented
Gemma4ForConditionalGeneration class shipped in @huggingface/transformers
v4.0.0. Going through the ONNX route gets us the latest Google model
six days after release instead of waiting on MLC compilation.

Trade-offs accepted (discussed before this commit):
- transformers.js is a more generic ONNX runtime, so per-token
  throughput will be ~20-40% lower than WebLLM would deliver for the
  same model size. For a 2B model on a modern WebGPU device that's
  still well above interactive latency.
- The JS bundle gains ~2-3 MB (the ONNX runtime). Negligible compared
  to the 500 MB model download.
- transformers.js v4 is brand new (released alongside Gemma 4) so the
  Gemma4ForConditionalGeneration code path has very little battle
  testing yet. The risk is partially offset by webml-community's
  reference implementation.

What changed file by file:

- packages/local-llm/package.json: drop @mlc-ai/web-llm, add
  @huggingface/transformers ^4.0.0; bump version 0.1.0 → 0.2.0; rewrite
  description.

- packages/local-llm/src/types.ts: add `dtype` field to ModelConfig
  ('fp32' | 'fp16' | 'q8' | 'q4' | 'q4f16') so each model can request
  the quantization that matches its uploaded ONNX shards.

- packages/local-llm/src/models.ts: replace the old Qwen 2.5 + Gemma 2
  registry with a single `gemma-4-e2b` entry pointing at
  onnx-community/gemma-4-E2B-it-ONNX with q4f16 quantization. Future
  models can be added by appending entries — the /llm-test picker
  reads MODELS dynamically and picks them up automatically.

- packages/local-llm/src/cache.ts: replace the WebLLM-specific
  hasModelInCache helper with a generic Cache API probe that looks for
  `https://huggingface.co/{model_id}/resolve/main/tokenizer.json` in
  any open cache. tokenizer.json is small, downloaded first, and
  always present, so its presence is a reliable proxy for "model has
  been loaded before".

- packages/local-llm/src/engine.ts: full rewrite. Internally we now
  hold a transformers.js model + processor pair (created via
  AutoProcessor.from_pretrained + Gemma4ForConditionalGeneration.from_pretrained
  with `device: 'webgpu'`), and translate our LoadingStatus union from
  the library's `progress_callback` shape. generate() applies Gemma's
  chat template via the processor, runs model.generate() with optional
  TextStreamer for streaming, then slices the prompt tokens off the
  output tensor to compute per-call usage. The convenience methods
  (prompt, extractJson, classify) are unchanged because they only call
  generate() under the hood.

- packages/local-llm/src/generate.ts and status.svelte.ts: deleted.
  These were orphaned from a much earlier engine API (referenced
  `getEngine()` / `subscribe()` / `LlmState` symbols that haven't
  existed for a while) and were never re-exported from index.ts —
  they only showed up because `tsc --noEmit` was crawling the src
  tree. Their functionality lives in engine.ts + svelte.svelte.ts now.

- apps/mana/apps/web/package.json: swap the direct dep from
  @mlc-ai/web-llm to @huggingface/transformers. This is the same
  trick we used for the previous adapter-node externals warning —
  having it as a direct dep makes adapter-node's Rollup pass treat
  it as external automatically.

- apps/mana/apps/web/vite.config.ts: swap ssr.external entry from
  @mlc-ai/web-llm to @huggingface/transformers. Add a comment
  explaining the why so the next person doesn't wonder.

- apps/mana/apps/web/src/routes/(app)/llm-test/+page.svelte: change
  the default selectedModel from 'qwen-2.5-1.5b' to 'gemma-4-e2b'.
  All other model display strings come from the MODELS registry, so
  this is the single hard-coded reference that needed updating.

- pnpm-lock.yaml: regenerated. Confirmed @mlc-ai/web-llm is gone (0
  references) and @huggingface/transformers is in (4 references).

CSP: no header changes needed. We already opened connect-src for
huggingface.co + cdn-lfs.huggingface.co + raw.githubusercontent.com
when fixing the WebLLM blockers earlier today, and 'wasm-unsafe-eval'
is already in script-src — both transformers.js (ONNX runtime) and
WebLLM (MLC runtime) need that. If transformers.js spawns its
inference into a Web Worker via a blob URL we may need to add
`worker-src 'self' blob:` once we hit the first runtime test, but
the existing CSP should be enough for the synchronous path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 22:22:32 +02:00
..
api fix(research): handle zero-hit retrieval — skip empty insert + graceful summary 2026-04-08 22:19:17 +02:00
calc/packages/shared chore: delete 25 web-archived directories, remove stale stubs, clean workspace config 2026-04-03 13:03:49 +02:00
calendar chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
cards chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
chat chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
citycorners chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
contacts chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
context chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
docs docs: Phase 9 documentation roundup — close encryption-shaped doc gaps 2026-04-08 11:47:59 +02:00
guides chore: delete 25 web-archived directories, remove stale stubs, clean workspace config 2026-04-03 13:03:49 +02:00
inventar chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
mana feat(local-llm): swap WebLLM/Qwen for transformers.js + Gemma 4 E2B 2026-04-08 22:22:32 +02:00
manavoxel chore(workspace): unify vitest to ^4.1.2 across all packages 2026-04-07 13:58:29 +02:00
memoro chore(workspace): unify vitest to ^4.1.2 across all packages 2026-04-07 13:58:29 +02:00
moodlit feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
mukke feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
news chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
nutriphi chore(workspace): unify vitest to ^4.1.2 across all packages 2026-04-07 13:58:29 +02:00
photos chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
picture chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
planta chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
presi chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
questions feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
skilltree chore: delete 25 web-archived directories, remove stale stubs, clean workspace config 2026-04-03 13:03:49 +02:00
storage chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
times chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
todo chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
traces feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
uload chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
zitare/packages/content chore: delete 25 web-archived directories, remove stale stubs, clean workspace config 2026-04-03 13:03:49 +02:00