mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 20:01:09 +02:00
Mirror of github.com/Memo-2023/mana-monorepo
Replace the entire @mana/local-llm engine with a transformers.js-based
implementation backed by Google's Gemma 4 E2B (released 2026-04-02).
The external API of LocalLLMEngine — load(), generate(), prompt(),
extractJson(), classify(), onStatusChange(), isSupported() — is
preserved 1:1, so the /llm-test page, the playground module, and the
Svelte 5 reactive bindings in svelte.svelte.ts need no changes
beyond updating the default model key.
Why the engine swap: MLC has not (and as of today still hasn't)
published Gemma 4 builds for WebLLM. The webml-community team and
HuggingFace's onnx-community already have Gemma 4 E2B running in
the browser via transformers.js + WebGPU, with a documented
Gemma4ForConditionalGeneration class shipped in @huggingface/transformers
v4.0.0. Going through the ONNX route gets us the latest Google model
six days after release instead of waiting on MLC compilation.
Trade-offs accepted (discussed before this commit):
- transformers.js is a more generic ONNX runtime, so per-token
throughput will be ~20-40% lower than WebLLM would deliver for the
same model size. For a 2B model on a modern WebGPU device that's
still well above interactive latency.
- The JS bundle gains ~2-3 MB (the ONNX runtime). Negligible compared
to the 500 MB model download.
- transformers.js v4 is brand new (released alongside Gemma 4) so the
Gemma4ForConditionalGeneration code path has very little battle
testing yet. The risk is partially offset by webml-community's
reference implementation.
What changed file by file:
- packages/local-llm/package.json: drop @mlc-ai/web-llm, add
@huggingface/transformers ^4.0.0; bump version 0.1.0 → 0.2.0; rewrite
description.
- packages/local-llm/src/types.ts: add `dtype` field to ModelConfig
('fp32' | 'fp16' | 'q8' | 'q4' | 'q4f16') so each model can request
the quantization that matches its uploaded ONNX shards.
- packages/local-llm/src/models.ts: replace the old Qwen 2.5 + Gemma 2
registry with a single `gemma-4-e2b` entry pointing at
onnx-community/gemma-4-E2B-it-ONNX with q4f16 quantization. Future
models can be added by appending entries — the /llm-test picker
reads MODELS dynamically and picks them up automatically.
- packages/local-llm/src/cache.ts: replace the WebLLM-specific
hasModelInCache helper with a generic Cache API probe that looks for
`https://huggingface.co/{model_id}/resolve/main/tokenizer.json` in
any open cache. tokenizer.json is small, downloaded first, and
always present, so its presence is a reliable proxy for "model has
been loaded before".
- packages/local-llm/src/engine.ts: full rewrite. Internally we now
hold a transformers.js model + processor pair (created via
AutoProcessor.from_pretrained + Gemma4ForConditionalGeneration.from_pretrained
with `device: 'webgpu'`), and translate our LoadingStatus union from
the library's `progress_callback` shape. generate() applies Gemma's
chat template via the processor, runs model.generate() with optional
TextStreamer for streaming, then slices the prompt tokens off the
output tensor to compute per-call usage. The convenience methods
(prompt, extractJson, classify) are unchanged because they only call
generate() under the hood.
- packages/local-llm/src/generate.ts and status.svelte.ts: deleted.
These were orphaned from a much earlier engine API (referenced
`getEngine()` / `subscribe()` / `LlmState` symbols that haven't
existed for a while) and were never re-exported from index.ts —
they only showed up because `tsc --noEmit` was crawling the src
tree. Their functionality lives in engine.ts + svelte.svelte.ts now.
- apps/mana/apps/web/package.json: swap the direct dep from
@mlc-ai/web-llm to @huggingface/transformers. This is the same
trick we used for the previous adapter-node externals warning —
having it as a direct dep makes adapter-node's Rollup pass treat
it as external automatically.
- apps/mana/apps/web/vite.config.ts: swap ssr.external entry from
@mlc-ai/web-llm to @huggingface/transformers. Add a comment
explaining the why so the next person doesn't wonder.
- apps/mana/apps/web/src/routes/(app)/llm-test/+page.svelte: change
the default selectedModel from 'qwen-2.5-1.5b' to 'gemma-4-e2b'.
All other model display strings come from the MODELS registry, so
this is the single hard-coded reference that needed updating.
- pnpm-lock.yaml: regenerated. Confirmed @mlc-ai/web-llm is gone (0
references) and @huggingface/transformers is in (4 references).
CSP: no header changes needed. We already opened connect-src for
huggingface.co + cdn-lfs.huggingface.co + raw.githubusercontent.com
when fixing the WebLLM blockers earlier today, and 'wasm-unsafe-eval'
is already in script-src — both transformers.js (ONNX runtime) and
WebLLM (MLC runtime) need that. If transformers.js spawns its
inference into a Web Worker via a blob URL we may need to add
`worker-src 'self' blob:` once we hit the first runtime test, but
the existing CSP should be enough for the synchronous path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .changeset | ||
| .claude | ||
| .github | ||
| .husky | ||
| apps | ||
| docker | ||
| docs | ||
| games | ||
| load-tests | ||
| NewAppIdeas/Roblox Reimagined | ||
| packages | ||
| patches | ||
| scripts | ||
| services | ||
| tests | ||
| .dockerignore | ||
| .editorconfig | ||
| .env.development | ||
| .env.macmini.example | ||
| .env.secrets.example | ||
| .gitignore | ||
| .npmrc | ||
| .nvmrc | ||
| .prettierignore | ||
| .prettierrc.json | ||
| CLAUDE.md | ||
| cloudflared-config.yml | ||
| docker-compose.dev.yml | ||
| docker-compose.macmini.yml | ||
| docker-compose.test.yml | ||
| eslint.config.mjs | ||
| gift-codes-2026-02-14.txt | ||
| lint-staged.config.js | ||
| package.json | ||
| playwright.config.ts | ||
| pnpm-lock.yaml | ||
| pnpm-workspace.yaml | ||
| README.md | ||
| TROUBLESHOOTING.md | ||
| turbo.json | ||
| vitest.config.ts | ||
Mana Monorepo
Monorepo containing all Mana projects — a self-hosted multi-app ecosystem with shared packages and unified tooling.
Projects
| Project | Description | Apps |
|---|---|---|
| mana | Multi-app ecosystem platform | Expo mobile, SvelteKit web |
| chat | AI chat application | NestJS backend, Expo mobile, SvelteKit web, Astro landing |
| todo | Task management | NestJS backend, SvelteKit web, Astro landing |
| calendar | Calendar & scheduling | NestJS backend, SvelteKit web, Astro landing |
| clock | Pomodoro & time tracking | NestJS backend, SvelteKit web, Astro landing |
| contacts | Contact management | NestJS backend, SvelteKit web |
| picture | AI image generation | NestJS backend, Expo mobile, SvelteKit web, Astro landing |
| cards | Card/deck management | NestJS backend, Expo mobile, SvelteKit web |
| zitare | Daily inspiration quotes | NestJS backend, Expo mobile, SvelteKit web, Astro landing |
| mukke | Music player | NestJS backend, SvelteKit web |
| planta | Plant care tracker | NestJS backend, SvelteKit web |
| storage | Cloud storage | NestJS backend, SvelteKit web |
| questions | Q&A with web search | SvelteKit web |
| skilltree | Skill tree visualization | NestJS backend, SvelteKit web |
| nutriphi | Nutrition tracking | NestJS backend, SvelteKit web |
| citycorners | City guide | NestJS backend, SvelteKit web, Astro landing |
| presi | Presentation tool | NestJS backend, SvelteKit web |
| photos | Photo management | NestJS backend, SvelteKit web |
Getting Started
Prerequisites
- Node.js 20+
- pnpm 9.15.0+
- Docker (for PostgreSQL, Redis, MinIO)
Installation
pnpm install
Development
# Start infrastructure (PostgreSQL, Redis, MinIO)
pnpm docker:up
# Start any app with auto DB setup
pnpm dev:chat:full
pnpm dev:todo:full
pnpm dev:calendar:full
pnpm dev:contacts:full
# Build & quality
pnpm run build
pnpm run type-check
pnpm run format
See CLAUDE.md for comprehensive development documentation.
Architecture
mana-monorepo/
├── apps/ # Product applications
├── services/ # Microservices (auth, search, LLM, bots)
├── packages/ # Shared packages
├── docker/ # Docker configuration
└── scripts/ # Development & deployment scripts
Tooling
- Package Manager: pnpm 9.15.0
- Build System: Turborepo
- Formatting: Prettier (tabs, single quotes, 100 char width)
- Hosting: Mac Mini (self-hosted) via Docker + Cloudflare Tunnel
- Analytics: Umami (stats.mana.how)
License
Private - All rights reserved