WebLLM was blocked by connect-src — model config and weight shards live
on huggingface.co (+ cdn-lfs.* for LFS), and the WebGPU model_lib WASM
comes from raw.githubusercontent.com (binary-mlc-llm-libs). Also wires
Gemma 2 2B/9B into the model registry so /llm-test picks them up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move hasModelInCache to local-llm package with dynamic import wrapper
so the browser-only dependency doesn't break server-side builds.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add generate.ts with streaming chat completions, JSON extraction, and
text classification helpers. Add status.svelte.ts with Svelte 5 runes
reactive wrapper for LLM engine state.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New shared package for browser-based LLM inference using Qwen 2.5 1.5B
via WebLLM. Includes Svelte 5 reactive stores, engine management, and
type definitions for local AI features without server roundtrips.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>