managarten/packages/local-llm
Till JS 7f1513b5a3 fix(local-llm): handle null model.generate() return + bogus return_tensor
First end-to-end Gemma 4 inference attempt threw "Cannot read
properties of null (reading 'dims')" the moment a chat message was
sent. Two bugs piled on top of each other:

1. apply_chat_template() was being called with `return_tensor: 'pt'`,
   which is the Python `transformers` convention. transformers.js's
   equivalent option is just a boolean (the default), and the string
   'pt' is unrecognized — older versions silently ignored it, but the
   v4 code path now produces a less predictable input shape when it
   sees the unknown value. Drop it.

2. model.generate() in transformers.js v4 returns null (not a tensor)
   when a streamer is attached. The previous engine code only attached
   a streamer if the caller passed an `onToken` callback, then
   unconditionally tried to slice the tensor return for token counting
   — which crashed because the chat tab DOES pass onToken for live
   streaming. The streamer collected the text fine, but generate()
   returned null and our tensor read blew up.

Restructure so the streamer is always attached and is the canonical
text channel. The tensor return is now only used for token counting
when present, and falls back to a chars/4 estimate when it isn't, so
the /llm-test UI still shows roughly meaningful prompt/completion
counts on either v3 (returns tensor) or v4 (returns null with
streamer). The user-facing GenerateResult.content now always comes
from the streamer's accumulated string instead of decoding the
tensor's sliced suffix, which is more robust across versions.

Also wrap the model.generate() call in try/catch so that versions
of transformers.js that throw at end-of-streaming (after the
streamer has already delivered all tokens) don't lose the answer.
We only re-throw if the streamer collected nothing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:15:33 +02:00
..
src fix(local-llm): handle null model.generate() return + bogus return_tensor 2026-04-08 23:15:33 +02:00
package.json feat(local-llm): swap WebLLM/Qwen for transformers.js + Gemma 4 E2B 2026-04-08 22:22:32 +02:00
tsconfig.json feat(local-llm): add client-side LLM inference package with WebLLM 2026-04-02 01:53:54 +02:00