mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 19:01:08 +02:00
First end-to-end Gemma 4 inference attempt threw "Cannot read properties of null (reading 'dims')" the moment a chat message was sent. Two bugs piled on top of each other: 1. apply_chat_template() was being called with `return_tensor: 'pt'`, which is the Python `transformers` convention. transformers.js's equivalent option is just a boolean (the default), and the string 'pt' is unrecognized — older versions silently ignored it, but the v4 code path now produces a less predictable input shape when it sees the unknown value. Drop it. 2. model.generate() in transformers.js v4 returns null (not a tensor) when a streamer is attached. The previous engine code only attached a streamer if the caller passed an `onToken` callback, then unconditionally tried to slice the tensor return for token counting — which crashed because the chat tab DOES pass onToken for live streaming. The streamer collected the text fine, but generate() returned null and our tensor read blew up. Restructure so the streamer is always attached and is the canonical text channel. The tensor return is now only used for token counting when present, and falls back to a chars/4 estimate when it isn't, so the /llm-test UI still shows roughly meaningful prompt/completion counts on either v3 (returns tensor) or v4 (returns null with streamer). The user-facing GenerateResult.content now always comes from the streamer's accumulated string instead of decoding the tensor's sliced suffix, which is more robust across versions. Also wrap the model.generate() call in try/catch so that versions of transformers.js that throw at end-of-streaming (after the streamer has already delivered all tokens) don't lose the answer. We only re-throw if the streamer collected nothing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| src | ||
| package.json | ||
| tsconfig.json | ||