mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 23:01:09 +02:00
fix(mana-llm): route Ollama through gpu-proxy instead of LAN IP
The mana-service-llm container had OLLAMA_URL pointed at the GPU box's LAN address (192.168.178.11:11434). On the Mac Mini host that route works fine, but from inside any Colima container the entire 192.168.178.0/24 subnet gets synthesized RST — Colima's VM "claims" the LAN range without being able to route to it, so every connect() returns "Connection refused" before a packet ever leaves the box. mana-llm started cleanly, reported the configured upstream as "unhealthy", served an empty /v1/models list, and every chat completion failed with "All connection attempts failed". The most visible downstream effect: voice quick-add (parse-task, parse-habit) silently degraded to its no-LLM fallback for everyone hitting the local stack — same shape as a successful response, no error log, just no enrichment. The Mac Mini already runs a gpu-proxy LaunchAgent (com.mana.gpu-proxy, /Users/mana/gpu-proxy.py) that forwards 127.0.0.1:13434 → 192.168.178.11:11434 alongside several other GPU service ports. Pointing OLLAMA_URL at host.docker.internal:13434 and adding the host-gateway extra_hosts mapping puts mana-llm on the already-running rail. Verified end-to-end: from inside the container, GET http://host.docker.internal:13434/api/tags now returns the full model list (gemma3:4b, gemma3:12b, gemma3:27b, qwen2.5-coder:14b, nomic-embed-text). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
da6e2f39da
commit
7f382138a1
1 changed files with 15 additions and 1 deletions
|
|
@ -952,10 +952,24 @@ services:
|
||||||
depends_on:
|
depends_on:
|
||||||
redis:
|
redis:
|
||||||
condition: service_healthy
|
condition: service_healthy
|
||||||
|
# Ollama lives on the Windows GPU box at 192.168.178.11:11434, but
|
||||||
|
# Colima containers can't reach the LAN range — the entire
|
||||||
|
# 192.168.178.0/24 subnet gets synthesized RST from inside any
|
||||||
|
# container, even though the macOS host routes there fine. The
|
||||||
|
# gpu-proxy LaunchAgent on the Mac Mini host (com.mana.gpu-proxy,
|
||||||
|
# see /Users/mana/gpu-proxy.py) bridges 127.0.0.1:13434 → GPU
|
||||||
|
# box's 11434, so we go through host.docker.internal:13434 to
|
||||||
|
# reach Ollama. Without this hop the local mana-llm starts
|
||||||
|
# cleanly but reports an empty model list and every chat
|
||||||
|
# completion fails with "All connection attempts failed", which
|
||||||
|
# cascades into voice quick-add silently degrading to its no-LLM
|
||||||
|
# fallback for everyone hitting the local stack.
|
||||||
|
extra_hosts:
|
||||||
|
- "host.docker.internal:host-gateway"
|
||||||
environment:
|
environment:
|
||||||
PORT: 3025
|
PORT: 3025
|
||||||
LOG_LEVEL: info
|
LOG_LEVEL: info
|
||||||
OLLAMA_URL: ${OLLAMA_URL:-http://192.168.178.11:11434}
|
OLLAMA_URL: ${OLLAMA_URL:-http://host.docker.internal:13434}
|
||||||
OLLAMA_DEFAULT_MODEL: ${OLLAMA_MODEL:-gemma3:12b}
|
OLLAMA_DEFAULT_MODEL: ${OLLAMA_MODEL:-gemma3:12b}
|
||||||
OLLAMA_TIMEOUT: 120
|
OLLAMA_TIMEOUT: 120
|
||||||
REDIS_URL: redis://redis:6379
|
REDIS_URL: redis://redis:6379
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue