From 7f382138a10336b5b25f01f041bcb5d9a212f559 Mon Sep 17 00:00:00 2001 From: Till JS Date: Wed, 8 Apr 2026 16:46:14 +0200 Subject: [PATCH] fix(mana-llm): route Ollama through gpu-proxy instead of LAN IP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The mana-service-llm container had OLLAMA_URL pointed at the GPU box's LAN address (192.168.178.11:11434). On the Mac Mini host that route works fine, but from inside any Colima container the entire 192.168.178.0/24 subnet gets synthesized RST — Colima's VM "claims" the LAN range without being able to route to it, so every connect() returns "Connection refused" before a packet ever leaves the box. mana-llm started cleanly, reported the configured upstream as "unhealthy", served an empty /v1/models list, and every chat completion failed with "All connection attempts failed". The most visible downstream effect: voice quick-add (parse-task, parse-habit) silently degraded to its no-LLM fallback for everyone hitting the local stack — same shape as a successful response, no error log, just no enrichment. The Mac Mini already runs a gpu-proxy LaunchAgent (com.mana.gpu-proxy, /Users/mana/gpu-proxy.py) that forwards 127.0.0.1:13434 → 192.168.178.11:11434 alongside several other GPU service ports. Pointing OLLAMA_URL at host.docker.internal:13434 and adding the host-gateway extra_hosts mapping puts mana-llm on the already-running rail. Verified end-to-end: from inside the container, GET http://host.docker.internal:13434/api/tags now returns the full model list (gemma3:4b, gemma3:12b, gemma3:27b, qwen2.5-coder:14b, nomic-embed-text). Co-Authored-By: Claude Opus 4.6 (1M context) --- docker-compose.macmini.yml | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/docker-compose.macmini.yml b/docker-compose.macmini.yml index c7a0b689f..874cea160 100644 --- a/docker-compose.macmini.yml +++ b/docker-compose.macmini.yml @@ -952,10 +952,24 @@ services: depends_on: redis: condition: service_healthy + # Ollama lives on the Windows GPU box at 192.168.178.11:11434, but + # Colima containers can't reach the LAN range — the entire + # 192.168.178.0/24 subnet gets synthesized RST from inside any + # container, even though the macOS host routes there fine. The + # gpu-proxy LaunchAgent on the Mac Mini host (com.mana.gpu-proxy, + # see /Users/mana/gpu-proxy.py) bridges 127.0.0.1:13434 → GPU + # box's 11434, so we go through host.docker.internal:13434 to + # reach Ollama. Without this hop the local mana-llm starts + # cleanly but reports an empty model list and every chat + # completion fails with "All connection attempts failed", which + # cascades into voice quick-add silently degrading to its no-LLM + # fallback for everyone hitting the local stack. + extra_hosts: + - "host.docker.internal:host-gateway" environment: PORT: 3025 LOG_LEVEL: info - OLLAMA_URL: ${OLLAMA_URL:-http://192.168.178.11:11434} + OLLAMA_URL: ${OLLAMA_URL:-http://host.docker.internal:13434} OLLAMA_DEFAULT_MODEL: ${OLLAMA_MODEL:-gemma3:12b} OLLAMA_TIMEOUT: 120 REDIS_URL: redis://redis:6379