fix(mana-llm): route Ollama through gpu-proxy instead of LAN IP

The mana-service-llm container had OLLAMA_URL pointed at the GPU box's LAN address (192.168.178.11:11434). On the Mac Mini host that route works fine, but from inside any Colima container the entire 192.168.178.0/24 subnet gets synthesized RST — Colima's VM "claims" the LAN range without being able to route to it, so every connect() returns "Connection refused" before a packet ever leaves the box. mana-llm started cleanly, reported the configured upstream as "unhealthy", served an empty /v1/models list, and every chat completion failed with "All connection attempts failed". The most visible downstream effect: voice quick-add (parse-task, parse-habit) silently degraded to its no-LLM fallback for everyone hitting the local stack — same shape as a successful response, no error log, just no enrichment. The Mac Mini already runs a gpu-proxy LaunchAgent (com.mana.gpu-proxy, /Users/mana/gpu-proxy.py) that forwards 127.0.0.1:13434 → 192.168.178.11:11434 alongside several other GPU service ports. Pointing OLLAMA_URL at host.docker.internal:13434 and adding the host-gateway extra_hosts mapping puts mana-llm on the already-running rail. Verified end-to-end: from inside the container, GET http://host.docker.internal:13434/api/tags now returns the full model list (gemma3:4b, gemma3:12b, gemma3:27b, qwen2.5-coder:14b, nomic-embed-text). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-14 23:01:09 +02:00 · 2026-04-08 16:46:14 +02:00 · 2026-04-08 16:46:14 +02:00 · 7f382138a1
commit 7f382138a1
parent da6e2f39da
1 changed files with 15 additions and 1 deletions
--- a/docker-compose.macmini.yml
+++ b/docker-compose.macmini.yml
@ -952,10 +952,24 @@ services:
    depends_on:
      redis:
        condition: service_healthy
    # Ollama lives on the Windows GPU box at 192.168.178.11:11434, but
    # Colima containers can't reach the LAN range — the entire
    # 192.168.178.0/24 subnet gets synthesized RST from inside any
    # container, even though the macOS host routes there fine. The
    # gpu-proxy LaunchAgent on the Mac Mini host (com.mana.gpu-proxy,
    # see /Users/mana/gpu-proxy.py) bridges 127.0.0.1:13434 → GPU
    # box's 11434, so we go through host.docker.internal:13434 to
    # reach Ollama. Without this hop the local mana-llm starts
    # cleanly but reports an empty model list and every chat
    # completion fails with "All connection attempts failed", which
    # cascades into voice quick-add silently degrading to its no-LLM
    # fallback for everyone hitting the local stack.
    extra_hosts:
      - "host.docker.internal:host-gateway"
    environment:
      PORT: 3025
      LOG_LEVEL: info
-      OLLAMA_URL: ${OLLAMA_URL:-http://192.168.178.11:11434}
+      OLLAMA_URL: ${OLLAMA_URL:-http://host.docker.internal:13434}
      OLLAMA_DEFAULT_MODEL: ${OLLAMA_MODEL:-gemma3:12b}
      OLLAMA_TIMEOUT: 120
      REDIS_URL: redis://redis:6379