refactor: remove local AI services from Mac Mini, GPU-only architecture

- Deactivate Ollama, FLUX.2, and Telegram Bot LaunchAgents on Mac Mini
- Remove extra_hosts from mana-llm (no longer needs host.docker.internal)
- Update health-check.sh to monitor GPU server services instead of local
- Update status.sh to show GPU server status instead of native services
- Rewrite MAC_MINI_SERVER.md: remove ~400 lines of Ollama/FLUX/Bot docs,
  add GPU server architecture diagram and deactivation notes
- Update CAPACITY_PLANNING.md with post-offload numbers (~80-150 peak users)

Mac Mini is now a pure hosting server (Web, API, DB, Sync).
All AI workloads run on GPU server (RTX 3090) via LAN.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-03-28 21:23:37 +01:00
parent 99f15955fe
commit b45ddbbb83
5 changed files with 109 additions and 369 deletions

View file

@ -1311,8 +1311,6 @@ services:
AUTO_FALLBACK_ENABLED: "true"
OLLAMA_MAX_CONCURRENT: 5
CORS_ORIGINS: https://playground.mana.how,https://mana.how,https://chat.mana.how
extra_hosts:
- "host.docker.internal:host-gateway"
ports:
- "3020:3020"
healthcheck: