fix(macmini): clean up container conflicts in build-app.sh restart cycle
Some checks are pending
CD Mac Mini / Detect Changes (push) Waiting to run
CD Mac Mini / Deploy (push) Blocked by required conditions
CI / Detect Changes (push) Waiting to run
CI / Validate (push) Waiting to run
CI / Auth flow integration test (push) Waiting to run
CI / Build mana-auth (push) Blocked by required conditions
CI / Build mana-search (push) Blocked by required conditions
CI / Build mana-sync (push) Blocked by required conditions
CI / Build mana-notify (push) Blocked by required conditions
CI / Build mana-api-gateway (push) Blocked by required conditions
CI / Build mana-crawler (push) Blocked by required conditions
CI / Build mana-media (push) Blocked by required conditions
CI / Build mana-credits (push) Blocked by required conditions
CI / Build mana-web (push) Blocked by required conditions
CI / Build chat-backend (push) Blocked by required conditions
CI / Build chat-web (push) Blocked by required conditions
CI / Build todo-backend (push) Blocked by required conditions
CI / Build todo-web (push) Blocked by required conditions
CI / Build calendar-backend (push) Blocked by required conditions
CI / Build calendar-web (push) Blocked by required conditions
CI / Build clock-web (push) Blocked by required conditions
CI / Build contacts-backend (push) Blocked by required conditions
CI / Build contacts-web (push) Blocked by required conditions
CI / Build presi-web (push) Blocked by required conditions
CI / Build storage-backend (push) Blocked by required conditions
CI / Build storage-web (push) Blocked by required conditions
CI / Build telegram-stats-bot (push) Blocked by required conditions
CI / Build nutriphi-backend (push) Blocked by required conditions
CI / Build nutriphi-web (push) Blocked by required conditions
CI / Build skilltree-web (push) Blocked by required conditions
Docker Validate / Validate Dockerfiles (push) Waiting to run
Docker Validate / Build calendar-web (push) Blocked by required conditions
Docker Validate / Build todo-backend (push) Blocked by required conditions
Docker Validate / Build todo-web (push) Blocked by required conditions
Docker Validate / Build zitare-web (push) Blocked by required conditions
Docker Validate / Build mana-auth (push) Blocked by required conditions
Docker Validate / Build mana-sync (push) Blocked by required conditions
Docker Validate / Build mana-media (push) Blocked by required conditions
Mirror to Forgejo / Push to Forgejo (push) Waiting to run

Hit "container name already in use" / "removal in progress" errors
three times during today's Phase 5 deploys. The previous restart
pattern was just `compose up -d --no-deps`, which fails when:

  1. A previous interrupted recreate left a stale container under
     the canonical name. The new `up` tries to claim the name and
     gets a conflict.
  2. Compose's recovery from #1 sometimes creates a hash-prefixed
     orphan container (`<hash>_<container_name>`), which then
     blocks the next clean run too.
  3. Even `--force-recreate` can't always handle the case because
     the old container is in the middle of being removed when the
     new one is being created (race).

Two-step replacement that's reliable across all three failure modes:

  Step 1 — `docker compose rm -fs SERVICES`
    Stops + force-removes the canonical compose-managed container.
    Idempotent: does nothing if already gone. Filters out the
    "No stopped containers" log noise so the output stays clean.

  Step 2 — orphan sweep via `docker rm -f`
    For each service, look up its container_name from the
    compose config (falls back to the service name if not set),
    then `docker ps -aq --filter name=^${cname}$` for the canonical
    one and `name=_${cname}$` for hash-prefixed orphans. Anything
    found gets nuked. This catches the case where compose's own
    state has lost track of an orphan it created earlier.

  Step 3 — `docker compose up -d --no-deps --remove-orphans`
    Creates the fresh container. The `--remove-orphans` flag also
    silences the "Found orphan containers ([mana-game-whopixels])"
    warning we kept seeing — that's a leftover from a removed
    service that nobody had cleaned up.

The container_name extraction uses awk on `compose config` output
(verified locally: `mana-web` → `mana-app-web`) so the script doesn't
need a hard-coded service→container mapping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-04-09 20:22:52 +02:00
parent 3c7bfc6a00
commit 83eaf71e9f

View file

@ -191,7 +191,44 @@ build_services() {
$DOCKER compose -f "$COMPOSE_FILE" build --no-cache "${services[@]}" 2>&1
echo ""
echo "=== Restarting: ${services[*]} ==="
$DOCKER compose -f "$COMPOSE_FILE" up -d --no-deps "${services[@]}" 2>&1
# Tear down existing containers BEFORE the up cycle. We hit "container
# name already in use" errors repeatedly during the Phase 5 deploys —
# a previous interrupted recreate would leave a stale container that
# the next `up -d` couldn't replace, even with --force-recreate.
#
# The two-step pattern (rm -fs + up -d) is reliable because:
# 1. `compose rm -fs` stops + force-removes the canonical container
# managed by compose. Idempotent — does nothing if already gone.
# 2. A separate `docker rm -f` pass catches hash-prefixed orphans
# (`<hash>_<container_name>`) that compose sometimes creates when
# a previous recreate failed mid-cycle. Those aren't tracked by
# compose's own state, so `compose rm` misses them.
# 3. `up -d --remove-orphans` then creates a clean new container
# and silences the "Found orphan containers" warning we kept
# seeing for the unrelated mana-game-whopixels leftover.
$DOCKER compose -f "$COMPOSE_FILE" rm -fs "${services[@]}" 2>&1 \
| grep -v 'No stopped containers' || true
for svc in "${services[@]}"; do
# Map compose service name → container_name from the compose config.
# Falls back to the service name itself if container_name isn't set.
local cname
cname=$($DOCKER compose -f "$COMPOSE_FILE" config 2>/dev/null \
| awk -v s="$svc:" '
$0 ~ "^ "s {found=1; next}
found && /^ [a-z]/ {found=0}
found && /container_name:/ {print $2; exit}
')
cname="${cname:-$svc}"
local orphans
orphans=$($DOCKER ps -aq --filter "name=^${cname}$" 2>/dev/null || true)
orphans+=" $($DOCKER ps -aq --filter "name=_${cname}$" 2>/dev/null || true)"
orphans=$(echo "$orphans" | tr ' ' '\n' | grep -v '^$' || true)
if [ -n "$orphans" ]; then
echo " Removing leftover containers for $svc: $(echo $orphans | tr '\n' ' ')"
echo "$orphans" | xargs -r $DOCKER rm -f 2>/dev/null || true
fi
done
$DOCKER compose -f "$COMPOSE_FILE" up -d --no-deps --remove-orphans "${services[@]}" 2>&1
}
# --- Main ---