managarten/services/mana-image-gen/CLAUDE.md
Till JS c7b4388cec feat(mana-image-gen): replace Mac flux2.c implementation with Windows GPU diffusers
The repo's mana-image-gen used to be a Mac Mini–only service built on
flux2.c with hard MPS+arm64 platform checks. The actual production
image-gen runs on the Windows GPU server (RTX 3090) using HuggingFace
diffusers + PyTorch CUDA + FLUX.1-schnell — completely different code
that lived only at C:\mana\services\mana-image-gen\ on the GPU box.

This commit pulls the Windows implementation into the repo and deletes
the Mac one, so there's exactly one mana-image-gen and its source of
truth is git rather than one folder on one machine.

Removed:
- setup.sh — Mac-only flux2.c installer with hard arm64 platform check
- app/main.py (Mac flux2.c subprocess wrapper version)
- app/flux_service.py (Mac flux2.c subprocess wrapper version)

Added (pulled from C:\mana\services\mana-image-gen\):
- app/main.py — FastAPI endpoints (/generate, /images/*, /cleanup)
- app/flux_service.py — diffusers FluxPipeline wrapper
- app/api_auth.py — ApiKeyMiddleware (GPU_API_KEY)
- app/vram_manager.py — shared VRAM accounting
- service.pyw — Windows runner used by the ManaImageGen scheduled task

Updated:
- main.py PORT default from 3025 → 3023 to match the production reality
  (the service.pyw runner already binds 3023 explicitly via uvicorn.run,
  but the source default should match so direct uvicorn invocations and
  local tests don't pick the wrong port)
- CLAUDE.md fully rewritten to describe the Windows/CUDA/diffusers stack
- README.md trimmed to a pointer at CLAUDE.md + the public URL
- .env.example written from scratch (didn't exist before — the service's
  .env on the GPU box was undocumented)

The setup-image-gen.sh launchd installer in scripts/mac-mini/ and the
actual Mac Mini deployment will be cleaned up in the next commit, along
with the rest of the Mac-Mini AI service infrastructure.
2026-04-08 13:02:42 +02:00

147 lines
4.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# mana-image-gen
AI image generation microservice using FLUX models via HuggingFace `diffusers` on NVIDIA CUDA. Lives on the Windows GPU server (`mana-server-gpu`, RTX 3090).
> ⚠️ **Earlier history**: this directory used to contain a Mac Minionly
> implementation built on `flux2.c` (MPS, Apple Silicon arm64). That
> version was removed when the service moved fully onto the Windows GPU.
> If you're looking for the old code, see git history before this commit.
## Tech Stack
| Layer | Technology |
|-------|------------|
| **Runtime** | Python 3.11 + uvicorn (Windows) |
| **Framework** | FastAPI |
| **Inference** | HuggingFace `diffusers` + PyTorch CUDA |
| **Default model** | FLUX.1-schnell (BFL, Apache 2.0, 4-step distilled) |
| **GPU** | NVIDIA RTX 3090 (24 GB VRAM) |
| **Auth** | `GPU_API_KEY` middleware (`app/api_auth.py`) |
| **Process supervision** | Windows Scheduled Task `ManaImageGen` (AtLogOn) |
## Port: 3023
## Where it runs
| Host | Path on disk | Entrypoint |
|------|--------------|------------|
| Windows GPU server (`192.168.178.11`) | `C:\mana\services\mana-image-gen\` | `service.pyw` via Scheduled Task `ManaImageGen` |
The service is exposed publicly via Cloudflare Tunnel + the Mac Mini TCP-proxy (`gpu-proxy.py`):
```
Internet → Cloudflare → Mac Mini (gpu-proxy.py) → 192.168.178.11:3023
```
Public URL: `https://gpu-img.mana.how`
## Quick Start (Windows GPU)
```powershell
# As tills on mana-server-gpu
cd C:\mana\services\mana-image-gen
C:\mana\venvs\image-gen\Scripts\python.exe service.pyw
# Or kick the scheduled task
Start-ScheduledTask -TaskName "ManaImageGen"
# Health
curl http://localhost:3023/health
```
The Scheduled Task runs:
```
Execute: C:\mana\venvs\image-gen\Scripts\python.exe
Arguments: C:\mana\services\mana-image-gen\service.pyw
WorkingDir: C:\mana\services\mana-image-gen
```
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Liveness + GPU + model status |
| GET | `/models` | Loaded model info |
| POST | `/generate` | Generate an image (returns `{image_url, ...}`) |
| GET | `/images/{filename}` | Serve a generated image |
| DELETE | `/images/{filename}` | Delete a generated image |
| POST | `/cleanup?max_age_hours=24` | Sweep old images |
All non-health endpoints are gated by `ApiKeyMiddleware` — clients must send `Authorization: Bearer $GPU_API_KEY` (header name and verification details in `app/api_auth.py`).
### Generate request
```json
{
"prompt": "A futuristic city skyline at sunset",
"width": 1024,
"height": 1024,
"steps": 4,
"seed": -1
}
```
## Code layout
```
services/mana-image-gen/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI endpoints
│ ├── flux_service.py # diffusers pipeline + generate_image()
│ ├── api_auth.py # ApiKeyMiddleware (GPU_API_KEY)
│ └── vram_manager.py # shared VRAM accounting helper
└── service.pyw # Windows runner (used by Scheduled Task)
```
## Configuration (`.env` on the Windows GPU box)
```env
PORT=3023
IMAGE_MODEL_ID=black-forest-labs/FLUX.1-schnell
DEFAULT_STEPS=4
DEFAULT_WIDTH=1024
DEFAULT_HEIGHT=1024
MAX_STEPS=8
GUIDANCE_SCALE=0.0
GENERATION_TIMEOUT=120
OUTPUT_DIR=C:\mana\services\mana-image-gen\outputs
CORS_ORIGINS=https://mana.how,https://chat.mana.how
GPU_API_KEY=... # cross-service auth, also used by mana-llm
```
The `service.pyw` runner loads `.env` from the service directory before
starting uvicorn.
## Operations
```powershell
# Status
Get-ScheduledTask -TaskName "ManaImageGen" | Format-List TaskName, State
Get-NetTCPConnection -LocalPort 3023 -State Listen
# Restart
Stop-ScheduledTask -TaskName "ManaImageGen"
Start-ScheduledTask -TaskName "ManaImageGen"
# Logs
Get-Content C:\mana\services\mana-image-gen\service.log -Tail 50
```
## Model details
| Field | Value |
|-------|-------|
| Model | `black-forest-labs/FLUX.1-schnell` |
| Parameters | ~12B |
| License | Apache 2.0 (commercial use OK) |
| Weights size | ~24 GB on disk |
| VRAM footprint | ~12 GB (with the default precision/optimization settings) |
| Optimal sampling steps | 4 (distilled "schnell" variant) |
| HuggingFace gate | Requires HF login + license accept |
## Reference
- `docs/WINDOWS_GPU_SERVER_SETUP.md` — full Windows GPU box setup, all
AI services, scheduled task setup, firewall rules, Cloudflare tunnel
- `docs/PORT_SCHEMA.md` — port assignments across services