The repo's mana-image-gen used to be a Mac Mini–only service built on flux2.c with hard MPS+arm64 platform checks. The actual production image-gen runs on the Windows GPU server (RTX 3090) using HuggingFace diffusers + PyTorch CUDA + FLUX.1-schnell — completely different code that lived only at C:\mana\services\mana-image-gen\ on the GPU box. This commit pulls the Windows implementation into the repo and deletes the Mac one, so there's exactly one mana-image-gen and its source of truth is git rather than one folder on one machine. Removed: - setup.sh — Mac-only flux2.c installer with hard arm64 platform check - app/main.py (Mac flux2.c subprocess wrapper version) - app/flux_service.py (Mac flux2.c subprocess wrapper version) Added (pulled from C:\mana\services\mana-image-gen\): - app/main.py — FastAPI endpoints (/generate, /images/*, /cleanup) - app/flux_service.py — diffusers FluxPipeline wrapper - app/api_auth.py — ApiKeyMiddleware (GPU_API_KEY) - app/vram_manager.py — shared VRAM accounting - service.pyw — Windows runner used by the ManaImageGen scheduled task Updated: - main.py PORT default from 3025 → 3023 to match the production reality (the service.pyw runner already binds 3023 explicitly via uvicorn.run, but the source default should match so direct uvicorn invocations and local tests don't pick the wrong port) - CLAUDE.md fully rewritten to describe the Windows/CUDA/diffusers stack - README.md trimmed to a pointer at CLAUDE.md + the public URL - .env.example written from scratch (didn't exist before — the service's .env on the GPU box was undocumented) The setup-image-gen.sh launchd installer in scripts/mac-mini/ and the actual Mac Mini deployment will be cleaned up in the next commit, along with the rest of the Mac-Mini AI service infrastructure.
4.4 KiB
mana-image-gen
AI image generation microservice using FLUX models via HuggingFace diffusers on NVIDIA CUDA. Lives on the Windows GPU server (mana-server-gpu, RTX 3090).
⚠️ Earlier history: this directory used to contain a Mac Mini–only implementation built on
flux2.c(MPS, Apple Silicon arm64). That version was removed when the service moved fully onto the Windows GPU. If you're looking for the old code, see git history before this commit.
Tech Stack
| Layer | Technology |
|---|---|
| Runtime | Python 3.11 + uvicorn (Windows) |
| Framework | FastAPI |
| Inference | HuggingFace diffusers + PyTorch CUDA |
| Default model | FLUX.1-schnell (BFL, Apache 2.0, 4-step distilled) |
| GPU | NVIDIA RTX 3090 (24 GB VRAM) |
| Auth | GPU_API_KEY middleware (app/api_auth.py) |
| Process supervision | Windows Scheduled Task ManaImageGen (AtLogOn) |
Port: 3023
Where it runs
| Host | Path on disk | Entrypoint |
|---|---|---|
Windows GPU server (192.168.178.11) |
C:\mana\services\mana-image-gen\ |
service.pyw via Scheduled Task ManaImageGen |
The service is exposed publicly via Cloudflare Tunnel + the Mac Mini TCP-proxy (gpu-proxy.py):
Internet → Cloudflare → Mac Mini (gpu-proxy.py) → 192.168.178.11:3023
Public URL: https://gpu-img.mana.how
Quick Start (Windows GPU)
# As tills on mana-server-gpu
cd C:\mana\services\mana-image-gen
C:\mana\venvs\image-gen\Scripts\python.exe service.pyw
# Or kick the scheduled task
Start-ScheduledTask -TaskName "ManaImageGen"
# Health
curl http://localhost:3023/health
The Scheduled Task runs:
Execute: C:\mana\venvs\image-gen\Scripts\python.exe
Arguments: C:\mana\services\mana-image-gen\service.pyw
WorkingDir: C:\mana\services\mana-image-gen
API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /health |
Liveness + GPU + model status |
| GET | /models |
Loaded model info |
| POST | /generate |
Generate an image (returns {image_url, ...}) |
| GET | /images/{filename} |
Serve a generated image |
| DELETE | /images/{filename} |
Delete a generated image |
| POST | /cleanup?max_age_hours=24 |
Sweep old images |
All non-health endpoints are gated by ApiKeyMiddleware — clients must send Authorization: Bearer $GPU_API_KEY (header name and verification details in app/api_auth.py).
Generate request
{
"prompt": "A futuristic city skyline at sunset",
"width": 1024,
"height": 1024,
"steps": 4,
"seed": -1
}
Code layout
services/mana-image-gen/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI endpoints
│ ├── flux_service.py # diffusers pipeline + generate_image()
│ ├── api_auth.py # ApiKeyMiddleware (GPU_API_KEY)
│ └── vram_manager.py # shared VRAM accounting helper
└── service.pyw # Windows runner (used by Scheduled Task)
Configuration (.env on the Windows GPU box)
PORT=3023
IMAGE_MODEL_ID=black-forest-labs/FLUX.1-schnell
DEFAULT_STEPS=4
DEFAULT_WIDTH=1024
DEFAULT_HEIGHT=1024
MAX_STEPS=8
GUIDANCE_SCALE=0.0
GENERATION_TIMEOUT=120
OUTPUT_DIR=C:\mana\services\mana-image-gen\outputs
CORS_ORIGINS=https://mana.how,https://chat.mana.how
GPU_API_KEY=... # cross-service auth, also used by mana-llm
The service.pyw runner loads .env from the service directory before
starting uvicorn.
Operations
# Status
Get-ScheduledTask -TaskName "ManaImageGen" | Format-List TaskName, State
Get-NetTCPConnection -LocalPort 3023 -State Listen
# Restart
Stop-ScheduledTask -TaskName "ManaImageGen"
Start-ScheduledTask -TaskName "ManaImageGen"
# Logs
Get-Content C:\mana\services\mana-image-gen\service.log -Tail 50
Model details
| Field | Value |
|---|---|
| Model | black-forest-labs/FLUX.1-schnell |
| Parameters | ~12B |
| License | Apache 2.0 (commercial use OK) |
| Weights size | ~24 GB on disk |
| VRAM footprint | ~12 GB (with the default precision/optimization settings) |
| Optimal sampling steps | 4 (distilled "schnell" variant) |
| HuggingFace gate | Requires HF login + license accept |
Reference
docs/WINDOWS_GPU_SERVER_SETUP.md— full Windows GPU box setup, all AI services, scheduled task setup, firewall rules, Cloudflare tunneldocs/PORT_SCHEMA.md— port assignments across services