mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-17 12:09:41 +02:00
The repo's mana-image-gen used to be a Mac Mini–only service built on flux2.c with hard MPS+arm64 platform checks. The actual production image-gen runs on the Windows GPU server (RTX 3090) using HuggingFace diffusers + PyTorch CUDA + FLUX.1-schnell — completely different code that lived only at C:\mana\services\mana-image-gen\ on the GPU box. This commit pulls the Windows implementation into the repo and deletes the Mac one, so there's exactly one mana-image-gen and its source of truth is git rather than one folder on one machine. Removed: - setup.sh — Mac-only flux2.c installer with hard arm64 platform check - app/main.py (Mac flux2.c subprocess wrapper version) - app/flux_service.py (Mac flux2.c subprocess wrapper version) Added (pulled from C:\mana\services\mana-image-gen\): - app/main.py — FastAPI endpoints (/generate, /images/*, /cleanup) - app/flux_service.py — diffusers FluxPipeline wrapper - app/api_auth.py — ApiKeyMiddleware (GPU_API_KEY) - app/vram_manager.py — shared VRAM accounting - service.pyw — Windows runner used by the ManaImageGen scheduled task Updated: - main.py PORT default from 3025 → 3023 to match the production reality (the service.pyw runner already binds 3023 explicitly via uvicorn.run, but the source default should match so direct uvicorn invocations and local tests don't pick the wrong port) - CLAUDE.md fully rewritten to describe the Windows/CUDA/diffusers stack - README.md trimmed to a pointer at CLAUDE.md + the public URL - .env.example written from scratch (didn't exist before — the service's .env on the GPU box was undocumented) The setup-image-gen.sh launchd installer in scripts/mac-mini/ and the actual Mac Mini deployment will be cleaned up in the next commit, along with the rest of the Mac-Mini AI service infrastructure.
147 lines
4.4 KiB
Markdown
147 lines
4.4 KiB
Markdown
# mana-image-gen
|
||
|
||
AI image generation microservice using FLUX models via HuggingFace `diffusers` on NVIDIA CUDA. Lives on the Windows GPU server (`mana-server-gpu`, RTX 3090).
|
||
|
||
> ⚠️ **Earlier history**: this directory used to contain a Mac Mini–only
|
||
> implementation built on `flux2.c` (MPS, Apple Silicon arm64). That
|
||
> version was removed when the service moved fully onto the Windows GPU.
|
||
> If you're looking for the old code, see git history before this commit.
|
||
|
||
## Tech Stack
|
||
|
||
| Layer | Technology |
|
||
|-------|------------|
|
||
| **Runtime** | Python 3.11 + uvicorn (Windows) |
|
||
| **Framework** | FastAPI |
|
||
| **Inference** | HuggingFace `diffusers` + PyTorch CUDA |
|
||
| **Default model** | FLUX.1-schnell (BFL, Apache 2.0, 4-step distilled) |
|
||
| **GPU** | NVIDIA RTX 3090 (24 GB VRAM) |
|
||
| **Auth** | `GPU_API_KEY` middleware (`app/api_auth.py`) |
|
||
| **Process supervision** | Windows Scheduled Task `ManaImageGen` (AtLogOn) |
|
||
|
||
## Port: 3023
|
||
|
||
## Where it runs
|
||
|
||
| Host | Path on disk | Entrypoint |
|
||
|------|--------------|------------|
|
||
| Windows GPU server (`192.168.178.11`) | `C:\mana\services\mana-image-gen\` | `service.pyw` via Scheduled Task `ManaImageGen` |
|
||
|
||
The service is exposed publicly via Cloudflare Tunnel + the Mac Mini TCP-proxy (`gpu-proxy.py`):
|
||
|
||
```
|
||
Internet → Cloudflare → Mac Mini (gpu-proxy.py) → 192.168.178.11:3023
|
||
```
|
||
|
||
Public URL: `https://gpu-img.mana.how`
|
||
|
||
## Quick Start (Windows GPU)
|
||
|
||
```powershell
|
||
# As tills on mana-server-gpu
|
||
cd C:\mana\services\mana-image-gen
|
||
C:\mana\venvs\image-gen\Scripts\python.exe service.pyw
|
||
|
||
# Or kick the scheduled task
|
||
Start-ScheduledTask -TaskName "ManaImageGen"
|
||
|
||
# Health
|
||
curl http://localhost:3023/health
|
||
```
|
||
|
||
The Scheduled Task runs:
|
||
```
|
||
Execute: C:\mana\venvs\image-gen\Scripts\python.exe
|
||
Arguments: C:\mana\services\mana-image-gen\service.pyw
|
||
WorkingDir: C:\mana\services\mana-image-gen
|
||
```
|
||
|
||
## API Endpoints
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/health` | Liveness + GPU + model status |
|
||
| GET | `/models` | Loaded model info |
|
||
| POST | `/generate` | Generate an image (returns `{image_url, ...}`) |
|
||
| GET | `/images/{filename}` | Serve a generated image |
|
||
| DELETE | `/images/{filename}` | Delete a generated image |
|
||
| POST | `/cleanup?max_age_hours=24` | Sweep old images |
|
||
|
||
All non-health endpoints are gated by `ApiKeyMiddleware` — clients must send `Authorization: Bearer $GPU_API_KEY` (header name and verification details in `app/api_auth.py`).
|
||
|
||
### Generate request
|
||
|
||
```json
|
||
{
|
||
"prompt": "A futuristic city skyline at sunset",
|
||
"width": 1024,
|
||
"height": 1024,
|
||
"steps": 4,
|
||
"seed": -1
|
||
}
|
||
```
|
||
|
||
## Code layout
|
||
|
||
```
|
||
services/mana-image-gen/
|
||
├── app/
|
||
│ ├── __init__.py
|
||
│ ├── main.py # FastAPI endpoints
|
||
│ ├── flux_service.py # diffusers pipeline + generate_image()
|
||
│ ├── api_auth.py # ApiKeyMiddleware (GPU_API_KEY)
|
||
│ └── vram_manager.py # shared VRAM accounting helper
|
||
└── service.pyw # Windows runner (used by Scheduled Task)
|
||
```
|
||
|
||
## Configuration (`.env` on the Windows GPU box)
|
||
|
||
```env
|
||
PORT=3023
|
||
IMAGE_MODEL_ID=black-forest-labs/FLUX.1-schnell
|
||
DEFAULT_STEPS=4
|
||
DEFAULT_WIDTH=1024
|
||
DEFAULT_HEIGHT=1024
|
||
MAX_STEPS=8
|
||
GUIDANCE_SCALE=0.0
|
||
GENERATION_TIMEOUT=120
|
||
OUTPUT_DIR=C:\mana\services\mana-image-gen\outputs
|
||
CORS_ORIGINS=https://mana.how,https://chat.mana.how
|
||
GPU_API_KEY=... # cross-service auth, also used by mana-llm
|
||
```
|
||
|
||
The `service.pyw` runner loads `.env` from the service directory before
|
||
starting uvicorn.
|
||
|
||
## Operations
|
||
|
||
```powershell
|
||
# Status
|
||
Get-ScheduledTask -TaskName "ManaImageGen" | Format-List TaskName, State
|
||
Get-NetTCPConnection -LocalPort 3023 -State Listen
|
||
|
||
# Restart
|
||
Stop-ScheduledTask -TaskName "ManaImageGen"
|
||
Start-ScheduledTask -TaskName "ManaImageGen"
|
||
|
||
# Logs
|
||
Get-Content C:\mana\services\mana-image-gen\service.log -Tail 50
|
||
```
|
||
|
||
## Model details
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Model | `black-forest-labs/FLUX.1-schnell` |
|
||
| Parameters | ~12B |
|
||
| License | Apache 2.0 (commercial use OK) |
|
||
| Weights size | ~24 GB on disk |
|
||
| VRAM footprint | ~12 GB (with the default precision/optimization settings) |
|
||
| Optimal sampling steps | 4 (distilled "schnell" variant) |
|
||
| HuggingFace gate | Requires HF login + license accept |
|
||
|
||
## Reference
|
||
|
||
- `docs/WINDOWS_GPU_SERVER_SETUP.md` — full Windows GPU box setup, all
|
||
AI services, scheduled task setup, firewall rules, Cloudflare tunnel
|
||
- `docs/PORT_SCHEMA.md` — port assignments across services
|