mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 20:01:09 +02:00

Till JS c7b4388cec feat(mana-image-gen): replace Mac flux2.c implementation with Windows GPU diffusers

The repo's mana-image-gen used to be a Mac Mini–only service built on
flux2.c with hard MPS+arm64 platform checks. The actual production
image-gen runs on the Windows GPU server (RTX 3090) using HuggingFace
diffusers + PyTorch CUDA + FLUX.1-schnell — completely different code
that lived only at C:\mana\services\mana-image-gen\ on the GPU box.

This commit pulls the Windows implementation into the repo and deletes
the Mac one, so there's exactly one mana-image-gen and its source of
truth is git rather than one folder on one machine.

Removed:
- setup.sh — Mac-only flux2.c installer with hard arm64 platform check
- app/main.py (Mac flux2.c subprocess wrapper version)
- app/flux_service.py (Mac flux2.c subprocess wrapper version)

Added (pulled from C:\mana\services\mana-image-gen\):
- app/main.py — FastAPI endpoints (/generate, /images/*, /cleanup)
- app/flux_service.py — diffusers FluxPipeline wrapper
- app/api_auth.py — ApiKeyMiddleware (GPU_API_KEY)
- app/vram_manager.py — shared VRAM accounting
- service.pyw — Windows runner used by the ManaImageGen scheduled task

Updated:
- main.py PORT default from 3025 → 3023 to match the production reality
  (the service.pyw runner already binds 3023 explicitly via uvicorn.run,
  but the source default should match so direct uvicorn invocations and
  local tests don't pick the wrong port)
- CLAUDE.md fully rewritten to describe the Windows/CUDA/diffusers stack
- README.md trimmed to a pointer at CLAUDE.md + the public URL
- .env.example written from scratch (didn't exist before — the service's
  .env on the GPU box was undocumented)

The setup-image-gen.sh launchd installer in scripts/mac-mini/ and the
actual Mac Mini deployment will be cleaned up in the next commit, along
with the rest of the Mac-Mini AI service infrastructure.

2026-04-08 13:02:42 +02:00

4.4 KiB

Raw Blame History

mana-image-gen

AI image generation microservice using FLUX models via HuggingFace diffusers on NVIDIA CUDA. Lives on the Windows GPU server (mana-server-gpu, RTX 3090).

⚠️ Earlier history: this directory used to contain a Mac Mini–only implementation built on flux2.c (MPS, Apple Silicon arm64). That version was removed when the service moved fully onto the Windows GPU. If you're looking for the old code, see git history before this commit.

Tech Stack

Layer	Technology
Runtime	Python 3.11 + uvicorn (Windows)
Framework	FastAPI
Inference	HuggingFace `diffusers` + PyTorch CUDA
Default model	FLUX.1-schnell (BFL, Apache 2.0, 4-step distilled)
GPU	NVIDIA RTX 3090 (24 GB VRAM)
Auth	`GPU_API_KEY` middleware (`app/api_auth.py`)
Process supervision	Windows Scheduled Task `ManaImageGen` (AtLogOn)

Port: 3023

Where it runs

Host	Path on disk	Entrypoint
Windows GPU server (`192.168.178.11`)	`C:\mana\services\mana-image-gen\`	`service.pyw` via Scheduled Task `ManaImageGen`

The service is exposed publicly via Cloudflare Tunnel + the Mac Mini TCP-proxy (gpu-proxy.py):

Internet → Cloudflare → Mac Mini (gpu-proxy.py) → 192.168.178.11:3023

Public URL: https://gpu-img.mana.how

Quick Start (Windows GPU)

# As tills on mana-server-gpu
cd C:\mana\services\mana-image-gen
C:\mana\venvs\image-gen\Scripts\python.exe service.pyw

# Or kick the scheduled task
Start-ScheduledTask -TaskName "ManaImageGen"

# Health
curl http://localhost:3023/health

The Scheduled Task runs:

Execute:    C:\mana\venvs\image-gen\Scripts\python.exe
Arguments:  C:\mana\services\mana-image-gen\service.pyw
WorkingDir: C:\mana\services\mana-image-gen

API Endpoints

Method	Path	Description
GET	`/health`	Liveness + GPU + model status
GET	`/models`	Loaded model info
POST	`/generate`	Generate an image (returns `{image_url, ...}`)
GET	`/images/{filename}`	Serve a generated image
DELETE	`/images/{filename}`	Delete a generated image
POST	`/cleanup?max_age_hours=24`	Sweep old images

All non-health endpoints are gated by ApiKeyMiddleware — clients must send Authorization: Bearer $GPU_API_KEY (header name and verification details in app/api_auth.py).

Generate request

{
  "prompt": "A futuristic city skyline at sunset",
  "width": 1024,
  "height": 1024,
  "steps": 4,
  "seed": -1
}

Code layout

services/mana-image-gen/
├── app/
│   ├── __init__.py
│   ├── main.py            # FastAPI endpoints
│   ├── flux_service.py    # diffusers pipeline + generate_image()
│   ├── api_auth.py        # ApiKeyMiddleware (GPU_API_KEY)
│   └── vram_manager.py    # shared VRAM accounting helper
└── service.pyw            # Windows runner (used by Scheduled Task)

Configuration (`.env` on the Windows GPU box)

PORT=3023
IMAGE_MODEL_ID=black-forest-labs/FLUX.1-schnell
DEFAULT_STEPS=4
DEFAULT_WIDTH=1024
DEFAULT_HEIGHT=1024
MAX_STEPS=8
GUIDANCE_SCALE=0.0
GENERATION_TIMEOUT=120
OUTPUT_DIR=C:\mana\services\mana-image-gen\outputs
CORS_ORIGINS=https://mana.how,https://chat.mana.how
GPU_API_KEY=...                # cross-service auth, also used by mana-llm

The service.pyw runner loads .env from the service directory before starting uvicorn.

Operations

# Status
Get-ScheduledTask -TaskName "ManaImageGen" | Format-List TaskName, State
Get-NetTCPConnection -LocalPort 3023 -State Listen

# Restart
Stop-ScheduledTask -TaskName "ManaImageGen"
Start-ScheduledTask -TaskName "ManaImageGen"

# Logs
Get-Content C:\mana\services\mana-image-gen\service.log -Tail 50

Model details

Field	Value
Model	`black-forest-labs/FLUX.1-schnell`
Parameters	~12B
License	Apache 2.0 (commercial use OK)
Weights size	~24 GB on disk
VRAM footprint	~12 GB (with the default precision/optimization settings)
Optimal sampling steps	4 (distilled "schnell" variant)
HuggingFace gate	Requires HF login + license accept

Reference

docs/WINDOWS_GPU_SERVER_SETUP.md — full Windows GPU box setup, all AI services, scheduled task setup, firewall rules, Cloudflare tunnel
docs/PORT_SCHEMA.md — port assignments across services

4.4 KiB Raw Blame History Unescape Escape