feat(wardrobe,picture): Google Nano Banana as a Try-On option

Add Google's Gemini image edit family (Nano Banana) as a user-
selectable model for Wardrobe Try-On next to the existing OpenAI
path. Three concrete choices now expose themselves in the Solo and
Outfit Try-On buttons:

  - openai/gpt-image-2          (default, falls back to gpt-image-1
                                 server-side when the org isn't
                                 verified)
  - google/gemini-3-pro-image-preview   (Nano Banana Pro — premium
                                 identity / character consistency)
  - google/gemini-3.1-flash-image-preview (Nano Banana 2 — newest,
                                 fast, cheapest)

All three accept multi-image refs (face + body + garment) through
the same /api/v1/picture/generate-with-reference endpoint; the only
differences are the provider-specific request/response shape and
the model-id routing.

Server (apps/api/src/modules/picture/routes.ts):
- Guard now accepts `openai/*` and `google/*` prefixes and rejects
  everything else as "not supported for edits". Each provider's key
  is validated separately so missing GEMINI_API_KEY doesn't break
  OpenAI calls and vice versa.
- New `callGeminiEdits(modelName)` helper mirrors the shape of
  callOpenAiEdits: encodes the normalized PNG refs as base64
  inline_data parts, POSTs to
  generativelanguage.googleapis.com/v1beta/models/{model}:generateContent
  with responseModalities=["TEXT","IMAGE"] and imageConfig
  (aspectRatio + imageSize), pulls the generated image out of
  candidates[].content.parts[].inlineData.
- Our internal size strings map cleanly: 1024x1024 → 1:1 / 1K,
  1024x1536 → 2:3 / 1K, 1536x1024 → 3:2 / 1K. Gemini 1K is enough
  for the thumbnail sizes Wardrobe renders; going higher bloats
  payload without visible gain.
- creditsFor() gains a google/ branch proportional to upstream
  pricing (pro ≈ 18, 3.1-flash ≈ 6, 2.5-flash ≈ 5).
- Response `model` reports `${provider}/${modelUsed}` so the picture
  row's model metadata is accurate across providers.

Client (apps/mana/apps/web/src/lib/modules/wardrobe):
- api/try-on.ts: export `TryOnModel` union + `DEFAULT_TRY_ON_MODEL`.
  RunGarmentTryOnParams / RunOutfitTryOnParams gain an optional
  `model` field, threaded through `callGenerateWithReference`.
- components/TryOnModelPicker.svelte: new segmented control, three
  options with label + one-line hint. Grid-auto-fits so it reflows
  on the narrow workbench card.
- components/GarmentTryOnButton.svelte + TryOnButton.svelte: both
  mount the picker above the Sparkle CTA. `estimatedCredits` on the
  button label updates live when the user switches model so the
  cost signal matches what the server will actually charge.

Env (scripts/generate-env.mjs): GEMINI_API_KEY and GOOGLE_API_KEY
now propagate from the root `.env.development` into `apps/api/.env`
so mana-api can pick them up at boot. The route reads GEMINI_API_KEY
with GOOGLE_API_KEY as fallback, matching how mana-llm ships today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-04-24 16:04:21 +02:00
parent 90915b7879
commit 8a882a3760
6 changed files with 370 additions and 53 deletions

View file

@ -97,6 +97,12 @@ const APP_CONFIGS = [
// Picture module providers
OPENAI_API_KEY: (env) => env.OPENAI_API_KEY || '',
REPLICATE_API_TOKEN: (env) => env.REPLICATE_API_TOKEN || '',
// Gemini Nano Banana image edits (Wardrobe Try-On + any future
// reference-generation surface). Either key name works — we
// read both inside the route with GEMINI_API_KEY taking
// precedence, matching how mana-llm ships today.
GEMINI_API_KEY: (env) => env.GEMINI_API_KEY || '',
GOOGLE_API_KEY: (env) => env.GOOGLE_API_KEY || '',
},
},