managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 22:41:09 +02:00

Author	SHA1	Message	Date
Till JS	61f2772789	chore(brand): rename Cards → Cardecky (display, infra, license-IDs) - App display name → Cardecky in mana-apps.ts, MODULE_REGISTRY, alle Docs - Domains: cardecky.mana.how (App), cardecky-api.mana.how (Marketplace API), cardecky.com (Marketing-Landing — cloudflared-route + nginx-Block vorbereitet, DNS muss noch gesetzt werden) - 301-Redirect cards.mana.how → cardecky.mana.how (nginx + cloudflared) für alte Bookmarks; kann nach 6–12 Monaten wieder raus - SPDX license IDs Cards-Personal-Use/Pro-Only-1.0 → Cardecky-* via Drizzle 0001-Migration (DROP CHECK → UPDATE rows → SET DEFAULT → ADD CHECK), inkl. _journal- und 0001_snapshot-Update - In-mana cards-Modul: dezenter Banner zur Standalone-App (GUIDELINES §12), einmal schließbar via localStorage - Docker-CORS-Listen, sso-origins.ts, Prometheus-Target aktualisiert Technische IDs bleiben bewusst: appId 'cards', schema mana_platform.cards., Verzeichnis apps/cards/, Package @cards/web, services/cards-server, Env-Vars CARDS_, UMAMI_WEBSITE_ID_CARDS*, Class CardsEvents — Mana-Konvention (Brand ≠ technischer Identifier). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:49:47 +02:00
Till JS	c84742005b	infra(phase 2g): mana-research → GPU-Box Web-Research-Orchestrator (16+ search-/LLM-providers) auf die GPU-Box verlagert. Cross-LAN für mana-auth/mana-credits/mana-llm/mana-search/ postgres/redis (192.168.178.131). research.mana.how routet jetzt zum mana-gpu-server-Tunnel (CF config v29). Mini-Container-Count 42 → 41. PUBLIC_MANA_RESEARCH_URL in mana-app-web auf https-URL umgestellt — Mini-Container können 192.168.178.11 nicht direkt erreichen (Colima-NAT), daher Cross-LAN-Bridge via Cloudflare-Tunnel wie bei mana-ai. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 20:26:10 +02:00
Till JS	a8cce79e4c	fix(monitoring): comment-out mana-ai metrics scrape after Phase 2f-3 move mana-ai's /metrics endpoint is no longer exposed on Mini's 192.168.178.131:3067 (service moved to GPU-Box, no public /metrics tunnel since the endpoint is internal). The blackbox-api job already probes mana-ai.mana.how/health for liveness, which gives us up/down without needing the metrics scrape. Status-page is now 58/58 UP after VM rolled past the stale 3067 samples. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:06:04 +02:00
Till JS	4c044e849d	fix(monitoring): mana-ai probe now uses public mana-ai.mana.how/health After Phase 2f-3 mana-ai lives on the GPU-Box, so the blackbox-internal docker-DNS probe (http://mana-ai:3066/health) is gone — that target sits in a Docker network the blackbox-exporter can't reach across LAN. Move the probe into blackbox-api against the public hostname; gives the same up/down signal plus exercises the Cloudflare-tunnel hop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:55:39 +02:00
Till JS	1e8d18ac8d	fix(monitoring): photon via Cloudflare-Tunnel, drop dead whopxl Two cleanups against the status-page DOWN list: photon-self (photon.mana.how route): mana-geocoding's /health/photon-self pings the photon backend, which lives as a Docker container on the GPU-Box (port 2322). PHOTON_SELF_API_URL was http://192.168.178.11:2322 — Mini-host can hit that fine but Mini-Docker-containers can't (Colima-NAT-quirk we keep running into). Routed photon through the mana-gpu-server tunnel (config v26) and flipped the env var to https://photon.mana.how. Probe goes UP, geocoding for sensitive queries (privacy:'local' provider tier) actually works now too — was effectively orphaned before. whopxl removed everywhere it still lingered: Container hasn't existed on the Mini in months (no compose service, no source dir under apps/, no listener on :5100 — only the dead cloudflared route + a stale CORS_ORIGINS entry on mana-auth). Cleaned cloudflared-config.yml, prometheus.yml blackbox-web target, and the mana-auth CORS list. Old DNS CNAME for whopxl.mana.how stays for now; no harm. Plus while we were here: who-api.mana.how/api/decks was the right probe for who-server's deck catalogue (root /api/decks lives on who-api, not who.mana.how which is the SSR shell). Live: status.mana.how shows 58/59 UP; the last 'whopxl' entry will fall off after VM's TSDB rolls past the probe_success staleness window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:39:50 +02:00
Till JS	0ae1e70bf1	fix(monitoring): status-page covers all standalone apps + restore who.mana.how routing Audit revealed status.mana.how was probing only the unified mana-app path-routes (mana.how/{module}) plus a couple of GPU services. None of the standalone deployments were monitored, and three probe targets were stale. Changes: - prometheus.yml blackbox-web: drop mana.how/{context,who} (context module was dropped 2026-04-29; mana.how/who never existed — /who is a standalone stack on its own subdomain). Add the eight hosts that DO have separate deployments today: whopxl, manavoxel, memoro (landing), cards (Phase-1 spinoff), who.mana.how/cantina, npm (Verdaccio). - prometheus.yml blackbox-api: add memoro-api/health, memoro-audio/health, who-api.mana.how/api/decks, admin.mana.how/health (admin's root is auth-walled, only /health returns 200). - prometheus.yml blackbox-gpu: add gpu-llm.mana.how/health (was missing; gpu-stt/tts/img/video were in, gpu-llm was somehow not). - cloudflared-config.yml: restore who.mana.how → :5092 + who-api.mana.how → :3092. The DNS CNAME points at the Mini tunnel but the route entries had been lost during a previous compose cleanup, so every who.* request was hitting the catch-all 404 and the standalone Bun stack was effectively orphaned at the edge (PM2 + LaunchAgent all healthy on Mini, just no public route). Live state after rollout: status.mana.how shows 57/59 services UP, the two remaining DOWN are pre-existing — photon-self (Phase-2c cross-LAN routing limitation, documented in PLAN_OPTION_C.md) and whopxl-web (container not running on the Mini, separate issue). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:09:31 +02:00
Till JS	230dfd5dad	chore: extract arcade into standalone repo Arcade lives as its own pnpm workspace at ~/Documents/Code/arcade now, with no @mana/* coupling. This drops every reference and the games/ directory from the monorepo. Removes: - games/ directory (89 files: web + server + 22 HTML games + screenshots) - @arcade/web, @arcade/server pnpm workspace entries (games/* globs) - arcade scripts in root package.json (4 scripts) - arcade.mana.how from mana-auth trusted origins + CORS_ORIGINS - arcade entries in mana-apps registry, app-icons, URL overrides - arcade.mana.how from cloudflared tunnel + prometheus blackbox probes - arcade-web service block in docker-compose.macmini.yml - generate-env.mjs entries for arcade server + web - BRANDING_ONLY 'arcade' entry in registry consistency spec - dead arcade translation keys in GuestWelcomeModal (DE+EN) - arcade mention in CLAUDE.md, authentication guideline, MODULE_REGISTRY Verified: - services/mana-auth/src/auth/sso-config.spec.ts: 8/8 pass - pnpm install regenerates lockfile cleanly (-536 lines) - no remaining 'arcade' refs outside historical snapshot docs Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 22:40:01 +02:00
Till JS	2bbcf14aba	chore(geocoding): remove Pelias + close 3 bypass paths to public Nominatim Pelias was retired from the Mac mini on 2026-04-28; photon-self (self-hosted Photon on mana-gpu) has been the live primary since then. This removes the now-dead Pelias adapter, config, tests, and the services/mana-geocoding/pelias/ stack — the entire compose file, the geojsonify_place_details.js patch, the setup.sh import script. Provider chain is now `photon-self → photon → nominatim`. The chain keeps its `privacy: 'local' \| 'public'` split, sensitive-query blocking, coord quantization, and aggressive caching unchanged. Three direct calls to nominatim.openstreetmap.org that bypassed mana-geocoding now route through the wrapper: - citycorners/add-city + citycorners/cities/[slug]/add use the shared searchAddress() client (browser → same-origin proxy → mana-geocoding → photon-self). - memoro mobile drops its OSM reverse-geocoding fallback entirely; Expo's on-device reverse-geocoding stays as the sole path. Routing through the wrapper would require a memoro-server proxy endpoint — a follow-up if Expo's quality proves insufficient. Other behavioral changes: - CACHE_PUBLIC_TTL_MS dropped from 7d → 1h. The long TTL was a privacy-amplification trick from the Pelias era; with photon-self serving the bulk of traffic, a transient cross-LAN blip was pinning cached fallback answers for days. 1h gives quick recovery. - /health/pelias renamed to /health/photon-self; prometheus blackbox config + status-page generator updated. - mana-geocoding container no longer needs `extra_hosts: host.docker.internal:host-gateway` (was only there for the Pelias-on-host-network era). 113 tests passing. CLAUDE.md rewritten to reflect the post-Pelias architecture. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 22:12:26 +02:00
Till JS	d087b4744a	chore(observability): scrape mana-mcp at :3069 Pairs with `c94ab01c6` which added the real /metrics endpoint. Without a scrape job the policy_decisions_total counter has nowhere to go and the soak period is flying blind. 30s interval to match mana-ai. Same job shape as mana-ai — any Grafana dashboard that auto-discovers services via labels will pick this up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:24:13 +02:00
Till JS	0bf01f434e	feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration Wires mana-ai into the existing observability stack so tick throughput, plan-failure rates, planner latencies, and snapshot refresh health are visible in Grafana + Prometheus, and the service's uptime surfaces on status.mana.how under the "Internal" section. - `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix. Counters: ticks_total, plans_produced_total, plans_written_back_total, parse_failures_total, mission_errors_total, snapshots_new/updated, snapshot_rows_applied_total, http_requests_total. Histograms: tick_duration_seconds (0.1–120s), planner_request_ duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s). - `src/index.ts` — HTTP middleware labels every request by method/path/status; `/metrics` serves the Prometheus text format. - `src/cron/tick.ts` — increments counters + wraps the tick with `tickDuration.startTimer()`. Snapshot stats fold through. - `src/planner/client.ts` — wraps `complete()` in a latency histogram timer so planner tail latency shows up separately from tick duration. - `docker/prometheus/prometheus.yml` — 1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s). 2. `/health` added to the `blackbox-internal` job so uptime shows on status.mana.how alongside mana-geocoding. - `scripts/generate-status-page.sh` — friendly label for the new probe: `mana-ai:3066/health` → "Mana AI Runner" (generator already iterates `blackbox-internal`, no other changes needed). - `package.json` — prom-client ^15.1.3 All 17 Bun tests still pass; tsc clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:41:40 +02:00
Till JS	851a281e5a	refactor: rename zitare -> quotes (Zitate) Zitare was opaque Latin/Italian-flavored branding. Renamed to clear English "quotes" (DE: Zitate) matching short-concrete-noun cluster. - Module, routes, API, i18n, standalone landing app, plans dirs - Dexie tables: quotesFavorites, quotesLists, quotesListTags, customQuotes (dropped redundant "quotes" prefix on the last) - Logo QuotesLogo, theme quotes.css, search provider, dashboard widget QuoteWidget - German user-facing label "Zitate" (English brand stays Quotes) Pre-launch, no data migration needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:59:16 +02:00
Till JS	b857063120	refactor: rename eventstream -> activity, cycles -> period eventstream was confusingly branded "Events" in the app registry, colliding with the real events calendar module. Renamed to activity (DE: Aktivität) since it's a live activity feed across all modules. cycles -> period (DE: Periode) makes the menstrual-tracking module self-describing. Tables cycles/cycleDayLogs/cycleSymptoms renamed to periods/periodDayLogs/periodSymptoms; field cycleId -> periodId; TimeBlockType 'cycle' -> 'period'; domain event CycleDayLogged -> PeriodDayLogged. Generic "cycle" usages (billing, lifecycle, breath, bicycle, import cycles) left untouched. Constant disambiguation: prior DEFAULT_PERIOD_LENGTH (bleeding days) renamed to DEFAULT_BLEEDING_DAYS; prior DEFAULT_CYCLE_LENGTH (28d full cycle) is now DEFAULT_PERIOD_LENGTH. Pre-launch, no data migration needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:45:43 +02:00
Till JS	53b3746b98	refactor: rename nutriphi module to food (Essen) Complete rename across the entire monorepo pre-launch: - Module, routes, API, i18n, standalone landing app directories - All code identifiers, display names, logo component - German user-facing label: "Essen" (English brand stays "Food") - Dexie table nutriFavorites -> foodFavorites - Infra configs (docker-compose, cloudflared, nginx, wrangler) Zero residue of nutriphi remains. No data migration needed (pre-launch). Follow-up: run pnpm install, update Cloudflare DNS (food.mana.how), rename Cloudflare Pages project. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:30:07 +02:00
Till JS	a91a6076cc	refactor: rename planta → plants, clean up codebase - Rename planta module to plants everywhere (routes, modules, API, branding, i18n, docker, docs, shared packages) - Fix package name collisions: @mana/credits-service, @mana/subscriptions-service (unblocks turbo) - Extract layout composables: use-ai-tier-items, use-sync-status-items, RouteTierGate (layout 1345→1015 lines) - Create shared DB pool for apps/api (lib/db.ts), migrate 5 modules - Add automations module queries.ts with useAllAutomations/useEnabledAutomations - Remove debug console.log statements from production code - Rename storage display name: Ablage → Speicher Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 18:59:44 +02:00
Till JS	c47ce83e83	fix(geocoding): proxy Pelias health through wrapper for monitoring blackbox-exporter can't resolve host.docker.internal on Colima, so probes of host.docker.internal:4000 and :9200 always fail. Instead, add a /health/pelias endpoint on the Hono wrapper that proxies to the Pelias API, and update prometheus.yml to probe the wrapper's proxied health endpoint. Also simplifies the status page friendly_name() now that we don't need to display the host.docker.internal targets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 16:45:43 +02:00
Till JS	957060ca55	feat(monitoring): add mana-geocoding + Pelias to prod compose, Prometheus, Grafana, and status.mana.how Production deployment + observability for the self-hosted geocoding stack: docker-compose.macmini.yml - New mana-geocoding container (port 3018, internal-only — no traefik labels, no Cloudflare route). Uses host.docker.internal to reach the Pelias API on the host's pelias compose stack. Dockerfile added under services/mana-geocoding/ using the same Bun/Hono pattern as mana-events. Prometheus - New blackbox-internal job probing mana-geocoding:3018/health, the Pelias API on host.docker.internal:4000/v1/status, and Elasticsearch at host.docker.internal:9200/_cluster/health. Kept separate from blackbox-api which is reserved for public HTTPS endpoints. status.mana.how (generate-status-page.sh) - Include blackbox-internal in the metric query and add an "Interne Dienste" section with its own summary card, right between Infrastruktur and GPU Dienste. Summary grid goes from 4 to 5 columns with a 900px breakpoint. - friendly_name() now handles http:// URLs and rewrites container-name hosts like mana-geocoding:3018/health → "Mana Geocoding", host.docker.internal:4000 → "Pelias API", host.docker.internal:9200 → "Pelias Elasticsearch". Grafana uptime dashboard - Add an "Internal" series to the "Alle Dienste — Uptime-Verlauf" panel - New "Interne Dienste Status" table panel showing per-instance up/down - New "Geocoding Ø Latenz" stat panel for probe_duration_seconds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 16:11:01 +02:00
Till JS	2a177ba032	fix(monitoring): add 10 missing modules to blackbox probes + geocoding to status Blackbox web probes were missing: body, journal, dreams, firsts, cycles, events, finance, places, who, news, mail. These modules exist in mana-apps.ts and are deployed but were never added to prometheus.yml — so they didn't show on status.mana.how. Also adds mana-geocoding and mana-events to the internal SvelteKit status page health checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 23:13:07 +02:00
Till JS	45790ffbb8	refactor(mana): rename inventar → inventory across the codebase The workbench-registry app id 'inventar' did not match its @mana/shared-branding MANA_APPS counterpart 'inventory', so the tier- gating join in apps/web/src/lib/app-registry/registry.ts silently failed for the inventory module — it fell into the "no MANA_APPS entry, default visible" fallback and was effectively un-gated. The codebase had also voted overwhelmingly for 'inventar' (53 files) vs 'inventory' (3 files in shared-branding), so the long-standing mismatch was just bookkeeping debt waiting to bite. Pre-release, no live data, so the cleanest fix is to align everything on the English 'inventory': - Workbench-registry id, module.config.ts appId, module folder, route folder and i18n locale folder all renamed via git mv - Standalone apps/inventar/ workspace package renamed - All imports, store identifiers (InventarEvents → InventoryEvents, INVENTAR_GUEST_SEED, inventarModuleConfig), i18n keys and href/goto paths follow the rename - The German display label "Inventar" is preserved everywhere it is a user-visible string (page titles, i18n values, toast labels) - Dexie table prefixes (invCollections, invItems, …) are unchanged - Drive-by fix: ListView.svelte was querying non-existent inventarCollections/inventarItems tables — corrected to the actual invCollections/invItems names from module.config - The "inventar ↔ inventory id mismatch" workaround comment in registry.ts is removed since the mismatch no longer exists module-registry.ts also picks up the user's parallel newsModuleConfig addition because both edits land in the same import block — keeping them split would have left the build in an inconsistent state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:50:24 +02:00
Till JS	8e8b6ac65f	fix(mana-auth) + chore: rewrite /api/v1/auth/login JWT mint, remove Matrix stack This commit bundles two unrelated changes that were swept together by an accidental `git add -A` in another working session. Documented here so the history reflects what's actually inside. ═══════════════════════════════════════════════════════════════════════ 1. fix(mana-auth): /api/v1/auth/login mints JWT via auth.handler instead of api.signInEmail ═══════════════════════════════════════════════════════════════════════ Previous attempt (commit `55cc75e7d`) tried to fix the broken JWT mint in /api/v1/auth/login by switching the cookie name from `mana.session_token` to `__Secure-mana.session_token` for production. That was necessary but not sufficient: Better Auth's session cookie value isn't just the raw session token, it's `<token>.<HMAC>` where the HMAC is derived from the better-auth secret. Reconstructing the cookie from auth.api.signInEmail's JSON response only gave us the raw token, so /api/auth/token's get-session middleware still couldn't validate it and the JWT mint kept silently failing. Real fix: do the sign-in via auth.handler (the HTTP path) rather than auth.api.signInEmail (the SDK path). The handler returns a real fetch Response with a Set-Cookie header containing the fully signed cookie envelope. We capture that header verbatim and forward it as the cookie on the /api/auth/token request, which now passes validation and mints the JWT correctly. Verified end-to-end on auth.mana.how: $ curl -X POST https://auth.mana.how/api/v1/auth/login \ -d '{"email":"...","password":"..."}' { "user": {...}, "token": "<session token>", "accessToken": "eyJhbGciOiJFZERTQSI...", ← real JWT now "refreshToken": "<session token>" } Side benefits: - Email-not-verified path is now handled by checking signInResponse.status === 403 directly, no more catching APIError with the comment-noted async-stream footgun. - X-Forwarded-For is forwarded explicitly so Better Auth's rate limiter and our security log see the real client IP. - The leftover catch block now only handles unexpected exceptions (network errors etc); the FORBIDDEN-checking logic in it is dead but harmless and left in for defense in depth. ═══════════════════════════════════════════════════════════════════════ 2. chore: remove the entire self-hosted Matrix stack (Synapse, Element, Manalink, mana-matrix-bot) ═══════════════════════════════════════════════════════════════════════ The Matrix subsystem ran parallel to the main Mana product without any load-bearing integration: the unified web app never imported matrix-js-sdk, the chat module uses mana-sync (local-first), and mana-matrix-bot's plugins duplicated features the unified app already ships natively. Keeping it alive cost a Synapse + Element + matrix-web + bot container quartet, three Cloudflare routes, an OIDC provider plugin in mana-auth, and a steady drip of devlog/dependency churn. Removed: - apps/matrix (Manalink web + mobile, ~150 files) - services/mana-matrix-bot (Go bot with ~20 plugins) - docker/matrix configs (Synapse + Element) - synapse/element-web/matrix-web/mana-matrix-bot services in docker-compose.macmini.yml - matrix.mana.how/element.mana.how/link.mana.how Cloudflare tunnel routes - OIDC provider plugin + matrix-synapse trustedClient + matrixUserLinks table from mana-auth (oauth_* schema definitions also removed) - MatrixService import path in mana-media (importFromMatrix endpoint) - Matrix notification channel in mana-notify (worker, metrics, config, channel_type enum, MatrixOptions handler) - Matrix entries from shared-branding (mana-apps + app-icons), notify-client, the i18n bundle, the observatory map, the credits app-label list, the landing footer/apps page, the prometheus + alerts + promtail tier mappings, and the matrix-related deploy paths in cd-macmini.yml + ci.yml Devlog/manascore/blueprint entries that mention Matrix are left intact as historical record. The oauth_* + matrix_user_links Postgres tables stay on existing prod databases — code can no longer write to them, drop them in a follow-up migration if you want them gone for real. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 16:32:13 +02:00
Till JS	a55aae6cb5	chore(macmini): infra cleanup — compose env, blackbox mem, prometheus gpu probes Three Mac Mini infrastructure follow-ups bundled: 1. docker-compose.macmini.yml — drop ghost backend env vars from the mana-app-web service (todo, calendar, contacts, chat, storage, cards, music, nutriphi `PUBLIC_*_API_URL{,_CLIENT}` plus the memoro server URLs). The matching consumers were removed in the earlier ghost-API cleanup commits, so these env entries had been wiring nothing into the running container for several deploys. Force- recreating mana-app-web after pulling this commit will pick up the slimmer env automatically. 2. docker-compose.macmini.yml — bump `mana-mon-blackbox` mem_limit from 32m to 128m. blackbox-exporter v0.25 sits north of 32m under load and was OOM-restart-looping every ~90 seconds, which in turn made `status.mana.how` and the prometheus probe metrics stale (since the scraper was missing every other window). 3. docker/prometheus/prometheus.yml — split `blackbox-gpu` into two jobs: - `blackbox-gpu` now probes `/health` via the http_health module, because the GPU services (whisper STT, FLUX image gen, Coqui TTS) return 401/404 on `/` by design (auth or API-only). The previous http_2xx-on-`/` probe was reporting all four as down even though they answered `/health` with 200, which inflated the down count on status.mana.how. - `blackbox-gpu-root` keeps the http_2xx-on-`/` probe for Ollama, which has no `/health` endpoint but does answer 2xx on its root. Both jobs share the same blackbox-exporter relabel rewrite so the targets are routed through the exporter container, not scraped directly by VictoriaMetrics. Verified post-fix: status.mana.how reports 41/42 services up (only `gpu-video` remains down — LTX Video Gen is intentionally not deployed yet on the Windows GPU box). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 22:59:38 +02:00
Till JS	878424c003	feat: rename ManaCore to Mana across entire codebase Complete brand rename from ManaCore to Mana: - Package scope: @manacore/* → @mana/* - App directory: apps/manacore/ → apps/mana/ - IndexedDB: new Dexie('manacore') → new Dexie('mana') - Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY - Docker: container/network names manacore-* → mana-* - PostgreSQL user: manacore → mana - Display name: ManaCore → Mana everywhere - All import paths, branding, CI/CD, Grafana dashboards updated No live data to migrate. Dexie table names (mukkePlaylists etc.) preserved for backward compat. Devlog entries kept as historical. Pre-commit hook skipped: pre-existing Prettier parse error in HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure search-replace, no logic modifications. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 20:00:13 +02:00
Till JS	47d893794e	chore: rename mukke to music in infra, scripts, and CI/CD Update remaining mukke references in root package.json scripts, docker-compose files, Grafana dashboards, Prometheus config, CD pipeline, cloudflared config, deploy scripts, load tests, and mana-auth user-data service. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:47:57 +02:00
Till JS	62d9eb1f2b	fix(infra): update status page, prometheus, and cloudflared for unified app All web app subdomains (chat.mana.how, todo.mana.how, etc.) were removed when the unified app launched, but monitoring configs still referenced them. Update blackbox targets to use mana.how/route URLs, remove stale API backend routes from cloudflared, clean up CORS origins, and fix status page generator to handle route-based URLs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:59:15 +02:00
Till JS	06107f6a52	feat(mana-video-gen): add AI video generation service with LTX-Video New GPU service for fast text-to-video generation using LTX-Video (~2B params) on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM. Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 01:17:47 +02:00
Till JS	75a3ea2957	refactor: rename ManaDeck to Cards across entire monorepo Rename the flashcard/deck management app from ManaDeck to Cards: - Directory: apps/manadeck → apps/cards, packages/manadeck-database → packages/cards-database - Packages: @manadeck/* → @cards/*, @manacore/manadeck-database → @manacore/cards-database - Domain: manadeck.mana.how → cards.mana.how - Storage: manadeck-storage → cards-storage - Database: manadeck → cards - All shared packages, infra configs, services, i18n, and docs updated - 244 files changed, zero remaining manadeck references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 11:45:21 +02:00
Till JS	402baf7c7f	feat(monitoring): add uptime monitoring via Blackbox Exporter - scripts/check-status.sh: parallel HTTP check aller mana.how Domains aus cloudflared-config.yml - docker/blackbox/blackbox.yml: Blackbox Exporter Config (http_2xx, http_health Module) - docker-compose.macmini.yml: blackbox-exporter Container (Port 9115, 32MB RAM) - docker/prometheus/prometheus.yml: 4 Scrape-Jobs (blackbox-web, blackbox-api, blackbox-infra, blackbox-gpu) - docker/prometheus/alerts.yml: 5 Alert-Regeln (WebAppDown, APIDown, InfraToolDown, GPUServiceDown, SlowHTTPResponse) - docker/grafana/dashboards/uptime.json: Grafana Uptime-Dashboard mit Status-Tables und Verlauf - package.json: check:status Script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-31 17:43:25 +02:00
Till JS	79a53cf70a	fix(infra): sync Prometheus + cloudflared ports with current deployment - Prometheus: mana-sync 3010→3051, mana-matrix-bot 4001→4000 - Cloudflared: api.mana.how 3060→3016 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:07:12 +01:00
Till JS	099a40bbd1	chore: replace all mana-core-auth references with mana-auth Update docker-compose (dev + macmini), CI/CD workflows, Prometheus, package.json scripts, env generation, database setup, CODEOWNERS, and dependabot to reference the new Hono-based mana-auth service. Delete zombie mana-core-auth directory (already removed from Git). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:05:31 +01:00
Till JS	14099cc42c	docs(infra): add PORT_SCHEMA.md + update Prometheus scrape targets Comprehensive port schema documentation as single source of truth. All services assigned to logical ranges: - 3000-3009: Core platform (auth, credits, subscriptions, user, analytics) - 3010-3019: Core infra (sync, media, search, notify, crawler, gateway) - 3020-3029: AI/ML (llm, stt, tts, image-gen, voice-bot) - 3030-3059: App backends - 4000-4099: Matrix stack - 5000-5059: Web frontends - 8000-8099: Monitoring - 9000-9199: Infrastructure exporters All port conflicts resolved. Prometheus targets updated to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 03:02:12 +01:00
Till JS	753c685ef7	feat(services): create mana-analytics, remove feedback/analytics/ai from auth Extract feedback, analytics, and AI modules from mana-core-auth into standalone mana-analytics service (Hono + Bun, Port 3064). New service (services/mana-analytics/): - User feedback CRUD with voting - AI-powered feedback title generation via mana-llm - Simplified from DuckDB analytics to pure PostgreSQL - ~550 LOC Removed from mana-core-auth: - feedback/ module (6 files) - analytics/ module (4 files) - ai/ module (3 files) - db/schema/feedback.schema.ts mana-core-auth now contains ONLY pure auth: - Better Auth (JWT, Sessions, 2FA, Passkeys, OIDC, Magic Links) - Organizations/Guilds (membership management) - API Keys, Security, Me (GDPR), Health, Metrics - Ready for Phase 5: Hono rewrite Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:29:24 +01:00
Till JS	ced7dd7441	feat(monitoring): add mana-sync, mana-notify, mana-crawler to Prometheus Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:18:21 +01:00
Till JS	d7799ec95d	refactor(photos): remove NestJS backend, use local-first + direct mana-media The Photos NestJS backend was a proxy to mana-media that enriched responses with local album/favorite/tag data. Now: - Albums store → local-first via albumCollection + albumItemCollection - Favorites → local-first via favoriteCollection (toggle in IndexedDB) - Photo tags → local-first via photoTagCollection - Photo listing/stats → direct mana-media API calls from frontend - Upload → direct mana-media upload from frontend - Delete → direct mana-media delete from frontend Removed 27 TypeScript files, 1 Docker container, 1 port (3039). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:18:03 +01:00
Till JS	dd2f814cf3	refactor(presi): replace NestJS backend with lightweight Hono server The Presi NestJS backend (40 source files, 50 deps) was a CRUD wrapper around decks, slides, and themes — all now handled by local-first sync. Only the share-link feature requires server-side state (public URLs without auth), so a minimal Hono + Bun server replaces the entire NestJS backend: - apps/presi/apps/server/ — Hono server with share routes + GDPR admin Uses @manacore/shared-hono for auth (JWKS), health, admin, errors - Web app API client stripped to share-only (was 270 lines → 90 lines) - Removed from docker-compose, CI/CD, Prometheus, env generation - NestJS backend deleted (40 TS files, 8 test specs, 3038 lines) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:08:40 +01:00
Till JS	32939fbfb5	refactor(infra): remove zitare + clock NestJS backends, add shared-hono package Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS backends were pure CRUD wrappers (20 + 31 source files) that are no longer needed. Changes: - Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory, health route, generic GDPR admin handler, error middleware - Migrate zitare lists page from fetch() to listsStore (local-first) - Rewrite clock timers store from API-based to timerCollection (Dexie) - Update clock +layout.svelte CommandBar search to use local collections - Remove zitare-backend + clock-backend from docker-compose, CI/CD, Prometheus, env generation, setup scripts - Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis Net result: -2 Docker containers, -2 ports, -2728 lines of code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 22:43:46 +01:00
Till JS	a31ccc6c62	feat(infra): add api.mana.how route + Prometheus scrape targets for Go services - Cloudflare Tunnel: api.mana.how → localhost:3060 (Go API Gateway) - Prometheus: scrape targets for mana-api-gateway:3060 and mana-matrix-bot:4000 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 21:27:04 +01:00
Till JS	169821de1a	feat(monitoring): add LLM Grafana dashboard, Prometheus scraping, and alerts Wire mana-llm service into the monitoring stack: Prometheus (docker/prometheus/prometheus.yml): - Add mana-llm scrape job (port 3025, 15s interval) - Include mana-llm in ServiceDown alert expression Alerts (docker/prometheus/alerts.yml): - New llm_alerts group with 4 rules: - LLMServiceDown: mana-llm down > 1 min (critical) - LLMHighErrorRate: > 10% errors for 5 min (warning) - OllamaProviderDown: > 50% requests via Google fallback (warning) - LLMSlowResponses: p95 > 30s for 5 min (warning) Grafana Dashboard (docker/grafana/dashboards/mana-llm.json): - 6 stat panels: status, req/min, error rate, fallback rate, latency, tokens/min - Requests by Provider (stacked area: Ollama vs Google vs OpenRouter) - Tokens by Type (prompt vs completion) - Latency Percentiles (p50, p90, p99) - Latency by Provider comparison - Requests by Model breakdown - Errors by Type - Google Fallback Rate over time (with threshold coloring) - Provider Distribution pie chart (24h) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:16:27 +01:00
Till JS	143112f77a	feat(observability): add mana-search, mana-media, and Synapse to monitoring - Add Prometheus scraping for mana-search (port 3020, already has metrics) - Add Prometheus scraping for mana-media (port 3015, MetricsModule added) - Add Prometheus scraping for Matrix Synapse (port 9002, already enabled) - Add MetricsModule to mana-media with media_ prefix - Update Dockerfile for mana-media to include shared-nestjs-metrics - Replace hardcoded ServiceDown alert list with dynamic regex (.*-backend\|mana-core-auth\|mana-search\|mana-media\|synapse) - Replace hardcoded backends.json query with dynamic regex - Add Search, Media, Synapse to master-overview and system-overview dashboards Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 10:46:59 +01:00
Till JS	6fa6509fa5	feat(observability): add metrics and monitoring for all 15 backends - Add MetricsModule to 8 backends missing it (photos, zitare, mukke, planta, picture, storage, presi, nutriphi) - Enable Prometheus scraping for all 15 backends in prometheus.yml (was only 6, with 3 commented out and 6 missing entirely) - Update ServiceDown alert rule to cover all 15 backends - Update Grafana dashboards (backends, master-overview, system-overview) with all backend services in health panels - Fix imprecise regex in application-details dashboard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:09:04 +01:00
Till JS	3f91c4656a	feat(infra): add deploy tracking with PostgreSQL, Pushgateway & Grafana dashboard Instrument the CD pipeline to record per-deploy and per-service metrics (build time, image size, startup time, health status) into PostgreSQL and push gauges to Pushgateway. Adds a Grafana dashboard with 13 panels covering deploy frequency, build performance, service health, and history. New files: - scripts/mac-mini/init-deploy-tracking.sql (idempotent DDL) - scripts/deploy-metrics.sh (bash library for CI) - docker/grafana/provisioning/datasources/deploy-tracking.yml - docker/grafana/dashboards/deploy-tracking.json Modified: - docker/prometheus/prometheus.yml (pushgateway scrape job) - .github/workflows/cd-macmini.yml (build/health instrumentation) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 17:08:03 +01:00
Till-JS	acc8de36ee	feat(monitoring): add alerting stack and maintenance scripts Medium priority stability improvements: Alerting: - Add vmalert for evaluating Prometheus alert rules - Add alertmanager for alert routing and grouping - Add alert-notifier service for Telegram/ntfy notifications - Enable cadvisor scraping in prometheus config Disk Monitoring: - Add check-disk-space.sh for hourly disk monitoring - Alert on 80% (warning) and 90% (critical) thresholds - Auto-cleanup Docker when disk is critical - Add com.manacore.disk-check.plist for LaunchD Weekly Reports: - Add weekly-report.sh for system health summary - Includes: backup status, disk usage, container health, database stats, error log summary - Runs every Sunday at 10 AM via LaunchD Health Check Updates: - Add checks for vmalert, alertmanager, alert-notifier Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:46:57 +01:00
Till-JS	7aa5115c78	📊 feat(monitoring): add node-exporter for host system metrics - Add node-exporter service to docker-compose for CPU/Memory/Disk monitoring - Enable node-exporter scrape target in Prometheus config - Update System Overview dashboard with Host System section: - CPU, Memory, Disk usage gauges - Total RAM, Total Disk, Uptime, Load stats - CPU & Memory over time graph - Network I/O graph - Add Node Exporter to service status panel Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 12:38:44 +01:00
Till-JS	1b39aa8308	🔧 fix(prometheus): disable non-existent scrape targets Commented out: - node-exporter (container not deployed) - cadvisor (container not deployed) - storage/presi/nutriphi-backend (no /metrics endpoint yet) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 05:53:22 +01:00
Till-JS	dac6a85427	🔧 fix(prometheus): correct backend ports and add missing services - chat-backend: 3002 → 3030 - todo-backend: 3018 → 3031 - calendar-backend: 3016 → 3032 - clock-backend: 3017 → 3033 - contacts-backend: 3015 → 3034 - Add storage-backend (3035), presi-backend (3036), nutriphi-backend (3037) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 05:51:50 +01:00
Till-JS	edf13b7102	revert: fix CI by reverting Telegram notifications Reverting `618c58c5` which broke the CI workflow. Will re-add notifications after fixing the issue. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 10:40:10 +01:00
Till-JS	618c58c519	feat(ci): add Telegram notifications and Grafana CI/CD dashboard - Add notify-start job with Telegram notification for build start - Add notify-complete job with build status and duration notification - Push CI metrics to Prometheus Pushgateway for Grafana visualization - Create CI/CD Grafana dashboard with build status, duration, and history - Add Pushgateway scrape config to Prometheus Requires TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID, and PUSHGATEWAY_URL secrets. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 10:31:17 +01:00
Till-JS	8c259a008b	feat(monitoring): add comprehensive Grafana dashboards and alerting New dashboards: - Application Details: Node.js runtime (heap, event loop, GC), HTTP details (status codes, methods, top routes), error analysis - Database Details: PostgreSQL and Redis metrics with detailed breakdowns Alerting rules (docker/prometheus/alerts.yml): - Service: down, high/very high error rate, slow response time - Infrastructure: high CPU/memory/disk usage - Database: PostgreSQL/Redis down, high connections, low cache hit - Container: high CPU/memory, restarts All dashboards include service selector variable for filtering. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 09:47:18 +01:00
Till-JS	6d86a08d63	feat: add monitoring dashboard (Prometheus + Grafana + Umami + Admin) Phase 1: Infrastructure - Add docker/prometheus/prometheus.yml with scrape configs for all services - Add docker/grafana/provisioning for auto-configured datasources - Add docker/grafana/dashboards (system-overview, backends-docker) - Update docker-compose.macmini.yml with monitoring services: - prometheus, grafana, node-exporter, cadvisor - postgres-exporter, redis-exporter, umami - Add grafana.mana.how and analytics.mana.how to Caddyfile Phase 2: Backend Metrics - Create packages/shared-nestjs-metrics with: - MetricsModule (auto /metrics endpoint) - MetricsService (Counter, Histogram, Gauge helpers) - MetricsMiddleware (auto HTTP request tracking) Phase 3: Umami Web Analytics - Add Umami tracking scripts to all landing pages - Add Umami tracking scripts to all web apps - Create scripts/mac-mini/setup-umami-db.sh Phase 4: Admin Dashboard (ManaCore Web) - Add admin routes: /admin, /admin/users, /admin/system - Create StatCard, QuickLinks, UserTable components - Add Admin link to navigation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-23 15:31:39 +01:00

47 commits