feat(website): M7 — observability + analytics + GC + M2-polish

Closes the plan. Prometheus metrics across the website endpoints, a
cookieless analytics block users can opt in to, a read-only orphan-
asset scan script, plus two M2 debts (rollback UI + determinism test).

apps/api:
- New /metrics endpoint (unauth; internal-network only via reverse proxy).
  Scrape with the existing Prometheus config that already covers mana-ai.
- lib/metrics.ts with prom-client Registry and default-metrics prefix
  `mana_api_`. Website-specific counters/histograms:
    website_publish_total{result=success|slug_taken|invalid|error}
    website_publish_duration_seconds (Histogram)
    website_submissions_total{result=received|spam|rate_limit|not_found|invalid}
    website_host_resolve_total{result=hit|miss|error}
    website_domain_verify_total{result=verified|failed}
    website_public_reads_total{result=hit|not_found}
    website_public_read_age_seconds (Histogram — age of served snapshot)
- Instrument publish.ts, submit.ts, public-routes.ts, domains.ts with
  .inc() calls on every code path.

packages/website-blocks:
- New `analytics` block: Plausible + Umami support with self-hosted
  script-URL override. Hidden in edit/preview, emits exactly one
  <script> in public mode. No cookies, no PII. Registered in block-
  registry; 11 blocks total now.

apps/api/scripts/gc-website-assets.ts:
- Read-only scan: walks published_snapshots.blob + submissions.payload
  for /api/v1/media/{id}/ references, asks mana-media for items scoped
  to app=website, flags orphans older than 30d. Writes report to
  /tmp/gc-website-assets-<ts>.json. Deletion toggle is a future commit.

apps/mana/apps/web:
- RollbackDialog component + PublishBar integration. Closes the M2
  debt "Rollback funktioniert" (API + store were there; UI was missing).
- publish.test.ts: snapshot determinism + orphan-drop tests. 4/4 pass.

docs:
- observability/website.md: metric reference, PromQL queries, alert
  suggestions, Grafana dashboard pointer.
- plans/website-builder.md: M7 checklist updated (Per-site-stats +
  submission-retention explicitly deferred with reason), shipping log
  table completed with all M1→M7 commits.

Validation:
- apps/mana/apps/web: pnpm check → 0 errors 0 warnings
- apps/api: tsc --noEmit → clean
- website-blocks tsc → clean
- publish.test.ts → 4/4 pass

Note: validate:all's check:crypto fails on unrelated WIP (wardrobe
module's Dexie tables aren't classified yet in encryption-registry).
Pre-existing failure, not introduced by this commit — the pre-commit
lint-staged run does NOT include check:crypto so it doesn't block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-04-23 18:30:49 +02:00
parent 4fc9d6c59c
commit d518169ce9

View file

@ -22,6 +22,7 @@ import { and, desc, eq } from 'drizzle-orm';
import { promises as dns } from 'node:dns';
import { requireTier, type AuthVariables } from '@mana/shared-hono';
import { errorResponse, validationError } from '../../lib/responses';
import { websiteDomainVerifyTotal } from '../../lib/metrics';
import { db, customDomains } from './schema';
const routes = new Hono<{ Variables: AuthVariables }>();
@ -171,6 +172,7 @@ routes.post('/sites/:id/domains/:domainId/verify', async (c) => {
void cloudflareOnboard(row.hostname).catch((err) => {
console.error('[website] cloudflare onboard failed', { hostname: row.hostname, err });
});
websiteDomainVerifyTotal.inc({ result: 'verified' });
return c.json({ verified: true, hostname: row.hostname });
}
@ -183,6 +185,7 @@ routes.post('/sites/:id/domains/:domainId/verify', async (c) => {
})
.where(eq(customDomains.id, domainId));
websiteDomainVerifyTotal.inc({ result: 'failed' });
return c.json({ verified: false, reason: result.reason }, 400);
});