mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 18:01:09 +02:00
feat(ai): Mission Grant rollout gating — flag, alerts, runbook, user docs
Phase 4 — everything needed to flip the Mission Key-Grant feature on
safely per deployment. No new behaviour; purely operational plumbing.
- PUBLIC_AI_MISSION_GRANTS feature flag (default off). hooks.server.ts
injects window.__PUBLIC_AI_MISSION_GRANTS__, api/config.ts exposes
isMissionGrantsEnabled(). Grant UI (dialog + status box) and the
Workbench "Datenzugriff" tab both hide when the flag is off.
- PUBLIC_MANA_AI_URL added to the injection set so the webapp can reach
the new audit endpoint from production.
- Prometheus alerts (new mana_ai_alerts group):
- ManaAIServiceDown (warning, 2m)
- ManaAIGrantScopeViolation (critical, 0m) — MUST stay at 0; any
increment pages immediately
- ManaAIGrantSkipsHigh (warning, 15m) — flags keypair drift
- ManaAIPlannerParseFailures (warning, 10m) — prompt/LLM drift
- Runbook in docs/plans/ai-mission-key-grant.md: initial keypair gen,
leak-response procedure (rotate + invalidate all grants + audit),
scope-violation triage.
- User-facing doc in apps/docs security.mdx: new "AI Mission Grants"
section with the three hard constraints (ZK users blocked, scope
changes invalidate cryptographically, revocation is one click) plus
an honest threat-model comparison column showing where grants shift
the tradeoff.
Rollout remaining (not code): generate keypair on Mac Mini, provision
MANA_AI_PRIVATE_KEY_PEM + MANA_AI_PUBLIC_KEY_PEM via Docker secrets,
flip PUBLIC_AI_MISSION_GRANTS=true starting with till-only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
74bbfda212
commit
bb3da78d5c
7 changed files with 204 additions and 15 deletions
|
|
@ -184,6 +184,58 @@ Each row carries the IP address, user-agent, HTTP status code, and a free-form c
|
|||
| User loses recovery code | n/a | ❌ Data lost |
|
||||
| User loses password but vault is in ZK mode | Recovery via password reset | ❌ Data lost (vault is keyed to recovery code) |
|
||||
|
||||
## AI Mission Grants (opt-in, per mission)
|
||||
|
||||
By default, AI missions that depend on encrypted data (notes, tasks,
|
||||
calendar events, journal entries, your Kontext document) run **only
|
||||
when your browser tab is open** — the background runner on our server
|
||||
sees ciphertext and physically cannot read them.
|
||||
|
||||
Some missions are more useful when they run continuously, even while
|
||||
you're offline. For those, you can opt in — per mission, not globally
|
||||
— to a **Mission Key-Grant**. Here is exactly what that does:
|
||||
|
||||
1. Your browser derives a fresh key that is bound to:
|
||||
- The mission's ID.
|
||||
- The specific table names referenced.
|
||||
- The specific record IDs referenced.
|
||||
2. The derived key is wrapped with the mana-ai service's public key
|
||||
and attached to the mission record.
|
||||
3. When the mana-ai runner ticks for that mission, it unwraps the
|
||||
key in memory, decrypts **only the allowlisted records**, plans
|
||||
the next iteration, and forgets the key at the end of the tick.
|
||||
4. Every decrypt is logged. You see the full log under **Workbench
|
||||
→ Datenzugriff**.
|
||||
|
||||
Hard constraints — enforced by the code, not by policy:
|
||||
|
||||
- **Zero-knowledge users cannot issue grants.** The mana-auth server
|
||||
has no usable master key in ZK mode; the endpoint refuses.
|
||||
- **Scope changes invalidate the key cryptographically.** Add a new
|
||||
record to a mission → the derived key is different → the existing
|
||||
grant stops working → you're prompted to re-consent. It is not
|
||||
possible for the runner to "silently expand" its scope.
|
||||
- **Grants expire.** Default lifetime is 7 days, renewed on every
|
||||
successful run. Missions that go idle lose their grant automatically;
|
||||
you re-consent on the next edit.
|
||||
- **Revocation is one click.** The lock icon in the Workbench removes
|
||||
the grant; the mission keeps its history but stops running
|
||||
server-side until you re-grant.
|
||||
- **The runner never writes under a grant** — it only reads. All
|
||||
changes still go through the normal proposal-approve flow you
|
||||
control.
|
||||
|
||||
| Threat | Standard | With a Mission Grant | Zero-Knowledge |
|
||||
|--------|----------|----------------------|----------------|
|
||||
| Mana operator reads an unrelated record of the same user | ⚠️ Could decrypt with KEK | ✅ Cannot — key is scoped | ✅ Cannot |
|
||||
| Mana operator reads the granted records of the grant-enabled mission | ⚠️ Could decrypt with KEK | ⚠️ Could decrypt with the grant key + record ciphertext | ✅ Cannot |
|
||||
| Court order against Mana for the granted-mission records | ⚠️ Could be compelled | ⚠️ Could be compelled (while grant is active) | ✅ Mana physically cannot comply |
|
||||
| Runner RAM-dump during the 60s tick | ⚠️ n/a | ⚠️ Could expose the grant key for one tick window | ✅ n/a |
|
||||
|
||||
The tradeoff is deliberate: you exchange a small, scoped privacy
|
||||
reduction for autonomy on one mission. Missions without a grant keep
|
||||
the full standard / ZK guarantees.
|
||||
|
||||
## Implementation references
|
||||
|
||||
For the architectural deep dive, code locations, and the complete rollout history (Phases 1–9 + the backlog sweep), see [`DATA_LAYER_AUDIT.md`](https://github.com/mana-how/mana-monorepo/blob/main/apps/mana/apps/web/src/lib/data/DATA_LAYER_AUDIT.md).
|
||||
|
|
|
|||
|
|
@ -47,6 +47,12 @@ const PUBLIC_MANA_API_URL_CLIENT =
|
|||
process.env.PUBLIC_MANA_API_URL_CLIENT || process.env.PUBLIC_MANA_API_URL || '';
|
||||
const PUBLIC_MANA_CREDITS_URL_CLIENT =
|
||||
process.env.PUBLIC_MANA_CREDITS_URL_CLIENT || process.env.PUBLIC_MANA_CREDITS_URL || '';
|
||||
const PUBLIC_MANA_AI_URL_CLIENT =
|
||||
process.env.PUBLIC_MANA_AI_URL_CLIENT || process.env.PUBLIC_MANA_AI_URL || '';
|
||||
// Feature flag for the Mission Key-Grant UI (server-side execution of
|
||||
// encrypted missions). Default off — flip to 'true' per deployment once
|
||||
// the MANA_AI_PUBLIC/PRIVATE_KEY_PEM pair is provisioned on both services.
|
||||
const PUBLIC_AI_MISSION_GRANTS = process.env.PUBLIC_AI_MISSION_GRANTS === 'true' ? 'true' : 'false';
|
||||
|
||||
// Map of app subdomains to internal paths
|
||||
const APP_SUBDOMAINS = new Set([
|
||||
|
|
@ -126,6 +132,8 @@ window.__PUBLIC_MANA_LLM_URL__ = ${JSON.stringify(PUBLIC_MANA_LLM_URL_CLIENT)};
|
|||
window.__PUBLIC_MANA_EVENTS_URL__ = ${JSON.stringify(PUBLIC_MANA_EVENTS_URL_CLIENT)};
|
||||
window.__PUBLIC_MANA_API_URL__ = ${JSON.stringify(PUBLIC_MANA_API_URL_CLIENT)};
|
||||
window.__PUBLIC_MANA_CREDITS_URL__ = ${JSON.stringify(PUBLIC_MANA_CREDITS_URL_CLIENT)};
|
||||
window.__PUBLIC_MANA_AI_URL__ = ${JSON.stringify(PUBLIC_MANA_AI_URL_CLIENT)};
|
||||
window.__PUBLIC_AI_MISSION_GRANTS__ = ${JSON.stringify(PUBLIC_AI_MISSION_GRANTS)};
|
||||
window.__PUBLIC_GLITCHTIP_DSN__ = ${JSON.stringify(PUBLIC_GLITCHTIP_DSN)};
|
||||
</script>`;
|
||||
return injectUmamiAnalytics(html.replace('<head>', `<head>${envScript}`));
|
||||
|
|
|
|||
|
|
@ -74,6 +74,22 @@ export function getManaAiUrl(): string {
|
|||
return process.env.PUBLIC_MANA_AI_URL || 'http://localhost:3066';
|
||||
}
|
||||
|
||||
/**
|
||||
* Feature flag for the AI Mission Key-Grant UI. When false, the consent
|
||||
* dialog + "Server-Zugriff" box are hidden even on missions with
|
||||
* encrypted inputs — missions simply stay foreground-only. Flip on per-
|
||||
* deployment after the MANA_AI_PUBLIC/PRIVATE_KEY_PEM keypair is
|
||||
* provisioned on both mana-auth and mana-ai.
|
||||
*/
|
||||
export function isMissionGrantsEnabled(): boolean {
|
||||
if (browser && typeof window !== 'undefined') {
|
||||
const flag = (window as unknown as { __PUBLIC_AI_MISSION_GRANTS__?: string })
|
||||
.__PUBLIC_AI_MISSION_GRANTS__;
|
||||
return flag === 'true';
|
||||
}
|
||||
return process.env.PUBLIC_AI_MISSION_GRANTS === 'true';
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the mana-mail service URL.
|
||||
* Hosts mail threads, send, labels, accounts.
|
||||
|
|
|
|||
|
|
@ -19,6 +19,7 @@
|
|||
import { productionDeps } from '$lib/data/ai/missions/setup';
|
||||
import MissionInputPicker from '$lib/components/ai/MissionInputPicker.svelte';
|
||||
import MissionGrantDialog from '$lib/components/ai/MissionGrantDialog.svelte';
|
||||
import { isMissionGrantsEnabled } from '$lib/api/config';
|
||||
import type { Mission, MissionCadence, MissionInputRef } from '$lib/data/ai/missions/types';
|
||||
|
||||
const missions = $derived(useMissions());
|
||||
|
|
@ -106,6 +107,7 @@
|
|||
function hasEncryptedInputs(m: Mission): boolean {
|
||||
return m.inputs.some((i) => ENCRYPTED_SERVER_TABLES.has(i.table));
|
||||
}
|
||||
const grantsEnabled = $derived(isMissionGrantsEnabled());
|
||||
function grantStatus(m: Mission): 'none' | 'active' | 'expired' {
|
||||
if (!m.grant) return 'none';
|
||||
return Date.parse(m.grant.expiresAt) < Date.now() ? 'expired' : 'active';
|
||||
|
|
@ -305,7 +307,7 @@
|
|||
</details>
|
||||
{/if}
|
||||
|
||||
{#if hasEncryptedInputs(selected)}
|
||||
{#if grantsEnabled && hasEncryptedInputs(selected)}
|
||||
<section class="grant-box">
|
||||
<div class="grant-head">
|
||||
<span class="grant-title">🔑 Server-Zugriff</span>
|
||||
|
|
|
|||
|
|
@ -8,6 +8,7 @@
|
|||
import { useMissions } from '$lib/data/ai/missions/queries';
|
||||
import { revertIteration } from '$lib/data/ai/revert/revert-iteration';
|
||||
import { fetchDecryptAudit, type AuditRow } from '$lib/data/ai/audit/queries';
|
||||
import { isMissionGrantsEnabled } from '$lib/api/config';
|
||||
import type { DomainEvent } from '$lib/data/events/types';
|
||||
|
||||
let moduleFilter = $state<string | null>(null);
|
||||
|
|
@ -41,6 +42,7 @@
|
|||
}
|
||||
|
||||
// ── Tab switcher: timeline ↔ decrypt audit ─────────────
|
||||
const grantsEnabled = $derived(isMissionGrantsEnabled());
|
||||
let tab = $state<'timeline' | 'audit'>('timeline');
|
||||
let auditRows = $state<AuditRow[]>([]);
|
||||
let auditLoading = $state(false);
|
||||
|
|
@ -110,16 +112,18 @@
|
|||
>
|
||||
Timeline
|
||||
</button>
|
||||
<button
|
||||
type="button"
|
||||
role="tab"
|
||||
class="tab"
|
||||
class:tab-active={tab === 'audit'}
|
||||
aria-selected={tab === 'audit'}
|
||||
onclick={() => (tab = 'audit')}
|
||||
>
|
||||
Datenzugriff
|
||||
</button>
|
||||
{#if grantsEnabled}
|
||||
<button
|
||||
type="button"
|
||||
role="tab"
|
||||
class="tab"
|
||||
class:tab-active={tab === 'audit'}
|
||||
aria-selected={tab === 'audit'}
|
||||
onclick={() => (tab = 'audit')}
|
||||
>
|
||||
Datenzugriff
|
||||
</button>
|
||||
{/if}
|
||||
</div>
|
||||
|
||||
<div class="filters">
|
||||
|
|
|
|||
|
|
@ -465,3 +465,53 @@ groups:
|
|||
annotations:
|
||||
summary: "LLM responses are slow"
|
||||
description: "LLM p95 latency is {{ $value | humanizeDuration }}."
|
||||
|
||||
- name: mana_ai_alerts
|
||||
rules:
|
||||
# mana-ai background runner down
|
||||
- alert: ManaAIServiceDown
|
||||
expr: up{job="mana-ai"} == 0
|
||||
for: 2m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "mana-ai background runner is down"
|
||||
description: "mana-ai has been down for 2+ minutes. Missions fall back to the browser-only Runner — users with closed tabs stop receiving proposals."
|
||||
|
||||
# Grant scope violation — MUST remain at 0 in steady state.
|
||||
# Any increment is a serious signal: either a runtime bug bypassed
|
||||
# the cryptographic scope binding, or a compromised service tried
|
||||
# to decrypt outside its allowlist. Page on first occurrence.
|
||||
- alert: ManaAIGrantScopeViolation
|
||||
expr: increase(mana_ai_grant_scope_violations_total[5m]) > 0
|
||||
for: 0m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "mana-ai Mission Grant scope violation detected"
|
||||
description: "mana-ai attempted to decrypt a record outside a Mission Grant's allowlist on table {{ $labels.table }}. Steady-state value MUST be 0. Investigate: (1) look for a resolver bug on the named table, (2) check recent grant issuance, (3) dump the most recent rows from mana_ai.decrypt_audit WHERE status='scope-violation'."
|
||||
|
||||
# Chronic grant failures — expired TTLs are fine, but a flood of
|
||||
# wrap-rejected / malformed / not-configured means the keypair is
|
||||
# misconfigured or rotated without re-consent.
|
||||
- alert: ManaAIGrantSkipsHigh
|
||||
expr: |
|
||||
sum(rate(mana_ai_grant_skips_total{reason!="expired"}[15m])) > 0.1
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "mana-ai grant skips trending high ({{ $labels.reason }})"
|
||||
description: "mana-ai is skipping grants at {{ $value | humanize }}/s with reason={{ $labels.reason }}. Likely causes: MANA_AI_PRIVATE_KEY_PEM mis-set, keypair out of sync with mana-auth's public key, or client producing malformed grants."
|
||||
|
||||
# Planner parse failures — too many means the prompt / LLM drifted.
|
||||
- alert: ManaAIPlannerParseFailures
|
||||
expr: |
|
||||
sum(rate(mana_ai_parse_failures_total[10m]))
|
||||
/ (sum(rate(mana_ai_plans_produced_total[10m])) + sum(rate(mana_ai_parse_failures_total[10m])) + 0.0001) > 0.2
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "mana-ai planner parse-failure rate high"
|
||||
description: "{{ $value | humanizePercentage }} of Planner responses failed to parse — prompt drift or LLM degradation likely."
|
||||
|
|
|
|||
|
|
@ -85,10 +85,11 @@ Ziel: User kann Grant geben/zurückziehen, UX ist ehrlich.
|
|||
|
||||
### Phase 4 — Rollout (1–2 Tage)
|
||||
|
||||
- [ ] **Feature-Flag**: `PUBLIC_AI_MISSION_GRANTS=false` default. Dogfood zuerst (till only), dann beta-tier, dann alpha.
|
||||
- [ ] **Status-Page**: blackbox-probe auf `mana-ai` `/health` existiert schon; zusätzlich Alerting auf `mana_ai_grant_scope_violations_total > 0` (darf nie vorkommen).
|
||||
- [ ] **Runbook**: Was tun wenn `MANA_AI_PRIVATE_KEY` leaked? → Keypair rotieren, alle Grants invalidieren (simples `UPDATE aiMissions SET grant=null`), User bekommen Re-Consent-Prompts.
|
||||
- [ ] **Docs-Update**: [`apps/docs/src/content/docs/architecture/security.mdx`](../../apps/docs/src/content/docs/architecture/security.mdx) — neuer Abschnitt "AI Mission Grants".
|
||||
- [x] **Feature-Flag**: `PUBLIC_AI_MISSION_GRANTS=false` default — Dialog + Audit-Tab sind gegated. Dogfood zuerst (till only), dann beta-tier, dann alpha.
|
||||
- [x] **Alerting**: `ManaAIGrantScopeViolation` (critical, any increment), `ManaAIGrantSkipsHigh` (warning, non-expired skips), `ManaAIPlannerParseFailures` in `docker/prometheus/alerts.yml`. Status-Page blackbox-probe auf `/health` laeuft bereits.
|
||||
- [x] **Runbook**: Keypair-initial + Keypair-Leak-Prozedur + Scope-Violation-Response weiter unten in diesem Dokument.
|
||||
- [x] **Docs-Update**: [`apps/docs/src/content/docs/architecture/security.mdx`](../../apps/docs/src/content/docs/architecture/security.mdx) — Abschnitt "AI Mission Grants" inkl. erweiterter Threat-Model-Zeilen.
|
||||
- [ ] **Keypair tatsaechlich erzeugen** auf Mac-Mini + in Secrets ablegen (nicht in diesem Repo — out-of-band).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -131,6 +132,62 @@ Ziel: User kann Grant geben/zurückziehen, UX ist ehrlich.
|
|||
|
||||
---
|
||||
|
||||
## Runbook
|
||||
|
||||
### Keypair initial erzeugen (einmalig pro Deployment)
|
||||
|
||||
```bash
|
||||
# Auf dem Mac-Mini (oder einer sicheren Arbeitsumgebung):
|
||||
openssl genpkey -algorithm RSA -pkeyopt rsa_keygen_bits:2048 -out mana-ai.priv.pem
|
||||
openssl pkey -in mana-ai.priv.pem -pubout -out mana-ai.pub.pem
|
||||
|
||||
# Als Env-Vars exportieren (Docker-Compose env_file / secrets):
|
||||
# MANA_AI_PRIVATE_KEY_PEM → mana-ai (niemals ausserhalb des Services!)
|
||||
# MANA_AI_PUBLIC_KEY_PEM → mana-auth
|
||||
|
||||
# Dann im Webapp-Build:
|
||||
# PUBLIC_AI_MISSION_GRANTS=true (Dialog + Audit-Tab aktivieren)
|
||||
```
|
||||
|
||||
Beide Services loggen beim Boot ob das Feature aktiv ist; `GET /health`-Status aendert sich nicht.
|
||||
|
||||
### "Was tun wenn `MANA_AI_PRIVATE_KEY_PEM` leaked?"
|
||||
|
||||
Der Private-Key ist das einzige Geheimnis, das alle aktiven Grants entschluesseln kann. Leakt er, kann ein Angreifer **im Besitz des verschluesselten Grant-Blobs + der verschluesselten Records** den Plaintext rekonstruieren. Ohne die verschluesselten Records allein bringt der Key nichts — aber das ist eine duenne Grenze; im Zweifel: rotieren.
|
||||
|
||||
Prozedur:
|
||||
|
||||
1. **Neues Keypair erzeugen** (siehe oben). Unter keinen Umstaenden das alte wiederverwenden.
|
||||
2. **`MANA_AI_PRIVATE_KEY_PEM`** auf `mana-ai` austauschen → Service neustarten. Alle bestehenden Grants unwrappen ab jetzt mit `wrap-rejected` (neuer Private-Key passt nicht zum alten Wrap).
|
||||
3. **`MANA_AI_PUBLIC_KEY_PEM`** auf `mana-auth` austauschen → Service neustarten.
|
||||
4. **Alle bestehenden Grants invalidieren** — die sind mit dem alten Public-Key gewrappt und funktionslos. Im Postgres:
|
||||
```sql
|
||||
UPDATE aiMissions SET grant = NULL
|
||||
WHERE user_id = '<jeder>' AND grant IS NOT NULL;
|
||||
```
|
||||
(Im Mana-Modell lebt das als `sync_changes`-Row auf `appId='ai'/table='aiMissions'`; einfacher ist eine leise Migration im `mana-sync` Admin-Backend.)
|
||||
5. **Audit-Trail** dokumentieren: Zeitpunkt Leak entdeckt / Keys getauscht / Grants invalidiert. Post-Mortem in `docs/postmortems/`.
|
||||
6. **User benachrichtigen**: Missions bleiben aktiv, laufen aber nur noch im Vordergrund bis der User den Zugriff erneut erteilt. Das ist nach Plan; Re-Consent-Prompt erscheint automatisch beim naechsten Mission-Edit.
|
||||
7. **Monitoring pruefen**: `mana_ai_grant_skips_total{reason="wrap-rejected"}` muss nach Schritt 2 kurz hoch gehen (alte Grants) und dann zurueck auf 0 sobald alle via Schritt 4 entfernt sind.
|
||||
|
||||
### Scope-Violation Alarm reagiert
|
||||
|
||||
Prometheus-Alert `ManaAIGrantScopeViolation` (critical, see `docker/prometheus/alerts.yml`) feuert bei `mana_ai_grant_scope_violations_total > 0`. Steady-State muss 0 sein — jede Zuendung ist entweder Bug oder Angriff.
|
||||
|
||||
1. Letzte Scope-Violations auslesen:
|
||||
```sql
|
||||
SELECT * FROM mana_ai.decrypt_audit
|
||||
WHERE status = 'scope-violation'
|
||||
ORDER BY ts DESC LIMIT 20;
|
||||
```
|
||||
2. `record_id` pruefen: gehoert die Record tatsaechlich zum User? Falls nein → kompromittierte Mission-Grant-Erzeugung, Nutzer sperren.
|
||||
3. Falls ja: Resolver-Bug. `services/mana-ai/src/db/resolvers/encrypted.ts` checken — die HKDF-Bindung sollte der Check eigentlich ueberfluessig machen. Wenn der Runtime-Check greift, stimmt etwas in der Derivation nicht.
|
||||
4. Mission temporaer pausieren:
|
||||
```sql
|
||||
UPDATE aiMissions SET state = 'paused', grant = NULL
|
||||
WHERE id = '<missionId>';
|
||||
```
|
||||
|
||||
## Nicht-Ziele
|
||||
|
||||
- **Zero-Knowledge-User bekommen das nicht.** Die bleiben beim Foreground-Runner. Wenn sie Autonomie wollen, müssen sie ZK abschalten — das ist die Entscheidung die ZK bedeutet.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue