#5 — SYSTEM_ARTICLES_IMPORT_WORKER hoisted into @mana/shared-ai
The worker built its actor inline, bypassing the SystemSource union
that's the blessed list for system-write principals. Now uses
makeSystemActor(SYSTEM_ARTICLES_IMPORT_WORKER) like every other
server-side system writer (mission-runner, projection, …).
#7 — sync-db helper hoisted out of mcp/ into lib/
Implementation moved to apps/api/src/lib/sync-db.ts; mcp/sync-db.ts
is a re-export shim so existing MCP imports keep working. Articles
bulk-import + future modules import from lib/ directly — no more
"articles depending on mcp" layering smell.
#11 — Prometheus metrics for the worker
New counters + histogram in lib/metrics.ts under
mana_api_articles_import_*:
- ticks_total{result=processed|skipped|error}
- items_total{result=extracted|error|consent_wall|cancelled}
- extract_duration_seconds (histogram, 0.25–30s buckets)
- jobs_completed_total{result=done}
- pickup_gc_rows_total
Worker tick + extractor instrumented at the right transition points.
Steady-state pickup_gc_rows_total > 0 over time signals a stuck
consumer somewhere — useful operator alert.
Plan: docs/plans/articles-bulk-import.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four cross-cutting fixes that make the bulk-import worker safe to run
under real production load. All four were called out as live-rollout
risks in the post-ship review of docs/plans/articles-bulk-import.md.
#1 — Same fieldMetaTime bug fixed in mana-ai
The articles fix in 054b9e5be hoists the helper to its own file
`apps/api/src/modules/articles/field-meta.ts`. The same naive
`rowFM[k] >= localTime` LWW comparison existed in three more
projections under services/mana-ai (missions-projection,
snapshot-refresh, agents-projection). Once any F3 stamp lands
beside a legacy-string stamp, the comparison evaluates
`'[object Object]' >= 'ISO-…'` (false) and the older value wins.
New `services/mana-ai/src/db/field-meta.ts` — same helper,
deliberately duplicated (each service treats sync_changes as a
read-only event log; sharing infra across services is out of
scope here). All 61 mana-ai bun tests still pass.
#2 — Stale 'extracting' items recycle
If the worker dies mid-fetch (OOM, pod restart), items stay in
state='extracting' forever and the job never completes. New sweep
at the start of `processOneJob`: items whose lastAttemptAt is
older than 5 minutes get bounced back to 'pending' so the next
tick re-claims them. STALE_EXTRACTING_MS tuned for the 15s
shared-rss fetch + JSDOM-parse worst case.
#3 — Pickup-row GC
Every 30 ticks (~once per minute) the worker hard-deletes
articleExtractPickup rows older than 24h. Without this a stuck
pickup-consumer (all tabs closed, Web-Lock mismatch) would let
sync_changes accumulate without bound. Logs the row count when
non-zero so we can spot stuck consumers in the wild.
#4 — DRY consent-wall heuristic
Identical CONSENT_KEYWORDS + threshold lived in routes.ts AND
import-extractor.ts. Hoisted to
`apps/api/src/modules/articles/consent-wall.ts`; both call sites
now share one heuristic.
Plan: docs/plans/articles-bulk-import.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>