managarten/services/mana-crawler/CLAUDE.md
Till JS 7e931b1c6d refactor(services): rename Go services, remove -go suffix
mana-search-go → mana-search
mana-notify-go → mana-notify
mana-crawler-go → mana-crawler
mana-api-gateway-go → mana-api-gateway

Legacy NestJS versions are deleted, suffix no longer needed.
Updated all references in docker-compose, CLAUDE.md, package.json,
Forgejo workflows, and service package.json files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 10:18:40 +01:00

819 B

mana-crawler (Go)

Go web crawler replacing the NestJS mana-crawler. Goroutine-based worker pool instead of BullMQ.

Architecture

  • Language: Go 1.25
  • HTML Parsing: goquery (jQuery-like selectors)
  • Robots.txt: temoto/robotstxt with 24h cache
  • Job Queue: Goroutine worker pool + channels (replaces BullMQ)
  • Database: PostgreSQL (pgx v5)
  • Port: 3023

Endpoints

  • POST /api/v1/crawl — Start crawl job
  • GET /api/v1/crawl — List jobs
  • GET /api/v1/crawl/{jobId} — Job status
  • GET /api/v1/crawl/{jobId}/results — Paginated results
  • DELETE /api/v1/crawl/{jobId} — Cancel job
  • GET /health — Health check
  • GET /metrics — Prometheus metrics

Commands

go run ./cmd/server    # Dev
go build ./cmd/server  # Build
go test ./...          # Test