managarten/services/mana-crawler-go
Till JS 64f7f768eb feat(infra): add Go web crawler (mana-crawler-go)
Goroutine-based crawler replacing NestJS mana-crawler:
- goquery for HTML parsing (title, content, links, metadata)
- robots.txt checker with 24h cache
- Worker pool with configurable concurrency + rate limiting
- PostgreSQL for job/result storage
- Same API surface: POST/GET/DELETE /api/v1/crawl

11 MB binary, ~15 MB Docker image vs ~200 MB NestJS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 22:10:45 +01:00
..
cmd/server feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
internal feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
.gitignore feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
CLAUDE.md feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
Dockerfile feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
go.mod feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
go.sum feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00
package.json feat(infra): add Go web crawler (mana-crawler-go) 2026-03-27 22:10:45 +01:00