mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-16 00:39:39 +02:00
Goroutine-based crawler replacing NestJS mana-crawler: - goquery for HTML parsing (title, content, links, metadata) - robots.txt checker with 24h cache - Worker pool with configurable concurrency + rate limiting - PostgreSQL for job/result storage - Same API surface: POST/GET/DELETE /api/v1/crawl 11 MB binary, ~15 MB Docker image vs ~200 MB NestJS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| main.go | ||