managarten/services/mana-crawler/src
Till-JS 4a3295d1d0 feat(mana-crawler): add web crawler service
NestJS-based web crawler service for structured content extraction.

Features:
- Depth-controlled crawling with URL pattern filtering
- robots.txt compliance
- HTML/PDF/Markdown content extraction
- BullMQ job queue for async processing
- Redis caching layer
- Prometheus metrics
2026-01-29 22:00:36 +01:00
..
cache feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
common/filters feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
config feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
crawler feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
db feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
health feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
metrics feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
parser feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
queue feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
robots feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
app.module.ts feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00
main.ts feat(mana-crawler): add web crawler service 2026-01-29 22:00:36 +01:00