managarten/services/mana-search/CLAUDE.md
Claude bd72b4d6d5
feat(search): implement mana-search microservice
Central search microservice for all ManaCore apps featuring:

- NestJS API on port 3021
- SearXNG meta-search engine integration (40+ search engines)
- Redis caching layer for search results and extracted content
- Content extraction with markdown conversion
- Prometheus metrics for monitoring

API Endpoints:
- POST /api/v1/search - Web search with categories/engines
- POST /api/v1/extract - Content extraction from URLs
- POST /api/v1/extract/bulk - Bulk extraction
- GET /health - Health check
- GET /metrics - Prometheus metrics

Search categories: general, news, science, it, images, videos
Supported engines: Google, Bing, DuckDuckGo, Wikipedia, arXiv,
GitHub, StackOverflow, and many more.

https://claude.ai/code/session_01Rk3YVJCU3nM8uvVPghRz6r
2026-01-28 20:41:59 +00:00

5.9 KiB

Mana Search Service

Central search microservice providing web search and content extraction for all ManaCore apps.

Overview

  • Port: 3021
  • Technology: NestJS + SearXNG + Redis
  • Purpose: Unified search and extraction API

Architecture

┌─────────────────────────────────────────────────────────────┐
│                 Consumer Apps                                │
│   Questions │ Chat │ Project Doc Bot │ Future Apps          │
└─────────────────────────┬───────────────────────────────────┘
                          ▼
┌─────────────────────────────────────────────────────────────┐
│              mana-search (Port 3021)                         │
│   Search API │ Extract API │ Redis Cache                    │
└─────────────────────────┬───────────────────────────────────┘
                          ▼
┌─────────────────────────────────────────────────────────────┐
│              SearXNG (Port 8080, internal)                   │
│   Google │ Bing │ DuckDuckGo │ Wikipedia │ arXiv │ ...      │
└─────────────────────────────────────────────────────────────┘

Quick Start

Development (Local NestJS + Docker SearXNG/Redis)

# 1. Start SearXNG and Redis
docker-compose -f docker-compose.dev.yml up -d

# 2. Install dependencies
pnpm install

# 3. Start NestJS in watch mode
pnpm dev

Production (Full Docker)

docker-compose up -d

API Endpoints

# Web search
POST /api/v1/search
{
  "query": "quantum computing",
  "options": {
    "categories": ["general", "science"],
    "engines": ["google", "wikipedia"],
    "language": "de-DE",
    "limit": 10
  }
}

# Get available engines
GET /api/v1/search/engines

# Search health check
GET /api/v1/search/health

# Clear search cache
DELETE /api/v1/search/cache

Extract

# Extract content from URL
POST /api/v1/extract
{
  "url": "https://example.com/article",
  "options": {
    "includeMarkdown": true,
    "maxLength": 5000
  }
}

# Bulk extract (max 20 URLs)
POST /api/v1/extract/bulk
{
  "urls": ["https://...", "https://..."],
  "options": { "includeMarkdown": true },
  "concurrency": 5
}

Health & Metrics

# Health check
GET /health

# Prometheus metrics
GET /metrics

Configuration

Environment Variables

Variable Default Description
PORT 3021 API port
SEARXNG_URL http://localhost:8080 SearXNG URL
SEARXNG_TIMEOUT 15000 Search timeout (ms)
SEARXNG_DEFAULT_LANGUAGE de-DE Default language
REDIS_HOST localhost Redis host
REDIS_PORT 6379 Redis port
CACHE_SEARCH_TTL 3600 Search cache TTL (seconds)
CACHE_EXTRACT_TTL 86400 Extract cache TTL (seconds)
EXTRACT_TIMEOUT 10000 Extraction timeout (ms)
EXTRACT_MAX_LENGTH 50000 Max extracted text length

SearXNG Configuration

Edit searxng/settings.yml to:

  • Enable/disable search engines
  • Configure rate limits
  • Set default language
  • Adjust timeouts

Development Commands

# Install dependencies
pnpm install

# Start development server
pnpm dev

# Build for production
pnpm build

# Start production server
pnpm start

# Type checking
pnpm type-check

# Linting
pnpm lint

# Run tests
pnpm test

Docker Commands

# Start all services (production)
docker-compose up -d

# Start SearXNG + Redis only (development)
docker-compose -f docker-compose.dev.yml up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

# Rebuild
docker-compose build --no-cache

Testing the API

# Search test
curl -X POST http://localhost:3021/api/v1/search \
  -H "Content-Type: application/json" \
  -d '{"query": "typescript tutorial"}'

# Extract test
curl -X POST http://localhost:3021/api/v1/extract \
  -H "Content-Type: application/json" \
  -d '{"url": "https://en.wikipedia.org/wiki/TypeScript", "options": {"includeMarkdown": true}}'

# Health check
curl http://localhost:3021/health

Search Categories

Category Engines
general Google, Bing, DuckDuckGo, Brave, Wikipedia
news Google News, Bing News
science arXiv, Google Scholar, PubMed, Semantic Scholar
it GitHub, StackOverflow, NPM, MDN
images Google Images, Bing Images, Unsplash
videos YouTube, Vimeo, PeerTube

Integration Example

// In another service
const response = await fetch('http://mana-search:3021/api/v1/search', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    query: 'machine learning basics',
    options: {
      categories: ['general', 'science'],
      limit: 5
    }
  })
});

const { results, meta } = await response.json();

Troubleshooting

SearXNG not responding

# Check SearXNG health
curl http://localhost:8080/healthz

# Check logs
docker logs mana-searxng-dev

Redis connection issues

# Check Redis
docker exec mana-search-redis-dev redis-cli ping

# Clear Redis data
docker exec mana-search-redis-dev redis-cli FLUSHALL

High memory usage

SearXNG can use significant memory. Adjust maxmemory in docker-compose if needed.