mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-16 06:19:39 +02:00
Three improvements to the unified LLM infrastructure: 1. Ollama memory optimization (scripts/mac-mini/configure-ollama.sh): - OLLAMA_KEEP_ALIVE=5m → models unload after 5min idle (saves 3-16GB RAM) - OLLAMA_NUM_PARALLEL=1 → predictable memory usage - OLLAMA_MAX_LOADED_MODELS=1 → max 1 model in RAM at a time 2. Request-level metrics in @manacore/shared-llm: - LlmRequestMetrics interface (model, latency, tokens, fallback detection) - LlmMetricsCollector class with summary stats (for health endpoints) - Optional onMetrics callback in LlmModuleOptions - Automatic metrics emission in chatMessages() (success + error) 3. Chat streaming (token-by-token SSE): - Backend: POST /chat/completions/stream SSE endpoint - OllamaService.createStreamingCompletion() via llm.chatStreamMessages() - ChatService.createStreamingCompletion() with upfront credit consumption - Web: chatApi.createStreamingCompletion() SSE consumer - Chat store: sendMessage() now streams tokens into assistant message - UI updates reactively as each token arrives Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
39 lines
825 B
TypeScript
39 lines
825 B
TypeScript
// Module
|
|
export { LlmModule } from './llm.module';
|
|
export { LlmClientService } from './llm-client.service';
|
|
export { LLM_MODULE_OPTIONS } from './llm.constants';
|
|
|
|
// Core client (for advanced use cases)
|
|
export { LlmClient } from './llm-client';
|
|
|
|
// Interfaces
|
|
export type {
|
|
LlmModuleOptions,
|
|
LlmModuleAsyncOptions,
|
|
LlmOptionsFactory,
|
|
ResolvedLlmOptions,
|
|
} from './interfaces';
|
|
export { resolveOptions } from './interfaces';
|
|
|
|
// Types
|
|
export type {
|
|
ChatMessage,
|
|
ContentPart,
|
|
TextContentPart,
|
|
ImageContentPart,
|
|
ChatOptions,
|
|
JsonOptions,
|
|
VisionOptions,
|
|
TokenUsage,
|
|
ChatResult,
|
|
JsonResult,
|
|
ModelInfo,
|
|
HealthStatus,
|
|
} from './types';
|
|
|
|
// Utilities
|
|
export { extractJson } from './utils';
|
|
|
|
// Metrics
|
|
export { LlmMetricsCollector } from './utils';
|
|
export type { LlmRequestMetrics, MetricsCallback } from './utils';
|