Oskar Kapala
|
dcacac6965
|
fix(planner-agent): rename OLLAMA_HOST → OLLAMA_API_BASE (litellm convention)
LiteLLM reads OLLAMA_API_BASE, not OLLAMA_HOST.
- llm_router.py: DEFAULT_OLLAMA_HOST → DEFAULT_OLLAMA_API_BASE, param ollama_host → ollama_api_base
- planner.py: env var os.getenv("OLLAMA_HOST") → os.getenv("OLLAMA_API_BASE"), param renamed accordingly
- /opt/homelab/config/planner-agent/.env on SOLARIA updated in-place (not in git)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-28 11:34:08 +02:00 |
|
Oskar Kapala
|
1bbc511bb7
|
feat(planner-agent): add llm_router.py with local-first fallback chain
services/planner-agent/src/llm_router.py:
- LLMRouter: async routing via litellm; chain = Qwen/Ollama → haiku → sonnet
- Timeouts: 8s local, 30s cloud; asyncio.wait_for belt-and-suspenders
- Rejection triggers: timeout, API error, refusal patterns, JSON schema fail
- JSON fence extraction: recovers valid JSON from blocks
- ModelMetrics: per-model success/fallback/error counters + success_rate()
- Redis publish to 'llm_router_metrics' after every call (failure-safe)
- redis_url=None disables Redis (useful in tests / edge nodes)
- context= param adds caller label to all log lines for tracing
services/planner-agent/tests/test_llm_router.py:
- 34 tests, 0 network calls (litellm + Redis fully mocked)
- Covers: primary success, JSON error fallback, refusal fallback,
timeout fallback, API exception fallback, all-fail RuntimeError,
schema validation, fence extraction, metrics recording, Redis publish,
Redis failure isolation
services/planner-agent/requirements.txt:
- litellm>=1.40.0, redis>=5.0.0, jsonschema>=4.21.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-27 18:38:06 +02:00 |
|