homelab-codex-ws/services/ha-diag-agent/src/ha_diag/checks/base.py
Oskar Kapala 20f6761a67 feat(ha-diag-agent): UnavailableEntitiesCheck with root cause dedup
- shared aiohttp ClientSession in HAClient (Phase 1 Flag #2 fixed):
  make_session() factory, session injected at startup, closed on shutdown
- Check.run() → list[CheckResult]: clean multi-event interface
- first real diagnostic check: entity unavailable > 24h
  (INSERT OR IGNORE baseline preserves first-seen timestamp)
- root cause grouping: emit ha_integration_failed instead of N entity
  events when ≥50% of integration's entities are unavailable (≥3 min)
- alert deduplication via SQLite cooldown window (default 6h)
- recovery clears baseline + dedup for immediate re-alert
- configurable thresholds: duration, integration %, cooldown
- 38 unit tests + 7 integration tests (42 pass, 3 skip w/o live HA)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 13:41:55 +02:00

21 lines
584 B
Python

from __future__ import annotations
from abc import ABC, abstractmethod
from ..models import CheckResult
class Check(ABC):
"""Base class for all HA diagnostic checks."""
name: str # unique slug used in /trigger/<name> and check_history
@abstractmethod
async def run(self) -> list[CheckResult]:
"""Execute the check and return results.
Empty list means the check passed cleanly.
Each CheckResult with event_type set causes an event to be emitted.
The caller (runner in main.py) handles emission and history recording.
"""