1.8 KiB
1.8 KiB
CHELSTY Stability Agent
The stability-agent on CHELSTY provides local observability and health monitoring for the node's services and infrastructure.
Purpose
It acts as a filesystem-first watchdog that detects anomalies in the local runtime environment without taking autonomous destructive actions (like restarts). It serves as the primary data source for node-level stability metrics.
Monitoring Scope
- Docker Containers: Monitors all local containers. If a container is not in the
runningstate, acontainers_not_runningevent is generated. - Disk Usage: Monitors the root filesystem. Generates
disk_usage_highevents if usage exceeds the configured threshold. - Connectivity:
- Checks if the Tailscale socket or interface is available.
- Checks reachability of the local Mosquitto MQTT broker.
- Zigbee2MQTT: Specifically tracks the presence and status of the Zigbee2MQTT service.
Storage and Integration
- Heartbeat: Updated every cycle at
/opt/homelab/state/stability-agent.heartbeat. - State Summary: A JSON summary of all latest checks at
/opt/homelab/state/stability-agent.json. - Events: Append-only JSON lines at
/opt/homelab/events/YYYY-MM-DD/chelsty/events.jsonl.
Deployment
The service is deployed via Docker Compose on CHELSTY.
cd services/stability-agent
docker compose up -d
Configuration
Configuration is managed via environment variables in docker-compose.override.yml on the host.
| Variable | Description | Default |
|---|---|---|
STABILITY_CHECK_INTERVAL |
Seconds between checks | 60 |
DISK_THRESHOLD_PCT |
Disk usage alert threshold | 90 |
MQTT_HOST |
MQTT broker hostname | mosquitto |
MQTT_PORT |
MQTT broker port | 1883 |