### CHELSTY Stability Agent The stability-agent on CHELSTY provides local observability and health monitoring for the node's services and infrastructure. #### Purpose It acts as a filesystem-first watchdog that detects anomalies in the local runtime environment without taking autonomous destructive actions (like restarts). It serves as the primary data source for node-level stability metrics. #### Monitoring Scope * **Docker Containers**: Monitors all local containers. If a container is not in the `running` state, a `containers_not_running` event is generated. * **Disk Usage**: Monitors the root filesystem. Generates `disk_usage_high` events if usage exceeds the configured threshold. * **Connectivity**: * Checks if the Tailscale socket or interface is available. * Checks reachability of the local Mosquitto MQTT broker. * **Zigbee2MQTT**: Specifically tracks the presence and status of the Zigbee2MQTT service. #### Storage and Integration * **Heartbeat**: Updated every cycle at `/opt/homelab/state/stability-agent.heartbeat`. * **State Summary**: A JSON summary of all latest checks at `/opt/homelab/state/stability-agent.json`. * **Events**: Append-only JSON lines at `/opt/homelab/events/YYYY-MM-DD/chelsty/events.jsonl`. #### Deployment The service is deployed via Docker Compose on CHELSTY. ```bash cd services/stability-agent docker compose up -d ``` #### Configuration Configuration is managed via environment variables in `docker-compose.override.yml` on the host. | Variable | Description | Default | |----------|-------------|---------| | `STABILITY_CHECK_INTERVAL` | Seconds between checks | `60` | | `DISK_THRESHOLD_PCT` | Disk usage alert threshold | `90` | | `MQTT_HOST` | MQTT broker hostname | `mosquitto` | | `MQTT_PORT` | MQTT broker port | `1883` |