homelab-codex-ws/docs/chelsty-stability-agent.md
oskar dc483ae31a docs(chelsty): update docs and topology for site/node split
- chelsty-runtime.md: references chelsty-infra and chelsty-ha nodes
- chelsty-stability-agent.md: scoped to chelsty-infra
- topology.yaml: chelsty monolith replaced with chelsty-infra + chelsty-ha
2026-05-20 14:23:57 +02:00

43 lines
1.8 KiB
Markdown

### CHELSTY Stability Agent
The stability-agent on CHELSTY provides local observability and health monitoring for the node's services and infrastructure.
#### Purpose
It acts as a filesystem-first watchdog that detects anomalies in the local runtime environment without taking autonomous destructive actions (like restarts). It serves as the primary data source for node-level stability metrics.
#### Monitoring Scope
* **Docker Containers**: Monitors all local containers. If a container is not in the `running` state, a `containers_not_running` event is generated.
* **Disk Usage**: Monitors the root filesystem. Generates `disk_usage_high` events if usage exceeds the configured threshold.
* **Connectivity**:
* Checks if the Tailscale socket or interface is available.
* Checks reachability of the local Mosquitto MQTT broker.
* **Zigbee2MQTT**: Specifically tracks the presence and status of the Zigbee2MQTT service.
#### Storage and Integration
* **Heartbeat**: Updated every cycle at `/opt/homelab/state/stability-agent.heartbeat`.
* **State Summary**: A JSON summary of all latest checks at `/opt/homelab/state/stability-agent.json`.
* **Events**: Append-only JSON lines at `/opt/homelab/events/YYYY-MM-DD/chelsty-infra/events.jsonl`.
#### Deployment
The service is deployed via Docker Compose on CHELSTY.
```bash
cd services/stability-agent
docker compose up -d
```
#### Configuration
Configuration is managed via environment variables in `docker-compose.override.yml` on the host.
| Variable | Description | Default |
|----------|-------------|---------|
| `STABILITY_CHECK_INTERVAL` | Seconds between checks | `60` |
| `DISK_THRESHOLD_PCT` | Disk usage alert threshold | `90` |
| `MQTT_HOST` | MQTT broker hostname | `mosquitto` |
| `MQTT_PORT` | MQTT broker port | `1883` |