64 lines
2.3 KiB
Markdown
64 lines
2.3 KiB
Markdown
### Stability Agent
|
|
|
|
A lightweight filesystem-first watchdog and observer agent for CHELSTY.
|
|
|
|
#### Features
|
|
|
|
* **Continuous Monitoring**: Runs as a background service.
|
|
* **Docker Inspection**: Checks container status via read-only Docker socket.
|
|
* **Disk Usage**: Monitors local disk utilization.
|
|
* **Tailscale Check**: Verifies Tailscale availability.
|
|
* **MQTT Reachability**: Checks connectivity to the local MQTT broker.
|
|
* **Zigbee2MQTT Monitoring**: Specifically monitors the Zigbee2MQTT container.
|
|
* **Redis Publishing**: (Optional) Publishes runtime state and events to a central Redis server.
|
|
* **Event Logging**: Writes append-only JSON events to `/opt/homelab/events/YYYY-MM-DD/chelsty/`.
|
|
* **State Reporting**: Writes heartbeat and status summary to `/opt/homelab/state/`.
|
|
|
|
#### Configuration
|
|
|
|
Environment variables:
|
|
|
|
* `STABILITY_CHECK_INTERVAL`: Interval between checks in seconds (default: 60).
|
|
* `DISK_THRESHOLD_PCT`: Disk usage percentage to trigger warning (default: 90).
|
|
* `MQTT_HOST`: Hostname or IP of the MQTT broker to check.
|
|
* `MQTT_PORT`: Port of the MQTT broker (default: 1883).
|
|
* `REDIS_HOST`: Hostname or IP of the Redis server (e.g., PIHA at 100.108.208.3).
|
|
* `REDIS_PORT`: Port of the Redis server (default: 6379).
|
|
* `REDIS_ENABLED`: Whether to enable Redis publishing (default: true if REDIS_HOST is set).
|
|
* `NODE_NAME`: Name of the current node (default: chelsty).
|
|
|
|
#### Verification
|
|
|
|
You can verify the Redis publishing using `redis-cli`:
|
|
|
|
```bash
|
|
# Check node state
|
|
redis-cli -h 100.108.208.3 HGETALL homelab:nodes:chelsty
|
|
|
|
# Check service discovery
|
|
redis-cli -h 100.108.208.3 HGETALL homelab:services:chelsty:stability-agent
|
|
|
|
# Check event stream
|
|
redis-cli -h 100.108.208.3 XRANGE homelab:events - +
|
|
```
|
|
|
|
#### Safety
|
|
|
|
* No automatic restarts are performed.
|
|
* Read-only access to Docker socket.
|
|
* No configuration mutation.
|
|
* No secrets stored in the repository.
|
|
|
|
#### Event Schema
|
|
|
|
Events are written as JSON lines with the following fields:
|
|
|
|
* `id`: Unique event UUID.
|
|
* `timestamp`: ISO 8601 timestamp (UTC).
|
|
* `node`: `chelsty`.
|
|
* `source`: `stability-agent`.
|
|
* `type`: Type of event (e.g., `disk_usage_high`, `containers_not_running`).
|
|
* `severity`: `info`, `warning`, or `error`.
|
|
* `message`: Human-readable description.
|
|
* `details`: Object containing specific check results.
|