docs/sessions/2026-05-27.md (new): - Full session record: problems found, all commits shipped, end state - Written in Polish per operator preference for session notes - Known limitations: SLZB-06U offline, ezsp→ember migration pending docs/observer-runtime.md: - Document per-node checkpoint format (replaces old global checkpoint) - Add service_healthy / service_recovered resolution behavior - Document ghost key pruning (_prune_stale_world patterns) - Add event type reference table (negative vs positive) docs/vps-control-plane.md: - Add container names and network_mode: host detail - Document monitor:false, NODE_ALIAS_MAP, auto-cancel behavior - Add piha agent-system materializer integration note - Rewrite recovery section with actionable bootstrap-flood diagnosis - Add action state machine (pending→approved→running→completed/cancelled) docs/chelsty-runtime.md: - Add chelsty-infra/chelsty-ha node table - Document docker-compose v1 constraint (always use docker-compose, not docker compose) - Add mosquitto network_mode:host + z2m extra_hosts:host-gateway explanation - Add z2m config writable requirement (EROFS failure mode documented) - Add chelsty-ha monitor:false rationale - Add minimal configuration.yaml template for z2m Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
155 lines
5.2 KiB
Markdown
155 lines
5.2 KiB
Markdown
# CHELSTY Runtime
|
|
|
|
This document describes the runtime environment and deployment flow for CHELSTY, an offline-capable home automation edge node split across two VMs.
|
|
|
|
| Node | Role | Services |
|
|
|------|------|----------|
|
|
| `chelsty-infra` | LTE edge hypervisor | Mosquitto, Zigbee2MQTT, stability-agent, node-agent |
|
|
| `chelsty-ha` | Home Assistant VM | homeassistant (no node-agent — see below) |
|
|
|
|
Both nodes share an LTE uplink and must function fully offline (Zigbee, MQTT, HA automations) without any connectivity to SATURN, VPS, or Forgejo.
|
|
|
|
## Runtime Layout
|
|
|
|
```
|
|
/opt/homelab/
|
|
├── config/ # Service-specific configs and secrets (not in Git)
|
|
│ ├── mosquitto/
|
|
│ └── zigbee2mqtt/
|
|
├── data/ # Persistent service data
|
|
│ ├── mosquitto/ # Persistence DB, password file
|
|
│ └── zigbee2mqtt/
|
|
│ └── data/ # z2m config, coordinator backup, network key
|
|
└── logs/
|
|
```
|
|
|
|
## SLZB-06U Integration
|
|
|
|
CHELSTY uses a SMLIGHT SLZB-06U Zigbee coordinator connected over Ethernet/TCP.
|
|
|
|
- **Coordinator IP**: `192.168.1.105`
|
|
- **Port**: `6638`
|
|
- **Adapter**: `ezsp` (deprecated — migration to `ember` recommended, requires only changing `adapter: ember` in `configuration.yaml`)
|
|
- **Zigbee2MQTT config key**: `serial.port: tcp://192.168.1.105:6638`
|
|
|
|
⚠️ Never use `/dev/ttyUSB0` — the coordinator is always TCP-only on this site.
|
|
|
|
## Networking Constraints
|
|
|
|
### Mosquitto — `network_mode: host`
|
|
Mosquitto runs with `network_mode: host` so that all containers on the same host can reach it at `localhost:1883`. **Do not change this.**
|
|
|
|
### Zigbee2MQTT — bridge network + extra_hosts
|
|
Zigbee2MQTT runs in a bridge-networked container (needed for port mapping compatibility with docker-compose v1). To reach the host-networked Mosquitto:
|
|
|
|
```yaml
|
|
# hosts/chelsty-infra/runtime/zigbee2mqtt/docker-compose.override.yml
|
|
services:
|
|
zigbee2mqtt:
|
|
extra_hosts:
|
|
- "mosquitto:host-gateway"
|
|
```
|
|
|
|
This maps the `mosquitto` hostname inside the z2m container to the Docker host gateway IP, so `mqtt://mosquitto:1883` reaches the host-networked Mosquitto process.
|
|
|
|
**Why not `network_mode: host` for z2m?**
|
|
chelsty-infra runs docker-compose v1 (1.29.2). In v1, `network_mode: host` cannot coexist with `ports:` declared in the base `docker-compose.yml` — raises `InvalidArgument`. The `extra_hosts` approach avoids this.
|
|
|
|
## Zigbee2MQTT Config Location
|
|
|
|
The `configuration.yaml` **must be writable** — z2m migrates and rewrites it on startup. It lives in the data directory:
|
|
|
|
```
|
|
/opt/homelab/data/zigbee2mqtt/data/configuration.yaml
|
|
```
|
|
|
|
This path is mounted read-write by the base `docker-compose.yml`:
|
|
```yaml
|
|
volumes:
|
|
- /opt/homelab/data/zigbee2mqtt/data:/app/data
|
|
```
|
|
|
|
Do **not** mount `configuration.yaml` as a separate `:ro` volume — z2m will fail with `EROFS`.
|
|
|
|
### Minimal configuration.yaml
|
|
```yaml
|
|
homeassistant: true
|
|
permit_join: false
|
|
mqtt:
|
|
base_topic: zigbee2mqtt
|
|
server: mqtt://mosquitto:1883
|
|
serial:
|
|
port: tcp://192.168.1.105:6638
|
|
adapter: ezsp
|
|
frontend:
|
|
port: 8080
|
|
advanced:
|
|
log_level: info
|
|
```
|
|
|
|
## chelsty-ha — No node-agent
|
|
|
|
`chelsty-ha` does not have a node-agent deployed. Home Assistant is monitored indirectly: if MQTT goes silent on `chelsty-infra`, HA is likely down.
|
|
|
|
In `hosts/chelsty-ha/services.yaml`:
|
|
```yaml
|
|
services:
|
|
homeassistant:
|
|
monitor: false # No node-agent; suppresses supervisor action generation
|
|
```
|
|
|
|
Remove `monitor: false` once node-agent is bootstrapped on this VM.
|
|
|
|
## Deployment Flow
|
|
|
|
### Initial Bootstrap
|
|
```bash
|
|
./scripts/bootstrap/chelsty-runtime.sh
|
|
```
|
|
|
|
### Deploy services
|
|
```bash
|
|
./scripts/deploy/deploy-node.sh chelsty-infra
|
|
./scripts/deploy/deploy-node.sh chelsty-ha
|
|
```
|
|
|
|
### Manual (SSH) — chelsty-infra uses docker-compose v1
|
|
```bash
|
|
ssh oskar@100.122.201.22
|
|
cd ~/homelab-codex-ws/services/<service>
|
|
docker-compose -f docker-compose.yml \
|
|
-f ../../hosts/chelsty-infra/runtime/<service>/docker-compose.override.yml \
|
|
up -d --build --force-recreate
|
|
```
|
|
|
|
> **Note:** `docker compose` (v2) is **not** available on chelsty-infra — always use `docker-compose` (hyphenated, v1 1.29.2).
|
|
|
|
## Recovery Procedures
|
|
|
|
### Mosquitto stopped
|
|
```bash
|
|
ssh oskar@100.122.201.22 "docker start mosquitto"
|
|
# Ensure restart policy is correct:
|
|
docker update --restart unless-stopped mosquitto
|
|
```
|
|
|
|
### Zigbee2MQTT won't start
|
|
1. Check logs: `docker logs zigbee2mqtt --tail 50`
|
|
2. Verify SLZB-06U reachable from host: `nc -zv 192.168.1.105 6638`
|
|
3. Verify config is not empty: `cat /opt/homelab/data/zigbee2mqtt/data/configuration.yaml`
|
|
4. If config missing, recreate from the minimal template above
|
|
|
|
### SLZB-06U unreachable
|
|
`192.168.1.105:6638` EHOSTUNREACH means the coordinator is offline or the LAN is down. Zigbee2MQTT will keep retrying — no restart needed once the coordinator returns.
|
|
|
|
## Critical Backup Sets
|
|
|
|
| Data | Path |
|
|
|------|------|
|
|
| HA config + DB | `/opt/homelab/data/homeassistant/` on chelsty-ha |
|
|
| z2m config + coordinator backup + network key | `/opt/homelab/data/zigbee2mqtt/data/` |
|
|
| Mosquitto persistence + password file | `/opt/homelab/data/mosquitto/` |
|
|
| SLZB-06U coordinator state | Backup via SLZB-06U web UI at `192.168.1.105` |
|
|
|
|
> ⚠️ The Zigbee network key is in `configuration.yaml` or `coordinator_backup.json` — losing it requires re-pairing all devices.
|