- dockerized ken + chelsty HA test instances with template fixtures
- snapshot/reset/wait scripts for fixture management
- integration test infrastructure with separate marker
- location_tag promoted from metadata to event payload (Phase 1 flag #3)
- chelsty-infra target_url points to chelsty-ha via tailnet (Phase 1 flag #1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- new per-host service, follows node-agent pattern
- 7 new HA event types defined (routing in supervisor — Phase 5)
- HeartbeatCheck as pipeline validator (pings /api/, emits ha_websocket_dead)
- service.yaml + host configs for piha (ken) and chelsty-infra (chelsty)
- test scaffolding with aiohttp/aiosqlite mocks (15/15 passing)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
services/agent-system/runtime-materializer/materializer.py:
- Add materialize_from_api() that fetches all world-state endpoints
from the control-plane HTTP API (CONTROL_PLANE_URL env var)
- When CONTROL_PLANE_URL is set, use API as source of truth instead of Redis
- Redis path preserved as fallback for backward compat
hosts/piha/runtime/agent-system/docker-compose.override.yml (new):
- Inject CONTROL_PLANE_URL=http://100.95.58.48:18180 for runtime-materializer
- piha webui /snapshot now mirrors VPS observer output (clean, ghost-free)
Root cause: materializer read from Redis which held 80 stale service entries
with hash-prefixed ghost keys (e.g. 0ccb8a88e079_control-plane-supervisor).
Redis is never updated by the current observer pipeline; the control-plane API
is the single authoritative world-state source.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
z2m migrates configuration.yaml on startup and needs write access.
Remove the separate :ro config mount; rely on the base compose's
/opt/homelab/data/zigbee2mqtt/data:/app/data read-write mount instead.
configuration.yaml must exist at that path on the node before first run.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docker-compose v1 cannot clear the ports list from the base compose with
ports: [] in an override, so network_mode: host caused InvalidArgument.
Use extra_hosts with host-gateway instead: maps 'mosquitto' hostname to the
Docker bridge gateway IP so mqtt://mosquitto:1883 reaches the host-networked
mosquitto process from within the bridge-networked z2m container.
Requires Docker 20.10+ (present on chelsty-infra).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docker-compose v1 (1.29.2 on chelsty-infra) raises InvalidArgument when
network_mode: host is combined with port_bindings from the base compose file.
Add ports: [] in the override to clear the base ports list.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The _check_control_plane_health() method probes localhost:18180, which
is the control-plane's mapped port. Inside a bridged container, localhost
resolves to the container's own loopback — the probe always fails.
host network mode shares the VPS host's network namespace so that
localhost:18180 correctly reaches the control-plane.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
100.108.208.3 is piha's Tailscale IP (piha hosts Forgejo+Redis).
VPS's actual Tailscale IP is 100.95.58.48. All three node-agent
overrides were pointing at piha itself, causing containers to SSH
to their own host and fail auth.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Nodes ship events to VPS via rsync+SSH. The container runs as root
and uses the default SSH identity, which must be at /root/.ssh/.
Mount /home/oskar/.ssh from the host read-only so the existing
authorized key is available inside the container.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- piha: NODE_TYPE=sd_card (rate-limited docker prune, once per day)
- solaria: NODE_TYPE=ai_node (dangling+containers+build cache; never -a to preserve Ollama images)
- chelsty-infra: NODE_TYPE=lte_node (NO cleanup, events-only)
- All three: VPS_EVENTS_HOST set for event shipping via rsync+SSH
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- observer: store trigger_type on incidents for supervisor routing
- supervisor: route containers_not_running/mqtt_unreachable to container_restart instead of redeploy
- supervisor: fix node alias normalization via NODE_ALIAS_MAP
- supervisor: fix pending action dedup (scan by content not filename)
- executor: implement container_restart via SSH docker restart with retry
- control-plane override: configure NODE_ALIAS_MAP for production
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
VAAPI decode via Intel UHD 630, CPU detection, 2x Reolink RLC-540
placeholders. MQTT to local mosquitto (127.0.0.1), 7-day recording
retention. Secrets in /opt/homelab/config/frigate/frigate.env on node.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>