- node_agent: emit service_healthy for all running managed containers so
observer populates services.json (previously empty → supervisor flooded
action queue with missing_service redeploys for healthy services)
- node_agent: VPS-only _check_control_plane_health() probes the HTTP
endpoint to emit service_healthy/unhealthy for the 'control-plane' logical
service (multi-container stack, container names don't match service name)
- node_agent: fix _cleanup_control_plane_fs() to read new node_checkpoints
format from observer checkpoint (was reading old last_processed_file key,
always found nothing, never cleaned up old events)
- observer: handle service_healthy event type → sets service status healthy
without resolving incidents (unlike service_recovered which also resolves)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When ~/.ssh is mounted from the host oskar user into a container that
runs as root, OpenSSH rejects ~/.ssh/config with 'Bad owner or
permissions' because the file UID doesn't match the running process.
Add -F /dev/null to the rsync SSH command to skip the config file
entirely. Also add UserKnownHostsFile=/dev/null so no known_hosts
write is attempted into a potentially read-only mounted .ssh dir.
The key itself (/root/.ssh/id_rsa) is still read as an implicit
default identity and is not affected by -F.
Reproduces on chelsty-infra (has ~/.ssh/config); safe for all nodes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>