homelab-codex-ws/services/node-agent
Oskar Kapala 4e8968f9c7 Fix service health tracking: emit service_healthy, control-plane endpoint check, cleanup checkpoint migration
- node_agent: emit service_healthy for all running managed containers so
  observer populates services.json (previously empty → supervisor flooded
  action queue with missing_service redeploys for healthy services)
- node_agent: VPS-only _check_control_plane_health() probes the HTTP
  endpoint to emit service_healthy/unhealthy for the 'control-plane' logical
  service (multi-container stack, container names don't match service name)
- node_agent: fix _cleanup_control_plane_fs() to read new node_checkpoints
  format from observer checkpoint (was reading old last_processed_file key,
  always found nothing, never cleaned up old events)
- observer: handle service_healthy event type → sets service status healthy
  without resolving incidents (unlike service_recovered which also resolves)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 14:49:56 +02:00
..
src Fix service health tracking: emit service_healthy, control-plane endpoint check, cleanup checkpoint migration 2026-05-27 14:49:56 +02:00
docker-compose.yml feat(node-agent): implement health monitor and safe cleanup policy 2026-05-27 13:15:06 +02:00
Dockerfile feat(node-agent): implement health monitor and safe cleanup policy 2026-05-27 13:15:06 +02:00