homelab-codex-ws/services
Oskar Kapala 4e8968f9c7 Fix service health tracking: emit service_healthy, control-plane endpoint check, cleanup checkpoint migration
- node_agent: emit service_healthy for all running managed containers so
  observer populates services.json (previously empty → supervisor flooded
  action queue with missing_service redeploys for healthy services)
- node_agent: VPS-only _check_control_plane_health() probes the HTTP
  endpoint to emit service_healthy/unhealthy for the 'control-plane' logical
  service (multi-container stack, container names don't match service name)
- node_agent: fix _cleanup_control_plane_fs() to read new node_checkpoints
  format from observer checkpoint (was reading old last_processed_file key,
  always found nothing, never cleaned up old events)
- observer: handle service_healthy event type → sets service status healthy
  without resolving incidents (unlike service_recovered which also resolves)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 14:49:56 +02:00
..
agent-system feat: add Copy for AI snapshot button to webui 2026-05-21 12:05:37 +02:00
control-plane fix(observer+operator-ui): fix stale world state, dict→list API, event time filter 2026-05-27 13:51:03 +02:00
forgejo Add node capability model 2026-05-11 20:46:50 +02:00
mosquitto Implement filesystem-first runtime event system 2026-05-12 13:38:25 +02:00
node-agent Fix service health tracking: emit service_healthy, control-plane endpoint check, cleanup checkpoint migration 2026-05-27 14:49:56 +02:00
npm Add node capability model 2026-05-11 20:46:50 +02:00
ollama Add node capability model 2026-05-11 20:46:50 +02:00
stability-agent Fix stability agent fleet deploy scripts 2026-05-17 21:09:06 +02:00
zigbee2mqtt Adapt zigbee2mqtt for SLZB coordinator 2026-05-14 16:37:18 +02:00
.gitkeep Add infrastructure standards and deployment conventions 2026-05-07 21:16:03 +02:00