- node_agent: emit service_healthy for all running managed containers so
observer populates services.json (previously empty → supervisor flooded
action queue with missing_service redeploys for healthy services)
- node_agent: VPS-only _check_control_plane_health() probes the HTTP
endpoint to emit service_healthy/unhealthy for the 'control-plane' logical
service (multi-container stack, container names don't match service name)
- node_agent: fix _cleanup_control_plane_fs() to read new node_checkpoints
format from observer checkpoint (was reading old last_processed_file key,
always found nothing, never cleaned up old events)
- observer: handle service_healthy event type → sets service status healthy
without resolving incidents (unlike service_recovered which also resolves)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>