1.3 KiB
1.3 KiB
Stale State Semantics
In a local-first, filesystem-backed control plane, visibility depends on the freshness of the runtime state files.
Detection
The Operator Control Plane monitors the modification time (mtime) of the runtime-summary.json file in /opt/homelab/state/.
- Live: If the file was updated within the last 60 seconds.
- Stale: If the file is older than 60 seconds.
UI Representation
When the state is detected as stale:
- A Critical Warning Banner appears at the top of the console.
- The exact time of the last successful update is displayed.
- Health badges and metrics should be treated with caution as they represent the last known good state, not necessarily the current live state.
Causes of Staleness
- Runtime Observer Failure: The process responsible for writing state to the filesystem has crashed.
- Node Isolation: The node where the observer is running is offline or disconnected.
- Filesystem Latency: Issues with the underlying storage layer (e.g., SD card degradation).
Operator Response
- Check the Nodes view to identify if a specific observer node is offline.
- Investigate the
homelab-codex-wsruntime logs. - Manually verify critical services if the control plane remains stale for an extended period.