26 lines
1.3 KiB
Markdown
26 lines
1.3 KiB
Markdown
|
|
# Stale State Semantics
|
||
|
|
|
||
|
|
In a local-first, filesystem-backed control plane, visibility depends on the freshness of the runtime state files.
|
||
|
|
|
||
|
|
## Detection
|
||
|
|
The Operator Control Plane monitors the modification time (mtime) of the `runtime-summary.json` file in `/opt/homelab/state/`.
|
||
|
|
|
||
|
|
- **Live**: If the file was updated within the last 60 seconds.
|
||
|
|
- **Stale**: If the file is older than 60 seconds.
|
||
|
|
|
||
|
|
## UI Representation
|
||
|
|
When the state is detected as stale:
|
||
|
|
1. A **Critical Warning Banner** appears at the top of the console.
|
||
|
|
2. The exact time of the last successful update is displayed.
|
||
|
|
3. Health badges and metrics should be treated with caution as they represent the last known good state, not necessarily the current live state.
|
||
|
|
|
||
|
|
## Causes of Staleness
|
||
|
|
- **Runtime Observer Failure**: The process responsible for writing state to the filesystem has crashed.
|
||
|
|
- **Node Isolation**: The node where the observer is running is offline or disconnected.
|
||
|
|
- **Filesystem Latency**: Issues with the underlying storage layer (e.g., SD card degradation).
|
||
|
|
|
||
|
|
## Operator Response
|
||
|
|
1. Check the **Nodes** view to identify if a specific observer node is offline.
|
||
|
|
2. Investigate the `homelab-codex-ws` runtime logs.
|
||
|
|
3. Manually verify critical services if the control plane remains stale for an extended period.
|