homelab-codex-ws/docs/vps-control-plane.md

2.9 KiB

VPS Control Plane

The VPS Control Plane is the orchestration brain of the homelab platform. It runs on the Hetzner VPS and provides observability, automated reconciliation, and a web-based operator interface.

Architecture

The control plane consists of four core services running as a Docker Compose stack:

  1. Observer: Synthesizes world state from events.
  2. Supervisor: Detects drifts between desired and actual state.
  3. Executor: Executes approved actions from the queue.
  4. Operator UI: Web interface for system monitoring and action approval.

All services adhere to filesystem-first semantics, using /opt/homelab/ as the primary data exchange and persistence layer.

Deployment Flow

1. Prerequisites

  • Target VPS node must be onboarded (Tailscale active, Docker installed).
  • Repository cloned to /home/oskar/homelab-codex-ws.

2. Bootstrap

Run the local deployment script on the VPS to initialize the runtime filesystem and start the stack:

cd services/control-plane
bash deploy-local.sh

3. Verification

Verify the stack is healthy using the deployment script or check container status on the VPS:

# Check status via deploy script
./scripts/deploy/deploy-control-plane.sh --ssh

# Manual status check on VPS
docker ps --filter "name=control-plane"

Operational Workflows

Action Approval

  1. Access the Operator UI (via Tailscale IP or Nginx Proxy Manager).
  2. Navigate to Action Queue.
  3. Review Pending actions recommended by the Supervisor.
  4. Click Approve to move actions to the execution queue.

Recovery Flow

In case of control plane failure:

  1. Check logs using docker logs.
  2. Restart stack using the local deployment script: bash deploy-local.sh.
  3. Rebuild world state: Delete /opt/homelab/state/observer_checkpoint.json and redeploy.

Upgrade Flow

To deploy updates from the SOLARIA/control host:

./scripts/deploy/deploy-control-plane.sh --ssh

Rollback Semantics

Since the runtime is filesystem-first and append-only:

  1. Roll back the repository state to a previous commit.
  2. Restart the control plane stack.
  3. The supervisor will detect drift against the older (rolled-back) desired state and recommend actions to restore it.

Runtime Safety

  • Readonly Mounts: Most services mount the repository as :ro to prevent accidental mutations.
  • Least-Privilege: UI, Observer, and Supervisor run as non-root homelab user (UID 1000).
  • Filesystem Isolation: Clear separation between /repo (code/inventory) and /opt/homelab (runtime state).

Integration

Nginx Proxy Manager

Configure a proxy host in NPM to point to http://control-plane-ui:8080. Ensure Websockets are enabled if the UI uses them.

Log Locations

  • Container logs: docker compose logs
  • Runtime events: /opt/homelab/events/YYYY-MM-DD/
  • World state: /opt/homelab/world/
  • Diagnostics: /opt/homelab/logs/