Find a file

Oskar Kapala 52607a7cdd feat(control-plane): shadow_mode for HA event auto-actions + deploy docs - HA_DIAG_SHADOW_MODE env flag in supervisor (default true) - shadow_mode downgrades container_restart actions to alert_only with [SHADOW MODE] note; same action_id and 30-min cooldown apply - alert_only events unaffected (always routed normally) - 3 new tests: shadow on/off for ha_websocket_dead, alert-only unaffected - DEPLOY.md with token gen, per-host config, verification, 48h observation, production-mode enablement, rollback - README.md updated with shadow mode flag summary and DEPLOY.md link Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-05-29 17:12:33 +02:00
backups/zigbee	Add Zigbee coordinator backup	2026-05-14 18:24:26 +02:00
docs	docs: add planner-agent docs and session summary 2026-05-27	2026-05-27 22:35:59 +02:00
dotfiles	add shared zshrc	2026-05-10 20:52:44 +02:00
hosts	feat(ha-diag-agent): test environment with dual HA Docker instances	2026-05-29 12:56:13 +02:00
inventory	ops: align vps desired state with control-plane architecture, remove legacy agent-system references	2026-05-21 11:40:55 +02:00
scripts	Fix ghost service keys from hash-prefixed Docker container names	2026-05-27 15:41:13 +02:00
services	feat(control-plane): shadow_mode for HA event auto-actions + deploy docs	2026-05-29 17:12:33 +02:00
.codex	Document current homelab state	2026-04-15 17:37:25 +02:00
.gitignore	chore: gitignore *.egg-info, remove committed egg-info	2026-05-29 12:26:57 +02:00
CLAUDE.md	feat(control-plane): route ha-diag-agent events through supervisor	2026-05-29 15:59:23 +02:00
codex_context	Add session context state	2026-04-20 22:10:39 +02:00
codex_context.yaml	add shared context lock	2026-05-05 17:25:50 +02:00
deploy_agent.py	Add deploy escalation output	2026-04-22 22:08:26 +02:00
ollama_client.py	Initial shared homelab agent workspace	2026-05-03 19:37:40 +02:00
README.md	docs: add planner-agent docs and session summary 2026-05-27	2026-05-27 22:35:59 +02:00
start-aider.sh	Initial shared homelab agent workspace	2026-05-03 19:37:40 +02:00
start-codex.sh	Initial shared homelab agent workspace	2026-05-03 19:37:40 +02:00
sync-context.sh	add shared context lock	2026-05-05 17:25:50 +02:00
tech-debt.md	docs: add tech-debt.md, forgejo_runner temp disabled	2026-05-21 10:37:42 +02:00
update-context.md	Initial shared homelab agent workspace	2026-05-03 19:37:40 +02:00

README.md

Homelab Codex

GitOps-lite orchestration for a distributed homelab environment.

Architecture

The homelab consists of several nodes connected via a Tailscale internal mesh.

Host	Role	Description
SATURN	Primary Node	Development, orchestration, and git source of truth (commit node).
SOLARIA	Compute Node	GPU, inference, and heavy compute workloads.
PIHA	Infra Node	Core infrastructure services, automation, and monitoring.
VPS	Edge Node	Public ingress, reverse proxy, and edge services.

Agent System

The homelab uses a multi-agent orchestration model with human-in-the-loop for destructive actions:

Agent	Node	Role
stability-agent	all nodes	Per-node watchdog — monitors Docker, disk, Tailscale, MQTT; emits events
node-agent	all nodes	Publishes container health events to Redis pub/sub
observer	VPS	Synthesizes world state from events into `/opt/homelab/world/*.json`
supervisor	VPS	Detects drift between desired and actual state; writes `pending` actions
planner-agent	SOLARIA	LLM-powered diagnosis — listens to Redis, proposes remediation actions
executor	VPS	Executes actions only after operator approval
operator-ui + telegram-bot	VPS / PIHA	Operator reviews and approves/rejects pending actions

Action approval flow: pending/ → operator approves → approved/ → executor runs.

Repository Structure

docs/: Infrastructure Standards and Deployment Conventions.
hosts/: Host-specific configurations and service assignments.
services/: Reusable Docker Compose service definitions.
scripts/: Deployment and management scripts.

Getting Started

Standardization: Follow the Infrastructure Standards.
Deployment: See Deployment Conventions for how to roll out changes.
SATURN: Remember that SATURN is the only node where commits should be made.

Documentation Index

Infrastructure Standards
Agent Operating Procedures (For AI/Non-Human Agents)
Deployment Conventions
Hardware
Networking
Services
Node Capabilities
Action Model

Note: This repository documents the state of the homelab. Runtime state lives outside the repository in /opt/homelab.