3.1 KiB
3.1 KiB
Agent Operating Procedures
This document defines the operating procedures, constraints, and interaction protocols for non-human agents (AI agents, autonomous scripts) within the Homelab Codex ecosystem.
1. Core Principles for Agents
- Read-Only by Default: Agents should assume read-only access to the
/opt/homelabruntime unless explicitly executing an approved action. - Git as Authority: The repository on SATURN is the source of truth. Agents must not modify the runtime state on nodes directly without corresponding (or pending) Git state, unless it's an emergency mitigation.
- Human-in-the-Loop (HIL): All destructive or structural changes (restarts, deployments, config changes) must follow the Action Approval Model.
- Idempotency: All scripts and actions proposed or executed by agents MUST be idempotent.
- Context-Awareness: Agents MUST read the
README.mdanddocs/agents.mdat the start of every session to align with current infrastructure standards.
2. Agent Roles
| Role | Responsibility | Scope |
|---|---|---|
| Observer | Monitors health, logs, and events. | Read-only access to /opt/homelab/events and logs. |
| Stability Agent | Local node watchdog, event emitter. | Local node runtime, service.yaml healthchecks. |
| Orchestrator | High-level planning, workload placement. | Repository-wide, multi-node topology. |
| Materializer | Translates high-level intent into Docker/System state. | Execution of approved actions. |
3. Discovery Protocol
Agents must use the following entry points to understand the system:
- Topology:
inventory/topology.yamlfor node list and roles. - Capabilities:
hosts/<node>/capabilities.yamlto understand hardware/software constraints. - Service Contract:
services/<service>/service.yamlto understand how to check health and manage a service. - Operational State:
/opt/homelab/state/on local nodes for real-time status.
4. Interaction with Humans
Agents communicate with the operator via the agent-system/telegram-bot.
- Alerting: Agents emit events to the event system. Critical events are forwarded to Telegram.
- Proposals: When an agent identifies a need for change (e.g., "Service X is failing, suggest restart"), it creates a
pendingaction in/opt/homelab/actions/pending/. - Approval: Agents must wait for the action status to transition to
approvedbefore execution.
5. Decision Logic (Reasoning)
When making decisions, agents MUST prioritize:
- Safety: Do not violate power constraints (see
capabilities.yaml). - Stability: Prefer keeping services on their
owner_nodeunless it's down. - Connectivity: On intermittent nodes (CHELSTY), avoid actions requiring heavy WAN traffic during low-signal periods.
6. Access Control for Agents
- Filesystem: Agents should run as the
homelabuser or equivalent with restricted sudo access todocker compose. - Secrets: Agents MUST NOT attempt to read
.envfiles unless specifically tasked with credential rotation. They should treat secrets as opaque handles.