docs: uzupelnij dokumentacje pod katem agentow AI
Co-authored-by: Junie <junie@jetbrains.com>
This commit is contained in:
parent
f65698925e
commit
8a12b7ff17
|
|
@ -29,10 +29,13 @@ The homelab consists of several nodes connected via a Tailscale internal mesh.
|
||||||
## Documentation Index
|
## Documentation Index
|
||||||
|
|
||||||
- [Infrastructure Standards](docs/standards.md)
|
- [Infrastructure Standards](docs/standards.md)
|
||||||
|
- [Agent Operating Procedures](docs/agents.md) (For AI/Non-Human Agents)
|
||||||
- [Deployment Conventions](docs/deployment.md)
|
- [Deployment Conventions](docs/deployment.md)
|
||||||
- [Hardware](docs/hardware.md)
|
- [Hardware](docs/hardware.md)
|
||||||
- [Networking](docs/networking.md)
|
- [Networking](docs/networking.md)
|
||||||
- [Services](docs/services.md)
|
- [Services](docs/services.md)
|
||||||
|
- [Node Capabilities](docs/capabilities.md)
|
||||||
|
- [Action Model](services/agent-system/action-model.md)
|
||||||
|
|
||||||
---
|
---
|
||||||
*Note: This repository documents the state of the homelab. Runtime state lives outside the repository in `/opt/homelab`.*
|
*Note: This repository documents the state of the homelab. Runtime state lives outside the repository in `/opt/homelab`.*
|
||||||
|
|
|
||||||
49
docs/agents.md
Normal file
49
docs/agents.md
Normal file
|
|
@ -0,0 +1,49 @@
|
||||||
|
# Agent Operating Procedures
|
||||||
|
|
||||||
|
This document defines the operating procedures, constraints, and interaction protocols for non-human agents (AI agents, autonomous scripts) within the Homelab Codex ecosystem.
|
||||||
|
|
||||||
|
## 1. Core Principles for Agents
|
||||||
|
|
||||||
|
1. **Read-Only by Default**: Agents should assume read-only access to the `/opt/homelab` runtime unless explicitly executing an approved action.
|
||||||
|
2. **Git as Authority**: The repository on **SATURN** is the source of truth. Agents must not modify the runtime state on nodes directly without corresponding (or pending) Git state, unless it's an emergency mitigation.
|
||||||
|
3. **Human-in-the-Loop (HIL)**: All destructive or structural changes (restarts, deployments, config changes) must follow the [Action Approval Model](../services/agent-system/action-model.md).
|
||||||
|
4. **Idempotency**: All scripts and actions proposed or executed by agents MUST be idempotent.
|
||||||
|
5. **Context-Awareness**: Agents MUST read the `README.md` and `docs/agents.md` at the start of every session to align with current infrastructure standards.
|
||||||
|
|
||||||
|
## 2. Agent Roles
|
||||||
|
|
||||||
|
| Role | Responsibility | Scope |
|
||||||
|
|------|----------------|-------|
|
||||||
|
| **Observer** | Monitors health, logs, and events. | Read-only access to `/opt/homelab/events` and `logs`. |
|
||||||
|
| **Stability Agent** | Local node watchdog, event emitter. | Local node runtime, `service.yaml` healthchecks. |
|
||||||
|
| **Orchestrator** | High-level planning, workload placement. | Repository-wide, multi-node topology. |
|
||||||
|
| **Materializer** | Translates high-level intent into Docker/System state. | Execution of `approved` actions. |
|
||||||
|
|
||||||
|
## 3. Discovery Protocol
|
||||||
|
|
||||||
|
Agents must use the following entry points to understand the system:
|
||||||
|
|
||||||
|
1. **Topology**: `inventory/topology.yaml` for node list and roles.
|
||||||
|
2. **Capabilities**: `hosts/<node>/capabilities.yaml` to understand hardware/software constraints.
|
||||||
|
3. **Service Contract**: `services/<service>/service.yaml` to understand how to check health and manage a service.
|
||||||
|
4. **Operational State**: `/opt/homelab/state/` on local nodes for real-time status.
|
||||||
|
|
||||||
|
## 4. Interaction with Humans
|
||||||
|
|
||||||
|
Agents communicate with the operator via the `agent-system/telegram-bot`.
|
||||||
|
|
||||||
|
- **Alerting**: Agents emit events to the event system. Critical events are forwarded to Telegram.
|
||||||
|
- **Proposals**: When an agent identifies a need for change (e.g., "Service X is failing, suggest restart"), it creates a `pending` action in `/opt/homelab/actions/pending/`.
|
||||||
|
- **Approval**: Agents must wait for the action status to transition to `approved` before execution.
|
||||||
|
|
||||||
|
## 5. Decision Logic (Reasoning)
|
||||||
|
|
||||||
|
When making decisions, agents MUST prioritize:
|
||||||
|
1. **Safety**: Do not violate power constraints (see `capabilities.yaml`).
|
||||||
|
2. **Stability**: Prefer keeping services on their `owner_node` unless it's down.
|
||||||
|
3. **Connectivity**: On intermittent nodes (CHELSTY), avoid actions requiring heavy WAN traffic during low-signal periods.
|
||||||
|
|
||||||
|
## 6. Access Control for Agents
|
||||||
|
|
||||||
|
- **Filesystem**: Agents should run as the `homelab` user or equivalent with restricted sudo access to `docker compose`.
|
||||||
|
- **Secrets**: Agents MUST NOT attempt to read `.env` files unless specifically tasked with credential rotation. They should treat secrets as opaque handles.
|
||||||
|
|
@ -83,3 +83,10 @@ Future autonomous agents will use this metadata to:
|
||||||
2. **Generate Plans:** Create step-by-step deployment or migration plans based on hardware compatibility.
|
2. **Generate Plans:** Create step-by-step deployment or migration plans based on hardware compatibility.
|
||||||
3. **Validate Topology:** Ensure that a proposed multi-node setup doesn't violate networking or operational constraints (e.g., don't put a DB on an intermittent node).
|
3. **Validate Topology:** Ensure that a proposed multi-node setup doesn't violate networking or operational constraints (e.g., don't put a DB on an intermittent node).
|
||||||
4. **Propose Failover:** Automatically suggest the best alternative node during an outage.
|
4. **Propose Failover:** Automatically suggest the best alternative node during an outage.
|
||||||
|
|
||||||
|
## Agent Reasoning Logic
|
||||||
|
|
||||||
|
When an agent parses `capabilities.yaml`, it should apply these heuristics:
|
||||||
|
- **Intermittent Connectivity**: If `operational.connectivity == "intermittent"`, do not schedule high-bandwidth syncs or critical cloud-dependent services.
|
||||||
|
- **Power Constraints**: If `operational.power_constraint == "low-power"`, avoid heavy LLM inference or continuous high-CPU tasks.
|
||||||
|
- **Availability Target**: If `availability_target == "high"`, this node is a candidate for hosting control-plane failovers.
|
||||||
|
|
|
||||||
|
|
@ -49,9 +49,10 @@ Runtime state must live outside the repository to keep it immutable and clean.
|
||||||
## Service Standards
|
## Service Standards
|
||||||
|
|
||||||
1. **Normalization**: Every service MUST follow the `services/<service>/` layout.
|
1. **Normalization**: Every service MUST follow the `services/<service>/` layout.
|
||||||
2. **Metadata**: Every service MUST have a `service.yaml` defining its operational contract.
|
2. **Metadata**: Every service MUST have a `service.yaml` defining its operational contract. This is the primary source of truth for AI agents.
|
||||||
3. **Healthchecks**: Every service MUST have a `healthcheck.sh` for verification.
|
3. **Healthchecks**: Every service MUST have a `healthcheck.sh` for verification. Agents use this to emit stability events.
|
||||||
4. **Secrets**: NEVER commit secrets to Git. Use `env.example` as a template and populate `/opt/homelab/config/<service>/.env` on the host.
|
4. **Actionability**: Any automated recovery action proposed by an agent must be backed by a `service.yaml` definition.
|
||||||
|
5. **Secrets**: NEVER commit secrets to Git. Use `env.example` as a template and populate `/opt/homelab/config/<service>/.env` on the host. Agents must treat these as "black box" configurations.
|
||||||
|
|
||||||
## Docker Compose Standards
|
## Docker Compose Standards
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -13,3 +13,22 @@ services:
|
||||||
config_path: /opt/homelab/config/stability-agent
|
config_path: /opt/homelab/config/stability-agent
|
||||||
data_path: /opt/homelab/state
|
data_path: /opt/homelab/state
|
||||||
logs_path: /opt/homelab/events
|
logs_path: /opt/homelab/events
|
||||||
|
|
||||||
|
control-plane:
|
||||||
|
role: management-and-orchestration
|
||||||
|
deployment_model: docker-compose
|
||||||
|
exposure: tailscale-internal
|
||||||
|
offline_required: false
|
||||||
|
depends_on:
|
||||||
|
local:
|
||||||
|
- stability-agent
|
||||||
|
external:
|
||||||
|
- piha:agent-system-redis
|
||||||
|
ports:
|
||||||
|
- name: http
|
||||||
|
container_port: 18180
|
||||||
|
protocol: tcp
|
||||||
|
runtime:
|
||||||
|
config_path: /opt/homelab/config/control-plane
|
||||||
|
data_path: /opt/homelab/data/control-plane
|
||||||
|
logs_path: /opt/homelab/logs/control-plane
|
||||||
|
|
|
||||||
|
|
@ -3,13 +3,20 @@
|
||||||
Actions are JSON files stored in `/opt/homelab/actions/{status}/{action_id}.json`.
|
Actions are JSON files stored in `/opt/homelab/actions/{status}/{action_id}.json`.
|
||||||
|
|
||||||
#### Statuses
|
#### Statuses
|
||||||
- `pending`: Waiting for operator approval.
|
- `pending`: Waiting for operator approval. AI agents create actions in this state.
|
||||||
- `approved`: Approved by operator, ready for execution.
|
- `approved`: Approved by operator, ready for execution.
|
||||||
- `rejected`: Rejected by operator, will not be executed.
|
- `rejected`: Rejected by operator, will not be executed.
|
||||||
- `running`: Currently being executed by an agent.
|
- `running`: Currently being executed by an agent (e.g. `materializer`).
|
||||||
- `completed`: Successfully executed.
|
- `completed`: Successfully executed.
|
||||||
- `failed`: Execution failed.
|
- `failed`: Execution failed.
|
||||||
|
|
||||||
|
#### Human-in-the-Loop (HIL) Protocol
|
||||||
|
1. **Request**: Agent identifies a required change and writes a JSON to `actions/pending/`.
|
||||||
|
2. **Notification**: System notifies the human operator.
|
||||||
|
3. **Audit**: Human reviews `details.reason` and `details.diff`.
|
||||||
|
4. **Authorization**: Human moves file to `approved/`.
|
||||||
|
5. **Execution**: Agent monitors `approved/` and executes the task.
|
||||||
|
|
||||||
#### Schema
|
#### Schema
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue