Action Queue System

The Action Queue System provides a safe, filesystem-first lifecycle for operational actions in the homelab platform. It enables controlled execution with mandatory approval for high-risk operations.

Action Lifecycle

Actions move through various states, represented by directories under /opt/homelab/actions/:

Pending (pending/): Actions proposed by the Supervisor or other agents.
Approved (approved/): Actions that have been reviewed and approved for execution.
Running (running/): Actions currently being processed by the Executor.
Completed (completed/): Successfully executed actions.
Failed (failed/): Actions that encountered errors during execution.
Rejected (rejected/): Proposed actions that were explicitly denied.

Action Schema

Actions are stored as JSON documents with the following structure:

{
  "action_id": "uuid",
  "created_at": 1620000000.0,
  "proposed_by": "supervisor",
  "correlation_id": "uuid",
  "node": "node-name",
  "service": "service-name",
  "action_type": "redeploy_service",
  "risk_level": "guarded",
  "confidence": 0.9,
  "approval_required": true,
  "autonomous_eligible": false,
  "status": "pending",
  "payload": { ... },
  "rollback_reference": null
}

Safety Model

Actions are categorized into safety classes:

Safe: Low-risk actions that may be eligible for autonomous execution in the future (e.g., collect_diagnostics, rerun_healthcheck).
Guarded: Actions that default to requiring approval but could be automated under strict conditions (e.g., redeploy_service, rerun_deployment_stage).
Dangerous: High-risk actions that ALWAYS require manual approval.

Currently, the platform operates in a Recommendation-Only mode where even safe actions require explicit approval.

Initial Action Types

redeploy_service: Restarts or redeploys a service container.
rerun_healthcheck: Triggers an immediate health check.
rerun_deployment_stage: Retries a specific stage of a failed deployment.
collect_diagnostics: Gathers logs and metrics for troubleshooting.

Executor

The Executor (scripts/executor/executor.py) is responsible for processing approved actions. It features:

Process Approved Only: Only actions in the approved/ directory are processed.
Recommendation-Safe: Simulation-based execution that logs intended mutations without side effects.
Idempotency: Designed to be safe to run multiple times.
Resumable State: If interrupted, it will pick up actions in the running/ state.
Append-Only History: Maintains a history.log of all action transitions.

Rollback Concepts

Every action schema includes a rollback_reference. In future iterations, this will point to the previous stable state or a reverse action that can be triggered if the current action fails or causes further instability.

Future Autonomous Execution

The system is designed to transition to autonomous execution by:

Identifying safe actions with high confidence scores.
Matching them against a policy-engine.
Automatically moving them from pending/ to approved/ based on allowed safety guardrails.

3.2 KiB Raw Permalink Blame History