122 lines
6.1 KiB
Markdown
122 lines
6.1 KiB
Markdown
# Deployment Conventions
|
|
|
|
This document describes the GitOps-lite deployment process for the homelab.
|
|
|
|
## Principles
|
|
|
|
1. **Git as Source of Truth**: All infrastructure definitions (Docker Compose, configurations) are stored in Git.
|
|
2. **Unidirectional Flow**: Changes flow from **SATURN** (commit node) to execution nodes.
|
|
3. **Lightweight**: No complex orchestrators (no Kubernetes). Use `docker compose` and simple shell scripts.
|
|
4. **Tailscale Mesh**: All hosts are connected via Tailscale, allowing secure communication without public port exposure.
|
|
5. **Host Autonomy**: Services that must operate during WAN or Git outages keep their runtime dependencies on the execution node or local LAN.
|
|
|
|
## Staged Deployment Framework
|
|
|
|
The homelab uses a staged deployment framework located at `scripts/deploy/deploy.sh`. This script is designed to be resumable, stage-aware, and observable.
|
|
|
|
### Deployment Stages
|
|
|
|
1. **prepare**: Pulls the latest changes from Git, validates inventory, and prepares the local environment. It is tolerant of network failures to support intermittently connected nodes like CHELSTY.
|
|
2. **validate**: Ensures all required service definitions and metadata are present.
|
|
3. **deploy**: Executes `docker compose` commands for all assigned services. Supports `.env` files and `docker-compose.override.yml` under `/opt/homelab/config/<service>/`.
|
|
4. **verify**: Executes service-specific `healthcheck.sh` scripts or checks container status.
|
|
5. **diagnose**: Automatically triggered on failure; collects container status and logs for troubleshooting.
|
|
6. **complete**: Finalizes the deployment and marks the state as finished.
|
|
|
|
### State Tracking and Logging
|
|
|
|
- **State**: Local node state is tracked in `/opt/homelab/state/deploy/current_stage`. The last successfully processed service in the `deploy` stage is tracked in `last_service` to support granular resumption.
|
|
- **Logs**: Detailed execution logs are stored in `/opt/homelab/logs/deploy/deploy_<timestamp>.log`. Structured log entries prefixed with `[STRUCT]` provide machine-parseable event data.
|
|
|
|
### Resume Semantics
|
|
|
|
If a deployment is interrupted (e.g., due to LTE disconnect on CHELSTY):
|
|
1. Rerun the script with the `--resume` flag: `scripts/deploy/deploy.sh --resume`.
|
|
2. The script reads the last incomplete stage and continues from there.
|
|
3. In the `deploy` stage, it specifically resumes from the first service that was not successfully completed.
|
|
|
|
### Operational Semantics
|
|
|
|
Deployment is **hybrid**:
|
|
- **SATURN** acts as the orchestrator and source of truth.
|
|
- **Nodes** execute the deployment locally using the `deploy.sh` script.
|
|
- Human-in-the-loop is required for triggering and confirming deployments.
|
|
|
|
### Recovery Workflow
|
|
|
|
If a deployment fails:
|
|
1. Run `deploy.sh --stage diagnose` to identify the issue.
|
|
2. Use the `recover-node` AI prompt to analyze logs and get recommendations.
|
|
3. Fix the issue (e.g., update a secret in `.env`) and run `deploy.sh --resume`.
|
|
|
|
## Onboarding New Nodes
|
|
|
|
Refer to `inventory/templates/how_to_add_new_node.yaml` for a detailed guide on adding new hardware to the mesh. The general flow is:
|
|
1. Define node in `hosts/` and `inventory/topology.yaml` on SATURN.
|
|
2. Bootstrap the node (Docker, Tailscale, Git).
|
|
3. Run the staged deployment framework starting with `prepare`.
|
|
|
|
## Host-Local Overrides
|
|
|
|
If a service requires host-specific configuration (e.g., unique device paths for GPUs on SOLARIA):
|
|
|
|
1. Create a `docker-compose.override.yml` in `/opt/homelab/config/<service>/`.
|
|
2. The deployment script should include this override if it exists.
|
|
|
|
For CHELSTY Home Assistant infrastructure, host-local configuration is the
|
|
authority for runtime identity, secrets, and local device endpoints:
|
|
|
|
- Home Assistant config: `/opt/homelab/config/homeassistant`
|
|
- Zigbee2MQTT config: `/opt/homelab/config/zigbee2mqtt`
|
|
- Mosquitto config: `/opt/homelab/config/mosquitto`
|
|
|
|
CHELSTY services must not require SATURN, VPS, or Forgejo to be reachable after
|
|
deployment has completed. Docker Compose definitions can still come from Git,
|
|
but Home Assistant automation, Zigbee control, and MQTT messaging must continue
|
|
locally while LTE or Tailscale connectivity is unavailable.
|
|
|
|
## Exposure Classes
|
|
|
|
Service inventory may declare one of these exposure classes:
|
|
|
|
- `local-only`: bind only to host, LAN, or container networks. This is the default for Zigbee2MQTT and Mosquitto.
|
|
- `tailscale-internal`: reachable over Tailscale only. This is appropriate for Home Assistant remote administration.
|
|
- `public`: reachable from the public internet through a deliberate ingress path, normally the VPS edge role.
|
|
|
|
Public exposure is not implied by a service existing in Git. It must be explicit
|
|
in host inventory and ingress configuration.
|
|
|
|
## CHELSTY Home Automation Deployment Notes
|
|
|
|
CHELSTY remains a Docker Compose execution node. No Kubernetes, Helm, Ansible,
|
|
or additional orchestration layer is required for Home Assistant infrastructure.
|
|
|
|
The SLZB-06U coordinator is network-connected over Ethernet or WiFi. Compose
|
|
files and host overrides should configure Zigbee2MQTT for a TCP/network
|
|
coordinator endpoint, not a USB serial device. Avoid `/dev/ttyUSB0` mappings.
|
|
|
|
Runtime paths follow the standard layout:
|
|
|
|
- `/opt/homelab/data/homeassistant`
|
|
- `/opt/homelab/config/homeassistant`
|
|
- `/opt/homelab/logs/homeassistant`
|
|
- `/opt/homelab/data/zigbee2mqtt`
|
|
- `/opt/homelab/config/zigbee2mqtt`
|
|
- `/opt/homelab/logs/zigbee2mqtt`
|
|
- `/opt/homelab/data/mosquitto`
|
|
- `/opt/homelab/config/mosquitto`
|
|
- `/opt/homelab/logs/mosquitto`
|
|
|
|
Recommended backup coverage:
|
|
|
|
- Home Assistant config and persistent data before upgrades or major integration changes.
|
|
- Zigbee2MQTT config, database, coordinator backup files, and Zigbee network key material.
|
|
- SLZB-06U firmware version, exported configuration, network address reservation, and coordinator state.
|
|
- Mosquitto config, ACL/password files, persistence data, and bridge configuration if enabled.
|
|
|
|
## Secrets Management
|
|
|
|
- **Do NOT commit secrets to Git.**
|
|
- Secrets should be placed in `/opt/homelab/config/<service>/.env` on the target host.
|
|
- The deployment script should ensure these are sourced by Docker Compose.
|