2 KiB
2 KiB
Service Lifecycle and Recovery
This document defines the lifecycle of a service in the homelab and the procedures for operational recovery.
Service Lifecycle
- Onboarding:
- Create
services/<service>/directory. - Define
docker-compose.yml,service.yaml,README.md,env.example, andhealthcheck.sh. - Register service in
inventory/topology.yamlor relevant host configs.
- Create
- Provisioning:
- Ensure
/opt/homelab/data/<service>exists. - Ensure
/opt/homelab/config/<service>exists and contains required secrets/configs. - Setup environment variables from
env.exampleinto/opt/homelab/config/<service>/.env.
- Ensure
- Deployment:
scripts/deploy/deploy.sh(Starts fresh)scripts/deploy/deploy.sh --resume(Continues after interruption)
- Verification:
- Automatic as part of the
deploy.shpipeline (verifystage). - Manual:
scripts/deploy/deploy.sh --stage verify.
- Automatic as part of the
- Maintenance:
- Periodic updates via
docker compose pull. - Log monitoring via
docker compose logs -f.
- Periodic updates via
- Decommissioning:
docker compose down.- Archive
/opt/homelab/data/<service>if necessary.
Operational Recovery
1. Container Failure
If a service is unhealthy:
- Check
docker compose logs. - Restart:
docker compose restart. - Recreate:
docker compose up -d --force-recreate.
2. Node Failure
If a host node fails:
- Services with
owner_nodematching the failed node must be recovered on a backup node or the node must be restored. - Persistence data must be restored from backups to
/opt/homelab/data/<service>.
3. Dependency Recovery
If a dependency fails:
- Services depending on it might report unhealthy status.
- Recover the dependency first.
- Re-verify dependent services.
Persistent Data Conventions
- Data:
/opt/homelab/data/<service>- Primary persistent state. - Config:
/opt/homelab/config/<service>- Local overrides and secrets. - Backups: Standard backup routines should target
/opt/homelab/data/.