docs(claude): add Definition of Done for services (smoke test + pytest)

Lesson from brain-watchdog: code that was never run had a packaging bug
that caused a crash loop in production. New rule: docker build + short
smoke-run + pytest before any commit or deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Oskar Kapala 2026-06-01 20:38:39 +02:00
parent cb4ae756ab
commit f381023206

View file

@ -106,6 +106,48 @@ When exploring the system, use these files in order:
3. `hosts/<node>/services.yaml` — desired services and exposure classes for that host
4. `services/<service>/service.yaml` — operational contract for a service
## VPS-Specific Rules
VPS has **4 GiB RAM, no swap**. Every repo-managed service must declare memory limits in its `hosts/vps/runtime/<service>/docker-compose.override.yml`.
### Memory limit convention
Use top-level Compose properties (not `deploy.resources.limits`, which requires Swarm mode):
```yaml
services:
myservice:
mem_limit: 256m # cgroup ceiling; Docker restarts on breach
oom_score_adj: -900 # host kernel OOM-killer will not pick this container
```
Rules:
- **Control-plane containers** (executor, observer, supervisor, operator-ui), **node-agent**, **stability-agent**: always set `oom_score_adj: -900` — these must never be a system-level OOM victim.
- `mem_limit` still applies even with `oom_score_adj: -900`; the cgroup OOM killer is independent of the host OOM killer and will restart the container via Docker when the limit is exceeded.
- Budget: OS+Docker reserves ~800 MiB; sum of all `mem_limit` values must stay ≤ 3200 MiB (3.1 GiB).
### Repo-managed services on VPS
All VPS services are now GitOps-managed. Service definitions live in `services/<name>/docker-compose.yml`; host-specific overrides (mem_limit, env) live in `hosts/vps/runtime/<name>/docker-compose.override.yml`.
| Service | Compose stack | Data path |
|---|---|---|
| npm | `services/npm/` | `/home/dockeruser/docker/npm/{data,letsencrypt}` (bind mount) |
| outline | `services/outline/` | Docker named volumes: `outline_outline_storage`, `outline_postgres_data`, `outline_redis_data` |
| joplin | `services/joplin/` | Docker named volume: `joplin_postgres_data` |
| ai-cluster | `services/ai-cluster/` | Mosquitto config bind: `/home/dockeruser/docker/ai-cluster/mosquitto/` |
**Data migration rule**: data paths stay in place at cutover. Never move volumes or bind-mount sources without a dedicated migration plan.
**Cutover checklist** (before running `docker compose up` for any migrated service):
1. `git pull` on VPS
2. Populate `/opt/homelab/config/<service>/.env` from the `env.example` template
3. For ai-cluster: copy `/home/dockeruser/docker/ai-cluster/.env` to `/opt/homelab/config/ai-cluster/.env`
4. For mosquitto: config stays at old bind path until explicitly migrated
5. Verify named volumes exist: `docker volume ls | grep <project>`
**ai-cluster architectural note**: compute workloads (codex-worker, planner-worker) belong on SOLARIA (GPU/compute node), not the 4 GB ingress VPS. Migrate when feasible; for now, hard mem_limits contain the blast radius.
## CHELSTY-Specific Rules
- Zigbee coordinator is **SLZB-06U** over TCP (`192.168.1.105:6638`, `ezsp` adapter). Never use `/dev/ttyUSB0`.
@ -124,6 +166,14 @@ When exploring the system, use these files in order:
- `world/` — Observer output (synthesized state)
- `actions/` — pending / approved / running / completed / failed
## Definition of Done (serwisy)
Before any new or changed service is considered ready:
1. **docker build + smoke run** — build the image locally and run it for a few seconds; confirm the process starts its main loop without crashing. This catches packaging/import errors (e.g. `ModuleNotFoundError`) before they reach a node.
2. **pytest** — run the service's test suite. If no tests exist yet, add a minimal one (at minimum: import passes, core logic has at least one case). Tests live in `services/<service>/tests/`.
3. **Never commit or deploy code that has never been run.** If a smoke run or test fails, fix it first.
## Naming Conventions
- Hosts: ALL CAPS (`SATURN`, `PIHA`)