Was: malformed event (bad JSON / truncated / corrupted bytes) wedged the
node's checkpoint forever — every cycle re-tried, logged, never advanced
past the bad file; all subsequent good events for that node lost.
Now: first parse failure -> atomic os.replace to STATE_DIR/observer_failed_events/<node>/
with collision handling. Checkpoint advances, downstream events flow.
Move failures themselves are logged but don't crash the loop.
Complementary to yesterday's atomic_write_json fix (state files);
this addresses the same race-pattern on event files instead.
Regression test asserts: bad event quarantined to failed_events dir,
removed from hot path, subsequent good event processed (node online),
checkpoint moves to good event.
ECC-format skill for the node onboarding workflow. Covers full step
sequence, operational rules, node.yaml key fields, gotchas from LUSTRO
session, and Definition of Done. Marked as living doc — SCAFFOLD sections
to be promoted to PROVEN as steps land on real nodes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Encodes branch hygiene for CC running in task worktrees: commit only to
assigned branch, no push origin master, no touching main checkout, no
git add -A, no worktree management, mandatory final report.
Records session facts (git log, diff --stat, deploys from transcript)
by appending to docs/sessions/YYYY-MM-DD.md with a mandatory narrative
placeholder. Never touches backlog.md or CLAUDE.md without explicit
instruction. Commits only the session file.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instructs CC to always route deploy/redeploy/ship/wdróż requests through
scripts/deploy/deploy.sh, maps exit codes to required actions, and
enforces no-bypass rules for gate and branch checks.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>