Commit graph

4 commits

Author SHA1 Message Date
Oskar Kapala c255a021d1 fix(observer): quarantine malformed event files to prevent processing wedge
Was: malformed event (bad JSON / truncated / corrupted bytes) wedged the
node's checkpoint forever — every cycle re-tried, logged, never advanced
past the bad file; all subsequent good events for that node lost.

Now: first parse failure -> atomic os.replace to STATE_DIR/observer_failed_events/<node>/
with collision handling. Checkpoint advances, downstream events flow.
Move failures themselves are logged but don't crash the loop.

Complementary to yesterday's atomic_write_json fix (state files);
this addresses the same race-pattern on event files instead.

Regression test asserts: bad event quarantined to failed_events dir,
removed from hot path, subsequent good event processed (node online),
checkpoint moves to good event.
2026-06-12 11:22:56 +02:00
Oskar Kapala a0bfd96870 docs: session 2026-06-11 — lustro ssh shipping fix + ha-diag-agent piha + backlog/flota-bomba
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:18:00 +02:00
Oskar Kapala 5c2516d097 docs: session 2026-06-09 + skill/backlog update
- docs/sessions/2026-06-09-flota-recovery-lustro-register.md: flota
  recovery (root cause aerbot group, 3 warstwy maskujące), lustro register
  stan+plan, fix-event-bloat i OOM pending, worktree gotcha
- docs/backlog.md: nowy plik — tech-debt tracker; wpisy: --omit-dir-times,
  oskar∈aerbot deklaratywnie, worktree per task, observer staleness
- .claude/skills/node-onboarding/SKILL.md: step table aktualizacja (PROVEN:
  20-base, 30-node-agent; WRITTEN: 40-register, 50-verify), 3 nowe gotchas
  (rsync perm, observer restart, worktree branch)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 20:38:35 +02:00
Oskar Kapala c466ed28d1 docs(skills): add node-onboarding skill (living doc)
ECC-format skill for the node onboarding workflow. Covers full step
sequence, operational rules, node.yaml key fields, gotchas from LUSTRO
session, and Definition of Done. Marked as living doc — SCAFFOLD sections
to be promoted to PROVEN as steps land on real nodes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 10:14:42 +02:00