Files

T

admin 5dc363771b doc 03 §8/§9: slice 8B.2 implemented — resume at snapshotted (downtime ~24s->~3s) (2026-06-10)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-10 15:02:14 +02:00

1.8 KiB

Raw Blame History

felhom.eu — task reports

Overwrite this file with a summary of the most recent task only (uniform with the other repos; not cumulative). The cumulative hub history lives in hub/CHANGELOG.md.

REPORT — Slice 8B.2 docs: quiesce downtime optimization (resume at `snapshotted`) (2026-06-10)

Type

Documentation update for slice 8B.2 (implementation: felhom-agent v0.13.0 + felhom-controller v0.38.0; no hub change).

What changed (doc 03 — host-agent)

§8 — the 8B.2 downtime optimization is now implemented (was a fast-follow note): in snapshot mode the agent watches the vzdump task log for the snapshot marker (create storage snapshot, validated PVE 9.2.2) and emits a snapshotted phase on /backup/status; the controller resumes its app at snapshotted (not done), cutting app downtime from whole-backup to until-snapshot with no loss of app-consistency (the snapshot froze the app-stopped state). Noted the snapshot-capable-storage dependency + the stop-mode fallback to resume-at-done, and that the controller keeps tracking to done/failed after early resume.
§9 slice table — the 8B row notes 8B.2 implemented.

Live validation (cross-repo, on the demo)

A provisioned controller + postgres stack: quiescing → snapshotted — resuming app early → backup done. App downtime ≈ 3s (resume at snapshot) vs ≈ 23s if it had waited for done (~87% cut). The snapshot backup restored clean (database system was shut down, no WAL replay) — the early resume preserved app-consistency. See the agent + controller REPORTs.

Deferred

Snapshot-capable storage required for the win; stop/downgraded storage falls back to resume-at-done (8B). No hub change → no deploy. No secrets committed.

1.8 KiB Raw Blame History