Files
felhom.eu/REPORT.md
T

36 lines
1.8 KiB
Markdown

# felhom.eu — task reports
> **Overwrite** this file with a summary of the most recent task only (uniform with the other repos; not cumulative). The cumulative hub history lives in [hub/CHANGELOG.md](hub/CHANGELOG.md).
---
# REPORT — Slice 8B.2 docs: quiesce downtime optimization (resume at `snapshotted`) (2026-06-10)
## Type
Documentation update for **slice 8B.2** (implementation: `felhom-agent` v0.13.0 + `felhom-controller`
v0.38.0; no hub change).
## What changed (doc 03 — host-agent)
- **§8** — the **8B.2 downtime optimization is now implemented** (was a fast-follow note): in snapshot
mode the agent watches the vzdump task log for the snapshot marker (`create storage snapshot`,
validated PVE 9.2.2) and emits a **`snapshotted`** phase on `/backup/status`; the controller
**resumes its app at `snapshotted`** (not `done`), cutting app downtime from *whole-backup* to
*until-snapshot* with **no loss of app-consistency** (the snapshot froze the app-stopped state).
Noted the snapshot-capable-storage dependency + the stop-mode **fallback to resume-at-`done`**, and
that the controller keeps tracking to `done`/`failed` after early resume.
- **§9 slice table** — the 8B row notes 8B.2 implemented.
## Live validation (cross-repo, on the demo)
A provisioned controller + postgres stack: `quiescing``snapshotted — resuming app early`
`backup done`. **App downtime ≈ 3s** (resume at snapshot) vs **≈ 23s** if it had waited for `done`
(~87% cut). The snapshot backup restored **clean** (`database system was shut down`, no WAL replay) —
the early resume preserved app-consistency. See the agent + controller REPORTs.
## Deferred
Snapshot-capable storage required for the win; stop/downgraded storage falls back to resume-at-`done`
(8B). No hub change → no deploy. No secrets committed.