slice 8B.2 (controller): resume app at snapshotted, keep tracking to done (v0.38.0)

Quiesce loop resumes (StartStack + clear marker) at the snapshotted phase
instead of done -> downtime whole-backup -> until-snapshot, no consistency loss.
Keeps polling to done/failed (no overlapping backup; post-snapshot failure
observed). Stop-mode fallback to done + crash-safety preserved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-10 14:54:19 +02:00
parent 6ac7167dfd
commit e4b69ac9e5
3 changed files with 173 additions and 2 deletions
+21
View File
@@ -1,5 +1,26 @@
## Changelog
### v0.38.0 — slice 8B.2: quiesce downtime optimization (resume at `snapshotted`) (2026-06-10)
The controller half of slice 8B.2. Pairs with `felhom-agent` v0.13.0. The quiesce loop now resumes
the app at the **`snapshotted`** phase (storage snapshot taken) instead of `done` — app downtime
drops from *whole-backup* to *until-snapshot* (seconds), with no loss of app-consistency (the
snapshot froze the app-stopped state).
#### Changed (`internal/quiesce`)
- The status-poll loop **resumes (`StartStack` + clears the marker) at `snapshotted`**, then **keeps
polling to `done`/`failed`** — so a new backup isn't started until this one truly finishes, and a
post-snapshot failure is observed (the backup isn't "successful" until `done`; resuming early does
not mark it done).
- **Fallback preserved:** if `snapshotted` never arrives (stop/downgraded mode), it resumes at `done`
exactly as 8B. **Crash-safety unchanged:** marker written before stop; guaranteed unquiesce;
startup `Recover()`. A backup that fails *after* `snapshotted` is harmless — the app is already up.
#### Tests
- resume at `snapshotted` (RESUME event before `done`, marker cleared, then tracked to `done`);
stop-mode fallback (resume at `done`, no `snapshotted`); fail-after-`snapshotted` (one resume, app
stays up); the 8B crash-safety tests stay green.
### v0.37.0 — slice 8C: controller de-privileging + disk management via the agent (2026-06-10)
The in-guest controller half of slice 8C (closes slice 8). The disk-execution subsystem moves to