doc 03 §8/§9: slice 8B.2 implemented — resume at snapshotted (downtime ~24s->~3s) (2026-06-10)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -4,42 +4,32 @@
|
||||
|
||||
---
|
||||
|
||||
# REPORT — Slice 8C docs: controller de-privileging + disk classifier (slice 8 CLOSED) (2026-06-10)
|
||||
# REPORT — Slice 8B.2 docs: quiesce downtime optimization (resume at `snapshotted`) (2026-06-10)
|
||||
|
||||
## Type
|
||||
|
||||
Documentation update for **slice 8C** (the implementation is in `felhom-agent` v0.12.0 +
|
||||
`felhom-controller` v0.37.0; no hub change). Slice 8 is now **CLOSED**.
|
||||
Documentation update for **slice 8B.2** (implementation: `felhom-agent` v0.13.0 + `felhom-controller`
|
||||
v0.38.0; no hub change).
|
||||
|
||||
## What changed (doc 03 — host-agent)
|
||||
|
||||
- **§6** — added the disk-management endpoints (`GET /disks`, `POST /disks/{assign,eject,format}`)
|
||||
and **reframed the principle**: a controller may do *non-data-destructive* storage setup self-serve
|
||||
(list / assign / eject / format-blank); **anything that can lose customer data stays
|
||||
operator-signed (§4)**, with the **classifier (agent-internal device inspection)** as the enforcer.
|
||||
The 8C invariant: the agent decides data-bearing-ness by inspecting the device itself, never the
|
||||
caller's claim; a data-bearing format → `ClassStorageWipe` → gate → `pending_signature` (signed
|
||||
completion is slice 10). Marked **implemented**.
|
||||
- **§4** — added: data-bearing-ness is **agent-internal evidence, never the caller's claim**
|
||||
(mirrors the agent-internal scratch-provenance rule); destructive completion → slice 10.
|
||||
- **§9 slice table** — **8C implemented → slice 8 CLOSED**: agent v0.12.0 (`/disks` + classifier
|
||||
gate + `mkfs`); controller v0.37.0 (~12.3k LOC disk-execution retired, `backup.Manager` split to
|
||||
app-data, disk mgmt rewired to the agent, container de-privileged). §13 + doc changelog updated.
|
||||
|
||||
## What changed (doc 02 — controller module map)
|
||||
|
||||
- Added an **EXECUTED** banner: the map's target state is realized — the disk subsystem is deleted,
|
||||
`backup.Manager` split, disk mgmt rewired to the agent, the container de-privileged. The in-guest
|
||||
controller is now Docker-only with no disk/Proxmox privileges.
|
||||
- **§8** — the **8B.2 downtime optimization is now implemented** (was a fast-follow note): in snapshot
|
||||
mode the agent watches the vzdump task log for the snapshot marker (`create storage snapshot`,
|
||||
validated PVE 9.2.2) and emits a **`snapshotted`** phase on `/backup/status`; the controller
|
||||
**resumes its app at `snapshotted`** (not `done`), cutting app downtime from *whole-backup* to
|
||||
*until-snapshot* with **no loss of app-consistency** (the snapshot froze the app-stopped state).
|
||||
Noted the snapshot-capable-storage dependency + the stop-mode **fallback to resume-at-`done`**, and
|
||||
that the controller keeps tracking to `done`/`failed` after early resume.
|
||||
- **§9 slice table** — the 8B row notes 8B.2 implemented.
|
||||
|
||||
## Live validation (cross-repo, on the demo)
|
||||
|
||||
A provisioned **de-privileged** controller v0.37.0 (`Privileged=false`; mounts only bootstrap + data
|
||||
+ docker.sock) drove the agent disk API: `GET /disks` returned data-bearing flags, and a
|
||||
**data-bearing format was refused** (`pending_signature`, nothing formatted) — the security
|
||||
centerpiece, proven live. See the agent + controller REPORTs.
|
||||
A provisioned controller + postgres stack: `quiescing` → `snapshotted — resuming app early` →
|
||||
`backup done`. **App downtime ≈ 3s** (resume at snapshot) vs **≈ 23s** if it had waited for `done`
|
||||
(~87% cut). The snapshot backup restored **clean** (`database system was shut down`, no WAL replay) —
|
||||
the early resume preserved app-consistency. See the agent + controller REPORTs.
|
||||
|
||||
## Deferred
|
||||
|
||||
The operator-signed completion of a data-bearing wipe/format → **slice 10**. No hub change → no
|
||||
deploy. No secrets committed.
|
||||
Snapshot-capable storage required for the win; stop/downgraded storage falls back to resume-at-`done`
|
||||
(8B). No hub change → no deploy. No secrets committed.
|
||||
|
||||
Reference in New Issue
Block a user