diff --git a/CHANGELOG.md b/CHANGELOG.md index 6d93e72..e871b87 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,11 +22,17 @@ from secrets stored in the unit (there are none), and **regenerating nothing**. - **Tests:** the gate (all recovered / data-key missing → refuse / empty data-key → refuse / resettable missing → proceed+warn, recovered values used verbatim) and `data_key` parsing from `.felhom.yml` (`Metadata.DataKeyEnvVars()`). -- **Validation status:** the gate + reconciliation + data_key parsing are unit-tested (authoritative for - the refuse/proceed/regenerate-nothing behaviour); the capture side is live-validated (v0.53.1, RomM). - The full live **readable-data e2e** against AdventureLog (deploy → back up → restore → confirm the - data decrypts) requires triggering the **auth-gated** `/backup/restore` from the dashboard — pending an - operator-run on the demo. +- **Live-validated on guest 9201 (AdventureLog, a real data_key app):** its recovery-unit manifest + correctly carries `data_key_env_vars: [SECRET_KEY]` (catalog→metadata→manifest flow proven live); and + with `SECRET_KEY` made unrecoverable, `POST /backup/restore` **refused** with the exact fail-closed + message ("…[SECRET_KEY] could not be recovered … a PBS whole-guest restore is required first…"), + **before any compose-up** (no side effects). The demo has no dashboard password, so the API is open + (auth + CSRF are both skipped in that mode) — this was driven via the public URL. Gate + reconciliation + + orchestration + data_key parsing are also unit-tested. +- **One e2e not run (environment limit, not a code gap):** the full "deploy with data → restore → + confirm data decrypts" — AdventureLog's images don't fit the **8 GB guest rootfs** (the deploy hit "no + space left on device"). This is exactly the Phase 3 rootfs-headroom concern, now observed live. + Key-preservation/regenerate-nothing is covered by the gate's verbatim-recovery unit test. ### v0.53.1 — Phase 2: recovery units refresh on the periodic cache cycle (idempotent) (2026-06-13) diff --git a/CONTEXT.md b/CONTEXT.md index 282cce3..8df12c7 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -34,8 +34,13 @@ Last updated: 2026-06-12 (storage UX polish) > `stacks.RedeployFromEnv`), regenerating nothing. `reconcileRestoreSecrets` (pure, unit-tested) is the > fail-closed gate: missing/empty data-key → REFUSE (needs PBS whole-guest restore); missing resettable > secret → warn+proceed. Wired into `/backup/restore`. Gate + orchestration + data_key parsing -> unit/integration-tested; deployed v0.54.0 healthy. **PENDING:** live readable-data e2e vs AdventureLog -> needs the auth-gated dashboard restore (no web cred in bootstrap.json) — operator-run. +> unit/integration-tested; deployed v0.54.0 healthy. +> - **LIVE-validated (9201, AdventureLog):** unit manifest `data_key_env_vars:[SECRET_KEY]` +> (catalog→manifest live); with SECRET_KEY made unrecoverable, `POST /backup/restore` REFUSED with the +> exact fail-closed message BEFORE any compose-up. Demo has NO dashboard password → API open (auth+CSRF +> skipped), driven via public URL. NOTE: full deploy-with-data→restore e2e blocked because AdventureLog +> images don't fit the 8G guest rootfs ("no space left") — that's the Phase 3 rootfs-headroom concern +> seen live. Demo left clean (AdventureLog reverted to not-deployed). > - Next: Phase 3 (Tier 2 auto off-drive, rootfs-headroom guard), Phase 4 (FileBrowser + UI). > > **2026-06-13 — v0.52.0 Phase 1 GATE: deploy-side double-nest fix (catalog) + path-agreement test:** diff --git a/REPORT.md b/REPORT.md index c37d240..36e7002 100644 --- a/REPORT.md +++ b/REPORT.md @@ -71,15 +71,19 @@ persist. (This is what "self-update handles version drift" refers to.) →proceed, values used verbatim), the full orchestration (success→recreate-with-merged-env; data-key-missing→refused, recreate never called), and `data_key` parsing from `.felhom.yml`. -## Validation status (honest) +## Validation status - **Unit/integration-tested (authoritative):** the fail-closed gate, the restore orchestration, secret reconciliation (regenerate-nothing), and the catalog→metadata `data_key` flow. -- **Live-validated:** the capture side (v0.53.1, RomM — secret-free unit, NO_LEAK grep); v0.54.0 deployed - + healthy + capture regression clean. -- **PENDING (auth-gated):** the full live **readable-data e2e** vs AdventureLog (deploy with an - encryption key → back up → restore → confirm data decrypts) needs triggering the session-authed - `/backup/restore` from the dashboard. `bootstrap.json` carries no web credential and the password is a - bcrypt hash, so this needs an operator-run (or the demo dashboard password). +- **Live-validated (guest 9201):** the capture side (v0.53.1, RomM — secret-free, NO_LEAK). For Phase 2b + on **AdventureLog** (a real data_key app): its unit manifest carries `data_key_env_vars: [SECRET_KEY]` + (catalog→manifest flow live); and with `SECRET_KEY` made unrecoverable, `POST /backup/restore` + **refused** with the exact fail-closed message **before any compose-up** (no side effects). The demo + has no dashboard password → the API is open (auth + CSRF skipped), driven via the public URL. +- **One e2e not run — environment limit, not a code gap:** the full "deploy with data → restore → + confirm decrypts" — AdventureLog's images do not fit the **8 GB guest rootfs** (deploy hit "no space + left on device"). That is precisely the Phase 3 rootfs-headroom concern, now observed live. + Key-preservation is covered by the gate's verbatim-recovery unit test. Demo left clean (AdventureLog + reverted to not-deployed, no leftovers). ## Still ahead Phase 3 (auto off-drive Tier 2 with rootfs-headroom guard) and Phase 4 (FileBrowser scoping + deploy-UI