Validated wrap->lose->unwrap->restore on a fenced throwaway: the R-recovered key decrypts a real encrypted snapshot. Pins the PBS-native command sequence (key change-passphrase --kdf scrypt/none), the pty requirement (F-A1: TTY-only, env var ignored) + the echo caveat (F-A2: discard pty output so R can't leak), the blob format/size, and the R format (EFF wordlist, >=128-bit). No K/R/token value recorded. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.4 KiB
Slice 7 — PBS recovery-code escrow round-trip: Findings
Host: demo-felhom (192.168.0.162) + PBS on DooPlex (192.168.0.180), PVE 9.2.2 / Debian 13.
proxmox-backup-client 4.x.
Date: 2026-06-10. Driver: SPIKE — validate the wrap → lose → unwrap → restore flow
end-to-end on a fenced throwaway before specing the agent's escrow creation (slice-6 verifyfail
discipline). No real key K or datastore was touched.
REDACTED by policy. No
Kvalue, no recovery code value, no token secret appears here — command shapes, blob size/format, fingerprint matching (not contents),Rentropy/format (not the value). Throwaway keyKt/ throwaway passphraseRtonly.
1. Setup (all throwaway, torn down)
- Throwaway datastore
escrowspikeon DooPlex (/mnt/5_hdd/pbs-escrowspike), ACLs forfelhom@pbs+felhom@pbs!n100. The realfelhom-spikedatastore was never used. - Throwaway PBS client key
Kt(key create --kdf none— mirrors the liveKposture: stored unencrypted so the agent backs up + restore-tests unattended). - The
felhom@pbs!n100token (real) was used for auth only to the throwaway datastore; the encryption key under test (Kt) is throwaway. (Same separation as the verifyfail runbook.)
2. The validated command sequence (this is the Phase-B contract)
PBS's key+passphrase path is the wrap mechanism (no bespoke crypto). The blob is a PBS key file
re-keyed from kdf=none to kdf=scrypt under the recovery code; recovery reverses it.
- Wrap
KunderR(escrow create) — copy the live key, then re-key the copy:cp <K-keyfile> <blob> proxmox-backup-client key change-passphrase <blob> --kdf scrypt # prompts: New + Verify - Unwrap (recover
Kfrom the blob withR):proxmox-backup-client key change-passphrase <blob> --kdf none # prompts: Encryption Key Password
F-A1 — change-passphrase is TTY-only; PBS_ENCRYPTION_PASSWORD is NOT consulted
Both directions prompt on the controlling terminal and fail unable to change passphrase - no tty
when run non-interactively; the env var does not supply the new/old passphrase. The agent
must drive it via a pty (Go: a pty pair; the spike used pty.fork()), feeding the passphrase
once per prompt: wrap → twice (New + Verify), unwrap → once (Encryption Key Password).
F-A2 — the pty echoes the passphrase → the driver MUST discard pty output
The pty's line discipline echoes the fed passphrase back on the master fd. The wrapper must
discard the pty's output (never copy it to stdout/log) and ideally run echo-off, so R cannot
leak through captured output. (The spike's redacted runner returns the child output only to satisfy
pty.spawn's progress loop and sends the whole invocation's stdout to /dev/null.)
F-A3 — blob format + size
The blob is the standard PBS key JSON (kdf: scrypt, scrypt params, data, fingerprint,
created). ~383 bytes, opaque. The fingerprint is preserved across wrap→unwrap (it
identifies the underlying key, not the passphrase) — the spike used it to prove same-key recovery.
3. Results (round-trips — the actual tests)
- Crypto round-trip:
create Kt (kdf=none)→ wrap tokdf=scrypt(383 B) → removeKt→ unwrap withRt→ recovered keykdf=nonewith fingerprint identical to the original (match=True). A wrong passphrase is rejected (change-passphraseexits non-zero; blob staysscrypt). - Backup → recover → restore (the load-bearing test): wrote a canary file → encrypted backup
to
escrowspikewithKt(--crypt-mode encrypt) → wrapKtunderRt→ removeKt→ unwrap withRt→proxmox-backup-client restore <snap> data.pxar <out> --keyfile <recovered>→ the canary content came back byte-identical (canary-match=True). The R-recovered key decrypts a real encrypted snapshot — slice-10 recovery is pre-validated a slice early. - Gotcha (test-harness only, not the mechanism): a snapshot must be restored with the same key it
was made with — selecting the newest snapshot by
backup-timematters when stale snapshots exist.
4. R (recovery code) — chosen entropy/format (implemented in Phase B)
- Generated with
crypto/rand; ≥128 bits. - Word-list form for off-paper transcription by a non-technical household: EFF large wordlist (7776 words, 12.92 bits/word), 10 words → ~129 bits, space/hyphen separated. (Raw base32 invites typos; the diceware form is the standard for human-entered passphrases.)
- Surfaced to the customer exactly once (selftest stdout on the demo; enrollment UX later). Never logged/persisted/committed.
5. Teardown
escrowspike datastore removed (--destroy-data true) + ACLs deleted + dir removed;
felhom-spike (real) untouched; all throwaway keys/blobs/scripts on the demo host removed. No
R/K/token value was written anywhere.
6. Verdict
READY to implement Phase B (agent escrow creation) and Phase C (hub opaque storage). The
PBS-native wrap is validated, recovery is proven to restore real encrypted data, and the two
implementation constraints are pinned: drive change-passphrase via a pty (F-A1), and discard the
pty output so R can't leak (F-A2). Escrow consumption / restore-mode serving stays slice 10
(but is now de-risked by this round-trip).