# Phase 5 spike — PBS mechanism validation (DooPlex server ← N100 client) > **Status: empirical findings from a live spike (2026-06-09).** PBS was never validated in > any prior spike (proxmox-platform.md §4.6). This establishes the *real* mechanisms before > the slice-6 **Phase B** spec is written. No production data; probe → record → teardown. The > PBS server + datastore + the N100's `felhom-pbs` storage are **left in place** for Phase B's > live runbook to reuse. ## Topology under test - **PBS server: DooPlex** `192.168.0.180` (Debian 13 trixie, separate box, backup-only — runs no guests). PBS `proxmox-backup-server 4.2.1-1`. - **Client: the N100** `demo-felhom` `192.168.0.162` (PVE 9.2.2, `proxmox-backup-client 4.2.0`). It backs up to DooPlex and restores from it. ## Pre-flight | | Result | |---|---| | P1 DooPlex can host PBS | ✅ Debian 13 trixie; `proxmox-backup-server` **not** in stock apt — needed the Proxmox **PBS no-subscription** repo (`deb http://download.proxmox.com/debian/pbs trixie pbs-no-subscription`). **Surprise:** there is **no standalone `proxmox-release-trixie.gpg`** (404; only bookworm/bullseye are published) — the trixie key ships in the **`proxmox-archive-keyring`** package (key `24B30F06…0BFE778E`). I copied that keyring from the N100 (PVE9/trixie already has it). 126 GB free on `/`. | | P2 N100 → DooPlex:8007 | ✅ reachable (8007 closed pre-install, open after). | | P3 N100 PBS client | ✅ `proxmox-backup-client 4.2.0`, PVE `PBSPlugin.pm` present. | ## Stand-up (DooPlex) - **S1** — `proxmox-backup-manager datastore create felhom-spike /var/lib/pbs-spike` → `TASK OK`. Services `proxmox-backup-proxy` + `proxmox-backup` active, listening on `:8007`. **Server cert fingerprint:** `3b:95:5a:fa:9e:0e:4a:54:f3:64:08:e5:a2:a2:6c:66:e9:86:44:64:40:8e:c2:f7:6e:41:d2:c2:1e:86:48:c4`. - **S2** — created PBS user `felhom@pbs` + **API token** `felhom@pbs!n100`, ACL `DatastoreAdmin` on `/datastore/felhom-spike`. **⚠️ PBS privsep gotcha (mirrors PVE):** an API token's effective rights = token-ACL ∩ **user**-ACL. Granting only the token wasn't enough — `pvesm add` failed with *"Cannot find datastore"* until `DatastoreAdmin` was **also** granted to the `felhom@pbs` user. Phase B's enrollment must grant the role on **both** the user and the token. ## Adding as an encrypted PVE storage (N100) - **A1** — `pvesm add pbs felhom-pbs --server 192.168.0.180 --datastore felhom-spike --fingerprint --username 'felhom@pbs!n100' --password --encryption-key autogen --content backup`. Resulting `/etc/pve/storage.cfg`: ``` pbs: felhom-pbs datastore felhom-spike server 192.168.0.180 content backup encryption-key 01:36:e9:fe:e1:ee:3d:7a:9d:bf:3d:63:d0:68:fd:24:45:b7:5f:bc:b6:82:bc:6d:d2:b4:7a:b0:1a:86:6d:a1 fingerprint 3b:95:5a:fa:…:48:c4 username felhom@pbs!n100 ``` **Where the keys live on the box (the "live key on box"):** - **client encryption key** → `/etc/pve/priv/storage/felhom-pbs.enc` (root:www-data **0600**, 255 B). The `encryption-key` line in storage.cfg is only the key's **fingerprint** (`01:36:e9:fe…`), not the key. - **PBS token secret** → `/etc/pve/priv/storage/felhom-pbs.pw` (0600, 37 B). - **A2** — the slice-5 agent observe (`--selftest=storage`) sees the target with the **fingerprint-pinned durable_id** exactly as designed: `durable=192.168.0.180:felhom-spike#3b:95:5a:fa:…:48:c4`, `type=pbs`, `state=attached`. No agent change needed for observation. ## Probes (B1–B6) ### B1 — backup to PBS `vzdump 9001 --storage felhom-pbs --mode snapshot` → - PBS snapshot id **`ct/9001/2026-06-09T14:18:33Z`** (`//`); the underlying `proxmox-backup-client backup … --repository felhom@pbs!n100@192.168.0.180:felhom-spike`. - **Encrypted client-side**: `--crypt-mode=encrypt`, *"Using encryption key from file descriptor"*, *"Encryption key fingerprint: 01:36:e9:fe:e1:ee:3d:7a"* (matches the storage key). Incremental/deduped (*"reused 41 MiB"*). ~19 s for ~1 GiB. - **Surprise vs Phase A:** vzdump **chose `stop` mode** for the (stopped) guest even though `snapshot` was requested (`INFO: backup mode: stop`). PVE picks the actual mode; the reported `Backup.mode` is what was *requested*. For a running guest on lvm-thin it would snapshot. (Still crash-consistent only — no fsfreeze, per slice 6.) ### B2 — snapshot inventory → the `PBSSnapshot` wire shape - **PVE volid** (`pvesm list felhom-pbs`): **`felhom-pbs:backup/ct/9001/2026-06-09T14:18:33Z`**, format `pbs-ct`, type `backup`. This is the exact volid `pct restore` consumes (B3). - **PBS native** (`proxmox-backup-client snapshot list --output-format json`) per snapshot: `backup-type` (ct|vm), `backup-id`, `backup-time` (epoch int), `size`, `owner` (`felhom@pbs!n100`), `protected` (bool), `fingerprint` (the encryption-key fp), and `files[]` each with `filename` + `size` + **`crypt-mode`** (`encrypt` for data, `sign-only` for `index.json`). **`verification` is ABSENT until a verify runs** (see B4). **Namespace**: not shown → the default (root) namespace; a `ns` field appears only for non-root namespaces. - → **Proposed `PBSSnapshot`**: `namespace`, `backup_type`, `backup_id`, `backup_time`, `size_bytes`, `owner`, `protected`, `encrypted` (derive from `files[].crypt-mode`), `verify_state` (ok|failed|none), `verified_at`/`verify_upid`. ### B3 — restore from PBS `pct restore 990001 'felhom-pbs:backup/ct/9001/2026-06-09T14:18:33Z' --storage local-lvm` → restored + booted to `running`. **The existing restore path works UNCHANGED against a pbs-sourced volid** — same `volid` + `--storage` shape the agent's `RestoreLXC` already uses (`ostemplate=`, `restore=1`). **No agent restore code change needed for PBS.** PVE pulls + decrypts using the storage's `.enc` key automatically. ### B4 — verify mechanism (the big unknown — resolved) - **`proxmox-backup-client` has NO `verify` subcommand** — verify is **server-side**. - Triggers: server CLI `proxmox-backup-manager verify [--ignore-verified] [--outdated-after N]` **on DooPlex**, OR the **PBS API** `POST /api2/json/admin/datastore/ /verify` (whole datastore; per-snapshot params available). - **The agent on the N100 CAN drive it remotely** via the PBS API + token (no DooPlex shell needed). Proven: `curl -X POST …/admin/datastore/felhom-spike/verify` with header `Authorization: PBSAPIToken=felhom@pbs!n100:` returned a task UPID `UPID:dooplex:…:verify:felhom\x2dspike:felhom@pbs!n100:`. Needs `Datastore.Verify` (in `DatastoreAdmin`). - **Result read-back:** after verify, the snapshot's **`verification` field** appears: `{"state":"ok","upid":"UPID:dooplex:…"}` (read via `snapshot list`). So the agent triggers via API → polls/re-lists → reads `verification.state` (`ok`/`failed`). (Task-status polling needs the PBS **node name** — it's `dooplex`, embedded in the UPID; `localhost` returns `exitstatus: unknown`.) ### B5 — agent-token (`felhom-agent@pve`) privileges — **no gap** Driven by the agent (operator token, not root@pam): - **Backup to PBS** (`--selftest=backup`): ✅ `felhom-pbs:backup/ct/9001/2026-06-09T14:22:30Z`, crash_consistent, success. - **Restore from PBS** (`--selftest=restore-test`): ✅ restored into scratch 990000, booted, verified `running`, torn down — pass. - The **FelhomAgent role's existing `Datastore.{Audit,Allocate,AllocateSpace}` + `VM.Backup` suffice** for both backup-to-PBS and restore-from-PBS. **No role widening needed.** (Two auth layers: the *PVE* operator token authorizes the vzdump/restore API call; the *PBS* token in storage.cfg authenticates PVE→PBS. The spike exercised both.) ### B6 — zero-knowledge confirmed - All data files are `crypt-mode=encrypt` (B2); `index.json` is `sign-only`. - **Without the key**, an authenticated restore **fails to decrypt**: `proxmox-backup-client restore … pct.conf.blob -` (no `--keyfile`) → `Error: missing key - manifest was created with key 01:36:e9:fe:e1:ee:3d:7a`. - **With** `--keyfile /etc/pve/priv/storage/felhom-pbs.enc` → decrypts (returns the guest config). The key is the *only* gate. - **The PBS server holds no client key** — `find /etc/proxmox-backup /var/lib/pbs-spike` for key material returns only the server's own `csrf.key`, never the client encryption key. So DooPlex can store + serve chunks but cannot read guest data. Zero-knowledge holds: the live key on the N100 is the irreducible residual (the operator/hub can't read the data). ## Implications for the Phase B spec (flagged surprises vs the dir-storage assumptions) 1. **Enrollment must grant the PBS role on BOTH the user AND the token** (PBS privsep), and add the `pbs` storage with `--encryption-key autogen` → the live key lands at `/etc/pve/priv/storage/.enc` (the "live PBS key on the box", doc 03 §8). The hub holds only the recovery-code-wrapped escrow (out of scope here). 2. **Backup + restore need NO new code** beyond targeting a `pbs` storage — `Vzdump` and `RestoreLXC`/`pct restore` work against pbs volids unchanged. The agent's `LatestBackupVolID` (StorageContent filter) already resolved the pbs volid. 3. **Verify is a NEW capability to build**: a server-side op the agent triggers **remotely via the PBS API** (`POST …/datastore//verify`) using the storage's token, then reads back `verification.state` from the snapshot list. This is the "lighter frequent integrity check" (§8) — it does NOT need the encryption key (ciphertext-level), unlike the full self-restore- test. Phase B needs a small PBS-API client (token auth, fingerprint pin) for verify + snapshot-list-with-verify-state; the existing `proxmox.Client` (PVE API) does not cover it. 4. **`PBSSnapshot` wire shape** = the B2 fields; `verify_state` is the load-bearing one and is `none` until a verify runs. 5. **vzdump mode** is PVE's choice (stop for stopped guests) — report requested-vs-actual if it matters, or read the actual mode from the task log. ## Teardown / left-in-place - Throwaway restore guest **990001 destroyed**; agent restore-test scratch self-torn-down; `pct list` → **no leftover guests**. Agent config reverted (`backup.local_backup_target` → `local`). Token-secret temp files removed from both boxes. - **Left in place for Phase B:** the PBS server on DooPlex, the `felhom-spike` datastore (with two test snapshots of 9001), the `felhom@pbs!n100` token + ACLs, and the N100's `felhom-pbs` encrypted storage (+ its `.enc`/`.pw` under `/etc/pve/priv/storage/`). - **No secrets committed** — the encryption key, token secret, and PBS password live only in `/etc/pve/priv/storage/` (0600) on the N100; this doc references them by location/fingerprint only.