# Phase 1 + 2 — Privilege Model & Backup/Restore Round-Trip: Findings **Host:** `demo-felhom` (192.168.0.162) — Proxmox VE 9.2.2, node confirmed via `pvesh get /nodes` → `demo-felhom`. Storage: `local` (dir, content `iso,vztmpl,backup,import`), `local-lvm` (LVM-thin, `rootdir,images`). **Subject:** LXC `9001` (`spike-lxc`, unprivileged, `nesting=1,keyctl=1`, Docker + postgres/redis/nginx stack). **Date:** 2026-06-07. > Data and observations only — **no recommendation or verdict**. ## Hypotheses — verdicts at a glance | | Hypothesis | Result | |---|---|---| | **H1** | Backup scopes to one VMID; restore/create needs node/pool allocate → denied to narrow token | **CONFIRMED** (create CT = 403) | | **H2** | An LXC vzdump captures the Docker volumes (they live in the container rootfs) | **CONFIRMED** (sentinel survived both restores) | | **H3** | Crash-consistent (running) *and* quiesced (stopped) backups both restore cleanly | **CONFIRMED** (A via WAL recovery, B clean start) | | **H4** | Running unprivileged LXC snapshots on LVM-thin; restored CT keeps unprivileged+nesting/keyctl | **CONFIRMED** (live snapshot OK; config survived) | --- ## 1. Phase 1 — Privilege model ### 1.1 Setup (operator side, root) ``` pveum role add FelhomSelfBackup -privs "VM.Audit VM.Snapshot VM.Backup Datastore.AllocateSpace Datastore.Audit" pveum user add felhom-ctl@pve --comment "spike in-guest controller" pveum user token add felhom-ctl@pve ctl --privsep 1 # secret: b6547d9d-... (ephemeral, spike-only) pveum acl modify /vms/9001 -token 'felhom-ctl@pve!ctl' -role FelhomSelfBackup pveum acl modify /storage/local -token 'felhom-ctl@pve!ctl' -role FelhomSelfBackup ``` Privilege names were verified against `PVEVMAdmin` / `PVEDatastoreUser` via `pveum role list` first. **Note:** the reference doc's introspection command `pveum role info ` **does not exist in PVE 9** — only `pveum role list` works. ### 1.2 ⚠️ Privsep gotcha — the doc's runbook is incomplete With `--privsep 1`, a token's effective rights are the **intersection of the backing user's permissions AND the token's own ACLs**. The reference doc (§3) grants ACLs to the **token only**. With the user `felhom-ctl@pve` holding **no** permissions, the intersection was **empty** — the first self-audit call returned: ``` HTTP 403 {"message":"Permission check failed (/vms/9001, VM.Audit)\n"} ``` **Fix applied:** also grant the user the role on the same paths (`pveum acl modify /vms/9001 -user felhom-ctl@pve -role FelhomSelfBackup`, same for `/storage/local`). After that the self-calls succeeded. **A privsep token needs the permission present on *both* the user and the token** (the token ACL is what keeps the token ≤ user / narrowly scoped). This must be reflected in the controller provisioning. ### 1.3 Test matrix (every call run from **inside** the unprivileged LXC, `pct exec 9001`) `H=192.168.0.162 N=demo-felhom AUTH="PVEAPIToken=felhom-ctl@pve!ctl="` | # | Call | Expected | **Actual** | Notes | |---|---|---|---|---| | 1 | `GET /version` | 200 | **200** | reachable + auth from inside LXC (no privilege needed) | | 2 | `GET /nodes/$N/lxc/9001/status/current` | 200 | **200**¹ | self audit (after privsep fix) | | 3 | `POST /nodes/$N/lxc/9001/snapshot snapname=spk1` | 200/UPID→OK | **200, task exitstatus OK** | **running-LXC self-snapshot (H4)** | | 4 | `POST /nodes/$N/vzdump vmid=9001 storage=local mode=snapshot` | 200/UPID→OK | **200, task exitstatus OK** | self backup, archive produced | | 5 | `GET /nodes/$N/qemu/9000/status/current` | 403 | **403** | `Permission check failed (/vms/9000, VM.Audit)` | | 6 | `POST /nodes/$N/vzdump vmid=9000 storage=local` | 403 | **200 POST → task exitstatus 403**² | see note | | 7 | `POST /nodes/$N/lxc` (create CT) | 403 | **403** | `Permission check failed` — **proves create/allocate is operator-tier (H1)** | ¹ before the privsep fix this was 403; see §1.2. ² **Important nuance:** the `vzdump` endpoint accepts the POST and returns a UPID even for an unauthorized vmid; the authorization failure surfaces at **task execution**, not at the HTTP layer. Polled from root: `exitstatus: "403 Permission check failed (/vms/9000, VM.Backup)"`, and **no 9000 archive was created**. The boundary holds — but a controller must **poll the task exitstatus**, not trust the POST's 200, to know a cross-guest backup was actually refused. **Pass criteria met:** self-ops (1–4) succeed; cross-guest read (5), cross-guest backup (6, at task level), and create/allocate (7) are denied. The controller-as-guest boundary and the two-tier split are validated. ### 1.4 Final minimal role — `VM.PowerMgmt` **not** required The doc's open question ("does Tier A need `VM.PowerMgmt` for stop-mode backups? Likely yes"). **Tested and refuted:** a **stop-mode** self-vzdump submitted by the token (`vmid=9001 mode=stop`) completed with **`exitstatus: OK`** using the role *without* `VM.PowerMgmt`. `vzdump` performs the guest shutdown/restart internally under `VM.Backup`; no separate power privilege is needed. > **Final minimal role (`FelhomSelfBackup`) — satisfies self-audit, self-snapshot, and > both `snapshot`- and `stop`-mode self-backup:** > `VM.Audit, VM.Snapshot, VM.Backup, Datastore.AllocateSpace, Datastore.Audit` > (`VM.PowerMgmt` deliberately omitted — confirmed unnecessary.) ### 1.5 TLS observation From inside the LXC, `curl` **without** `-k`: ``` curl: (60) SSL certificate problem: unable to get local issuer certificate ``` The host serves the default self-signed PVE cert; all tests used `-k`. Production trust (pin the PVE CA / issue a proper cert) is a separate design decision, flagged here. ### 1.6 Running-LXC snapshot (H4) Call #3 snapshotted the **running** unprivileged LXC on LVM-thin (`exitstatus OK`). `pct listsnapshot 9001` shows `spk1` with `pct status 9001 = running`. **No stop required** — the snapshot-before-update rollback flow is viable on a live container. --- ## 2. Phase 2 — Backup → real restore round-trip Sentinel written pre-flight into the `pgdata` volume: `restore_check(42,'phase2-sentinel')` → clean read `42|phase2-sentinel`. ### 2.1 Backups (operator/root side) | Variant | Mode | Stack state | Task time | Wall | Archive | Size (zstd) | |---|---|---|---|---|---|---| | **A — crash-consistent** | `snapshot` | **running** | 00:00:24 | 25 s | `vzdump-lxc-9001-2026_06_07-20_13_43.tar.zst` | **934 MB** (979,718,569 B) | | **B — quiesced** | `snapshot` | **stopped** (`docker compose stop`) | 00:00:21 | 22 s | `vzdump-lxc-9001-2026_06_07-20_14_40.tar.zst` | **934 MB** (979,671,582 B) | Both from a 2.5 GiB source; zstd → ~934 MB (~2.7:1). The stack was restarted after Variant B. **LXC snapshot-mode vzdump does *not* fsfreeze** (no guest agent in an LXC — consistent with the Phase 0 finding) → Variant A is genuinely crash-consistent. ### 2.2 Restore → fresh VMID → boot → verify | Check | 9002 (Variant A) | 9003 (Variant B) | |---|---|---| | Restore time (`pct restore … --storage local-lvm`) | **12 s** | **11 s** | | `unprivileged: 1` survived | **yes** | **yes** | | `features: nesting=1,keyctl=1` survived | **yes** | **yes** | | Containers after boot | `exited` (no restart policy) → `docker compose up -d` | same | | 3 containers healthy | **yes** | **yes** | | `curl localhost:8080` | **HTTP 200** | **HTTP 200** | | **Sentinel `(42,'phase2-sentinel')`** | **PRESENT** | **PRESENT** | | Postgres first-start | **WAL crash recovery** (see below) | **clean start, no recovery** | > Restored CTs inherit 9001's fixed `hwaddr`. To avoid a MAC clash with the still-running > 9001 on `vmbr0`, `net0` was reset to auto-generate a fresh MAC before boot. All > verification (stack health, `curl localhost`, sentinel) is guest-internal and needs no > external network — and the Docker images are inside the restored rootfs, so no pulls. **Variant A — Postgres automatic WAL recovery on 9002 (verbatim, post-restore boot):** ``` LOG: database system was interrupted; last known up at 2026-06-07 18:13:21 UTC LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 0/CB12838 LOG: invalid record length at 0/CB12870: expected at least 24, got 0 # normal end-of-WAL LOG: redo done at 0/CB12838 ... LOG: checkpoint starting: end-of-recovery immediate wait LOG: database system is ready to accept connections ``` **Variant B — clean start on 9003 (verbatim, post-restore boot):** ``` LOG: database system was shut down at 2026-06-07 18:14:39 UTC LOG: database system is ready to accept connections ``` **H2 confirmed:** one LXC vzdump captured the whole customer including the Docker named volume — the sentinel data restored in both guests. **H3 confirmed:** both variants restored to a bootable guest with intact data; the crash-consistent one recovered via WAL with no manual intervention, the quiesced one started clean. **H4 confirmed:** restored config preserved `unprivileged` + `nesting/keyctl`, so Docker ran in the restored CT. --- ## 3. Observations & confounds 1. **Privsep token needs perms on user *and* token** (§1.2) — the single most important correction to the reference runbook; without it every scoped call 403s. 2. **vzdump authorization is task-level, not POST-level** (§1.3 note ²) — a 200 + UPID does **not** mean authorized. The controller must poll `exitstatus`. This is also the general async-task lesson: every backup/snapshot/restore returns a UPID and the real result is in the task status. 3. **`pveum role info` is gone in PVE 9** — use `pveum role list`. Minor doc drift. 4. **`VM.PowerMgmt` not needed for stop-mode backup** (§1.4) — narrower role than the doc assumed. 5. **No fsfreeze for LXC** — Variant A relied on Postgres's own WAL crash recovery, which worked here for an idle-at-backup DB. Under heavy write load, app-consistency for LXC still rests on the controller quiescing first (or stop-mode), exactly as the reference warned. This single test is not a durability guarantee under load. 6. **Restore MAC collision** (§2.2) — `pct restore` preserves the source `hwaddr`; restoring while the original runs needs a MAC reset (or the original stopped). The controller's restore flow must handle identity (MAC/hostname/IP) to avoid clashes. 7. **No restart policy on the compose services** — restored containers came up `exited`; `docker compose up -d` (or a restart policy / systemd unit) is required for the stack to return automatically after a restore or guest reboot. 8. **Restore is fast, backup dominated by I/O** — restores were 11–12 s (extract at ~524 MiB/s); backups ~22–25 s (read 2.5 GiB at ~108–119 MiB/s + zstd). Single runs, idle host, ~150 MB DB; not a throughput benchmark. 9. **Sequencing artifact:** a Phase-1 stop-mode self-backup ran before Phase 2 and stopped/started 9001; the stack was brought back up and the sentinel re-verified before the Variant A/B backups, so it does not affect the round-trip results. --- ## 4. Raw command log (appendix) ### 4.1 Pre-flight ``` $ pvesh get /nodes -> node: demo-felhom $ cat /etc/pve/storage.cfg dir: local ... content iso,vztmpl,backup,import # 'backup' present lvmthin: local-lvm ... content rootdir,images # no backup (expected) $ pct start 9001 ; docker compose up -d -> 3 containers Started $ curl localhost:8080 -> HTTP 200 # sentinel: CREATE TABLE ; INSERT 0 1 ; SELECT count -> 1 ; SELECT * -> 42 | phase2-sentinel ``` ### 4.2 Phase 1 — role/user/token/ACL ``` $ pveum role add FelhomSelfBackup -privs "VM.Audit VM.Snapshot VM.Backup Datastore.AllocateSpace Datastore.Audit" -> role-ok $ pveum user add felhom-ctl@pve --comment "spike in-guest controller" -> user-ok $ pveum user token add felhom-ctl@pve ctl --privsep 1 {"full-tokenid":"felhom-ctl@pve!ctl","info":{"privsep":"1"},"value":"b6547d9d-08ec-4f22-beb8-a551dc2cd69d"} $ pveum acl modify /vms/9001 -token 'felhom-ctl@pve!ctl' -role FelhomSelfBackup -> ok $ pveum acl modify /storage/local -token 'felhom-ctl@pve!ctl' -role FelhomSelfBackup -> ok $ pveum role list | grep FelhomSelfBackup FelhomSelfBackup | Datastore.AllocateSpace,Datastore.Audit,VM.Audit,VM.Backup,VM.Snapshot $ pveum role info FelhomSelfBackup -> ERROR: unknown command 'pveum role info' # PVE9 has no 'role info' ``` ### 4.3 Phase 1 — matrix (from inside LXC) ``` # TLS without -k: curl: (60) SSL certificate problem: unable to get local issuer certificate # BEFORE privsep fix: #2 GET self status -> HTTP 403 {"message":"Permission check failed (/vms/9001, VM.Audit)\n"} # privsep fix: $ pveum acl modify /vms/9001 -user 'felhom-ctl@pve' -role FelhomSelfBackup -> ok $ pveum acl modify /storage/local -user 'felhom-ctl@pve' -role FelhomSelfBackup -> ok # AFTER fix: #1 GET /version -> HTTP 200 #2 GET /nodes/.../lxc/9001/status/current -> HTTP 200 {"data":{...,"status":"running",...}} #5 GET /nodes/.../qemu/9000/status/current -> HTTP 403 (/vms/9000, VM.Audit) #6 POST vzdump vmid=9000 -> HTTP 200 {"data":"UPID:...vzdump:9000:felhom-ctl@pve!ctl:"} root poll: exitstatus="403 Permission check failed (/vms/9000, VM.Backup)" task log: TASK ERROR: 403 Permission check failed (/vms/9000, VM.Backup) /var/lib/vz/dump: no 9000 archive created #7 POST /nodes/.../lxc (create CT vmid=9009) -> HTTP 403 {"message":"Permission check failed\n"} #3 POST lxc/9001/snapshot snapname=spk1 -> HTTP 200 UPID:...vzsnapshot:9001... root: exitstatus "OK" ; pct listsnapshot 9001 -> spk1 ; pct status 9001 -> running #4 POST vzdump vmid=9001 storage=local mode=snapshot -> HTTP 200 UPID:...vzdump:9001... root: exitstatus "OK" token can read own task status: HTTP 200 {"...exitstatus":"OK"} # earlier poll TIMEOUTs were a shell-quoting bug in the helper, not a perms issue # stop-mode self-backup (VM.PowerMgmt test): $ token POST vzdump vmid=9001 storage=local mode=stop -> HTTP 200 UPID:...vzdump:9001... root poll: exitstatus "OK" # SUCCEEDED without VM.PowerMgmt in the role ``` ### 4.4 Phase 2 — backups ``` # Variant A (running): $ vzdump 9001 --mode snapshot --storage local --compress zstd INFO: Total bytes written: 2585589760 (2.5GiB, 108MiB/s) INFO: archive file size: 934MB INFO: Finished Backup of VM 9001 (00:00:24) ; WALL_SECONDS=25 -> vzdump-lxc-9001-2026_06_07-20_13_43.tar.zst (979718569 B) # Variant B (stopped): $ docker compose stop (cache,db,web Stopped) $ vzdump 9001 --mode snapshot --storage local --compress zstd INFO: Total bytes written: 2585825280 (2.5GiB, 119MiB/s) INFO: Finished Backup of VM 9001 (00:00:21) ; WALL_SECONDS=22 -> vzdump-lxc-9001-2026_06_07-20_14_40.tar.zst (979671582 B) $ docker compose start (db,cache,web Started) ``` ### 4.5 Phase 2 — restores + verification ``` # A -> 9002: $ pct restore 9002 .../20_13_43.tar.zst --storage local-lvm Total bytes read: 2585589760 (2.5GiB, 524MiB/s) ; RESTORE_A_SECONDS=12 $ pct config 9002 -> features: nesting=1,keyctl=1 ; unprivileged: 1 $ pct set 9002 -net0 name=eth0,bridge=vmbr0,ip=dhcp # fresh MAC BC:24:11:E3:F4:64 $ pct start 9002 ; docker compose up -d -> 3 running ; curl -> HTTP 200 $ psql SELECT * FROM restore_check -> 42 | phase2-sentinel db log: "was interrupted ... not properly shut down; automatic recovery in progress redo starts/redo done ... database system is ready to accept connections" # B -> 9003: $ pct restore 9003 .../20_14_40.tar.zst --storage local-lvm Total bytes read: 2585825280 (2.5GiB, 524MiB/s) ; RESTORE_B_SECONDS=11 $ pct config 9003 -> features: nesting=1,keyctl=1 ; unprivileged: 1 $ pct set 9003 -net0 ... (fresh MAC) ; pct start 9003 ; docker compose up -d -> 3 running ; curl 200 $ psql SELECT * FROM restore_check -> 42 | phase2-sentinel db log: "database system was shut down at ... ; database system is ready to accept connections" # clean ``` --- ## 5. Teardown (executed — see §6 for what was left) Restore targets destroyed; Phase 1 objects and spike artifacts removed; `9000`/`9001` left **stopped-but-present**. ```bash pct destroy 9002 --purge ; pct destroy 9003 --purge pveum acl delete /vms/9001 -user 'felhom-ctl@pve' ; pveum acl delete /vms/9001 -token 'felhom-ctl@pve!ctl' pveum acl delete /storage/local -user 'felhom-ctl@pve' ; pveum acl delete /storage/local -token 'felhom-ctl@pve!ctl' pveum user token remove felhom-ctl@pve ctl ; pveum user delete felhom-ctl@pve ; pveum role delete FelhomSelfBackup pct delsnapshot 9001 spk1 rm -f /var/lib/vz/dump/vzdump-lxc-9001-*.tar.zst /var/lib/vz/dump/vzdump-lxc-9001-*.log pct stop 9001 # back to stopped-but-present ``` ## 6. To destroy 9000/9001 later (NOT run — left stopped-but-present) ```bash qm destroy 9000 --purge # VM (Phase 0 subject) pct destroy 9001 --purge # LXC (Phase 0/1/2 subject) # Debian 13 CT template left in place: local:vztmpl/debian-13-standard_13.1-2_amd64.tar.zst ```