Files
felhom.eu/documentation/tests/phase3-findings.md
T

13 KiB
Raw Blame History

Phase 3 — vzdump exclusion (B2) & agent operator role + root boundary (B3): Findings

Host: demo-felhom (192.168.0.162) — Proxmox VE 9.2.2, node confirmed via pvesh get /nodesdemo-felhom. Date: 2026-06-08. Throwaway resources (VMIDs 9010-9023, role/user FelhomAgent/felhom-agent@pve); all torn down (only the pre-existing 9000/9001 remain, stopped). Every Proxmox op polled to task exitstatus (not the POST return).

Validates the two items the design review (_design-review.md) flagged as unvalidated: B2 (what vzdump includes/excludes per LXC mount type + how to keep bulk out) and B3 (the least-privilege operator role + the root-vs-API boundary). Data only.


B2 — vzdump inclusion/exclusion matrix

Setup: one unprivileged LXC 9010 (nesting=1,keyctl=1, overlayfs), Docker 29.5.3 installed, with five sentinel locations:

# location config
1 rootfs file /SENTINEL_ROOTFS rootfs (local-lvm:8)
2 Docker named volume b2volSENTINEL_DOCKERVOL default driver
3 mp1 volume mount /mnt/mp1 SENTINEL_MP1 local-lvm:1,backup=1
4 mp2 volume mount /mnt/mp2 SENTINEL_MP2 local-lvm:1,backup=0
5 mp3 bind mount /mnt/mp3 SENTINEL_MP3 host /root/b2-bindsrc
6 bulk Docker vol bulkvol bound onto mp2 → SENTINEL_BULK --driver local -o type=none -o o=bind -o device=/mnt/mp2

The "trap" confirmed at setup: the Docker named volume's on-disk path is /var/lib/docker/volumes/b2vol/_datainside the LXC rootfs.

Result matrix (stop-mode vzdump → local, verified 3 ways: vzdump log, archive grep, restore to 9011)

Sentinel location flag in archive? restored 9011
SENTINEL_ROOTFS rootfs INCLUDED present
SENTINEL_DOCKERVOL Docker named vol (in rootfs) INCLUDED ⚠️ the trap present
SENTINEL_MP1 volume mp backup=1 INCLUDED present
SENTINEL_MP2 volume mp backup=0 EXCLUDED absent (vol recreated empty)
SENTINEL_MP3 bind mount n/a EXCLUDED reappears via re-bind only¹
SENTINEL_BULK Docker vol on mp2 backup=0 EXCLUDED absent

¹ The bind-mount data is not in the archive (archive grep shows no mp3 path). It reappears in the restored 9011 only because pct restore preserves the bind config mp3: /root/b2-bindsrc and re-attaches the same host dir. On a different host (true DR) the bind data would be gone unless backed up separately — important for DR planning.

vzdump log (verbatim) — the authoritative per-mount decision:

INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp1 ('/mnt/mp1') in backup
INFO: excluding volume mount point mp2 ('/mnt/mp2') from backup (disabled)
INFO: excluding bind mount point mp3 ('/mnt/mp3') from backup (not a volume)

Archive contents (verbatim) — tar --zstd -tf … | grep SENTINEL:

./var/lib/docker/volumes/b2vol/_data/SENTINEL_DOCKERVOL
./SENTINEL_ROOTFS
./mnt/mp1/SENTINEL_MP1

Restore verification (verbatim) — sentinels in restored 9011:

PRESENT : /SENTINEL_ROOTFS
PRESENT : /var/lib/docker/volumes/b2vol/_data/SENTINEL_DOCKERVOL
PRESENT : /mnt/mp1/SENTINEL_MP1
ABSENT  : /mnt/mp2/SENTINEL_MP2
ABSENT  : /mnt/mp2/SENTINEL_BULK
PRESENT : /mnt/mp3/SENTINEL_MP3   # via re-bind to same host dir, NOT from archive

Proven bulk-exclusion recipe

A "bulk" Docker volume is kept out of the guest vzdump by binding it onto a volume mountpoint with backup=0:

  1. Attach a Proxmox volume mountpoint with the flag: pct set <id> -mpN <storage>:<size>,mp=/mnt/bulk,backup=0
  2. Realize the Docker volume on that path: docker volume create --driver local -o type=none -o o=bind -o device=/mnt/bulk bulkvol (or a compose bind to /mnt/bulk).
  3. Data written through bulkvol lands on the backup=0 mountpoint → excluded from vzdump, while rootfs/hot sentinels are included. Verified: SENTINEL_BULK absent from archive and restore; SENTINEL_ROOTFS present.

The trap, stated for the placement component

backup=<boolean> is only honoured for volume mount points (confirmed: pct manpage + vzdump log "excluding volume mount point … (disabled)"). A Docker named volume uses the default driver and lands in the rootfs, which is always backed up — so a "bulk" volume left as an ordinary named volume is silently swept into the whole-guest image. The per-volume placement component must realize every bulk volume as a dedicated backup=0 mountpoint (or external bind mount), never a default named volume.


B3 — agent operator role + root-vs-API boundary

Caveat applied (Phase 1): privsep token needs the role on both user and token. Setup: user felhom-agent@pve + privsep token agent, role FelhomAgent, dual-granted at /. All ops driven as the token via the REST API; task exitstatus polled.

⚠️ Terminology: the Phase-1 FelhomSelfBackup role is the discarded guest-side self-backup role (scoped to one guest, denied create/allocate). FelhomAgent here is its operator-tier replacement — a different, broader role. Do not conflate.

Op matrix (as the scoped token)

# Operation API call Result
read host status GET /nodes/$N/status 200 (needs Sys.Audit)
read storage list GET /storage 200 (Datastore.Audit)
1 create LXC, nesting=1,keyctl=1 POST /nodes/$N/lxc 403changing feature flags (except nesting) is only allowed for root@pam
1 create LXC, nesting-only POST /nodes/$N/lxc 200 / OK
2 set config (mem/cpu/options + mountpoint w/ backup flag) PUT /nodes/$N/lxc/<id>/config 200
3 allocate volume POST /nodes/$N/storage/local-lvm/content 200 (Datastore.AllocateSpace)
4 start POST …/status/start OK (VM.PowerMgmt)
5 stop POST …/status/stop OK
6a snapshot POST …/snapshot OK (VM.Snapshot)
6b rollback POST …/snapshot/s1/rollback OK (VM.Snapshot.Rollback)
7 stop-mode backup POST /nodes/$N/vzdump mode=stop OK (VM.Backup)
8 restore → fresh vmid POST /nodes/$N/lxc restore=1 OK — and restored CT kept features: nesting=1,keyctl=1
9 destroy CT DELETE /nodes/$N/lxc/<id>?purge=1 OK (VM.Allocate)
9b add storage definition (dir) POST /storage 200 (Datastore.Allocate, no root)

The two headline results:

  1. keyctl=1 on create is root@pam-only. Verbatim: Permission check failed (changing feature flags (except nesting) is only allowed for root@pam). Confirmed this is not token-fixable: a non-privsep root@pam token got the same 403. Only an actual root@pam session (OS root / pct create as root) can set it. nesting alone is allowed for a scoped token.
  2. Restore preserves keyctl. A token-authorized vzrestore of a keyctl archive produced 9021 with features: nesting=1,keyctl=1, unprivileged: 1. So the DR/restore path is fully token-covered; only fresh provisioning needs root for the keyctl flag.

Paring (each drop shown to still pass, or proven needed)

Privilege Verdict Evidence
Datastore.AllocateTemplate DROP (unnecessary) create-from-template succeeded without it (200/OK)
Sys.Audit KEEP GET /nodes/$N/status403 without it (host metrics, 03 §5)
VM.Config.Network KEEP create with net0403 (/vms/…, VM.Config.Network) without it
VM.Config.Options KEEP config onboot=1403 (/vms/…, VM.Config.Options) without it
SDN.Use KEEP (added vs review sketch) create → 403 (/sdn/zones/localnetwork/vmbr0, SDN.Use) without it

Corrections to the review's candidate sketch: VM.Config.CPUMemory is not a real privilege — split into VM.Config.CPU + VM.Config.Memory. SDN.Use was missing and is required (PVE 9 gates bridge use behind it). Datastore.AllocateTemplate is not needed.

Final minimal FelhomAgent role (proven sufficient for ops 1′–9b)

VM.Allocate  VM.Audit  VM.Config.Disk  VM.Config.CPU  VM.Config.Memory
VM.Config.Network  VM.Config.Options  VM.PowerMgmt  VM.Snapshot  VM.Snapshot.Rollback
VM.Backup  Datastore.Allocate  Datastore.AllocateSpace  Datastore.Audit  Sys.Audit  SDN.Use

(16 privileges. Datastore.Allocate is for the storage-definition add; drop it if the agent never creates Proxmox storage entries via the API. VM.PowerMgmt is for start/stop lifecycle — not for the backup itself, consistent with proxmox-platform.md §3.4.)

Root-vs-API boundary table (answers 03 §3)

Agent host operation Coverage Notes
Create unprivileged LXC, nesting-only API token VM.Allocate+VM.Config.*+Datastore.AllocateSpace+SDN.Use
Create with keyctl=1 (Docker needs it — Phase 0) OS root root@pam (pct create as root / sudoers) no API token works, incl. a root@pam token
Set config (mem/cpu/net/options/mountpoint + backup flag) API token
Allocate guest volume API token Datastore.AllocateSpace
Start / stop / snapshot / rollback API token VM.PowerMgmt / VM.Snapshot(.Rollback)
vzdump backup (stop/snapshot mode) API token VM.Backup
Restore from vzdump (preserves keyctl) API token DR path needs no root
Destroy guest (scratch + compensating rollback, B1) API token VM.Allocate
Add Proxmox storage definition (dir/nfs/cifs/pbs) API token Datastore.Allocate; the definition only
Host status / metrics report API token Sys.Audit
USB physical mount-by-UUID / systemd mount unit / fstab OS root / narrow sudoers not a Proxmox API op (host-level mount; not tested here)
SMART / hardware sensors OS root not API-exposed

Boundary summary: nearly the entire guest lifecycle — including restore — is covered by the scoped token. The genuine OS-root residual is narrow: (1) fresh creation of a Docker-capable LXC (the keyctl flag), (2) physical USB mount-by-UUID / systemd mount units / fstab, (3) hardware/SMART. This supports 03 §3's "non-root service + scoped token + narrow sudoers" model — with the specific sudoers/root entries being: pct create (or just the keyctl-setting step) and the host mount operations.


Raw command log (appendix)

B2

pct create 9010 ... --features nesting=1,keyctl=1 --unprivileged 1   # rootfs local-lvm:8
pct set 9010 -mp1 local-lvm:1,mp=/mnt/mp1,backup=1
pct set 9010 -mp2 local-lvm:1,mp=/mnt/mp2,backup=0
pct set 9010 -mp3 /root/b2-bindsrc,mp=/mnt/mp3
# docker named vol: docker volume inspect b2vol -> /var/lib/docker/volumes/b2vol/_data
# bulk: docker volume create --driver local -o type=none -o o=bind -o device=/mnt/mp2 bulkvol
vzdump 9010 --mode stop --storage local --compress zstd
#   INFO: including mount point rootfs ('/') in backup
#   INFO: including mount point mp1 ('/mnt/mp1') in backup
#   INFO: excluding volume mount point mp2 ('/mnt/mp2') from backup (disabled)
#   INFO: excluding bind mount point mp3 ('/mnt/mp3') from backup (not a volume)
tar --zstd -tf <archive> | grep SENTINEL   # -> rootfs, dockervol, mp1 only
pct restore 9011 <archive> --storage local-lvm   # -> mp2/bulk absent, mp3 via re-bind

B3

pveum role add FelhomAgent -privs "VM.Allocate VM.Audit VM.Config.Disk VM.Config.CPU VM.Config.Memory VM.Config.Network VM.Config.Options VM.PowerMgmt VM.Snapshot VM.Snapshot.Rollback VM.Backup Datastore.Allocate Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Sys.Audit"   # candidate (pre-SDN)
pveum user add felhom-agent@pve ; pveum user token add felhom-agent@pve agent --privsep 1
pveum acl modify / -user  'felhom-agent@pve'        -role FelhomAgent
pveum acl modify / -token 'felhom-agent@pve!agent'  -role FelhomAgent

# token create with keyctl:
POST /nodes/demo-felhom/lxc ... features=nesting=1,keyctl=1
  -> 403 "changing feature flags (except nesting) is only allowed for root@pam"
# + SDN.Use missing initially:
  -> 403 "Permission check failed (/sdn/zones/localnetwork/vmbr0, SDN.Use)"
# root@pam non-privsep token, keyctl create:
  -> 403 (same "only allowed for root@pam")   # tokens never qualify

# token nesting-only create / config(PUT) / start / stop / snapshot / rollback /
# vzdump(stop) / restore->9021 (kept keyctl) / destroy / POST /storage  -> all 200/OK

# paring:
GET /nodes/$N/status  without Sys.Audit            -> 403   (KEEP)
create net0           without VM.Config.Network     -> 403   (KEEP)
config onboot=1       without VM.Config.Options      -> 403   (KEEP)
create from template  without Datastore.AllocateTemplate -> OK (DROP)

Teardown

pct destroy 9010 9011 9021 --purge   # 9020/9022/9023 already destroyed during tests
pveum user token remove felhom-agent@pve agent ; pveum user delete felhom-agent@pve
pveum role delete FelhomAgent        # ACLs at / auto-invalidated
rm -f /var/lib/vz/dump/vzdump-lxc-9010-* /var/lib/vz/dump/vzdump-lxc-9020-*
# verified: only 9000/9001 remain (stopped-but-present); no felhom-agent user/role; dump dir empty