Files
felhom-agent/docs/tests/phase3-findings.md
T
2026-06-08 08:21:07 +02:00

235 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 3 — vzdump exclusion (B2) & agent operator role + root boundary (B3): Findings
**Host:** `demo-felhom` (192.168.0.162) — Proxmox VE 9.2.2, node confirmed via
`pvesh get /nodes``demo-felhom`. **Date:** 2026-06-08. Throwaway resources (VMIDs
9010-9023, role/user `FelhomAgent`/`felhom-agent@pve`); all torn down (only the pre-existing
9000/9001 remain, stopped). Every Proxmox op polled to `task exitstatus` (not the POST
return).
> Validates the two items the design review (`_design-review.md`) flagged as unvalidated:
> **B2** (what vzdump includes/excludes per LXC mount type + how to keep bulk out) and **B3**
> (the least-privilege operator role + the root-vs-API boundary). Data only.
---
## B2 — vzdump inclusion/exclusion matrix
**Setup:** one unprivileged LXC `9010` (`nesting=1,keyctl=1`, overlayfs), Docker 29.5.3
installed, with five sentinel locations:
| # | location | config |
|---|---|---|
| 1 | rootfs file `/SENTINEL_ROOTFS` | rootfs (`local-lvm:8`) |
| 2 | Docker **named** volume `b2vol``SENTINEL_DOCKERVOL` | default driver |
| 3 | `mp1` volume mount `/mnt/mp1` `SENTINEL_MP1` | `local-lvm:1,backup=1` |
| 4 | `mp2` volume mount `/mnt/mp2` `SENTINEL_MP2` | `local-lvm:1,backup=0` |
| 5 | `mp3` **bind** mount `/mnt/mp3` `SENTINEL_MP3` | host `/root/b2-bindsrc` |
| 6 | bulk Docker vol `bulkvol` bound onto mp2 → `SENTINEL_BULK` | `--driver local -o type=none -o o=bind -o device=/mnt/mp2` |
**The "trap" confirmed at setup:** the Docker named volume's on-disk path is
`/var/lib/docker/volumes/b2vol/_data`**inside the LXC rootfs**.
### Result matrix (stop-mode vzdump → `local`, verified 3 ways: vzdump log, archive grep, restore to 9011)
| Sentinel | location | flag | **in archive?** | restored 9011 |
|---|---|---|---|---|
| `SENTINEL_ROOTFS` | rootfs | — | **INCLUDED** | present |
| `SENTINEL_DOCKERVOL` | Docker named vol (in rootfs) | — | **INCLUDED** ⚠️ the trap | present |
| `SENTINEL_MP1` | volume mp | `backup=1` | **INCLUDED** | present |
| `SENTINEL_MP2` | volume mp | `backup=0` | **EXCLUDED** | absent (vol recreated empty) |
| `SENTINEL_MP3` | bind mount | n/a | **EXCLUDED** | reappears via re-bind only¹ |
| `SENTINEL_BULK` | Docker vol on mp2 | `backup=0` | **EXCLUDED** | absent |
¹ The bind-mount **data is not in the archive** (archive grep shows no mp3 path). It
reappears in the restored 9011 only because `pct restore` preserves the bind config
`mp3: /root/b2-bindsrc` and re-attaches the **same host dir**. On a *different* host (true DR)
the bind data would be gone unless backed up separately — important for DR planning.
**vzdump log (verbatim) — the authoritative per-mount decision:**
```
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp1 ('/mnt/mp1') in backup
INFO: excluding volume mount point mp2 ('/mnt/mp2') from backup (disabled)
INFO: excluding bind mount point mp3 ('/mnt/mp3') from backup (not a volume)
```
**Archive contents (verbatim) — `tar --zstd -tf … | grep SENTINEL`:**
```
./var/lib/docker/volumes/b2vol/_data/SENTINEL_DOCKERVOL
./SENTINEL_ROOTFS
./mnt/mp1/SENTINEL_MP1
```
**Restore verification (verbatim) — sentinels in restored 9011:**
```
PRESENT : /SENTINEL_ROOTFS
PRESENT : /var/lib/docker/volumes/b2vol/_data/SENTINEL_DOCKERVOL
PRESENT : /mnt/mp1/SENTINEL_MP1
ABSENT : /mnt/mp2/SENTINEL_MP2
ABSENT : /mnt/mp2/SENTINEL_BULK
PRESENT : /mnt/mp3/SENTINEL_MP3 # via re-bind to same host dir, NOT from archive
```
### Proven bulk-exclusion recipe
A "bulk" Docker volume is kept out of the guest vzdump by binding it onto a **volume
mountpoint with `backup=0`**:
1. Attach a Proxmox volume mountpoint with the flag:
`pct set <id> -mpN <storage>:<size>,mp=/mnt/bulk,backup=0`
2. Realize the Docker volume on that path:
`docker volume create --driver local -o type=none -o o=bind -o device=/mnt/bulk bulkvol`
(or a compose bind to `/mnt/bulk`).
3. Data written through `bulkvol` lands on the `backup=0` mountpoint → **excluded** from
vzdump, while rootfs/hot sentinels are **included**. Verified: `SENTINEL_BULK` absent from
archive and restore; `SENTINEL_ROOTFS` present.
### The trap, stated for the placement component
`backup=<boolean>` is **only honoured for volume mount points** (confirmed: pct manpage +
vzdump log "excluding volume mount point … (disabled)"). A Docker **named volume uses the
default driver and lands in the rootfs**, which is **always backed up** — so a "bulk" volume
left as an ordinary named volume is **silently swept into the whole-guest image**. The
per-volume placement component **must** realize every `bulk` volume as a dedicated `backup=0`
mountpoint (or external bind mount), never a default named volume.
---
## B3 — agent operator role + root-vs-API boundary
**Caveat applied (Phase 1):** privsep token needs the role on **both** user and token. Setup:
user `felhom-agent@pve` + privsep token `agent`, role `FelhomAgent`, dual-granted at `/`.
All ops driven **as the token** via the REST API; task `exitstatus` polled.
> ⚠️ **Terminology:** the Phase-1 `FelhomSelfBackup` role is the discarded **guest-side
> self-backup** role (scoped to one guest, *denied* create/allocate). `FelhomAgent` here is
> its **operator-tier replacement** — a different, broader role. Do not conflate.
### Op matrix (as the scoped token)
| # | Operation | API call | Result |
|---|---|---|---|
| read | host status | `GET /nodes/$N/status` | **200** (needs `Sys.Audit`) |
| read | storage list | `GET /storage` | **200** (`Datastore.Audit`) |
| 1 | **create LXC, `nesting=1,keyctl=1`** | `POST /nodes/$N/lxc` | **403**`changing feature flags (except nesting) is only allowed for root@pam` |
| 1 | create LXC, **nesting-only** | `POST /nodes/$N/lxc` | **200 / OK** |
| 2 | set config (mem/cpu/options + mountpoint w/ `backup` flag) | `PUT /nodes/$N/lxc/<id>/config` | **200** |
| 3 | allocate volume | `POST /nodes/$N/storage/local-lvm/content` | **200** (`Datastore.AllocateSpace`) |
| 4 | start | `POST …/status/start` | **OK** (`VM.PowerMgmt`) |
| 5 | stop | `POST …/status/stop` | **OK** |
| 6a | snapshot | `POST …/snapshot` | **OK** (`VM.Snapshot`) |
| 6b | rollback | `POST …/snapshot/s1/rollback` | **OK** (`VM.Snapshot.Rollback`) |
| 7 | stop-mode backup | `POST /nodes/$N/vzdump mode=stop` | **OK** (`VM.Backup`) |
| 8 | restore → fresh vmid | `POST /nodes/$N/lxc restore=1` | **OK** — and **restored CT kept `features: nesting=1,keyctl=1`** |
| 9 | destroy CT | `DELETE /nodes/$N/lxc/<id>?purge=1` | **OK** (`VM.Allocate`) |
| 9b | add storage definition (dir) | `POST /storage` | **200** (`Datastore.Allocate`, **no root**) |
**The two headline results:**
1. **`keyctl=1` on create is `root@pam`-only.** Verbatim:
`Permission check failed (changing feature flags (except nesting) is only allowed for root@pam)`.
Confirmed this is **not** token-fixable: a **non-privsep `root@pam` token** got the **same
403**. Only an actual `root@pam` session (OS root / `pct create` as root) can set it.
`nesting` alone is allowed for a scoped token.
2. **Restore preserves `keyctl`.** A token-authorized `vzrestore` of a keyctl archive produced
`9021` with `features: nesting=1,keyctl=1, unprivileged: 1`. So the **DR/restore path is
fully token-covered**; only *fresh provisioning* needs root for the keyctl flag.
### Paring (each drop shown to still pass, or proven needed)
| Privilege | Verdict | Evidence |
|---|---|---|
| `Datastore.AllocateTemplate` | **DROP** (unnecessary) | create-from-template succeeded without it (200/OK) |
| `Sys.Audit` | **KEEP** | `GET /nodes/$N/status`**403** without it (host metrics, `03` §5) |
| `VM.Config.Network` | **KEEP** | create with `net0`**403 (/vms/…, VM.Config.Network)** without it |
| `VM.Config.Options` | **KEEP** | config `onboot=1`**403 (/vms/…, VM.Config.Options)** without it |
| `SDN.Use` | **KEEP (added vs review sketch)** | create → **403 (/sdn/zones/localnetwork/vmbr0, SDN.Use)** without it |
> Corrections to the review's candidate sketch: `VM.Config.CPUMemory` is **not a real
> privilege** — split into `VM.Config.CPU` + `VM.Config.Memory`. `SDN.Use` was **missing** and
> is **required** (PVE 9 gates bridge use behind it). `Datastore.AllocateTemplate` is **not
> needed**.
### Final minimal `FelhomAgent` role (proven sufficient for ops 1′–9b)
```
VM.Allocate VM.Audit VM.Config.Disk VM.Config.CPU VM.Config.Memory
VM.Config.Network VM.Config.Options VM.PowerMgmt VM.Snapshot VM.Snapshot.Rollback
VM.Backup Datastore.Allocate Datastore.AllocateSpace Datastore.Audit Sys.Audit SDN.Use
```
(16 privileges. `Datastore.Allocate` is for the storage-definition add; drop it if the agent
never creates Proxmox storage entries via the API. `VM.PowerMgmt` is for start/stop lifecycle
— not for the backup itself, consistent with `proxmox-platform.md` §3.4.)
### Root-vs-API boundary table (answers `03` §3)
| Agent host operation | Coverage | Notes |
|---|---|---|
| Create unprivileged LXC, **nesting-only** | **API token** | `VM.Allocate`+`VM.Config.*`+`Datastore.AllocateSpace`+`SDN.Use` |
| **Create with `keyctl=1` (Docker needs it — Phase 0)** | **OS root `root@pam`** (`pct create` as root / sudoers) | no API token works, incl. a root@pam token |
| Set config (mem/cpu/net/options/mountpoint + `backup` flag) | API token | |
| Allocate guest volume | API token | `Datastore.AllocateSpace` |
| Start / stop / snapshot / rollback | API token | `VM.PowerMgmt` / `VM.Snapshot(.Rollback)` |
| vzdump backup (stop/snapshot mode) | API token | `VM.Backup` |
| **Restore from vzdump (preserves keyctl)** | **API token** | DR path needs no root |
| Destroy guest (scratch + compensating rollback, B1) | API token | `VM.Allocate` |
| Add Proxmox **storage definition** (dir/nfs/cifs/pbs) | API token | `Datastore.Allocate`; the *definition* only |
| Host status / metrics report | API token | `Sys.Audit` |
| **USB physical mount-by-UUID / systemd mount unit / fstab** | **OS root / narrow sudoers** | not a Proxmox API op (host-level mount; not tested here) |
| **SMART / hardware sensors** | OS root | not API-exposed |
**Boundary summary:** nearly the entire guest lifecycle — including **restore** — is covered
by the scoped token. The genuine OS-root residual is narrow: **(1) fresh creation of a
Docker-capable LXC (the `keyctl` flag), (2) physical USB mount-by-UUID / systemd mount units /
fstab, (3) hardware/SMART.** This supports `03` §3's "non-root service + scoped token + narrow
sudoers" model — with the **specific** sudoers/root entries being: `pct create` (or just the
keyctl-setting step) and the host mount operations.
---
## Raw command log (appendix)
### B2
```
pct create 9010 ... --features nesting=1,keyctl=1 --unprivileged 1 # rootfs local-lvm:8
pct set 9010 -mp1 local-lvm:1,mp=/mnt/mp1,backup=1
pct set 9010 -mp2 local-lvm:1,mp=/mnt/mp2,backup=0
pct set 9010 -mp3 /root/b2-bindsrc,mp=/mnt/mp3
# docker named vol: docker volume inspect b2vol -> /var/lib/docker/volumes/b2vol/_data
# bulk: docker volume create --driver local -o type=none -o o=bind -o device=/mnt/mp2 bulkvol
vzdump 9010 --mode stop --storage local --compress zstd
# INFO: including mount point rootfs ('/') in backup
# INFO: including mount point mp1 ('/mnt/mp1') in backup
# INFO: excluding volume mount point mp2 ('/mnt/mp2') from backup (disabled)
# INFO: excluding bind mount point mp3 ('/mnt/mp3') from backup (not a volume)
tar --zstd -tf <archive> | grep SENTINEL # -> rootfs, dockervol, mp1 only
pct restore 9011 <archive> --storage local-lvm # -> mp2/bulk absent, mp3 via re-bind
```
### B3
```
pveum role add FelhomAgent -privs "VM.Allocate VM.Audit VM.Config.Disk VM.Config.CPU VM.Config.Memory VM.Config.Network VM.Config.Options VM.PowerMgmt VM.Snapshot VM.Snapshot.Rollback VM.Backup Datastore.Allocate Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Sys.Audit" # candidate (pre-SDN)
pveum user add felhom-agent@pve ; pveum user token add felhom-agent@pve agent --privsep 1
pveum acl modify / -user 'felhom-agent@pve' -role FelhomAgent
pveum acl modify / -token 'felhom-agent@pve!agent' -role FelhomAgent
# token create with keyctl:
POST /nodes/demo-felhom/lxc ... features=nesting=1,keyctl=1
-> 403 "changing feature flags (except nesting) is only allowed for root@pam"
# + SDN.Use missing initially:
-> 403 "Permission check failed (/sdn/zones/localnetwork/vmbr0, SDN.Use)"
# root@pam non-privsep token, keyctl create:
-> 403 (same "only allowed for root@pam") # tokens never qualify
# token nesting-only create / config(PUT) / start / stop / snapshot / rollback /
# vzdump(stop) / restore->9021 (kept keyctl) / destroy / POST /storage -> all 200/OK
# paring:
GET /nodes/$N/status without Sys.Audit -> 403 (KEEP)
create net0 without VM.Config.Network -> 403 (KEEP)
config onboot=1 without VM.Config.Options -> 403 (KEEP)
create from template without Datastore.AllocateTemplate -> OK (DROP)
```
### Teardown
```
pct destroy 9010 9011 9021 --purge # 9020/9022/9023 already destroyed during tests
pveum user token remove felhom-agent@pve agent ; pveum user delete felhom-agent@pve
pveum role delete FelhomAgent # ACLs at / auto-invalidated
rm -f /var/lib/vz/dump/vzdump-lxc-9010-* /var/lib/vz/dump/vzdump-lxc-9020-*
# verified: only 9000/9001 remain (stopped-but-present); no felhom-agent user/role; dump dir empty
```