This commit is contained in:
2026-06-08 08:21:07 +02:00
parent c8837d442e
commit 8ae6e8abf3
2 changed files with 295 additions and 0 deletions
+61
View File
@@ -207,6 +207,39 @@ Poll `GET /nodes/<node>/tasks/<upid>/status` until `status: stopped`, then read
(The task owner — including a token — can read its own task status: 200.)
### 3.6 Operator-tier agent role & root-vs-API boundary (validated)
The operator-tier **host agent** (`03-host-agent.md`) needs a far broader role than the
Phase-1 *guest self-backup* role (which is denied create/allocate — §3.4). The minimal role
that drives the full guest lifecycle via an API token, validated by paring
[[phase3 §B3](tests/phase3-findings.md)]:
> **`FelhomAgent` (operator-tier, 16 privileges):**
> `VM.Allocate, VM.Audit, VM.Config.Disk, VM.Config.CPU, VM.Config.Memory, VM.Config.Network,
> VM.Config.Options, VM.PowerMgmt, VM.Snapshot, VM.Snapshot.Rollback, VM.Backup,
> Datastore.Allocate, Datastore.AllocateSpace, Datastore.Audit, Sys.Audit, SDN.Use`
>
> Paring proved: `SDN.Use` is **required** (PVE 9 gates bridge use; omitting it → `403
> (/sdn/zones/localnetwork/vmbr0, SDN.Use)`); `Sys.Audit` required for host metrics
> (`GET /nodes/<node>/status`); `VM.Config.Network`/`VM.Config.Options` required for NIC/onboot
> config; `Datastore.AllocateTemplate` **not** needed (drop it). NB `VM.Config.CPUMemory` is
> not a real privilege — it is `VM.Config.CPU` + `VM.Config.Memory`.
**Root-vs-API boundary** [[phase3 §B3](tests/phase3-findings.md)] — nearly the entire guest
lifecycle, **including restore**, is API-token-covered; the genuine OS-root residual is narrow:
| Operation | Coverage |
|---|---|
| Create LXC (nesting-only), config, allocate, start/stop, snapshot/rollback, vzdump, **restore**, destroy, add storage definition, host metrics | **scoped API token** (the `FelhomAgent` role) |
| ⚠️ **Create LXC with `keyctl=1`** (Docker needs it — §2.3) | **OS root `root@pam` only** |
| USB physical mount-by-UUID / systemd mount unit / fstab; SMART/sensors | OS root / narrow sudoers |
> ⚠️ **`keyctl=1` (and any feature flag except `nesting`) can be set only by an actual
> `root@pam` session** — `changing feature flags (except nesting) is only allowed for
> root@pam`. **No API token qualifies**, not even a non-privsep `root@pam` token (same 403).
> So *fresh provisioning* of a Docker-capable LXC needs `pct create` as OS root (or a narrow
> sudoers entry). **Restore is exempt:** a token-authorized `vzrestore` **preserves
> `keyctl=1`** from the archive — the DR path needs no root.
---
## 4. Backup & restore (`vzdump` / `pct restore`)
@@ -267,6 +300,29 @@ snapshot-before-change rollback flow.
findings above are for `vzdump` to a `dir` storage. PBS (dedup, incremental, remote, dirty-
bitmap) is pending.
### 4.7 vzdump scope by LXC mount type (validated)
A stop-mode `vzdump` includes/excludes each LXC mount point by **type and the `backup` flag**
[[phase3 §B2](tests/phase3-findings.md)]. Validated three ways (vzdump log, archive grep,
restore):
| Location | `backup` flag | In the vzdump? |
|---|---|---|
| rootfs (and anything inside it) | — | **included** (always) |
| **Docker named volume** (default driver) | — | **included** — it lives in the rootfs (`/var/lib/docker/volumes/<v>/_data`) |
| volume mount point (`mpN`) | `backup=1` | included |
| volume mount point (`mpN`) | `backup=0` | **excluded** (vol recreated empty on restore) |
| bind mount point (`mpN: /host/path`) | n/a | **excluded** ("not a volume"); data is *not* in the archive |
> ⚠️ **The `backup=<boolean>` flag is honoured ONLY for *volume* mount points.** A **Docker
> named volume is in the rootfs and is always captured** — so a "bulk" volume left as a
> default named volume is silently swept into the whole-guest image. To keep bulk data **out**,
> realize it as a dedicated `backup=0` volume mount point (proven recipe:
> `pct set <id> -mpN <storage>:<size>,mp=/mnt/bulk,backup=0` then
> `docker volume create --driver local -o type=none -o o=bind -o device=/mnt/bulk bulkvol`).
> A **bind mount's** data is excluded from the archive entirely; on same-host restore it
> reappears only because the bind config re-attaches the same host dir — on a *different* host
> (true DR) it is gone unless backed up separately.
---
## 5. Gotchas & operational notes (quick reference)
@@ -283,6 +339,9 @@ bitmap) is pending.
| **`pveum role info` gone** | use `pveum role list` in PVE 9 | [phase1-2 §1.1](tests/phase1-2-findings.md) |
| **`pveum acl delete` needs `--roles`** | bare `-user`/`-token` path errors `400 roles: property is missing` | [phase1-2 §5](tests/phase1-2-findings.md) |
| **`VM.PowerMgmt` not needed** | stop-mode backup works under `VM.Backup` alone | [phase1-2 §1.4](tests/phase1-2-findings.md) |
| **`keyctl=1` is root-only** | feature flags except `nesting` need a `root@pam` session; no API token (even root's) can set them; restore preserves them | [phase3 §B3](tests/phase3-findings.md) |
| **`SDN.Use` gates bridge use** | PVE 9 needs `SDN.Use` to attach a NIC to `vmbr0`; omit it → 403 | [phase3 §B3](tests/phase3-findings.md) |
| **Docker named vol = always backed up** | named volumes live in rootfs; only *volume mountpoints* honour `backup=0`; bulk must be a dedicated `backup=0` mp | [phase3 §B2](tests/phase3-findings.md) |
---
@@ -301,6 +360,8 @@ bitmap) is pending.
| Running, unprivileged LXC snapshots on LVM-thin (no stop) | [phase1-2 §1.6](tests/phase1-2-findings.md) |
| `vzdump``pct restore` round-trip; one backup captures Docker volumes; config survives | [phase1-2 §2](tests/phase1-2-findings.md) |
| Crash-consistent restore recovers via Postgres WAL; quiesced restores clean | [phase1-2 §2.2](tests/phase1-2-findings.md) |
| LXC vzdump scope by mount type; `backup=0` excludes volume mps; Docker named vols ride rootfs; proven bulk-exclusion recipe | [phase3 §B2](tests/phase3-findings.md) |
| Operator agent role (16 privs); guest lifecycle incl. restore is API-token-covered; `keyctl` create is `root@pam`-only | [phase3 §B3](tests/phase3-findings.md) |
### Not yet validated (do not assume)
| Open item | Why it matters |