Files
felhom-agent/internal/proxmox/doc.go
T
admin a042316d6d feat(agent): scaffold + proxmox interaction layer (slice 1)
Stand up the felhom-agent project (module gitea.dooplex.hu/admin/felhom-agent,
binary felhom-agent) and the internal/proxmox package: the typed library every
other agent module calls to talk to Proxmox.

- API-first Client (hand-rolled REST over net/http, PVEAPIToken auth) with typed
  read ops (version/nodes/status/lxc/config/storage) and async mutating ops
  (restore/vzdump/snapshot/rollback/delete-snapshot/setconfig/start/stop), each
  returning a UPID. WaitTask polls task status until stopped and asserts
  exitstatus OK (authz can surface at task exec, not the POST — phase1-2 §1.3).
- Fenced Privileged (root-CLI) backend for the THREE proven exceptions only
  (keyctl pct create, USB mount/fstab, SMART/sensors); each cites why it can't be
  the API. Fence is structural (Client never shells out, Privileged never HTTPs)
  and asserted in routing_test.go.
- TLS: SHA-256 leaf-cert pinning or CA file; insecure mode explicit + off by
  default. No blanket verification disable.
- 403 -> privilege-named APIError; failed task -> privilege-named TaskError.
- JSON config + env overrides (token never logged); slog logging.
- cmd/felhom-agent --selftest (read-only health report) + gated --selftest=task
  (reversible snapshot/rollback/delete exercise of WaitTask). No daemon loop yet.
- Types grounded in the spike findings and exact JSON shapes captured live from
  demo-felhom (PVE 9.2.2). Unit tests use a mock transport + runner.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 14:34:32 +02:00

63 lines
4.0 KiB
Go

// Package proxmox is the typed interaction layer the host agent uses to talk to
// a single Proxmox VE host. Every other agent module calls this package; it owns
// the API-first + fenced-root-CLI model the spikes proved
// (felhom.eu/documentation/proxmox-platform.md and tests/phase{0,1-2,3}-findings.md).
//
// # Two backends, one routing policy
//
// The package has two independent backends. Which path an operation takes is a
// fixed policy, not a per-call choice:
//
// - Client (API backend) — the default for everything the scoped FelhomAgent
// token can do. A hand-rolled REST client over https://<host>:8006/api2/json,
// auth header "Authorization: PVEAPIToken=USER@REALM!TOKENID=SECRET". Every
// mutating call is async: it returns a UPID and the caller polls the task with
// WaitTask until it stops, then asserts exitstatus == "OK". Authorization can
// surface at task execution, not the HTTP POST (phase1-2 §1.3) — so the POST's
// 200 is never trusted.
//
// - Privileged (root-CLI backend) — fenced to the three proven exceptions ONLY:
// (a) keyctl `pct create` for golden-image builds, (b) USB mount-by-UUID /
// fstab, (c) SMART / sensors reads. Each method cites why it cannot be the API.
//
// Client never shells out and Privileged never makes an HTTP call: the fence is
// structural (separate types, separate dependencies), and asserted in
// routing_test.go.
//
// # API-vs-root routing table (phase3-findings.md §B3 boundary)
//
// Operation Backend Why
// ------------------------------------------------- ----------- ----------------------------------
// node status / resources / metrics Client (API) Sys.Audit
// list guests + per-guest status/config Client (API) VM.Audit
// storage list + content Client (API) Datastore.Audit
// task status / log Client (API) task owner can read own task
// restore LXC from archive (PRIMARY create path) Client (API) VM.Allocate; restore preserves keyctl
// vzdump backup (stop/snapshot mode) Client (API) VM.Backup (stop-mode needs no PowerMgmt)
// snapshot / rollback / delete-snapshot Client (API) VM.Snapshot / VM.Snapshot.Rollback
// set config (mem/cpu/net/options/mountpoint) Client (API) VM.Config.*
// start / stop guest Client (API) VM.PowerMgmt
// ------------------------------------------------- ----------- ----------------------------------
// golden-image `pct create` with keyctl=1 Privileged keyctl is root@pam-only; no token qualifies
// USB mount-by-UUID / systemd mount unit / fstab Privileged host-level mount, not a Proxmox API op
// SMART / hardware sensors Privileged not API-exposed
//
// # Grounding notes for later slices (do not act on these here)
//
// - Provision-by-restore is the primary create path: a token-authorized restore
// preserves features=nesting=1,keyctl=1 (phase3 §B3); fresh `pct create` with
// keyctl is the only root-fenced create.
// - A Docker NAMED volume lives in the LXC rootfs (/var/lib/docker/volumes/<v>/_data)
// and is ALWAYS captured by vzdump. The backup=<bool> flag is honoured only for
// *volume* mount points; a bulk volume must be a dedicated backup=0 mountpoint or
// it is silently swept into the whole-guest image (phase3 §B2).
// - `pct restore` preserves the source MAC + hostname — reset network identity
// before starting alongside the original (phase1-2 §2.2).
// - An LXC has no guest agent, so snapshot-mode vzdump does NOT fsfreeze: an
// agent-initiated backup is crash-consistent only; app-consistency is the
// controller's job (quiesce, then POST /backup) (proxmox-platform.md §4.2).
//
// This slice (slice 1) wraps only the proven, read-tested op set. No reconcile
// loop, hub client, or signing — those are later slices.
package proxmox