From 43b7e96905afd1418674ff04965910beb38ee2e5 Mon Sep 17 00:00:00 2001 From: kisfenyo Date: Mon, 8 Jun 2026 14:47:38 +0200 Subject: [PATCH] docs(agent): add REPORT.md (latest-task report, overwritten each change) Co-Authored-By: Claude Opus 4.8 (1M context) --- REPORT.md | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 REPORT.md diff --git a/REPORT.md b/REPORT.md new file mode 100644 index 0000000..3288481 --- /dev/null +++ b/REPORT.md @@ -0,0 +1,53 @@ +# felhom-agent — latest task report + +> This file holds the report for the **most recent** change, fully overwritten each task. +> Cumulative history lives in [CHANGELOG.md](CHANGELOG.md). + +## Task: Agent scaffold + `proxmox` interaction package (slice 1) — v0.1.0 + +Stood up the host-agent project and its foundation — the typed `proxmox` interaction +layer every other agent module will call — with a runnable read-only `--selftest`. +Pushed to `main` (main-only repo). Build/vet/test green; verified live against the demo host. + +### Public surface + +**`proxmox.Client`** (API backend): +- Read: `Version`, `Nodes`, `NodeStatus`, `ListLXC`, `GuestStatus`, `GuestConfig`, `ListStorage`, `NodeStorage`, `StorageContent` +- Async mutating (return a UPID): `RestoreLXC` (primary create path), `Vzdump`, `Snapshot`, `Rollback`, `DeleteSnapshot`, `SetConfig`, `Start`, `Stop` +- Tasks: `WaitTask(ctx, upid, WaitOptions)`, `TaskStatusOnce`, `TaskLogTail` +- Errors: `*APIError` (parses the offending privilege from a 403), `*TaskError` (parses it from a failed task `exitstatus` + log tail) +- Types: `Version, Node, NodeStatus, Guest, GuestConfig (+Extra/MountPoints/Nets), Storage, StorageContent, TaskStatus, UPID` + +**`proxmox.Privileged`** (fenced root-CLI; `Runner` iface, `ExecRunner` direct/`sudo -n`): `CreateGoldenLXC` (keyctl), `MountUSBByUUID`, `SMART`, `Sensors` — each documents *why it can't be the API*. + +### API-vs-root routing table + +| Backend | Ops | Why | +|---|---|---| +| **API** | node status, list/status/config guests, storage list+content, task status/log, **restore**, vzdump, snapshot/rollback/delete-snap, set-config, start/stop | FelhomAgent 16-priv token | +| **root-CLI (fenced)** | golden `pct create` (keyctl=1), USB mount-by-UUID/fstab, SMART/sensors | keyctl is `root@pam`-only; host mounts + SMART aren't API ops | + +Fence is **structural** (`Client` has no runner, `Privileged` has no HTTP client) and asserted in `routing_test.go`. + +### OPEN-item choices +- **Config:** JSON file + `FELHOM_AGENT_*` env overrides (stdlib, zero-dep; swappable to `yaml.v3` if YAML house-style is preferred). Token never logged (`Redacted()`). +- **Privileged runner / uid:** `Runner` iface; `ExecRunner{Mode: sudo|direct}`, default `sudo -n`. Proposed (not finalized): non-root service user + narrow sudoers allowlist for the 3 fenced commands. +- **Polling:** first poll immediate, then 1s → exponential backoff capped 5s, default total timeout 10m; honors ctx cancellation. Tunable via `WaitOptions`. +- **`--selftest=task`:** included (gated behind the flag + `-vmid`). Unit-tested via mocks; not run live (the live token was read-only). +- **Versioning:** `version` var in `main.go` (default `0.1.0`, `-ldflags -X main.version=`), `--version` flag. + +### What the live host revealed (recorded, not guessed) +- Node name is **`demo-felhom`**; `felhom-pve` is only the SSH alias. +- `/nodes/{node}/status`: `cpu` is a 0..1 fraction, **`loadavg` is an array of strings**; `memory`/`rootfs`/`swap` nested. +- `vmid` is an **integer** in list/status; `status/current` carries no `vmid` (set from the path arg). +- Task: `status` ∈ {running, stopped}, `exitstatus` only once stopped; task log is `[{"n":N,"t":"…"}]`. UPID = `UPID:node:pid(hex):pstart(hex):starttime(hex):worker:id:user:`. +- `pveum user token add … --output-format json` returns `{"value":"…"}`. +- **No spike fact failed in practice** — 16-priv role, async/UPID model, keyctl boundary, dual-grant privsep all held. Teardown logged `ignore invalid acl token …`, confirming ACL auto-invalidation (phase1-2 §5). + +### Verification +- `go build/vet/test` green twice: locally (Go 1.26) and on the build server (Go 1.24.4). +- **Live read-only `--selftest`** (built on 192.168.0.180, against `https://192.168.0.162:8006`, **TLS fingerprint-pinned** — no insecure mode): version, nodes, node status, guests, storage all `[ ok ]`. slog confirmed the token rendered as `…=********`. Throwaway token created + torn down. +- Mutating ops + live `WaitTask` are unit-tested only (live run used a read-only token); `--selftest=task` is ready to exercise them against a real `FelhomAgent` token. + +### Repo state +- Branch: `main` only (feature branch merged + deleted, local & remote). Latest: `chore(agent): add CHANGELOG, version the agent at 0.1.0`.