43b7e96905
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
54 lines
4.2 KiB
Markdown
54 lines
4.2 KiB
Markdown
# felhom-agent — latest task report
|
|
|
|
> This file holds the report for the **most recent** change, fully overwritten each task.
|
|
> Cumulative history lives in [CHANGELOG.md](CHANGELOG.md).
|
|
|
|
## Task: Agent scaffold + `proxmox` interaction package (slice 1) — v0.1.0
|
|
|
|
Stood up the host-agent project and its foundation — the typed `proxmox` interaction
|
|
layer every other agent module will call — with a runnable read-only `--selftest`.
|
|
Pushed to `main` (main-only repo). Build/vet/test green; verified live against the demo host.
|
|
|
|
### Public surface
|
|
|
|
**`proxmox.Client`** (API backend):
|
|
- Read: `Version`, `Nodes`, `NodeStatus`, `ListLXC`, `GuestStatus`, `GuestConfig`, `ListStorage`, `NodeStorage`, `StorageContent`
|
|
- Async mutating (return a UPID): `RestoreLXC` (primary create path), `Vzdump`, `Snapshot`, `Rollback`, `DeleteSnapshot`, `SetConfig`, `Start`, `Stop`
|
|
- Tasks: `WaitTask(ctx, upid, WaitOptions)`, `TaskStatusOnce`, `TaskLogTail`
|
|
- Errors: `*APIError` (parses the offending privilege from a 403), `*TaskError` (parses it from a failed task `exitstatus` + log tail)
|
|
- Types: `Version, Node, NodeStatus, Guest, GuestConfig (+Extra/MountPoints/Nets), Storage, StorageContent, TaskStatus, UPID`
|
|
|
|
**`proxmox.Privileged`** (fenced root-CLI; `Runner` iface, `ExecRunner` direct/`sudo -n`): `CreateGoldenLXC` (keyctl), `MountUSBByUUID`, `SMART`, `Sensors` — each documents *why it can't be the API*.
|
|
|
|
### API-vs-root routing table
|
|
|
|
| Backend | Ops | Why |
|
|
|---|---|---|
|
|
| **API** | node status, list/status/config guests, storage list+content, task status/log, **restore**, vzdump, snapshot/rollback/delete-snap, set-config, start/stop | FelhomAgent 16-priv token |
|
|
| **root-CLI (fenced)** | golden `pct create` (keyctl=1), USB mount-by-UUID/fstab, SMART/sensors | keyctl is `root@pam`-only; host mounts + SMART aren't API ops |
|
|
|
|
Fence is **structural** (`Client` has no runner, `Privileged` has no HTTP client) and asserted in `routing_test.go`.
|
|
|
|
### OPEN-item choices
|
|
- **Config:** JSON file + `FELHOM_AGENT_*` env overrides (stdlib, zero-dep; swappable to `yaml.v3` if YAML house-style is preferred). Token never logged (`Redacted()`).
|
|
- **Privileged runner / uid:** `Runner` iface; `ExecRunner{Mode: sudo|direct}`, default `sudo -n`. Proposed (not finalized): non-root service user + narrow sudoers allowlist for the 3 fenced commands.
|
|
- **Polling:** first poll immediate, then 1s → exponential backoff capped 5s, default total timeout 10m; honors ctx cancellation. Tunable via `WaitOptions`.
|
|
- **`--selftest=task`:** included (gated behind the flag + `-vmid`). Unit-tested via mocks; not run live (the live token was read-only).
|
|
- **Versioning:** `version` var in `main.go` (default `0.1.0`, `-ldflags -X main.version=`), `--version` flag.
|
|
|
|
### What the live host revealed (recorded, not guessed)
|
|
- Node name is **`demo-felhom`**; `felhom-pve` is only the SSH alias.
|
|
- `/nodes/{node}/status`: `cpu` is a 0..1 fraction, **`loadavg` is an array of strings**; `memory`/`rootfs`/`swap` nested.
|
|
- `vmid` is an **integer** in list/status; `status/current` carries no `vmid` (set from the path arg).
|
|
- Task: `status` ∈ {running, stopped}, `exitstatus` only once stopped; task log is `[{"n":N,"t":"…"}]`. UPID = `UPID:node:pid(hex):pstart(hex):starttime(hex):worker:id:user:`.
|
|
- `pveum user token add … --output-format json` returns `{"value":"…"}`.
|
|
- **No spike fact failed in practice** — 16-priv role, async/UPID model, keyctl boundary, dual-grant privsep all held. Teardown logged `ignore invalid acl token …`, confirming ACL auto-invalidation (phase1-2 §5).
|
|
|
|
### Verification
|
|
- `go build/vet/test` green twice: locally (Go 1.26) and on the build server (Go 1.24.4).
|
|
- **Live read-only `--selftest`** (built on 192.168.0.180, against `https://192.168.0.162:8006`, **TLS fingerprint-pinned** — no insecure mode): version, nodes, node status, guests, storage all `[ ok ]`. slog confirmed the token rendered as `…=********`. Throwaway token created + torn down.
|
|
- Mutating ops + live `WaitTask` are unit-tested only (live run used a read-only token); `--selftest=task` is ready to exercise them against a real `FelhomAgent` token.
|
|
|
|
### Repo state
|
|
- Branch: `main` only (feature branch merged + deleted, local & remote). Latest: `chore(agent): add CHANGELOG, version the agent at 0.1.0`.
|