New internal/reconcile package: the agent-side control core's structural half.
- Per-guest serializer Queue (doc 03 §10): the single choke point all mutation
sources funnel through; same-vmid serial in submit order, different vmids
parallel (cond-var FIFO lanes).
- Desired-state model + DesiredProvider seam; EmptyProvider is the only live
source at slice 4 (no hub serving until slice 10) so the live engine computes
an empty action set and performs zero mutations.
- Normalization layer (FieldNormalizers): normalized desired-vs-actual so
Proxmox round-trip quirks don't read as drift. normDesc promoted out of
main.go to reconcile.NormDescription; selftest uses the shared helper.
- Plan (pure diff): minimal benign action set (Start/Stop/SetConfig) for guests
in both desired and actual; provision/destroy out of scope here.
- Engine: dispatches onto the shared queue; honors the dual-mode SetConfig
contract (UPID -> WaitTask; empty UPID -> synchronous success).
- Durable op journal + idempotency store (mirrors authz.FileNonceStore):
in-flight task ids for crash detection + AlreadyApplied dedupe across restart.
- Wired into runDaemon alongside the hub loop, sharing the queue; runs cleanly
with no desired state and no signers.
Full module race-clean and vet-clean on the Linux build server.
CHECKPOINT: Phase A only. Awaiting validation before Phase B (the reversibility
gate + signed-op consuming layer, landing v0.4.0).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
internal/hub: the agent's first daemon — a periodic read-only host-report POSTed to
the hub (the heartbeat; no separate ping).
- HostReport wire contract (shared field-for-field with the hub ingest): host
metrics, guests (vmid + spec), cloudflared status; storage/backups/restore-tests/
pbs/audit collections DEFINED but emitted empty (slices 5/6 fill).
- Collector over a read-only proxmoxReader (adapted to the real proxmox surface;
no proxmox changes) + a CloudflaredProber. Partial-failure: NodeStatus fail = hard
(skip POST); per-guest GuestConfig fail = status "unknown", still report.
- Client: Bearer-auth POST, standard TLS (system roots / optional ca_file), typed
TransportError/HTTPError, token never in errors.
- Loop: immediate first report, adopt hub poll_interval (clamp [60,3600]), resilient
to collect/report errors, clean ctx-cancel shutdown.
- ControlEnvelope: only poll_interval_seconds acted on; blocked/desired_generation/
has_signed_ops parsed-but-ignored (slice 4).
- config: HubConfig + FELHOM_AGENT_HUB_* overlay + mode-aware HubConfig.Validate +
WithDefaults + hub-key redaction; example config updated.
- main: no-selftest mode is now the daemon; added --selftest=hub. Version -> 0.3.0.
Tests: report serialization, client (incl. token-redaction), collector partial-
failure, loop continuation+interval adoption, config. internal/proxmox + internal/
authz untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Stand up the felhom-agent project (module gitea.dooplex.hu/admin/felhom-agent,
binary felhom-agent) and the internal/proxmox package: the typed library every
other agent module calls to talk to Proxmox.
- API-first Client (hand-rolled REST over net/http, PVEAPIToken auth) with typed
read ops (version/nodes/status/lxc/config/storage) and async mutating ops
(restore/vzdump/snapshot/rollback/delete-snapshot/setconfig/start/stop), each
returning a UPID. WaitTask polls task status until stopped and asserts
exitstatus OK (authz can surface at task exec, not the POST — phase1-2 §1.3).
- Fenced Privileged (root-CLI) backend for the THREE proven exceptions only
(keyctl pct create, USB mount/fstab, SMART/sensors); each cites why it can't be
the API. Fence is structural (Client never shells out, Privileged never HTTPs)
and asserted in routing_test.go.
- TLS: SHA-256 leaf-cert pinning or CA file; insecure mode explicit + off by
default. No blanket verification disable.
- 403 -> privilege-named APIError; failed task -> privilege-named TaskError.
- JSON config + env overrides (token never logged); slog logging.
- cmd/felhom-agent --selftest (read-only health report) + gated --selftest=task
(reversible snapshot/rollback/delete exercise of WaitTask). No daemon loop yet.
- Types grounded in the spike findings and exact JSON shapes captured live from
demo-felhom (PVE 9.2.2). Unit tests use a mock transport + runner.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>