Files
felhom-agent/REPORT.md
T
admin ab77fa3544 feat(hub): host-report client + collector + first daemon loop (slice 3, v0.3.0)
internal/hub: the agent's first daemon — a periodic read-only host-report POSTed to
the hub (the heartbeat; no separate ping).

- HostReport wire contract (shared field-for-field with the hub ingest): host
  metrics, guests (vmid + spec), cloudflared status; storage/backups/restore-tests/
  pbs/audit collections DEFINED but emitted empty (slices 5/6 fill).
- Collector over a read-only proxmoxReader (adapted to the real proxmox surface;
  no proxmox changes) + a CloudflaredProber. Partial-failure: NodeStatus fail = hard
  (skip POST); per-guest GuestConfig fail = status "unknown", still report.
- Client: Bearer-auth POST, standard TLS (system roots / optional ca_file), typed
  TransportError/HTTPError, token never in errors.
- Loop: immediate first report, adopt hub poll_interval (clamp [60,3600]), resilient
  to collect/report errors, clean ctx-cancel shutdown.
- ControlEnvelope: only poll_interval_seconds acted on; blocked/desired_generation/
  has_signed_ops parsed-but-ignored (slice 4).
- config: HubConfig + FELHOM_AGENT_HUB_* overlay + mode-aware HubConfig.Validate +
  WithDefaults + hub-key redaction; example config updated.
- main: no-selftest mode is now the daemon; added --selftest=hub. Version -> 0.3.0.

Tests: report serialization, client (incl. token-redaction), collector partial-
failure, loop continuation+interval adoption, config. internal/proxmox + internal/
authz untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:20:09 +02:00

74 lines
4.6 KiB
Markdown

# felhom-agent — latest task report
> This file holds the report for the **most recent** change, fully overwritten each task.
> Cumulative history lives in [CHANGELOG.md](CHANGELOG.md).
## Task: hub client + host-report + first daemon loop (slice 3) — v0.3.0
The agent's **first daemon**: a periodic, read-only host-report POSTed to the hub — which
**is** the heartbeat (its server-side `received_at` is the dead-man's-switch signal). New
`internal/hub` package + config additions + `main.go` daemon wiring. Pushed to `main`;
build/vet/test green locally (go1.26) and on the build server.
### `internal/hub` public surface
- **`HostReport`** + sub-types (`HostMetrics`, `Guest`, `GuestSpec`, `Cloudflared`,
`ControlEnvelope`) — the JSON wire contract shared field-for-field with the hub ingest.
- **`Collector`** — `NewCollector(px proxmoxReader, cf CloudflaredProber, hostID, agentVersion, logger)`;
`Collect(ctx) (*HostReport, error)`.
- **`CloudflaredProber`** interface + **`SystemctlProber`** (`systemctl is-active`).
- **`Client`** — `NewClient(cfg config.HubConfig, logger) (*Client, error)`;
`Report(ctx, *HostReport) (*ControlEnvelope, error)`; typed `*TransportError` / `*HTTPError`.
- **`Loop`** — `NewLoop(collector, client, interval, logger)`; `Run(ctx) error`. Constants
`MinPollSeconds=60` / `MaxPollSeconds=3600`.
### Config additions (`internal/config`)
- `HubConfig{URL, HostID, APIKey, PollSeconds, TimeoutSeconds, CAFile}` on `Config.Hub`.
- `FELHOM_AGENT_HUB_{URL,HOST_ID,API_KEY,POLL_SECONDS,TIMEOUT_SECONDS,CA_FILE}` overlay
(int parse errors warn to stderr + keep file value, never crash).
- `HubConfig.Validate()` (mode-aware — proxmox-only selftests unaffected; https required
except loopback for tests), `HubConfig.WithDefaults()` (900s/30s), `Redacted()` blanks the key.
- `configs/agent.example.json` gains `hub` (and `authz`) blocks.
### Daemon-loop behaviour (`main.go`)
- No `--selftest` flag → **daemon**: validate proxmox + hub config → build read-path proxmox
client, collector, hub client, loop → `signal.NotifyContext(SIGINT, SIGTERM)``loop.Run`.
- **Immediate first report**, then tick at the interval; adopt the hub's
`poll_interval_seconds` (clamped [60,3600], reset the ticker on change).
- **Resilient**: any collect/report error is logged and the loop continues (survives hub 5xx
and transient proxmox read errors). Clean `nil` return on context cancel.
- **`--selftest=hub`**: one collect + report; prints the report it would send + the envelope.
- Startup line logs host_id/url/interval with the **key redacted**; no secret ever logged.
### Explicitly deferred (defined now, not active)
- **Defined-but-EMPTY** this slice (slices 5/6 fill): `storage_targets`, `backups`,
`restore_tests`, `pbs_snapshots`, `audit_tail` — emitted as typed empty `[]`.
- **Parsed-but-IGNORED** (slice 4 / reconcile consumes): the envelope's `blocked`,
`desired_generation`, `has_signed_ops` — logged at most, never acted on.
- No per-guest work queue (zero Proxmox mutations this slice); no canonical JSON (nothing
signs the report); no controller_version (slice 8) — emitted `""`.
### proxmox surface
**No changes to `internal/proxmox` or `internal/authz`.** No new proxmox surface was needed:
`ListLXC` already returns status/maxmem/maxdisk and `GuestConfig` returns cores. The task's
`proxmoxReader` sketch (node-arg / pointer returns / `LXC` type) was **adapted to the real
exports** — `Node()` on the client, value returns, `proxmox.Guest` — per its instruction.
### Test matrix (all green)
- **report**: field names match §4; empty collections serialize as `[]` not `null`; spec
omitted when unknown.
- **client**: sets `Bearer`; non-2xx → `*HTTPError` (status preserved); transport → `*TransportError`;
**asserts the bearer token never appears in any error string**.
- **collector**: `NodeStatus`→host block; `ListLXC`+`GuestConfig`→guest spec; a failing
`GuestConfig``status="unknown"` + omitted spec + **still returns a report**; a failing
`NodeStatus` → hard error; cloudflared probe error → `"unknown"`.
- **loop**: immediate first report; continues after an injected report error (≥3 cycles);
adopts + clamps the envelope interval (cycle-level) and applies a slower interval in `Run`.
- **config**: hub validate cases, key redaction, env overlay + defaults.
### Verification
- `go build/vet/test` green locally (go1.26.0) and on the build server (go1.26.0). No live hub
or `systemctl` in unit tests (mock transport + fake prober/collector/reporter).
### Repo state
- Branch: `main` only. Version 0.3.0. Dep unchanged (`golang.org/x/crypto v0.52.0`).