feat(hub): host-report client + collector + first daemon loop (slice 3, v0.3.0)
internal/hub: the agent's first daemon — a periodic read-only host-report POSTed to the hub (the heartbeat; no separate ping). - HostReport wire contract (shared field-for-field with the hub ingest): host metrics, guests (vmid + spec), cloudflared status; storage/backups/restore-tests/ pbs/audit collections DEFINED but emitted empty (slices 5/6 fill). - Collector over a read-only proxmoxReader (adapted to the real proxmox surface; no proxmox changes) + a CloudflaredProber. Partial-failure: NodeStatus fail = hard (skip POST); per-guest GuestConfig fail = status "unknown", still report. - Client: Bearer-auth POST, standard TLS (system roots / optional ca_file), typed TransportError/HTTPError, token never in errors. - Loop: immediate first report, adopt hub poll_interval (clamp [60,3600]), resilient to collect/report errors, clean ctx-cancel shutdown. - ControlEnvelope: only poll_interval_seconds acted on; blocked/desired_generation/ has_signed_ops parsed-but-ignored (slice 4). - config: HubConfig + FELHOM_AGENT_HUB_* overlay + mode-aware HubConfig.Validate + WithDefaults + hub-key redaction; example config updated. - main: no-selftest mode is now the daemon; added --selftest=hub. Version -> 0.3.0. Tests: report serialization, client (incl. token-redaction), collector partial- failure, loop continuation+interval adoption, config. internal/proxmox + internal/ authz untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -3,6 +3,62 @@
|
||||
All notable changes to **felhom-agent** are recorded here. Update on every code
|
||||
change that gets pushed.
|
||||
|
||||
## v0.3.0 — hub client + host-report + first daemon loop (slice 3) (2026-06-08)
|
||||
|
||||
The agent's first daemon: a periodic read-only host-report POSTed to the hub (the
|
||||
heartbeat). No Proxmox mutations, no desired-state/signed-op consumption, no
|
||||
storage/backup collection yet — those are slices 4/5/6.
|
||||
|
||||
### Added
|
||||
- **`internal/hub`** package:
|
||||
- **`HostReport`** wire contract (`report.go`) shared field-for-field with the hub
|
||||
ingest: host metrics, guests (`vmid` + spec), `cloudflared` status, and the
|
||||
`storage_targets`/`backups`/`restore_tests`/`pbs_snapshots`/`audit_tail`
|
||||
collections **defined but emitted empty** (typed `[]`, slices 5/6 fill them).
|
||||
- **`Collector`** (`collect.go`) builds the report from a read-only `proxmoxReader`
|
||||
(adapted to the real `internal/proxmox` surface — node held by the client, value
|
||||
returns, `proxmox.Guest`) + a `CloudflaredProber`. Partial-failure policy: a
|
||||
failed `NodeStatus` is a hard error (skip the POST); a failed per-guest
|
||||
`GuestConfig` degrades that guest to `status="unknown"` (spec omitted) but still
|
||||
sends; a cloudflared probe failure → `"unknown"`, never fatal.
|
||||
- **`CloudflaredProber`** + `SystemctlProber` (`systemctl is-active cloudflared`;
|
||||
read-only — NOT a Privileged/root op; tunnel management is a later slice).
|
||||
- **`Client`** (`client.go`): `POST /api/v1/host-report` with
|
||||
`Authorization: Bearer <key>`, standard TLS (system roots or optional `ca_file`;
|
||||
verification always on). Typed `*TransportError` / `*HTTPError`; the bearer token
|
||||
never appears in any error.
|
||||
- **`Loop`** (`loop.go`): the daemon — immediate first report then tick; adopts the
|
||||
hub's `poll_interval_seconds` clamped to [60,3600]; resilient (a collect/report
|
||||
error is logged and the loop continues); clean shutdown on context cancel.
|
||||
- **`ControlEnvelope`**: only `poll_interval_seconds` is acted on; `blocked` /
|
||||
`desired_generation` / `has_signed_ops` are parsed-but-ignored (logged at most)
|
||||
pending reconcile (slice 4).
|
||||
- **Config**: `HubConfig` (url/host_id/api_key/poll_seconds/timeout_seconds/ca_file),
|
||||
`FELHOM_AGENT_HUB_*` env overlay, `HubConfig.Validate()` (mode-aware — proxmox-only
|
||||
`--selftest=read|task` still runs without hub config), `WithDefaults()`, and
|
||||
`Redacted()` now also blanks the hub key. `configs/agent.example.json` gains `hub`
|
||||
(and `authz`) blocks.
|
||||
- **`cmd/felhom-agent`**: the no-`--selftest` mode is now the **daemon** (poll loop);
|
||||
added **`--selftest=hub`** (one collect+report, prints the report + envelope).
|
||||
Version 0.2.0 → 0.3.0.
|
||||
|
||||
### Tests
|
||||
- Report serialization (field names; empty collections are `[]` not `null`; spec
|
||||
omitted when unknown); client (Bearer header, non-2xx→`*HTTPError`,
|
||||
transport→`*TransportError`, **token never in error**); collector (host mapping,
|
||||
guest spec, per-guest failure degrades-but-still-reports, NodeStatus hard error,
|
||||
cloudflared error→unknown); loop (immediate first report, continuation after an
|
||||
injected error, interval adoption + clamp); config (hub validate/redact/env).
|
||||
|
||||
### Notes
|
||||
- `internal/proxmox` and `internal/authz` were **not touched** — no new proxmox
|
||||
surface was needed (`ListLXC` already exposes status/maxmem/maxdisk; `GuestConfig`
|
||||
exposes cores). The task's `proxmoxReader` sketch (node-arg/pointer/`LXC`) was
|
||||
adapted to the real exports as instructed.
|
||||
- **Defined-but-empty** this slice: `storage_targets`, `backups`, `restore_tests`,
|
||||
`pbs_snapshots`, `audit_tail` (slices 5/6). **Parsed-but-ignored**: the envelope's
|
||||
`blocked`/`desired_generation`/`has_signed_ops` (slice 4).
|
||||
|
||||
## v0.2.0 — `authz` signed-op verifier (slice 2) (2026-06-08)
|
||||
|
||||
Production form of the Phase-4 signing primitive: a key-type-agnostic SSHSIG
|
||||
|
||||
Reference in New Issue
Block a user