slice 10A: hub desired-state serving + signed-jobs queue (Down channel) (hub v0.9.0)

Serve operator intent to authenticated hosts: PUT /admin/hosts/{id}/desired-state
(global key) bumps desired_generation; GET /hosts/{id}/desired-state + /jobs are
per-host self-scoped; the host-report envelope now carries the real generation +
has_signed_ops. New signed_jobs table + store methods. Desired-state stored/served
opaquely (agent owns the schema). Cross-repo golden (envelope + desired-state)
byte-identical with felhom-agent; doc 03 §4/§9 updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-10 19:03:14 +02:00
parent f9af3243b9
commit e54f882e70
8 changed files with 669 additions and 30 deletions
+36 -26
View File
@@ -4,41 +4,51 @@
---
# REPORT — Slice 9 (hub + docs): host metrics to the controller — `cpu_temp_c` wire field + docs (2026-06-10)
# REPORT — Slice 10A (hub half): desired-state serving — the "Down" channel (hub v0.9.0) (2026-06-10)
## Type
Cross-repo wire-contract + documentation update for **slice 9** (implementation: `felhom-agent`
v0.14.0 + `felhom-controller` v0.39.0). **No hub code change, no hub version bump.**
TASK (CC-implemented). The hub half of slice 10A. Pairs with `felhom-agent` v0.15.0.
## What changed (hub)
- **Cross-repo host-report golden** (`hub/internal/api/testdata/host-report.golden.json`) gained
**`host.cpu_temp_c: 47`**, kept **byte-identical** with
`felhom-agent/internal/hub/testdata/host-report.golden.json` (the duplicated-contract discipline;
manual diff confirmed identical). No code change: the full `report_json` already persists the field
verbatim, and the hub's host parse-struct ignores the extra key — the golden-contract test
(`host_test.go`) still passes. CPU temp on the operator dashboard is an optional later freebie.
- `hub/CHANGELOG.md` records the contract update (no version bump).
The hub now **serves operator intent** down to already-authenticated hosts; the control envelope stops
returning placeholders and carries the host's real generation + signed-jobs flag.
## What changed (doc 03 — host-agent)
### Store (`internal/store`)
- New `signed_jobs` table (per-host **opaque** signed-op blob queue). New methods: `SetHostDesired`
(set desired-state + **atomically bump `desired_generation`**), `EnqueueSignedJob` / `GetSignedJobs`
/ `CountSignedJobs`. The `hosts` table's previously-inert `desired_json` / `desired_generation`
columns are now live.
- **§6** — added **`GET /host/metrics`** to the local-API surface: host-wide health
(cpu%/mem/load/uptime/`cpu_temp_c`) + per-storage capacity for the customer's monitoring view.
Reuses the slice-4 collector (no duplicate collection); **host-wide, token-authed, fresh** (not the
15-min hub snapshot); noted the **one-customer-per-host** assumption.
- **§9 slice table** — **defined + marked slice 9** (the roadmap previously jumped 8→10; this fills
it), incl. the assumption + out-of-scope items (multi-tenant filtering, time-series history). Added
a slice-9 entry to the doc changelog.
### API (`internal/api`)
- **`PUT /api/v1/admin/hosts/{id}/desired-state`** (global key) — set + bump generation; body stored +
served **opaquely** (validated only as well-formed JSON — the agent owns the schema).
- **`GET /api/v1/hosts/{id}/desired-state`** (per-host key, **self-scoped**) — `{generation,
desired_state}`; host A's key cannot read host B (403); global key may read any.
- **`GET /api/v1/hosts/{id}/jobs`** (per-host key, self-scoped) — serves the host's pending opaque
signed-op blobs, oldest first (verify+execute is 10B).
- **`POST /api/v1/admin/hosts/{id}/jobs`** (global key) — enqueue a pre-signed opaque blob (the hub
holds no signing key).
- The host-report **control envelope** now reports the real `desired_generation` + `has_signed_ops`,
degrading safely to defaults on a store error.
## Why (the slice 9 thesis)
## Tests (green)
- admin-set bumps the generation + serves the latest body; global-key-only (per-host 403, malformed
400, unknown host 404); `GET /desired-state` self-scoped (A→B 403, global any, no-token 401);
envelope carries generation + `has_signed_ops` flips on enqueue; `GET /jobs` self-scoped oldest-first;
cross-repo golden round-trip (set → fetched back unchanged), **byte-identical** with felhom-agent.
The de-privileged controller (slice 8C) sees only its own cgroup — it can't read the host. Slice 9
re-serves the agent's existing host + storage observation to the customer, plus the one new collector
(CPU/chassis temp, graceful-null). On-ethos for a data-sovereignty product: the customer sees their
own box's health.
## Docs
- Doc 03 §4 (control loop live: heartbeat → envelope generation/jobs → fetch-on-change → reconcile
benign / gate destructive) + §9 slice table (**10A done**; 10B signed-op execution / 10C escrow
consumption / 10D DR capstone pending; the `restore_directive` field exists now, consumed in 10D).
## Deferred / not built
## Deferred / out of scope
- Signed-op **execution** + signature verification → **10B** (10A only serves the queue + flag).
- **Restore-mode / re-enroll** consumption (a new box's first directive) → **10D**; 10A serves
already-authenticated hosts only. Rich desired-state editing UX → doc-05 (10A's admin-set is minimal).
Multi-tenant host-metric filtering (one-customer-per-host assumed); historical/time-series metric
storage (this is a live snapshot view). No secrets committed.
## Pending
- Build + deploy hub v0.9.0 (+ agent v0.15.0) and live-validate against the demo host (admin-set
benign+destructive → generation bump → agent fetch → reconcile/gate; self-scope refusal).