doc 03: slice 8A implemented — §6a local-API impl, §9 back-half row, §13 (2026-06-10)
§6a (new): the local-API implementation — stable leaf-SHA-256 pin, token->guest self-scoping (cross-guest 403), bootstrap.json contract + controller ingestion (c), baked-controller deploy (no registry cred in guest), firewall narrowing. §9 slice table: back-half = slice 8A implemented (8B quiesce / 8C de-priv split out); build-golden.sh bakes the controller. §13 + doc changelog. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -124,6 +124,34 @@ A controller can only `POST /rollback` (or snapshot/backup) **its own** guest
|
|||||||
token → guest and authorizes per guest, so a compromised controller's blast radius is
|
token → guest and authorizes per guest, so a compromised controller's blast radius is
|
||||||
**self-scoped and bounded** to its own guest.
|
**self-scoped and bounded** to its own guest.
|
||||||
|
|
||||||
|
### 6a. Implementation (slice 8A — implemented)
|
||||||
|
|
||||||
|
**Status: implemented** (agent v0.10.0 `internal/localapi`; controller v0.35.0 `internal/bootstrap`
|
||||||
|
+ `internal/agentapi`). Grounded by `documentation/tests/slice8a-channel-deploy-spike-findings.md`
|
||||||
|
(commit `4a81a96`). The 7 endpoints above are live; `GET /backup/due` is **thin** in 8A (the
|
||||||
|
quiesce-on-due consumer is 8B), the rest wrap the existing slice-5/6/7 machinery.
|
||||||
|
|
||||||
|
- **Transport / pin.** The agent serves a **persisted self-signed leaf** bound to the host bridge IP
|
||||||
|
on a fixed port (default `:8443`). The controller pins the **leaf-cert SHA-256** (decision:
|
||||||
|
consistency with the agent's Proxmox/PBS cert pinning), carried in its bootstrap. The leaf is
|
||||||
|
generated **once and persisted**, so its fingerprint is stable across agent restarts (a fresh cert
|
||||||
|
each boot would invalidate every already-issued bootstrap pin). Defense-in-depth: the listener
|
||||||
|
binds the **bridge IP** (not `0.0.0.0`) and a host firewall rule narrows the port to the guest
|
||||||
|
bridge subnet (`configs/felhom-localapi-firewall.example`) — the **per-guest token stays the gate**.
|
||||||
|
- **Token custody.** The per-guest token is minted by the back-half (§9), persisted as a **SHA-256
|
||||||
|
hash** only (the plaintext exists transiently at mint→write-to-mount, then is discarded), in a
|
||||||
|
durable last-write-wins map. **Self-scoping** is enforced by the token→guest map alone: the VMID is
|
||||||
|
resolved from the token, never from a caller-supplied id; an explicit `vmid` that disagrees is
|
||||||
|
refused (**403**) and the Proxmox op is never issued for the other guest. Absent/unknown token → 401.
|
||||||
|
- **The bootstrap contract `(c)`.** The agent emits a stable `bootstrap.json`
|
||||||
|
(`schema: felhom.bootstrap/v1`: customer identity, hub, and the local-API `{endpoint, fingerprint,
|
||||||
|
token}`) into a read-only config mount; the controller **ingests it on first run and seeds its own
|
||||||
|
`controller.yaml`, skipping setup mode** (idempotent — never clobbers an existing config; fail-safe
|
||||||
|
— a malformed/absent bootstrap stays in setup). The agent emits the contract; the controller owns
|
||||||
|
the translation — they stay decoupled (no shared config schema). **No registry credential ever
|
||||||
|
enters a guest**: the controller image is **baked into the golden** (§9), so deploy does no
|
||||||
|
`docker login`/`pull`.
|
||||||
|
|
||||||
## 7. Storage manifest & reconciliation
|
## 7. Storage manifest & reconciliation
|
||||||
|
|
||||||
The manifest is the load-bearing contract. It absorbs the **persisted** disk-state fields that
|
The manifest is the load-bearing contract. It absorbs the **persisted** disk-state fields that
|
||||||
@@ -307,7 +335,7 @@ identity" is shorthand for two different operations:
|
|||||||
NOT auto-regenerate host keys after a restore, so the golden carries the regeneration, keeping
|
NOT auto-regenerate host keys after a restore, so the golden carries the regeneration, keeping
|
||||||
the agent host-side-only). It then receives a **fresh** controller identity (host-id, local
|
the agent host-side-only). It then receives a **fresh** controller identity (host-id, local
|
||||||
token, hub channel), **fresh restic repo identity**, and a fresh tunnel association — all minted
|
token, hub channel), **fresh restic repo identity**, and a fresh tunnel association — all minted
|
||||||
in the back half (slice 8).
|
in the back half (slice 8A — implemented).
|
||||||
- **Guest-loss DR (customer backup) → preserve continuity identity, reset only what would
|
- **Guest-loss DR (customer backup) → preserve continuity identity, reset only what would
|
||||||
collide.** The restored guest must *continue* the customer's world: **keep** the restic repo
|
collide.** The restored guest must *continue* the customer's world: **keep** the restic repo
|
||||||
identity (resetting it orphans the existing backup chain — a silent data-continuity bug), the
|
identity (resetting it orphans the existing backup chain — a silent data-continuity bug), the
|
||||||
@@ -332,11 +360,13 @@ this path — bring up + reattach external storage and it is whole. This is full
|
|||||||
|
|
||||||
| Capability | Slice | Status |
|
| Capability | Slice | Status |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| Golden base image build (root@pam, at enrollment) | **7** | **recipe implemented** (`felhom-agent/configs/build-golden.sh`, incl. the F3 host-key unit); golden archived at enrollment |
|
| Golden base image build (root@pam, at enrollment) | **7** | **recipe implemented** (`felhom-agent/configs/build-golden.sh`, incl. the F3 host-key unit; **now also bakes the controller image + a controller-bootstrap unit**, slice 8A); golden archived at enrollment |
|
||||||
| Unified bring-up **front half** (restore→reset identity→size→attach storage), journaled + compensating rollback | **7** | **implemented** (agent v0.8.0, `internal/reconcile/bringup.go`) |
|
| Unified bring-up **front half** (restore→reset identity→size→attach storage), journaled + compensating rollback | **7** | **implemented** (agent v0.8.0, `internal/reconcile/bringup.go`) |
|
||||||
| **Guest-loss DR** (front half + DR identity policy; no controller deploy) | **7** | **implemented** (v0.8.0, `dr_guest_loss` mode — continuity identity preserved) |
|
| **Guest-loss DR** (front half + DR identity policy; no controller deploy) | **7** | **implemented** (v0.8.0, `dr_guest_loss` mode — continuity identity preserved) |
|
||||||
| PBS recovery-code escrow **creation** + **hub opaque storage** (§8a) | **7** | **implemented** (agent v0.9.0 `internal/escrow`; hub v0.8.0 `PUT /hosts/{id}/escrow`) |
|
| PBS recovery-code escrow **creation** + **hub opaque storage** (§8a) | **7** | **implemented** (agent v0.9.0 `internal/escrow`; hub v0.8.0 `PUT /hosts/{id}/escrow`) |
|
||||||
| Provisioning **back half** — deploy controller, hand bootstrap config, mint per-guest local token | **8** | deferred — needs the controller-deploy path + agent↔controller local API (§6) |
|
| **Local API** server (§6) + provisioning **back half** — deploy controller, hand bootstrap config, mint per-guest local token | **8A** | **implemented** (agent v0.10.0 `internal/localapi` + `internal/provision`; controller v0.35.0 `internal/bootstrap` + `internal/agentapi`). The controller image is **baked into the golden** (no registry cred in any guest); the back-half mints the token, writes a 0600 `bootstrap.json` to a `chown 100000:100000` config mount, and `pct set`-attaches it read-only; the golden's baked unit deploys the controller, which ingests the bootstrap, comes up configured, and reaches the agent over the bridge (leaf-pin + token). Validated live end-to-end on the demo. |
|
||||||
|
| **Quiesced app-consistent backup** (`/backup/due`-driven stack-stop) | **8B** | deferred — `/backup/due` is thin in 8A; the controller quiesce-then-`POST /backup` loop is 8B |
|
||||||
|
| **Controller de-privileging** (retire the disk-execution subsystem; new customer disk endpoints behind the slice-4 data-bearing classifier) | **8C** | deferred |
|
||||||
| **Host/hardware loss** DR — re-enroll in "restore mode"; hub serves identity / PBS namespace / tunnel token / storage manifest / restore directive | **10** | deferred — needs hub desired-state serving; hub store today holds only `{host_id, customer_id, api_key}` (slice 3) |
|
| **Host/hardware loss** DR — re-enroll in "restore mode"; hub serves identity / PBS namespace / tunnel token / storage manifest / restore directive | **10** | deferred — needs hub desired-state serving; hub store today holds only `{host_id, customer_id, api_key}` (slice 3) |
|
||||||
| PBS escrow **consumption** (recover `K` on a new box) | **10** | deferred — exercised by host-loss DR |
|
| PBS escrow **consumption** (recover `K` on a new box) | **10** | deferred — exercised by host-loss DR |
|
||||||
| Golden base refresh cadence + fleet versioning | post-launch | operational, non-blocking (§13) |
|
| Golden base refresh cadence + fleet versioning | post-launch | operational, non-blocking (§13) |
|
||||||
@@ -386,10 +416,13 @@ argument for §3's root-minimization and a small, auditable agent.
|
|||||||
|
|
||||||
Resolved here: tunnel placement (host, agent-managed, own systemd service), the
|
Resolved here: tunnel placement (host, agent-managed, own systemd service), the
|
||||||
reconcile-vs-jobs fork (hybrid, gated by reversibility), agent process model, self-update
|
reconcile-vs-jobs fork (hybrid, gated by reversibility), agent process model, self-update
|
||||||
ownership, the local-API surface, the storage-manifest schema, **provision-by-restore**, the
|
ownership, the local-API surface (**implemented, slice 8A — §6a**), the storage-manifest schema,
|
||||||
**provision/DR slice boundary** (7 front-half + guest-loss DR + escrow creation; 8 provisioning
|
**provision-by-restore**, the **provision/DR slice boundary** (7 front-half + guest-loss DR +
|
||||||
back-half; 10 host-loss DR + escrow consumption — §9 table), the **PBS recovery-code escrow
|
escrow creation; **8A provisioning back-half + local API — implemented**; 8B quiesced backup; 8C
|
||||||
design** (§8a), and the **root-vs-API boundary** (Phase 3, B3).
|
controller de-privileging; 10 host-loss DR + escrow consumption — §9 table), the **PBS
|
||||||
|
recovery-code escrow design** (§8a), and the **root-vs-API boundary** (Phase 3, B3 — the slice-8A
|
||||||
|
back-half's host-side `chown`/`pct set` bind-mount is a deliberate, narrow addition OUTSIDE the
|
||||||
|
API token, in `internal/provision`, not the 3-exception `proxmox.Privileged` fence).
|
||||||
|
|
||||||
Still open:
|
Still open:
|
||||||
|
|
||||||
@@ -413,6 +446,19 @@ This doc hands the implementation three contracts it was waiting on:
|
|||||||
|
|
||||||
## Changelog — design-review + Phase-3 fold-in (2026-06-08)
|
## Changelog — design-review + Phase-3 fold-in (2026-06-08)
|
||||||
|
|
||||||
|
### Slice-8A implemented: local API + provisioning back-half (2026-06-10)
|
||||||
|
- NEW §6a: the **local-API implementation** (agent v0.10.0 `internal/localapi`; controller v0.35.0
|
||||||
|
`internal/bootstrap` + `internal/agentapi`) — persisted self-signed leaf with a **stable
|
||||||
|
leaf-SHA-256 pin**, the **token→guest self-scoping** (explicit cross-guest id → 403, op never
|
||||||
|
issued), the stable **`bootstrap.json` contract + controller ingestion `(c)`** (seed
|
||||||
|
`controller.yaml`, skip setup; idempotent + fail-safe), and the **baked-controller deploy** (no
|
||||||
|
registry credential in any guest). Firewall narrowing = defense-in-depth; the token stays the gate.
|
||||||
|
- §9: the provisioning **back half** row is now **slice 8A — implemented** (split from the old "8");
|
||||||
|
`build-golden.sh` now **bakes the controller + a bootstrap unit**; quiesced backup → 8B, controller
|
||||||
|
de-privileging → 8C. The host-side `chown`/`pct set` bind-mount is a deliberate narrow surface in
|
||||||
|
`internal/provision` (NOT the 3-exception `proxmox.Privileged` fence). Validated live end-to-end.
|
||||||
|
- §13 updated accordingly.
|
||||||
|
|
||||||
### Slice-7 scope + escrow design (2026-06-09)
|
### Slice-7 scope + escrow design (2026-06-09)
|
||||||
- §9 rewritten: the bring-up primitive is a **shared front half only** — identity-reset policy is
|
- §9 rewritten: the bring-up primitive is a **shared front half only** — identity-reset policy is
|
||||||
**scenario-specific** (provision = fresh everything; guest-loss DR = preserve restic/tunnel/hub
|
**scenario-specific** (provision = fresh everything; guest-loss DR = preserve restic/tunnel/hub
|
||||||
|
|||||||
Reference in New Issue
Block a user