spike(slice7): golden base build + live bring-up front-half findings

SPIKE-RUNBOOK Slice 7 Phase 0, executed live on demo-felhom. Golden base
(Debian 13 + Docker, nesting=1,keyctl=1, identity-cleaned) built as root@pam,
archived, then token-restored to a throwaway guest and brought up LINK-UP with
the FelhomAgent token (restore/config/resize/start all token-covered).

Key findings:
- MAC reset is UNCONDITIONAL — vzrestore preserves the archived MAC (F1).
- hostname reset is host-side token config (F2).
- machine-id auto-regenerates on first boot (free); SSH host keys do NOT —
  ssh.service fails, agent must run ssh-keygen -A guest-side OR bake a first-boot
  unit (F3, the one surface-widening design consequence).
- keyctl-through-restore is functional (Docker hello-world in the restored guest);
  storage driver overlayfs (F5/F6).
- Settles the §9 / doc-13 identity-reset field list for the provision path.

Verdict: READY to spec the unified bring-up reconcile job (Phase 7.1).
Golden archive kept; both spike guests torn down.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-09 20:48:50 +02:00
parent e7ed8a8483
commit 33429933af
@@ -0,0 +1,141 @@
# Slice 7 Phase 0 — Golden base build + live bring-up (front half): Findings
**Host:** `demo-felhom` (192.168.0.162) — Proxmox VE 9.2.2, Debian 13 (Trixie). Bridge `vmbr0`,
LAN DHCP (router at 192.168.0.1).
**Date:** 2026-06-09. **Driver:** SPIKE-RUNBOOK (root@pam CLI for the golden build; the
`FelhomAgent` API token for the per-customer front-half ops — restore/config/resize/start).
**VMIDs:** golden-build `9100`, restored-test `9101` (both torn down; golden archive kept).
> This document presents **data, observations, and the resulting design deliverables** (the
> identity-reset field list). It feeds the spec of the unified bring-up reconcile job (Phase 7.1).
---
## 1. Provenance / setup
| Component | Value |
|---|---|
| Template | `local:vztmpl/debian-13-standard_13.1-2_amd64.tar.zst` |
| Restore storage | `local-lvm` (lvmthin) · Archive storage | `local` (dir, `/var/lib/vz/dump`) |
| Token | `felhom@pve!agent` (the `FelhomAgent` 16-priv role; by reference) |
| Golden archive (KEPT) | `local:backup/vzdump-lxc-9100-2026_06_09-20_41_10.tar.zst` (298 MB) |
| openssh-server (in guest) | `1:10.0p1-7` |
| Docker storage driver | **`overlayfs`** (not `overlay2`/`vfs`) — consistent with phase0 |
Token API smoke (S0): `GET /version` → 200, `GET /nodes/demo-felhom/lxc` → 200. Token holds
`VM.Allocate`, `Datastore.Allocate`/`AllocateSpace`, `VM.Config.{Disk,Network,Options,…}`,
`VM.PowerMgmt`, `VM.Backup`, etc. (full set confirmed via `/access/permissions`).
## 2. Golden recipe (validated — build the real golden from this)
1. **Create (root@pam — the one root step; `keyctl=1` is root-only, phase3 #1):**
```
pct create 9100 local:vztmpl/debian-13-standard_13.1-2_amd64.tar.zst \
--hostname felhom-golden --unprivileged 1 --features nesting=1,keyctl=1 \
--rootfs local-lvm:8 --cores 2 --memory 2048 \
--net0 name=eth0,bridge=vmbr0,ip=dhcp --onboot 0
```
(`pct create` auto-generates SSH host keys — these get wiped in step 3.)
2. **Docker (official apt repo, `trixie` channel):** `ca-certificates curl` → keyring →
`docker-ce docker-ce-cli containerd.io`. Confirmed working in the build guest:
`docker run --rm hello-world` → "Hello from Docker!", **storage driver `overlayfs`**.
3. **Identity-clean + minimize (guest-internal, run during build):**
```
systemctl stop docker containerd
apt-get clean; rm -rf /var/lib/apt/lists/*
rm -f /etc/ssh/ssh_host_* # SSH host keys
truncate -s 0 /etc/machine-id # systemd regenerates on first boot
rm -f /var/lib/dbus/machine-id; ln -sf /etc/machine-id /var/lib/dbus/machine-id
rm -rf /var/log/*; : > /root/.bash_history
rm -f /etc/hostname # set per-guest at provision
```
4. **Stop + archive (root vzdump is fine for the build):**
`pct stop 9100; vzdump 9100 --storage local --mode stop --compress zstd`.
5. **Archive carries keyctl (verified, phase3 method — embedded `./etc/vzdump/pct.conf`):**
`features: nesting=1,keyctl=1` · `unprivileged: 1`. **It also carries the build guest's
baked MAC** `BC:24:11:63:43:F4` and `hostname: felhom-golden` — see §4.
## 3. Result matrix
| Property | As-restored (9101, stopped, pre-reset) | Front-half reset (token) | After link-up boot |
|---|---|---|---|
| keyctl / nesting / unpriv | **preserved** `nesting=1,keyctl=1,unprivileged:1` | — | **Docker runs** (`hello-world` OK) — keyctl *functional*, not just flag-present |
| **MAC** | **KEPT golden's** `BC:24:11:63:43:F4` | reset → fresh `BC:24:11:A6:C0:DE` (PUT net0, **omit hwaddr** → PVE regenerates) | DHCP lease `192.168.0.109`; MAC unique; no LAN collision |
| **hostname** | **KEPT golden's** `felhom-golden` (config field; `/etc/hostname` file absent) | reset → `felhom-spike-9101` (PUT hostname) | **propagated** inside (`hostname` = `felhom-spike-9101`) |
| **machine-id** | **empty** (baked `truncate`) | — | **auto-regenerated by systemd** → `faeffb0bc1b8403089cdd0b981cff109` (unique) |
| **SSH host keys** | **absent** (baked `rm`) | — | **NOT regenerated; `ssh.service` FAILED** — see Finding F3 |
| rootfs | 8 G | **resize → 10 G** (`PUT /resize disk=rootfs size=+2G`) | — |
| mp0 mount | n/a | attached `local-lvm:1,mp=/mnt/spike-test` (transient 500 → retry 200, F4) | present + **writable** (ext4) |
Token ops all ran as `felhom-agent@pve!agent` (restore `vzrestore` OK, start `vzstart` OK) —
the per-customer front half is **fully token-covered**.
## 4. Findings
- **F1 — MAC reset is UNCONDITIONAL.** A token `vzrestore` **preserves the archived MAC**
(9101 came up with the golden's `BC:24:11:63:43:F4`). Every guest restored from the golden
would therefore share one MAC → guaranteed L2 collision. The reconcile job **must** reset MAC
on every provision (host-side: `PUT net0` with `hwaddr` omitted → PVE generates a fresh
`BC:24:11:xx:xx:xx`). This settles the §9 "MAC handling" question for the *provision* path:
always reset. (DR-restore of a *customer* backup is the separate continuity case — §9.)
- **F2 — hostname is carried in the config and must be reset host-side.** The archive's
`hostname:` field restored verbatim (`felhom-golden`); `PUT hostname=` resets it and it
**propagates into the guest** on boot. Host-side, token-covered — no guest-internal step.
- **F3 — machine-id regenerates for free; SSH host keys do NOT (design consequence).**
- `machine-id`: bake `truncate -s 0` → **systemd regenerates it on first boot** (confirmed
non-empty + unique). No agent action needed. ✓ free.
- SSH host keys: bake `rm` → on Debian 13 they are **not** regenerated at boot (the keygen is
a `pct create` hook + a package-install action; **`pct restore` runs neither**). Result:
`openssh-server` is installed and `ssh.service` is **enabled but FAILED** on first boot (no
host keys). `ssh-keygen -A` regenerates them cleanly (unique fingerprint
`SHA256:MAX191…ED25519`, `root@felhom-spike-9101`).
→ **The bring-up reconcile job must regenerate SSH host keys guest-side** (`ssh-keygen -A`,
or `dpkg-reconfigure openssh-server`). **This widens the agent's guest-internal surface**
beyond pure host-side config — the one real design consequence this spike surfaced.
*Alternative to consider in the spec:* bake a one-shot first-boot unit into the golden that
runs `ssh-keygen -A` (keeps regeneration guest-internal-but-baked, so the agent stays
host-side-only). Either way it must be decided; it is **not** free like machine-id.
- **F4 — transient config-lock 500 on back-to-back PUTs.** A `mp0` attach issued immediately
after a `resize` returned **HTTP 500**, then succeeded (200) on retry seconds later — a
config-lock contention, **not** a permission issue (token holds `VM.Config.Disk` +
`Datastore.AllocateSpace`). The reconcile job's existing **per-guest serialization** avoids
this; add a **retry on transient 500** for safety.
- **F5 — keyctl-through-restore is *functional*, not just flag-present.** Docker started and
ran `hello-world` in the *restored* guest — re-confirms phase3 #8 on the golden specifically.
- **F6 — Docker storage driver is `overlayfs`** (not `overlay2`), matching phase0's LXC result.
No extra config beyond `nesting=1,keyctl=1` was needed.
- **F7 — live link-up surfaced no DHCP/ARP problem.** Fresh MAC → fresh lease `192.168.0.109`;
the golden's old MAC only lingered as a STALE IPv6-neighbour cache entry from the (stopped)
build guest. No active collision.
## 5. Identity-reset deliverable (the §9 / doc-13 open item — settled for the *provision* path)
| Field | Restore leaves it as | Who resets it | Where | Cost |
|---|---|---|---|---|
| MAC | golden's archived MAC | reconcile job (unconditional) | **host-side** token `PUT net0` (omit hwaddr) | cheap |
| hostname | golden's archived hostname | reconcile job | **host-side** token `PUT hostname` | cheap |
| machine-id | empty (baked) | **systemd, first boot** | guest first-boot regen (golden bake) | **free** |
| SSH host keys | absent (baked) | reconcile job | **guest-side** `ssh-keygen -A` (or baked first-boot unit) | **surface-widening — flag** |
**Reconcile-job front-half reset set (provision):** host-side `{MAC, hostname}` via token config;
guest-side `{SSH host keys}` via `ssh-keygen -A` (or a baked first-boot unit); `{machine-id}` is
handled for free by the bake-clean golden. Restic / tunnel / hub identity are **out of scope**
here (back half, slice 8 / DR policy §9).
## 6. Verdict
**READY to spec the unified bring-up reconcile job (Phase 7.1).** The golden recipe is validated
end-to-end and the token-covered front half (restore → reset MAC+hostname → resize → attach
mount → start link-up) works with Docker functional in the restored guest. **One design change
the findings force:** the front half is **not** purely host-side — SSH-host-key regeneration is a
guest-internal step (F3). The spec must choose between an agent-run `ssh-keygen -A` (widening the
guest-internal surface) and a baked first-boot unit in the golden (keeping the agent host-side).
machine-id needs no such step. MAC reset is unconditional (F1).
## 7. Out of scope (not done here — note for the implementation)
- Controller deploy / bootstrap / per-guest local-token mint — **slice 8** (back half).
- Restic / tunnel / hub identity handling — DR identity policy (§9) + slice 8/10.
- Reconcile-job journaling + compensating rollback — the **implementation** (Phase 7.1),
specced from these findings; this spike restored/destroyed manually without the journal.
- PBS escrow (§8a) — separate slice-7 thread.