docs(v0.50.0): REPORT — controller slice-10 (P2C + activation-UX + P4); validated on 9201
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,91 +1,48 @@
|
|||||||
# REPORT — USB drive availability (diagnosis) + backup-page whole-guest wiring
|
# REPORT — felhom-controller: slice 10 (external user-data drives) — controller side
|
||||||
|
|
||||||
**Repo:** `felhom-controller` · **Version:** 0.47.0 · **Date:** 2026-06-12 · agent untouched
|
**Version:** 0.50.0 · **Date:** 2026-06-12 · pairs with **felhom-agent v0.25.0–v0.27.0**
|
||||||
|
|
||||||
---
|
The controller's half of slice 10 (the agent owns the host-side execution + self-heal; see
|
||||||
|
felhom-agent's REPORT for P1 spike / P2 passthrough / P2 activation endpoint / P3 self-heal). All
|
||||||
|
live-validated on guest 9201; golden rebaked to 0.50.0.
|
||||||
|
|
||||||
## PART 1 — USB drive availability: DIAGNOSIS (GATED — Branch A, no blind build)
|
## P2C — enroll passes the drive into the guest (v0.48.0)
|
||||||
|
- `agentapi.GuestAttach(where)` → agent `POST /disks/guest-attach`. `runStorageInit` /
|
||||||
|
`runStorageAttach` / `handleStorageRegister` call `attachIntoGuest` after recording the StoragePath
|
||||||
|
(best-effort; a transient failure is logged — P3 self-heal completes it). Closes Branch A: an enrolled
|
||||||
|
drive becomes usable in the guest (app `HDD_PATH` writes land on `/dev/sdb1`; the "nem elérhető"
|
||||||
|
banner clears).
|
||||||
|
|
||||||
**Conclusion: BRANCH A confirmed — felhom-usb is NOT passed into guest 9201.** The drive is mounted on
|
## Activation-UX (v0.49.0)
|
||||||
the **host** only; the guest (and therefore the controller container and any app) cannot reach it. This
|
- The host-side live inject is blocked on unprivileged LXC, so a drive enrolled into a *running* guest
|
||||||
is the slice-10 additive-mount passthrough — **reported, not built.** The banner is correct.
|
activates at the next guest boot. Per decision: enroll persists (no forced reboot) + a user-triggered
|
||||||
|
restart.
|
||||||
|
- `pendingActivationDrives()` flags registered drives the agent reports present+attached but which
|
||||||
|
aren't a live mount in the container. The settings page shows a banner + a batched **"Újraindítás
|
||||||
|
most (~30 mp)"** button → `POST /api/storage/activate` → `agentapi.GuestReboot` → agent
|
||||||
|
`POST /guest/reboot`. Live-validated: activate → guest reboots → drive active.
|
||||||
|
|
||||||
### Captured evidence (live, guest 9201 / felhom-pve)
|
## P4 — dual-role drives + backup-aware wipe warning (v0.50.0)
|
||||||
**Host:**
|
- **4A:** a user-data drive is appdata AND backup-target-eligible (not locked to one role) — surfaced
|
||||||
- `pct config 9201` → mountpoints: only `mp9: …/bootstrap → /etc/felhom-bootstrap (ro)` and
|
in the drive overview's per-card purpose note. `felhom-pbs`/system/backup roles unchanged.
|
||||||
`rootfs: local-lvm:vm-9201-disk-0,size=8G`. **No felhom-usb mp.**
|
- **4B:** `handleStorageImpact` now also returns `backup_copies` — apps whose cross-drive (secondary)
|
||||||
- `findmnt /mnt/felhom-usb` → `/dev/sdb1 ext4` (mounted on the host).
|
backups are stored on the drive (`backupCopiesOnPath` scans `felhom-data/backups/secondary/<app>`,
|
||||||
- `/etc/pve/storage.cfg` → `dir: felhom-usb, path /mnt/felhom-usb, content backup, is_mountpoint 1`.
|
skipping the shared restic repo / `_infra`). The type-to-confirm wipe/eject modal names them ("Ez a
|
||||||
|
meghajtó más alkalmazások biztonsági másolatait is tárolja — a törlés ezeket is eltávolítja"). The
|
||||||
|
wipe stays **customer-confirmable** (the copies are redundant; originals live on the source drive).
|
||||||
|
- **OUT OF SCOPE:** the cross-drive backup ENGINE (restic USB1↔USB2, scheduling, pruning) — a follow-on
|
||||||
|
slice (needs a 2nd physical drive to validate). The 4B detection is forward-compatible (empty until
|
||||||
|
the engine writes there).
|
||||||
|
|
||||||
**Guest:**
|
## Live validation (9201)
|
||||||
- `findmnt /mnt/felhom-usb` → **nothing** (not a mount in the guest).
|
- P2C: app bytes on `/dev/sdb1`, banner `[PASS] 1 connected, 0 disconnected`.
|
||||||
- `ls -la /mnt/felhom-usb` → empty dir (just `.`/`..`), created `Jun 12 07:46`.
|
- Activation: `/api/storage/activate` → reboot → drive active.
|
||||||
- `df -h /mnt/felhom-usb /` → **both** `/dev/mapper/pve-vm--9201--disk--0` (the 8 GB rootfs) — i.e.
|
- P4: `/api/storage/impact?where=/mnt/felhom-usb` → `backup_copies:[]`; after creating
|
||||||
`/mnt/felhom-usb` in the guest is a plain directory **on the rootfs**, not the external drive.
|
`felhom-data/backups/secondary/immich` → `backup_copies:["immich"]` (detection live).
|
||||||
|
- `go test ./internal/{web,agentapi}/` green; golden rebaked to 0.50.0, build guest purged.
|
||||||
|
|
||||||
**Controller container:**
|
## Note (carried from P2)
|
||||||
- `stat /mnt/felhom-usb` → **No such file or directory** (the de-privileged container has no `/mnt`
|
The controller's app/backup path helpers still join `felhom-data` under the registered drive path; in
|
||||||
bind) → `os.Stat` in `monitor/healthcheck.go` fails → the "Adattároló nem elérhető" banner.
|
Model A the in-guest mount IS the felhom-data namespace, so backup paths double-nest
|
||||||
- logs: `Storage paths: 0 connected, 1 disconnected`.
|
(`felhom-data/felhom-data/...`) — functional but untidy. Reconcile when wiring app-data-backup-to-drive
|
||||||
|
(not in this slice; app `HDD_PATH` data lands correctly today).
|
||||||
**Host drive contents (real data exists on the host):** `du -sh /mnt/felhom-usb` = 8.0 G;
|
|
||||||
contains `Dokumentumok/`, `felhom_data/`, `storage/`, `dump/`, `lost+found/` (Feb–Jun timestamps) —
|
|
||||||
a prior bare-metal layout. **Apps in the guest cannot see any of it.**
|
|
||||||
|
|
||||||
### Flags (must be addressed by the slice-10 passthrough)
|
|
||||||
1. **App-data location correctness:** an app configured with `HDD_PATH=/mnt/felhom-usb` would write to
|
|
||||||
the **8 GB rootfs (local-lvm)**, silently, NOT the external drive. The only deployed app
|
|
||||||
(`actualbudget`) has **0 HDD mounts** (`ParseComposeHDDMounts: found 0`), so nothing is mislanding
|
|
||||||
*today* — but the risk is real the moment any app is placed on "external" storage.
|
|
||||||
2. **The banner was surfaced by the `Regisztrálás` shortcut** (controller v0.45.0, prior task):
|
|
||||||
registering `/mnt/felhom-usb` (mounted on the host, not the guest) created the phantom empty dir on
|
|
||||||
rootfs (the `Jun 12 07:46` dir) and the `1 disconnected` probe. Registering a host-only drive should
|
|
||||||
arguably be refused until passthrough exists — a candidate guard for the slice-10 work.
|
|
||||||
|
|
||||||
### Slice-10 scope (NOT built here)
|
|
||||||
assign drive → attach as an LXC `mpN` on the guest (`reconcile/bringup.go:GuestMount`, marked "slice 10
|
|
||||||
wires") → mount propagation into the controller container → register. Gated per the spec; to be scoped
|
|
||||||
with this evidence in hand.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## PART 2 — Backups page: whole-guest backup visibility + manual trigger (v0.47.0, BUILT)
|
|
||||||
|
|
||||||
### 2C gate — quiesce ownership (confirmed before wiring)
|
|
||||||
**The CONTROLLER owns quiescing.** The `quiesce.Loop` (slice 8B) stops its app stacks → `POST /backup`
|
|
||||||
→ polls `/backup/status` → resumes; the agent's vzdump is crash-consistent only (an LXC has no
|
|
||||||
fsfreeze). So the manual trigger goes **through the loop**, never a bare `StartBackup` (which would be
|
|
||||||
crash-consistent and wouldn't stop apps). Guest 9201 is on lvm-thin → **snapshot mode**, so downtime is
|
|
||||||
the until-snapshot window (~10 s here), with 8B.2 early-resume.
|
|
||||||
|
|
||||||
### What shipped
|
|
||||||
- **2A agentapi:** `StatusResponse.Backup *BackupRecord`, `DueResponse.AgeSecs`, new
|
|
||||||
`RestoreTestStatus()`. Non-hollow tests (`backup_test.go`): parse the documented JSON; assert
|
|
||||||
`StartBackup` POSTs `/backup`.
|
|
||||||
- **2B Section "Rendszermentés (teljes mentés)":** read-only cards — last whole-guest backup (time +
|
|
||||||
size + **target PBS-vs-local**, from the archive volid), next-due (`/backup/due` age vs cadence),
|
|
||||||
restore-test, running phase. Agent-unreachable degrades to a note.
|
|
||||||
- **2C "Mentés most":** `quiesce.Loop` gains a mutex + `TriggerNow()` (single-flight via `TryLock` +
|
|
||||||
the persisted marker; `ErrBackupInProgress` on overlap; async, bounded by max-quiesce). New
|
|
||||||
`POST /api/guest-backup/trigger` + `GET /api/guest-backup/status` (distinct prefix from apiRouter's
|
|
||||||
app-data `/api/backup/{run,status}` — verified the collision and avoided shadowing). Button warns per
|
|
||||||
mode.
|
|
||||||
- **2D:** existing per-app DB-dump UI relabeled under an "Alkalmazás-mentések (adatbázis + konfiguráció)"
|
|
||||||
divider, distinct from the whole-guest tier.
|
|
||||||
- **2E config:** OUT OF SCOPE (hub-served policy, slice 10) — no agent config surface added.
|
|
||||||
|
|
||||||
### Live validation (guest 9201, against the agent API — not REPORT)
|
|
||||||
- Agent `curl`: `/backup/status` → done, backup `local:backup/vzdump-lxc-9201-…`, snapshot, 1.4 GB,
|
|
||||||
success; `/backup/due` → due=false, within cadence; `/restore-test/status` → null.
|
|
||||||
- `GET /backups` → **200**; Section 1 renders "Utolsó teljes mentés", "Helyi tároló (local)",
|
|
||||||
"Visszaállítás ellenőrizve / Még nem futott", "Mentés most"; "Alkalmazás-mentések" divider present.
|
|
||||||
- **Manual trigger:** `POST /api/guest-backup/trigger` → `{started:true}`; quiesce logs show
|
|
||||||
quiesce `actualbudget` → job started → **snapshotted → early-resume (8B.2) → done**; phase polled
|
|
||||||
`snapshotted → done`; a **new backup recorded** (`…11_17_38.tar.zst`); `actualbudget` back up+healthy;
|
|
||||||
quiesce marker cleared (no stranded quiesce).
|
|
||||||
- **Single-flight:** concurrent double-trigger → one `{started:true}`, one
|
|
||||||
`{"error":"mentés már folyamatban van"}` (409).
|
|
||||||
- `go test ./internal/{web,agentapi,quiesce}/` green; `go build ./...` clean.
|
|
||||||
|
|
||||||
### Deploy
|
|
||||||
Built + pushed `felhom-controller:0.47.0`; deployed to guest 9201 (healthy). Golden rebaked to 0.47.0.
|
|
||||||
|
|||||||
Reference in New Issue
Block a user