diff --git a/REPORT.md b/REPORT.md index 12c074b..f70082b 100644 --- a/REPORT.md +++ b/REPORT.md @@ -1,91 +1,48 @@ -# REPORT — USB drive availability (diagnosis) + backup-page whole-guest wiring +# REPORT — felhom-controller: slice 10 (external user-data drives) — controller side -**Repo:** `felhom-controller` · **Version:** 0.47.0 · **Date:** 2026-06-12 · agent untouched +**Version:** 0.50.0 · **Date:** 2026-06-12 · pairs with **felhom-agent v0.25.0–v0.27.0** ---- +The controller's half of slice 10 (the agent owns the host-side execution + self-heal; see +felhom-agent's REPORT for P1 spike / P2 passthrough / P2 activation endpoint / P3 self-heal). All +live-validated on guest 9201; golden rebaked to 0.50.0. -## PART 1 — USB drive availability: DIAGNOSIS (GATED — Branch A, no blind build) +## P2C — enroll passes the drive into the guest (v0.48.0) +- `agentapi.GuestAttach(where)` → agent `POST /disks/guest-attach`. `runStorageInit` / + `runStorageAttach` / `handleStorageRegister` call `attachIntoGuest` after recording the StoragePath + (best-effort; a transient failure is logged — P3 self-heal completes it). Closes Branch A: an enrolled + drive becomes usable in the guest (app `HDD_PATH` writes land on `/dev/sdb1`; the "nem elérhető" + banner clears). -**Conclusion: BRANCH A confirmed — felhom-usb is NOT passed into guest 9201.** The drive is mounted on -the **host** only; the guest (and therefore the controller container and any app) cannot reach it. This -is the slice-10 additive-mount passthrough — **reported, not built.** The banner is correct. +## Activation-UX (v0.49.0) +- The host-side live inject is blocked on unprivileged LXC, so a drive enrolled into a *running* guest + activates at the next guest boot. Per decision: enroll persists (no forced reboot) + a user-triggered + restart. +- `pendingActivationDrives()` flags registered drives the agent reports present+attached but which + aren't a live mount in the container. The settings page shows a banner + a batched **"Újraindítás + most (~30 mp)"** button → `POST /api/storage/activate` → `agentapi.GuestReboot` → agent + `POST /guest/reboot`. Live-validated: activate → guest reboots → drive active. -### Captured evidence (live, guest 9201 / felhom-pve) -**Host:** -- `pct config 9201` → mountpoints: only `mp9: …/bootstrap → /etc/felhom-bootstrap (ro)` and - `rootfs: local-lvm:vm-9201-disk-0,size=8G`. **No felhom-usb mp.** -- `findmnt /mnt/felhom-usb` → `/dev/sdb1 ext4` (mounted on the host). -- `/etc/pve/storage.cfg` → `dir: felhom-usb, path /mnt/felhom-usb, content backup, is_mountpoint 1`. +## P4 — dual-role drives + backup-aware wipe warning (v0.50.0) +- **4A:** a user-data drive is appdata AND backup-target-eligible (not locked to one role) — surfaced + in the drive overview's per-card purpose note. `felhom-pbs`/system/backup roles unchanged. +- **4B:** `handleStorageImpact` now also returns `backup_copies` — apps whose cross-drive (secondary) + backups are stored on the drive (`backupCopiesOnPath` scans `felhom-data/backups/secondary/`, + skipping the shared restic repo / `_infra`). The type-to-confirm wipe/eject modal names them ("Ez a + meghajtó más alkalmazások biztonsági másolatait is tárolja — a törlés ezeket is eltávolítja"). The + wipe stays **customer-confirmable** (the copies are redundant; originals live on the source drive). +- **OUT OF SCOPE:** the cross-drive backup ENGINE (restic USB1↔USB2, scheduling, pruning) — a follow-on + slice (needs a 2nd physical drive to validate). The 4B detection is forward-compatible (empty until + the engine writes there). -**Guest:** -- `findmnt /mnt/felhom-usb` → **nothing** (not a mount in the guest). -- `ls -la /mnt/felhom-usb` → empty dir (just `.`/`..`), created `Jun 12 07:46`. -- `df -h /mnt/felhom-usb /` → **both** `/dev/mapper/pve-vm--9201--disk--0` (the 8 GB rootfs) — i.e. - `/mnt/felhom-usb` in the guest is a plain directory **on the rootfs**, not the external drive. +## Live validation (9201) +- P2C: app bytes on `/dev/sdb1`, banner `[PASS] 1 connected, 0 disconnected`. +- Activation: `/api/storage/activate` → reboot → drive active. +- P4: `/api/storage/impact?where=/mnt/felhom-usb` → `backup_copies:[]`; after creating + `felhom-data/backups/secondary/immich` → `backup_copies:["immich"]` (detection live). +- `go test ./internal/{web,agentapi}/` green; golden rebaked to 0.50.0, build guest purged. -**Controller container:** -- `stat /mnt/felhom-usb` → **No such file or directory** (the de-privileged container has no `/mnt` - bind) → `os.Stat` in `monitor/healthcheck.go` fails → the "Adattároló nem elérhető" banner. -- logs: `Storage paths: 0 connected, 1 disconnected`. - -**Host drive contents (real data exists on the host):** `du -sh /mnt/felhom-usb` = 8.0 G; -contains `Dokumentumok/`, `felhom_data/`, `storage/`, `dump/`, `lost+found/` (Feb–Jun timestamps) — -a prior bare-metal layout. **Apps in the guest cannot see any of it.** - -### Flags (must be addressed by the slice-10 passthrough) -1. **App-data location correctness:** an app configured with `HDD_PATH=/mnt/felhom-usb` would write to - the **8 GB rootfs (local-lvm)**, silently, NOT the external drive. The only deployed app - (`actualbudget`) has **0 HDD mounts** (`ParseComposeHDDMounts: found 0`), so nothing is mislanding - *today* — but the risk is real the moment any app is placed on "external" storage. -2. **The banner was surfaced by the `Regisztrálás` shortcut** (controller v0.45.0, prior task): - registering `/mnt/felhom-usb` (mounted on the host, not the guest) created the phantom empty dir on - rootfs (the `Jun 12 07:46` dir) and the `1 disconnected` probe. Registering a host-only drive should - arguably be refused until passthrough exists — a candidate guard for the slice-10 work. - -### Slice-10 scope (NOT built here) -assign drive → attach as an LXC `mpN` on the guest (`reconcile/bringup.go:GuestMount`, marked "slice 10 -wires") → mount propagation into the controller container → register. Gated per the spec; to be scoped -with this evidence in hand. - ---- - -## PART 2 — Backups page: whole-guest backup visibility + manual trigger (v0.47.0, BUILT) - -### 2C gate — quiesce ownership (confirmed before wiring) -**The CONTROLLER owns quiescing.** The `quiesce.Loop` (slice 8B) stops its app stacks → `POST /backup` -→ polls `/backup/status` → resumes; the agent's vzdump is crash-consistent only (an LXC has no -fsfreeze). So the manual trigger goes **through the loop**, never a bare `StartBackup` (which would be -crash-consistent and wouldn't stop apps). Guest 9201 is on lvm-thin → **snapshot mode**, so downtime is -the until-snapshot window (~10 s here), with 8B.2 early-resume. - -### What shipped -- **2A agentapi:** `StatusResponse.Backup *BackupRecord`, `DueResponse.AgeSecs`, new - `RestoreTestStatus()`. Non-hollow tests (`backup_test.go`): parse the documented JSON; assert - `StartBackup` POSTs `/backup`. -- **2B Section "Rendszermentés (teljes mentés)":** read-only cards — last whole-guest backup (time + - size + **target PBS-vs-local**, from the archive volid), next-due (`/backup/due` age vs cadence), - restore-test, running phase. Agent-unreachable degrades to a note. -- **2C "Mentés most":** `quiesce.Loop` gains a mutex + `TriggerNow()` (single-flight via `TryLock` + - the persisted marker; `ErrBackupInProgress` on overlap; async, bounded by max-quiesce). New - `POST /api/guest-backup/trigger` + `GET /api/guest-backup/status` (distinct prefix from apiRouter's - app-data `/api/backup/{run,status}` — verified the collision and avoided shadowing). Button warns per - mode. -- **2D:** existing per-app DB-dump UI relabeled under an "Alkalmazás-mentések (adatbázis + konfiguráció)" - divider, distinct from the whole-guest tier. -- **2E config:** OUT OF SCOPE (hub-served policy, slice 10) — no agent config surface added. - -### Live validation (guest 9201, against the agent API — not REPORT) -- Agent `curl`: `/backup/status` → done, backup `local:backup/vzdump-lxc-9201-…`, snapshot, 1.4 GB, - success; `/backup/due` → due=false, within cadence; `/restore-test/status` → null. -- `GET /backups` → **200**; Section 1 renders "Utolsó teljes mentés", "Helyi tároló (local)", - "Visszaállítás ellenőrizve / Még nem futott", "Mentés most"; "Alkalmazás-mentések" divider present. -- **Manual trigger:** `POST /api/guest-backup/trigger` → `{started:true}`; quiesce logs show - quiesce `actualbudget` → job started → **snapshotted → early-resume (8B.2) → done**; phase polled - `snapshotted → done`; a **new backup recorded** (`…11_17_38.tar.zst`); `actualbudget` back up+healthy; - quiesce marker cleared (no stranded quiesce). -- **Single-flight:** concurrent double-trigger → one `{started:true}`, one - `{"error":"mentés már folyamatban van"}` (409). -- `go test ./internal/{web,agentapi,quiesce}/` green; `go build ./...` clean. - -### Deploy -Built + pushed `felhom-controller:0.47.0`; deployed to guest 9201 (healthy). Golden rebaked to 0.47.0. +## Note (carried from P2) +The controller's app/backup path helpers still join `felhom-data` under the registered drive path; in +Model A the in-guest mount IS the felhom-data namespace, so backup paths double-nest +(`felhom-data/felhom-data/...`) — functional but untidy. Reconcile when wiring app-data-backup-to-drive +(not in this slice; app `HDD_PATH` data lands correctly today).