Files
felhom-controller/REPORT.md
T

5.8 KiB
Raw Blame History

REPORT — USB drive availability (diagnosis) + backup-page whole-guest wiring

Repo: felhom-controller · Version: 0.47.0 · Date: 2026-06-12 · agent untouched


PART 1 — USB drive availability: DIAGNOSIS (GATED — Branch A, no blind build)

Conclusion: BRANCH A confirmed — felhom-usb is NOT passed into guest 9201. The drive is mounted on the host only; the guest (and therefore the controller container and any app) cannot reach it. This is the slice-10 additive-mount passthrough — reported, not built. The banner is correct.

Captured evidence (live, guest 9201 / felhom-pve)

Host:

  • pct config 9201 → mountpoints: only mp9: …/bootstrap → /etc/felhom-bootstrap (ro) and rootfs: local-lvm:vm-9201-disk-0,size=8G. No felhom-usb mp.
  • findmnt /mnt/felhom-usb/dev/sdb1 ext4 (mounted on the host).
  • /etc/pve/storage.cfgdir: felhom-usb, path /mnt/felhom-usb, content backup, is_mountpoint 1.

Guest:

  • findmnt /mnt/felhom-usbnothing (not a mount in the guest).
  • ls -la /mnt/felhom-usb → empty dir (just ./..), created Jun 12 07:46.
  • df -h /mnt/felhom-usb /both /dev/mapper/pve-vm--9201--disk--0 (the 8 GB rootfs) — i.e. /mnt/felhom-usb in the guest is a plain directory on the rootfs, not the external drive.

Controller container:

  • stat /mnt/felhom-usbNo such file or directory (the de-privileged container has no /mnt bind) → os.Stat in monitor/healthcheck.go fails → the "Adattároló nem elérhető" banner.
  • logs: Storage paths: 0 connected, 1 disconnected.

Host drive contents (real data exists on the host): du -sh /mnt/felhom-usb = 8.0 G; contains Dokumentumok/, felhom_data/, storage/, dump/, lost+found/ (FebJun timestamps) — a prior bare-metal layout. Apps in the guest cannot see any of it.

Flags (must be addressed by the slice-10 passthrough)

  1. App-data location correctness: an app configured with HDD_PATH=/mnt/felhom-usb would write to the 8 GB rootfs (local-lvm), silently, NOT the external drive. The only deployed app (actualbudget) has 0 HDD mounts (ParseComposeHDDMounts: found 0), so nothing is mislanding today — but the risk is real the moment any app is placed on "external" storage.
  2. The banner was surfaced by the Regisztrálás shortcut (controller v0.45.0, prior task): registering /mnt/felhom-usb (mounted on the host, not the guest) created the phantom empty dir on rootfs (the Jun 12 07:46 dir) and the 1 disconnected probe. Registering a host-only drive should arguably be refused until passthrough exists — a candidate guard for the slice-10 work.

Slice-10 scope (NOT built here)

assign drive → attach as an LXC mpN on the guest (reconcile/bringup.go:GuestMount, marked "slice 10 wires") → mount propagation into the controller container → register. Gated per the spec; to be scoped with this evidence in hand.


PART 2 — Backups page: whole-guest backup visibility + manual trigger (v0.47.0, BUILT)

2C gate — quiesce ownership (confirmed before wiring)

The CONTROLLER owns quiescing. The quiesce.Loop (slice 8B) stops its app stacks → POST /backup → polls /backup/status → resumes; the agent's vzdump is crash-consistent only (an LXC has no fsfreeze). So the manual trigger goes through the loop, never a bare StartBackup (which would be crash-consistent and wouldn't stop apps). Guest 9201 is on lvm-thin → snapshot mode, so downtime is the until-snapshot window (~10 s here), with 8B.2 early-resume.

What shipped

  • 2A agentapi: StatusResponse.Backup *BackupRecord, DueResponse.AgeSecs, new RestoreTestStatus(). Non-hollow tests (backup_test.go): parse the documented JSON; assert StartBackup POSTs /backup.
  • 2B Section "Rendszermentés (teljes mentés)": read-only cards — last whole-guest backup (time + size + target PBS-vs-local, from the archive volid), next-due (/backup/due age vs cadence), restore-test, running phase. Agent-unreachable degrades to a note.
  • 2C "Mentés most": quiesce.Loop gains a mutex + TriggerNow() (single-flight via TryLock + the persisted marker; ErrBackupInProgress on overlap; async, bounded by max-quiesce). New POST /api/guest-backup/trigger + GET /api/guest-backup/status (distinct prefix from apiRouter's app-data /api/backup/{run,status} — verified the collision and avoided shadowing). Button warns per mode.
  • 2D: existing per-app DB-dump UI relabeled under an "Alkalmazás-mentések (adatbázis + konfiguráció)" divider, distinct from the whole-guest tier.
  • 2E config: OUT OF SCOPE (hub-served policy, slice 10) — no agent config surface added.

Live validation (guest 9201, against the agent API — not REPORT)

  • Agent curl: /backup/status → done, backup local:backup/vzdump-lxc-9201-…, snapshot, 1.4 GB, success; /backup/due → due=false, within cadence; /restore-test/status → null.
  • GET /backups200; Section 1 renders "Utolsó teljes mentés", "Helyi tároló (local)", "Visszaállítás ellenőrizve / Még nem futott", "Mentés most"; "Alkalmazás-mentések" divider present.
  • Manual trigger: POST /api/guest-backup/trigger{started:true}; quiesce logs show quiesce actualbudget → job started → snapshotted → early-resume (8B.2) → done; phase polled snapshotted → done; a new backup recorded (…11_17_38.tar.zst); actualbudget back up+healthy; quiesce marker cleared (no stranded quiesce).
  • Single-flight: concurrent double-trigger → one {started:true}, one {"error":"mentés már folyamatban van"} (409).
  • go test ./internal/{web,agentapi,quiesce}/ green; go build ./... clean.

Deploy

Built + pushed felhom-controller:0.47.0; deployed to guest 9201 (healthy). Golden rebaked to 0.47.0.