Files
felhom-controller/REPORT.md
T

92 lines
5.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# REPORT — USB drive availability (diagnosis) + backup-page whole-guest wiring
**Repo:** `felhom-controller` · **Version:** 0.47.0 · **Date:** 2026-06-12 · agent untouched
---
## PART 1 — USB drive availability: DIAGNOSIS (GATED — Branch A, no blind build)
**Conclusion: BRANCH A confirmed — felhom-usb is NOT passed into guest 9201.** The drive is mounted on
the **host** only; the guest (and therefore the controller container and any app) cannot reach it. This
is the slice-10 additive-mount passthrough — **reported, not built.** The banner is correct.
### Captured evidence (live, guest 9201 / felhom-pve)
**Host:**
- `pct config 9201` → mountpoints: only `mp9: …/bootstrap → /etc/felhom-bootstrap (ro)` and
`rootfs: local-lvm:vm-9201-disk-0,size=8G`. **No felhom-usb mp.**
- `findmnt /mnt/felhom-usb``/dev/sdb1 ext4` (mounted on the host).
- `/etc/pve/storage.cfg``dir: felhom-usb, path /mnt/felhom-usb, content backup, is_mountpoint 1`.
**Guest:**
- `findmnt /mnt/felhom-usb`**nothing** (not a mount in the guest).
- `ls -la /mnt/felhom-usb` → empty dir (just `.`/`..`), created `Jun 12 07:46`.
- `df -h /mnt/felhom-usb /`**both** `/dev/mapper/pve-vm--9201--disk--0` (the 8 GB rootfs) — i.e.
`/mnt/felhom-usb` in the guest is a plain directory **on the rootfs**, not the external drive.
**Controller container:**
- `stat /mnt/felhom-usb`**No such file or directory** (the de-privileged container has no `/mnt`
bind) → `os.Stat` in `monitor/healthcheck.go` fails → the "Adattároló nem elérhető" banner.
- logs: `Storage paths: 0 connected, 1 disconnected`.
**Host drive contents (real data exists on the host):** `du -sh /mnt/felhom-usb` = 8.0 G;
contains `Dokumentumok/`, `felhom_data/`, `storage/`, `dump/`, `lost+found/` (FebJun timestamps) —
a prior bare-metal layout. **Apps in the guest cannot see any of it.**
### Flags (must be addressed by the slice-10 passthrough)
1. **App-data location correctness:** an app configured with `HDD_PATH=/mnt/felhom-usb` would write to
the **8 GB rootfs (local-lvm)**, silently, NOT the external drive. The only deployed app
(`actualbudget`) has **0 HDD mounts** (`ParseComposeHDDMounts: found 0`), so nothing is mislanding
*today* — but the risk is real the moment any app is placed on "external" storage.
2. **The banner was surfaced by the `Regisztrálás` shortcut** (controller v0.45.0, prior task):
registering `/mnt/felhom-usb` (mounted on the host, not the guest) created the phantom empty dir on
rootfs (the `Jun 12 07:46` dir) and the `1 disconnected` probe. Registering a host-only drive should
arguably be refused until passthrough exists — a candidate guard for the slice-10 work.
### Slice-10 scope (NOT built here)
assign drive → attach as an LXC `mpN` on the guest (`reconcile/bringup.go:GuestMount`, marked "slice 10
wires") → mount propagation into the controller container → register. Gated per the spec; to be scoped
with this evidence in hand.
---
## PART 2 — Backups page: whole-guest backup visibility + manual trigger (v0.47.0, BUILT)
### 2C gate — quiesce ownership (confirmed before wiring)
**The CONTROLLER owns quiescing.** The `quiesce.Loop` (slice 8B) stops its app stacks → `POST /backup`
→ polls `/backup/status` → resumes; the agent's vzdump is crash-consistent only (an LXC has no
fsfreeze). So the manual trigger goes **through the loop**, never a bare `StartBackup` (which would be
crash-consistent and wouldn't stop apps). Guest 9201 is on lvm-thin → **snapshot mode**, so downtime is
the until-snapshot window (~10 s here), with 8B.2 early-resume.
### What shipped
- **2A agentapi:** `StatusResponse.Backup *BackupRecord`, `DueResponse.AgeSecs`, new
`RestoreTestStatus()`. Non-hollow tests (`backup_test.go`): parse the documented JSON; assert
`StartBackup` POSTs `/backup`.
- **2B Section "Rendszermentés (teljes mentés)":** read-only cards — last whole-guest backup (time +
size + **target PBS-vs-local**, from the archive volid), next-due (`/backup/due` age vs cadence),
restore-test, running phase. Agent-unreachable degrades to a note.
- **2C "Mentés most":** `quiesce.Loop` gains a mutex + `TriggerNow()` (single-flight via `TryLock` +
the persisted marker; `ErrBackupInProgress` on overlap; async, bounded by max-quiesce). New
`POST /api/guest-backup/trigger` + `GET /api/guest-backup/status` (distinct prefix from apiRouter's
app-data `/api/backup/{run,status}` — verified the collision and avoided shadowing). Button warns per
mode.
- **2D:** existing per-app DB-dump UI relabeled under an "Alkalmazás-mentések (adatbázis + konfiguráció)"
divider, distinct from the whole-guest tier.
- **2E config:** OUT OF SCOPE (hub-served policy, slice 10) — no agent config surface added.
### Live validation (guest 9201, against the agent API — not REPORT)
- Agent `curl`: `/backup/status` → done, backup `local:backup/vzdump-lxc-9201-…`, snapshot, 1.4 GB,
success; `/backup/due` → due=false, within cadence; `/restore-test/status` → null.
- `GET /backups`**200**; Section 1 renders "Utolsó teljes mentés", "Helyi tároló (local)",
"Visszaállítás ellenőrizve / Még nem futott", "Mentés most"; "Alkalmazás-mentések" divider present.
- **Manual trigger:** `POST /api/guest-backup/trigger``{started:true}`; quiesce logs show
quiesce `actualbudget` → job started → **snapshotted → early-resume (8B.2) → done**; phase polled
`snapshotted → done`; a **new backup recorded** (`…11_17_38.tar.zst`); `actualbudget` back up+healthy;
quiesce marker cleared (no stranded quiesce).
- **Single-flight:** concurrent double-trigger → one `{started:true}`, one
`{"error":"mentés már folyamatban van"}` (409).
- `go test ./internal/{web,agentapi,quiesce}/` green; `go build ./...` clean.
### Deploy
Built + pushed `felhom-controller:0.47.0`; deployed to guest 9201 (healthy). Golden rebaked to 0.47.0.