Files
felhom-controller/REPORT.md
T
admin 88ca1178ae docs: Phase 3 off-drive Tier 2 — REPORT/CONTEXT/README for v0.55.0
REPORT (Tier 2 engine + rootfs-headroom guard + live validation: happy path RomM->SSD
off felhom-usb, refuse path 1G dummy -> honest "needs 2nd HDD", UI card). CONTEXT entry.
README Tier 2 subsection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 13:29:33 +02:00

3.9 KiB
Raw Blame History

REPORT — felhom-controller v0.55.0 (Phase 3: auto off-drive Tier 2)

Tier 2 makes an off-drive copy of each HDD app's recovery unit + bulk userdata to a different physical disk — the only off-drive protection browsable HDD userdata can get (PBS can't reach bind mounts). Auto-enabled, auto-targeted, and — crucially — it refuses rather than fills the small guest rootfs. Built, unit-tested, shipped, deployed, and live-validated on guest 9201. (Phases 1/2/2b shipped as v0.520.54 — see git history.)

What shipped

  • Engine (internal/backup/tier2.go, RunTier2/RunAllTier2): rsync -a --delete mirror of the recovery unit (backups/primary/<app>/) and the app's appdata/<app>/<target>/backups/secondary/ <app>/. restic is not revived — a plain, browsable mirror.
  • Auto target selection: prefer another registered user-data drive on a different physical disk (can hold bulk userdata); else the internal SSD for small units only. Off-disk enforced by system.SamePhysicalDevice (block-device identity — new exported helper, linux + non-linux stub), re-checked before the copy (defense in depth).
  • Rootfs-headroom guard (the safety): the SSD target is the ~8 GB guest rootfs, so a size-aware guard (tier2FitsHeadroom, unit-tested) refuses unless the unit fits leaving a reserve free (max(2 GB, 20% of total)). When nothing fits, it records an honest "needs a 2nd HDD" status — never silently no-ops, never endangers the rootfs.
  • Status + UI: results persist via the surviving settings.CrossDriveBackup. buildAppBackupRows now populates the "2. mentés" card — real target ("belső SSD (csak DB/konfiguráció)" vs an external drive) on success, or the honest no-target reason. Notifications via the surviving NotifyCrossDrive{Completed,Failed} hooks.
  • Scheduling + trigger: daily tier2-backup (03:30, after the DB dump); manual POST /api/backup/tier2.
  • Fixed a stale pre-existing test (TestBackupCopiesOnPath, which still used the old felhom-data/backups/secondary layout) to the Model-A in-guest layout Tier 2 actually uses.

Live validation (guest 9201)

  • Happy path: triggered Tier 2 → "Tier 2 copied romm → /mnt/sys_drive/felhom-data/backups/secondary/ romm (77.1 KB) [SSD: DB/config only]". The recovery unit landed on the SSD, off the felhom-usb source (block devices 2065 vs 64518 — off-disk confirmed), auto-picking the SSD (no 2nd drive).
  • Refuse path (rootfs-headroom guard): placed a 1 GB userdata dummy (SSD had 2.3 GB free) → Tier 2 refused: "nincs elég hely a belső SSD-n — a nagy fájlok off-drive mentéséhez 2. meghajtó (vagy távoli tárhely) szükséges", and did not copy the 1 GB to the rootfs. Removed the dummy; re-trigger restored the successful small-unit copy.
  • UI end-to-end: the backups page "2. mentés" card renders Sikeres → belső SSD (csak DB/konfiguráció) for RomM.
  • Demo left clean (dummy removed; RomM's intended small Tier 2 copy remains on the SSD).

Notes / follow-ups

  • Off-disk identity uses block-device (Stat_t.Dev) equality — correct for the felhom layout (external drive vs system rootfs). Two partitions on one physical disk would look "different"; the agent's DiskInfo.DurableID is the stronger guarantee for that case (future hardening).
  • Non-HDD apps (data on the rootfs, already in PBS) are skipped by Tier 2; their "2. mentés" card shows "Nincs 2." — cosmetically it could be hidden for non-HDD apps (Phase 4 polish).
  • The single-drive demo can only Tier 2 to the SSD (small units); a 2nd HDD would let bulk userdata copy off-drive — the engine already prefers it when present.

Still ahead

Phase 4: FileBrowser scoping (hide recovery units), deploy-UI "DB runs on the fast internal drive" note, monitoring storage-bar sort + descriptions. The README backup-paths section's stale restic/secondary text should be rewritten alongside.