88ca1178ae
REPORT (Tier 2 engine + rootfs-headroom guard + live validation: happy path RomM->SSD off felhom-usb, refuse path 1G dummy -> honest "needs 2nd HDD", UI card). CONTEXT entry. README Tier 2 subsection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
54 lines
3.9 KiB
Markdown
54 lines
3.9 KiB
Markdown
# REPORT — felhom-controller v0.55.0 (Phase 3: auto off-drive Tier 2)
|
||
|
||
Tier 2 makes an **off-drive copy** of each HDD app's recovery unit + bulk userdata to a **different
|
||
physical disk** — the only off-drive protection browsable HDD userdata can get (PBS can't reach bind
|
||
mounts). Auto-enabled, auto-targeted, and — crucially — it **refuses rather than fills** the small guest
|
||
rootfs. Built, unit-tested, shipped, deployed, and live-validated on guest 9201. (Phases 1/2/2b shipped
|
||
as v0.52–0.54 — see git history.)
|
||
|
||
## What shipped
|
||
- **Engine** (`internal/backup/tier2.go`, `RunTier2`/`RunAllTier2`): rsync `-a --delete` mirror of the
|
||
recovery unit (`backups/primary/<app>/`) and the app's `appdata/<app>/` → `<target>/backups/secondary/
|
||
<app>/`. restic is **not** revived — a plain, browsable mirror.
|
||
- **Auto target selection:** prefer another registered user-data drive on a **different physical disk**
|
||
(can hold bulk userdata); else the internal SSD for **small units only**. Off-disk enforced by
|
||
`system.SamePhysicalDevice` (block-device identity — new exported helper, linux + non-linux stub),
|
||
re-checked before the copy (defense in depth).
|
||
- **Rootfs-headroom guard (the safety):** the SSD target is the ~8 GB guest rootfs, so a size-aware
|
||
guard (`tier2FitsHeadroom`, unit-tested) **refuses** unless the unit fits leaving a reserve free
|
||
(`max(2 GB, 20% of total)`). When nothing fits, it records an **honest** "needs a 2nd HDD" status —
|
||
never silently no-ops, never endangers the rootfs.
|
||
- **Status + UI:** results persist via the surviving `settings.CrossDriveBackup`. `buildAppBackupRows`
|
||
now **populates** the "2. mentés" card — real target ("belső SSD (csak DB/konfiguráció)" vs an external
|
||
drive) on success, or the honest no-target reason. Notifications via the surviving
|
||
`NotifyCrossDrive{Completed,Failed}` hooks.
|
||
- **Scheduling + trigger:** daily `tier2-backup` (03:30, after the DB dump); manual `POST /api/backup/tier2`.
|
||
- Fixed a stale pre-existing test (`TestBackupCopiesOnPath`, which still used the old
|
||
`felhom-data/backups/secondary` layout) to the Model-A in-guest layout Tier 2 actually uses.
|
||
|
||
## Live validation (guest 9201)
|
||
- **Happy path:** triggered Tier 2 → *"Tier 2 copied romm → /mnt/sys_drive/felhom-data/backups/secondary/
|
||
romm (77.1 KB) [SSD: DB/config only]"*. The recovery unit landed on the SSD, **off** the felhom-usb
|
||
source (block devices 2065 vs 64518 — off-disk confirmed), auto-picking the SSD (no 2nd drive).
|
||
- **Refuse path (rootfs-headroom guard):** placed a 1 GB userdata dummy (SSD had 2.3 GB free) → Tier 2
|
||
**refused**: *"nincs elég hely a belső SSD-n — a nagy fájlok off-drive mentéséhez 2. meghajtó (vagy
|
||
távoli tárhely) szükséges"*, and did **not** copy the 1 GB to the rootfs. Removed the dummy; re-trigger
|
||
restored the successful small-unit copy.
|
||
- **UI end-to-end:** the backups page "2. mentés" card renders *Sikeres → belső SSD (csak
|
||
DB/konfiguráció)* for RomM.
|
||
- Demo left clean (dummy removed; RomM's intended small Tier 2 copy remains on the SSD).
|
||
|
||
## Notes / follow-ups
|
||
- **Off-disk identity** uses block-device (`Stat_t.Dev`) equality — correct for the felhom layout
|
||
(external drive vs system rootfs). Two partitions on one physical disk would look "different"; the
|
||
agent's `DiskInfo.DurableID` is the stronger guarantee for that case (future hardening).
|
||
- Non-HDD apps (data on the rootfs, already in PBS) are skipped by Tier 2; their "2. mentés" card shows
|
||
"Nincs 2." — cosmetically it could be hidden for non-HDD apps (Phase 4 polish).
|
||
- The single-drive demo can only Tier 2 to the SSD (small units); a 2nd HDD would let bulk userdata copy
|
||
off-drive — the engine already prefers it when present.
|
||
|
||
## Still ahead
|
||
Phase 4: FileBrowser scoping (hide recovery units), deploy-UI "DB runs on the fast internal drive" note,
|
||
monitoring storage-bar sort + descriptions. The README backup-paths section's stale restic/secondary
|
||
text should be rewritten alongside.
|