diff --git a/CONTEXT.md b/CONTEXT.md index 89257eb..3ba67bf 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -13,6 +13,19 @@ Last updated: 2026-06-12 (storage UX polish) > is tracked in `CHANGELOG.md`, `controller/README.md`, and the auto-memory `MEMORY.md`. Live version: > **v0.45.0**. > +> **2026-06-13 — v0.57.0 UI fixes (Part A of the UI-fixes/storage-spike spec):** +> - A1: fixed the RIGHT storage list — `#host-storage-bars` (the JS-filled, agent-PVE-storage list: +> `local`/`local-lvm`/`felhom-pbs`/`felhom-usb`), which reordered on every poll. Now +> `enrichHostStorageTargets` sorts `/api/host-metrics` server-side + adds friendly Hungarian +> labels/purpose. Display-only — PVE storage ids never renamed. (v0.56.0's 4C had sorted the OTHER, +> server-rendered user-data list.) +> - A2: per-app Tier-2 config panel at `GET/POST /stacks/{name}/backup`; the dead-end "Beállítás" button +> (was → deploy page) is repointed there. Pin a target drive / toggle Tier 2 off; prefs +> (`UserDisabled`/`PreferredTarget`) persist on `CrossDriveBackup` and survive the runner's status +> writes (`withTier2Prefs`). Always visible incl. single-SSD + non-HDD (PBS-context) apps. +> - Part B (storage OS/data split spike) = build-nothing; findings → `felhom-agent/REPORT-storage-split-spike.md`. +> - Live-validated on guest 9201; build/deploy = golden bootstrap (`/etc/felhom-controller-image` + restart `felhom-controller-bootstrap.service`). +> > **2026-06-13 — v0.56.0 Phase 4: FileBrowser scoping + UI polish (SLICE COMPLETE):** > - 4A: FileBrowser bind scoped to `/appdata` (recovery units + Tier 2 copies under `backups/` > NOT mounted → customer can't browse/delete the restore source). 4B: deploy storage step states diff --git a/REPORT.md b/REPORT.md index d1b0ef8..22564f9 100644 --- a/REPORT.md +++ b/REPORT.md @@ -1,46 +1,63 @@ -# REPORT — felhom-controller v0.56.0 (Phase 4: FileBrowser scoping + UI polish) — SLICE COMPLETE +# REPORT — felhom-controller v0.57.0 (UI fixes: stable host-storage list + Tier-2 config panel) -Phase 4 closes the per-app-recovery-unit / Tier-2 slice. Built, deployed, and live-validated on guest -9201. With this, **all five phases of the spec (1, 2, 2b, 3, 4) are shipped and live-validated** — see -git history (v0.52 → v0.56). +Part A of the "UI fixes (Part A) + storage-restructure spike (Part B)" spec. Part A ships a single +controller version bump (v0.57.0); Part B builds nothing (findings report — see below). Built, +deployed, and live-validated on guest 9201. -## Phase 4 — what shipped (v0.56.0) -- **4A FileBrowser scoping (safety):** the FileBrowser bind mount is scoped to each drive's `appdata/` - subtree (`/appdata:/srv/`) instead of the whole drive root. The recovery units + Tier 2 - copies under `backups/` are **not mounted into FileBrowser at all** — the customer browses their - userdata but cannot reach (or see) the thing that restores them. `syncFileBrowserMounts` mkdir's the - appdata dir before binding; it runs on controller startup, so the scoping applies immediately. -- **4B Deploy-UI communication:** the storage-selection step states plainly (Hungarian) that the chosen - drive holds the app's **files**, while its **database runs on the fast internal SSD** and is backed up - alongside the app — so the DB-on-SSD split stops being a surprise. -- **4C Monitoring storage list:** `buildStorageBars` sorts deterministically (by path) and carries a - **purpose description** for the user-data drives, rendered on the monitoring "Tárolók kapacitása" list. - (Correction to the spec's premise: this list is the controller's registered **user-data** drives only — - the agent's local/local-lvm/pbs storage is not in this registry, so the role-tier sort and - `local`-vs-`local-lvm` descriptions live on the agent-backed storage-management page, not here.) +## A1 — host-storage list no longer reorders (item 2) +The monitoring page has two storage sections. v0.56.0's 4C sorted the **server-rendered, user-data-only** +list (`buildStorageBars`). The list the customer actually saw reordering was the **other** one: +`#host-storage-bars`, filled client-side from the agent's PVE-storage list (`local`, `local-lvm`, +`felhom-pbs`, `felhom-usb` with thin-pool % + temperature). The agent enumerates `pvesm` in a +non-deterministic order and this list never passed through a Go sort, so it reshuffled on every 8 s poll. -## Live validation (guest 9201) -- **4A:** after deploy, FileBrowser's mount is `/mnt/felhom-usb/appdata -> /srv/felhom-usb`; `/srv/ - felhom-usb` lists `romm` (userdata) and the recovery units at `/mnt/felhom-usb/backups` are outside the - mount — confirmed via `docker inspect`. -- **4B:** the deploy page (nextcloud) renders "…adatbázis a gyors belső SSD-n…". -- **4C:** the monitoring page renders "Külső adattároló — … az adatbázisok a belső SSD-n vannak." +Fix (`internal/web/agent_host_metrics_handler.go`): `enrichHostStorageTargets` sorts the +`/api/host-metrics` response **server-side** — user-data (`usb`/`local-dir`) → system+apps +(`lvmthin`/`lvm`) → builtin local (`local`) → backup (`pbs`/`nfs`/`cifs`) → other; alphabetical by id +within a tier — and attaches a **friendly Hungarian label + one-line purpose** per entry. The raw PVE id +stays in `Name` and is rendered muted in `monitoring.html`. **Display labels only — PVE storage ids are +never renamed** (vzdump/PBS targets reference them by name). Decision flagged for the owner: a follow-up +could replace the raw breakdown with customer-meaningful aggregates and push the raw `local`/`local-lvm`/ +`pbs` detail to an operator (hub) view; this pass takes the smallest change that fixes the confusion +(friendly labels + descriptions on the existing list). -## Slice summary (all live-validated on guest 9201) -- **Phase 1 (v0.52.0):** deploy-side Model-A double-nest fix (catalog templates) + deploy↔backup path - agreement test; RomM migrated. -- **Phase 2 (v0.53.x):** per-app **secret-free** recovery unit (compose + secret-stripped app.yaml + - db-dumps + volume-dumps + manifest), idempotent capture. -- **Phase 2b (v0.54.0):** restore-from-unit recreate + **fail-closed `data_key` gate** (proven live on - AdventureLog: refused when the encryption key was unrecoverable). -- **Phase 3 (v0.55.0):** auto **off-drive Tier 2** with the **rootfs-headroom guard** (refuse-not-fill, - proven live). -- **Phase 4 (v0.56.0):** FileBrowser scoping + deploy DB-on-SSD note + monitoring descriptions. +## A2 — per-app Tier-2 config panel (item 4) +The "2. mentés" row's **Beállítás** button linked to the app deploy page (no backup-location setting — a +dead end). New surface: `GET/POST /stacks/{name}/backup` (`tier2_config_handler.go` + +`templates/tier2_config.html`), wired in `server.go` behind RequireAuth + CsrfProtect; the button is +repointed there on **every** "2. mentés" branch (configured / disconnected / inactive / unconfigured / +disabled). The panel shows the current/effective off-drive target, whether it's the size-limited internal +SSD, the last-run reason, and lets the customer **pin a different registered drive** or **turn Tier 2 +off**. It is **always visible**: with only the internal SSD it shows "automatikus: belső SSD — csak +DB/konfiguráció" + the rootfs-headroom note; for a non-HDD app it shows honest "already in the PBS +whole-guest snapshot; the off-drive copy is supplementary" context (no active control, since Tier 2 does +not run for rootfs apps). -## Known follow-ups (small, optional) -- Off-disk identity uses block-device equality; the agent's `DiskInfo.DurableID` is stronger for the - same-disk-multiple-partitions case. -- Non-HDD apps' "2. mentés" card shows "Nincs 2." (they're in PBS); could be hidden for them. -- The README backup-paths section still has stale restic/secondary text (flagged inline) — worth a pass. -- Full readable-data restore e2e vs AdventureLog couldn't run on the 8 GB demo rootfs (images too big); - the gate + recreate are unit/integration-tested and the fail-closed path is proven live. +Persistence: two preference fields on `settings.CrossDriveBackup` — `UserDisabled` + `PreferredTarget` +— set via `SetTier2Preference` and **preserved across the Tier-2 runner's status writes** +(`withTier2Prefs` in `tier2.go`). `selectTier2Target` now honors a valid pinned target (registered, +schedulable, off physical disk) before the auto-pick; an invalid pin silently falls back to auto. +`RunTier2` skips a customer-disabled app. Saving with Tier 2 on for an HDD app triggers an immediate run. + +## Live validation (guest 9201, v0.57.0, public URL https://felhom.demo-felhom.eu) +- **A1:** `/api/host-metrics` returns a stable order across repeated polls — `felhom-usb` → + `local-lvm` → `local` → `felhom-pbs` — each entry carrying `label` + `purpose` (e.g. `local-lvm` → + "Belső SSD – rendszer és alkalmazások"). Confirmed 3 consecutive polls identical. +- **A2 (HDD app, RomM):** panel shows Tier 2 **Bekapcsolva**, effective target "belső SSD (rendszer) — + csak DB/konfiguráció", the SSD-only banner, the "Nincs másik adatmeghajtó" help (single-drive demo), + and the enabled checkbox + target dropdown. POST disable → 303 + flash → GET shows **Kikapcsolva**; + POST re-enable → GET shows **Bekapcsolva** and the controller log records an immediate successful + Tier 2 run (RomM → SSD, 77 KB, DB/config only). +- **A2 (non-HDD app, ActualBudget):** panel shows the PBS-coverage context ("már szerepelnek a teljes + rendszermentésben (PBS) … nincs külön teendő"), no active form. +- Both `/backups` "Beállítás" buttons now target `/stacks/{name}/backup`; logs clean (no errors/panics). + +## Tests +`enrichHostStorageTargets` (order, labels, determinism, unknown-type fallback); +`selectTier2Target` (honors a pin / falls back on an invalid pin); status writes preserve the +preference. Full `go test ./...` green. + +## Part B — storage-restructure spike +Build-nothing investigation; the findings report gating the provisioning spec is at +`felhom-agent/REPORT-storage-split-spike.md` (and summarised in that repo's CHANGELOG). Nothing in the +controller changed for Part B. diff --git a/controller/README.md b/controller/README.md index 0885ea1..76d387f 100644 --- a/controller/README.md +++ b/controller/README.md @@ -390,6 +390,18 @@ reach bind mounts). Auto-targeted: **prefer another registered user-data drive** `settings.CrossDriveBackup` and drives the "2. mentés" card. Runs daily (`tier2-backup`, 03:30) or via `POST /api/backup/tier2`. restic is **not** used — a plain browsable mirror. +**Per-app Tier-2 config panel (v0.57.0)** — `GET/POST /stacks/{name}/backup` +(`internal/web/tier2_config_handler.go` + `templates/tier2_config.html`). The "2. mentés" row's +**Beállítás** button links here (was the dead-end deploy page). Shows the effective off-drive target +(pinned or auto), whether it's the size-limited internal SSD, the last-run reason, and lets the customer +**pin a registered drive** (off physical disk) or **toggle Tier 2 off**. Always visible — single-SSD apps +get the "csak DB/konfiguráció" note, non-HDD apps the "already in the PBS whole-guest snapshot" context. +Two preference fields on `CrossDriveBackup` — `UserDisabled` + `PreferredTarget` (set via +`Settings.SetTier2Preference`) — are **preserved across the runner's status writes** (`withTier2Prefs`): +`selectTier2Target` honors a valid pin before auto-picking; `RunTier2` skips a disabled app. The runner +re-validates the pin off-disk at run time. `Manager.Tier2Info(stackName)` is the read-only panel view +(effective target + eligible alternative drives). + **Phase 1 — Database Dumps** (`internal/backup/dbdump.go`, scheduled 02:30) - **Auto-discovery** of PostgreSQL and MariaDB containers via `docker ps` + `docker inspect` @@ -789,6 +801,15 @@ The de-privileged controller (slice 8C) sees only its own cgroup and cannot read Path: `GET /api/host-metrics` → `Client.HostMetrics()` (leaf-pinned, per-guest-token agentapi client) → agent `GET /host/metrics`. Host-wide and token-authed (assumption: **one customer per host** — the home-server model). It is a **live** fetch (a fresh agent collect, not the 15-minute hub snapshot), so the page polls it every **8 s** while open. When the agent is unconfigured/unreachable the card shows a "nem elérhető" banner; the controller's own metric charts are unaffected. +**Storage-bar ordering + labels (v0.57.0):** the agent enumerates storages via `pvesm` in a +non-deterministic order, so the per-storage capacity list (`#host-storage-bars`) reordered on every poll. +`enrichHostStorageTargets` (`agent_host_metrics_handler.go`) sorts the response **server-side** — +user-data (`usb`/`local-dir`) → system+apps (`lvmthin`/`lvm`) → builtin `local` → backup +(`pbs`/`nfs`/`cifs`) → other, alphabetical by id within a tier — and attaches a friendly Hungarian +`label` + one-line `purpose` per entry (rendered by `monitoring.html`, with the raw PVE id shown muted). +**Display labels only — the PVE storage ids are never renamed** (vzdump/PBS configs reference them by +name). This is distinct from the server-rendered, user-data-only `buildStorageBars` "Tárhely" list. + #### Alert System (`internal/web/alerts.go`) State-based alerts displayed on all pages: