Files
felhom-controller/REPORT.md
T
2026-06-12 10:24:19 +02:00

110 lines
7.1 KiB
Markdown

# REPORT — felhom-controller v0.46.0: fix /backups 500 (stale disk-tier template fields)
**Repo:** `felhom-controller` · **Version:** 0.46.0 · **Date:** 2026-06-12
## v0.46.0 — /backups 500 fix (most recent)
**Symptom:** `GET /backups` → HTTP 500 ("Internal error").
**Diagnosis (from the live controller log, not source-guessing):**
`backups.html:64: executing "backups" at <.Backup.RepoStats>: can't evaluate field RepoStats in type
interface {}`. Not a panic / not a funcmap nil-deref. The 8C de-privileging slimmed `FullBackupStatus`
to app-data-only (DB dumps + Docker-volume tars); the disk-tier restic/cross-drive backup moved to the
host agent. But `backups.html` still carried the pre-8C restic UI, referencing `.Backup.X` **struct
fields that no longer exist**: `RepoStats, LastBackup, ResticSchedule, NextBackup, PruneSchedule,
Retention, SnapshotHistory, LastCheckTime, LastCheckOK`. While those fields existed-but-nil, the
`{{if .Backup.X}}` guards short-circuited; once removed from the struct, the field access itself errors.
(Root-level keys like `.PerDriveRepoStats`/`.Tier2DriveGroups`/`.ResticPassword` are map lookups → nil
on miss → safe; the `Tier1*/Tier2*` fields are on `AppBackupRows`, which the handler still supplies — so
the user's "Tier2/cross-drive fields are consistent" was correct; the break was the `.Backup.*` restic
stats only.)
**Fix (template-only, no Go change — the struct was already correct):** removed the dead disk-tier UI
from `backups.html`, keeping the app-data backup view — storage overview (DB dumps), DB-dump status card
+ schedule + table, per-app backup rows (Tier1/Tier2 via `AppBackupRows`), restore. Removed: the restic
"Mentési tároló"/"Tároló méret"/snapshot-history/per-drive-repo-stats/integrity/retention blocks, and
the restic schedule rows. The status card + schedule summary now key on `.Backup.LastDBDump`.
**Validation (guest 9201):** built + deployed `felhom-controller:0.46.0`; `GET /backups`**HTTP 200**,
renders all sections (Tárhely áttekintés / Adatbázisok / Ütemezés), no template error in logs.
`TestTemplatesParse` + `TestSortDisksForView` green; golden rebaked to 0.46.0.
`settings.html`'s `.ResticSchedule`/`.LastCheckTime` are unaffected (root-map lookups, nil-safe; the
latter is the self-update checker, not backup).
---
## v0.45.0 — storage UX polish (order, init filter, register shortcut, clarity) · pairs with **felhom-agent v0.24.0**
Part B of the storage-fixes spec — controller-side ordering/filter/clarity polish on top of v0.44.0's
role-aware drive management. (Part A, the eject role-gate, lives at the agent — see felhom-agent
v0.24.0 / its REPORT.)
## What changed
### B1 — deterministic disk order (`internal/web/agent_disk_handlers.go`)
`agentDisksListHandler` sorts the agent's drive list server-side before rendering (`sortDisksForView`):
**user-data → system → backup** (then unrecognized), alphabetical by storage name within each tier.
The agent's storage view iterates an unordered Go map, so the list previously reordered on every reload
(CLAUDE.md lesson #3). A stable Go-side contract beats relying on map order or template JS.
Test: `TestSortDisksForView`.
### B2 — init wizard excludes mounted drives (`templates/storage_init.html`)
The `formattable` filter gained `&& !d.mount_path` (matching the attach wizard): an already-mounted
drive (e.g. `felhom-usb`) no longer appears as an "initialize" candidate. Eject it first to make it an
init target.
### B3 — register shortcut for a mounted-but-unregistered user-data drive
- New `POST /api/storage/register``handleStorageRegister` → reuses `registerStoragePath` (the
manual-add path): records the existing mount into the `StoragePath` registry (no format, no eject),
then FileBrowser-syncs. `AddStoragePath` dedupes (clean error on double-register).
- `settings.html`: a mounted, **unregistered** user-data drive now shows **Regisztrálás** as its
PRIMARY per-card action; Leválasztás/Törlés stay secondary.
### B4 — system-storage clarity (presentation only, `settings.html` + `style.css`)
`local` and `local-lvm` are both kept (not collapsed). Each card now carries:
- a plain-Hungarian **purpose description** keyed on the agent's `type`/`role` (`purposeDesc`):
local-lvm → internal SSD (system, Docker, app **databases**); local → host storage, **no app data**;
pbs → backups; user-data → external store for large app files.
- an **app-backing tag** (`appBackingTag`): `local-lvm` → "Alkalmazás-rendszer"; user-data →
"Alkalmazás-adatok".
- a one-line **tiering note** above the list answering "which storage do the apps use?".
Role/type stay authoritative from the agent — no agent contract change.
### B5 — eject confirmation names affected apps
Already wired pre-existing: `confirmEject` → the type-to-confirm modal fetches `/api/storage/impact`
and lists, by name, the deployed apps that lose their storage (parity with the wipe warning). Verified,
no code change needed.
## Build / deploy
- Built `felhom-controller:0.45.0` on the build server (`build.sh 0.45.0 --push`).
- Deployed to **guest 9201** on `felhom-pve`: `docker pull`, updated `/etc/felhom-controller-image`,
re-ran `felhom-controller-bootstrap.sh` → container recreated on `0.45.0`, **healthy**;
`[selfupdate] Current version 0.45.0 is up to date`.
- CHANGELOG (newest on top) + `controller/README.md` (Storage Management section) updated.
## Live validation (guest 9201, auth disabled on this demo → API reachable)
- **B1 order:** `GET /api/disks` returns `[felhom-usb(user-data), local(system), local-lvm(system),
felhom-pbs(backup)]` — **stable across 3 reloads** (user-data first, system alpha, backup last).
- **B2 filter:** the only user-data drive (`felhom-usb`) has a `mount_path`, so the new filter excludes
it from init candidates (confirmed by the live disk data).
- **B3 register:** `POST /api/storage/register {where:/mnt/felhom-usb}` → `{registered:true}`;
`settings.json` now lists `/mnt/felhom-usb` (label "Tárhely (felhom-usb)", schedulable);
`FileBrowser mounts synced — 1 storage path(s)`.
- **B5 impact:** `GET /api/storage/impact?where=/mnt/felhom-usb` → `{apps:[], where:…}` (valid wiring;
no deployed app currently routes data there).
- **B4 clarity:** pure presentation rendered client-side from the agent's role/type (templates
parse-tested; the disk data carries the keys the JS branches on). Visual confirmation is a browser
check.
- `go test ./...` green for the whole module.
## Golden rebake — DONE
- Rebaked on `felhom-pve` (the gitea registry allows anonymous pull — no creds needed):
`bash /root/build-golden.sh 9100 local:vztmpl/debian-13-standard_13.1-2_amd64.tar.zst local-lvm local
vmbr0 gitea.dooplex.hu/admin/felhom-controller:0.45.0`.
- Verified the build guest baked `/etc/felhom-controller-image` = `…/felhom-controller:0.45.0` and the
docker image `…/felhom-controller:0.45.0` before archiving.
- Golden archive: `local:backup/vzdump-lxc-9100-2026_06_12-09_53_03.tar.zst` (876 MB). Transient build
guest 9100 stopped + `pct destroy 9100 --purge`'d. New provisions now bring up controller 0.45.0
directly (self-update still covers any later drift).