Part A of the UI-fixes/storage-spike spec.
A1: enrichHostStorageTargets sorts /api/host-metrics storage_targets
server-side and attaches friendly Hungarian labels + purpose, fixing the
#host-storage-bars reorder-on-poll bug. Display labels only — PVE storage
ids are never renamed.
A2: new GET/POST /stacks/{name}/backup Tier-2 config panel; the "2. mentés"
Beállítás button is repointed there from the dead-end deploy page. Customer
can pin a target drive or disable Tier 2; preference is preserved across the
runner's status writes. Always visible (single-SSD + non-HDD apps included).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4A: scope FileBrowser bind to <drive>/appdata (recovery units + Tier 2 copies under
backups/ are no longer mounted into FileBrowser — customer can't browse/delete the
thing that restores them). 4B: deploy storage-selection step states the chosen drive
holds files while the DB runs on the fast internal SSD + is backed up with the app.
4C: buildStorageBars stable sort + purpose description on the monitoring storage list.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tier 2 rsync-mirrors each HDD app's recovery unit + appdata to a DIFFERENT physical
disk (the only off-drive protection bind-mounted userdata can get; PBS can't reach it).
Auto-enabled, auto-target: prefer another registered drive (different physical disk via
system.SamePhysicalDevice), else the internal SSD for SMALL units only — with a
size-aware headroom guard that REFUSES rather than fill the ~8G guest rootfs, recording
an honest "needs 2nd HDD" status. Status persisted via the surviving CrossDriveBackup;
"2. mentés" UI card now populated. Daily tier2-backup job + POST /api/backup/tier2.
- backup/tier2.go (engine+selection+headroom), tier2_test.go (headroom arithmetic)
- system.SamePhysicalDevice (linux Stat_t.Dev + stub)
- handlers.go Tier2 UI population + tier2DestLabel; backups.html honest no-target reason
- fixed stale TestBackupCopiesOnPath (old felhom-data layout -> in-guest layout)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds an in-process orchestration test for RestoreFromRecoveryUnit: success path
calls recreate with non-secret env + recovered secrets merged; data-key-missing
path is REFUSED and recreate is never called. Makes Manager.isDebug nil-safe
(behavior-neutral in prod; cfg is always set) so the gate/orchestration are testable.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restore recreates an app from its on-drive unit + the guest's own secrets,
regenerating nothing. reconcileRestoreSecrets (pure, unit-tested) merges the unit's
non-secret env with secrets recovered from the live app.yaml and FAILS CLOSED if a
data-encrypting key is unrecoverable (refuse — a PBS whole-guest restore is needed —
rather than regenerate and corrupt). Resettable secrets missing → warn + proceed.
- backup: RestoreFromRecoveryUnit (manifest -> recover secrets -> gate -> restore
volumes -> recreate definition + redeploy w/ re-pull); falls back to volume-only.
- seams: RecoverStackSecrets/RecreateStackFromUnit (adapter +encKey),
stacks.RedeployFromEnv. Wired into /backup/restore.
- tests: gate (refuse/proceed/verbatim) + data_key parsing.
Gate + reconcile + data_key parsing unit-tested; capture live-validated (v0.53.1).
Full readable-data e2e vs AdventureLog needs the auth-gated dashboard restore — pending.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CaptureRecoveryUnit now builds content in memory and skips writes when the unit
is already current (checksum + dump-set + version), so it can run from RefreshCache
(startup + every 5m) without thrashing the USB drive. Units now exist shortly after
startup and track config changes without waiting for the daily DB dump. +idempotency test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
REPORT.md overwritten with the Phase-1 gate run (catalog template fix + agreement
test + live RomM migration on guest 9201, gate PASSED). CONTEXT.md dated entry.
README HDD_PATH/felhom-data convention note corrected for Model-A single-nesting.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The deploy-side double-nest fix lives in the app catalog (templates dropped the
extra felhom-data segment). This adds the controller-side invariant test that
ties the deploy path (ParseComposeHDDMounts) to the backup path
(AppDataDir/NamespaceRoot) so they can't drift again, plus the v0.52.0 CHANGELOG.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Backups page: whole-guest backup shown as real DR — target label "Biztonsági szerver –
külön hardver (PBS)"; app-data "Távoli mentés" card now reflects the PBS offsite tier
(guestBackupView.Offsite) instead of "nincs beállítva".
- Model-A double-nest fix: appbackup path helpers take a felhom-data NAMESPACE ROOT (no
internal felhom-data join); backup.Manager.namespaceRoot/AppNamespaceRoot resolve
HDD-vs-systemDataPath provenance so a drive-resident app's backups land single-nested
(<drive>/backups/... on the guest = <drive>/felhom-data/backups/... on the host) instead
of .../felhom-data/felhom-data/.... Writes, deletion (GetStackBackupData/RemoveStack/
ProtectedHDDPaths), wipe-warning scan, and export updated coherently; legacy double-nest
dirs kept protected. New appbackup test asserts no doubled segment.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4A: user-data drives are backup-target-eligible (not role-locked) — surfaced in
the drive purpose note. 4B: handleStorageImpact returns backup_copies (apps whose
cross-drive backups live on the drive, via backupCopiesOnPath); the wipe/eject
modal warns they'd be destroyed (stays customer-confirmable — copies redundant).
Cross-drive backup engine remains out of scope. Test: TestBackupCopiesOnPath.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
pendingActivationDrives() flags registered drives the agent shows attached but not
live-mounted in the container; settings banner + "Újraindítás most" button →
/api/storage/activate → agentapi.GuestReboot. Batches all pending into one restart.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
agentapi GuestAttach(where) → POST /disks/guest-attach; runStorageInit/Attach +
handleStorageRegister call attachIntoGuest after register (best-effort, P3 heals).
Closes Branch A: enrolled drives become usable in the guest, banner clears.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Part 2 of the USB/backup spec. agentapi: StatusResponse.Backup record, DueResponse
age_seconds, RestoreTestStatus(). New "Rendszermentés (teljes mentés)" section
(read-only: last backup/target PBS-vs-local/next-due/restore-test) + "Mentés most"
manual trigger that goes through the quiesce loop (controller owns quiescing):
quiesce.Loop gains mutex + TriggerNow() (single-flight, async). New
/api/guest-backup/{trigger,status} (distinct from apiRouter's /api/backup/*).
App-data rows relabeled under an "Alkalmazás-mentések" divider. Config → slice 10.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
backups.html still referenced .Backup.{RepoStats,LastBackup,ResticSchedule,
NextBackup,PruneSchedule,Retention,SnapshotHistory,LastCheckTime,LastCheckOK} —
fields removed from FullBackupStatus in the 8C de-privileging (disk-tier backup
moved to the agent). Field access on the slimmed struct 500s. Removed the dead
restic/snapshot/repo-stat sections; kept the app-data (DB dumps + per-app) view.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Empirically (staging on 9201): traefik v3 issues a cert from a router-level
tls.domains but NOT from the entrypoint http.tls.domains. So the wildcard moves
to RenderControllerRoute (the always-present anchor): when DNS-01 ACME is
configured it carries tls.certResolver+domains *.<domain>+apex, and every other
router serves that wildcard by SNI (no per-app labels). Reverts v0.42.0's dead
entrypoint-domains + TraefikData.Domain.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
traefik's websecure entrypoint now declares http.tls.domains *.<domain>+apex so
it proactively obtains the wildcard via Cloudflare DNS-01 at startup (cert ready
before first client, every router serves it by SNI). Gated on CFAPIToken (DNS-01).
TraefikData gains Domain; ensureTraefik wires cfg.Customer.Domain.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
containerOnNetwork misread the absent-key '<nil>' as "already attached", so
wireController skipped docker network connect -> traefik 502'd felhom.<domain>.
Now lists network names and matches exactly. Also removed dashboard.html's dead
CrossDrive* block (slice-8C leftover) that 500'd the dashboard via gt <nil> 0,
exposed once v0.41.1 made the dashboard reachable.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EnsureBaseStack now writes a traefik file-provider route
(Host(felhom.<domain>) -> http://felhom-controller:8080) and joins the
controller to traefik-public. Done post-pull (domain known) and idempotently
(write-if-changed + skip-if-connected), so felhom.<domain> reaches the
controller. Completes the v0.41.0 base-infra bring-up.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New internal/infra package renders traefik/cloudflared/filebrowser from config
(pinned images, single source of truth; web filebrowser path delegates here).
stacks.EnsureBaseStack deploys the traefik-public network + the three stacks,
single-flight + idempotent + non-fatal; wired to first boot and every health
tick. monitor.EffectiveProtected drops cloudflared when no tunnel token.
Section-G fix lives in felhom-agent build-golden.sh (same-path stacks bind).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix the onboarding 401: instead of seeding controller.yaml from the agent's
HOST hub key (which the hub's customer-scoped /api/v1/report rejects), the
controller now PULLS its full controller.yaml from the hub on first boot using
the bootstrap's retrieval passphrase (yielding the customer-scoped key) and
MERGES in the per-guest local_api block.
- internal/bootstrap: contract v1->v2 (customer.id + hub.url +
hub.retrieval_password + local_api; drop host key/identity). MaybeIngest gains
an injected PullFunc (keeps bootstrap free of the heavy report package),
pulls with bounded transient-only retry, merges local_api at YAML-map level
(preserves all hub-emitted fields), idempotent + fail-safe + never-crash.
- main.go: wire report.PullConfig as the pull adapter (maps ErrHubUnreachable
-> ErrPullTransient; auth/not-found permanent).
- Lockstep with felhom-agent v0.19.0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove five orphaned HTML templates left behind when slice 8C retired the
disk/storage/restore web handlers (storage_handlers.go, handler_restore.go and
the /api/storage/* + /api/restore/* routes): storage_init, storage_attach,
migrate, migrate_drive, restore. Zero .go references, zero cross-template
references, no route, no nav entry; embed is a glob so deletion is safe (14
templates remain, build + tests green). No behaviour change; the deleted pages
were already unreachable.
Also ships the live demo validation (v0.39.0) writeup in REPORT.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add agentapi HostMetrics() + a thin /api/host-metrics proxy to the agent's
new GET /host/metrics, and a 'Szerver allapota (gazdagep)' card on the
monitoring page rendering host CPU%/load/mem/CPU-temp(n/a)/uptime + per-
storage capacity bars (thin-pool fill, disk temp/wear). Polls every 8s.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Quiesce loop resumes (StartStack + clear marker) at the snapshotted phase
instead of done -> downtime whole-backup -> until-snapshot, no consistency loss.
Keeps polling to done/failed (no overlapping backup; post-snapshot failure
observed). Stop-mode fallback to done + crash-safety preserved.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Repo renamed on Gitea (admin/deploy-felhom-compose -> admin/felhom-controller).
Updates clone URLs, clone dirs, the customer bootstrap URL, build.sh, BUILDING.md,
README.md, CLAUDE.md, CONTEXT.md and TASK.md to the new name. No functional change:
Go module path and Docker image path (both already 'felhom-controller') untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extract the stateless, keep-side app-data backup primitives out of
internal/backup/ into a new self-contained internal/appbackup/ package:
- dbdump.go: DB dump discovery/execution (DiscoverDatabases, DumpOne, ...)
- appdata.go: StackDataProvider + app-data/volume discovery, HumanizeBytes
- paths.go: keep-side path helpers (AppDBDumpPath, AppVolumeDumpPath, AppDataDir)
backup/ keeps every name available via type/const aliases + one-line function
forwarders (appbackup_bridge.go), so the still-present delete-side code
(restic, cross-drive, drive-mount) and the both-side consumers (web/api/report)
compile unchanged. The keep-only consumers appexport and storage are rewired to
import appbackup directly and no longer import backup.
This is the Part-2 prerequisite for the Proxmox port: appbackup has zero
references to restic/cross-drive/drive-mount and does not import backup, so the
delete-side can later be removed without breaking app-data backup or appexport.
Behaviour-preserving: pure move + import/qualifier rewrites, no logic edits.
The four Manager methods (RunDBDumps/DumpAppVolumes/DumpAppVolumesSafe share the
delete-side mutex/status state; RestoreAppFromTier2 reads the cross-drive mirror)
intentionally stay on Manager and delegate to appbackup — for the re-platform step.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The backups page template references .HasVolumeData on the status table
rows but the AppBackupRow struct was missing this field, causing a
template error.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Docker named volume backup to Tier 1 (dump to tar, include in restic)
and Tier 2 (copy tars to rsync mirror _volumes/ dir)
- Fix volume name resolution: use project-prefixed names (mealie_mealie_data)
- Fix double Tier 1 in restore dropdown: filter snapshots by app's home drive
- Add Tier 2 restore: RestoreAppFromTier2() restores from rsync mirror
- Show Tier 2 entry in restore dropdown when cross-drive backup succeeded
- Add .fab import link in restore section
- Volume-aware restore type banners and backup content labels
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Users couldn't find metadata provider fields (IGDB, ScreenScraper, etc.)
on the app info page. Move them to the deploy page where all other
settings (integrations, geo-restriction) already live.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The gtstef/filebrowser image bakes FILEBROWSER_CONFIG=/home/filebrowser/data/config.yaml,
but controller mounts config at /home/filebrowser/config.yaml. Override the env var in both
generateFileBrowserCompose() and docker-setup.sh so FileBrowser reads the controller-managed
config with proper sources and database path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Detect and offer to format empty (no filesystem) partitions on the system
disk. Adds IsSystemPartition() for granular per-partition safety checks
instead of blocking the entire system disk. Init wizard shows formatable
partitions with appropriate warnings. Add felhotest demo node to docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deactivated drives (Schedulable=false) now treated like disconnected for
Tier2 backups. New IsStoragePathSchedulable() checks active+connected+not
decommissioned. UI shows yellow "Cél meghajtó inaktív" badge, scheduler
skips silently with WARN log.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously, removing a storage drive from the controller only marked it as
disconnected if the StoragePath entry still existed with Disconnected:true.
Drives removed entirely from storage_paths were invisible to the check,
causing Tier2 backup UI to show green "Sikeres" and scheduler to attempt
backups to a no-longer-managed destination.
New IsStoragePathKnown() method covers both cases. UI shows yellow
"Cél meghajtó leválasztva" and scheduler skips silently.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- IsUSBDevice/diskModel: strip findmnt bind-mount suffix [/subdir] before
parsing device path (fixes USB badge not showing for attach-wizard drives)
- crossdrive.go: skip disconnected src/dest drives with WARN log instead of
returning error (prevents noisy error status in settings.json)
- handlers.go: detect Tier2 destination disconnection, set yellow status dot
instead of red, skip ValidateDestination for disconnected paths
- backups.html: new template branch showing "Cél meghajtó leválasztva" badge
with grayed-out info and hidden "Futtatás most" button
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>