From 82ef3b15cfe2d508533c69300827a0535c8dbf99 Mon Sep 17 00:00:00 2001 From: kisfenyo Date: Tue, 17 Feb 2026 09:14:15 +0100 Subject: [PATCH] updated .md files --- CHANGELOG.md | 568 ++++++++++++++++++++++++++++++++++++++++++++------- CLAUDE.md | 8 +- CONTEXT.md | 526 +---------------------------------------------- 3 files changed, 502 insertions(+), 600 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0586b29..0b22c2b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,99 +1,521 @@ -# Changelog +## Changelog -## v0.9.0 — Storage Paths Foundation & Backup Toggle Fix (2026-02-17) +### What was just completed (2026-02-17 session 26) +- **v0.9.0 — Phase A: Storage Paths Foundation & Backup Toggle Fix:** + - **Root cause:** Per-app backup toggles (v0.8.0) didn't appear because `controller.yaml` had no `paths.hdd_path` set → `ParseComposeHDDMounts` returned nil. Even with global hdd_path, apps with different HDD_PATH values wouldn't match. + - **Core fix: Per-app HDD_PATH resolution** — `stackAdapter.GetStackHDDMounts()` now reads each app's own `HDD_PATH` from its `app.yaml` env section (Priority 1), falling back to all registered storage paths (Priority 2). Removed dependency on global `cfg.Paths.HDDPath`. + - **Storage paths registry** (`settings.json`) — new `StoragePath` struct with Path, Label, IsDefault, Schedulable, AddedAt. Thread-safe CRUD methods in `settings.go` (Get/Add/Remove/SetDefault/SetSchedulable). Multiple external storage paths supported. + - **Auto-discovery** — On startup, `discoverHDDPaths()` scans deployed apps' `app.yaml` for `HDD_PATH` values. `AutoDiscoverStoragePaths()` registers discovered paths with inferred labels. Legacy `cfg.Paths.HDDPath` used as fallback. + - **Mount-point validation** — New `mounts_linux.go` (build-tagged): `IsMountPoint()` via `syscall.Stat_t.Dev` comparison, `IsWritable()`, `PathsOverlap()`, `GetDiskUsage()` via `syscall.Statfs`. Non-Linux stubs in `mounts_other.go`. + - **Settings page "Adattárolók" section** — Lists registered paths with label, path, disk usage bar, app count, badges (default/active/unmounted). Actions: set default, toggle schedulable, remove (with guards). Expandable "Új adattároló hozzáadása" form with 5-step validation (exists, mount point, writable, no overlap, no duplicate). + - **Deploy page storage dropdown** — `path` field type renders as `` dropdown of schedulable storage paths. Falls back to text input with warning if no paths registered. - - **Health check storage monitoring** — `RunHealthCheck()` now accepts `storagePaths` parameter. Checks: path accessible (warning), not a mount point (issue — data writes to SSD!), disk usage ≥95% (issue) / ≥90% (warning). - - **Controller docker-compose.yml** — Changed HDD mount from `${HDD_PATH:-/mnt/hdd_placeholder}:...:ro` to `/mnt:/mnt:rw` for multi-storage support + restore capability. - - **Removed unused `hddPath` param** from `DiscoverAppData()` signature in backup/appdata.go. - - **Files created (2):** `system/mounts_linux.go`, `system/mounts_other.go` - - **Files modified (11):** `settings.go`, `main.go`, `appdata.go`, `backup.go`, `handlers.go`, `server.go`, `settings.html`, `deploy.html`, `style.css`, `healthcheck.go`, `docker-compose.yml`, `report/builder.go` - -### What was previously completed (2026-02-16 session 25) -- **v0.8.0 — Phase 7: Storage Overview, Per-App Backup Toggles & Limited Restore:** - - **Storage overview on backup page** — new "Tárhely áttekintés" section as first section on backup page showing SSD/HDD progress bars + backup repo stats (repo size, dump file count, snapshot count). Reuses existing `system.GetInfo()` and `RepoStats`. - - **Restic password visibility** — new "Titkosítási kulcs" section inside the repository card. Masked password field with show/copy buttons (JS toggle). Password synced to hub via periodic report for disaster recovery (`ResticPassword` field added to `BackupReport`). - - **App data discovery** — new `internal/backup/appdata.go`: - - `StackDataProvider` interface to avoid circular imports between backup and stacks packages - - `AppBackupInfo`, `AppDataPath`, `AppDockerVolume` structs - - `DiscoverAppData()` iterates deployed stacks, discovers HDD bind mounts (via adapter calling `ParseComposeHDDMounts`), Docker named volumes (via `parseComposeNamedVolumes` using YAML parser), and DB dump status - - Stack adapter in `main.go` implements `StackDataProvider` using `stacks.Manager` - - **Per-app backup toggles** — new "Alkalmazás adatok" section on backup page: - - Toggle checkbox per app (only for apps with HDD data) - - Shows HDD paths with sizes, Docker volume info, DB dump notes - - `POST /settings/app-backup` handler saves preferences to `settings.json` - - `AppBackupPrefs` struct + bulk getter/setter in `settings.go` - - `RefreshCache()` populates `AppDataInfo` via `DiscoverAppData()` - - **Dynamic backup paths** — `RunBackup()` now includes enabled app HDD data paths: - - `resolveAppBackupPaths()` reads enabled apps from settings, resolves HDD paths via provider - - Paths logged at INFO level, included in restic snapshot - - `BackupPaths` display on backup page includes app data paths - - **Limited app restore** — new restore section on backup page: - - `RestoreApp()` in `restore.go`: validates enabled, resolves HDD paths, validates snapshot exists, uses running mutex - - `RestoreAppData()` on `ResticManager`: runs `restic restore` with `--include` flags for specific paths - - `POST /backup/restore` web handler with confirmation flow - - `GET /api/backup/snapshots` JSON endpoint for restore dropdown - - UI: app/snapshot dropdowns, warning box, confirmation checkbox, JS-driven form submission - - **Exported `ParseComposeHDDMounts`** from stacks package (was unexported `parseComposeHDDMounts`) - - **Flash messages** on backup page via query params (success/error redirects from handlers) - - **CSS**: New styles for storage overview grid, app backup toggles, encryption key field, restore section, flash messages - - **Files created**: `appdata.go`, `restore.go` - - **Files modified**: `backup.go`, `restic.go`, `handlers.go`, `server.go`, `backups.html`, `style.css`, `settings.go`, `delete.go`, `router.go`, `types.go`, `builder.go`, `main.go` - -### What was previously completed (2026-02-16 session 24) -- **v0.7.2 — Fix Notification Preferences Sync (Controller → Hub):** - - **Two repos changed** (deploy-felhom-compose + felhom.eu): - - **Hub: `POST /api/v1/preferences` endpoint** (`hub/internal/api/handler.go`): - - New route in API handler: same Bearer token auth as /report and /notify - - Accepts JSON payload: `{customer_id, email, enabled_events}` - - Calls existing `store.SaveNotificationPrefs()` — no store changes needed - - Logs preference updates at INFO level - - **Hub: Notification section on customer detail page** (`hub/internal/web/`, `hub/internal/store/store.go`): - - New `GetRecentNotifications()` store method returns last N notification_log entries - - `handleCustomerDetail()` loads NotifPrefs + RecentNotifications - - `joinStrings` template function added for event list display - - `customer.html` template: new "Notifications" section showing email, events, and last 10 notification log entries (time, event, status, message) - - **Controller: `SyncPreferences` method** (`internal/notify/notifier.go`): - - New `preferencesRequest` struct for JSON payload - - `SyncPreferences(email, enabledEvents)` — synchronous POST to hub `/api/v1/preferences` - - `IsEnabled()` getter for checking hub connectivity - - Hungarian error messages for user-facing feedback - - **Controller: Sync on settings save** (`internal/web/handlers.go`): - - `settingsNotificationsHandler` now calls `SyncPreferences` after saving to `settings.json` - - Three flash message variants: success (synced), warning (local save OK, sync failed), error (save failed) - - Local save always succeeds even if hub sync fails - - **Controller: Sync on startup** (`cmd/controller/main.go`): - - Non-blocking goroutine syncs preferences to hub when controller starts - - Only runs if hub is enabled and email is configured - - Handles hub DB rebuild recovery (re-populates preferences after hub redeployment) - - **Files changed**: hub (3 files: handler.go, store.go, server.go, customer.html), controller (3 files: notifier.go, handlers.go, main.go) - - **Documentation**: README.md updated (version, notify module, phase checklist), CONTEXT.md updated - -### What was previously completed (2026-02-16 session 23) -- **v0.7.1 — Phase 2: Monitoring Warnings, Dashboard Alerts & Notification System:** - - **Three workstreams across two repos** (deploy-felhom-compose + felhom.eu): - - **Monitoring page "Távoli monitoring" section** (`monitoring.html`, `handlers.go`): - - New section between System Overview and System Metrics showing healthcheck ping UUID status - - 5 rows: Heartbeat, System Health, DB Dump, Backup, Backup Integrity — each shows ✅ configured or ⚠️ missing - - Banner: green (all configured), yellow (some missing), red (monitoring disabled) - - `isPingConfigured()` helper checks non-empty AND not "CHANGEME" prefix - - **Dashboard alert banners** (new `alerts.go`, `layout.html`): - - `AlertManager` struct with `Refresh()` + `GetAlerts()` — generates alerts from health report, missing pings, backup disabled - - Alert types: `Alert{ID, Level, Message, Link, LinkText}` — levels: error/warning/info - - Renders colored banners (red/yellow/blue) after `
` on all pages - - Caps at 5 alerts with "+N more" overflow; monitoring page excludes "pings-missing" (shown in table instead) - - Refreshed every 5 min via system-health scheduler task + once at startup - - **Hub notification relay** (felhom.eu repo — `hub/internal/api/handler.go`, `hub/internal/store/store.go`): - - `POST /api/v1/notify` endpoint: Bearer auth, JSON payload (customer_id, event_type, severity, message, details) - - New `customer_notifications` table (email, enabled_events JSON) + `notification_log` audit table - - Resend email integration: direct HTTP POST to `https://api.resend.com/emails` - - Hungarian email template with event details, timestamp, severity - - `hub.yaml.example` updated with notifications config section - - **Controller-side notifier** (new `internal/notify/notifier.go`): - - `Notifier` struct: fires HTTP POST to hub `/api/v1/notify`, non-blocking (goroutine) - - Cooldown tracking per event type (default 6h, configurable via UI) - - Checks notification preferences (email configured + event enabled) before sending - - `NotifyHealthChange()`: only notifies on status degradation (ok→warn, ok→fail, warn→fail) - - `NotifyBackupFailed/NotifyDBDumpFailed/NotifyIntegrityFailed` convenience methods - - `SendTest()` for test email flow - - Wired into scheduler: system-health task calls `NotifyHealthChange()`, backup tasks call failure notifiers - - **Notification preferences UI** (`settings.html`, `handlers.go`): - - New "Értesítések" Section C on Settings page (only shown when hub enabled) - - Email input, 4 event checkboxes (disk_warning, backup_failed, update_available, security_update) - - Cooldown hours input (default 6) - - "Mentés" + "Teszt email küldése" buttons - - Saved to `settings.json` via `NotificationPrefs` struct (Email, EnabledEvents, CooldownHours) - - **Settings persistence expanded** (`settings.go`): - - `NotificationPrefs` struct with Email, EnabledEvents, CooldownHours - - `DefaultEnabledEvents`: disk_warning, backup_failed, update_available - - `GetNotificationPrefs()` returns defaults if nil, `SetNotificationPrefs()` saves atomically - - **Files changed**: 3 new (alerts.go, notifier.go, notify package), ~12 modified across both repos - - **Deployed:** Controller v0.7.1 to demo-felhom.eu, verified healthy (0 alerts on clean system) - -### What was previously completed (2026-02-16 session 22) -- **v0.7.0 — Phase 1: Authentication, Persistence & Settings Page:** - - **New `internal/settings/settings.go`:** Shared persistence layer via `settings.json` in the data directory. Atomic writes (tmp + rename), thread-safe with `sync.RWMutex`. Stores password hash overrides and DB validation cache. Graceful handling if file doesn't exist. - - **Auth improvements:** - - Password resolution priority: `settings.json` → `controller.yaml` → none (open dashboard) - - Startup logs which source is active: `Auth: using password from settings.json/controller.yaml/no password configured` - - Session duration extended to 7 days (was 24h) - - `?next=` redirect after session expiry — returns user to the page they were on - - Flash messages on login page (green info box, used after password change) - - Conditional logout link — hidden when auth is disabled (no password configured) - - `invalidateAllSessions()` method for password change flow - - **New Settings page (`/settings`):** - - "Rendszer konfiguráció" section: read-only display of controller.yaml values (customer ID/name/domain, git repo/sync interval, backup enabled/schedule, monitoring, healthchecks URL, hub status, controller version) - - "Jelszó módosítás" section: form with current password, new password, confirm — validates min 8 chars, match check, bcrypt comparison - - Password saved to `settings.json`, all sessions invalidated, redirect to login with flash message - - Only shown if auth is enabled; otherwise shows info message to contact operator - - **Sidebar update:** - - "Beállítások" menu item with ⚙ icon pinned to bottom (above version/logout) - - Version and logout link separated from nav links - - Logout link conditionally shown only when auth is enabled - - **DB validation persistence:** - - After each successful dump, validation results saved to `settings.json` (`db_validations` map keyed by filename) - - Cached data survives container restarts - - `DBValidationCache` struct with `validated_at`, `table_count`, `has_header`, `error` - - **10 files changed** (3 new: settings.go, settings.html; 7 modified: main.go, backup.go, auth.go, handlers.go, server.go, layout.html, login.html, style.css) - - **Deployed:** Controller v0.7.0 to demo-felhom.eu, verified healthy - -### What was previously completed (2026-02-16 session 21) -- **v0.6.3 — Bug fixes from v0.6.2 code scan (4 minor fixes):** - - **Bug 1:** `--hdd-path` in `docker-setup.sh` now uses `require_arg` validation like all other flags. Previously, `--hdd-path` as the last argument without a value would crash with a cryptic bash error under `set -u` instead of a friendly message. - - **Bug 2:** `stackAction()` in `layout.html` now receives `event` as an explicit parameter instead of relying on the deprecated implicit `window.event`. All 10 onclick call sites in `dashboard.html` and `stacks.html` updated to pass `event` as first argument. - - **Bug 3:** Page `` now has an em dash separator: `"Vezérlőpult — Felhom.eu"` instead of `"VezérlőpultFelhom.eu"`. - - **Bug 4:** `nextPruneLabel()` in `funcmap.go` now returns `"ma"` (Hungarian for "today") on Sunday before 4am, consistent with the `nextRunLabel` function. Previously returned the date in `"2006-01-02"` format. - - **Deployed:** Controller v0.6.3 to demo-felhom.eu, verified healthy - -### What was previously completed (2026-02-16 session 20) -- **Hub Dashboard Bugs + Backup Validation Fix (3 bugs):** - - **Bug 1&2 (Hub repo, felhom-hub v0.1.2):** Hub timestamp parsing failure — `time.Parse` with single hardcoded format silently failed for formats returned by `modernc.org/sqlite`. Added `parseSQLiteTime()` that tries 6 common formats. Fixed: hub main page showing DOWN despite OK status, and report history timestamps showing 00:00:00. - - **Bug 3 (Controller repo, v0.6.2):** Backup page showing "Hiba" for all DB validations — zero-value `DumpValidation{}` (never assigned) hit the `{{else}}` branch in template. Three fixes: - - Template: 4-branch guard (Valid → OK / Error → Hiba / zero-value → "–" with tooltip) - - Debug logging: Added `[DEBUG]` and `[WARN]` log lines to all `ValidateDump()` code paths - - Re-validation: `RefreshCache()` now cross-checks `lastDBDump` results against fresh `ListDumpFiles()` validation, healing stale in-memory state - - **Deployed:** Hub v0.1.2 to k3s, Controller v0.6.2 to demo-felhom - - **Verified:** Controller logs show `ValidateDump OK` for all 3 databases (immich: 60 tables, paperless: 67 tables, romm: 14 tables) - -### What was previously completed (2026-02-16 session 19) -- **v0.6.1 — Code Review Bugfixes (7 fixes):** - - **Fix 1:** `http.NotFound(w, nil)` → pass actual `*http.Request` in `deployHandler` and `appDetailHandler` - - **Fix 2:** Dashboard running/stopped counts now computed from the filtered `deployedStacks` set (was counting ALL stacks including non-deployed) - - **Fix 3:** Session cookie `Secure` flag now dynamic based on `r.TLS != nil || X-Forwarded-Proto == "https"`. `SameSite` changed from `Strict` to `Lax` (Strict breaks Cloudflare Tunnel redirects) - - **Fix 4:** Removed misleading `subtle.ConstantTimeCompare` from `isValidSession()` (map lookup already leaks timing; comparing token to itself is meaningless). Removed unused `token` field from `session` struct. Removed `crypto/subtle` import. - - **Fix 5:** Replaced `time.Tick()` (goroutine leak) with proper `time.NewTicker` + `done` channel in `cleanupSessions()`. Added `Close()` method to Server. Added `done chan struct{}` to Server struct. - - **Fix 6:** Added `http.MaxBytesReader(w, req.Body, 1<<20)` (1MB limit) to `deployStack`, `updateOptionalConfig`, `deleteStack` API handlers via `limitBody()` helper. - - **Fix 7:** Cached `time.LoadLocation("Europe/Budapest")` once at top of `templateFuncMap()`, removed 5 per-function `LoadLocation` calls (timeAgo, fmtTime, fmtTimeShort, nextRunLabel, nextPruneLabel). - - **Post-fix verification:** All 4 grep checks pass (0 results for NotFound(w,nil), ConstantTimeCompare, time.Tick(, Secure:.*true). `go vet ./...` clean. - - **Controller version:** v0.6.1 — deployed and verified on demo-felhom.eu - -### What was previously completed (2026-02-16 session 18) -- **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:** - - **Part 1 — Healthcheck enhancements (controller-side):** - - Added `heartbeat` ping — lightweight "I'm alive" signal every 5 min (no logic, just ping) - - Added `backup_integrity` ping — weekly `restic check` on Sunday 04:00, pings healthchecks with result - - Added `Heartbeat` and `BackupIntegrity` fields to `PingUUIDsConfig` - - Added `RunIntegrityCheck()` to backup Manager (calls restic Check(), updates lastCheckTime/lastCheckOK, pings) - - Updated `controller.yaml.example` with new monitoring ping_uuids - - Created `monitoring/DEPRECATED.md` for legacy bash monitoring scripts - - **Part 2 — Central hub reporting (controller-side):** - - New `internal/report/` package: types.go (Report struct), builder.go (BuildReport), pusher.go (HTTP push) - - Report builder gathers data from all subsystems: system info (via metrics.GetStaticInfo + system.GetInfo), container stats (via metricsStore.QueryContainerSummary), backup status (via backupMgr.GetFullStatus), health (via monitor.RunHealthCheck), stacks (via stackMgr.GetStacks) - - Report pusher: POST JSON to hub with Bearer token auth, 3 retries with 5s backoff, never fails caller - - Added `HubConfig` to config.go (enabled, url, api_key, push_interval) - - Wired hub reporting into scheduler (configurable interval, default 15m) - - Hub reporting disabled by default (hub.enabled: false) - - **Part 3 — Hub service (felhom.eu repo, new `hub/` subfolder):** - - Full Go service: `cmd/hub/main.go`, `internal/api/handler.go`, `internal/store/store.go`, `internal/web/server.go` - - SQLite store with WAL mode, auto-migration, denormalized fields for fast queries - - REST API: POST /api/v1/report (Bearer token auth), GET /api/v1/customers, GET /api/v1/customers/{id}, GET /api/v1/customers/{id}/history - - Dark theme dashboard (English): multi-customer overview table with status indicators, customer detail page with system/storage/containers/backup/health sections - - Color coding: green (OK, <30min), yellow (warn or 30-60min), red (fail or >60min) - - K8s manifest: Deployment + Service + Ingress for hub.felhom.eu in felhom-system namespace - - Dockerfile, Makefile, hub.yaml.example config - - 90-day report retention with daily auto-prune - - **Controller version:** v0.6.0 — deployed and verified on demo-felhom.eu (9 scheduler jobs, all new jobs registered) - - **Manual steps remaining for Viktor (Part 4 of TASK.md):** - - Create 5 healthcheck checks on status.felhom.eu (heartbeat, system-health, db-dump, backup, backup-integrity) - - Update controller.yaml on demo-felhom with real UUIDs - - Build and deploy felhom-hub to k3s cluster - - Configure hub.felhom.eu DNS in Cloudflare - - Enable hub reporting on demo-felhom controller.yaml - -### What was previously completed (2026-02-16 session 17) -- **v0.5.4 — Monitoring Page Frontend Fixes (4 bugs, frontend-only):** - - **Bug 1: Tooltip "Invalid Date"** — `items[0].parsed.x` unreliable across Chart.js versions. Fixed tooltip callback to use `items[0].raw.x` (direct {x,y} data access) with `parsed.x` as fallback. - - **Bug 2: Charts fill full width regardless of data density** — `setChartXBounds()` setting `min/max` at runtime was ignored because the scale was created without them. Fixed by including `min: now - defaultRangeMs, max: now` in the initial `chartOpts()` options. Now "7 nap" shows full 7-day x-axis with data clustered on the right. - - **Bug 3: Sysinfo values not consistently right-aligned** — `.sysinfo-grid` used `auto-fill` creating variable-width cells. Fixed to `1fr 1fr` (fixed 2-column). Added `align-items: baseline`, `gap: 1rem`, `white-space: nowrap` on labels, `font-weight: 600` + `word-break: break-word` on values. Removed redundant `<style>` block from monitoring.html (styles now in style.css). - - **Bug 4: Charts overflow on mobile** — Added `min-width: 0` on `.chart-box` (critical CSS grid fix), `overflow: hidden` + `max-width: 100%` on `.chart-wrap` and `.chart-wrap-bar`, `max-width: 100%` on canvas. - - **Controller version:** v0.5.4 — deployed and verified on demo-felhom.eu - -### What was previously completed (2026-02-16 session 16) -- **v0.5.1 — Monitoring Page Bugfixes:** - - **Bug 1: Hostname** — `os.Hostname()` returns the container ID inside Docker. Fixed by mounting `/etc/hostname:/host/etc/hostname:ro` and reading it first in `sysinfo.go`. Now shows `demo-felhom`. - - **Bug 2: Tooltip timestamps** — Chart.js tooltip callback used `items[0].parsed.x` (category index 0,1,2...) instead of `items[0].label` (actual timestamp). Index 0 worked by accident (`0 || label` falls through), but all other points showed 1970-01-01. - - **Bug 3+4: Default range + empty charts** — Default range was `24h` but new system had only minutes of data. Changed to `1h` default for both system and container detail charts. Moved `active` class to "1 óra" button. - - **Controller version:** v0.5.1 — deployed and verified on demo-felhom.eu - -### What was previously completed (2026-02-16 session 15) -- **v0.5.0 — Backup Bugfixes + Monitoring Page with Metrics Store:** - - **Task 1: Fixed "Helyi mentés" showing "–" after restart** — `GetFullStatus()` now synthesizes `LastBackup` from `SnapshotHistory` and `LastDBDump` from `DumpFiles` on disk when the in-memory values are nil (e.g., after controller restart). Dashboard handler also updated to use `GetFullStatus()` instead of `GetStatus()` for consistent behavior. - - **Task 2: Verified backup page caching** — Already implemented in v0.4.7 (`RefreshCache`, scheduler job, `AfterBackup` callback). No changes needed. - - **Task 3: New Monitoring Page ("Rendszermonitor")** — Full system monitoring subsystem: - - **SQLite metrics store** (`internal/metrics/store.go`, `types.go`): WAL-mode SQLite via `modernc.org/sqlite` (pure Go, no CGO). Stores system metrics (CPU%, memory, temperature, load) and container metrics (CPU%, memory, net/block I/O) with timestamp. Downsampled queries via bucket-based `GROUP BY` for Chart.js. 30-day auto-prune via daily scheduler job at 04:00. - - **Metrics collector** (`internal/metrics/collector.go`): Background goroutine collects system + container metrics every 60 seconds. System data from `system.GetInfo()`, container data from `docker stats --no-stream` with tab-separated format parsing. - - **System info provider** (`internal/metrics/sysinfo.go`, `sysinfo_other.go`): Reads hostname, OS, kernel, CPU model/cores, uptime from `/proc` filesystem. Linux-specific with build-tag fallback for cross-compilation. - - **REST API endpoints** (4 new routes in `router.go`): `GET /api/metrics/system` (time-series with range presets), `GET /api/metrics/containers/summary` (current stats), `GET /api/metrics/containers/{name}` (per-container time-series), `GET /api/metrics/sysinfo` (static system info). - - **Monitoring page template** (`monitoring.html`): 5 sections — System Overview (sysinfo via API), System Metrics Charts (4 line charts: CPU, Memory, Temperature, Load in 2×2 grid), Container Resources (2 horizontal bar charts: CPU% and Memory), Per-container Detail (click to expand with historical charts), Storage (server-rendered progress bars). Time range selectors (1h/6h/24h/7d/30d). Auto-refresh every 60s. - - **Chart.js 4.4.7** embedded locally (offline environments, ~200KB UMD), dark theme configuration matching site design. - - **CSS**: ~100 lines added for monitoring page (`.monitor-card`, `.charts-grid`, `.chart-box`, `.container-charts-row`, `.storage-bars`, responsive rules). - - **Wiring**: 4th sidebar nav item "Rendszermonitor", metrics DB path in named volume (`data/metrics.db`), `/etc/os-release:/host/etc/os-release:ro` volume mount in docker-compose.yml, Dockerfile updated to `golang:1.24-bookworm` (required by `modernc.org/sqlite`), `go.mod` upgraded to `go 1.24.0`. - - **Controller version:** v0.5.0 — deployed and verified on demo-felhom.eu (metrics collecting, 16 containers reporting, sysinfo showing Intel N100 correctly) - -### What was previously completed (2026-02-16 session 14) -- **v0.4.7 — Protected Stack Detail Pages + Backup Page Caching:** - - **Protected stacks clickable** — `data-href` gating changed from `{{if not .Protected}}` to `{{if .Meta.Slug}}` on both `stacks.html` and `dashboard.html`. Protected stacks with `.felhom.yml` (i.e. a slug) are now clickable, linking to `/apps/{slug}`. Stacks without `.felhom.yml` remain non-clickable. - - **"Részletek" button for protected stacks** — Protected stack action section in `stacks.html` now shows a "Részletek" link when the stack has a slug, next to the restart button. - - **FileBrowser `.felhom.yml` resources** — Added `resources` section (mem_request: 128M, mem_limit: 256M, pi_compatible: true, needs_hdd: true) to both `install_filebrowser()` in `docker-setup.sh` and manually on the demo node. FileBrowser detail page now shows memory/Pi/HDD badges. - - **Backup page caching** — `GetFullStatus()` no longer runs expensive subprocess calls (restic stats, docker inspect, disk listing) on every page load. Instead, a new `RefreshCache()` method runs these in the background: - - Every 5 minutes via `backup-cache` scheduler job - - After each successful backup via `AfterBackup` callback - - On startup via a goroutine (non-blocking) - - `GetFullStatus()` returns the cached `FullBackupStatus` instantly, updating only dynamic fields (running flag, next run times, snapshot history). Falls back to a minimal status if cache hasn't populated yet. - - **Controller version:** v0.4.7 — deployed and verified on demo-felhom.eu - -### What was previously completed (2026-02-16 session 13) -- **v0.4.6 — MariaDB Validation Fix + Dashboard & Protected Stack UX:** - - **Bugfix: MariaDB dump validation false positive** — MariaDB 11.4+ prepends `/*M!999999\- enable the sandbox mode */` before the dump header comment. `ValidateDump()` now scans the first 10 lines for the expected header pattern instead of just checking line 1. Accepts `-- MariaDB dump`, `-- MySQL dump`, `-- mysqldump` for MariaDB and `-- PostgreSQL database dump` for PostgreSQL. - - **Dashboard shows deployed apps only** — `dashboardHandler()` filters to deployed + protected stacks only. Non-deployed apps remain on the Alkalmazások page. Section heading changed to "Telepített alkalmazások". `TotalCount` stat card still shows all 52 apps. - - **Protected stack restart button** — Protected stacks (traefik, cloudflared, felhom-controller, filebrowser) now show an "Újraindítás" restart button when operational, on both dashboard (compact ↻) and Alkalmazások page (full button). "Védett" / "Védett rendszerkomponens" badge still shown. - - **API protection guard** — Centralized guard in `actionStack()` blocks all actions except `restart` on protected stacks (HTTP 403). Defense-in-depth: `StopStack()` and `DeleteStack()` retain their own guards. - - **FileBrowser `.felhom.yml`** — `install_filebrowser()` in `docker-setup.sh` now creates `.felhom.yml` with `subdomain: files` metadata, so the controller shows the `files.DOMAIN ↗` URL link. Manually created on demo node. - - **Controller version:** v0.4.6 — deployed and verified on demo-felhom.eu - -### What was previously completed (2026-02-16 session 12) -- **v0.4.5 — Dedicated Backup Page ("Biztonsági mentés"):** - - **New `/backups` page** with full backup system visibility — 5 sections: - 1. **Status overview cards**: Local backup status (green/gray), remote placeholder (gray), DB count, repo size - 2. **Schedule section**: DB dump/restic/prune schedule with next-run times, last backup time + duration, retention policy, "Mentés most" button - 3. **Database table**: Lists all discovered DBs with type badge (PostgreSQL/MariaDB), dump file size, last dump time, validation (table count), status - 4. **Snapshot history table**: Last 20 snapshots with ID, time, data added, files new/changed - 5. **Repository info card**: Path, size, snapshot count, integrity check status, backed-up paths list, remote copy placeholder - - **Backend extensions:** - - `SnapshotRecord` type + ring buffer (20 entries) in Manager for per-snapshot stats - - `DumpValidation` — scans dump files for CREATE TABLE statements, validates header and file size - - `ValidateDump()` runs after each successful dump in `DumpOne()` - - `ListDumpFiles()` scans dump directory for existing `.sql` files (fallback when in-memory results empty) - - `ListSnapshots()` on ResticManager — returns all snapshots from restic (newest first) - - `GetFullStatus()` on Manager — single call returns everything the page needs - - `LoadSnapshotHistory()` populates history from restic on startup (without delta stats) - - Restic check result tracking (`lastCheckTime`, `lastCheckOK`) - - `NextDailyRun()` exported from scheduler for next-run time calculation - - **Server wiring:** - - `Server` struct now holds `*scheduler.Scheduler` - - `NewServer()` accepts scheduler parameter - - `/backups` route + `backupsHandler()` in handlers.go - - **New template functions** (`funcmap.go`): `timeAgo`, `fmtTime`, `fmtTimeShort`, `dbTypeLabel`, `nextRunLabel`, `pruneLabel`, `nextPruneLabel`, `fmtDuration`, `fmtBytes`, `shortID` - - **Navigation**: Sidebar now has 3 items (Vezérlőpult, Alkalmazások, Biztonsági mentés) - - **Dashboard**: Backup card title is now a clickable link to `/backups` - - **Auto-refresh**: Page polls `/api/backup/status` every 3s during backup-in-progress, reloads when complete - - **CSS**: Full dark-theme styles for schedule card, database table, snapshot table, repository card, validation badges, DB type badges, empty state - - **Controller version:** v0.4.5 — deployed and verified on demo-felhom.eu (2 historical snapshots loaded) - -### What was previously completed (2026-02-15 session 11) -- **v0.4.1 — App Filtering + Bugfixes:** - - **Filter bar on Alkalmazások page**: Four pill-shaped filter buttons (Mind/Futó/Leállítva/Telepíthető) with live count badges computed from DOM. Filters stack cards via `display: none`, updates URL with `?filter=running` via `history.replaceState`. Reads filter from URL on page load for deep-linking support. - - **New `filterCategory` template function** (`funcmap.go`): Maps container state + deployed flag to filter categories (running/stopped/available). Each stack card gets a `data-filter-state` attribute for client-side filtering. - - **Clickable dashboard stat cards**: Stat cards (Futó/Leállítva/Összes) changed from `<div>` to `<a>` with `href` linking to `/stacks?filter=running`, `/stacks?filter=stopped`, `/stacks` respectively. Hover effect with translateY + box-shadow. - - **docker-compose.yml synced to demo node**: Fixed the stale compose file that still had `dashboard.${DOMAIN}` Traefik label (from pre-v0.3.0). Now uses correct `felhom.${DOMAIN}` label + `/sys:/host/sys:ro` mount. - - **Controller version:** v0.4.1 — deployed and verified on demo-felhom.eu - - **Remaining manual tasks for Viktor (Task 2 & 3 from TASK.md):** - - Verify `felhom.demo-felhom.eu` resolves correctly (Cloudflare Tunnel public hostname may need updating from `dashboard.*` to `felhom.*`) - - Update Pi-hole local DNS if applicable - - Enable backup in `controller.yaml` on demo node (`backup.enabled: true`) - - Create `/srv/backups` directories on demo node - -### What was previously completed (2026-02-15 session 10) -- **v0.4.0 — Monitoring & Health + Backups (Phase 2 & 3):** - - **Central job scheduler** (`internal/scheduler/scheduler.go`): - - Replaces ad-hoc goroutines in main.go with a unified scheduler - - `Every(name, interval, fn)` for periodic jobs, `Daily(name, timeStr, fn)` for scheduled tasks - - Panic recovery, skip-if-running, quiet mode for high-frequency jobs (≤30s) - - Daily jobs use `Europe/Budapest` timezone with `time.Timer` for DST correctness - - Graceful shutdown with 30s timeout for running jobs - - **CPU usage collector** (`internal/system/cpu_linux.go`): - - Background goroutine samples `/proc/stat` every 5s, computes delta-based CPU % - - Platform stubs for non-Linux in `cpu_other.go` - - **Temperature & load metrics** (`internal/system/info_linux.go`): - - Reads `/proc/loadavg` for 1/5/15 min load averages - - Reads thermal zones from `/host/sys/class/thermal/` (Docker mount) with `/sys/` fallback - - Handles millidegree values, picks highest zone, with hwmon fallback - - **Healthchecks.io pinger** (`internal/monitor/pinger.go`): - - HTTP ping client for Healthchecks.io-compatible endpoints - - POST to `/ping/{uuid}` (success), `/fail` (failure), `/start` (started) - - 10s timeout, 3 retries with 2s backoff, skips CHANGEME UUIDs - - **System health checks** (`internal/monitor/healthcheck.go`): - - Checks disk, memory, CPU, temperature, Docker reachability, protected containers - - Returns HealthReport with status "ok"/"warn"/"fail" + formatted message for pings - - **Database dump engine** (`internal/backup/dbdump.go`): - - Auto-discovers PostgreSQL/MariaDB containers via `docker ps` + `docker inspect` - - Dumps via `docker exec pg_dump`/`mariadb-dump` with 5min timeout - - Atomic writes (`.tmp` → `.sql`), empty file detection, stale temp cleanup - - **Restic integration** (`internal/backup/restic.go`): - - Auto-generates repository password (32 random bytes, base64url) - - Init, snapshot (JSON output), prune, check, stats, latest snapshot - - Stale lock detection with automatic unlock + retry - - **Backup orchestrator** (`internal/backup/backup.go`): - - DB dumps + restic snapshots, weekly prune on Sundays - - Thread-safe running flag, Healthchecks.io pings with results - - `RunFullBackup()` for manual trigger (sequential: dumps → snapshot) - - **Wiring updates:** - - `main.go`: scheduler-based job registration, cpuCollector lifecycle, pinger + backupMgr init - - `api/router.go`: `GET /api/backup/status`, `POST /api/backup/run` - - `web/server.go` + `handlers.go`: pass cpuCollector to GetInfo(), backup status on dashboard - - `funcmap.go`: `tempColor`, `fmtTemp`, `fmtLoad` template functions - - **Dashboard UI enhancements:** - - CPU usage bar with load average display below - - Temperature with colored indicator dot (green/yellow/red at 60°/75°C) - - Backup status card: last run time, DB count, repo size/snapshots - - "Mentés most" button triggers manual backup via API - - **Config updates:** - - `controller.yaml.example`: added `system_health_interval`, `hdd_path`, `system.reserved_memory_mb` - - `docker-compose.yml`: added `/sys:/host/sys:ro` mount for temperature reading - - `restic_password_file` default changed to `data/` subdir (auto-generated in named volume) -- **Controller version:** v0.4.0 — deployed and verified on demo-felhom.eu - -### What was previously completed (2026-02-15 session 9) -- **v0.3.0 — Structural refactoring (templates + server split + domain rename):** - - **Templates: go:embed migration** — moved all 7 HTML templates + CSS from Go string constants to individual files in `internal/web/templates/`. Created `embed.go` with `//go:embed` directive. Template loading now uses `ParseFS()` instead of `Parse()`. CSS served from embed.FS via `ReadFile()`. Zero runtime file dependencies — still compiled into the binary. - - **Server decomposition** — split monolithic `server.go` (540 lines) into focused files: - - `auth.go`: session struct, auth middleware, login/logout handlers, session management - - `handlers.go`: page handlers (dashboard, stacks, logs, deploy, app detail) - - `funcmap.go`: template FuncMap with 14 custom functions - - `server.go`: Server struct, NewServer, loadTemplates (3-liner), ServeHTTP routing, render helper, static file serving - - **Domain rename** — controller subdomain changed from `dashboard.*` to `felhom.*` in Traefik labels and setup script - - **Documentation updated** — CLAUDE.md, README.md, CONTEXT.md all reflect new file structure - - **Reminder for Viktor:** Update Cloudflare Tunnel public hostname (`dashboard.demo-felhom.eu` → `felhom.demo-felhom.eu`) and Pi-hole DNS if needed -- **Controller version:** v0.3.0 - -### What was previously completed (2026-02-15 session 8) -- **FileBrowser as infrastructure service:** - - Created `scripts/hdd-setup.sh` (adapted from deploy-portainer) — sets up HDD folder structure with `Dokumentumok` user dir - - Created `scripts/docker-setup.sh` (adapted from deploy-portainer) — installs Docker, Traefik, FileBrowser as infra services - - Added `filebrowser` to protected stacks in `controller.yaml.example` - - Removed `templates/filebrowser/` from app-catalog-felhom.eu (no longer a catalog app) -- **Orphan stack detection and deletion:** - - Added `Orphaned` field to Stack struct + `getCatalogTemplateSlugs()` helper - - Orphan detection in `ScanStacks()` — deployed stacks with no matching catalog template marked as orphaned - - New `delete.go`: `DeleteStack()` (compose down + HDD cleanup + dir removal), `GetStackHDDData()`, `parseComposeHDDMounts()` - - Safety: protected HDD paths (root, media, storage, Dokumentumok, appdata) can never be deleted - - New API endpoints: `DELETE /api/stacks/{name}` and `GET /api/stacks/{name}/hdd-data` - - UI: orange "Elavult" badge on orphaned stacks, "Törlés" button, delete confirmation modal - - Modal shows HDD data paths/sizes, checkbox for "Felhasználói adatok törlése a merevlemezről" - - Hides "Frissítés" and "Részletek" buttons for orphaned stacks -- **Verified:** 1 orphaned stack detected on startup (filebrowser — now infra, removed from catalog) -- **Controller version:** v0.2.15 - -### Previously completed (2026-02-14 session 7) -- **Fixed YAML parse error in romm `.felhom.yml`** (app-catalog repo): - - Root cause: Hungarian opening quote `„` (U+201E) paired with ASCII `"` (0x22) inside YAML double-quoted strings terminated the string prematurely - - Affected lines: `help_text` for IGDB Client Secret and SteamGridDB API Key fields - - Fix: escaped inner ASCII double quotes with `\"` in the YAML strings - - This caused `LoadMetadata()` to silently fail and return empty defaults for ALL romm metadata (tagline, resources, category — everything) -- **Added error logging to `LoadMetadata()`** in `metadata.go`: - - `[ERROR]` log on YAML parse failure (was silently swallowed — critical bug) - - Temporary `[DEBUG]` log used for diagnosis, then removed -- **Fixed deploy command in CLAUDE.md**: - - `sed` pattern now targets only `image:` lines (was matching service name too, breaking YAML) - - Added `sudo` for both sed and docker compose (directory is root-owned) -- **Controller version:** v0.2.14 - -### Previously completed (2026-02-14 session 6) -- **Bug fix: App info logo SVG rendering** — `.app-info-logo` CSS in `templates.go`: - - Added `min-width`, `min-height`, `max-width`, `max-height: 80px` and `overflow: hidden` - - Prevents SVG images with explicit dimensions or no viewBox from overflowing container - - Logo now reliably renders at 80x80 regardless of SVG intrinsic size -- **Controller version:** v0.2.12 - -### Previously completed (2026-02-14 session 5) -- **App detail/info pages** — new feature: - - New route: `GET /apps/{slug}` renders a full info page (was redirect to deploy page) - - Hero section with logo, tagline, resource badges - - Screenshots section (graceful — hidden via `onerror` if assets don't exist) - - Info cards: use cases, first steps, prerequisites, default credentials, docs link - - Optional config form with AJAX save (POST `/api/stacks/{name}/optional-config`) - - New `.felhom.yml` fields: `app_info` (tagline, use_cases, first_steps, prerequisites, default_creds, docs_url) and `optional_config` (groups of env var fields) - - New structs in `metadata.go`: `AppInfo`, `OptionalConfigGroup`, `OptionalConfigField` - - `UpdateOptionalConfig` in `deploy.go`: saves optional env vars to `app.yaml`, restarts deployed stacks with `docker compose up -d` to pick up new env vars - - Navigation updated: stack cards on dashboard/stacks pages now link to `/apps/{slug}`, deploy page has "Részletek" link back to info page -- **RoMM metadata updated** (app-catalog repo): - - Full `app_info` section: tagline, 5 use cases, 6 first steps, 3 prerequisites, default creds, docs URL - - 6 optional config fields for metadata providers: IGDB (client_id + secret), SteamGridDB, ScreenScraper (user + password), MobyGames - - docker-compose.yml updated with SCREENSCRAPER_USER, SCREENSCRAPER_PASSWORD, MOBYGAMES_API_KEY env vars - - Display name fixed: "ROMM" → "RomM" -- **Controller version:** v0.2.11 - -### Previously completed (2026-02-14 session 4) -- **Fixed deploy race condition** in `internal/stacks/deploy.go`: - - In-memory `Deployed` flag now set BEFORE `docker compose up -d` (compose up can take 30-60s for image pulls) - - On failure: both in-memory state and disk (app.yaml) are reverted - - Eliminates stale "Telepítés" button during long compose operations -- **Added `checkBeforeDeploy()` JS guard** in `internal/web/templates.go`: - - Telepítés buttons on Vezérlőpult and Alkalmazások pages now fetch live state from `/api/stacks/{name}` before navigating - - If app is already deployed (e.g., another tab deployed it), shows alert and reloads page instead of navigating to deploy form - - Catches stale UI state gracefully - -### Previously completed (2026-02-14 session 3) -- **Enhanced debug logging** across all stack operations in `internal/stacks/`: - - **Operation timing**: All stack ops (start, stop, restart, update, deploy) now log elapsed time - - **Post-start container state check**: Async goroutine after start/restart/update/deploy - - **Image pull detection**: Checks local images before deploy/update (debug level) - - **GetLogs/ScanStacks improvements**: Byte count logging, deployed/available counts - - All verbose checks gated on `cfg.Logging.Level == "debug"`; timing always at INFO -- **UI improvements** in `internal/web/templates.go` and `server.go`: - - **Memory bar fix on deploy page**: Bar segments now always visible (min-width: 3px), new app segment uses translucent green with distinct border for clear visual separation from committed memory - - **Clickable app cards**: Cards on Vezérlőpult and Alkalmazások pages are now clickable (navigates to deploy/detail page). Uses `data-href` attribute + delegated click handler. Protected stacks excluded. Actions area (buttons, state labels) excluded from click-to-navigate - - **Live-scrolling logs**: Logs page now auto-refreshes every 3s via AJAX polling (`?raw=1` returns plain text). Fixed-height container (70vh) with auto-scroll to bottom. Pulsing green "Élő" indicator. Pause/resume toggle ("Szüneteltetés"/"Folytatás"). User scroll position preserved when scrolled up to read history - - **Deployment progress UI**: Deploy button no longer shows alert+redirect immediately. Instead shows 3-step progress panel: config saved → containers starting → app initializing. Polls `GET /api/stacks/{name}` every 3s to track actual container health state. Handles running (auto-redirect), starting (keep polling), unhealthy (warning), exited (error), and 120s timeout. Shows elapsed time counter -- **Mealie healthcheck fix** (app-catalog-felhom.eu): - - `wget --spider` replaced with Python TCP socket check — mealie image doesn't include wget - - `start_period` increased to 60s (DB migrations take ~40s on first start) -- **Healthcheck audit**: filebrowser (Alpine, has BusyBox wget — OK), stirling-pdf (Ubuntu, has wget — OK) - -### Previously completed (2026-02-15 session 2) -- **Phase 4: Git Sync + App Catalog Audit** — major milestone -- **Git sync module** (`internal/sync/sync.go`): - - Clones/pulls app-catalog-felhom.eu repo to local cache on startup - - Periodic sync based on `git.sync_interval` (default 15m) - - Copies `docker-compose.yml` + `.felhom.yml` to stacks dir (never overwrites `app.yaml`/`.env`) - - SHA-256 content comparison — only writes changed files - - Triggers `ScanStacks()` after sync so dashboard updates immediately - - Uses `os/exec` git CLI — no Go git library dependency -- **Manual sync button** ("Sablonok frissítése") on Alkalmazások page: - - `POST /api/sync` endpoint with 30s debounce - - Toast notification shows result (success/failure/what changed) - - Auto-reloads page if new apps or updates detected -- **Sync status** added to `/api/system/info` (last_sync, last_status, syncing flag) -- **.felhom.yml files created for all 10 apps** (paperless-ngx already had one): - - actualbudget, docmost, filebrowser, homebox, immich, mealie, romm, stirling-pdf, vaultwarden - - All follow the same format: display_name, description, category, subdomain, resources, deploy_fields -- **Docker Compose templates audited and fixed** for all 10 apps: - - Fixed `{{DOMAIN}}` → `${DOMAIN}` syntax in homebox, mealie, romm, stirling-pdf - - Fixed `{{HDD_PATH}}` → `${HDD_PATH}` in romm - - Added `deploy.resources.limits.memory` to all services across all templates - - Added `TZ=Europe/Budapest` to all sidecar services (postgres, redis, mariadb) - - Added healthcheck to romm main service - - Added `romm-redis` `condition: service_healthy` (was `service_started`) - - Standardized header comment blocks across all templates -- **Documentation updated**: app-catalog README, CLAUDE.md, CONTEXT.md - -### Previously completed (2026-02-15 session 1) -- **Memory validation during deployment**: - - Pre-deploy memory check: compares `mem_request` sum against usable system RAM - - Hard block if requests exceed usable memory (total - 384MB reserved) - - Soft warning if `mem_limit` sum exceeds total RAM (overcommit OK for limits) - - `ParseMemoryMB()` supports "500M", "1G", "1.5G", "1024" formats - - `CommittedMemory()` sums requests/limits across all deployed stacks - - Memory summary bar shown on deploy page before user clicks deploy - - `system.reserved_memory_mb` configurable in controller.yaml (default: 384) -- **Display: `~` prefix on mem_request** in UI badges (display-only, exact value stored) -- **Felhom.eu logo** replaced text logos in sidebar and login page with actual SVG logo - - Logo SVG embedded as Go string constant, served at `/static/felhom-logo.svg` - -### Previously completed (2026-02-14) -- **System info bar on Vezérlőpult dashboard**: RAM, SSD, and optional HDD usage - - Progress bars with color coding (green < 70%, yellow 70-85%, red > 85%) - - New `internal/system` package reads `/proc/meminfo` + `syscall.Statfs` - - Platform-specific: Linux impl + non-Linux stub (build tags) - - Hungarian labels: "Memória", "SSD tárhely", "Külső HDD" -- **Docker Compose memory limits** on paperless-ngx template: - - paperless-webserver: 768M, postgres: 256M, redis: 128M - - Added `mem_limit` field to `.felhom.yml` ResourceHints (total: 1152M) -- **`/api/system/info` endpoint** now returns live system metrics (was customer info) -- **Config**: Added `paths.hdd_path` for external HDD monitoring -- Controller image builds via build.sh, pushes to Gitea container registry - -### Previously completed (2026-02-13) -- Built the entire felhom-controller from scratch (Go, no frameworks) -- Debugged and fixed 7 issues during first real deployment: - 1. Password validation (empty passwords accepted) - 2. In-memory Deployed flag not updating after deploy - 3. Health-aware state parsing (starting/unhealthy detection) - 4. Random card ordering (Go map iteration) - 5. "Részletek" button redirect for deployed apps - 6. Paperless OCR language installation (LANGUAGES vs LANGUAGE env var) - 7. Documentation: restart vs up -d for image updates - -### What's next (priorities) -1. **Test per-app backup** — enable backup for Paperless-ngx HDD data, trigger manual backup, verify restic snapshot includes HDD paths -2. **Test restore** — restore app data from snapshot, verify file recovery (now possible with /mnt:rw mount) -3. **Deploy Immich** — tests HDD path + secrets + multi-storage (biggest real-world test) -4. Add `app_info` + `optional_config` to more apps (Immich, Mealie, Vaultwarden) -5. Test on Raspberry Pi (pi-customer-1) -6. Self-update mechanism -7. Hub alerting (webhook to Healthchecks for stale customers) -8. Docker volume backup (mount `/var/lib/docker/volumes:ro` into controller) - ## Architecture decisions | Decision | Rationale |