Files
deploy-felhom-compose/CHANGELOG.md
T
admin db83db383c fix: deep bug hunt II — concurrency, security & optimization (25 files)
Critical: watchdog mutex panic safety, SetGeoAppOverride nil guard,
SSD-only app DB restore fallback.

High: double deploy race (atomic Deploying flag), delete/remove during
deploy guard, ScanStacks overwrite protection, FileBrowser mount mutex,
PushEvent history, PushOnce error handling, DB dump sync+close before
rename, restic retry fresh context, encrypt failure logging, cross-backup
path traversal validation, deepCopyStack completeness.

Security: constant-time API key comparison, login rate limiting (5/min),
git credential masking in logs, storage path prefix traversal fix.

Concurrency: MigrateEncryption lock ordering, SubdomainInUse I/O outside
lock, scheduler late-registered jobs, SQLite WAL verification, metrics
shutdown context, telemetry scan error logging, asset sync lock scope.

Optimization: streaming file copy for DB dumps, restic stats dedup,
atomic infra config copy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 14:21:09 +01:00

1763 lines
185 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## Changelog
### v0.30.4 — Deep Bug Hunt II: Concurrency, Security & Optimization (2026-02-25)
#### Fixed (Critical)
- **Watchdog mutex panic** — Wrapped `handleDisconnect` call in anonymous func with deferred re-lock to guarantee mutex re-acquisition even on panic (C1)
- **SetGeoAppOverride nil crash** — Added nil guard; passing nil override now correctly deletes the entry instead of panicking (C2)
- **SSD-only app DB restore** — `restoreDBDumps` now falls back to `app.DrivePath` when `HDDPath` is empty (C3)
#### Fixed (High)
- **Double deploy race** — Added atomic check-and-set of `Deploying` flag with `clearDeploying()` helper on all error paths (H1)
- **Delete/Remove during deploy** — Both `DeleteStack` and `RemoveStack` now reject operations while stack is deploying (H2)
- **ScanStacks overwrite** — Skips updating `Deployed`/`AppConfig` for stacks with active deploy in progress (H3)
- **FileBrowser mount race** — Added `fileBrowserMu` mutex to prevent concurrent `SyncFileBrowserMounts` calls (H5)
- **PushEvent history gap** — Added `recordHistory` calls on both success and failure paths in PushEvent goroutine (H6)
- **PushOnce silent failure** — Now returns error for non-2xx HTTP responses instead of nil (H7)
- **DB dump file corruption** — Added `tmpFile.Sync()` and `tmpFile.Close()` before rename in `DumpOne` (H8)
- **Restic retry timeout** — Creates fresh 30-minute context for retry after unlock instead of reusing near-expired original (H9)
- **Encrypt failure silent** — Added warning log when encryption fails in `SaveAppConfig` (H10)
- **Cross-backup path traversal** — Validates destination path against registered storage paths in both web and API handlers (H11)
- **deepCopyStack incomplete** — Now deep-copies `Meta.OptionalConfig`, `Meta.HealthCheck`, and `DeployField.Options` (H12)
#### Security
- **Constant-time API key** — Replaced `==` with `subtle.ConstantTimeCompare` for API key comparison, preventing timing attacks (M1)
- **Login rate limiting** — Added per-IP rate limiter (5 attempts/minute) to login handler (M8)
- **Git credential masking** — Applied `maskRepoURL()` in `runGitInDir` log output to prevent credential leakage (M23)
- **Path prefix traversal** — Fixed `storageAttachBrowseHandler` prefix check to require trailing `/`, preventing sibling directory matches (M24)
#### Concurrency & Logic
- **MigrateEncryption race** — Moved `encKey == nil` check inside the mutex lock (M5)
- **SubdomainInUse I/O under lock** — Collect stack dirs under RLock, release, then perform disk I/O outside (M4)
- **Scheduler late jobs** — Jobs registered after `Start()` now immediately get their goroutine launched (M10)
- **SQLite WAL verification** — WAL pragma now verified via `QueryRow` + `Scan` instead of silent `Exec` (M13)
- **Metrics shutdown** — `sampleContainers` now uses parent context instead of `context.Background()` for clean shutdown (M14)
- **Telemetry scan logging** — Row scan errors now logged instead of silently swallowed (M15)
- **Asset sync lock** — Refactored to hold mutex only for status updates, not during entire HTTP download (M22)
#### Optimization
- **DB dump copy** — Replaced `os.ReadFile`/`os.WriteFile` with streaming `io.Copy` via `copyFile` helper for large dumps (M16)
- **Restic stats dedup** — Per-drive stats now computed once and aggregated, eliminating duplicate restic subprocess calls (M17)
- **Infra config atomic** — `syncInfraConfig` controller.yaml copy now uses atomic write via `copyFile` (M20)
### v0.30.3 — Comprehensive Bug Hunt Fixes (2026-02-25)
#### Fixed (Critical — P0)
- **Encrypted env vars** — `UpdateStackConfig` now uses decrypted values when building compose env, preventing `ENC:...` literals in containers (C01)
- **Silent decrypt failures** — `DecryptMap` now logs warnings on decrypt failure instead of silently returning empty values (C02)
- **Deploy race condition** — `Deployed = false` flag now set inside the mutex lock in `runComposeDeploy` (C03)
- **Shared state mutation** — `GetStack`/`GetStacks` now return deep copies preventing callers from mutating cached state (C04)
- **Watchdog races** — Added per-state mutex to `pathProbeState` for thread-safe probe state access (C05)
- **Metrics double-start** — `MetricsCollector.Start()` guarded with `sync.Once` (C06)
- **Raw mount race** — `diskJobMu` now held across entire cleanup+mount+set operation (C07)
- **Encryption key race** — Added mutex to `SetEncryptionKey` (C08)
#### Fixed (High — P1)
- **Restic lock detection** — `Snapshot()` now extracts stderr from `*exec.ExitError` and checks `unlockCmd.Run()` error (H01)
- **Disconnected drives in backup** — `activeDrives()` now skips disconnected/decommissioned drives (H02)
- **Template rendering** — Buffered via `bytes.Buffer` to prevent partial HTML on error (H07)
- **Sync stop panic** — `Stop()` uses `sync.Once` for safe channel close (H08)
- **Sync race** — `syncing = true` set before releasing lock in `TriggerSync` (H09)
- **Cloudflare context** — Threaded `context.Context` through all Cloudflare API calls for cancellation support (H10)
- **Cross-drive collision** — Replaced flawed leaf-name dedup with proper `seen` map (H15)
- **CSRF bypass** — Bearer token now validated against Hub API key before skipping CSRF (H16)
- **Nil pointer** — Added nil check for `crossDriveRunner` in handlers (H17)
- **Selftest panic** — Replaced `out[:len(out)-1]` with `strings.TrimSpace` (H18)
- **Stderr goroutine** — Added `sync.WaitGroup` in `MigrateDrive` (H19)
- **UUID slice** — Guarded `uuid[:8]` with length check (H20)
- **Fstab matching** — Parse fields exactly instead of loose `strings.Contains` (H21)
- **Atomic save** — `SaveAppConfig` writes to `.tmp` then renames (H04)
- **Deploy failure** — `SaveAppConfig` on failure now includes `encKey` (H05)
- **Encryption migration** — Uses write lock instead of read lock (H03)
- **Deep copy** — `GetFullStatus` deep-copies `lastDBDump`/`lastBackup` (H11)
- **IPv6** — TCP health probe uses `net.JoinHostPort` for IPv6 compatibility
- **Backup path validation** — `RemoveStack` validates paths under expected directory (M12)
- **Updater race** — `SetBackupRunningCheck` protected by mutex (M18)
#### Fixed (Medium — P2)
- **Config env overrides** — `LoadFromBytes` now calls `applyEnvOverrides` (M05)
- **Selfupdate state** — Compose-up failure now sets `state.Status = "failed"` (M16)
- **Memory check** — `usableMB` clamped to min 0 (M22)
- **Cross-backup trigger** — Removed invalid "manual" schedule from `triggerAllCrossBackups` (M23)
- **mmcblk support** — Partition path and `stripPartition` now handle mmcblk devices (M21, L25)
- **Scheduler** — `Start()` guarded against double-start, `Stop()` acquires mutex (M14, L24)
- **Pending events** — Events restored on save failure in `DrainPendingEvents` (M03)
- **Duplicate storage** — `AddStoragePath` rejects already-registered paths (M04)
- **Setup scan** — `CleanupTempMounts` called after drive scan (H13)
- **Setup state** — `SetStep` now logs save errors (M25)
#### Fixed (Low — P3)
- **UTF-8 truncation** — `TruncateStr` now operates on runes and handles negative maxLen (L05/L06)
- **AllDone** — Returns false for empty restore plans (L14)
- **PushOnce** — Returns actual errors instead of swallowing them (L39)
- **CSRF token** — Panics on `crypto/rand.Read` failure instead of using static fallback (L40)
- **Logout** — Requires POST method (L32)
- **Server.Close** — Uses `sync.Once` to prevent double-close panic (L49)
- **Log cap** — `lines` query parameter capped at 10000 (L31)
- **Hash function** — Replaced custom `simpleHash` with `crc32.ChecksumIEEE` (L48)
- **hasPrefix** — Replaced custom implementation with `strings.HasPrefix` (L13)
- **DefaultEnabledEvents** — Copied in `GetNotificationPrefs` early return (L09)
- **Variable shadowing** — Renamed `copy` to `cp` in `SetNotificationPrefs` (L07)
#### Removed
- Dead `imageName` function in selfupdate (L02)
- Dead `detectHostIPViaRoute` function in setup (L03)
- Custom `hasPrefix` function in restore_scan (L13)
### v0.30.2 — Report geo-restriction + logo/favicon update (2026-02-25)
#### Added
- **Geo-restriction in reports** (`internal/report/`) — New `GeoRestrictionReport` struct and `geo_restriction` field in the Report JSON. Hub can now display current geo-blocking status (enabled, allowed countries, per-app overrides, sync state) on customer detail pages.
- **Favicon route** (`/static/favicon.svg`) — Separate favicon SVG served from synced assets or embedded fallback. Uses the cloud icon from `logo_favicon_2.svg`.
- **Hub Bearer auth for geo API** — `/api/geo/` routes now accept `selfUpdateAuthMiddleware` (session auth OR Hub API key), allowing the Hub to send geo-disable commands to controllers.
#### Changed
- **Logo SVG updated** (`internal/web/templates.go`) — Replaced embedded logo with the latest `logo.svg` from the website (white text variant).
- **Favicon link** — Layout and catch-all templates now reference `/static/favicon.svg` instead of the full logo.
### v0.30.1 — Geo-Restriction fix (2026-02-25)
#### Fixed
- **WAF rule creation** — Removed custom block response body from WAF rules (requires paid Cloudflare plan). Block action now uses Cloudflare's default 403 page.
### v0.30.0 — Geo-Restriction via Cloudflare WAF (2026-02-25)
#### Added
- **Geo-restriction feature** (`internal/cloudflare/`) — New package for managing Cloudflare WAF Custom Rules. Allows restricting access to apps by country using the `http_request_firewall_custom` phase. Rules are identified by `[felhom-geo]` description prefix — other WAF rules are untouched.
- **Cloudflare API client** (`internal/cloudflare/client.go`) — HTTP client with Bearer token auth for the Cloudflare v4 API. Supports zone lookup, ruleset management, and rule CRUD operations.
- **Country data** (`internal/cloudflare/countries.go`) — Embedded map of ~250 ISO 3166-1 alpha-2 country codes with Hungarian names. Includes search helpers for the UI.
- **Geo sync manager** (`internal/cloudflare/geosync.go`) — Orchestrator that diffs desired vs existing Cloudflare rules and applies changes. Runs on settings change, after app deploy/remove, and every 6 hours for verification.
- **Settings page UI** (`templates/settings.html`) — New "Földrajzi korlátozás" section with searchable country selector (autocomplete dropdown → tag chips), enable/disable toggle, per-app override summary, and sync status display. Hungary removal triggers a confirmation warning.
- **Per-app override** (`templates/app_info.html`) — Each app's detail page now has a "Földrajzi korlátozás" section (when the feature is globally enabled) to set app-specific allowed countries.
- **Geo API endpoints** (`internal/api/geo.go`) — `GET /api/geo/status`, `POST /api/geo/settings`, `POST /api/geo/sync`, `GET /api/geo/countries`, `POST/DELETE /api/stacks/{name}/geo/override`.
- **Settings model** (`internal/settings/settings.go`) — New `GeoRestriction` struct with `AllowedCountries`, `AppOverrides`, and sync state (zone ID, ruleset ID, last sync). Thread-safe getter/setter methods following existing RWMutex pattern.
#### Changed
- **Router** (`internal/api/router.go`) — Added `OnGeoRelevantChange` callback triggered after app deploy/remove to re-sync geo rules when hostnames change.
- **Main wiring** (`cmd/controller/main.go`) — Cloudflare client, geo sync manager, and scheduler job initialized when `cf_api_token` is configured. New `geoStackAdapter` provides deployed app hostnames.
#### Hub Changes
- **Config form** (`hub/internal/web/templates/config_form.html`) — Updated CF API token help text to indicate Zone WAF:Edit permission is needed for geo-restriction.
#### Notes
- The existing `cf_api_token` needs **Zone WAF:Edit** permission added (in addition to existing Zone DNS:Edit for ACME). No new token field is needed.
- Local network access is inherently unaffected — local traffic bypasses Cloudflare entirely.
- Cloudflare Free plan supports up to 5 custom rules, which is sufficient for a global rule + a few per-app overrides.
### v0.29.3 — Controller-side Health Probes (2026-02-25)
#### Added
- **HTTP/TCP health probes** (`internal/stacks/healthprobe.go`) — The controller now probes deployed apps directly over the Docker network to verify services are actually responding, not just that containers are running. Runs every minute, configurable per-app interval (default 5 min).
- **Three probe types**: `http` (any response = alive), `api` (validates status code and response body), `tcp` (port reachability). Multiple checks per app supported.
- **`.felhom.yml` healthcheck config** (`internal/stacks/metadata.go`) — New `healthcheck:` section with `interval`, `checks[]` (type, port, path, method, expect). Parsed from app catalog metadata.
- **State override** (`internal/stacks/manager.go`) — If a running container's health probe fails, the stack state is overridden to "unhealthy". Clears automatically when probe passes again.
#### Fixed
- **Vikunja healthcheck** — Removed Docker-level healthcheck (distroless image has no wget/curl). Controller-side API probe to `:3456/api/v1/info` replaces it.
### v0.29.2 — Dynamic Logo & Favicon (2026-02-25)
#### Changed
- **Logo served from synced assets** (`internal/web/server.go`) — `serveLogoHandler` now checks the Hub-synced assets directory for `felhom-logo.svg` first, falling back to the embedded SVG constant if not found. This allows logo updates via Hub without a controller rebuild.
#### Added
- **SVG favicon** (`templates/layout.html`, `templates/catchall.html`) — Added `<link rel="icon" type="image/svg+xml">` pointing to `/static/felhom-logo.svg` so browsers display the Felhom logo as a tab icon.
### v0.29.1 — Fix Git Lock File Stale After Interrupted Sync (2026-02-24)
#### Fixed
- **Stale git lock file recovery** — Catalog sync now removes stale `.git/index.lock`, `.git/shallow.lock`, and `.git/HEAD.lock` files before running `git fetch`/`git reset`. Previously, if the container was killed mid-sync, the leftover lock file would block all subsequent syncs until manual intervention.
### v0.29.0 — Encrypt Sensitive Values in app.yaml (2026-02-23)
#### Added
- **AES-256-GCM encryption for app.yaml secrets** — Sensitive deploy field values (`type: password` and `type: secret`) are now encrypted at rest in each stack's `app.yaml` using a per-node 32-byte key. Encrypted values are stored as `ENC:base64(nonce+ciphertext)`. New `internal/crypto` package provides `Encrypt`, `Decrypt`, `LoadOrCreateKey`, `DecryptMap`, and `IsEncrypted` helpers.
- **Encryption key in infra backup** — The encryption key (`encryption.key`) is included in the Hub infra backup bundle (`encryption_key_b64` field) and local drive infra backups for disaster recovery.
- **Encryption key restore** — The setup wizard's infra restore flow restores `encryption.key` from the backup bundle so encrypted app.yaml values remain readable after disaster recovery.
- **Startup migration** — On first start after upgrade, existing plaintext sensitive values in deployed stacks' `app.yaml` files are automatically encrypted in-place.
#### Changed
- **`SaveAppConfig` signature** — Now accepts `encKey []byte` and `sensitiveVars []string` parameters for encryption. All callers (deploy, update, optional config, inject missing fields, HDD path update, storage handlers) updated.
- **`LoadAppConfigDecrypted`** — New helper that loads app.yaml and transparently decrypts all `ENC:` values for docker-compose env injection and web UI display.
- **`SensitiveEnvVars`** — New exported helper that identifies sensitive env vars from `.felhom.yml` metadata (`type: password` or `type: secret` deploy fields).
- **Manager struct** — Added `encKey` field and `SetEncryptionKey()` / `MigrateEncryption()` methods.
- **Web Server struct** — Added `encKey` field and `SetEncryptionKey()` method; deploy handler decrypts values before template rendering.
### v0.28.8 — Password UX Polish (2026-02-23)
#### Fixed
- **Password fields empty after deployment** (`templates/deploy.html`) — Password-type deploy fields now read their stored value from `DeployedFieldValues` (app.yaml env) when viewing settings for an already-deployed app, instead of always using the field's `.Default` (which was empty).
- **Post-deploy credentials masked** — Passwords on the post-deploy success card are now shown as `••••••••••••` with "Megjelenítés" (reveal) and "Másolás" (copy to clipboard) buttons, instead of displaying plaintext.
#### Changed
- **Settings page: initial password hint** — Deployed password fields show a note: *"Telepítéskor beállított kezdeti jelszó — ha az alkalmazásban megváltoztattad, az itt nem frissül."* Generate button is hidden for already-deployed apps.
- **Post-deploy credential detection** — Added EMAIL to the username-detection heuristic (catches Kimai's `ADMIN_EMAIL`).
### v0.28.7 — Password Field UX (2026-02-23)
#### Changed
- **Password deploy fields: masked input with reveal & confirmation** (`templates/deploy.html`) — `type: password` fields now render as masked inputs (hidden by default) with an eye toggle button to reveal/hide. Added a "Jelszó megerősítése" confirmation field below each password input. The "Generálás" button fills both fields simultaneously. Form validation checks that both fields match before allowing deploy. Confirmation fields are only shown for new deployments.
- **App catalog: admin passwords use `type: password`** (separate repo: `app-catalog-felhom.eu`) — Changed 4 apps (Nextcloud, Grafana, Kimai, Code-server) from `type: secret` to `type: password` so users can see/edit/generate admin passwords during deployment (matching the existing Paperless-ngx pattern).
### v0.28.6 — Filebrowser Link, Appdata Paths & Log Timestamps (2026-02-23)
#### Added
- **Post-deploy credential display** (`templates/deploy.html`) — The success page now shows actual username/password values from the deploy form instead of a generic message. Reads from deploy field metadata, filtering out internal DB passwords and secret keys. Falls back to `defaultCreds` for apps without typed deploy fields.
#### Fixed
- **Filebrowser "open" link on stacks page** (`web/handlers.go`) — Protected stacks like filebrowser have no `.felhom.yml` or `app.yaml`, so the subdomain lookup found nothing. Added `protectedStackSubdomains` fallback map for programmatically managed protected stacks (filebrowser → "files"). Now shows `files.<domain> ↗` link on both the stacks page and dashboard.
- **App catalog: appdata volume paths** (separate repo: `app-catalog-felhom.eu`) — 4 compose templates (nextcloud, immich, paperless-ngx, romm) used `${HDD_PATH}/appdata/` instead of `${HDD_PATH}/felhom-data/appdata/` as designed in the v0.26.0+ storage structure. Fixed all templates. Existing deployments need redeployment or manual volume path update.
- **Debug log viewer timestamps** (`web/logbuffer.go`, `templates/debug.html`) — Naplóviewer showed relative times like "-3586mp" (negative due to timezone bug: `time.Parse` assumed UTC but `log.LstdFlags` outputs local time). Now uses `time.ParseInLocation` with `time.Local`, and displays absolute `HH:MM:SS` timestamps.
### v0.28.5 — Post-Deploy Info Card (2026-02-23)
#### Added
- **Post-deploy success page** (`web/templates/deploy.html`) — After a successful deploy, instead of auto-redirecting to the apps list, shows a rich info card with: direct app link ("Alkalmazás megnyitása ↗"), first steps from catalog metadata (with DOMAIN placeholders replaced), default credentials info, documentation link, and a link to the settings page where passwords can be revealed. Also shown for unhealthy/timeout states since apps may still be usable during initialization.
### v0.28.4 — Telemetry: Skip Stopped Apps (2026-02-23)
#### Fixed
- **Stopped apps no longer send zero-value telemetry to hub** (`report/telemetry.go`) — Previously, deployed-but-stopped apps were included in the telemetry report with all-zero memory/CPU values, which dragged down hub-side averages. Now `buildAppTelemetry` checks `isStackRunning()` and only includes apps in running, starting, unhealthy, or restarting states.
### v0.28.3 — Catch-All Page, Deploy Controls, Dashboard Open (2026-02-23)
#### Added
- **Catch-all page for stopped/undeployed apps** — When a user visits a stopped app's subdomain (e.g., `travel.demo-felhom.eu`), they now see a branded felhom page with the app name and status ("Az alkalmazás jelenleg le van állítva") instead of Traefik's raw 404. Implemented via a low-priority (1) Traefik catch-all router on the controller container + `CatchAllMiddleware` in `server.go` that intercepts non-controller hosts and renders standalone `catchall.html` without auth.
- **Start/Stop/Restart buttons on deploy settings page** — Deployed apps now show Indítás/Leállítás/Újraindítás buttons in the page header, plus a "Megnyitás ↗" link to the app's subdomain (visible when running). Previously the deploy page had no state controls.
- **"Megnyitás ↗" button on Vezérlőpult** — Running apps on the dashboard now show an open button that launches the app in a new tab. Uses the `Subdomains` map built from `app.yaml` SUBDOMAIN env with metadata fallback.
- **`findStackBySubdomain()`** helper in `server.go` — looks up stacks by subdomain, checking deployed `app.yaml` env first, then `.felhom.yml` metadata.
#### Changed
- **Subdomain links on Alkalmazások page** — Links now only shown for deployed apps (previously shown for all apps including non-deployed ones where the subdomain isn't final yet).
- **`docker-compose.yml`** — Added 6 catch-all Traefik router labels (`traefik.http.routers.catchall.*`) with `priority=1` and `certresolver=letsencrypt`.
### v0.28.2 — Async Deploy & AdventureLog Fix (2026-02-23)
#### Changed
- **Async deploy** — `DeployStack()` now runs `docker compose up -d` in a background goroutine instead of blocking the HTTP response. The deploy API returns immediately after validation + config save, so the UI switches to the progress panel instantly (previously waited 30-60s for image pulls). New `StateDeploying` container state shown while compose-up is in progress. On failure, the goroutine reverts both disk and in-memory state and stores the error in `DeployError` for the polling UI to display.
- **Deploy progress UI** — Polling now handles the `deploying` state ("Képek letöltése, konténerek indítása...") and `deploy_error` (shows error message with links to logs). Previous behavior only showed progress after compose-up completed.
#### Fixed
- **RestartStack uses `up -d` with env vars** — `RestartStack()` previously used bare `docker compose restart` which only sends SIGTERM+start without re-reading the compose file or injecting env vars from `app.yaml`. Now uses `docker compose up -d` with full env, matching `StartStack()` behavior. This ensures template changes (images, healthchecks) and env var updates are picked up on restart.
- **AdventureLog backend healthcheck** — Replaced `wget` (not available in v0.11.0 image) with `python urllib.request`. Also uses `127.0.0.1` instead of `localhost` to avoid IPv6 resolution issues.
- **AdventureLog frontend healthcheck** — Changed `localhost``127.0.0.1` to fix IPv6 resolution causing connection refused (Node.js only listens on IPv4).
- **AdventureLog SECRET_KEY** — Added `SECRET_KEY=${SECRET_KEY}` env var alongside `DJANGO_SECRET_KEY` for v0.11.0 compatibility (Django settings now reads `SECRET_KEY` directly).
### v0.28.1 — Telemetry Debug Section (2026-02-23)
#### Added
- **Telemetria teszt section on Debug page** — New collapsible section between "Hub & Kapcsolatok" and "Önfrissítés teszt". Click "Telemetria futtatása" to run the full telemetry collection pipeline on-demand without waiting for the 15-minute report cycle.
- **`GET /api/debug/telemetry`** — New debug endpoint in `handler_debug.go`. Invokes `GetTelemetryPreview` callback, returns per-app data: container list, memory (current/avg/peak), CPU avg, catalog limit, log error/warning counts, top issues, and overall latency. Response: `{latency_ms, app_count, total_errors, total_warnings, app_telemetry[]}`.
- **`GetTelemetryPreview` callback** added to `DebugCallbacks` struct. Wired in `main.go` debug-mode block: calls `report.BuildAppTelemetryForDebug(stackMgr, metricsStore, logger)`. Available regardless of hub configuration.
- **`report.BuildAppTelemetryForDebug()`** — Exported wrapper in `internal/report/telemetry.go` around the private `buildAppTelemetrySection()`. Allows debug endpoint access without exposing internal package details.
- **JS rendering** — `runTelemetryTest()` fetches the endpoint and shows a summary message. `renderTelemetryDetail()` builds a table with per-app rows (color-coded errors in red, warnings in yellow) and sub-rows for top issues. Includes a collapsible "Nyers JSON" section showing the exact payload that would go to the hub.
### v0.28.0 — App Telemetry & Analytics (2026-02-23)
#### Added
- **App telemetry in Hub reports** — `Report.AppTelemetry` (new field in `report/types.go`) carries per-stack memory/CPU metrics and log scan results to the Hub on every report push. Backward-compatible: old Hub versions silently ignore the new field.
- **`internal/metrics/telemetry.go`** — New `MetricsStore.GetContainerTelemetry(since)` method aggregates container memory (current/avg/peak) and CPU averages from the existing `container_metrics` SQLite table over the last 15 minutes.
- **`internal/metrics/logscanner.go`** — New `ScanContainerLogs(containerNames, since, logger)` function runs `docker logs --since=15m --tail=1000` on each non-protected deployed container. Detects errors/warnings by keyword matching, deduplicates via fingerprinting (strips timestamps, replaces 6+ digit numbers with `<N>`, hex with `<HEX>`, UUIDs with `<UUID>`). Returns `[]ContainerLogSummary` with counts and `RecentIssues` (top 10 per container).
- **`internal/report/telemetry.go`** — New `buildAppTelemetrySection()` and `buildAppTelemetry()` functions assemble per-stack `AppTelemetry` records by aggregating container-level metrics and log summaries. Only non-protected, deployed stacks are included.
#### Changed
- **`internal/report/builder.go`** — `BuildReport()` now calls `buildAppTelemetrySection()` after the stacks section, populating `r.AppTelemetry`.
- **`internal/report/types.go`** — Added `AppTelemetry []AppTelemetry` field to `Report` struct. Added new `AppTelemetry` type with fields: app_name, display_name, containers, memory metrics, catalog estimate/limit, log error/warning counts, and top issues.
### v0.27.3 — Real System Memory Everywhere (2026-02-23)
#### Changed
- **Deploy page uses real system memory** — Memory bar now shows actual `/proc/meminfo` usage instead of declared `mem_request` sums. Labels changed from "Jelenlegi foglalás" to "Jelenlegi használat". `system.GetMemoryMB()` provides real-time total and used memory.
- **Pre-start memory check uses real memory** — `actionStack("start")` in `router.go` and `DeployStack()` in `deploy.go` now check real used memory (`usedMB + newReqMB > usableMB`) instead of declared committed sums. `CommittedMemory()` kept only for soft overcommit warnings.
#### Added
- **`system.GetMemoryMB()` helper** — Lightweight function in `internal/system/info_linux.go` that returns real total and used memory from `/proc/meminfo` without the overhead of full `GetInfo()` (no disk/CPU/temp). Stub in `info_other.go` for non-Linux.
- **Monitoring page memory distribution bar** — New stacked bar on `/monitoring` showing per-container memory usage (colored segments), OS/system overhead (gray), and free memory. Built dynamically from container summary data + real-time `/api/system/info`. Color-coded legend with per-app labels.
### v0.27.2 — Comprehensive Fixes and New Labels (2026-02-23)
#### Fixed
- **Deploy error popups now copyable** — Replaced all native `alert()` calls with a custom modal (`showAlert()` in layout.html) using a `<pre>` block with `user-select:text`. Error messages can now be selected and copied. Applied across deploy.html and layout.html.
- **Manual Tier2 backup now reports to Hub** — Added `OnCrossDriveComplete` callback to `Router` (`internal/api/router.go`). Both `triggerCrossBackup` (single-app) and `triggerAllCrossBackups` (run-all) now call `pushInfraBackup()` + `writeLocalInfraBackup()` after completion, matching the automatic scheduled path.
- **Memory bar excludes stopped apps** — `CommittedMemory()` in `internal/stacks/manager.go` now skips apps with `StateStopped` or `StateExited`. Only running/starting/unhealthy apps count toward committed memory.
- **Pre-start memory check** — `actionStack("start")` in `internal/api/router.go` now validates available memory before starting a stopped app. Returns 409 Conflict with a descriptive Hungarian error if insufficient.
#### Added
- **`hungarian_ui` metadata field** — New `HungarianUI bool` field in `ResourceHints` (`internal/stacks/metadata.go`). Shows "Magyar felület" green badge on deploy, stacks, and app info pages when `hungarian_ui: true` in `.felhom.yml`.
- **USB badge on storage cards** — Settings page storage cards now show an orange "USB" badge next to Aktív/Alapértelmezett when the drive is USB-attached (using existing `IsUSB` sysfs detection).
- **`StackMemoryMB()` helper** — New method on `Manager` to get a specific stack's memory request.
#### App Catalog (app-catalog-felhom.eu)
- **AdventureLog** — Fixed image tags from `v0.12.0` (non-existent) to `v0.11.0` for both backend and frontend.
### v0.27.1 — Fix FileBrowser Mount Sync (2026-02-22)
#### Fixed
- **`internal/web/handlers.go`** — `SyncFileBrowserMounts()` was reading the domain from a `.env` file that doesn't exist in the filebrowser stack directory (domain is baked into the compose labels by `docker-setup.sh`). It always logged `[WARN] Cannot read DOMAIN from FileBrowser .env — skipping mount sync` and returned early, so storage paths were never synced to FileBrowser's config.yaml or docker-compose.yml. Fixed by using `s.cfg.Customer.Domain` directly from the controller config.
### v0.27.0 — User-Configurable App Subdomains (2026-02-22)
#### Added
- **User-configurable subdomains**: Users can now customize the subdomain (e.g., `wiki`, `cloud`, `my-notes`) for each app during deployment, instead of using a fixed value. The deploy page shows an editable text input with the default subdomain pre-filled and the base domain as a suffix (e.g., `[wiki] .demo-felhom.eu`).
- **New deploy field type `"subdomain"`** — `internal/stacks/metadata.go`, `deploy.go`: A new field type that is user-editable with a default value, validated, and locked after deployment. Changing the subdomain requires removing the app (clean install) and redeploying.
- **Subdomain validation** — `internal/stacks/deploy.go`: Three-layer validation: DNS-safe format (lowercase alphanumeric + hyphens, max 63 chars), reserved name blocklist (`felhom`, `files`, `traefik`, `api`, `www`, `mail`, `admin`, etc.), and uniqueness check across all deployed stacks.
- **Backward compatibility** — `internal/stacks/deploy.go`: `InjectMissingFields()` auto-fills `SUBDOMAIN` from the `.felhom.yml` default for existing deployed apps when templates are synced, so no manual intervention is needed.
- **`internal/web/handlers.go`** — `stacksHandler()` builds an effective subdomain lookup map (stored env → metadata fallback). `appDetailHandler()` passes `EffectiveSubdomain` to templates.
- **`internal/web/templates/deploy.html`** — New `.subdomain-input-group` widget with inline `.domain` suffix. Client-side validation enforces DNS-safe format with real-time lowercasing.
- **`internal/web/templates/stacks.html`**, **`app_info.html`** — Subdomain links now read from stored `app.yaml` env (via lookup map) instead of hardcoded metadata, showing the user's actual chosen subdomain.
#### Changed
- **`internal/stacks/deploy.go`** — `PreviewDeployValues()` domain case simplified: shows just the base domain now (subdomain is a separate field).
- **`internal/web/handlers.go`** — Deploy page domain auto-field no longer prepends `meta.Subdomain + "."`. Passes `DeployedFieldValues` for rendering stored subdomain on settings page.
#### App Catalog (app-catalog-felhom.eu)
- All 51 template `docker-compose.yml` files updated: hardcoded `{subdomain}.${DOMAIN}` replaced with `${SUBDOMAIN}.${DOMAIN}` in Traefik labels, app env vars (APP_URL, trusted domains, webhook URLs, etc.), and comments.
- All 51 `.felhom.yml` files updated: added `SUBDOMAIN` deploy field with `type: subdomain` and `default:` matching the existing `subdomain:` metadata value.
### v0.26.2 — Show Full App URL on Deploy Page (2026-02-22)
#### Fixed
- **`internal/stacks/deploy.go`** — `PreviewDeployValues()` now shows the full reachable URL (`subdomain.base_domain`) for domain-type fields instead of just the base domain. Informational only — stored env var remains the base domain.
- **`internal/web/handlers.go`** — Same fix applied to the already-deployed settings page: domain field displays `subdomain.base_domain` matching what the app card shows.
### v0.26.1 — Show Auto-Generated Values on Deploy Page (2026-02-22)
#### Changed
- **`internal/stacks/deploy.go`** — Added `PreviewDeployValues()` method: pre-generates domain and secret field values when the deploy page is loaded, so the user can see (and note down) exact values before deploying. Updated `DeployStack()` to accept pre-generated secret values from the form instead of always regenerating.
- **`internal/web/handlers.go`** — `deployHandler` now calls `PreviewDeployValues()` for non-deployed apps and populates `AutoFieldValues` (previously empty for pre-deploy).
- **`internal/web/templates/deploy.html`** — "Automatikusan generált értékek" section now shows actual values on the pre-deploy page too: domain as a readonly text input, secrets as readonly password inputs with a "Megjelenítés" reveal button. Updated section description to inform the user to note down passwords. Pre-generated secret values are submitted as hidden inputs so the same values shown to the user are saved to `app.yaml`.
### scripts — Hub Mode + FileBrowser Controller-Managed Volumes (2026-02-22)
#### `scripts/docker-setup.sh` — v6.0.0
- **Hub mode** (`--hub-customer` / `--hub-password`): downloads `controller.yaml` from Hub API early in setup, extracts `domain`, `email`, `cf_api_token`, `cf_tunnel_token` and auto-populates all infrastructure settings. Single one-liner deploys fully configured Traefik + TLS + Cloudflare Tunnel with no additional flags needed. CLI flags always override hub values.
- **`yaml_get()` helper**: strips leading whitespace before key comparison — required because Go's `yaml.v3` uses 4-space indentation.
- **`apply_hub_config()`**: called before `print_banner` in `main()` so hub-sourced values are reflected in the plan display.
- **FileBrowser initial install**: removed drive auto-discovery from `install_filebrowser()`. FileBrowser is now installed with no drive volumes and a minimal `config.yaml` with `/srv` fallback. Drive volumes are managed entirely by the controller (`SyncFileBrowserMounts()`) after storage is registered via the dashboard.
- **Bug fix**: `((found_mounts++))``found_mounts=$(( found_mounts + 1 ))``set -euo pipefail` traps post-increment when var=0 (exit code 1). Same fix applied to `step_num` in `install_filebrowser()`.
#### `scripts/felhom-wipe.sh`
- **`cleanup_scan_dir()`**: removes `/mnt/.felhom-scan/` (ephemeral DR scan directory) — called from `full` level onwards.
- **`cleanup_raw_mounts()`**: removes raw helper mount infrastructure (`/mnt/.felhom-raw/`) at `nuclear` level: unmounts bind mounts first, then raw mounts, strips fstab entries, removes empty directories. Physical drive data untouched.
- **Bug fix**: `do_soft_wipe()` used `[ -f "$f" ] && rm -f "$f" && info "..."` — with `set -euo pipefail`, when a state file doesn't exist `[ -f ]` returns 1, the whole `&&` chain returns 1, and `set -e` exits the script. Nuclear wipe was silently stopping after removing only the first two state files that existed. Fixed with `if [ -f "$f" ]; then ...; fi`.
#### `scripts/README.md`
- Hub mode quick start simplified to one-liner
- Updated installation steps table: step 7 reflects controller-managed FileBrowser volumes
- Added "Raw helper mounts" section explaining two-level mount architecture
- Updated wipe levels table for `full` (scan dir) and `nuclear` (raw mounts + scan dir)
### v0.26.0 — Storage Namespace `felhom-data/` + Test Node Wipe Script (2026-02-22)
All felhom-managed data on external drives now lives under a `felhom-data/` subdirectory, cleanly separating controller-managed data from user files. Plus a multi-level wipe script for repeatable test node cleanup.
**Key design principle:** `HDD_PATH` env var stays as the mount point (e.g., `/mnt/hdd_1`). The `felhom-data` segment is embedded in path helpers and compose templates — not in `HDD_PATH`.
#### Changed
- **`internal/backup/paths.go`** — Added `FelhomDataDir = "felhom-data"` constant. Updated 8 path functions to insert `felhom-data` between the drive root and data subdirectory:
- `PrimaryBackupPath``<drive>/felhom-data/backups/primary`
- `PrimaryResticRepoPath``<drive>/felhom-data/backups/primary/restic`
- `AppDBDumpPath``<drive>/felhom-data/backups/primary/<stack>/db-dumps`
- `SecondaryBackupPath``<drive>/felhom-data/backups/secondary`
- `AppSecondaryRsyncPath``<drive>/felhom-data/backups/secondary/<stack>/rsync`
- `SecondaryResticRepoPath``<drive>/felhom-data/backups/secondary/restic`
- `SecondaryInfraPath``<drive>/felhom-data/backups/secondary/_infra`
- `AppDataDir``<drive>/felhom-data/appdata/<stack>`
- `InfraBackupDir` **unchanged** — stays at drive root for DR scanner
- **`internal/stacks/delete.go`** — Added local `felhomDataDir = "felhom-data"` constant (cannot import `backup` due to architectural boundary). Updated `ProtectedHDDPaths()` to protect `<drive>/felhom-data`, `<drive>/felhom-data/appdata`, `<drive>/felhom-data/backups`. Fixed hardcoded paths in `GetStackBackupData()`.
- **`internal/storage/migrate_drive.go`** — Added `backup` package import. Fixed 4 issues:
- Conflict check: uses `backup.AppDataDir()` instead of hardcoded `appdata/`
- Verify step: uses `backup.AppDataDir()` instead of hardcoded `appdata/`
- rsync excludes: updated from `backups/primary/restic/` to `felhom-data/backups/primary/restic/`
- Size estimation: now scans inside `felhom-data/` namespace, skipping restic repos correctly
- **`internal/storage/migrate.go`** — Added `backup` package import. Post-migration DB dump copy now uses `backup.AppDBDumpPath()` instead of hardcoded paths.
- **`internal/web/handlers.go`** — Fixed legacy `"storage"` path in storage app detail size calculation (was dead code — path never existed); now uses `backup.AppDataDir()`.
- **`internal/storage/format_linux.go`** — Format wizard creates `felhom-data/` subdirectory instead of legacy `storage/`.
- **`internal/storage/attach_linux.go`** — Attach wizard creates `felhom-data/` subdirectory instead of legacy `storage/`.
#### Added
- **`scripts/felhom-wipe.sh`** — Test node cleanup script with 4 wipe levels:
- `soft` — Removes controller state files (settings.json, metrics.db, session/setup/update/snapshot state)
- `controller` — Soft + removes all app containers, volumes, and stack directories (skips protected stacks by default)
- `full``controller`-level cleanup + removes `felhom-data/` on all storage drives (also removes old-style `appdata/` and `backups/` for migration compatibility); infra containers preserved, controller restarted after cleanup
- `nuclear` — Full + removes controller.yaml, all infrastructure containers (controller, traefik, cloudflared, portainer), DR markers, and runs `docker system prune -af --volumes`
- Auto-detects paths from `controller.yaml` and `settings.json`
- Dry-run by default; requires `--yes` to execute
- Interactive confirmation prompt with `--yes` execution
#### Notes
- **Migration**: Pre-v0.26.0 restic snapshots reference old paths (without `felhom-data/`). Existing installations need data migration before upgrading.
- **App catalog**: Compose templates need separate update: `${HDD_PATH}/appdata/``${HDD_PATH}/felhom-data/appdata/` (tracked as separate task).
- All backup, crossdrive, and restore logic automatically picks up new paths via `paths.go` helpers — no changes needed in `backup.go`, `crossdrive.go`, or `restore.go`.
---
### v0.25.0 — Debug Page: Operator Testing & Diagnostics Dashboard (2026-02-21)
**Full debug dashboard with 8 sections for testing all controller subsystems in debug mode.**
Only available when `logging.level: "debug"` — sidebar link, page, and all `/api/debug/*` endpoints return 404 otherwise.
#### New files
- `internal/web/logbuffer.go` — Ring buffer (1000 entries) implementing `io.Writer` for capturing log output. Parses Go standard log format (with/without `Lshortfile`), extracts level/source/timestamp. Supports filtered retrieval by level and timestamp.
- `internal/web/handler_debug.go` — Debug page handler + 20 API endpoint handlers organized in 8 sections. `DebugCallbacks` struct (6 fields) for wiring main.go closures.
- `internal/web/templates/debug.html` — Full debug dashboard template with 8 collapsible sections, complete JS framework (lazy-load, polling, action buttons, log viewer with filter/auto-refresh).
#### Debug page sections
1. **Rendszer diagnosztika** — Diagnostic dump (migrated from `api/router.go`) with structured UI rendering: controller info, storage paths, deployed stacks, scheduler jobs, alerts. JSON download button.
2. **Értesítés teszt** — Send test events with configurable type/severity, view event history ring buffer (last 50 events, newest first).
3. **Mentés teszt** — Trigger individual backup phases: full backup, DB dump only, cross-drive only, restic integrity check, infrastructure backup.
4. **Tárhely teszt** — Storage watchdog status table with per-path probe state. Simulate disconnect (stops apps, marks disconnected, skips unmount) and reconnect (cleans locks, clears state). 5s auto-refresh.
5. **Hub & Kapcsolatok** — Hub report push, infra backup push, Hub/Gitea connectivity tests with latency, preference sync.
6. **Önfrissítés teszt** — Version check + dry-run (shows current/new image lines, compose writability, backup status).
7. **DR / Telepítő varázsló** — Infra backup status per drive (files, timestamps). "RESET" confirmation + infra backup pre-check before triggering setup mode via marker file.
8. **Naplóviewer** — In-memory log viewer with level filter (DEBUG/INFO/WARN/ERROR), 2s auto-refresh, color-coded entries, clear display.
#### Module additions
- `notify/notifier.go`: `PushTestEventSync()` (synchronous, returns Hub status), `GetEventHistory()` (ring buffer), `recordHistory()` for debug page.
- `backup/crossdrive.go`: `RunAllConfigured()` — runs all enabled apps ignoring schedule filter.
- `selfupdate/updater.go`: `DryRun()` — checks update availability, compose writability, backup status without performing changes.
- `monitor/watchdog.go`: `SimulateDisconnect()` / `SimulateReconnect()` with `simulatedPaths` map, `GetDebugStatus()` for per-path probe state. Watchdog `Check()` skips simulated paths.
- `setup/setup.go`: `NeedsSetup()` now checks `.needs-setup` marker file. `ClearSetupMarker()` for cleanup.
#### Routing changes
- **Mux carve-out**: `/api/debug/` routes to web server (same pattern as `/api/storage/`), with auth + CSRF.
- **Removed** `SetDebugDumpDeps()` from `api/router.go` and the `/api/debug/dump` route — dump handler migrated to `handler_debug.go` using Server's existing fields.
#### Infrastructure
- `setupLogger()` now returns `(*log.Logger, *web.LogBuffer)`. In debug mode, creates `io.MultiWriter(os.Stdout, logBuffer)` so all log output is captured from the start.
- Debug CSS: ~170 lines of styles for sections, result badges, log viewer, confirm input, danger button, spinner.
### v0.24.0 — Pre-Testing Observability (2026-02-21)
**Three features for pre-testing diagnostics: verbose debug logging, diagnostic dump endpoint, and startup self-test.**
#### Feature 1: Debug logging across all modules
All `[DEBUG]` log lines are gated behind `logging.level: "debug"` — zero overhead at `info` level.
- **New** `internal/util/strings.go`: shared `TruncateStr()` for safely truncating command output in logs.
- **Backup** (`backup.go`, `dbdump.go`, `crossdrive.go`, `restore.go`, `local_infra.go`): added `isDebug()` method and per-operation debug logging. DB dump logs container discovery, per-dump command details (passwords masked as `***`), validation results. Cross-drive logs source/dest paths, rsync results, auto-enable decisions. Restore logs step-by-step progress.
- **Storage** (`scan_linux.go`, `format_linux.go`, `attach_linux.go`, `migrate.go`): added `Logger`/`Debug` fields to request structs. Logs raw lsblk output (truncated), per-disk classification, pipeline steps for format/attach, rsync progress for migrate. Updated `*_other.go` stubs.
- **Sync** (`sync.go`): logs masked clone URLs, per-file hash comparison, post-sync hook triggers.
- **Self-update** (`updater.go`): logs registry API calls, tag parsing, version comparison, compose file edits.
- **Monitor** (`watchdog.go`): smart logging — periodic 60-probe summaries (~5 min), immediate log on unexpected failures, reconnect attempt details. (`healthcheck.go`): logs raw check values and per-check results.
- **Notify** (`notifier.go`): logs event push URL/type/response, preference sync details.
- **Report** (`pusher.go`, `builder.go`): logs payload sizes, section summaries, push responses.
- **Assets** (`syncer.go`): logs manifest fetch, per-file hash comparison, download/removal actions.
- **Setup** (`scanner.go`, `handlers.go`): logs drive scan details, hub recovery/config write operations.
#### Feature 2: Diagnostic dump endpoint (`GET /api/debug/dump`)
Returns a comprehensive JSON snapshot of all controller state. Only available when `logging.level: "debug"` — returns 404 otherwise.
- Sections: `controller` (version, uptime, config hash, PID), `storage` (per-path usage), `stacks` (deployed/running/stopped counts + list), `backup` (status, repo stats), `hub` (push status, consecutive failures), `scheduler` (all jobs with last_run/running/errors), `health` (fresh check), `notifications`, `self_update`, `alerts`.
- API router expanded with `SetDebugDumpDeps()` setter for scheduler, hub pusher, alert manager, version, and start time.
#### Feature 3: Startup self-test
- **New** `internal/selftest/selftest.go`: runs 9 diagnostic checks on boot with 5s timeout each.
- Checks: Docker socket, stacks directory, data directory (write test), system data path (mount point), storage paths (connected vs disconnected), git catalog (.felhom.yml files), Hub connectivity (/healthz), restic repos, metrics DB.
- Results logged in a clear block: `[PASS]`/`[WARN]`/`[FAIL]` per check, summary at end.
- Self-test summary (pass/warn/fail counts) sent to Hub via `NotifyControllerStarted` details map.
- Never blocks startup — purely diagnostic.
#### Constructor/signature changes
- `notify.New()`: added `debug bool` param. `NotifyControllerStarted()`: added `details map[string]interface{}` param.
- `report.NewPusher()`: added `debug bool` param. `BuildReport()`: added `logger *log.Logger` param.
- `monitor.RunHealthCheck()`: added `logger *log.Logger` param (5 call sites in main.go).
- `selfupdate.NewUpdater()`: added `debug bool` param.
- `assets.New()`: added `debug bool` param.
- `backup.NewCrossDriveRunner()`: added `debug bool` param. `WriteLocalInfraBackup()`: added `debug bool` param.
- `backup.DiscoverDatabases()`, `DumpOne()`: added `debug bool` param.
- `storage.ScanDisks()`: added `logger, debug` params. `FormatRequest`, `AttachRequest`, `MigrateRequest`: added `Logger`/`Debug` fields.
- `setup.ScanDrivesForInfraBackups()`: added `debug bool` param.
### v0.23.0 — CSRF Protection (2026-02-21)
**CSRF (Cross-Site Request Forgery) protection on all browser-facing POST endpoints — controller and hub.**
**Controller changes:**
- New `internal/web/csrf.go`: `CsrfProtect` HTTP middleware validates CSRF tokens on all state-mutating requests (POST/DELETE/PATCH).
- Reads token from `_csrf` form field or `X-CSRF-Token` request header.
- Exempt paths: `Authorization: Bearer` requests (selfupdate, config/apply hub→controller calls) — browsers cannot auto-send Bearer headers, so no CSRF risk.
- Auth-disabled mode (no password set): CSRF check is skipped entirely.
- On rejection: JSON error for `/api/` paths, HTTP 403 text for page routes.
- `internal/web/auth.go`: `session` struct gains a `csrfToken string` field. `createSession()` generates a second 32-byte random CSRF token alongside the session token. New `csrfTokenForSession(sessionToken)` method returns the CSRF token for a given session.
- `internal/web/server.go`: New `executeTemplate(w, r, name, data)` wrapper auto-injects `CSRFField` (`template.HTML` hidden input) and `CSRFToken` (raw string) into every page render data map.
- `cmd/controller/main.go`: All route registrations wrapped with `webServer.CsrfProtect(...)` middleware. Version bumped to `v0.23.0`.
- All handlers (`handlers.go`, `storage_handlers.go`, `handler_restore.go`): Switched from `s.render(w, ...)` to `s.executeTemplate(w, r, ...)`.
- All templates updated:
- `layout.html`: Added `<meta name="csrf-token">` and inline `csrfHeaders()` JS helper (returns `{'X-CSRF-Token': ...}`) in `<head>` (before page-specific scripts). Updated 4 fetch POST/DELETE calls.
- `settings.html`: Added `{{$.CSRFField}}` to 5 forms inside `{{range .StoragePaths}}` (must use `$` for outer scope inside range). Added `{{.CSRFField}}` to 3 page-level forms. Inline-label form uses `document.querySelector('meta[name="csrf-token"]').content`. Updated 5 fetch calls.
- `deploy.html`: Added `{{.CSRFField}}` to cross-backup form. Updated 3 fetch calls.
- `backups.html`: Updated 3 fetch calls. Dynamically-created restore form injects `_csrf` from meta tag.
- `storage_init.html`, `storage_attach.html`, `migrate.html`, `migrate_drive.html`, `app_info.html`, `restore.html`: All fetch calls updated.
- `storage_attach.html`: Replaced `navigator.sendBeacon()` with `fetch(..., {keepalive: true})``sendBeacon` cannot send custom headers, making CSRF impossible.
**Hub changes (v0.3.8):**
- `internal/web/server.go`: Replaced insecure literal `hub_session=authenticated` cookie with proper server-side session map.
- New `hubSession` struct with `csrfToken string` and `expiresAt time.Time`.
- `sessions map[string]*hubSession` + `sessionsMu sync.RWMutex` on `Server` struct.
- `handleLogin`: Generates cryptographically random 64-char hex session token + 64-char hex CSRF token. Cookie gains `SameSite=Lax` and `Secure` (when TLS) attributes. Session expires after 7 days.
- `RequireAuth`: Validates session token against map (constant-time compare), redirects to `/login` on failure.
- `CleanupSessions(ctx)`: Goroutine that purges expired sessions every hour.
- CSRF validation block at top of `ServeHTTP`: checks `X-CSRF-Token` header or `_csrf` form field on POST/DELETE/PATCH. Skips when no session cookie (Basic Auth / API path).
- `csrfToken(r)`, `csrfField(r)` helpers for template data injection.
- `internal/web/configs.go`: Added `html/template` import. All template render calls pass `CSRFField template.HTML` and/or `CSRFToken string`. `renderConfigForm` gains `r *http.Request` parameter.
- Templates updated:
- `config_form.html`: Added `{{.CSRFField}}` inside the `<form>`.
- `customer_unified.html`: Added `<meta name="csrf-token">` + inline `csrfHeaders()` in `<head>`. Added `{{.CSRFField}}` to all 5 POST forms (unblock, block, delete config, create-config, regen-password). Updated 3 JS fetch POST calls (trigger-update, push-config, pull-config).
- `cmd/hub/main.go`: Started `go webServer.CleanupSessions(ctx)` goroutine.
### v0.22.3 — Hub Asset Sync (2026-02-21)
**Hub-managed asset downloads**
- New `internal/assets` package: downloads and caches app assets (logos, screenshots) from the Hub API with SHA-256 change detection.
- Asset syncer resolves files from downloaded cache first, falls back to baked-in `/usr/share/felhom/assets/` directory.
- Config: `assets.sync_enabled: true` + `assets.sync_schedule: "05:00"` to enable daily sync.
- API: `POST /api/assets/sync` triggers on-demand sync, `GET /api/assets/status` returns sync status.
- Web server's `serveAsset()` now routes through syncer's `Resolve()` when available.
### v0.22.2 — Setup Logo Fix (2026-02-21)
- **Fix setup wizard logo**: Logo failed to load because `handleLogo()` tried to read it as a file from the filesystem, but it only exists as an embedded string constant. Now imports and serves `web.FelhomLogoSVG` directly.
### v0.22.1 — Setup Wizard Bugfixes (2026-02-21)
- **Fix setup mode detection**: Remove `demo-felhom` from `NeedsSetup()` check — only empty `customer.id` triggers setup mode. Previously the demo customer was stuck in setup mode.
- **Fix CSRF nil pointer panic**: `renderError()` was passing `nil` instead of `*http.Request` to `ensureCSRFToken()`, causing panic when rendering error pages.
- **Fix double-v version display**: Welcome page showed "vv0.22.0" — removed redundant `v` prefix from template.
- **Fix IP detection in Docker**: Setup wizard showed container bridge IP (172.x) instead of host LAN IP. Now reads `HOST_IP` env var (set by docker-setup.sh).
- **Add Hub download logging**: Log Hub config download attempts and errors for easier debugging.
- **docker-setup.sh**: Inject `HOST_IP` env var into generated docker-compose.yml.
### v0.22.0 — First-Run Setup Wizard & Local Infra Backup (2026-02-21)
Major feature release: moves ALL initial configuration and disaster recovery setup from `docker-setup.sh` into the controller itself as a web-based wizard.
**Setup Wizard (`internal/setup/`):**
- New web-based setup wizard replaces interactive CLI wizard from `docker-setup.sh`
- Dual listener: `:8080` (behind Traefik) + `:8081` (direct HTTP for LAN access before DNS is configured)
- Setup mode detection: controller enters wizard when `customer.id` is empty
- Two paths: "Restore from backup" (local drive scan + Hub recovery) and "Fresh install" (Hub download or manual config)
- Drive scanner: detects `.felhom-infra-backup/` on all connected drives, validates checksums
- Hub recovery: `GET /api/v1/recovery/{id}` with retrieval password auth — returns combined config + infra backup
- CSRF protection (cookie + hidden field) for all wizard POST endpoints
- State persistence (`setup-state.json`) survives browser crashes
- All UI text in Hungarian, uses existing dark theme CSS
- After setup: writes `controller.yaml`, creates `settings.json`, `os.Exit(0)` → Docker restart into normal mode
**Local Infra Backup (`internal/backup/local_infra.go`):**
- Writes infrastructure backup to all connected drives as `.felhom-infra-backup/backup.json` + `metadata.json`
- Schema-versioned with SHA256 checksum validation
- Runs on startup and after each nightly backup cycle
- Enables disaster recovery without Hub connectivity — any drive can bootstrap a new controller
**Hub Verification:**
- Pusher parses Hub report response for `customer_blocked` field
- Updates `hub_verified` / `hub_verified_at` in settings on each successful push
- `IsLimitedMode()` checks verification state + 7-day grace period
**Recovery Info:**
- New `internal/recovery/` package generates `recovery-info.txt` in data directory
- Settings page shows recovery info section (customer ID, Hub URL, masked retrieval password)
- Recovery file auto-regenerated on each startup when retrieval password is set
**Pending Events:**
- New `PendingEvent` type in settings with `AddPendingEvent()` / `DrainPendingEvents()`
- Events queued during setup (e.g., DR completed) are drained and pushed to Hub on first successful report push
**Config & Settings Schema:**
- `config.go`: Added `SetupListen` field (default `:8081`), `LoadPermissive()`, `Default()`
- `settings.go`: Added `hub_verified`, `hub_verified_at`, `retrieval_password`, `pending_events` fields with RWMutex accessors
**Infrastructure:**
- `docker-compose.yml`: Added port `8081:8081` mapping for setup wizard
- Removed old fresh-deployment auto-restore code from `main.go` (lines 70-141)
- Removed `restoreSettingsFromHub()` and `restorePasswordsFromHub()` helpers
### v0.21.3 — Config Apply Infra Push + Fixes (2026-02-20)
- **Push infra backup after config apply**: After a successful `POST /api/config/apply`, the controller immediately pushes an infra backup to the Hub so the config sync status updates right away.
- **Fix double "v" prefix in startup event**: "Controller elindult (vv0.21.2)" → "Controller elindult (v0.21.3)".
### v0.21.2 — Config Apply Bind Mount Fix (2026-02-20)
- **Fix config apply on Docker bind mounts**: `POST /api/config/apply` failed with "device or resource busy" because `os.Rename()` doesn't work on bind-mounted files. Now falls back to direct write when rename fails.
### v0.21.1 — Config Content Endpoint (2026-02-20)
- **`GET /api/config`**: New endpoint returning raw controller.yaml content (text/yaml). Used by Hub for live config diff and pull operations. Same auth as other config endpoints (Bearer token or session cookie).
### What was just completed (2026-02-20 session 64)
- **v0.21.0 — Hub Monitoring Takeover (Controller-side, Phases 5+6):**
Replaces external Healthchecks.io dependency with Hub-native event system. The controller now pushes structured events directly to the Hub's `/api/v1/event` endpoint, and the Hub handles dead man's switch detection, notification dispatch, and cooldown management.
**Phase 5 — Event Push System (`internal/notify/notifier.go`):**
- New core method `PushEvent(eventType, severity, message, details)` — non-blocking goroutine, 2 retries with 3s backoff, POSTs to Hub `/api/v1/event`
- 8 typed detail structs: `BackupDetails`, `DBDumpDetails`, `DiskDetails`, `HealthDetails`, `StorageDetails`, `UpdateDetails`, `AppDetails`, `CrossDriveDetails`
- Replaced all old `Notify*` methods with event-based equivalents:
- `NotifyBackupCompleted/Failed``backup_completed`/`backup_failed` events
- `NotifyDBDumpCompleted/Failed``db_dump_completed`/`db_dump_failed` events
- `NotifyIntegrityOK/Failed``backup_integrity_ok`/`backup_integrity_failed` events
- `NotifyHealthChange` → detects transitions, pushes `health_degraded`/`health_critical`/`health_recovered`
- `NotifyStorageDisconnected/Reconnected``storage_disconnected`/`storage_reconnected` events
- `NotifyControllerStarted``controller_started` event on startup
- `NotifyControllerUpdated``controller_updated` event (replaces `NotifyUpdateSuccess/Failed`)
- `NotifyAppDeployed/Removed``app_deployed`/`app_removed` events
- `NotifyCrossDriveCompleted/Failed``crossdrive_completed`/`crossdrive_failed` events
- `NotifyDRStarted/Completed``disaster_recovery_started`/`disaster_recovery_completed` events
- Removed old `/api/v1/notify` relay, `classifyWarning()`, and client-side cooldown logic (Hub handles cooldowns now)
- `SendTest()` now pushes `test` event type via `PushEvent`
- `SyncPreferences` updated to include `cooldownHours` parameter
**Phase 5 — Event Wiring:**
- `main.go`: Wired success events for backup, db-dump, integrity check; startup event with 5s delay; update event after `VerifyStartup()`
- `router.go`: Added `NotifyAppDeployed`/`NotifyAppRemoved` after successful deploy/remove via API
- `handler_restore.go`: Added `NotifyDRStarted`/`NotifyDRCompleted` in DR restore flow
- `server.go`: New `HubPushStatusData` struct and `SetHubPushStatus` callback for monitoring page
**Phase 5 — Hub Connection Monitoring:**
- `pusher.go`: Added `PushStatus` tracking (LastAttempt, LastSuccess, LastError, Consecutive failures) to report Pusher
- `handlers.go`: Monitoring page now shows Hub connection status (connected/unreachable, URL, customer ID, last success, last error) instead of Healthchecks ping UUIDs
- `monitoring.html`: Replaced "Távoli monitoring" section with "Hub kapcsolat" section
- `alerts.go`: Replaced "Missing ping UUIDs" alert with Hub connection alerts (`hub-disabled` warning, `hub-unreachable` error)
**Phase 5 — Expanded Notification Settings:**
- `settings.html`: Expanded from 4 checkboxes to 11 grouped toggles in two categories:
- "Hibák és figyelmeztetések": backup_failed, db_dump_failed, backup_integrity_failed, crossdrive_failed, disk alerts, storage_disconnected, node_down, health_critical, expected missed
- "Tájékoztató": storage_reconnected, health_recovered
- Compound toggles: "Lemez figyelmeztetés" maps to `disk_warning` + `disk_critical`; "Elvárt mentés elmaradt" maps to `expected_backup_missed` + `expected_dbdump_missed`
- `settings.go`: Updated `DefaultEnabledEvents` to new Hub event types
- `handlers.go`: Updated settings POST handler for expanded event names and compound toggles
**Phase 6 — Config Cleanup:**
- `main.go`: Deprecation log on startup when ping UUIDs are configured: `[INFO] Healthchecks ping UUIDs configured but no longer used — monitoring is now handled by the Hub`
- Pinger still runs for transitional backward compatibility
### What was just completed (2026-02-20 session 63)
- **v0.20.0 — Hub Config Management (Phase B):**
Two new features enabling the Hub to manage and compare controller configuration remotely.
**Feature A — Config Apply Endpoint:**
- `router.go`: Added `POST /api/config/apply` — accepts YAML body from Hub, validates it's parseable via `config.LoadFromBytes()`, writes atomically to controller.yaml (`.tmp` + `os.Rename`), returns success JSON. Restart required to apply.
- `router.go`: Added `GET /api/config/hash` — returns SHA256 hex digest of current controller.yaml
- `router.go`: Router struct gained `configPath string` field; `NewRouter()` signature updated
- `config.go`: Added `LoadFromBytes([]byte)` — parses YAML without file I/O (for validation)
- `config.go`: Added `FileHash(path)` — SHA256 hex digest helper
- `main.go`: Config endpoints use same dual auth middleware as self-update (session OR Hub API key Bearer token)
- `main.go`: Added `/api/config/` mux entry with `selfUpdateAuthMiddleware`
**Feature B — Config Hash in Reports:**
- `types.go`: Added `ConfigHash string` field to `Report` struct (JSON: `config_hash`)
- `builder.go`: `BuildReport()` now accepts `configPath string` parameter, computes SHA256 of controller.yaml and includes it in every report
- `main.go`: All 4 `BuildReport()` call sites updated to pass `*configPath`
- Hub uses this hash to compare against its generated YAML — shows "In sync" / "Config mismatch" / "Unknown" on the unified customer detail page
### What was just completed (2026-02-20 session 62)
- **docker-setup.sh — Hub Config Download:**
- Added `--hub-customer` and `--hub-password` CLI flags for downloading pre-configured controller.yaml from Felhom Hub
- Added `HUB_URL` global variable (default: `https://hub.felhom.eu`)
- Hub download logic at start of `run_config_wizard()`: downloads YAML via `curl` with `X-Retrieval-Password` header, validates response, extracts key variables (domain, CF tokens, email), sets global variables for subsequent setup steps
- Falls back to interactive wizard if download fails or credentials not provided
### What was just completed (2026-02-20 session 61)
- **v0.19.0 — Deployed App Removal + Missing Field Injection:**
Two new features: "Eltávolítás" (Remove) action for deployed stacks and automatic missing deploy field injection on template updates.
**Feature A — Deployed App Removal ("Eltávolítás"):**
- `delete.go`: Added `RemoveStack()` — removes deployed (non-orphaned) stack: `docker compose down --volumes`, optional HDD data cleanup, optional backup data cleanup (DB dumps + cross-drive rsync), removes `app.yaml` only (template files preserved for redeploy); stack reverts to "Nincs telepítve" state
- `delete.go`: Added `GetStackBackupData()` — returns backup path info (DB dump dir + cross-drive rsync dir) with sizes and existence status
- `delete.go`: Added `RemoveResponse`, `BackupDataResponse` structs, `buildPathInfo()` helper
- `router.go`: Added `POST /api/stacks/{name}/remove` endpoint — accepts `{remove_hdd_data, remove_backups}`, computes backup paths via `AppDBDumpPath()`/`AppSecondaryRsyncPath()`, cleans cross-drive config on success
- `router.go`: Added `GET /api/stacks/{name}/backup-data` endpoint — returns backup data paths with sizes
- `crossdrive.go`: Made `getAppDrivePath``GetAppDrivePath` (public) for use by router
- `stacks.html`: Added "Eltávolítás" button for stopped, deployed, non-orphaned, non-protected stacks
- `dashboard.html`: Same button in compact card layout
- `layout.html`: Added `removeStack()` modal — fetches HDD + backup data in parallel, 3-section layout (always removed / HDD data with checkbox / backup data with checkbox), reimport warning for preserved HDD data, restic retention note
- `layout.html`: Added `confirmRemoveStack()` — POST to `/remove`, shows result summary with removed/preserved paths
**Feature B — Missing Deploy Field Injection:**
- `deploy.go`: Added `InjectMissingFields(stackNames)` — iterates deployed stacks, compares `.felhom.yml` deploy_fields against `app.yaml` env vars, auto-generates values for missing `secret` (using generator spec) and `domain` fields, saves updated `app.yaml`
- `deploy.go`: Added `base64key` generator type — produces `base64:<N random bytes base64-encoded>` (for Laravel APP_KEY and similar)
- `deploy.go`: Added `containsStr()` helper
- `manager.go`: Added `DeployedStackNames()` — returns names of all deployed stacks
- `sync.go`: Added `postSyncHook func(updated []string)` field to `Syncer`; `New()` accepts optional hook; hook called in `doSync()` after rescan with names of updated stacks
- `main.go`: Wired injection on startup (all deployed stacks) and after sync (updated stacks only)
### v0.18.0 (2026-02-19 session 60)
- **v0.18.0 — Drive Migration & Tier 2 Restic Deprecation:**
Full drive replacement workflow with decommissioned state, enhanced per-app migration with backup awareness, and deprecation of restic as a Tier 2 cross-drive backup method (rsync only).
**Phase 1 — Restic Tier 2 Deprecation:**
- `settings.go`: Auto-migrate restic→rsync on startup via `migrateResticToRsync()` in `Load()`
- `crossdrive.go`: Removed `runResticBackup()`, `pruneResticRepo()`, `ensureResticRepo()`; `RunAppBackup()` calls rsync directly
- `backup.go`: Removed Tier 2 secondary restic scanning from `ListAllSnapshots()`
- `settings.go`: Removed cross-drive restic password methods (`GetOrCreateCrossDrivePassword`, etc.)
- `deploy.html`: Removed method dropdown (rsync/restic selector)
- `handlers.go`: Simplified `Tier2DriveGroup` (flat `Items` list), removed method handling from `settingsCrossBackupHandler()`
- `backups.html`: Removed method split in Tier 2 details section
- `router.go`: Always set method to "rsync" in cross-backup API
- `infra_backup.go`: Removed cross-drive password block from `CollectInfraBackup()`
- `main.go`: Removed `SetCrossDriveResticPassword` restore block
**Phase 2 — Enhanced Per-App Migration:**
- `backup.go`: Extracted `backupDrive()` from `runBackupInternal()` loop; added `TryRunDriveBackup()` with non-blocking lock
- `crossdrive.go`: Added `AnyRunning()` method
- `migrate.go`: Added `BackupTrigger` interface, `MigrateOrchestrator`, `RunEnhancedMigration()` with post-migration steps (DB dump copy, Tier 2 conflict clearing, auto-delete stale data, immediate Tier 1 backup)
- `storage_handlers.go`: Wired orchestrator into migration handler with `auto_delete_stale` support
- `migrate.html`: Added auto-delete checkbox, "cleaning" + "backing_up" progress steps
**Phase 3 — Full Drive Migration:**
- `settings.go`: Added `Decommissioned`/`DecommissionedAt`/`MigratedTo` fields to `StoragePath`; added `SetDecommissioned()`, `ClearDecommissioned()`, `IsDecommissioned()`, `GetDecommissionedPaths()`, `GetStorageLabel()`; `GetConnectedPaths()`/`GetSchedulableStoragePaths()` exclude decommissioned
- `migrate_drive.go` (NEW): `DriveMigrator` with `MigrateDrive()` 10-step flow (validate→stop→rsync→verify→configure→decommission→Tier2→start→backup→notify), `migrationTx` rollback pattern, excludes restic repos from rsync
- `settings.html`: Decommissioned card variant with "Kiváltva" badge, "Összes adat átköltöztetése" button on connected cards
- `migrate_drive.html` (NEW): Drive migration wizard (form + progress + done cards)
- `storage_handlers.go`: Added `/api/storage/migrate-drive`, `/api/storage/migrate-drive/status`, `/api/storage/decommission/remove` endpoints
- `server.go`: Added `/settings/storage/migrate-drive` route, `SetDriveMigrator()` setter
- `watchdog.go`: Skip decommissioned drives in `Check()`; block `SafeDisconnect()` for decommissioned
- `healthcheck.go`: Skip decommissioned paths in `checkStoragePaths()`
- `backup.go`: Skip decommissioned drives in `backupDrive()`/`runDBDumpsInternal()`; added `MigrationActiveCheck` callback to skip nightly backup during migration
- `crossdrive.go`: Reject decommissioned destinations in `ValidateDestination()`; skip decommissioned paths in `AutoEnableSmallApps()`
- `handlers.go`: Skip decommissioned drives in `buildStorageBars()`; made `SyncFileBrowserMounts()` public
- `main.go`: Added `driveMigrateStackAdapter`, wired `DriveMigrator` with all dependencies
**Phase 4 — Hub Changes:**
- `report/types.go`: Added `Decommissioned`/`MigratedTo` fields to `StorageReport`
- `report/builder.go`: Include decommissioned drives in report with flag
**Files modified:** 21 files modified + 2 new files (`migrate_drive.go`, `migrate_drive.html`).
### What was just completed (2026-02-19 session 59)
- **v0.16.1 + hub v0.1.8 — Hub Update Trigger + Controller URL Reporting:**
Controller now includes its external URL (`controller_url`) in periodic hub reports so the hub can trigger self-updates remotely. Hub tracks the URL in a new `controller_url` DB column, checks the Gitea registry for the latest controller image version (VersionChecker goroutine, `web/version.go`), and shows a "Controller Update" card on the customer detail page.
**Controller (v0.16.1):**
- `internal/report/types.go`: Added `ControllerURL string` field to Report struct.
- `internal/report/builder.go`: Sets `ControllerURL` from `cfg.Customer.Domain``https://felhom.<domain>`.
- `internal/api/router.go`: **Bug fix** — moved selfupdate routes to before `hasSuffix(path, "/update")` stack case (which was catching `/selfupdate/update` first).
**Hub (v0.1.8):**
- `cmd/hub/main.go`: Added `Registry` config section + defaults; creates `VersionChecker` goroutine if credentials configured; passes `apiKey` to `web.New()`.
- `internal/store/store.go`: Added `ControllerURL` to `CustomerSummary`; idempotent `ALTER TABLE reports ADD COLUMN controller_url TEXT` migration; updated `SaveReport`, `GetCustomers`, `GetCustomer`, `GetCustomerHistory` queries.
- `internal/web/version.go` (NEW): `VersionChecker` type — polls Gitea Docker Registry V2 API (`/v2/<owner>/<repo>/tags/list`) every 6h; parses semver tags; stores latest version thread-safely.
- `internal/web/server.go`: Added `apiKey`, `versionChecker` fields; updated `New()` signature; added `SetVersionChecker()`; added `handleTriggerUpdate` handler that proxies POST to controller's `/api/selfupdate/update`; added trigger-update route (before `/customers/` catch-all); updated `handleCustomerDetail` with `ControllerURL`, `LatestVersion`, `UpdateAvailable` template data; added `compareVersions` helper.
- `internal/web/templates/customer.html`: New "Controller Update" section between Health and Notifications — shows current/latest version with update indicator, controller URL link, and conditional "Trigger Update" button with JS.
- `internal/api/handler.go`: Added `ControllerURL` to `/api/v1/customers` JSON response.
- Hub config (`hub.yaml`): Added `registry:` section with Gitea admin credentials.
**Files modified/created:** controller: 3 files; hub: 5 modified + 1 created (version.go).
### What was just completed (2026-02-19 session 58)
- **v0.16.0 — Controller Self-Update:**
Watchtower-style self-update mechanism. New package `internal/selfupdate/` with 3 files: `version.go` (semver parsing/comparison), `state.go` (audit log state file I/O), `updater.go` (registry check via Gitea V2 API, update trigger, startup verification).
**Flow:** Gitea registry tag list → `docker pull` → atomic compose file rewrite → `docker compose up -d` → process replaced. State file (`update-state.json`) persists across restart as audit log; verified on next startup to detect success/failure.
**Config:** `SelfUpdateConfig` extended with `AutoUpdateTime` field + defaults for `Image` and `AutoUpdateTime`. Scheduler jobs: periodic check every `check_interval` (default 6h); optional daily auto-update at `auto_update_time` (default 04:30).
**API:** 3 new endpoints under `/api/selfupdate/` (`status`, `check`, `update`). Auth via session cookie OR `Authorization: Bearer <hub_api_key>` header (for external triggering from build scripts).
**UI:** Settings page "Verzió és frissítés" card shows current/latest version, check time, auto-update status, last update result. "Frissítés keresése" button queries registry; "Frissítés telepítése" button appears when update is available. `pollUntilBack()` JS polls `/api/health` after triggering update and reloads when container is back up.
**Notifications:** `NotifyUpdateSuccess()` and `NotifyUpdateFailed()` added to notifier for post-update startup verification results.
**Alert:** Dashboard shows "Új controller verzió elérhető" info alert when update is available.
**docker-compose.yml:** Added `/opt/docker/felhom-controller:/opt/docker/felhom-controller` directory bind mount (required for compose file access during self-update); named volume and read-only config override on top.
**Files modified/created (12):** `internal/selfupdate/version.go` (NEW), `internal/selfupdate/state.go` (NEW), `internal/selfupdate/updater.go` (NEW), `internal/config/config.go`, `internal/notify/notifier.go`, `internal/api/router.go`, `internal/web/server.go`, `internal/web/handlers.go`, `internal/web/alerts.go`, `internal/web/templates/settings.html`, `cmd/controller/main.go`, `docker-compose.yml`
### What was just completed (2026-02-19 session 57)
- **v0.15.7 — Fix backup page storage display & rename system drive label:**
Backup page ("Biztonsági mentés") now shows all registered storage paths instead of only a single "Külső HDD". Added `data["StorageBars"] = s.buildStorageBars()` to `backupsHandler` (was missing unlike dashboard/monitoring handlers). Updated `backups.html` storage bars section to use `StorageBars` loop (same pattern as monitoring page), replacing the old `{{if .HDDConfigured}}` single-HDD block.
Renamed system root partition label from "SSD (/)" to "Rendszer (/)" on all three pages (backup, monitoring, dashboard), as the root filesystem is not necessarily on an SSD.
**Files modified (4):** `internal/web/handlers.go`, `internal/web/templates/backups.html`, `internal/web/templates/monitoring.html`, `internal/web/templates/dashboard.html`
### What was just completed (2026-02-19 session 56)
- **v0.15.6 (controller) + hub v0.1.7 — Bug hunt fixes (BUGHUNT.md):**
**Controller — Restore race conditions (P0-P1):** All 4 restore handlers (`restorePageHandler`, `apiRestoreStatus`, `apiRestoreAll`, `apiRestoreSkip`) now hold `restoreMu.RLock()` across nil-check and field reads. `apiRestoreAll` uses new `TryStartRestore()` method for atomic check-and-set (eliminates double-restore race). `executeAllRestores()` snapshots plan under lock, uses `SetStatus("done")` instead of direct write. Removed dead no-op goroutine.
**Controller — restore_scan.go:** `dirIsEmpty()` now returns `false` on read errors (was silently treating unreadable dirs as empty, losing backup data). `Snapshot()` deep-copies Apps and Drives slices. Added `TryStartRestore()`, `SetStatus()`, `GetStatus()` helper methods.
**Controller — infra_backup.go (P0):** `controller.yaml` read failure now returns a real error (was silently creating empty backup). `settings.json` and restic password read failures now logged. Added `logger *log.Logger` parameter to `BuildInfraBackup`.
**Controller — main.go DR wiring:** Fixed ordering — `restoreSettingsFromHub` + settings reload now happens before `restorePasswordsFromHub` (prevents cross-drive password loss). Nil check after `ScanDrivesForBackups`. `os.MkdirAll` error now logged. `os.MkdirAll` added to `restoreSettingsFromHub` before write.
**Hub — store.go (P2):** 5 `json.Unmarshal` calls now log `[WARN]` on failure. `GetInfraBackupMeta` logs unmarshal error instead of silently returning wrong counts.
**docker-setup.sh (P0-P2):** DRY_RUN check moved to top of `run_config_wizard()` with dummy values (was prompting interactively even in dry-run). CF tunnel token quoted in docker-compose env. `htpasswd` uses `cut -d: -f2` + bcrypt format validation. `grep -qF` for literal path matching. Volume paths quoted in YAML output. Post-wizard validation rejects default `demo-felhom`/`homeserver.local` values.
**restore.html (P2-P3):** Error text uses `textContent` instead of `innerHTML`. Poll errors counted; after 10 failures shows "Kapcsolat megszakadt" message instead of polling silently forever.
**Files modified (controller, 6):** `internal/backup/restore_scan.go`, `internal/web/handler_restore.go`, `internal/report/infra_backup.go`, `cmd/controller/main.go`, `internal/web/templates/restore.html`, `scripts/docker-setup.sh`
**Files modified (hub, 1):** `hub/internal/store/store.go`
### What was just completed (2026-02-19 session 55)
- **v0.15.5 — Fix startup hub report silently failing:**
`Push()` now returns actual errors instead of always `nil`. Previously, push failures were logged internally but the caller could never detect them, leading to a misleading `[INFO] Startup hub report sent` log even when the push actually failed (e.g., hub returning HTTP 503 during simultaneous deployment). Removed the "Never returns error to caller" behavior: marshal error returns a wrapped error, and after 3 failed retries the error is returned to the caller (the internal `[WARN]` log before `return nil` is gone).
Startup hub push now retries 3 times with 15-second delays between outer attempts, giving the hub time to come up when both are deployed together. Each outer attempt uses `Push()`'s own internal 3-retry logic (5s backoff), so the hub gets up to ~40s total to become ready. If all 3 outer attempts fail, logs a clear warning with the next scheduled push interval.
**Files modified (2):** `internal/report/pusher.go`, `cmd/controller/main.go`
### What was just completed (2026-02-19 session 54)
- **v0.15.4 (controller) + hub v0.1.6 — Hub reporting improvements:**
**Controller:** When `hub.enabled: false` but URL+API key are configured, the controller now creates the `Pusher` and sends a one-time "disabled" notification on startup (`health.status = "disabled"`, `reporting_disabled: true`). This replaces the old behavior where a disabled controller was indistinguishable from a crashed node. Added `PushOnce()` method to `Pusher` (bypasses the `enabled` flag). Added `ReportingDisabled` field to the `Report` struct.
**Hub:** Added "disabled" status handling — when the latest report has `health_status = "disabled"`, the overall status is "disabled" (checked BEFORE the stale-time logic, so it stays "PAUSED" even after 30min+). Dashboard shows gray "PAUSED" badge. Customer detail shows "Reporting has been disabled on this node" with a hint to re-enable. Storage labels now shown (`label` field with fallback to `mount`). Report history timestamps now show date + time ("Feb 19 09:46" instead of "09:46:54"). New `.status-badge-disabled` CSS (neutral gray `#475569`).
**Files modified (controller):** `internal/report/types.go`, `internal/report/pusher.go`, `cmd/controller/main.go`
**Files modified (hub):** `hub/internal/web/server.go`, `hub/internal/web/templates/dashboard.html`, `hub/internal/web/templates/customer.html`, `hub/internal/web/templates/style.css`
### What was just completed (2026-02-19 session 53)
- **v0.15.3 — Show all storage paths on dashboard + fix hub report:**
Dashboard ("Vezérlőpult") and monitoring ("Rendszermonitor") pages now show usage bars for ALL registered storage paths instead of just one hardcoded "Külső HDD" bar. New `StorageBarInfo` type and `buildStorageBars()` helper build bars from `settings.GetStoragePaths()`. Each bar shows the storage label and live disk usage.
Hub storage report now correctly includes all registered storage paths with proper mount paths and labels. Previously it sent only root `/` plus one HDD entry using the deprecated (empty) `cfg.Paths.HDDPath`. Now uses `system.GetDiskUsage()` per storage path, same as the dashboard bars. Added `Label` field to `StorageReport` in `types.go`.
**Files modified (5):** `internal/web/handlers.go`, `internal/web/templates/dashboard.html`, `internal/web/templates/monitoring.html`, `internal/report/builder.go`, `internal/report/types.go`
### What was just completed (2026-02-19 session 52)
- **v0.15.2 — Fix data loss on container restart (2 bugs):**
**Bug 1:** Snapshot history delta stats (HOZZÁADOTT, ÚJ FÁJL, VÁLTOZOTT) showed 0 after container restart because restic doesn't store these stats — they were only in memory. Fixed by persisting the snapshot history ring buffer to `data/snapshot-history.json`. On startup, persisted stats are merged with restic repo snapshots. Added `saveSnapshotHistory()` (atomic write via tmp+rename), `loadSnapshotHistoryFromFile()`, updated `appendSnapshotRecord()` to save after each backup, and updated `LoadSnapshotHistory()` to merge persisted + restic data.
**Bug 2:** DB validation (ÉRVÉNYESÍTÉS column) showed "" after restart because the synthesized `LastDBDump.Results` didn't copy `Validation` from `DumpFileInfo`. One-line fix: added `Validation: f.Validation` to the synthesized `DumpResult` in `GetFullStatus()`.
**Files modified:** `internal/backup/backup.go`
### What was just completed (2026-02-19 session 51)
- **v0.15.1 — Backup Page "Részletek" Overhaul:**
Replaced the "Tároló" section on the backup page with a new "Részletek" section containing 3 collapsible tier sections with per-drive breakdowns.
**Tier 1 (Helyi mentés):** Shows per-drive restic repo stats (size, snapshot count) with storage labels. Includes aggregated totals when multiple drives exist, plus DB dump summary, integrity check, and encryption key (all carried over).
**Tier 2 (Másodlagos másolat):** Groups cross-drive backup items by destination drive, separated into restic and rsync method sections with per-app sizes.
**Tier 3 (Távoli mentés):** Placeholder for future B2/S3/SFTP remote backup.
**Restore UI improvements:** Snapshot dropdown now groups by tier (optgroup), shows tier label + drive name per snapshot (e.g., "1. szint, hdd_1"), and marks Tier 1 as recommended. Also lists Tier 2 (secondary restic) snapshots for visibility.
**Backend:** New `DriveRepoInfo` struct, `perDriveRepoStats()` method, `ListAllSnapshots()` that includes secondary restic repos, and `Tier2DriveGroup` handler struct. `SnapshotInfo` now carries `Tier` and `DriveLabel` fields.
**Files modified (5):** `internal/backup/backup.go`, `internal/backup/restic.go`, `internal/web/handlers.go`, `internal/api/router.go`, `internal/web/templates/backups.html`, `internal/web/templates/style.css`
### What was just completed (2026-02-18 session 50)
- **v0.15.0 — Attach Existing Drive (bind mount wizard):**
New feature: Settings → "Meglévő meghajtó csatolása" wizard. Allows attaching a drive that already has a filesystem (ext4, etc.) without formatting. Solves the real-world scenario where a customer's drive contains existing data that must be preserved.
**How it works:** The partition is mounted read-only at a hidden staging path (`/mnt/.felhom-raw/<label>`). A directory browser lets the user navigate the drive's contents and create a new folder. The selected folder is bind-mounted at `/mnt/<hdd-name>`, keeping the controller's data isolated from existing files. Two fstab entries (raw + bind, both with `nofail`) ensure the mount survives reboots.
**Wizard flow:** Scan → Select partition (only shows partitions with existing FS) → Mount raw + Browse directories → Create folder if needed → Configure mount name + label → Finalize (bind mount + fstab + permissions + register). Cancel cleans up the temp mount.
**New files (4):** `internal/storage/attach.go`, `internal/storage/attach_linux.go`, `internal/storage/attach_other.go`, `internal/web/templates/storage_attach.html`
**Modified files (3):** `internal/web/storage_handlers.go` (6 new API handlers), `internal/web/server.go` (route + activeRawMount field), `internal/web/templates/settings.html` (button)
### What was just completed (2026-02-18 session 49)
- **v0.14.2 — Backup Bug Fixes (4 fixes from code review):**
**Bug 1 (HIGH):** rsync `--delete` was destroying `_db/` and `_config/` directories on every single-mount run. Fixed by adding `--exclude _*` to the rsync command in `runRsyncBackup()`. Controller-managed directories (underscore prefix) are now excluded from `--delete` cleanup. (`crossdrive.go`)
**Bug 2 (MEDIUM):** Scheduled backups (`RunBackup`, `RunDBDumps`) did not set `m.running`, so UI showed "not running" during nightly jobs and restore could overlap. Fixed by extracting `acquireRunning()` / `releaseRunning()` helpers and `runDBDumpsInternal()` / `runBackupInternal()` internal methods. All three public entry points now guard with the running flag; `RunFullBackup()` calls the internal methods directly to avoid deadlock. (`backup.go`)
**Bug 3 (MEDIUM):** `ValidateDestination` silently succeeded when `GetDiskUsage` returned nil (exotic filesystems, FUSE, NFS). Fixed by logging `[WARN]` and returning nil (backward-compatible). (`crossdrive.go`)
**Bug 4 (MEDIUM):** Empty `systemDataPath` produced relative dump paths. Fixed with: startup `[WARN]` in `NewManager()`, `[ERROR]` log in `GetAppDrivePath()`, and explicit guard in `DumpStackDB()` that returns an error when path is empty or non-absolute. (`backup.go`)
**Files modified (2):** `internal/backup/backup.go`, `internal/backup/crossdrive.go`
### What was just completed (2026-02-18 session 48)
- **v0.13.1 — UI Polish Fixes Round 2 (4 fixes):**
**Fix 1:** Deploy page "Biztonsági mentés" section now has proper card border. Root cause: `.deploy-cross-drive` used undefined CSS variables `--card-bg` and `--border` (only `--bg-secondary` and `--border-color` exist). Fixed by using correct vars (`style.css`).
**Fix 2:** Auto-generated env values section cleaned up (`deploy.html`, `style.css`). Badge moved inline with label. "Másolás" buttons removed (native select+copy sufficient). Secret fields keep show/hide toggle. Non-secret fields now plain readonly input without button wrapper. Removed `copyAutoField()` JS. CSS updated: `.form-group-auto` now block layout (was flex row), label uses `display: flex; gap: .5rem`, badge downsized to `0.75rem / normal weight`, readonly inputs get muted background.
**Fix 3:** Snapshot table n/a → 0 (`backups.html`). Replaced `<span class="col-na" title="...">n/a</span>` with plain `0` in all three stats columns. Removed `.col-na` CSS class (no longer used).
**Fix 4:** Disk warnings moved from top banner to inline under storage bars (`alerts.go`, `layout.html`, `handlers.go`, `dashboard.html`, `monitoring.html`, `style.css`). Added `Inline bool` field to `Alert` struct. Disk-related warnings set `Inline: true`. Layout banner skips inline alerts. New `GetInlineAlerts(page)` method on `AlertManager`. Dashboard and monitoring handlers pass `DiskWarnings`. Inline warning block rendered below storage bars. New `.inline-warning*` CSS classes (compact, subtle, colored).
**Files modified (8):** `alerts.go`, `handlers.go`, `templates/style.css`, `templates/dashboard.html`, `templates/backups.html`, `templates/deploy.html`, `templates/monitoring.html`, `templates/layout.html`
### What was just completed (2026-02-18 session 47)
- **v0.13.0 — UI Polish Fixes (8 independent fixes):**
**Fix 1:** backup-status-card border already correct (verified same styling as system-info-card).
**Fix 2:** Deploy page auto-generated fields now show actual values for deployed apps (`deploy.html`, `handlers.go`). Secrets show as password fields with show/hide toggle; domain/plain values show as readonly text with copy button. JS helpers `toggleAutoField()` / `copyAutoField()` added.
**Fix 3:** Temperature display made more prominent (`dashboard.html`, `style.css`). Dot enlarged to 11px; value wrapped in colored pill badge (`.temp-value-pill` / `.temp-pill-{green|yellow|red}`).
**Fix 4:** Dashboard backup card reworked (`dashboard.html`, `handlers.go`). Removed "Mentés most" button and `triggerBackup()` JS. Removed "Tároló méret" line. Added Tier 2 status line (configured/total apps) + warning row for failed cross-drive backups. Handler now computes `CrossDriveTotal`, `CrossDriveConfigured`, `CrossDriveFailed`.
**Fix 5:** HDD warning banner scoped to dashboard + monitoring pages only (`alerts.go`, `layout.html`, `funcmap.go`). Added `PageOnly []string` field to `Alert` struct. Disk-related warnings (keywords "meghajtón", "adattároló") get stable ID `"disk-not-separate"` + `PageOnly: ["dashboard", "monitoring"]`. `pageMatch()` template function added. Layout renders alerts conditionally.
**Fix 6:** Tárhely section moved up in Rendszermonitor — now appears right after "Rendszer áttekintés", before "Távoli monitoring" (`monitoring.html`).
**Fix 7:** Snapshot table improvements (`backups.html`, `style.css`). "MÉRET" renamed to "HOZZÁADOTT (új adat)". `` for unavailable data replaced with `n/a` (with tooltip explaining restic limitations). New `.col-subtitle` and `.col-na` CSS classes.
**Fix 8:** Tároló section restructured into tiers (`backups.html`, `handlers.go`, `style.css`). Tier 1 (restic local), Tier 2 (cross-drive, only shown if configured), DB dump directory + total size. Removed "Távoli másolat: Nincs beállítva" placeholder. Handler passes `DBDumpDir`, `DBDumpTotalBytes`, `Tier2Dests` (deduplicated). New `.repo-tier` / `.repo-tier-title` CSS.
**Files modified (9):** `alerts.go`, `funcmap.go`, `handlers.go`, `templates/style.css`, `templates/dashboard.html`, `templates/backups.html`, `templates/deploy.html`, `templates/monitoring.html`, `templates/layout.html`
### What was just completed (2026-02-18 session 46)
- **v0.12.9 — Tier 2 for All Apps + Status Dot Update:**
**Fix 1: Tier 2 now configurable for ALL apps — not just HDD apps (`crossdrive.go`)**
- Removed `len(mounts) == 0` error gate from `RunAppBackup()` — empty mounts = config-only backup
- rsync: DB dump copy (`_db/`) + config rsync (`_config/`) still runs even with zero HDD mounts
- restic: config dir + DB dump dir still appended even without mount paths
- Non-HDD apps (Mealie, Gokapi, etc.) can now be protected against drive failure via Tier 2
**Fix 2: Status dot logic updated, HasHDDData gate removed (`handlers.go`)**
- `buildAppBackupRows()`: "auto" (gray) status removed — all apps start yellow ("Csak helyi mentés")
- Green requires Tier 2 configured + last status "ok" (not just "configured but never run")
- Tier2 section is now unconditional — no `if app.HasHDDData` gate
- Cross-drive summary loop: removed `if !app.HasHDDData { continue }` — all apps in summary
**Fix 3: Backup page template updates (`backups.html`)**
- Tier 2 row shown for all apps (removed `{{if .HasHDDData}}` gate)
- Meta badge: non-HDD apps show "Konfig" or "Konfig + DB" instead of "Auto"
- Tier 3 placeholder row added (grayed out "Hamarosan / távoli offsite")
- Button text: "Összes HDD mentés" → "Összes 2. mentés futtatása most"
**Fix 4: Deploy page cross-drive section visible for all deployed apps (`deploy.html`)**
- Removed `{{if .StorageInfo}}` double-gate — section now shows for all deployed apps
- Updated heading: "Másolat másik meghajtóra (felhasználói adatok)" → "2. mentés — másolat másik meghajtóra"
- Updated hint: "mint az alkalmazás adattárolója" → "a meghibásodás elleni védelem érdekében"
**Files modified (4):** `internal/backup/crossdrive.go`, `internal/web/handlers.go`, `internal/web/templates/backups.html`, `internal/web/templates/deploy.html`
### What was just completed (2026-02-18 session 45)
- **v0.12.8 — Complete Cross-Drive Backup + Per-Tier UI:**
**Fix 1: Cross-drive backup now includes DB dumps + app config (`crossdrive.go`, `main.go`)**
- `CrossDriveRunner` gets `dbDumpDir` field + `SetDBDumpDir(dir string)` setter
- `copyStackDBDumps()` helper copies `<stackName>_*.sql` files to `_db/` subfolder in rsync dest
- `runRsyncBackup()`: after HDD mount rsync loop, copies DB dumps to `_db/` and rsyncs config dir to `_config/` — both non-fatal on error
- `runResticBackup()`: appends config dir and full DB dump dir to restic paths (restic deduplicates)
- rsync destination layout: `backups/rsync/<app>/_db/` (dumps) + `_config/` (compose+yaml) + user data
- `main.go`: `crossDriveRunner.SetDBDumpDir(cfg.Paths.DBDumpDir)` wired after runner init
**Fix 2: UI restructured from per-layer to per-tier (`handlers.go`, `backups.html`, `style.css`)**
- `AppBackupRow` struct rebuilt: dropped old `DBLastRun/Status`, `VolumeLastRun/Status`, `HasUserData`, `UserDataConfigured/Method/Dest/Schedule/LastRun/LastStatus/LastError/StatusBadge` fields
- New fields: `BackupContents` (e.g., "DB + Konfig + Adatok"), `Tier1LastRun/LastStatus/DBStatus`, `Tier2Configured/Method/MethodLabel/Dest/Schedule/LastRun/LastStatus/LastError/StatusBadge/SizeHuman/Browsable`
- `buildAppBackupRows()` rewritten: destination health now via `s.crossDriveRunner.ValidateDestination()` instead of `system.CheckBackupDestination()`
- `backups.html`: two tier rows (1. mentés / 2. mentés) replace the old three layer rows (DB / Konfig / Userdata)
- `style.css`: added `.tier-label`, `.tier-location`, `.tier-contents`, `.tier-size`, `.tier-browsable` classes
**Fix 3: Cleanup (`router.go`)**
- `filterSnapshotsByPaths()` and `pathCovers()` deleted (were unused since v0.12.7a)
**Files modified (6):** `internal/backup/crossdrive.go`, `cmd/controller/main.go`, `internal/web/handlers.go`, `internal/web/templates/backups.html`, `internal/web/templates/style.css`, `internal/api/router.go`
### What was just completed (2026-02-18 session 44)
- **v0.12.7a — Post-deploy fixes:**
**Fix A: Restore now shows snapshots for all apps (`internal/api/router.go`)**
- Root cause: `filterSnapshotsByPaths` filtered older snapshots (pre-v0.12.7) by HDD paths. Older snapshots don't contain HDD paths (backup wasn't mandatory yet), so Immich got zero snapshots.
- Fix: removed HDD path filtering entirely from `backupSnapshots`. All snapshots contain config + DB dumps and are useful for any app. `RestoreApp` extracts whatever paths are available from the chosen snapshot.
- `filterSnapshotsByPaths` and `pathCovers` functions kept (unused, no compile error).
**Fix B: Clarified "no cross-drive" warning (`internal/web/handlers.go`, `backups.html`, `style.css`)**
- Root cause: "Nincs beállítva" / red dot implied no backup at all — misleading since nightly restic now always covers HDD data.
- `handlers.go`: status `"red"``"yellow"`, StatusText → `"Nincs második másolat (csak helyi mentés)"`
- `backups.html`: added `✓ Helyi mentés auto` badge before the `⚠ Nincs 2. másolat` warning
- `style.css`: `.layer-auto-ok` class added (green text for the auto badge)
**Files modified (3):** `internal/api/router.go`, `internal/web/handlers.go`, `internal/web/templates/backups.html`, `internal/web/templates/style.css`
### What was just completed (2026-02-18 session 43)
- **v0.12.7 — Backup Architecture Overhaul (mandatory HDD backup, pre-dump, restore for all apps):**
**Fix 1: HDD data backup now mandatory (`backup.go`, `appdata.go`, `settings.go`)**
- `resolveAppBackupPaths()` rewrote to iterate ALL deployed stacks via `ListDeployedStacks()` — no longer reads `GetAppBackupMap()` or checks `Enabled` flag
- `DiscoverAppData()` signature simplified: dropped `backupPrefs map[string]bool` parameter; `BackupEnabled` is now derived from `HasHDDData` (if app has HDD data, it's always backed up)
- `RefreshCache()` updated to call new `DiscoverAppData(m.stackProvider, status.DiscoveredDBs)` signature
- 5 dead settings methods deleted: `IsAppBackupEnabled`, `SetAppBackup`, `GetAppBackupMap`, `SetAppBackupBulk`, `GetAppBackupPrefs``AppBackupPrefs.Enabled` field kept in struct for backward-compat JSON loading
**Fix 2: Cross-drive backup triggers fresh DB dump first (`crossdrive.go`, `backup.go`, `main.go`)**
- New `DBDumper` interface with `DumpStackDB(ctx, stackName)` in `crossdrive.go`
- `CrossDriveRunner` gets `dbDumper` field + `SetDBDumper(d DBDumper)` setter
- `Manager.DumpStackDB()` discovers containers for that stack via `DiscoverDatabases()`, runs `DumpAll()`, persists validation cache — same logic as nightly dump but scoped to one stack
- `RunAppBackup()` calls `DumpStackDB()` before `ValidateDestination()` — non-fatal on failure (logs warn, proceeds with user data)
- `main.go` wires `crossDriveRunner.SetDBDumper(backupMgr)` after both are initialized
**Fix 3: Restore dropdown shows ALL deployed apps (`backups.html`, `restore.go`, `router.go`)**
- `restore.go` rewritten: no `IsAppBackupEnabled()` check; resolves `GetStackComposePath` + `DBDumpDir` + HDD mounts; always restores config+DB, adds user data if `hasHDD`; logs restore type (`config+DB` vs `full (config+DB+userdata)`)
- Restore dropdown template: removed `{{if and .HasHDDData .BackupEnabled}}` filter; every app gets an `<option>` with `data-has-hdd` and `data-has-db` attributes
- New `#restore-type-info` div added between snapshot selector and warnings
- `onRestoreAppChange()` JS updated: reads `data-has-hdd`/`data-has-db` from selected option, shows Hungarian restore type banner (full / config+DB / config only) with color-coded styling
- `router.go` `backupSnapshots`: added clarifying comment for non-HDD apps (no filter = all snapshots returned)
**Fix 4: Honest UI label (`backups.html`)**
- "Docker kötetek" renamed to "Konfiguráció" — Docker named volumes at `/var/lib/docker/volumes/` are NOT in the restic backup paths; what's actually backed up is compose files + app.yaml + .felhom.yml
**CSS: `.restore-info` and `.restore-info-partial` classes added to `style.css`**
**Files modified (9):** `internal/backup/backup.go`, `internal/backup/appdata.go`, `internal/settings/settings.go`, `internal/backup/crossdrive.go`, `internal/backup/restore.go`, `cmd/controller/main.go`, `internal/web/templates/backups.html`, `internal/web/templates/style.css`, `internal/api/router.go`
### What was just completed (2026-02-18 session 42)
- **v0.12.6 — Cross-Drive Backup Rsync Fixes:**
**Context:** After fixing mount-point validation and system-drive thresholds (v0.12.5), testing revealed two more rsync issues for Immich.
**Fix 3: Simplified rsync destination path structure (`internal/backup/crossdrive.go` `runRsyncBackup`)**
- Old logic stripped only the first 2 path segments and kept the rest as a subpath, producing redundant nesting: `backups/rsync/immich/storage/immich/<data>` instead of `backups/rsync/immich/<data>`
- New logic: if app has a single mount, rsync directly into the stack folder (`backups/rsync/immich/`); if multiple mounts, use each mount's leaf directory name as subfolder
- Duplicate leaf names disambiguated by appending `_N` index suffix
- Loop variable changed from `_, srcMount` to `i, srcMount` to support the index-based disambiguation
- Old nested `storage/immich/` folder will remain orphaned after first run (no data loss; `--delete` only affects the target subtree)
**Fix 4: Exclude app-internal DB dump files from rsync (`internal/backup/crossdrive.go` `runRsyncBackup`)**
- Apps like Immich store their own periodic DB dumps in `<data>/backups/*.sql.gz` (~16 MB/day)
- The controller already handles DB backups via `pg_dump` separately — copying these again via rsync is redundant and wastes space
- Added `--exclude backups/*.sql.gz`, `--exclude backups/*.sql`, `--exclude backups/*.dump` to rsync command
- The `backups/` directory itself and non-dump files within it are preserved
**Files modified (1):** `internal/backup/crossdrive.go`
### What was just completed (2026-02-18 session 41)
- **v0.12.5 — Cross-Drive Backup Validation Fix:**
**Root cause:** Immich cross-drive backup failed with `destination /mnt/hdd_placeholder is not a mount point` because `ValidateDestination()` hard-blocked non-mount-point destinations. The `/mnt/hdd_placeholder` folder is on the internal SSD (not a separate mount), so the device-ID check returned false.
**Fix 1: Drive-type-aware space checks in `ValidateDestination` (`internal/backup/crossdrive.go`)**
- `onSystemDrive` flag replaces the previous boolean-only mount-point check
- System-drive destinations: require **≥10 GB free** and **<90% usage** to protect OS stability
- External-drive destinations: require **≥100 MB free** (original threshold)
- Updated function comment to reflect the new tiered logic
**Fix 2: Aligned `CheckBackupDestination` UI thresholds for system drives (`internal/system/mounts_linux.go`)**
- Tier 4 disk checks now branch on `h.SystemDrive` flag (set in Tier 3)
- System drive: block at <10 GB free OR ≥90% used (matches runner enforcement); Hungarian warning messages
- External drive: warn at ≥90% used, block at ≥95% used (unchanged)
- Removed the `&& h.Severity == "ok"` guard that prevented system-drive warnings from being overridden properly
**Files modified (2):** `internal/backup/crossdrive.go`, `internal/system/mounts_linux.go`
### What was just completed (2026-02-18 session 40)
- **v0.12.4 — Correctness & Robustness Bug Fixes (TASK.md — 15 bugs fixed):**
**CRITICAL fixes (data loss, panics):**
- **C1: `SetAppBackupBulk` data loss + nil map panic** — Fixed: now updates map IN PLACE instead of replacing it, so stacks absent from the input are preserved. Added nil guard for `s.AppBackup`. (`internal/settings/settings.go`)
- **C2: `UpdateStackConfig` nil Env map panic** — Added nil check `if appCfg.Env == nil { appCfg.Env = make(...) }` before the field assignment loop. (`internal/stacks/deploy.go`)
- **C3: `ValidateDump` missing scanner.Err() check** — Added `if err := scanner.Err()` check after the scan loop so I/O errors don't silently mark a partial dump as valid. (`internal/backup/dbdump.go`)
**HIGH fixes (logic errors, resource leaks):**
- **H1: `nextDailyRun` DST bug** — Replaced `next.Add(24 * time.Hour)` with `time.Date(day+1, ...)` for correct scheduling across Europe/Budapest DST transitions. (`internal/scheduler/scheduler.go`)
- **H2: `nextDailyRun` repeated `LoadLocation`** — Cached timezone in package-level `sync.Once` variable; `getBudapestLocation()` now loaded only once. (`internal/scheduler/scheduler.go`)
- **H3: `settings.save()` .tmp file leak** — Added `os.Remove(tmpPath)` cleanup on `WriteFile` failure path. (`internal/settings/settings.go`)
- **H4: `SetNotificationPrefs` nil pointer panic** — Added nil guard at start of function, returns error instead of panicking. (`internal/settings/settings.go`)
- **H5: `appDirSize` ignores `Sscanf` return value** — Now checks `n != 1` and returns `(0, "?")` on parse failure. Same fix applied to `getDirSizeBytes` in `stacks/delete.go`. (`internal/backup/appdata.go`, `internal/stacks/delete.go`)
- **H6: `getDirSizeBytes` no timeout** — Added `exec.CommandContext` with 30s timeout. Added `"context"` import. (`internal/stacks/delete.go`)
- **H7: `dbdump.go` tmpFile not using `defer Close`** — Replaced explicit `tmpFile.Close()` call with `defer tmpFile.Close()` so the file handle is released even on panic. (`internal/backup/dbdump.go`)
- **H8: `UpdateCrossDriveStatus` misleading comment** — Updated comment to accurately describe the "does nothing if nil" behavior instead of claiming it "creates one if nil". (`internal/settings/settings.go`)
**MEDIUM fixes (code quality, edge cases):**
- **M1: Custom `contains`/`containsBytes` replaced** — Removed bespoke `containsBytes` and simplified `contains` to delegate to `strings.Contains`. Added `"strings"` import. (`internal/notify/notifier.go`)
- **M2: `scheduler.Every()` doesn't validate interval** — Added early return with error log if `interval <= 0` to prevent panic in `time.NewTicker`. (`internal/scheduler/scheduler.go`)
- **M3: `executeJob` panic recovery missing `LastRun`** — Panic recovery defer now also sets `job.LastRun = time.Now()` so the job status shows a timestamp after a panic. (`internal/scheduler/scheduler.go`)
- **M4: `logPostStartStatus` goroutine captures env by reference** — Copies the env slice before launching the goroutine (`envCopy`). (`internal/stacks/manager.go`)
- **M5: Multiple `time.LoadLocation` calls in web package** — Added package-level `getTimezone()` with `sync.Once` in `funcmap.go`. Replaced all `time.LoadLocation("Europe/Budapest")` calls in the web package with `getTimezone()`. (`internal/web/funcmap.go`, `internal/web/handlers.go`)
**Files modified (8):** `internal/settings/settings.go`, `internal/stacks/deploy.go`, `internal/backup/dbdump.go`, `internal/scheduler/scheduler.go`, `internal/backup/appdata.go`, `internal/stacks/delete.go`, `internal/stacks/manager.go`, `internal/notify/notifier.go`, `internal/web/funcmap.go`, `internal/web/handlers.go`
### What was just completed (2026-02-17 session 39)
- **v0.12.3 — Security & Correctness Bug Fixes (TASK.md — 33 bugs fixed):**
**CRITICAL fixes (data races, security vulnerabilities):**
- **C1: Data race in RefreshCache** — Moved `m.lastDBDump.Results` mutation inside `m.mu.Lock()`. Was previously mutating shared state without the lock, causing potential torn writes visible to `GetFullStatus()` goroutines. (`internal/backup/backup.go`)
- **C2: SnapshotHistory reversed after unlock** — Moved snapshot reversal loop before `m.cachedStatus = status` (inside the lock). Previously reversed after `Unlock()`, so `m.cachedStatus.SnapshotHistory` was reversed without protection. (`internal/backup/backup.go`)
- **C3: SetStackProvider write without lock** — `m.stackProvider = provider` now wrapped in `m.mu.Lock()`. Read by `resolveAppBackupPaths()` concurrently. (`internal/backup/backup.go`)
- **C4: GetFullStatus shallow-copies mutable pointers** — `LastDBDump` and `LastBackup` are now deep-copied (struct + Results slice) so callers cannot mutate shared manager state. (`internal/backup/backup.go`)
- **C5: IsSystemDisk 8-bit major mask** — Replaced `>> 8 & 0xff` with `unix.Major()`/`unix.Minor()` (12-bit extraction). Also compares disk-portion of minor (groups of 16) to correctly distinguish physical disks of the same type. Adds `golang.org/x/sys/unix` import. (`internal/storage/safety_linux.go`)
- **C6: No /dev/ prefix validation on DevicePath** — `FormatAndMount` now validates `DevicePath` starts with `/dev/` and does not contain `..` before any disk operations. (`internal/storage/format_linux.go`)
- **C7: Path traversal in extractName** — `extractName()` now rejects empty string, `.`, `..`, and names containing `/` or `\`. (`internal/api/router.go`)
- **C8: Path traversal in TargetPath** — Migration API validates `TargetPath` against registered storage paths from settings before starting migration job. (`internal/web/storage_handlers.go`)
- **C9: Path traversal in DestinationPath** — Cross-drive backup config API validates `DestinationPath` against registered storage paths when `enabled=true`. (`internal/api/router.go`)
- **C10: Path traversal in ParseComposeHDDMounts** — `filepath.Clean()` applied before prefix check; uses separator-aware check `cleanHDD + string(filepath.Separator)` to prevent `${HDD_PATH}/../../etc/passwd` escaping. (`internal/stacks/delete.go`)
**HIGH fixes (logic errors, resource leaks):**
- **H1: ValidateDump reads entire file into memory** — Replaced `os.ReadFile` with `bufio.Scanner` reading line-by-line. 256KB per-line buffer prevents OOM on large (500MB+) SQL dumps during 5-min cache refresh. (`internal/backup/dbdump.go`)
- **H2/H3: Double du invocation per mount + no timeout** — Replaced `appDirSizeHuman()`+`appDirSizeBytes()` with single `appDirSize()` function using `exec.CommandContext` with 30s timeout. Halves subprocess calls per mount point. (`internal/backup/appdata.go`)
- **H4: Snapshot validation only checks first 100** — Replaced `ListSnapshots(100)` existence check with regex validation (`^[0-9a-f]{8,64}$`). Allows restoring any snapshot; `restic restore` returns a clear error for non-existent IDs. (`internal/backup/restore.go`)
- **H5: No pruning for cross-drive restic repos** — Added `pruneResticRepo()` called after each successful cross-drive restic backup (`forget --keep-daily 7 --keep-weekly 4 --prune`). Non-fatal — logs warning on failure. (`internal/backup/crossdrive.go`)
- **H6: Temp password file management** — Reorganized temp file lifecycle: close before deferred remove, remove-on-write-error cleanup. (`internal/backup/crossdrive.go`)
- **H7: dirSizeBytes swallows walk errors** — `filepath.Walk` callback now returns errors instead of `nil`, propagating permission/IO issues. (`internal/backup/crossdrive.go`)
- **H8: Non-atomic fstab write** — `AppendFstabEntry` now reads existing fstab, writes to `.tmp`, then atomically renames. Crash-safe. (`internal/storage/safety_linux.go`)
- **H9: IsDeviceMounted naive prefix matching** — After prefix check, next character must be digit (`0-9`) or `p` (partition marker). Prevents `/dev/sdb` matching `/dev/sdba`. (`internal/storage/safety_linux.go`)
- **H10: eMMC device mapping bug** — `partitionToParentDisk` now handles `mmcblk0p1 → mmcblk0` and `nvme0n1p1 → nvme0n1` patterns. Uses `LastIndex("p")` with digit-suffix check before falling back to `TrimRight("0-9")`. (`internal/storage/scan_linux.go`)
- **H11: Data race on bytesCopied in rsync error path** — Error return path in `runRsync` now reads `bytesCopied` under mutex lock. (`internal/storage/migrate.go`)
- **H13: Path prefix match without separator** — Migration source path check now uses `srcPath == req.CurrentHDDPath || strings.HasPrefix(srcPath, req.CurrentHDDPath+"/")`. Prevents `/mnt/hdd` matching `/mnt/hdd_backup/data`. (`internal/storage/migrate.go`)
- **H14: DeleteStack continues after failed compose down** — `docker compose down` failure now returns an error immediately, preventing deletion of files while containers are still running. (`internal/stacks/delete.go`)
- **H16: exec.Command("docker") without timeout** — `syncFileBrowserMounts()` now uses `exec.CommandContext` with 60s timeout. (`internal/web/handlers.go`)
- **H17: SetNotificationPrefs stores caller's pointer** — Deep-copies `NotificationPrefs` struct and `EnabledEvents` slice before storing. (`internal/settings/settings.go`)
- **H18: wipefs error silently discarded** — wipefs failure logged as warning via progress channel; continues (wipefs may not be installed). (`internal/storage/format_linux.go`)
- **H19: Orphaned fstab entry on mount failure** — New `RemoveFstabEntry()` function atomically removes UUID entry. Called as rollback on `mount` failure and `findmnt` verify failure. (`internal/storage/safety_linux.go`, `format_linux.go`)
**MEDIUM fixes (edge cases, code quality):**
- **M1: formatBytes duplicate in dbdump.go** — Removed `formatBytes()` from `dbdump.go`; all callers (backup.go, restic.go, dbdump.go) now use `humanizeBytes()` from appdata.go. (`internal/backup/dbdump.go`, `backup.go`, `restic.go`)
- **M2: Dead code .tmp suffix check** — Reordered filter in `ListDumpFiles`: `.tmp` check now comes before `.sql` check to correctly skip `.sql.tmp` temp files (was unreachable before). (`internal/backup/dbdump.go`)
- **M3: sizeBytes() returns 0 for string types** — Added `case string:` to `sizeBytes()` using `strconv.ParseUint`. (`internal/storage/scan_linux.go`)
- **M6: Dead elapsed variable** — Removed `_ = elapsed`; elapsed time now shown inline in the "done" progress message. (`internal/storage/migrate.go`)
- **M7: time.LoadLocation error silently discarded** — Two locations in handlers.go now handle `LoadLocation` error, falling back to `time.UTC`. (`internal/web/handlers.go`)
- **M10: filterSnapshotsByPaths imprecise prefix** — Added `pathCovers()` helper using separator-aware prefix check. Prevents `/mnt/hdd_1` matching `/mnt/hdd_10/data`. (`internal/api/router.go`)
- **M11: XSS in editStorageLabel innerHTML** — `cancelEditLabel()` in settings.html now uses DOM manipulation (`document.createElement`, `.textContent`) instead of `innerHTML` for the label text. (`internal/web/templates/settings.html`)
**Files modified (15):** `internal/backup/backup.go`, `internal/backup/appdata.go`, `internal/backup/dbdump.go`, `internal/backup/restore.go`, `internal/backup/crossdrive.go`, `internal/backup/restic.go`, `internal/storage/safety_linux.go`, `internal/storage/format_linux.go`, `internal/storage/scan_linux.go`, `internal/storage/migrate.go`, `internal/stacks/delete.go`, `internal/api/router.go`, `internal/web/handlers.go`, `internal/web/storage_handlers.go`, `internal/settings/settings.go`, `internal/web/templates/settings.html`
### What was just completed (2026-02-17 session 38)
- **v0.12.2 — Restore Section Simplification (Bug 4 from v0.12.1 TASK.md):**
- **Feature: Snapshot filtering by app** — `GET /api/backup/snapshots?stack={name}` now filters snapshots to those whose `Paths` overlap with the app's HDD mount paths. Uses prefix matching (snapshot path is prefix of required, or vice versa). New `filterSnapshotsByPaths()` helper in `internal/api/router.go`. Manager gains `GetStackHDDMounts()` method to expose stackProvider's mount resolution.
- **Feature: Auto-stop/restart on restore** — `RestoreApp()` now stops the app's containers before running `restic restore` and restarts them after (even on failure). Avoids data corruption from live writes during restore. Eliminates the "Javasoljuk az alkalmazás leállítását" advisory from the UI.
- **Interface extension: StackDataProvider** — Added `StopStack(name string) error` and `StartStack(name string) error` to the `backup.StackDataProvider` interface in `internal/backup/appdata.go`. `stackAdapter` in `cmd/controller/main.go` wires these through to `stacks.Manager`.
- **UI simplification: Restore section** — Removed confusing "Visszaállítandó útvonalak" path list (technical detail not needed by customer). Snapshot dropdown now populated per-app (filtered) with human-friendly format: `2026-02-17 hétfő 03:00 (a3f2b1)`. Single calm warning replacing the triple-exclamation block. Empty filtered result shows inline message instead of empty dropdown. `data-paths` attribute removed from app dropdown options.
- **Files modified (6):** `internal/backup/appdata.go`, `internal/backup/backup.go`, `internal/backup/restore.go`, `internal/api/router.go`, `internal/web/templates/backups.html`, `cmd/controller/main.go`
### What was just completed (2026-02-17 session 37)
- **v0.12.0 — Backup Page Overhaul — Unified App Backup Status & Bug Fixes:**
- **Bug Fix 1: Duplicate unconfigured apps** — `GetFullStatus()` now returns a deep copy of the cached status. `CrossDriveSummary`, `UnconfiguredApps`, and `CrossDriveWarnings` slices are always nil in the returned copy so the handler builds them fresh on every page load. Previously the handler appended to the cached slices, causing 3× duplication on 3 page loads.
- **Bug Fix 2: Misleading "drive disconnected" error** — Replaced the binary `IsMountPoint || !IsWritable` check with tiered `CheckBackupDestination()` validation (new in `internal/system/mounts_linux.go` and stub in `mounts_other.go`). Tiers: path doesn't exist (critical/blocked), not writable (critical/blocked), same block device as `/` (warning/allowed with note about system drive), disk >95% full (critical/blocked), disk >90% (warning/allowed). `isSameBlockDevice()` replaces `IsMountPoint()` for source/dest same-device detection. Used in both `deployHandler()` and `backupsHandler()` for display, and in `crossdrive.go` logic via `CheckBackupDestination()`.
- **Bug Fix 3: Dead BackupEnabled toggle** — Removed `settingsAppBackupHandler()` from handlers.go and its `POST /settings/app-backup` route from server.go. The toggle wrote to settings.json but nothing read it to skip apps. UI nightly backup section in deploy.html now shows an informational note instead of the toggle.
- **Architecture: Unified per-app backup rows** — New `AppBackupRow` struct and `buildAppBackupRows()` in handlers.go. Replaces old "Alkalmazás adatok" + "Másolatok másik meghajtóra" sections with a single expandable row per app showing all 3 backup layers (DB, Docker volumes, user data). Status dot: green=fully covered, yellow=warning (failed run, system drive, disk full), red=HDD data without cross-drive configured, auto=no user data. Expandable JS toggle with ▶/▼ icon.
- **Architecture: Sequential backup chaining** — Removed independent `cross-drive-daily` (03:30) and `cross-drive-weekly` (04:30) scheduler jobs. Cross-drive backups now run immediately after the restic backup completes (daily jobs every night; weekly jobs on Sunday). This ensures DB dump → restic → cross-drive happen in the same window for file/DB consistency on restore.
- **Architecture: Deploy page schedule dropdown** — Removed "Csak kézi indítás" option (schedule="manual"). Two options remain: "Naponta (az éjszakai mentés után)" and "Hetente, vasárnap (az éjszakai mentés után)". Weekly option shows informational note about DB consistency implications. Existing "manual" configs treated as "weekly" in the dropdown.
- **CSS added:** `.app-backup-row`, `.app-backup-row-header`, `.app-backup-row-name`, `.app-backup-row-meta`, `.app-backup-row-detail`, `.status-dot` (green/yellow/red/auto), `.backup-layers`, `.backup-layer-row`, `.layer-label`, `.layer-badge`, `.layer-na`, `.layer-method`, `.layer-dest`, `.layer-schedule`, `.layer-last`, `.layer-unconfigured`, `.layer-actions`, `.layer-warnings`, `.backup-layer-warning`, `.btn-xs`, `.text-ok`, `.text-error`.
- **Files modified (9):** `internal/backup/backup.go`, `internal/system/mounts_linux.go`, `internal/system/mounts_other.go`, `internal/web/handlers.go`, `internal/web/server.go`, `internal/web/templates/backups.html`, `internal/web/templates/deploy.html`, `internal/web/templates/style.css`, `cmd/controller/main.go`
### What was just completed (2026-02-17 session 36)
- **v0.11.9 — UI Polish Fixes for deploy/settings backup section:**
- **Fix 1: Spacing** — `.deploy-cross-drive` `margin-bottom` increased from `1rem` to `1.5rem` for consistent spacing before deploy form.
- **Fix 2: Tooltip on "Módszer"** — Renamed "Verziózott mentés (restic)" to "Titkosított mentés (restic)". Added info `(i)` tooltip explaining rsync vs restic tradeoffs.
- **Fix 3: Nightly backup indicator** — Replaced disabled checkbox (with confusing pointer cursor) with a non-interactive green/gray dot indicator.
- **Fix 4: Progressive disclosure** — Dest/method/schedule selects are disabled until "Engedélyezve" is checked. JS `toggleCrossDriveFields()` enables/disables them. Backend handler updated to preserve existing config when disabling (disabled fields not submitted).
- **Fix 5: Emoji cleanup** — Removed all emoji from `deploy.html` backup section (h4, warning, status, hint, stale data) and `backups.html` cross-drive summary (status badges, schedule badge, unconfigured warning). JS callbacks also cleaned up.
- **CSS added:** `.info-tooltip`, `.info-icon`, `.info-tooltip-text`, `.cross-drive-nightly-status`, `.nightly-status-indicator`, `.nightly-enabled`, `.nightly-disabled`, `.meta-badge-fail`.
- **Files modified (4):** `web/templates/deploy.html`, `web/templates/backups.html`, `web/templates/style.css`, `web/handlers.go`
### What was just completed (2026-02-17 session 35)
- **v0.11.8 — Per-App Cross-Drive Backup (3-2-1 rule, second copy on different media):**
- **Feature: CrossDriveBackup data model** — `AppBackupPrefs` extended with `CrossDrive *CrossDriveBackup` field in `settings.go`. New methods: `GetCrossDriveConfig`, `SetCrossDriveConfig`, `UpdateCrossDriveStatus`, `GetAllCrossDriveConfigs`, `GetOrCreateCrossDrivePassword`. Existing `SetAppBackup`/`SetAppBackupBulk` now preserve cross-drive config. Auto-generated restic password stored in `settings.json`.
- **Feature: CrossDriveRunner** — New `internal/backup/crossdrive.go`. Supports rsync (simple mirror with `--delete`) and restic (versioned, deduplicated, shared repo). Safety guards: destination ≠ source, mount point check, writable check, per-app concurrency lock. `RunAllScheduled(ctx, schedule)` iterates all apps matching the given schedule. Status (last_run, last_status, last_error, last_duration, last_size_human) persisted to settings.json after each run.
- **Feature: Scheduler jobs** — Two new daily jobs: `cross-drive-daily` at 03:30 (for apps with `schedule: daily`), `cross-drive-weekly` at 04:30 Sundays only (for `schedule: weekly`).
- **Feature: API endpoints** — 4 new routes: `POST /api/stacks/{name}/cross-backup`, `POST /api/stacks/{name}/cross-backup/run`, `GET /api/stacks/{name}/cross-backup/status`, `POST /api/backup/cross-drive/run-all`.
- **Feature: Deploy/Settings page UI** — New "Biztonsági mentés" card on the deploy page for apps with HDD data. Shows nightly backup toggle (read-only link), cross-drive dropdowns (destination, method, schedule), last run status, manual trigger button. States: no other storage (info message), configured, destination unreachable (warning). Flash messages on save redirect.
- **Feature: Backup page summary** — New "Másolatok másik meghajtóra" section showing all configured apps with method, destination, last status, size. Warns about unconfigured apps with HDD data. Destination health warnings. "Összes futtatása most" button.
- **CSS:** `margin-bottom: 1.5rem` added to `.deploy-stale-data`. New styles: `.deploy-cross-drive`, `.cross-drive-list`, `.cross-drive-item`, `.cross-drive-header`, `.cross-drive-meta`, `.cross-drive-actions`.
- **Files modified (10):** `settings/settings.go`, `backup/crossdrive.go` (new), `backup/backup.go`, `api/router.go`, `web/handlers.go`, `web/server.go`, `web/templates/deploy.html`, `web/templates/backups.html`, `web/templates/style.css`, `cmd/controller/main.go`
### What was just completed (2026-02-17 session 34)
- **v0.11.7 — Stale Data Cleanup + FileBrowser Sync + UI Title Fix:**
- **Feature: Stale data cleanup** — After app data migration, the deploy/settings page now shows leftover data on previous storage paths with size info and a delete button. Two-step confirmation required before deletion. Protected paths (storage root, media, Dokumentumok, appdata) cannot be deleted. Also available immediately after migration on the migration-done page.
- **Fix: FileBrowser sync after migration** — `syncFileBrowserMounts()` now called after successful data migration, ensuring FileBrowser mounts reflect the current storage layout.
- **Fix: Deploy page title** — Already-deployed apps now show "Beállítások" (Settings) instead of "Telepítés" (Deploy) in both the browser page title and the `<h2>` heading.
- **Internal: Exported `ProtectedHDDPaths()`** from stacks package for reuse in web handlers.
- **Files modified (7):** `internal/stacks/delete.go`, `internal/web/handlers.go`, `internal/web/storage_handlers.go`, `internal/web/templates/deploy.html`, `internal/web/templates/migrate.html`, `internal/web/templates/style.css`
### What was just completed (2026-02-17 session 33)
- **v0.11.6 — FileBrowser Auto-Mount Sync + UI Polish (3 fixes):**
- **Feature: FileBrowser auto-mount sync** — Added `syncFileBrowserMounts()` and `generateFileBrowserCompose()` to `handlers.go`. After a storage path is added (via storage init wizard) or removed, the controller regenerates `/opt/docker/stacks/filebrowser/docker-compose.yml` with volume mounts for all registered paths (`/mnt/hdd_1:/srv/hdd_1` etc.), then recreates the FileBrowser container. Domain is read from FileBrowser's `.env`. If FileBrowser isn't deployed, the function silently returns. The generated compose is self-contained (no env vars).
- **UI Fix 1: Badge color fix** — `settings.html`: changed "Nincs csatolva!" (red `state-red`) badge to "Rendszermeghajtón" (yellow `badge-warn`). The path is on the system SSD, which isn't an error — just informational. Added `.badge-warn { background: rgba(250, 204, 21, 0.15); color: #facc15; }` to `style.css`.
- **UI Fix 2: Progress bar fix** — `storage_init.html`: replaced the disk-usage gradient progress bar (green→yellow→red zones, alarming at 30%) with a clean single-color `progress-bar-task` bar. Added `.progress-bar-task` and `.progress-bar-task .progress-fill` CSS classes to `style.css`.
- **UI Fix 3: Button text fix** — `settings.html`: "Alapértelmezett" button (reads as status, confusing) → "Legyen alapértelmezett" (clear action verb).
- **Files modified (5):** `web/handlers.go`, `web/storage_handlers.go`, `web/templates/settings.html`, `web/templates/storage_init.html`, `web/templates/style.css`
### What was just completed (2026-02-17 session 32)
- **v0.11.4 — Bugfix: Storage Initialization (FormatAndMount) — 3 bugs + 4 safety improvements:**
- **Bug 1 (sfdisk):** Added `wipefs -a` before sfdisk; changed sfdisk input from `,,,L` (unsupported GPT type shorthand) to `,,` (default Linux GUID); added `--force --wipe always` flags. Previous table confusing sfdisk and `L` type not accepted for GPT.
- **Bug 2 (mount):** Replaced `mount mountPath` (fstab lookup — uses container's /etc/fstab, not host's) with explicit `mount -t ext4 -o defaults,noatime /host-dev/sdb1 /mnt/hdd_1`. fstab entry still written to `/host-fstab` for host reboot persistence.
- **Bug 3 (mount propagation):** Changed `/mnt` volume in compose to long-form bind with `propagation: rshared`. Also ran `mount --bind /mnt /mnt && mount --make-rshared /mnt` on demo host. Confirmed `Propagation=rshared` in `docker inspect`. Mounts created inside container now propagate to host.
- **Safety 1 (post-mount verification):** Added `findmnt` check after mount — fails with clear error if mount isn't actually visible.
- **Safety 2 (ASCII label):** Use `req.MountName` (always ASCII) for ext4 `-L` label (16-byte limit). Display label (`req.Label`, may contain UTF-8 Hungarian chars) stays only in settings.json.
- **Safety 3 (smart partition):** In `storageInitAPIHandler`, if disk has exactly 1 empty partition (no filesystem), skip wipefs+sfdisk entirely and format existing partition directly. Handles demo sdb case (sdb1 exists, no FS).
- **Safety 4 (progress messages):** Updated `send()` calls to include command details (device paths, flags) for remote debugging via UI progress panel.
- **Files modified (3):** `storage/format_linux.go`, `docker-compose.yml`, `web/storage_handlers.go`
### What was just completed (2026-02-17 session 31)
- **v0.11.3 — Bugfix: Missing sfdisk in container (fdisk package):**
- `sfdisk` is in the `fdisk` package on Debian bookworm, not `util-linux`. Dockerfile had `util-linux` but not `fdisk`, so `sfdisk` was missing and partitioning failed.
- Added `fdisk` to Dockerfile's `apt-get install` list. Updated comment to clarify which package provides what.
- Verified: all six disk tools now present in container (`sfdisk`, `mkfs.ext4`, `blkid`, `mount`, `lsblk`, `partprobe`).
- **Files modified (1):** `Dockerfile`
### What was just completed (2026-02-17 session 30)
- **v0.11.2 — Bugfix: /dev/sdb not accessible inside container:**
- **Root cause:** Docker always creates a fresh tmpfs at `/dev` inside containers. Even with `privileged: true`, the bind mount `- /dev:/dev` is silently dropped. Block device nodes like `/dev/sdb` don't exist inside the container.
- **Fix:** Mount host `/dev` at `/host-dev` instead. With `privileged: true`, the kernel allows I/O to the device nodes regardless of path inside the container.
- **docker-compose.yml:** Changed `- /dev:/dev``- /dev:/host-dev:rw`. Also applied missing `privileged: true`, `/etc/fstab:/host-fstab`, and `/run/udev:/run/udev:ro` to demo node's live compose (never applied after v0.11.0).
- **safety.go:** Added `HostDevPath = "/host-dev"` constant and `HostDevicePath(devPath) string` helper (`/dev/sdb``/host-dev/sdb`).
- **format_linux.go:** All device operations (os.Stat, sfdisk, partprobe, mkfs.ext4, blkid UUID) use `HostDevicePath()`.
- **safety_linux.go:** `IsSystemDisk()` stats device via `HostDevicePath()`.
- **scan_linux.go:** `enrichWithBlkid()` probes each partition individually (`blkid -o value -s TYPE/UUID/LABEL /host-dev/sdXN`) instead of batch `blkid -o export` (which fails when `/dev` is Docker's minimal tmpfs).
- **Verified:** `/host-dev/sda`, `/host-dev/sdb`, partitions visible; `blkid /host-dev/sdb1` returns correct UUID/fstype/label.
- **Files modified (5):** `storage/safety.go`, `storage/safety_linux.go`, `storage/format_linux.go`, `storage/scan_linux.go`, `docker-compose.yml`
### What was just completed (2026-02-17 session 29)
- **v0.11.1 — Bugfix: Storage Scan — System Disk Detection & FSType in Container:**
- **Bug 1 fix: System disk detection** — Replaced mount-point string comparison (`== "/"`, `"/boot"`, `"/boot/efi"`) with host fstab parsing. Inside the container, `lsblk` reports container mount points (e.g. `/opt/docker/felhom-controller/data`), not host mount points. New `getSystemDiskNames()` reads `/host-fstab` (fallback: `/etc/fstab`), finds system entries (`/`, `/boot`, `/boot/efi`, `swap`), resolves `UUID=` entries to device paths via `blkid -U`, and marks parent disks as system. `partitionToParentDisk()` handles both standard (`sda2→sda`) and NVMe (`nvme0n1p2→nvme0n1`) naming.
- **Bug 2 fix: FSType enrichment** — `lsblk` returns null fstype in containers (udev/blkid cache incomplete). New `enrichWithBlkid()` runs `blkid -o export` after lsblk scan and fills in missing `FSType`, `UUID`, `Label` per partition from direct device probing. Runs on both `AvailableDisks` and `SystemDisks`.
- **Result:** sda (system SSD) now correctly appears in SystemDisks; sdb (USB HDD) appears in AvailableDisks; partition fstypes (vfat/ext4/swap) correctly shown; sdb1 genuinely shows "(nincs fájlrendszer)".
- **Files modified (1):** `storage/scan_linux.go`
### What was just completed (2026-02-17 session 28)
- **v0.11.0 — Phase C: Storage Init, Data Migration & Startup Fixes:**
- **Step 0: Startup ping + hub report** — Controller now fires heartbeat ping, system_health ping, and hub report immediately on startup (5s delay) instead of waiting for first scheduler tick (5-15 min). `hubPusher` instance created once and reused for both startup and periodic reports. Prevents Healthchecks showing stale "Last Ping: X ago" after restarts.
- **Step 1-3: Storage initialization wizard** — New `internal/storage/` package (`scan.go`, `format.go`, `safety.go`, `format_linux.go`, `safety_linux.go`, `scan_linux.go` + non-linux stubs). `ScanDisks()` via `lsblk -J`. `FormatAndMount()` with progress channel (partition via sfdisk → mkfs.ext4 → blkid UUID → fstab backup + UUID-based entry → mount → chown + subdirs). Safety guards: system disk detection via major device numbers, mount path conflict, confirmation "FORMÁZÁS" required. New wizard page at `/settings/storage/init`. JSON API endpoints at `/api/storage/scan`, `/api/storage/init`, `/api/storage/init/status`. Auto-registers storage path in settings.json after success.
- **Step 4-5: Data migration** — New `MigrateAppData()` in `internal/storage/migrate.go`. Per-app "Mozgatás" button on deploy page (for deployed apps with HDD data) and settings page storage app list. Migration flow: stop app → rsync with `--info=progress2` progress parsing → update `app.yaml` HDD_PATH → start app. Rollback on failure (revert config + restart with original path). Old data preserved. New migration page at `/stacks/{name}/migrate`. JSON API at `/api/storage/migrate`, `/api/storage/migrate/status`.
- **Step 6: Per-app storage display** — Deploy page (read-only mode) now shows "Adattárolás" section for deployed apps: current path + label, data size, free space. "Mozgatás" link shown when other storage paths exist.
- **Step 7: Container setup** — Added `privileged: true` to `docker-compose.yml`. New volume mounts: `/dev:/dev`, `/etc/fstab:/host-fstab`, `/run/udev:/run/udev:ro`. Docker socket changed from `:ro` to writable. `Dockerfile` adds: `util-linux`, `e2fsprogs`, `rsync`, `parted`.
- **Storage API routing** — New `/api/storage/` prefix registered in `main.go` before `/api/` catch-all (longer prefix takes priority in Go ServeMux). `ServeStorageAPI` method on web.Server handles all storage JSON endpoints.
- **CSS additions** — `.disk-step`, `.disk-step-active`, `.disk-step-done`, `.disk-progress-steps`, `.disk-progress-bar-wrap`, `.deploy-storage-info` styles.
- **Files created (13):** `storage/scan.go`, `storage/scan_linux.go`, `storage/scan_other.go`, `storage/safety.go`, `storage/safety_linux.go`, `storage/safety_other.go`, `storage/format.go`, `storage/format_linux.go`, `storage/format_other.go`, `storage/migrate.go`, `web/storage_handlers.go`, `templates/storage_init.html`, `templates/migrate.html`
- **Files modified (8):** `main.go`, `web/server.go`, `web/handlers.go`, `templates/settings.html`, `templates/deploy.html`, `templates/style.css`, `docker-compose.yml`, `Dockerfile`
### What was just completed (2026-02-17 session 27)
- **v0.10.0 — Phase B: Storage Management UI Polish & Health Severity Fix:**
- **Step 0: Health severity fix** — `checkStoragePaths()` mount-point check reclassified from **issue** (FAIL) to **warning** (WARN). All storage health messages translated to Hungarian. Added `.monitoring-banner-warn` CSS class for yellow warning banners. Prevents false FAIL status on demo/test environments where storage is intentionally on SSD.
- **Step 1: Success flash messages** — All 4 storage handlers (add/remove/set-default/toggle-schedulable) now redirect with `?storage_msg=success&storage_detail=...` query params. Settings page displays green "alert-info" flash on success. Consistent with backup page flash pattern.
- **Step 2: Edit storage path labels** — New `SetStorageLabel()` method in `settings.go`. New `POST /settings/storage/label` route + handler. Inline edit UI with ✏️ button, text input, OK/Cancel. Added `.btn-ghost` CSS class.
- **Step 3: App details per storage path** — Settings page now shows expandable `<details>` list per storage path with app names, sizes, and links to deploy page. New `StorageAppDetail` struct + `appDetailsForPath()` helper. Added CSS for `.storage-app-details`, `.storage-app-list`, `.storage-app-row`.
- **Step 4: Storage badge on stacks page** — Deployed app cards show "💾 Label" badge indicating which registered storage path the app uses. `StorageLabels` map built from deployed apps' HDD_PATH → registered storage path label lookup. Added `.meta-badge-storage` CSS.
- **Step 5: Deploy dropdown enhancements** — Storage path dropdown now shows free space ("234 GB szabad"). `DeployStoragePath` struct wraps `StoragePath` with `FreeHuman`/`FreePercent` from `GetDiskUsage()`. JS `checkStorageSpace()` shows yellow warning when selected storage has <20% free.
- **Step 6: Filesystem & disk info** — New `FSInfo` struct + `GetFSInfo()` in `mounts_linux.go` using `findmnt` command + `/sys/block/` sysfs reads for disk model. Settings page shows "ext4 · /dev/sdb1 · WD Elements" below disk usage bar. Non-Linux stub returns nil.
- **Step 7: Backup page storage context** — Added `StorageLabel` field to `AppBackupInfo`. Backup page shows storage label badge per app by matching HDD path prefixes against registered storage paths. Uses existing `.meta-badge-storage` CSS.
- **Files modified (12):** `healthcheck.go`, `settings.go`, `mounts_linux.go`, `mounts_other.go`, `appdata.go`, `handlers.go`, `server.go`, `settings.html`, `stacks.html`, `deploy.html`, `backups.html`, `style.css`
### What was previously completed (2026-02-17 session 26)
- **v0.9.0 — Phase A: Storage Paths Foundation & Backup Toggle Fix:**
- **Root cause:** Per-app backup toggles (v0.8.0) didn't appear because `controller.yaml` had no `paths.hdd_path` set → `ParseComposeHDDMounts` returned nil. Even with global hdd_path, apps with different HDD_PATH values wouldn't match.
- **Core fix: Per-app HDD_PATH resolution** — `stackAdapter.GetStackHDDMounts()` now reads each app's own `HDD_PATH` from its `app.yaml` env section (Priority 1), falling back to all registered storage paths (Priority 2). Removed dependency on global `cfg.Paths.HDDPath`.
- **Storage paths registry** (`settings.json`) — new `StoragePath` struct with Path, Label, IsDefault, Schedulable, AddedAt. Thread-safe CRUD methods in `settings.go` (Get/Add/Remove/SetDefault/SetSchedulable). Multiple external storage paths supported.
- **Auto-discovery** — On startup, `discoverHDDPaths()` scans deployed apps' `app.yaml` for `HDD_PATH` values. `AutoDiscoverStoragePaths()` registers discovered paths with inferred labels. Legacy `cfg.Paths.HDDPath` used as fallback.
- **Mount-point validation** — New `mounts_linux.go` (build-tagged): `IsMountPoint()` via `syscall.Stat_t.Dev` comparison, `IsWritable()`, `PathsOverlap()`, `GetDiskUsage()` via `syscall.Statfs`. Non-Linux stubs in `mounts_other.go`.
- **Settings page "Adattárolók" section** — Lists registered paths with label, path, disk usage bar, app count, badges (default/active/unmounted). Actions: set default, toggle schedulable, remove (with guards). Expandable "Új adattároló hozzáadása" form with 5-step validation (exists, mount point, writable, no overlap, no duplicate).
- **Deploy page storage dropdown** — `path` field type renders as `<select>` dropdown of schedulable storage paths. Falls back to text input with warning if no paths registered.
- **Health check storage monitoring** — `RunHealthCheck()` now accepts `storagePaths` parameter. Checks: path accessible (warning), not a mount point (issue — data writes to SSD!), disk usage ≥95% (issue) / ≥90% (warning).
- **Controller docker-compose.yml** — Changed HDD mount from `${HDD_PATH:-/mnt/hdd_placeholder}:...:ro` to `/mnt:/mnt:rw` for multi-storage support + restore capability.
- **Removed unused `hddPath` param** from `DiscoverAppData()` signature in backup/appdata.go.
- **Files created (2):** `system/mounts_linux.go`, `system/mounts_other.go`
- **Files modified (11):** `settings.go`, `main.go`, `appdata.go`, `backup.go`, `handlers.go`, `server.go`, `settings.html`, `deploy.html`, `style.css`, `healthcheck.go`, `docker-compose.yml`, `report/builder.go`
### What was previously completed (2026-02-16 session 25)
- **v0.8.0 — Phase 7: Storage Overview, Per-App Backup Toggles & Limited Restore:**
- **Storage overview on backup page** — new "Tárhely áttekintés" section as first section on backup page showing SSD/HDD progress bars + backup repo stats (repo size, dump file count, snapshot count). Reuses existing `system.GetInfo()` and `RepoStats`.
- **Restic password visibility** — new "Titkosítási kulcs" section inside the repository card. Masked password field with show/copy buttons (JS toggle). Password synced to hub via periodic report for disaster recovery (`ResticPassword` field added to `BackupReport`).
- **App data discovery** — new `internal/backup/appdata.go`:
- `StackDataProvider` interface to avoid circular imports between backup and stacks packages
- `AppBackupInfo`, `AppDataPath`, `AppDockerVolume` structs
- `DiscoverAppData()` iterates deployed stacks, discovers HDD bind mounts (via adapter calling `ParseComposeHDDMounts`), Docker named volumes (via `parseComposeNamedVolumes` using YAML parser), and DB dump status
- Stack adapter in `main.go` implements `StackDataProvider` using `stacks.Manager`
- **Per-app backup toggles** — new "Alkalmazás adatok" section on backup page:
- Toggle checkbox per app (only for apps with HDD data)
- Shows HDD paths with sizes, Docker volume info, DB dump notes
- `POST /settings/app-backup` handler saves preferences to `settings.json`
- `AppBackupPrefs` struct + bulk getter/setter in `settings.go`
- `RefreshCache()` populates `AppDataInfo` via `DiscoverAppData()`
- **Dynamic backup paths** — `RunBackup()` now includes enabled app HDD data paths:
- `resolveAppBackupPaths()` reads enabled apps from settings, resolves HDD paths via provider
- Paths logged at INFO level, included in restic snapshot
- `BackupPaths` display on backup page includes app data paths
- **Limited app restore** — new restore section on backup page:
- `RestoreApp()` in `restore.go`: validates enabled, resolves HDD paths, validates snapshot exists, uses running mutex
- `RestoreAppData()` on `ResticManager`: runs `restic restore` with `--include` flags for specific paths
- `POST /backup/restore` web handler with confirmation flow
- `GET /api/backup/snapshots` JSON endpoint for restore dropdown
- UI: app/snapshot dropdowns, warning box, confirmation checkbox, JS-driven form submission
- **Exported `ParseComposeHDDMounts`** from stacks package (was unexported `parseComposeHDDMounts`)
- **Flash messages** on backup page via query params (success/error redirects from handlers)
- **CSS**: New styles for storage overview grid, app backup toggles, encryption key field, restore section, flash messages
- **Files created**: `appdata.go`, `restore.go`
- **Files modified**: `backup.go`, `restic.go`, `handlers.go`, `server.go`, `backups.html`, `style.css`, `settings.go`, `delete.go`, `router.go`, `types.go`, `builder.go`, `main.go`
### What was previously completed (2026-02-16 session 24)
- **v0.7.2 — Fix Notification Preferences Sync (Controller → Hub):**
- **Two repos changed** (deploy-felhom-compose + felhom.eu):
- **Hub: `POST /api/v1/preferences` endpoint** (`hub/internal/api/handler.go`):
- New route in API handler: same Bearer token auth as /report and /notify
- Accepts JSON payload: `{customer_id, email, enabled_events}`
- Calls existing `store.SaveNotificationPrefs()` — no store changes needed
- Logs preference updates at INFO level
- **Hub: Notification section on customer detail page** (`hub/internal/web/`, `hub/internal/store/store.go`):
- New `GetRecentNotifications()` store method returns last N notification_log entries
- `handleCustomerDetail()` loads NotifPrefs + RecentNotifications
- `joinStrings` template function added for event list display
- `customer.html` template: new "Notifications" section showing email, events, and last 10 notification log entries (time, event, status, message)
- **Controller: `SyncPreferences` method** (`internal/notify/notifier.go`):
- New `preferencesRequest` struct for JSON payload
- `SyncPreferences(email, enabledEvents)` — synchronous POST to hub `/api/v1/preferences`
- `IsEnabled()` getter for checking hub connectivity
- Hungarian error messages for user-facing feedback
- **Controller: Sync on settings save** (`internal/web/handlers.go`):
- `settingsNotificationsHandler` now calls `SyncPreferences` after saving to `settings.json`
- Three flash message variants: success (synced), warning (local save OK, sync failed), error (save failed)
- Local save always succeeds even if hub sync fails
- **Controller: Sync on startup** (`cmd/controller/main.go`):
- Non-blocking goroutine syncs preferences to hub when controller starts
- Only runs if hub is enabled and email is configured
- Handles hub DB rebuild recovery (re-populates preferences after hub redeployment)
- **Files changed**: hub (3 files: handler.go, store.go, server.go, customer.html), controller (3 files: notifier.go, handlers.go, main.go)
- **Documentation**: README.md updated (version, notify module, phase checklist), CONTEXT.md updated
### What was previously completed (2026-02-16 session 23)
- **v0.7.1 — Phase 2: Monitoring Warnings, Dashboard Alerts & Notification System:**
- **Three workstreams across two repos** (deploy-felhom-compose + felhom.eu):
- **Monitoring page "Távoli monitoring" section** (`monitoring.html`, `handlers.go`):
- New section between System Overview and System Metrics showing healthcheck ping UUID status
- 5 rows: Heartbeat, System Health, DB Dump, Backup, Backup Integrity — each shows ✅ configured or ⚠️ missing
- Banner: green (all configured), yellow (some missing), red (monitoring disabled)
- `isPingConfigured()` helper checks non-empty AND not "CHANGEME" prefix
- **Dashboard alert banners** (new `alerts.go`, `layout.html`):
- `AlertManager` struct with `Refresh()` + `GetAlerts()` — generates alerts from health report, missing pings, backup disabled
- Alert types: `Alert{ID, Level, Message, Link, LinkText}` — levels: error/warning/info
- Renders colored banners (red/yellow/blue) after `<main class="content">` on all pages
- Caps at 5 alerts with "+N more" overflow; monitoring page excludes "pings-missing" (shown in table instead)
- Refreshed every 5 min via system-health scheduler task + once at startup
- **Hub notification relay** (felhom.eu repo — `hub/internal/api/handler.go`, `hub/internal/store/store.go`):
- `POST /api/v1/notify` endpoint: Bearer auth, JSON payload (customer_id, event_type, severity, message, details)
- New `customer_notifications` table (email, enabled_events JSON) + `notification_log` audit table
- Resend email integration: direct HTTP POST to `https://api.resend.com/emails`
- Hungarian email template with event details, timestamp, severity
- `hub.yaml.example` updated with notifications config section
- **Controller-side notifier** (new `internal/notify/notifier.go`):
- `Notifier` struct: fires HTTP POST to hub `/api/v1/notify`, non-blocking (goroutine)
- Cooldown tracking per event type (default 6h, configurable via UI)
- Checks notification preferences (email configured + event enabled) before sending
- `NotifyHealthChange()`: only notifies on status degradation (ok→warn, ok→fail, warn→fail)
- `NotifyBackupFailed/NotifyDBDumpFailed/NotifyIntegrityFailed` convenience methods
- `SendTest()` for test email flow
- Wired into scheduler: system-health task calls `NotifyHealthChange()`, backup tasks call failure notifiers
- **Notification preferences UI** (`settings.html`, `handlers.go`):
- New "Értesítések" Section C on Settings page (only shown when hub enabled)
- Email input, 4 event checkboxes (disk_warning, backup_failed, update_available, security_update)
- Cooldown hours input (default 6)
- "Mentés" + "Teszt email küldése" buttons
- Saved to `settings.json` via `NotificationPrefs` struct (Email, EnabledEvents, CooldownHours)
- **Settings persistence expanded** (`settings.go`):
- `NotificationPrefs` struct with Email, EnabledEvents, CooldownHours
- `DefaultEnabledEvents`: disk_warning, backup_failed, update_available
- `GetNotificationPrefs()` returns defaults if nil, `SetNotificationPrefs()` saves atomically
- **Files changed**: 3 new (alerts.go, notifier.go, notify package), ~12 modified across both repos
- **Deployed:** Controller v0.7.1 to demo-felhom.eu, verified healthy (0 alerts on clean system)
### What was previously completed (2026-02-16 session 22)
- **v0.7.0 — Phase 1: Authentication, Persistence & Settings Page:**
- **New `internal/settings/settings.go`:** Shared persistence layer via `settings.json` in the data directory. Atomic writes (tmp + rename), thread-safe with `sync.RWMutex`. Stores password hash overrides and DB validation cache. Graceful handling if file doesn't exist.
- **Auth improvements:**
- Password resolution priority: `settings.json``controller.yaml` → none (open dashboard)
- Startup logs which source is active: `Auth: using password from settings.json/controller.yaml/no password configured`
- Session duration extended to 7 days (was 24h)
- `?next=` redirect after session expiry — returns user to the page they were on
- Flash messages on login page (green info box, used after password change)
- Conditional logout link — hidden when auth is disabled (no password configured)
- `invalidateAllSessions()` method for password change flow
- **New Settings page (`/settings`):**
- "Rendszer konfiguráció" section: read-only display of controller.yaml values (customer ID/name/domain, git repo/sync interval, backup enabled/schedule, monitoring, healthchecks URL, hub status, controller version)
- "Jelszó módosítás" section: form with current password, new password, confirm — validates min 8 chars, match check, bcrypt comparison
- Password saved to `settings.json`, all sessions invalidated, redirect to login with flash message
- Only shown if auth is enabled; otherwise shows info message to contact operator
- **Sidebar update:**
- "Beállítások" menu item with ⚙ icon pinned to bottom (above version/logout)
- Version and logout link separated from nav links
- Logout link conditionally shown only when auth is enabled
- **DB validation persistence:**
- After each successful dump, validation results saved to `settings.json` (`db_validations` map keyed by filename)
- Cached data survives container restarts
- `DBValidationCache` struct with `validated_at`, `table_count`, `has_header`, `error`
- **10 files changed** (3 new: settings.go, settings.html; 7 modified: main.go, backup.go, auth.go, handlers.go, server.go, layout.html, login.html, style.css)
- **Deployed:** Controller v0.7.0 to demo-felhom.eu, verified healthy
### What was previously completed (2026-02-16 session 21)
- **v0.6.3 — Bug fixes from v0.6.2 code scan (4 minor fixes):**
- **Bug 1:** `--hdd-path` in `docker-setup.sh` now uses `require_arg` validation like all other flags. Previously, `--hdd-path` as the last argument without a value would crash with a cryptic bash error under `set -u` instead of a friendly message.
- **Bug 2:** `stackAction()` in `layout.html` now receives `event` as an explicit parameter instead of relying on the deprecated implicit `window.event`. All 10 onclick call sites in `dashboard.html` and `stacks.html` updated to pass `event` as first argument.
- **Bug 3:** Page `<title>` now has an em dash separator: `"Vezérlőpult — Felhom.eu"` instead of `"VezérlőpultFelhom.eu"`.
- **Bug 4:** `nextPruneLabel()` in `funcmap.go` now returns `"ma"` (Hungarian for "today") on Sunday before 4am, consistent with the `nextRunLabel` function. Previously returned the date in `"2006-01-02"` format.
- **Deployed:** Controller v0.6.3 to demo-felhom.eu, verified healthy
### What was previously completed (2026-02-16 session 20)
- **Hub Dashboard Bugs + Backup Validation Fix (3 bugs):**
- **Bug 1&2 (Hub repo, felhom-hub v0.1.2):** Hub timestamp parsing failure — `time.Parse` with single hardcoded format silently failed for formats returned by `modernc.org/sqlite`. Added `parseSQLiteTime()` that tries 6 common formats. Fixed: hub main page showing DOWN despite OK status, and report history timestamps showing 00:00:00.
- **Bug 3 (Controller repo, v0.6.2):** Backup page showing "Hiba" for all DB validations — zero-value `DumpValidation{}` (never assigned) hit the `{{else}}` branch in template. Three fixes:
- Template: 4-branch guard (Valid → OK / Error → Hiba / zero-value → "" with tooltip)
- Debug logging: Added `[DEBUG]` and `[WARN]` log lines to all `ValidateDump()` code paths
- Re-validation: `RefreshCache()` now cross-checks `lastDBDump` results against fresh `ListDumpFiles()` validation, healing stale in-memory state
- **Deployed:** Hub v0.1.2 to k3s, Controller v0.6.2 to demo-felhom
- **Verified:** Controller logs show `ValidateDump OK` for all 3 databases (immich: 60 tables, paperless: 67 tables, romm: 14 tables)
### What was previously completed (2026-02-16 session 19)
- **v0.6.1 — Code Review Bugfixes (7 fixes):**
- **Fix 1:** `http.NotFound(w, nil)` → pass actual `*http.Request` in `deployHandler` and `appDetailHandler`
- **Fix 2:** Dashboard running/stopped counts now computed from the filtered `deployedStacks` set (was counting ALL stacks including non-deployed)
- **Fix 3:** Session cookie `Secure` flag now dynamic based on `r.TLS != nil || X-Forwarded-Proto == "https"`. `SameSite` changed from `Strict` to `Lax` (Strict breaks Cloudflare Tunnel redirects)
- **Fix 4:** Removed misleading `subtle.ConstantTimeCompare` from `isValidSession()` (map lookup already leaks timing; comparing token to itself is meaningless). Removed unused `token` field from `session` struct. Removed `crypto/subtle` import.
- **Fix 5:** Replaced `time.Tick()` (goroutine leak) with proper `time.NewTicker` + `done` channel in `cleanupSessions()`. Added `Close()` method to Server. Added `done chan struct{}` to Server struct.
- **Fix 6:** Added `http.MaxBytesReader(w, req.Body, 1<<20)` (1MB limit) to `deployStack`, `updateOptionalConfig`, `deleteStack` API handlers via `limitBody()` helper.
- **Fix 7:** Cached `time.LoadLocation("Europe/Budapest")` once at top of `templateFuncMap()`, removed 5 per-function `LoadLocation` calls (timeAgo, fmtTime, fmtTimeShort, nextRunLabel, nextPruneLabel).
- **Post-fix verification:** All 4 grep checks pass (0 results for NotFound(w,nil), ConstantTimeCompare, time.Tick(, Secure:.*true). `go vet ./...` clean.
- **Controller version:** v0.6.1 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-16 session 18)
- **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:**
- **Part 1 — Healthcheck enhancements (controller-side):**
- Added `heartbeat` ping — lightweight "I'm alive" signal every 5 min (no logic, just ping)
- Added `backup_integrity` ping — weekly `restic check` on Sunday 04:00, pings healthchecks with result
- Added `Heartbeat` and `BackupIntegrity` fields to `PingUUIDsConfig`
- Added `RunIntegrityCheck()` to backup Manager (calls restic Check(), updates lastCheckTime/lastCheckOK, pings)
- Updated `controller.yaml.example` with new monitoring ping_uuids
- Created `monitoring/DEPRECATED.md` for legacy bash monitoring scripts
- **Part 2 — Central hub reporting (controller-side):**
- New `internal/report/` package: types.go (Report struct), builder.go (BuildReport), pusher.go (HTTP push)
- Report builder gathers data from all subsystems: system info (via metrics.GetStaticInfo + system.GetInfo), container stats (via metricsStore.QueryContainerSummary), backup status (via backupMgr.GetFullStatus), health (via monitor.RunHealthCheck), stacks (via stackMgr.GetStacks)
- Report pusher: POST JSON to hub with Bearer token auth, 3 retries with 5s backoff, never fails caller
- Added `HubConfig` to config.go (enabled, url, api_key, push_interval)
- Wired hub reporting into scheduler (configurable interval, default 15m)
- Hub reporting disabled by default (hub.enabled: false)
- **Part 3 — Hub service (felhom.eu repo, new `hub/` subfolder):**
- Full Go service: `cmd/hub/main.go`, `internal/api/handler.go`, `internal/store/store.go`, `internal/web/server.go`
- SQLite store with WAL mode, auto-migration, denormalized fields for fast queries
- REST API: POST /api/v1/report (Bearer token auth), GET /api/v1/customers, GET /api/v1/customers/{id}, GET /api/v1/customers/{id}/history
- Dark theme dashboard (English): multi-customer overview table with status indicators, customer detail page with system/storage/containers/backup/health sections
- Color coding: green (OK, <30min), yellow (warn or 30-60min), red (fail or >60min)
- K8s manifest: Deployment + Service + Ingress for hub.felhom.eu in felhom-system namespace
- Dockerfile, Makefile, hub.yaml.example config
- 90-day report retention with daily auto-prune
- **Controller version:** v0.6.0 — deployed and verified on demo-felhom.eu (9 scheduler jobs, all new jobs registered)
- **Manual steps remaining for Viktor (Part 4 of TASK.md):**
- Create 5 healthcheck checks on status.felhom.eu (heartbeat, system-health, db-dump, backup, backup-integrity)
- Update controller.yaml on demo-felhom with real UUIDs
- Build and deploy felhom-hub to k3s cluster
- Configure hub.felhom.eu DNS in Cloudflare
- Enable hub reporting on demo-felhom controller.yaml
### What was previously completed (2026-02-16 session 17)
- **v0.5.4 — Monitoring Page Frontend Fixes (4 bugs, frontend-only):**
- **Bug 1: Tooltip "Invalid Date"** — `items[0].parsed.x` unreliable across Chart.js versions. Fixed tooltip callback to use `items[0].raw.x` (direct {x,y} data access) with `parsed.x` as fallback.
- **Bug 2: Charts fill full width regardless of data density** — `setChartXBounds()` setting `min/max` at runtime was ignored because the scale was created without them. Fixed by including `min: now - defaultRangeMs, max: now` in the initial `chartOpts()` options. Now "7 nap" shows full 7-day x-axis with data clustered on the right.
- **Bug 3: Sysinfo values not consistently right-aligned** — `.sysinfo-grid` used `auto-fill` creating variable-width cells. Fixed to `1fr 1fr` (fixed 2-column). Added `align-items: baseline`, `gap: 1rem`, `white-space: nowrap` on labels, `font-weight: 600` + `word-break: break-word` on values. Removed redundant `<style>` block from monitoring.html (styles now in style.css).
- **Bug 4: Charts overflow on mobile** — Added `min-width: 0` on `.chart-box` (critical CSS grid fix), `overflow: hidden` + `max-width: 100%` on `.chart-wrap` and `.chart-wrap-bar`, `max-width: 100%` on canvas.
- **Controller version:** v0.5.4 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-16 session 16)
- **v0.5.1 — Monitoring Page Bugfixes:**
- **Bug 1: Hostname** — `os.Hostname()` returns the container ID inside Docker. Fixed by mounting `/etc/hostname:/host/etc/hostname:ro` and reading it first in `sysinfo.go`. Now shows `demo-felhom`.
- **Bug 2: Tooltip timestamps** — Chart.js tooltip callback used `items[0].parsed.x` (category index 0,1,2...) instead of `items[0].label` (actual timestamp). Index 0 worked by accident (`0 || label` falls through), but all other points showed 1970-01-01.
- **Bug 3+4: Default range + empty charts** — Default range was `24h` but new system had only minutes of data. Changed to `1h` default for both system and container detail charts. Moved `active` class to "1 óra" button.
- **Controller version:** v0.5.1 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-16 session 15)
- **v0.5.0 — Backup Bugfixes + Monitoring Page with Metrics Store:**
- **Task 1: Fixed "Helyi mentés" showing "" after restart** — `GetFullStatus()` now synthesizes `LastBackup` from `SnapshotHistory` and `LastDBDump` from `DumpFiles` on disk when the in-memory values are nil (e.g., after controller restart). Dashboard handler also updated to use `GetFullStatus()` instead of `GetStatus()` for consistent behavior.
- **Task 2: Verified backup page caching** — Already implemented in v0.4.7 (`RefreshCache`, scheduler job, `AfterBackup` callback). No changes needed.
- **Task 3: New Monitoring Page ("Rendszermonitor")** — Full system monitoring subsystem:
- **SQLite metrics store** (`internal/metrics/store.go`, `types.go`): WAL-mode SQLite via `modernc.org/sqlite` (pure Go, no CGO). Stores system metrics (CPU%, memory, temperature, load) and container metrics (CPU%, memory, net/block I/O) with timestamp. Downsampled queries via bucket-based `GROUP BY` for Chart.js. 30-day auto-prune via daily scheduler job at 04:00.
- **Metrics collector** (`internal/metrics/collector.go`): Background goroutine collects system + container metrics every 60 seconds. System data from `system.GetInfo()`, container data from `docker stats --no-stream` with tab-separated format parsing.
- **System info provider** (`internal/metrics/sysinfo.go`, `sysinfo_other.go`): Reads hostname, OS, kernel, CPU model/cores, uptime from `/proc` filesystem. Linux-specific with build-tag fallback for cross-compilation.
- **REST API endpoints** (4 new routes in `router.go`): `GET /api/metrics/system` (time-series with range presets), `GET /api/metrics/containers/summary` (current stats), `GET /api/metrics/containers/{name}` (per-container time-series), `GET /api/metrics/sysinfo` (static system info).
- **Monitoring page template** (`monitoring.html`): 5 sections — System Overview (sysinfo via API), System Metrics Charts (4 line charts: CPU, Memory, Temperature, Load in 2×2 grid), Container Resources (2 horizontal bar charts: CPU% and Memory), Per-container Detail (click to expand with historical charts), Storage (server-rendered progress bars). Time range selectors (1h/6h/24h/7d/30d). Auto-refresh every 60s.
- **Chart.js 4.4.7** embedded locally (offline environments, ~200KB UMD), dark theme configuration matching site design.
- **CSS**: ~100 lines added for monitoring page (`.monitor-card`, `.charts-grid`, `.chart-box`, `.container-charts-row`, `.storage-bars`, responsive rules).
- **Wiring**: 4th sidebar nav item "Rendszermonitor", metrics DB path in named volume (`data/metrics.db`), `/etc/os-release:/host/etc/os-release:ro` volume mount in docker-compose.yml, Dockerfile updated to `golang:1.24-bookworm` (required by `modernc.org/sqlite`), `go.mod` upgraded to `go 1.24.0`.
- **Controller version:** v0.5.0 — deployed and verified on demo-felhom.eu (metrics collecting, 16 containers reporting, sysinfo showing Intel N100 correctly)
### What was previously completed (2026-02-16 session 14)
- **v0.4.7 — Protected Stack Detail Pages + Backup Page Caching:**
- **Protected stacks clickable** — `data-href` gating changed from `{{if not .Protected}}` to `{{if .Meta.Slug}}` on both `stacks.html` and `dashboard.html`. Protected stacks with `.felhom.yml` (i.e. a slug) are now clickable, linking to `/apps/{slug}`. Stacks without `.felhom.yml` remain non-clickable.
- **"Részletek" button for protected stacks** — Protected stack action section in `stacks.html` now shows a "Részletek" link when the stack has a slug, next to the restart button.
- **FileBrowser `.felhom.yml` resources** — Added `resources` section (mem_request: 128M, mem_limit: 256M, pi_compatible: true, needs_hdd: true) to both `install_filebrowser()` in `docker-setup.sh` and manually on the demo node. FileBrowser detail page now shows memory/Pi/HDD badges.
- **Backup page caching** — `GetFullStatus()` no longer runs expensive subprocess calls (restic stats, docker inspect, disk listing) on every page load. Instead, a new `RefreshCache()` method runs these in the background:
- Every 5 minutes via `backup-cache` scheduler job
- After each successful backup via `AfterBackup` callback
- On startup via a goroutine (non-blocking)
- `GetFullStatus()` returns the cached `FullBackupStatus` instantly, updating only dynamic fields (running flag, next run times, snapshot history). Falls back to a minimal status if cache hasn't populated yet.
- **Controller version:** v0.4.7 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-16 session 13)
- **v0.4.6 — MariaDB Validation Fix + Dashboard & Protected Stack UX:**
- **Bugfix: MariaDB dump validation false positive** — MariaDB 11.4+ prepends `/*M!999999\- enable the sandbox mode */` before the dump header comment. `ValidateDump()` now scans the first 10 lines for the expected header pattern instead of just checking line 1. Accepts `-- MariaDB dump`, `-- MySQL dump`, `-- mysqldump` for MariaDB and `-- PostgreSQL database dump` for PostgreSQL.
- **Dashboard shows deployed apps only** — `dashboardHandler()` filters to deployed + protected stacks only. Non-deployed apps remain on the Alkalmazások page. Section heading changed to "Telepített alkalmazások". `TotalCount` stat card still shows all 52 apps.
- **Protected stack restart button** — Protected stacks (traefik, cloudflared, felhom-controller, filebrowser) now show an "Újraindítás" restart button when operational, on both dashboard (compact ↻) and Alkalmazások page (full button). "Védett" / "Védett rendszerkomponens" badge still shown.
- **API protection guard** — Centralized guard in `actionStack()` blocks all actions except `restart` on protected stacks (HTTP 403). Defense-in-depth: `StopStack()` and `DeleteStack()` retain their own guards.
- **FileBrowser `.felhom.yml`** — `install_filebrowser()` in `docker-setup.sh` now creates `.felhom.yml` with `subdomain: files` metadata, so the controller shows the `files.DOMAIN ↗` URL link. Manually created on demo node.
- **Controller version:** v0.4.6 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-16 session 12)
- **v0.4.5 — Dedicated Backup Page ("Biztonsági mentés"):**
- **New `/backups` page** with full backup system visibility — 5 sections:
1. **Status overview cards**: Local backup status (green/gray), remote placeholder (gray), DB count, repo size
2. **Schedule section**: DB dump/restic/prune schedule with next-run times, last backup time + duration, retention policy, "Mentés most" button
3. **Database table**: Lists all discovered DBs with type badge (PostgreSQL/MariaDB), dump file size, last dump time, validation (table count), status
4. **Snapshot history table**: Last 20 snapshots with ID, time, data added, files new/changed
5. **Repository info card**: Path, size, snapshot count, integrity check status, backed-up paths list, remote copy placeholder
- **Backend extensions:**
- `SnapshotRecord` type + ring buffer (20 entries) in Manager for per-snapshot stats
- `DumpValidation` — scans dump files for CREATE TABLE statements, validates header and file size
- `ValidateDump()` runs after each successful dump in `DumpOne()`
- `ListDumpFiles()` scans dump directory for existing `.sql` files (fallback when in-memory results empty)
- `ListSnapshots()` on ResticManager — returns all snapshots from restic (newest first)
- `GetFullStatus()` on Manager — single call returns everything the page needs
- `LoadSnapshotHistory()` populates history from restic on startup (without delta stats)
- Restic check result tracking (`lastCheckTime`, `lastCheckOK`)
- `NextDailyRun()` exported from scheduler for next-run time calculation
- **Server wiring:**
- `Server` struct now holds `*scheduler.Scheduler`
- `NewServer()` accepts scheduler parameter
- `/backups` route + `backupsHandler()` in handlers.go
- **New template functions** (`funcmap.go`): `timeAgo`, `fmtTime`, `fmtTimeShort`, `dbTypeLabel`, `nextRunLabel`, `pruneLabel`, `nextPruneLabel`, `fmtDuration`, `fmtBytes`, `shortID`
- **Navigation**: Sidebar now has 3 items (Vezérlőpult, Alkalmazások, Biztonsági mentés)
- **Dashboard**: Backup card title is now a clickable link to `/backups`
- **Auto-refresh**: Page polls `/api/backup/status` every 3s during backup-in-progress, reloads when complete
- **CSS**: Full dark-theme styles for schedule card, database table, snapshot table, repository card, validation badges, DB type badges, empty state
- **Controller version:** v0.4.5 — deployed and verified on demo-felhom.eu (2 historical snapshots loaded)
### What was previously completed (2026-02-15 session 11)
- **v0.4.1 — App Filtering + Bugfixes:**
- **Filter bar on Alkalmazások page**: Four pill-shaped filter buttons (Mind/Futó/Leállítva/Telepíthető) with live count badges computed from DOM. Filters stack cards via `display: none`, updates URL with `?filter=running` via `history.replaceState`. Reads filter from URL on page load for deep-linking support.
- **New `filterCategory` template function** (`funcmap.go`): Maps container state + deployed flag to filter categories (running/stopped/available). Each stack card gets a `data-filter-state` attribute for client-side filtering.
- **Clickable dashboard stat cards**: Stat cards (Futó/Leállítva/Összes) changed from `<div>` to `<a>` with `href` linking to `/stacks?filter=running`, `/stacks?filter=stopped`, `/stacks` respectively. Hover effect with translateY + box-shadow.
- **docker-compose.yml synced to demo node**: Fixed the stale compose file that still had `dashboard.${DOMAIN}` Traefik label (from pre-v0.3.0). Now uses correct `felhom.${DOMAIN}` label + `/sys:/host/sys:ro` mount.
- **Controller version:** v0.4.1 — deployed and verified on demo-felhom.eu
- **Remaining manual tasks for Viktor (Task 2 & 3 from TASK.md):**
- Verify `felhom.demo-felhom.eu` resolves correctly (Cloudflare Tunnel public hostname may need updating from `dashboard.*` to `felhom.*`)
- Update Pi-hole local DNS if applicable
- Enable backup in `controller.yaml` on demo node (`backup.enabled: true`)
- Create `/srv/backups` directories on demo node
### What was previously completed (2026-02-15 session 10)
- **v0.4.0 — Monitoring & Health + Backups (Phase 2 & 3):**
- **Central job scheduler** (`internal/scheduler/scheduler.go`):
- Replaces ad-hoc goroutines in main.go with a unified scheduler
- `Every(name, interval, fn)` for periodic jobs, `Daily(name, timeStr, fn)` for scheduled tasks
- Panic recovery, skip-if-running, quiet mode for high-frequency jobs (≤30s)
- Daily jobs use `Europe/Budapest` timezone with `time.Timer` for DST correctness
- Graceful shutdown with 30s timeout for running jobs
- **CPU usage collector** (`internal/system/cpu_linux.go`):
- Background goroutine samples `/proc/stat` every 5s, computes delta-based CPU %
- Platform stubs for non-Linux in `cpu_other.go`
- **Temperature & load metrics** (`internal/system/info_linux.go`):
- Reads `/proc/loadavg` for 1/5/15 min load averages
- Reads thermal zones from `/host/sys/class/thermal/` (Docker mount) with `/sys/` fallback
- Handles millidegree values, picks highest zone, with hwmon fallback
- **Healthchecks.io pinger** (`internal/monitor/pinger.go`):
- HTTP ping client for Healthchecks.io-compatible endpoints
- POST to `/ping/{uuid}` (success), `/fail` (failure), `/start` (started)
- 10s timeout, 3 retries with 2s backoff, skips CHANGEME UUIDs
- **System health checks** (`internal/monitor/healthcheck.go`):
- Checks disk, memory, CPU, temperature, Docker reachability, protected containers
- Returns HealthReport with status "ok"/"warn"/"fail" + formatted message for pings
- **Database dump engine** (`internal/backup/dbdump.go`):
- Auto-discovers PostgreSQL/MariaDB containers via `docker ps` + `docker inspect`
- Dumps via `docker exec pg_dump`/`mariadb-dump` with 5min timeout
- Atomic writes (`.tmp``.sql`), empty file detection, stale temp cleanup
- **Restic integration** (`internal/backup/restic.go`):
- Auto-generates repository password (32 random bytes, base64url)
- Init, snapshot (JSON output), prune, check, stats, latest snapshot
- Stale lock detection with automatic unlock + retry
- **Backup orchestrator** (`internal/backup/backup.go`):
- DB dumps + restic snapshots, weekly prune on Sundays
- Thread-safe running flag, Healthchecks.io pings with results
- `RunFullBackup()` for manual trigger (sequential: dumps → snapshot)
- **Wiring updates:**
- `main.go`: scheduler-based job registration, cpuCollector lifecycle, pinger + backupMgr init
- `api/router.go`: `GET /api/backup/status`, `POST /api/backup/run`
- `web/server.go` + `handlers.go`: pass cpuCollector to GetInfo(), backup status on dashboard
- `funcmap.go`: `tempColor`, `fmtTemp`, `fmtLoad` template functions
- **Dashboard UI enhancements:**
- CPU usage bar with load average display below
- Temperature with colored indicator dot (green/yellow/red at 60°/75°C)
- Backup status card: last run time, DB count, repo size/snapshots
- "Mentés most" button triggers manual backup via API
- **Config updates:**
- `controller.yaml.example`: added `system_health_interval`, `hdd_path`, `system.reserved_memory_mb`
- `docker-compose.yml`: added `/sys:/host/sys:ro` mount for temperature reading
- `restic_password_file` default changed to `data/` subdir (auto-generated in named volume)
- **Controller version:** v0.4.0 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-15 session 9)
- **v0.3.0 — Structural refactoring (templates + server split + domain rename):**
- **Templates: go:embed migration** — moved all 7 HTML templates + CSS from Go string constants to individual files in `internal/web/templates/`. Created `embed.go` with `//go:embed` directive. Template loading now uses `ParseFS()` instead of `Parse()`. CSS served from embed.FS via `ReadFile()`. Zero runtime file dependencies — still compiled into the binary.
- **Server decomposition** — split monolithic `server.go` (540 lines) into focused files:
- `auth.go`: session struct, auth middleware, login/logout handlers, session management
- `handlers.go`: page handlers (dashboard, stacks, logs, deploy, app detail)
- `funcmap.go`: template FuncMap with 14 custom functions
- `server.go`: Server struct, NewServer, loadTemplates (3-liner), ServeHTTP routing, render helper, static file serving
- **Domain rename** — controller subdomain changed from `dashboard.*` to `felhom.*` in Traefik labels and setup script
- **Documentation updated** — CLAUDE.md, README.md, CONTEXT.md all reflect new file structure
- **Reminder for Viktor:** Update Cloudflare Tunnel public hostname (`dashboard.demo-felhom.eu``felhom.demo-felhom.eu`) and Pi-hole DNS if needed
- **Controller version:** v0.3.0
### What was previously completed (2026-02-15 session 8)
- **FileBrowser as infrastructure service:**
- Created `scripts/hdd-setup.sh` (adapted from deploy-portainer) — sets up HDD folder structure with `Dokumentumok` user dir
- Created `scripts/docker-setup.sh` (adapted from deploy-portainer) — installs Docker, Traefik, FileBrowser as infra services
- Added `filebrowser` to protected stacks in `controller.yaml.example`
- Removed `templates/filebrowser/` from app-catalog-felhom.eu (no longer a catalog app)
- **Orphan stack detection and deletion:**
- Added `Orphaned` field to Stack struct + `getCatalogTemplateSlugs()` helper
- Orphan detection in `ScanStacks()` — deployed stacks with no matching catalog template marked as orphaned
- New `delete.go`: `DeleteStack()` (compose down + HDD cleanup + dir removal), `GetStackHDDData()`, `parseComposeHDDMounts()`
- Safety: protected HDD paths (root, media, storage, Dokumentumok, appdata) can never be deleted
- New API endpoints: `DELETE /api/stacks/{name}` and `GET /api/stacks/{name}/hdd-data`
- UI: orange "Elavult" badge on orphaned stacks, "Törlés" button, delete confirmation modal
- Modal shows HDD data paths/sizes, checkbox for "Felhasználói adatok törlése a merevlemezről"
- Hides "Frissítés" and "Részletek" buttons for orphaned stacks
- **Verified:** 1 orphaned stack detected on startup (filebrowser — now infra, removed from catalog)
- **Controller version:** v0.2.15
### Previously completed (2026-02-14 session 7)
- **Fixed YAML parse error in romm `.felhom.yml`** (app-catalog repo):
- Root cause: Hungarian opening quote `„` (U+201E) paired with ASCII `"` (0x22) inside YAML double-quoted strings terminated the string prematurely
- Affected lines: `help_text` for IGDB Client Secret and SteamGridDB API Key fields
- Fix: escaped inner ASCII double quotes with `\"` in the YAML strings
- This caused `LoadMetadata()` to silently fail and return empty defaults for ALL romm metadata (tagline, resources, category — everything)
- **Added error logging to `LoadMetadata()`** in `metadata.go`:
- `[ERROR]` log on YAML parse failure (was silently swallowed — critical bug)
- Temporary `[DEBUG]` log used for diagnosis, then removed
- **Fixed deploy command in CLAUDE.md**:
- `sed` pattern now targets only `image:` lines (was matching service name too, breaking YAML)
- Added `sudo` for both sed and docker compose (directory is root-owned)
- **Controller version:** v0.2.14
### Previously completed (2026-02-14 session 6)
- **Bug fix: App info logo SVG rendering** — `.app-info-logo` CSS in `templates.go`:
- Added `min-width`, `min-height`, `max-width`, `max-height: 80px` and `overflow: hidden`
- Prevents SVG images with explicit dimensions or no viewBox from overflowing container
- Logo now reliably renders at 80x80 regardless of SVG intrinsic size
- **Controller version:** v0.2.12
### Previously completed (2026-02-14 session 5)
- **App detail/info pages** — new feature:
- New route: `GET /apps/{slug}` renders a full info page (was redirect to deploy page)
- Hero section with logo, tagline, resource badges
- Screenshots section (graceful — hidden via `onerror` if assets don't exist)
- Info cards: use cases, first steps, prerequisites, default credentials, docs link
- Optional config form with AJAX save (POST `/api/stacks/{name}/optional-config`)
- New `.felhom.yml` fields: `app_info` (tagline, use_cases, first_steps, prerequisites, default_creds, docs_url) and `optional_config` (groups of env var fields)
- New structs in `metadata.go`: `AppInfo`, `OptionalConfigGroup`, `OptionalConfigField`
- `UpdateOptionalConfig` in `deploy.go`: saves optional env vars to `app.yaml`, restarts deployed stacks with `docker compose up -d` to pick up new env vars
- Navigation updated: stack cards on dashboard/stacks pages now link to `/apps/{slug}`, deploy page has "Részletek" link back to info page
- **RoMM metadata updated** (app-catalog repo):
- Full `app_info` section: tagline, 5 use cases, 6 first steps, 3 prerequisites, default creds, docs URL
- 6 optional config fields for metadata providers: IGDB (client_id + secret), SteamGridDB, ScreenScraper (user + password), MobyGames
- docker-compose.yml updated with SCREENSCRAPER_USER, SCREENSCRAPER_PASSWORD, MOBYGAMES_API_KEY env vars
- Display name fixed: "ROMM" → "RomM"
- **Controller version:** v0.2.11
### Previously completed (2026-02-14 session 4)
- **Fixed deploy race condition** in `internal/stacks/deploy.go`:
- In-memory `Deployed` flag now set BEFORE `docker compose up -d` (compose up can take 30-60s for image pulls)
- On failure: both in-memory state and disk (app.yaml) are reverted
- Eliminates stale "Telepítés" button during long compose operations
- **Added `checkBeforeDeploy()` JS guard** in `internal/web/templates.go`:
- Telepítés buttons on Vezérlőpult and Alkalmazások pages now fetch live state from `/api/stacks/{name}` before navigating
- If app is already deployed (e.g., another tab deployed it), shows alert and reloads page instead of navigating to deploy form
- Catches stale UI state gracefully
### Previously completed (2026-02-14 session 3)
- **Enhanced debug logging** across all stack operations in `internal/stacks/`:
- **Operation timing**: All stack ops (start, stop, restart, update, deploy) now log elapsed time
- **Post-start container state check**: Async goroutine after start/restart/update/deploy
- **Image pull detection**: Checks local images before deploy/update (debug level)
- **GetLogs/ScanStacks improvements**: Byte count logging, deployed/available counts
- All verbose checks gated on `cfg.Logging.Level == "debug"`; timing always at INFO
- **UI improvements** in `internal/web/templates.go` and `server.go`:
- **Memory bar fix on deploy page**: Bar segments now always visible (min-width: 3px), new app segment uses translucent green with distinct border for clear visual separation from committed memory
- **Clickable app cards**: Cards on Vezérlőpult and Alkalmazások pages are now clickable (navigates to deploy/detail page). Uses `data-href` attribute + delegated click handler. Protected stacks excluded. Actions area (buttons, state labels) excluded from click-to-navigate
- **Live-scrolling logs**: Logs page now auto-refreshes every 3s via AJAX polling (`?raw=1` returns plain text). Fixed-height container (70vh) with auto-scroll to bottom. Pulsing green "Élő" indicator. Pause/resume toggle ("Szüneteltetés"/"Folytatás"). User scroll position preserved when scrolled up to read history
- **Deployment progress UI**: Deploy button no longer shows alert+redirect immediately. Instead shows 3-step progress panel: config saved → containers starting → app initializing. Polls `GET /api/stacks/{name}` every 3s to track actual container health state. Handles running (auto-redirect), starting (keep polling), unhealthy (warning), exited (error), and 120s timeout. Shows elapsed time counter
- **Mealie healthcheck fix** (app-catalog-felhom.eu):
- `wget --spider` replaced with Python TCP socket check — mealie image doesn't include wget
- `start_period` increased to 60s (DB migrations take ~40s on first start)
- **Healthcheck audit**: filebrowser (Alpine, has BusyBox wget — OK), stirling-pdf (Ubuntu, has wget — OK)
### Previously completed (2026-02-15 session 2)
- **Phase 4: Git Sync + App Catalog Audit** — major milestone
- **Git sync module** (`internal/sync/sync.go`):
- Clones/pulls app-catalog-felhom.eu repo to local cache on startup
- Periodic sync based on `git.sync_interval` (default 15m)
- Copies `docker-compose.yml` + `.felhom.yml` to stacks dir (never overwrites `app.yaml`/`.env`)
- SHA-256 content comparison — only writes changed files
- Triggers `ScanStacks()` after sync so dashboard updates immediately
- Uses `os/exec` git CLI — no Go git library dependency
- **Manual sync button** ("Sablonok frissítése") on Alkalmazások page:
- `POST /api/sync` endpoint with 30s debounce
- Toast notification shows result (success/failure/what changed)
- Auto-reloads page if new apps or updates detected
- **Sync status** added to `/api/system/info` (last_sync, last_status, syncing flag)
- **.felhom.yml files created for all 10 apps** (paperless-ngx already had one):
- actualbudget, docmost, filebrowser, homebox, immich, mealie, romm, stirling-pdf, vaultwarden
- All follow the same format: display_name, description, category, subdomain, resources, deploy_fields
- **Docker Compose templates audited and fixed** for all 10 apps:
- Fixed `{{DOMAIN}}``${DOMAIN}` syntax in homebox, mealie, romm, stirling-pdf
- Fixed `{{HDD_PATH}}``${HDD_PATH}` in romm
- Added `deploy.resources.limits.memory` to all services across all templates
- Added `TZ=Europe/Budapest` to all sidecar services (postgres, redis, mariadb)
- Added healthcheck to romm main service
- Added `romm-redis` `condition: service_healthy` (was `service_started`)
- Standardized header comment blocks across all templates
- **Documentation updated**: app-catalog README, CLAUDE.md, CONTEXT.md
### Previously completed (2026-02-15 session 1)
- **Memory validation during deployment**:
- Pre-deploy memory check: compares `mem_request` sum against usable system RAM
- Hard block if requests exceed usable memory (total - 384MB reserved)
- Soft warning if `mem_limit` sum exceeds total RAM (overcommit OK for limits)
- `ParseMemoryMB()` supports "500M", "1G", "1.5G", "1024" formats
- `CommittedMemory()` sums requests/limits across all deployed stacks
- Memory summary bar shown on deploy page before user clicks deploy
- `system.reserved_memory_mb` configurable in controller.yaml (default: 384)
- **Display: `~` prefix on mem_request** in UI badges (display-only, exact value stored)
- **Felhom.eu logo** replaced text logos in sidebar and login page with actual SVG logo
- Logo SVG embedded as Go string constant, served at `/static/felhom-logo.svg`
### Previously completed (2026-02-14)
- **System info bar on Vezérlőpult dashboard**: RAM, SSD, and optional HDD usage
- Progress bars with color coding (green < 70%, yellow 70-85%, red > 85%)
- New `internal/system` package reads `/proc/meminfo` + `syscall.Statfs`
- Platform-specific: Linux impl + non-Linux stub (build tags)
- Hungarian labels: "Memória", "SSD tárhely", "Külső HDD"
- **Docker Compose memory limits** on paperless-ngx template:
- paperless-webserver: 768M, postgres: 256M, redis: 128M
- Added `mem_limit` field to `.felhom.yml` ResourceHints (total: 1152M)
- **`/api/system/info` endpoint** now returns live system metrics (was customer info)
- **Config**: Added `paths.hdd_path` for external HDD monitoring
- Controller image builds via build.sh, pushes to Gitea container registry
### Previously completed (2026-02-13)
- Built the entire felhom-controller from scratch (Go, no frameworks)
- Debugged and fixed 7 issues during first real deployment:
1. Password validation (empty passwords accepted)
2. In-memory Deployed flag not updating after deploy
3. Health-aware state parsing (starting/unhealthy detection)
4. Random card ordering (Go map iteration)
5. "Részletek" button redirect for deployed apps
6. Paperless OCR language installation (LANGUAGES vs LANGUAGE env var)
7. Documentation: restart vs up -d for image updates
### What's next (priorities)
1. **Test per-app backup** — enable backup for Paperless-ngx HDD data, trigger manual backup, verify restic snapshot includes HDD paths
2. **Test restore** — restore app data from snapshot, verify file recovery (now possible with /mnt:rw mount)
3. **Deploy Immich** — tests HDD path + secrets + multi-storage (biggest real-world test)
4. Add `app_info` + `optional_config` to more apps (Immich, Mealie, Vaultwarden)
5. Test on Raspberry Pi (pi-customer-1)
6. Self-update mechanism
7. Hub alerting (webhook to Healthchecks for stale customers)
8. Docker volume backup (mount `/var/lib/docker/volumes:ro` into controller)