Files
deploy-felhom-compose/CHANGELOG.md
T
admin 1a8036d055 v0.11.8 — Per-App Cross-Drive Backup (3-2-1 rule)
New feature: backup app data to a secondary storage drive to satisfy
the "different media" requirement of the 3-2-1 backup rule.

- settings.go: CrossDriveBackup struct, AppBackupPrefs.CrossDrive field,
  getter/setter methods, GetOrCreateCrossDrivePassword, preserves
  cross-drive config when toggling nightly backup

- crossdrive.go (new): CrossDriveRunner with rsync and restic backends.
  Validates destination (mount point, writable), prevents source/dest
  overlap, per-app concurrency lock, persists last_run/status/size.

- main.go: wire CrossDriveRunner, register cross-drive-daily (03:30)
  and cross-drive-weekly (04:30 Sundays) scheduler jobs

- router.go: 4 new API endpoints — save config, trigger run, get status,
  run-all. Router now accepts Settings and CrossDriveRunner.

- server.go: Server struct accepts CrossDriveRunner, new web route
  POST /settings/cross-backup/{name}

- handlers.go: deployHandler populates CrossDriveConfig, BackupDestPaths,
  BackupDestWarning, AppBackupEnabled. settingsCrossBackupHandler saves
  config. backupsHandler builds CrossDriveSummary, UnconfiguredApps,
  CrossDriveWarnings for backup page.

- deploy.html: "Biztonsági mentés" card with destination/method/schedule
  dropdowns, last-run status, manual trigger button, flash messages.

- backups.html: "Másolatok másik meghajtóra" section with per-app
  status rows, unconfigured app warnings, "Összes futtatása most" button.

- style.css: margin-bottom fix for .deploy-stale-data, new cross-drive
  card and list styles.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 15:45:31 +01:00

65 KiB
Raw Blame History

Changelog

What was just completed (2026-02-17 session 35)

  • v0.11.8 — Per-App Cross-Drive Backup (3-2-1 rule, second copy on different media):
    • Feature: CrossDriveBackup data modelAppBackupPrefs extended with CrossDrive *CrossDriveBackup field in settings.go. New methods: GetCrossDriveConfig, SetCrossDriveConfig, UpdateCrossDriveStatus, GetAllCrossDriveConfigs, GetOrCreateCrossDrivePassword. Existing SetAppBackup/SetAppBackupBulk now preserve cross-drive config. Auto-generated restic password stored in settings.json.
    • Feature: CrossDriveRunner — New internal/backup/crossdrive.go. Supports rsync (simple mirror with --delete) and restic (versioned, deduplicated, shared repo). Safety guards: destination ≠ source, mount point check, writable check, per-app concurrency lock. RunAllScheduled(ctx, schedule) iterates all apps matching the given schedule. Status (last_run, last_status, last_error, last_duration, last_size_human) persisted to settings.json after each run.
    • Feature: Scheduler jobs — Two new daily jobs: cross-drive-daily at 03:30 (for apps with schedule: daily), cross-drive-weekly at 04:30 Sundays only (for schedule: weekly).
    • Feature: API endpoints — 4 new routes: POST /api/stacks/{name}/cross-backup, POST /api/stacks/{name}/cross-backup/run, GET /api/stacks/{name}/cross-backup/status, POST /api/backup/cross-drive/run-all.
    • Feature: Deploy/Settings page UI — New "Biztonsági mentés" card on the deploy page for apps with HDD data. Shows nightly backup toggle (read-only link), cross-drive dropdowns (destination, method, schedule), last run status, manual trigger button. States: no other storage (info message), configured, destination unreachable (warning). Flash messages on save redirect.
    • Feature: Backup page summary — New "Másolatok másik meghajtóra" section showing all configured apps with method, destination, last status, size. Warns about unconfigured apps with HDD data. Destination health warnings. "Összes futtatása most" button.
    • CSS: margin-bottom: 1.5rem added to .deploy-stale-data. New styles: .deploy-cross-drive, .cross-drive-list, .cross-drive-item, .cross-drive-header, .cross-drive-meta, .cross-drive-actions.
    • Files modified (10): settings/settings.go, backup/crossdrive.go (new), backup/backup.go, api/router.go, web/handlers.go, web/server.go, web/templates/deploy.html, web/templates/backups.html, web/templates/style.css, cmd/controller/main.go

What was just completed (2026-02-17 session 34)

  • v0.11.7 — Stale Data Cleanup + FileBrowser Sync + UI Title Fix:
    • Feature: Stale data cleanup — After app data migration, the deploy/settings page now shows leftover data on previous storage paths with size info and a delete button. Two-step confirmation required before deletion. Protected paths (storage root, media, Dokumentumok, appdata) cannot be deleted. Also available immediately after migration on the migration-done page.
    • Fix: FileBrowser sync after migrationsyncFileBrowserMounts() now called after successful data migration, ensuring FileBrowser mounts reflect the current storage layout.
    • Fix: Deploy page title — Already-deployed apps now show "Beállítások" (Settings) instead of "Telepítés" (Deploy) in both the browser page title and the <h2> heading.
    • Internal: Exported ProtectedHDDPaths() from stacks package for reuse in web handlers.
    • Files modified (7): internal/stacks/delete.go, internal/web/handlers.go, internal/web/storage_handlers.go, internal/web/templates/deploy.html, internal/web/templates/migrate.html, internal/web/templates/style.css

What was just completed (2026-02-17 session 33)

  • v0.11.6 — FileBrowser Auto-Mount Sync + UI Polish (3 fixes):
    • Feature: FileBrowser auto-mount sync — Added syncFileBrowserMounts() and generateFileBrowserCompose() to handlers.go. After a storage path is added (via storage init wizard) or removed, the controller regenerates /opt/docker/stacks/filebrowser/docker-compose.yml with volume mounts for all registered paths (/mnt/hdd_1:/srv/hdd_1 etc.), then recreates the FileBrowser container. Domain is read from FileBrowser's .env. If FileBrowser isn't deployed, the function silently returns. The generated compose is self-contained (no env vars).
    • UI Fix 1: Badge color fixsettings.html: changed "Nincs csatolva!" (red state-red) badge to "Rendszermeghajtón" (yellow badge-warn). The path is on the system SSD, which isn't an error — just informational. Added .badge-warn { background: rgba(250, 204, 21, 0.15); color: #facc15; } to style.css.
    • UI Fix 2: Progress bar fixstorage_init.html: replaced the disk-usage gradient progress bar (green→yellow→red zones, alarming at 30%) with a clean single-color progress-bar-task bar. Added .progress-bar-task and .progress-bar-task .progress-fill CSS classes to style.css.
    • UI Fix 3: Button text fixsettings.html: "Alapértelmezett" button (reads as status, confusing) → "Legyen alapértelmezett" (clear action verb).
    • Files modified (5): web/handlers.go, web/storage_handlers.go, web/templates/settings.html, web/templates/storage_init.html, web/templates/style.css

What was just completed (2026-02-17 session 32)

  • v0.11.4 — Bugfix: Storage Initialization (FormatAndMount) — 3 bugs + 4 safety improvements:
    • Bug 1 (sfdisk): Added wipefs -a before sfdisk; changed sfdisk input from ,,,L (unsupported GPT type shorthand) to ,, (default Linux GUID); added --force --wipe always flags. Previous table confusing sfdisk and L type not accepted for GPT.
    • Bug 2 (mount): Replaced mount mountPath (fstab lookup — uses container's /etc/fstab, not host's) with explicit mount -t ext4 -o defaults,noatime /host-dev/sdb1 /mnt/hdd_1. fstab entry still written to /host-fstab for host reboot persistence.
    • Bug 3 (mount propagation): Changed /mnt volume in compose to long-form bind with propagation: rshared. Also ran mount --bind /mnt /mnt && mount --make-rshared /mnt on demo host. Confirmed Propagation=rshared in docker inspect. Mounts created inside container now propagate to host.
    • Safety 1 (post-mount verification): Added findmnt check after mount — fails with clear error if mount isn't actually visible.
    • Safety 2 (ASCII label): Use req.MountName (always ASCII) for ext4 -L label (16-byte limit). Display label (req.Label, may contain UTF-8 Hungarian chars) stays only in settings.json.
    • Safety 3 (smart partition): In storageInitAPIHandler, if disk has exactly 1 empty partition (no filesystem), skip wipefs+sfdisk entirely and format existing partition directly. Handles demo sdb case (sdb1 exists, no FS).
    • Safety 4 (progress messages): Updated send() calls to include command details (device paths, flags) for remote debugging via UI progress panel.
    • Files modified (3): storage/format_linux.go, docker-compose.yml, web/storage_handlers.go

What was just completed (2026-02-17 session 31)

  • v0.11.3 — Bugfix: Missing sfdisk in container (fdisk package):
    • sfdisk is in the fdisk package on Debian bookworm, not util-linux. Dockerfile had util-linux but not fdisk, so sfdisk was missing and partitioning failed.
    • Added fdisk to Dockerfile's apt-get install list. Updated comment to clarify which package provides what.
    • Verified: all six disk tools now present in container (sfdisk, mkfs.ext4, blkid, mount, lsblk, partprobe).
    • Files modified (1): Dockerfile

What was just completed (2026-02-17 session 30)

  • v0.11.2 — Bugfix: /dev/sdb not accessible inside container:
    • Root cause: Docker always creates a fresh tmpfs at /dev inside containers. Even with privileged: true, the bind mount - /dev:/dev is silently dropped. Block device nodes like /dev/sdb don't exist inside the container.
    • Fix: Mount host /dev at /host-dev instead. With privileged: true, the kernel allows I/O to the device nodes regardless of path inside the container.
    • docker-compose.yml: Changed - /dev:/dev- /dev:/host-dev:rw. Also applied missing privileged: true, /etc/fstab:/host-fstab, and /run/udev:/run/udev:ro to demo node's live compose (never applied after v0.11.0).
    • safety.go: Added HostDevPath = "/host-dev" constant and HostDevicePath(devPath) string helper (/dev/sdb/host-dev/sdb).
    • format_linux.go: All device operations (os.Stat, sfdisk, partprobe, mkfs.ext4, blkid UUID) use HostDevicePath().
    • safety_linux.go: IsSystemDisk() stats device via HostDevicePath().
    • scan_linux.go: enrichWithBlkid() probes each partition individually (blkid -o value -s TYPE/UUID/LABEL /host-dev/sdXN) instead of batch blkid -o export (which fails when /dev is Docker's minimal tmpfs).
    • Verified: /host-dev/sda, /host-dev/sdb, partitions visible; blkid /host-dev/sdb1 returns correct UUID/fstype/label.
    • Files modified (5): storage/safety.go, storage/safety_linux.go, storage/format_linux.go, storage/scan_linux.go, docker-compose.yml

What was just completed (2026-02-17 session 29)

  • v0.11.1 — Bugfix: Storage Scan — System Disk Detection & FSType in Container:
    • Bug 1 fix: System disk detection — Replaced mount-point string comparison (== "/", "/boot", "/boot/efi") with host fstab parsing. Inside the container, lsblk reports container mount points (e.g. /opt/docker/felhom-controller/data), not host mount points. New getSystemDiskNames() reads /host-fstab (fallback: /etc/fstab), finds system entries (/, /boot, /boot/efi, swap), resolves UUID= entries to device paths via blkid -U, and marks parent disks as system. partitionToParentDisk() handles both standard (sda2→sda) and NVMe (nvme0n1p2→nvme0n1) naming.
    • Bug 2 fix: FSType enrichmentlsblk returns null fstype in containers (udev/blkid cache incomplete). New enrichWithBlkid() runs blkid -o export after lsblk scan and fills in missing FSType, UUID, Label per partition from direct device probing. Runs on both AvailableDisks and SystemDisks.
    • Result: sda (system SSD) now correctly appears in SystemDisks; sdb (USB HDD) appears in AvailableDisks; partition fstypes (vfat/ext4/swap) correctly shown; sdb1 genuinely shows "(nincs fájlrendszer)".
    • Files modified (1): storage/scan_linux.go

What was just completed (2026-02-17 session 28)

  • v0.11.0 — Phase C: Storage Init, Data Migration & Startup Fixes:
    • Step 0: Startup ping + hub report — Controller now fires heartbeat ping, system_health ping, and hub report immediately on startup (5s delay) instead of waiting for first scheduler tick (5-15 min). hubPusher instance created once and reused for both startup and periodic reports. Prevents Healthchecks showing stale "Last Ping: X ago" after restarts.
    • Step 1-3: Storage initialization wizard — New internal/storage/ package (scan.go, format.go, safety.go, format_linux.go, safety_linux.go, scan_linux.go + non-linux stubs). ScanDisks() via lsblk -J. FormatAndMount() with progress channel (partition via sfdisk → mkfs.ext4 → blkid UUID → fstab backup + UUID-based entry → mount → chown + subdirs). Safety guards: system disk detection via major device numbers, mount path conflict, confirmation "FORMÁZÁS" required. New wizard page at /settings/storage/init. JSON API endpoints at /api/storage/scan, /api/storage/init, /api/storage/init/status. Auto-registers storage path in settings.json after success.
    • Step 4-5: Data migration — New MigrateAppData() in internal/storage/migrate.go. Per-app "Mozgatás" button on deploy page (for deployed apps with HDD data) and settings page storage app list. Migration flow: stop app → rsync with --info=progress2 progress parsing → update app.yaml HDD_PATH → start app. Rollback on failure (revert config + restart with original path). Old data preserved. New migration page at /stacks/{name}/migrate. JSON API at /api/storage/migrate, /api/storage/migrate/status.
    • Step 6: Per-app storage display — Deploy page (read-only mode) now shows "Adattárolás" section for deployed apps: current path + label, data size, free space. "Mozgatás" link shown when other storage paths exist.
    • Step 7: Container setup — Added privileged: true to docker-compose.yml. New volume mounts: /dev:/dev, /etc/fstab:/host-fstab, /run/udev:/run/udev:ro. Docker socket changed from :ro to writable. Dockerfile adds: util-linux, e2fsprogs, rsync, parted.
    • Storage API routing — New /api/storage/ prefix registered in main.go before /api/ catch-all (longer prefix takes priority in Go ServeMux). ServeStorageAPI method on web.Server handles all storage JSON endpoints.
    • CSS additions.disk-step, .disk-step-active, .disk-step-done, .disk-progress-steps, .disk-progress-bar-wrap, .deploy-storage-info styles.
    • Files created (13): storage/scan.go, storage/scan_linux.go, storage/scan_other.go, storage/safety.go, storage/safety_linux.go, storage/safety_other.go, storage/format.go, storage/format_linux.go, storage/format_other.go, storage/migrate.go, web/storage_handlers.go, templates/storage_init.html, templates/migrate.html
    • Files modified (8): main.go, web/server.go, web/handlers.go, templates/settings.html, templates/deploy.html, templates/style.css, docker-compose.yml, Dockerfile

What was just completed (2026-02-17 session 27)

  • v0.10.0 — Phase B: Storage Management UI Polish & Health Severity Fix:
    • Step 0: Health severity fixcheckStoragePaths() mount-point check reclassified from issue (FAIL) to warning (WARN). All storage health messages translated to Hungarian. Added .monitoring-banner-warn CSS class for yellow warning banners. Prevents false FAIL status on demo/test environments where storage is intentionally on SSD.
    • Step 1: Success flash messages — All 4 storage handlers (add/remove/set-default/toggle-schedulable) now redirect with ?storage_msg=success&storage_detail=... query params. Settings page displays green "alert-info" flash on success. Consistent with backup page flash pattern.
    • Step 2: Edit storage path labels — New SetStorageLabel() method in settings.go. New POST /settings/storage/label route + handler. Inline edit UI with ✏️ button, text input, OK/Cancel. Added .btn-ghost CSS class.
    • Step 3: App details per storage path — Settings page now shows expandable <details> list per storage path with app names, sizes, and links to deploy page. New StorageAppDetail struct + appDetailsForPath() helper. Added CSS for .storage-app-details, .storage-app-list, .storage-app-row.
    • Step 4: Storage badge on stacks page — Deployed app cards show "💾 Label" badge indicating which registered storage path the app uses. StorageLabels map built from deployed apps' HDD_PATH → registered storage path label lookup. Added .meta-badge-storage CSS.
    • Step 5: Deploy dropdown enhancements — Storage path dropdown now shows free space ("234 GB szabad"). DeployStoragePath struct wraps StoragePath with FreeHuman/FreePercent from GetDiskUsage(). JS checkStorageSpace() shows yellow warning when selected storage has <20% free.
    • Step 6: Filesystem & disk info — New FSInfo struct + GetFSInfo() in mounts_linux.go using findmnt command + /sys/block/ sysfs reads for disk model. Settings page shows "ext4 · /dev/sdb1 · WD Elements" below disk usage bar. Non-Linux stub returns nil.
    • Step 7: Backup page storage context — Added StorageLabel field to AppBackupInfo. Backup page shows storage label badge per app by matching HDD path prefixes against registered storage paths. Uses existing .meta-badge-storage CSS.
    • Files modified (12): healthcheck.go, settings.go, mounts_linux.go, mounts_other.go, appdata.go, handlers.go, server.go, settings.html, stacks.html, deploy.html, backups.html, style.css

What was previously completed (2026-02-17 session 26)

  • v0.9.0 — Phase A: Storage Paths Foundation & Backup Toggle Fix:
    • Root cause: Per-app backup toggles (v0.8.0) didn't appear because controller.yaml had no paths.hdd_path set → ParseComposeHDDMounts returned nil. Even with global hdd_path, apps with different HDD_PATH values wouldn't match.
    • Core fix: Per-app HDD_PATH resolutionstackAdapter.GetStackHDDMounts() now reads each app's own HDD_PATH from its app.yaml env section (Priority 1), falling back to all registered storage paths (Priority 2). Removed dependency on global cfg.Paths.HDDPath.
    • Storage paths registry (settings.json) — new StoragePath struct with Path, Label, IsDefault, Schedulable, AddedAt. Thread-safe CRUD methods in settings.go (Get/Add/Remove/SetDefault/SetSchedulable). Multiple external storage paths supported.
    • Auto-discovery — On startup, discoverHDDPaths() scans deployed apps' app.yaml for HDD_PATH values. AutoDiscoverStoragePaths() registers discovered paths with inferred labels. Legacy cfg.Paths.HDDPath used as fallback.
    • Mount-point validation — New mounts_linux.go (build-tagged): IsMountPoint() via syscall.Stat_t.Dev comparison, IsWritable(), PathsOverlap(), GetDiskUsage() via syscall.Statfs. Non-Linux stubs in mounts_other.go.
    • Settings page "Adattárolók" section — Lists registered paths with label, path, disk usage bar, app count, badges (default/active/unmounted). Actions: set default, toggle schedulable, remove (with guards). Expandable "Új adattároló hozzáadása" form with 5-step validation (exists, mount point, writable, no overlap, no duplicate).
    • Deploy page storage dropdownpath field type renders as <select> dropdown of schedulable storage paths. Falls back to text input with warning if no paths registered.
    • Health check storage monitoringRunHealthCheck() now accepts storagePaths parameter. Checks: path accessible (warning), not a mount point (issue — data writes to SSD!), disk usage ≥95% (issue) / ≥90% (warning).
    • Controller docker-compose.yml — Changed HDD mount from ${HDD_PATH:-/mnt/hdd_placeholder}:...:ro to /mnt:/mnt:rw for multi-storage support + restore capability.
    • Removed unused hddPath param from DiscoverAppData() signature in backup/appdata.go.
    • Files created (2): system/mounts_linux.go, system/mounts_other.go
    • Files modified (11): settings.go, main.go, appdata.go, backup.go, handlers.go, server.go, settings.html, deploy.html, style.css, healthcheck.go, docker-compose.yml, report/builder.go

What was previously completed (2026-02-16 session 25)

  • v0.8.0 — Phase 7: Storage Overview, Per-App Backup Toggles & Limited Restore:
    • Storage overview on backup page — new "Tárhely áttekintés" section as first section on backup page showing SSD/HDD progress bars + backup repo stats (repo size, dump file count, snapshot count). Reuses existing system.GetInfo() and RepoStats.
    • Restic password visibility — new "Titkosítási kulcs" section inside the repository card. Masked password field with show/copy buttons (JS toggle). Password synced to hub via periodic report for disaster recovery (ResticPassword field added to BackupReport).
    • App data discovery — new internal/backup/appdata.go:
      • StackDataProvider interface to avoid circular imports between backup and stacks packages
      • AppBackupInfo, AppDataPath, AppDockerVolume structs
      • DiscoverAppData() iterates deployed stacks, discovers HDD bind mounts (via adapter calling ParseComposeHDDMounts), Docker named volumes (via parseComposeNamedVolumes using YAML parser), and DB dump status
      • Stack adapter in main.go implements StackDataProvider using stacks.Manager
    • Per-app backup toggles — new "Alkalmazás adatok" section on backup page:
      • Toggle checkbox per app (only for apps with HDD data)
      • Shows HDD paths with sizes, Docker volume info, DB dump notes
      • POST /settings/app-backup handler saves preferences to settings.json
      • AppBackupPrefs struct + bulk getter/setter in settings.go
      • RefreshCache() populates AppDataInfo via DiscoverAppData()
    • Dynamic backup pathsRunBackup() now includes enabled app HDD data paths:
      • resolveAppBackupPaths() reads enabled apps from settings, resolves HDD paths via provider
      • Paths logged at INFO level, included in restic snapshot
      • BackupPaths display on backup page includes app data paths
    • Limited app restore — new restore section on backup page:
      • RestoreApp() in restore.go: validates enabled, resolves HDD paths, validates snapshot exists, uses running mutex
      • RestoreAppData() on ResticManager: runs restic restore with --include flags for specific paths
      • POST /backup/restore web handler with confirmation flow
      • GET /api/backup/snapshots JSON endpoint for restore dropdown
      • UI: app/snapshot dropdowns, warning box, confirmation checkbox, JS-driven form submission
    • Exported ParseComposeHDDMounts from stacks package (was unexported parseComposeHDDMounts)
    • Flash messages on backup page via query params (success/error redirects from handlers)
    • CSS: New styles for storage overview grid, app backup toggles, encryption key field, restore section, flash messages
    • Files created: appdata.go, restore.go
    • Files modified: backup.go, restic.go, handlers.go, server.go, backups.html, style.css, settings.go, delete.go, router.go, types.go, builder.go, main.go

What was previously completed (2026-02-16 session 24)

  • v0.7.2 — Fix Notification Preferences Sync (Controller → Hub):
    • Two repos changed (deploy-felhom-compose + felhom.eu):
    • Hub: POST /api/v1/preferences endpoint (hub/internal/api/handler.go):
      • New route in API handler: same Bearer token auth as /report and /notify
      • Accepts JSON payload: {customer_id, email, enabled_events}
      • Calls existing store.SaveNotificationPrefs() — no store changes needed
      • Logs preference updates at INFO level
    • Hub: Notification section on customer detail page (hub/internal/web/, hub/internal/store/store.go):
      • New GetRecentNotifications() store method returns last N notification_log entries
      • handleCustomerDetail() loads NotifPrefs + RecentNotifications
      • joinStrings template function added for event list display
      • customer.html template: new "Notifications" section showing email, events, and last 10 notification log entries (time, event, status, message)
    • Controller: SyncPreferences method (internal/notify/notifier.go):
      • New preferencesRequest struct for JSON payload
      • SyncPreferences(email, enabledEvents) — synchronous POST to hub /api/v1/preferences
      • IsEnabled() getter for checking hub connectivity
      • Hungarian error messages for user-facing feedback
    • Controller: Sync on settings save (internal/web/handlers.go):
      • settingsNotificationsHandler now calls SyncPreferences after saving to settings.json
      • Three flash message variants: success (synced), warning (local save OK, sync failed), error (save failed)
      • Local save always succeeds even if hub sync fails
    • Controller: Sync on startup (cmd/controller/main.go):
      • Non-blocking goroutine syncs preferences to hub when controller starts
      • Only runs if hub is enabled and email is configured
      • Handles hub DB rebuild recovery (re-populates preferences after hub redeployment)
    • Files changed: hub (3 files: handler.go, store.go, server.go, customer.html), controller (3 files: notifier.go, handlers.go, main.go)
    • Documentation: README.md updated (version, notify module, phase checklist), CONTEXT.md updated

What was previously completed (2026-02-16 session 23)

  • v0.7.1 — Phase 2: Monitoring Warnings, Dashboard Alerts & Notification System:
    • Three workstreams across two repos (deploy-felhom-compose + felhom.eu):
    • Monitoring page "Távoli monitoring" section (monitoring.html, handlers.go):
      • New section between System Overview and System Metrics showing healthcheck ping UUID status
      • 5 rows: Heartbeat, System Health, DB Dump, Backup, Backup Integrity — each shows configured or ⚠️ missing
      • Banner: green (all configured), yellow (some missing), red (monitoring disabled)
      • isPingConfigured() helper checks non-empty AND not "CHANGEME" prefix
    • Dashboard alert banners (new alerts.go, layout.html):
      • AlertManager struct with Refresh() + GetAlerts() — generates alerts from health report, missing pings, backup disabled
      • Alert types: Alert{ID, Level, Message, Link, LinkText} — levels: error/warning/info
      • Renders colored banners (red/yellow/blue) after <main class="content"> on all pages
      • Caps at 5 alerts with "+N more" overflow; monitoring page excludes "pings-missing" (shown in table instead)
      • Refreshed every 5 min via system-health scheduler task + once at startup
    • Hub notification relay (felhom.eu repo — hub/internal/api/handler.go, hub/internal/store/store.go):
      • POST /api/v1/notify endpoint: Bearer auth, JSON payload (customer_id, event_type, severity, message, details)
      • New customer_notifications table (email, enabled_events JSON) + notification_log audit table
      • Resend email integration: direct HTTP POST to https://api.resend.com/emails
      • Hungarian email template with event details, timestamp, severity
      • hub.yaml.example updated with notifications config section
    • Controller-side notifier (new internal/notify/notifier.go):
      • Notifier struct: fires HTTP POST to hub /api/v1/notify, non-blocking (goroutine)
      • Cooldown tracking per event type (default 6h, configurable via UI)
      • Checks notification preferences (email configured + event enabled) before sending
      • NotifyHealthChange(): only notifies on status degradation (ok→warn, ok→fail, warn→fail)
      • NotifyBackupFailed/NotifyDBDumpFailed/NotifyIntegrityFailed convenience methods
      • SendTest() for test email flow
      • Wired into scheduler: system-health task calls NotifyHealthChange(), backup tasks call failure notifiers
    • Notification preferences UI (settings.html, handlers.go):
      • New "Értesítések" Section C on Settings page (only shown when hub enabled)
      • Email input, 4 event checkboxes (disk_warning, backup_failed, update_available, security_update)
      • Cooldown hours input (default 6)
      • "Mentés" + "Teszt email küldése" buttons
      • Saved to settings.json via NotificationPrefs struct (Email, EnabledEvents, CooldownHours)
    • Settings persistence expanded (settings.go):
      • NotificationPrefs struct with Email, EnabledEvents, CooldownHours
      • DefaultEnabledEvents: disk_warning, backup_failed, update_available
      • GetNotificationPrefs() returns defaults if nil, SetNotificationPrefs() saves atomically
    • Files changed: 3 new (alerts.go, notifier.go, notify package), ~12 modified across both repos
    • Deployed: Controller v0.7.1 to demo-felhom.eu, verified healthy (0 alerts on clean system)

What was previously completed (2026-02-16 session 22)

  • v0.7.0 — Phase 1: Authentication, Persistence & Settings Page:
    • New internal/settings/settings.go: Shared persistence layer via settings.json in the data directory. Atomic writes (tmp + rename), thread-safe with sync.RWMutex. Stores password hash overrides and DB validation cache. Graceful handling if file doesn't exist.
    • Auth improvements:
      • Password resolution priority: settings.jsoncontroller.yaml → none (open dashboard)
      • Startup logs which source is active: Auth: using password from settings.json/controller.yaml/no password configured
      • Session duration extended to 7 days (was 24h)
      • ?next= redirect after session expiry — returns user to the page they were on
      • Flash messages on login page (green info box, used after password change)
      • Conditional logout link — hidden when auth is disabled (no password configured)
      • invalidateAllSessions() method for password change flow
    • New Settings page (/settings):
      • "Rendszer konfiguráció" section: read-only display of controller.yaml values (customer ID/name/domain, git repo/sync interval, backup enabled/schedule, monitoring, healthchecks URL, hub status, controller version)
      • "Jelszó módosítás" section: form with current password, new password, confirm — validates min 8 chars, match check, bcrypt comparison
      • Password saved to settings.json, all sessions invalidated, redirect to login with flash message
      • Only shown if auth is enabled; otherwise shows info message to contact operator
    • Sidebar update:
      • "Beállítások" menu item with ⚙ icon pinned to bottom (above version/logout)
      • Version and logout link separated from nav links
      • Logout link conditionally shown only when auth is enabled
    • DB validation persistence:
      • After each successful dump, validation results saved to settings.json (db_validations map keyed by filename)
      • Cached data survives container restarts
      • DBValidationCache struct with validated_at, table_count, has_header, error
    • 10 files changed (3 new: settings.go, settings.html; 7 modified: main.go, backup.go, auth.go, handlers.go, server.go, layout.html, login.html, style.css)
    • Deployed: Controller v0.7.0 to demo-felhom.eu, verified healthy

What was previously completed (2026-02-16 session 21)

  • v0.6.3 — Bug fixes from v0.6.2 code scan (4 minor fixes):
    • Bug 1: --hdd-path in docker-setup.sh now uses require_arg validation like all other flags. Previously, --hdd-path as the last argument without a value would crash with a cryptic bash error under set -u instead of a friendly message.
    • Bug 2: stackAction() in layout.html now receives event as an explicit parameter instead of relying on the deprecated implicit window.event. All 10 onclick call sites in dashboard.html and stacks.html updated to pass event as first argument.
    • Bug 3: Page <title> now has an em dash separator: "Vezérlőpult — Felhom.eu" instead of "VezérlőpultFelhom.eu".
    • Bug 4: nextPruneLabel() in funcmap.go now returns "ma" (Hungarian for "today") on Sunday before 4am, consistent with the nextRunLabel function. Previously returned the date in "2006-01-02" format.
    • Deployed: Controller v0.6.3 to demo-felhom.eu, verified healthy

What was previously completed (2026-02-16 session 20)

  • Hub Dashboard Bugs + Backup Validation Fix (3 bugs):
    • Bug 1&2 (Hub repo, felhom-hub v0.1.2): Hub timestamp parsing failure — time.Parse with single hardcoded format silently failed for formats returned by modernc.org/sqlite. Added parseSQLiteTime() that tries 6 common formats. Fixed: hub main page showing DOWN despite OK status, and report history timestamps showing 00:00:00.
    • Bug 3 (Controller repo, v0.6.2): Backup page showing "Hiba" for all DB validations — zero-value DumpValidation{} (never assigned) hit the {{else}} branch in template. Three fixes:
      • Template: 4-branch guard (Valid → OK / Error → Hiba / zero-value → "" with tooltip)
      • Debug logging: Added [DEBUG] and [WARN] log lines to all ValidateDump() code paths
      • Re-validation: RefreshCache() now cross-checks lastDBDump results against fresh ListDumpFiles() validation, healing stale in-memory state
    • Deployed: Hub v0.1.2 to k3s, Controller v0.6.2 to demo-felhom
    • Verified: Controller logs show ValidateDump OK for all 3 databases (immich: 60 tables, paperless: 67 tables, romm: 14 tables)

What was previously completed (2026-02-16 session 19)

  • v0.6.1 — Code Review Bugfixes (7 fixes):
    • Fix 1: http.NotFound(w, nil) → pass actual *http.Request in deployHandler and appDetailHandler
    • Fix 2: Dashboard running/stopped counts now computed from the filtered deployedStacks set (was counting ALL stacks including non-deployed)
    • Fix 3: Session cookie Secure flag now dynamic based on r.TLS != nil || X-Forwarded-Proto == "https". SameSite changed from Strict to Lax (Strict breaks Cloudflare Tunnel redirects)
    • Fix 4: Removed misleading subtle.ConstantTimeCompare from isValidSession() (map lookup already leaks timing; comparing token to itself is meaningless). Removed unused token field from session struct. Removed crypto/subtle import.
    • Fix 5: Replaced time.Tick() (goroutine leak) with proper time.NewTicker + done channel in cleanupSessions(). Added Close() method to Server. Added done chan struct{} to Server struct.
    • Fix 6: Added http.MaxBytesReader(w, req.Body, 1<<20) (1MB limit) to deployStack, updateOptionalConfig, deleteStack API handlers via limitBody() helper.
    • Fix 7: Cached time.LoadLocation("Europe/Budapest") once at top of templateFuncMap(), removed 5 per-function LoadLocation calls (timeAgo, fmtTime, fmtTimeShort, nextRunLabel, nextPruneLabel).
    • Post-fix verification: All 4 grep checks pass (0 results for NotFound(w,nil), ConstantTimeCompare, time.Tick(, Secure:.*true). go vet ./... clean.
    • Controller version: v0.6.1 — deployed and verified on demo-felhom.eu

What was previously completed (2026-02-16 session 18)

  • v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:
    • Part 1 — Healthcheck enhancements (controller-side):
      • Added heartbeat ping — lightweight "I'm alive" signal every 5 min (no logic, just ping)
      • Added backup_integrity ping — weekly restic check on Sunday 04:00, pings healthchecks with result
      • Added Heartbeat and BackupIntegrity fields to PingUUIDsConfig
      • Added RunIntegrityCheck() to backup Manager (calls restic Check(), updates lastCheckTime/lastCheckOK, pings)
      • Updated controller.yaml.example with new monitoring ping_uuids
      • Created monitoring/DEPRECATED.md for legacy bash monitoring scripts
    • Part 2 — Central hub reporting (controller-side):
      • New internal/report/ package: types.go (Report struct), builder.go (BuildReport), pusher.go (HTTP push)
      • Report builder gathers data from all subsystems: system info (via metrics.GetStaticInfo + system.GetInfo), container stats (via metricsStore.QueryContainerSummary), backup status (via backupMgr.GetFullStatus), health (via monitor.RunHealthCheck), stacks (via stackMgr.GetStacks)
      • Report pusher: POST JSON to hub with Bearer token auth, 3 retries with 5s backoff, never fails caller
      • Added HubConfig to config.go (enabled, url, api_key, push_interval)
      • Wired hub reporting into scheduler (configurable interval, default 15m)
      • Hub reporting disabled by default (hub.enabled: false)
    • Part 3 — Hub service (felhom.eu repo, new hub/ subfolder):
      • Full Go service: cmd/hub/main.go, internal/api/handler.go, internal/store/store.go, internal/web/server.go
      • SQLite store with WAL mode, auto-migration, denormalized fields for fast queries
      • REST API: POST /api/v1/report (Bearer token auth), GET /api/v1/customers, GET /api/v1/customers/{id}, GET /api/v1/customers/{id}/history
      • Dark theme dashboard (English): multi-customer overview table with status indicators, customer detail page with system/storage/containers/backup/health sections
      • Color coding: green (OK, <30min), yellow (warn or 30-60min), red (fail or >60min)
      • K8s manifest: Deployment + Service + Ingress for hub.felhom.eu in felhom-system namespace
      • Dockerfile, Makefile, hub.yaml.example config
      • 90-day report retention with daily auto-prune
    • Controller version: v0.6.0 — deployed and verified on demo-felhom.eu (9 scheduler jobs, all new jobs registered)
    • Manual steps remaining for Viktor (Part 4 of TASK.md):
      • Create 5 healthcheck checks on status.felhom.eu (heartbeat, system-health, db-dump, backup, backup-integrity)
      • Update controller.yaml on demo-felhom with real UUIDs
      • Build and deploy felhom-hub to k3s cluster
      • Configure hub.felhom.eu DNS in Cloudflare
      • Enable hub reporting on demo-felhom controller.yaml

What was previously completed (2026-02-16 session 17)

  • v0.5.4 — Monitoring Page Frontend Fixes (4 bugs, frontend-only):
    • Bug 1: Tooltip "Invalid Date"items[0].parsed.x unreliable across Chart.js versions. Fixed tooltip callback to use items[0].raw.x (direct {x,y} data access) with parsed.x as fallback.
    • Bug 2: Charts fill full width regardless of data densitysetChartXBounds() setting min/max at runtime was ignored because the scale was created without them. Fixed by including min: now - defaultRangeMs, max: now in the initial chartOpts() options. Now "7 nap" shows full 7-day x-axis with data clustered on the right.
    • Bug 3: Sysinfo values not consistently right-aligned.sysinfo-grid used auto-fill creating variable-width cells. Fixed to 1fr 1fr (fixed 2-column). Added align-items: baseline, gap: 1rem, white-space: nowrap on labels, font-weight: 600 + word-break: break-word on values. Removed redundant <style> block from monitoring.html (styles now in style.css).
    • Bug 4: Charts overflow on mobile — Added min-width: 0 on .chart-box (critical CSS grid fix), overflow: hidden + max-width: 100% on .chart-wrap and .chart-wrap-bar, max-width: 100% on canvas.
    • Controller version: v0.5.4 — deployed and verified on demo-felhom.eu

What was previously completed (2026-02-16 session 16)

  • v0.5.1 — Monitoring Page Bugfixes:
    • Bug 1: Hostnameos.Hostname() returns the container ID inside Docker. Fixed by mounting /etc/hostname:/host/etc/hostname:ro and reading it first in sysinfo.go. Now shows demo-felhom.
    • Bug 2: Tooltip timestamps — Chart.js tooltip callback used items[0].parsed.x (category index 0,1,2...) instead of items[0].label (actual timestamp). Index 0 worked by accident (0 || label falls through), but all other points showed 1970-01-01.
    • Bug 3+4: Default range + empty charts — Default range was 24h but new system had only minutes of data. Changed to 1h default for both system and container detail charts. Moved active class to "1 óra" button.
    • Controller version: v0.5.1 — deployed and verified on demo-felhom.eu

What was previously completed (2026-02-16 session 15)

  • v0.5.0 — Backup Bugfixes + Monitoring Page with Metrics Store:
    • Task 1: Fixed "Helyi mentés" showing "" after restartGetFullStatus() now synthesizes LastBackup from SnapshotHistory and LastDBDump from DumpFiles on disk when the in-memory values are nil (e.g., after controller restart). Dashboard handler also updated to use GetFullStatus() instead of GetStatus() for consistent behavior.
    • Task 2: Verified backup page caching — Already implemented in v0.4.7 (RefreshCache, scheduler job, AfterBackup callback). No changes needed.
    • Task 3: New Monitoring Page ("Rendszermonitor") — Full system monitoring subsystem:
      • SQLite metrics store (internal/metrics/store.go, types.go): WAL-mode SQLite via modernc.org/sqlite (pure Go, no CGO). Stores system metrics (CPU%, memory, temperature, load) and container metrics (CPU%, memory, net/block I/O) with timestamp. Downsampled queries via bucket-based GROUP BY for Chart.js. 30-day auto-prune via daily scheduler job at 04:00.
      • Metrics collector (internal/metrics/collector.go): Background goroutine collects system + container metrics every 60 seconds. System data from system.GetInfo(), container data from docker stats --no-stream with tab-separated format parsing.
      • System info provider (internal/metrics/sysinfo.go, sysinfo_other.go): Reads hostname, OS, kernel, CPU model/cores, uptime from /proc filesystem. Linux-specific with build-tag fallback for cross-compilation.
      • REST API endpoints (4 new routes in router.go): GET /api/metrics/system (time-series with range presets), GET /api/metrics/containers/summary (current stats), GET /api/metrics/containers/{name} (per-container time-series), GET /api/metrics/sysinfo (static system info).
      • Monitoring page template (monitoring.html): 5 sections — System Overview (sysinfo via API), System Metrics Charts (4 line charts: CPU, Memory, Temperature, Load in 2×2 grid), Container Resources (2 horizontal bar charts: CPU% and Memory), Per-container Detail (click to expand with historical charts), Storage (server-rendered progress bars). Time range selectors (1h/6h/24h/7d/30d). Auto-refresh every 60s.
      • Chart.js 4.4.7 embedded locally (offline environments, ~200KB UMD), dark theme configuration matching site design.
      • CSS: ~100 lines added for monitoring page (.monitor-card, .charts-grid, .chart-box, .container-charts-row, .storage-bars, responsive rules).
      • Wiring: 4th sidebar nav item "Rendszermonitor", metrics DB path in named volume (data/metrics.db), /etc/os-release:/host/etc/os-release:ro volume mount in docker-compose.yml, Dockerfile updated to golang:1.24-bookworm (required by modernc.org/sqlite), go.mod upgraded to go 1.24.0.
    • Controller version: v0.5.0 — deployed and verified on demo-felhom.eu (metrics collecting, 16 containers reporting, sysinfo showing Intel N100 correctly)

What was previously completed (2026-02-16 session 14)

  • v0.4.7 — Protected Stack Detail Pages + Backup Page Caching:
    • Protected stacks clickabledata-href gating changed from {{if not .Protected}} to {{if .Meta.Slug}} on both stacks.html and dashboard.html. Protected stacks with .felhom.yml (i.e. a slug) are now clickable, linking to /apps/{slug}. Stacks without .felhom.yml remain non-clickable.
    • "Részletek" button for protected stacks — Protected stack action section in stacks.html now shows a "Részletek" link when the stack has a slug, next to the restart button.
    • FileBrowser .felhom.yml resources — Added resources section (mem_request: 128M, mem_limit: 256M, pi_compatible: true, needs_hdd: true) to both install_filebrowser() in docker-setup.sh and manually on the demo node. FileBrowser detail page now shows memory/Pi/HDD badges.
    • Backup page cachingGetFullStatus() no longer runs expensive subprocess calls (restic stats, docker inspect, disk listing) on every page load. Instead, a new RefreshCache() method runs these in the background:
      • Every 5 minutes via backup-cache scheduler job
      • After each successful backup via AfterBackup callback
      • On startup via a goroutine (non-blocking)
    • GetFullStatus() returns the cached FullBackupStatus instantly, updating only dynamic fields (running flag, next run times, snapshot history). Falls back to a minimal status if cache hasn't populated yet.
    • Controller version: v0.4.7 — deployed and verified on demo-felhom.eu

What was previously completed (2026-02-16 session 13)

  • v0.4.6 — MariaDB Validation Fix + Dashboard & Protected Stack UX:
    • Bugfix: MariaDB dump validation false positive — MariaDB 11.4+ prepends /*M!999999\- enable the sandbox mode */ before the dump header comment. ValidateDump() now scans the first 10 lines for the expected header pattern instead of just checking line 1. Accepts -- MariaDB dump, -- MySQL dump, -- mysqldump for MariaDB and -- PostgreSQL database dump for PostgreSQL.
    • Dashboard shows deployed apps onlydashboardHandler() filters to deployed + protected stacks only. Non-deployed apps remain on the Alkalmazások page. Section heading changed to "Telepített alkalmazások". TotalCount stat card still shows all 52 apps.
    • Protected stack restart button — Protected stacks (traefik, cloudflared, felhom-controller, filebrowser) now show an "Újraindítás" restart button when operational, on both dashboard (compact ↻) and Alkalmazások page (full button). "Védett" / "Védett rendszerkomponens" badge still shown.
    • API protection guard — Centralized guard in actionStack() blocks all actions except restart on protected stacks (HTTP 403). Defense-in-depth: StopStack() and DeleteStack() retain their own guards.
    • FileBrowser .felhom.ymlinstall_filebrowser() in docker-setup.sh now creates .felhom.yml with subdomain: files metadata, so the controller shows the files.DOMAIN ↗ URL link. Manually created on demo node.
    • Controller version: v0.4.6 — deployed and verified on demo-felhom.eu

What was previously completed (2026-02-16 session 12)

  • v0.4.5 — Dedicated Backup Page ("Biztonsági mentés"):
    • New /backups page with full backup system visibility — 5 sections:
      1. Status overview cards: Local backup status (green/gray), remote placeholder (gray), DB count, repo size
      2. Schedule section: DB dump/restic/prune schedule with next-run times, last backup time + duration, retention policy, "Mentés most" button
      3. Database table: Lists all discovered DBs with type badge (PostgreSQL/MariaDB), dump file size, last dump time, validation (table count), status
      4. Snapshot history table: Last 20 snapshots with ID, time, data added, files new/changed
      5. Repository info card: Path, size, snapshot count, integrity check status, backed-up paths list, remote copy placeholder
    • Backend extensions:
      • SnapshotRecord type + ring buffer (20 entries) in Manager for per-snapshot stats
      • DumpValidation — scans dump files for CREATE TABLE statements, validates header and file size
      • ValidateDump() runs after each successful dump in DumpOne()
      • ListDumpFiles() scans dump directory for existing .sql files (fallback when in-memory results empty)
      • ListSnapshots() on ResticManager — returns all snapshots from restic (newest first)
      • GetFullStatus() on Manager — single call returns everything the page needs
      • LoadSnapshotHistory() populates history from restic on startup (without delta stats)
      • Restic check result tracking (lastCheckTime, lastCheckOK)
      • NextDailyRun() exported from scheduler for next-run time calculation
    • Server wiring:
      • Server struct now holds *scheduler.Scheduler
      • NewServer() accepts scheduler parameter
      • /backups route + backupsHandler() in handlers.go
    • New template functions (funcmap.go): timeAgo, fmtTime, fmtTimeShort, dbTypeLabel, nextRunLabel, pruneLabel, nextPruneLabel, fmtDuration, fmtBytes, shortID
    • Navigation: Sidebar now has 3 items (Vezérlőpult, Alkalmazások, Biztonsági mentés)
    • Dashboard: Backup card title is now a clickable link to /backups
    • Auto-refresh: Page polls /api/backup/status every 3s during backup-in-progress, reloads when complete
    • CSS: Full dark-theme styles for schedule card, database table, snapshot table, repository card, validation badges, DB type badges, empty state
    • Controller version: v0.4.5 — deployed and verified on demo-felhom.eu (2 historical snapshots loaded)

What was previously completed (2026-02-15 session 11)

  • v0.4.1 — App Filtering + Bugfixes:
    • Filter bar on Alkalmazások page: Four pill-shaped filter buttons (Mind/Futó/Leállítva/Telepíthető) with live count badges computed from DOM. Filters stack cards via display: none, updates URL with ?filter=running via history.replaceState. Reads filter from URL on page load for deep-linking support.
    • New filterCategory template function (funcmap.go): Maps container state + deployed flag to filter categories (running/stopped/available). Each stack card gets a data-filter-state attribute for client-side filtering.
    • Clickable dashboard stat cards: Stat cards (Futó/Leállítva/Összes) changed from <div> to <a> with href linking to /stacks?filter=running, /stacks?filter=stopped, /stacks respectively. Hover effect with translateY + box-shadow.
    • docker-compose.yml synced to demo node: Fixed the stale compose file that still had dashboard.${DOMAIN} Traefik label (from pre-v0.3.0). Now uses correct felhom.${DOMAIN} label + /sys:/host/sys:ro mount.
    • Controller version: v0.4.1 — deployed and verified on demo-felhom.eu
    • Remaining manual tasks for Viktor (Task 2 & 3 from TASK.md):
      • Verify felhom.demo-felhom.eu resolves correctly (Cloudflare Tunnel public hostname may need updating from dashboard.* to felhom.*)
      • Update Pi-hole local DNS if applicable
      • Enable backup in controller.yaml on demo node (backup.enabled: true)
      • Create /srv/backups directories on demo node

What was previously completed (2026-02-15 session 10)

  • v0.4.0 — Monitoring & Health + Backups (Phase 2 & 3):
    • Central job scheduler (internal/scheduler/scheduler.go):
      • Replaces ad-hoc goroutines in main.go with a unified scheduler
      • Every(name, interval, fn) for periodic jobs, Daily(name, timeStr, fn) for scheduled tasks
      • Panic recovery, skip-if-running, quiet mode for high-frequency jobs (≤30s)
      • Daily jobs use Europe/Budapest timezone with time.Timer for DST correctness
      • Graceful shutdown with 30s timeout for running jobs
    • CPU usage collector (internal/system/cpu_linux.go):
      • Background goroutine samples /proc/stat every 5s, computes delta-based CPU %
      • Platform stubs for non-Linux in cpu_other.go
    • Temperature & load metrics (internal/system/info_linux.go):
      • Reads /proc/loadavg for 1/5/15 min load averages
      • Reads thermal zones from /host/sys/class/thermal/ (Docker mount) with /sys/ fallback
      • Handles millidegree values, picks highest zone, with hwmon fallback
    • Healthchecks.io pinger (internal/monitor/pinger.go):
      • HTTP ping client for Healthchecks.io-compatible endpoints
      • POST to /ping/{uuid} (success), /fail (failure), /start (started)
      • 10s timeout, 3 retries with 2s backoff, skips CHANGEME UUIDs
    • System health checks (internal/monitor/healthcheck.go):
      • Checks disk, memory, CPU, temperature, Docker reachability, protected containers
      • Returns HealthReport with status "ok"/"warn"/"fail" + formatted message for pings
    • Database dump engine (internal/backup/dbdump.go):
      • Auto-discovers PostgreSQL/MariaDB containers via docker ps + docker inspect
      • Dumps via docker exec pg_dump/mariadb-dump with 5min timeout
      • Atomic writes (.tmp.sql), empty file detection, stale temp cleanup
    • Restic integration (internal/backup/restic.go):
      • Auto-generates repository password (32 random bytes, base64url)
      • Init, snapshot (JSON output), prune, check, stats, latest snapshot
      • Stale lock detection with automatic unlock + retry
    • Backup orchestrator (internal/backup/backup.go):
      • DB dumps + restic snapshots, weekly prune on Sundays
      • Thread-safe running flag, Healthchecks.io pings with results
      • RunFullBackup() for manual trigger (sequential: dumps → snapshot)
    • Wiring updates:
      • main.go: scheduler-based job registration, cpuCollector lifecycle, pinger + backupMgr init
      • api/router.go: GET /api/backup/status, POST /api/backup/run
      • web/server.go + handlers.go: pass cpuCollector to GetInfo(), backup status on dashboard
      • funcmap.go: tempColor, fmtTemp, fmtLoad template functions
    • Dashboard UI enhancements:
      • CPU usage bar with load average display below
      • Temperature with colored indicator dot (green/yellow/red at 60°/75°C)
      • Backup status card: last run time, DB count, repo size/snapshots
      • "Mentés most" button triggers manual backup via API
    • Config updates:
      • controller.yaml.example: added system_health_interval, hdd_path, system.reserved_memory_mb
      • docker-compose.yml: added /sys:/host/sys:ro mount for temperature reading
      • restic_password_file default changed to data/ subdir (auto-generated in named volume)
  • Controller version: v0.4.0 — deployed and verified on demo-felhom.eu

What was previously completed (2026-02-15 session 9)

  • v0.3.0 — Structural refactoring (templates + server split + domain rename):
    • Templates: go:embed migration — moved all 7 HTML templates + CSS from Go string constants to individual files in internal/web/templates/. Created embed.go with //go:embed directive. Template loading now uses ParseFS() instead of Parse(). CSS served from embed.FS via ReadFile(). Zero runtime file dependencies — still compiled into the binary.
    • Server decomposition — split monolithic server.go (540 lines) into focused files:
      • auth.go: session struct, auth middleware, login/logout handlers, session management
      • handlers.go: page handlers (dashboard, stacks, logs, deploy, app detail)
      • funcmap.go: template FuncMap with 14 custom functions
      • server.go: Server struct, NewServer, loadTemplates (3-liner), ServeHTTP routing, render helper, static file serving
    • Domain rename — controller subdomain changed from dashboard.* to felhom.* in Traefik labels and setup script
    • Documentation updated — CLAUDE.md, README.md, CONTEXT.md all reflect new file structure
    • Reminder for Viktor: Update Cloudflare Tunnel public hostname (dashboard.demo-felhom.eufelhom.demo-felhom.eu) and Pi-hole DNS if needed
  • Controller version: v0.3.0

What was previously completed (2026-02-15 session 8)

  • FileBrowser as infrastructure service:
    • Created scripts/hdd-setup.sh (adapted from deploy-portainer) — sets up HDD folder structure with Dokumentumok user dir
    • Created scripts/docker-setup.sh (adapted from deploy-portainer) — installs Docker, Traefik, FileBrowser as infra services
    • Added filebrowser to protected stacks in controller.yaml.example
    • Removed templates/filebrowser/ from app-catalog-felhom.eu (no longer a catalog app)
  • Orphan stack detection and deletion:
    • Added Orphaned field to Stack struct + getCatalogTemplateSlugs() helper
    • Orphan detection in ScanStacks() — deployed stacks with no matching catalog template marked as orphaned
    • New delete.go: DeleteStack() (compose down + HDD cleanup + dir removal), GetStackHDDData(), parseComposeHDDMounts()
    • Safety: protected HDD paths (root, media, storage, Dokumentumok, appdata) can never be deleted
    • New API endpoints: DELETE /api/stacks/{name} and GET /api/stacks/{name}/hdd-data
    • UI: orange "Elavult" badge on orphaned stacks, "Törlés" button, delete confirmation modal
    • Modal shows HDD data paths/sizes, checkbox for "Felhasználói adatok törlése a merevlemezről"
    • Hides "Frissítés" and "Részletek" buttons for orphaned stacks
  • Verified: 1 orphaned stack detected on startup (filebrowser — now infra, removed from catalog)
  • Controller version: v0.2.15

Previously completed (2026-02-14 session 7)

  • Fixed YAML parse error in romm .felhom.yml (app-catalog repo):
    • Root cause: Hungarian opening quote (U+201E) paired with ASCII " (0x22) inside YAML double-quoted strings terminated the string prematurely
    • Affected lines: help_text for IGDB Client Secret and SteamGridDB API Key fields
    • Fix: escaped inner ASCII double quotes with \" in the YAML strings
    • This caused LoadMetadata() to silently fail and return empty defaults for ALL romm metadata (tagline, resources, category — everything)
  • Added error logging to LoadMetadata() in metadata.go:
    • [ERROR] log on YAML parse failure (was silently swallowed — critical bug)
    • Temporary [DEBUG] log used for diagnosis, then removed
  • Fixed deploy command in CLAUDE.md:
    • sed pattern now targets only image: lines (was matching service name too, breaking YAML)
    • Added sudo for both sed and docker compose (directory is root-owned)
  • Controller version: v0.2.14

Previously completed (2026-02-14 session 6)

  • Bug fix: App info logo SVG rendering.app-info-logo CSS in templates.go:
    • Added min-width, min-height, max-width, max-height: 80px and overflow: hidden
    • Prevents SVG images with explicit dimensions or no viewBox from overflowing container
    • Logo now reliably renders at 80x80 regardless of SVG intrinsic size
  • Controller version: v0.2.12

Previously completed (2026-02-14 session 5)

  • App detail/info pages — new feature:
    • New route: GET /apps/{slug} renders a full info page (was redirect to deploy page)
    • Hero section with logo, tagline, resource badges
    • Screenshots section (graceful — hidden via onerror if assets don't exist)
    • Info cards: use cases, first steps, prerequisites, default credentials, docs link
    • Optional config form with AJAX save (POST /api/stacks/{name}/optional-config)
    • New .felhom.yml fields: app_info (tagline, use_cases, first_steps, prerequisites, default_creds, docs_url) and optional_config (groups of env var fields)
    • New structs in metadata.go: AppInfo, OptionalConfigGroup, OptionalConfigField
    • UpdateOptionalConfig in deploy.go: saves optional env vars to app.yaml, restarts deployed stacks with docker compose up -d to pick up new env vars
    • Navigation updated: stack cards on dashboard/stacks pages now link to /apps/{slug}, deploy page has "Részletek" link back to info page
  • RoMM metadata updated (app-catalog repo):
    • Full app_info section: tagline, 5 use cases, 6 first steps, 3 prerequisites, default creds, docs URL
    • 6 optional config fields for metadata providers: IGDB (client_id + secret), SteamGridDB, ScreenScraper (user + password), MobyGames
    • docker-compose.yml updated with SCREENSCRAPER_USER, SCREENSCRAPER_PASSWORD, MOBYGAMES_API_KEY env vars
    • Display name fixed: "ROMM" → "RomM"
  • Controller version: v0.2.11

Previously completed (2026-02-14 session 4)

  • Fixed deploy race condition in internal/stacks/deploy.go:
    • In-memory Deployed flag now set BEFORE docker compose up -d (compose up can take 30-60s for image pulls)
    • On failure: both in-memory state and disk (app.yaml) are reverted
    • Eliminates stale "Telepítés" button during long compose operations
  • Added checkBeforeDeploy() JS guard in internal/web/templates.go:
    • Telepítés buttons on Vezérlőpult and Alkalmazások pages now fetch live state from /api/stacks/{name} before navigating
    • If app is already deployed (e.g., another tab deployed it), shows alert and reloads page instead of navigating to deploy form
    • Catches stale UI state gracefully

Previously completed (2026-02-14 session 3)

  • Enhanced debug logging across all stack operations in internal/stacks/:
    • Operation timing: All stack ops (start, stop, restart, update, deploy) now log elapsed time
    • Post-start container state check: Async goroutine after start/restart/update/deploy
    • Image pull detection: Checks local images before deploy/update (debug level)
    • GetLogs/ScanStacks improvements: Byte count logging, deployed/available counts
    • All verbose checks gated on cfg.Logging.Level == "debug"; timing always at INFO
  • UI improvements in internal/web/templates.go and server.go:
    • Memory bar fix on deploy page: Bar segments now always visible (min-width: 3px), new app segment uses translucent green with distinct border for clear visual separation from committed memory
    • Clickable app cards: Cards on Vezérlőpult and Alkalmazások pages are now clickable (navigates to deploy/detail page). Uses data-href attribute + delegated click handler. Protected stacks excluded. Actions area (buttons, state labels) excluded from click-to-navigate
    • Live-scrolling logs: Logs page now auto-refreshes every 3s via AJAX polling (?raw=1 returns plain text). Fixed-height container (70vh) with auto-scroll to bottom. Pulsing green "Élő" indicator. Pause/resume toggle ("Szüneteltetés"/"Folytatás"). User scroll position preserved when scrolled up to read history
    • Deployment progress UI: Deploy button no longer shows alert+redirect immediately. Instead shows 3-step progress panel: config saved → containers starting → app initializing. Polls GET /api/stacks/{name} every 3s to track actual container health state. Handles running (auto-redirect), starting (keep polling), unhealthy (warning), exited (error), and 120s timeout. Shows elapsed time counter
  • Mealie healthcheck fix (app-catalog-felhom.eu):
    • wget --spider replaced with Python TCP socket check — mealie image doesn't include wget
    • start_period increased to 60s (DB migrations take ~40s on first start)
  • Healthcheck audit: filebrowser (Alpine, has BusyBox wget — OK), stirling-pdf (Ubuntu, has wget — OK)

Previously completed (2026-02-15 session 2)

  • Phase 4: Git Sync + App Catalog Audit — major milestone
  • Git sync module (internal/sync/sync.go):
    • Clones/pulls app-catalog-felhom.eu repo to local cache on startup
    • Periodic sync based on git.sync_interval (default 15m)
    • Copies docker-compose.yml + .felhom.yml to stacks dir (never overwrites app.yaml/.env)
    • SHA-256 content comparison — only writes changed files
    • Triggers ScanStacks() after sync so dashboard updates immediately
    • Uses os/exec git CLI — no Go git library dependency
  • Manual sync button ("Sablonok frissítése") on Alkalmazások page:
    • POST /api/sync endpoint with 30s debounce
    • Toast notification shows result (success/failure/what changed)
    • Auto-reloads page if new apps or updates detected
  • Sync status added to /api/system/info (last_sync, last_status, syncing flag)
  • .felhom.yml files created for all 10 apps (paperless-ngx already had one):
    • actualbudget, docmost, filebrowser, homebox, immich, mealie, romm, stirling-pdf, vaultwarden
    • All follow the same format: display_name, description, category, subdomain, resources, deploy_fields
  • Docker Compose templates audited and fixed for all 10 apps:
    • Fixed {{DOMAIN}}${DOMAIN} syntax in homebox, mealie, romm, stirling-pdf
    • Fixed {{HDD_PATH}}${HDD_PATH} in romm
    • Added deploy.resources.limits.memory to all services across all templates
    • Added TZ=Europe/Budapest to all sidecar services (postgres, redis, mariadb)
    • Added healthcheck to romm main service
    • Added romm-redis condition: service_healthy (was service_started)
    • Standardized header comment blocks across all templates
  • Documentation updated: app-catalog README, CLAUDE.md, CONTEXT.md

Previously completed (2026-02-15 session 1)

  • Memory validation during deployment:
    • Pre-deploy memory check: compares mem_request sum against usable system RAM
    • Hard block if requests exceed usable memory (total - 384MB reserved)
    • Soft warning if mem_limit sum exceeds total RAM (overcommit OK for limits)
    • ParseMemoryMB() supports "500M", "1G", "1.5G", "1024" formats
    • CommittedMemory() sums requests/limits across all deployed stacks
    • Memory summary bar shown on deploy page before user clicks deploy
    • system.reserved_memory_mb configurable in controller.yaml (default: 384)
  • Display: ~ prefix on mem_request in UI badges (display-only, exact value stored)
  • Felhom.eu logo replaced text logos in sidebar and login page with actual SVG logo
    • Logo SVG embedded as Go string constant, served at /static/felhom-logo.svg

Previously completed (2026-02-14)

  • System info bar on Vezérlőpult dashboard: RAM, SSD, and optional HDD usage
    • Progress bars with color coding (green < 70%, yellow 70-85%, red > 85%)
    • New internal/system package reads /proc/meminfo + syscall.Statfs
    • Platform-specific: Linux impl + non-Linux stub (build tags)
    • Hungarian labels: "Memória", "SSD tárhely", "Külső HDD"
  • Docker Compose memory limits on paperless-ngx template:
    • paperless-webserver: 768M, postgres: 256M, redis: 128M
    • Added mem_limit field to .felhom.yml ResourceHints (total: 1152M)
  • /api/system/info endpoint now returns live system metrics (was customer info)
  • Config: Added paths.hdd_path for external HDD monitoring
  • Controller image builds via build.sh, pushes to Gitea container registry

Previously completed (2026-02-13)

  • Built the entire felhom-controller from scratch (Go, no frameworks)
  • Debugged and fixed 7 issues during first real deployment:
    1. Password validation (empty passwords accepted)
    2. In-memory Deployed flag not updating after deploy
    3. Health-aware state parsing (starting/unhealthy detection)
    4. Random card ordering (Go map iteration)
    5. "Részletek" button redirect for deployed apps
    6. Paperless OCR language installation (LANGUAGES vs LANGUAGE env var)
    7. Documentation: restart vs up -d for image updates

What's next (priorities)

  1. Test per-app backup — enable backup for Paperless-ngx HDD data, trigger manual backup, verify restic snapshot includes HDD paths
  2. Test restore — restore app data from snapshot, verify file recovery (now possible with /mnt:rw mount)
  3. Deploy Immich — tests HDD path + secrets + multi-storage (biggest real-world test)
  4. Add app_info + optional_config to more apps (Immich, Mealie, Vaultwarden)
  5. Test on Raspberry Pi (pi-customer-1)
  6. Self-update mechanism
  7. Hub alerting (webhook to Healthchecks for stale customers)
  8. Docker volume backup (mount /var/lib/docker/volumes:ro into controller)