diff --git a/TASK.md b/TASK.md index 4b844c4..92eada7 100644 --- a/TASK.md +++ b/TASK.md @@ -420,6 +420,15 @@ Update the summary to reflect: - Show hub status (enabled) - Remove the `CUSTOMER_ID` display bug (the "Note: No --customer specified" message is inside the `if [[ -n "$CUSTOMER_ID" ]]` block — wrong logic) +- Add DR/reinstallation note: + ``` + If this is a reinstallation, the controller will automatically: + 1. Contact the Hub for your previous configuration + 2. Mount your existing storage drives + 3. Detect and restore your applications + + Open https://felhom. to monitor the restore process. + ``` ### 12. Update `print_help()` diff --git a/TASK2.md b/TASK2.md new file mode 100644 index 0000000..5f733e8 --- /dev/null +++ b/TASK2.md @@ -0,0 +1,1442 @@ +# TASK2: Disaster Recovery — Hub-Based Infrastructure Restore + +## Overview + +Add the ability to fully restore a Felhom deployment after a system drive failure. +The controller pushes an **infrastructure snapshot** to the central Hub during +each backup cycle. When a fresh controller is deployed on a replacement system, +it pulls the snapshot from the Hub, auto-mounts surviving drives using stored +disk UUIDs, and restores all applications and their data. + +**This is a phased implementation:** + +| Phase | Scope | Where | Status | +|-------|-------|-------|--------| +| **Phase 1** | Hub infra-backup endpoints + controller push | Hub + Controller | **DONE** | +| **Phase 2** | New-deployment detection + Hub pull + auto-mount | Controller | **DONE** | +| **Phase 3** | Restore UI + app data restoration | Controller | **DONE** | +| **Phase 4** | docker-setup.sh integration | Script | **DONE** | + +Phases 1-2 can be deployed independently. Phase 3 depends on Phase 2. +Phase 4 depends on Phase 1 (needs Hub endpoints). + +### Phase 1 — What was deployed + +**Hub changes** (`e:/git/felhom.eu/hub/`): +- `internal/store/store.go` — new `infra_backups` table (CREATE TABLE in migrate()), `SaveInfraBackup()`, `GetInfraBackup()`, `GetInfraBackupMeta()` + `InfraBackupMeta` struct +- `internal/api/handler.go` — `POST /api/v1/infra-backup` (push) + `GET /api/v1/infra-backup/{customer_id}` (pull), both with Bearer auth +- `internal/web/server.go` — `handleCustomerDetail()` loads `InfraBackupMeta` and passes to template +- `internal/web/templates/customer.html` — "Infra Backup" card showing last-updated age, stack count, disk count + +**Controller changes** (`controller/`): +- `internal/settings/settings.go` — new `GetCrossDriveResticPassword()` read-only getter +- `internal/report/infra_backup.go` — `InfraBackup`, `DiskLayout`, `DiskMount`, `InfraStack` types + `BuildInfraBackup()` builder +- `internal/report/infra_backup_linux.go` — `collectDiskLayout()` parses /host-fstab + blkid/lsblk for disk topology +- `internal/report/infra_backup_other.go` — no-op stub for non-Linux compilation +- `internal/report/pusher.go` — `PushInfraBackup()` method (3 retries, 5s backoff) +- `cmd/controller/main.go` — `pushInfraBackup()` helper; called after nightly backup cycle and on startup; `hubPusher` declaration moved earlier for closure access + +### Phase 2 — What was deployed + +**Controller changes** (`controller/`): +- `internal/backup/disk_layout.go` — **NEW** — `DiskLayout` and `DiskMount` types (moved from report to avoid circular import: report→backup, backup→report) +- `internal/report/infra_backup.go` — updated `DiskLayout` field to use `backup.DiskLayout` +- `internal/report/infra_backup_linux.go` — updated to return `backup.DiskLayout` +- `internal/report/infra_backup_other.go` — updated to return `backup.DiskLayout` +- `internal/report/infra_pull.go` — **NEW** — `PullInfraBackup(hubURL, apiKey, customerID)` HTTP GET from Hub, returns `*InfraBackup` or nil/nil for 404 +- `internal/backup/restore_drives_linux.go` — **NEW** — `MountDrivesFromLayout(ctx, layout, logger)` scans block devices by UUID, mounts using two-layer pattern (raw+bind), updates /host-fstab; includes `scanBlockDeviceUUIDs()` (lsblk+blkid), `mountDirect()`, `mountRawAndBind()`, `addDRFstabEntries()`, `isMountedPath()`, `hostDevPath()` +- `internal/backup/restore_drives_other.go` — **NEW** — no-op stub for non-Linux compilation +- `internal/settings/settings.go` — added `SetCrossDriveResticPassword(password)` setter (RWMutex + atomic save) +- `cmd/controller/main.go` — added fresh-deployment detection (`!fileExists(settings.json)`), Hub pull, password restoration, settings restoration, drive mounting (with 2min timeout), settings re-load after restore; helper functions: `fileExists()`, `restorePasswordsFromHub()`, `restoreSettingsFromHub()` + +### Phase 3 — Implementation plan + +**Context:** After Phase 2, drives are mounted and local backup data is accessible. +The Hub infra backup has the `deployed_stacks` manifest and cross-drive backup data +lives at `/backups/secondary//rsync/` with `_config/` and `_db/` subdirs. + +**Key insight:** In the common DR scenario (system drive died, HDDs survived), app data +is already on the HDD. The main thing to restore is stack configs (compose files + +app.yaml with deployed flag + env vars). Cross-drive rsync backups include `_config/` +which has the full stack directory. + +**Files (NEW):** +- `internal/backup/restore_scan.go` — `RestorePlan`, `RestorableApp` types + `ScanDrivesForBackups()` + `BuildRestorePlan()` +- `internal/backup/restore_app_linux.go` — `RestoreAppFromBackup()` (restore config + data + DB dump + docker compose up) +- `internal/backup/restore_app_other.go` — non-Linux stub +- `internal/web/handler_restore.go` — restore page handler + JSON API endpoints +- `internal/web/templates/restore.html` — full-page DR restore UI (standalone, no sidebar) + +**Files (MODIFIED):** +- `internal/web/server.go` — `restoreMode` + `restorePlan` state; `SetRestoreState()`; route interception (redirect all to /restore) +- `cmd/controller/main.go` — after Phase 2 drive mount, scan for backups + build restore plan + pass to web server + +**Restore page behavior:** +- When `restoreMode` is active, ALL web routes redirect to `/restore` (except `/static/*`, `/api/health`, `/api/restore/*`, `/login`, `/logout`) +- Page shows: domain/customer info, drive status, per-app table (config found, data found, DB dump found), restore all / skip buttons +- POST `/api/restore/all` starts sequential restore of all apps +- POST `/api/restore/skip` exits restore mode → normal dashboard +- GET `/api/restore/status` returns current plan with per-app status for JS polling +- All text in Hungarian + +**Per-app restore sequence:** +1. Restore stack config from `_config/` → `/opt/docker/stacks//` +2. Verify app data exists on HDD (it should if HDD survived) +3. If app data missing but rsync backup exists → rsync data back +4. If DB dumps in `_db/` → copy to primary dump dir +5. `docker compose pull` (pull images) +6. `docker compose up -d` (start app) +7. Update status → next app + +**Post-restore:** re-scan stacks, clear restoreMode, normal dashboard operation + +### Phase 3 — What was deployed + +**Controller changes** (`controller/`): +- `internal/backup/restore_scan.go` — **NEW** — `RestorePlan`, `RestorableApp`, `DriveInfo`, `InfraStackInfo` types; `ScanDrivesForBackups()` scans mount paths for cross-drive backup dirs, correlates with Hub manifest; `Snapshot()` for thread-safe JSON serialization; `UpdateApp()` for progress tracking +- `internal/backup/restore_app_linux.go` — **NEW** — `RestoreAppFromBackup()` restores a single app: rsyncs `_config/` to stack dir, verifies/restores user data, copies DB dumps, runs `docker compose pull && up -d` +- `internal/backup/restore_app_other.go` — **NEW** — non-Linux stub +- `internal/web/handler_restore.go` — **NEW** — `restorePageHandler()` renders DR page; `apiRestoreStatus()` returns plan+app statuses as JSON; `apiRestoreAll()` triggers sequential restore in goroutine; `apiRestoreSkip()` exits restore mode; `executeAllRestores()` drives the restore loop with per-app timeout +- `internal/web/templates/restore.html` — **NEW** — standalone full-page DR UI (no sidebar); shows customer info, drive status cards, app table with config/data/DB columns, progress bar, restore all / skip buttons; JS polling every 2s during restore +- `internal/web/server.go` — added `restorePlan *backup.RestorePlan` + `restoreMu`; `SetRestoreState()` and `InRestoreMode()` methods; route interception in `ServeHTTP()` redirects all non-static/non-restore routes to `/restore` when in restore mode +- `internal/web/funcmap.go` — added `statusText` template function (Hungarian labels for restore status codes) +- `cmd/controller/main.go` — after Phase 2 drive mount, builds `[]InfraStackInfo` from Hub data, calls `ScanDrivesForBackups()`, sets `restorePlan` metadata, calls `webServer.SetRestoreState()` + +### Phase 4 — What was deployed + +**Script changes:** +- `scripts/docker-setup.sh` — `print_summary()` now shows a "Disaster Recovery" block when `$CUSTOMER_ID` is set, informing the operator that the controller will automatically contact the Hub, mount drives, and offer restore + +**README updates:** +- `controller/README.md` — version bump to v0.15.5; repo layout updated with new DR files (restore_scan.go, restore_app_linux.go, restore_drives_linux.go, infra_pull.go, handler_restore.go); roadmap marks DR as completed +- Hub README (`felhom.eu/hub/README.md`) — already had complete DR documentation, no changes needed + +--- + +## Architecture + +### The problem (catch-22) + +When the system drive dies, the backup data lives on surviving HDDs. But a freshly +installed OS doesn't know about those drives — they aren't in `/etc/fstab`, aren't +mounted, and the controller can't scan them. Even if we stored mount info in the +local backup, we can't read the local backup without mounting the drives first. + +### The solution: Hub as infra backup store + +The Hub (`hub.felhom.eu`) is always reachable. During normal operation, the +controller pushes its infrastructure state to the Hub. On a fresh deployment: + +``` +[1] docker-setup.sh deploys controller with Hub details (customer_id + API key) +[2] Controller starts → detects empty data dir → "I'm a fresh deployment" +[3] Controller calls Hub: GET /api/v1/infra-backup/{customer_id} +[4] Hub responds with: disk layout, controller.yaml, manifest, restic passwords +[5] Controller scans /dev/ for disks matching stored UUIDs +[6] Controller mounts surviving drives (using its existing disk management) +[7] Local backups on mounted drives are now accessible +[8] Controller auto-restores stack configs → apps appear in dashboard +[9] User opens dashboard → "Restore from backup" wizard +[10] User confirms → controller restores data + starts apps +``` + +### Fallback: local-only detection + +If the Hub is unreachable (no internet, Hub down), the controller falls back to +scanning already-mounted drives for `_infra/manifest.json` — the existing local +backup path. This is less automated (drives must be manually mounted first) but +still works. + +--- + +## Data stored on Hub per customer + +The infra-backup payload is a single JSON blob (~20-50KB per customer): + +```json +{ + "customer_id": "demo-felhom", + "domain": "demo-felhom.eu", + "controller_version": "v0.15.5", + "timestamp": "2026-02-19T03:05:00Z", + + "controller_config_b64": "", + "settings_json_b64": "", + + "disk_layout": { + "mounts": [ + { + "uuid": "242ee4da-d9f8-40ce-b3fa-8e4860204790", + "label": "userdate", + "mount_point": "/mnt/sys_drive", + "fs_type": "ext4", + "size_bytes": 350073856000, + "fstab_options": "defaults,noatime", + "role": "system_data", + "bind_subdir": "", + "raw_mount": "" + }, + { + "uuid": "277a2179-a764-4758-b840-9ea741517914", + "label": "hdd_1", + "mount_point": "/mnt/hdd_1", + "fs_type": "ext4", + "size_bytes": 1000204886016, + "fstab_options": "defaults,nofail,noatime", + "role": "hdd_storage", + "bind_subdir": "felhom_data", + "raw_mount": "/mnt/.felhom-raw/hdd_1" + } + ] + }, + + "deployed_stacks": [ + { + "name": "immich", + "display_name": "Immich", + "hdd_path": "/mnt/hdd_1", + "needs_hdd": true + }, + { + "name": "docmost", + "display_name": "Docmost", + "hdd_path": "", + "needs_hdd": false + } + ], + + "restic_password": "base64-encoded-primary-restic-password", + "cross_drive_password": "hex-encoded-cross-drive-password" +} +``` + +**Security:** The Hub is operator-managed infrastructure. The connection is HTTPS +with Bearer token auth. The infra backup contains sensitive data (CF tokens, +restic passwords) but the Hub already receives all system health data. The +operator trusts the Hub with this data. + +--- + +## Phase 1: Hub infra-backup storage + controller push + +### 1A: Hub — new SQLite table + +**File:** `hub/internal/store/store.go` + +Add migration for a new table: + +```sql +CREATE TABLE IF NOT EXISTS infra_backups ( + customer_id TEXT PRIMARY KEY, + backup_json TEXT NOT NULL, + updated_at DATETIME NOT NULL DEFAULT (datetime('now')) +); +``` + +Add store methods: + +```go +// SaveInfraBackup upserts the infra backup for a customer. +func (s *Store) SaveInfraBackup(customerID string, backupJSON []byte) error { + _, err := s.db.Exec(` + INSERT INTO infra_backups (customer_id, backup_json, updated_at) + VALUES (?, ?, datetime('now')) + ON CONFLICT(customer_id) DO UPDATE SET + backup_json = excluded.backup_json, + updated_at = datetime('now') + `, customerID, string(backupJSON)) + return err +} + +// GetInfraBackup returns the infra backup for a customer, or nil if not found. +func (s *Store) GetInfraBackup(customerID string) ([]byte, error) { + var data string + err := s.db.QueryRow(` + SELECT backup_json FROM infra_backups WHERE customer_id = ? + `, customerID).Scan(&data) + if err == sql.ErrNoRows { + return nil, nil + } + if err != nil { + return nil, err + } + return []byte(data), nil +} +``` + +### 1B: Hub — new API endpoints + +**File:** `hub/internal/api/handler.go` + +Add two endpoints to the existing router: + +```go +// POST /api/v1/infra-backup +// Controller pushes its infrastructure snapshot to the Hub. +func (h *Handler) handleInfraBackupPush(w http.ResponseWriter, r *http.Request) { + // Read body (limit to 1MB) + body, err := io.ReadAll(io.LimitReader(r.Body, 1<<20)) + if err != nil { + writeJSON(w, http.StatusBadRequest, map[string]string{"status": "error", "error": "read body: " + err.Error()}) + return + } + + // Validate JSON structure — extract customer_id + var payload struct { + CustomerID string `json:"customer_id"` + } + if err := json.Unmarshal(body, &payload); err != nil || payload.CustomerID == "" { + writeJSON(w, http.StatusBadRequest, map[string]string{"status": "error", "error": "invalid payload or missing customer_id"}) + return + } + + if err := h.store.SaveInfraBackup(payload.CustomerID, body); err != nil { + writeJSON(w, http.StatusInternalServerError, map[string]string{"status": "error", "error": err.Error()}) + return + } + + h.logger.Printf("[INFO] Infra backup saved for %s (%d bytes)", payload.CustomerID, len(body)) + writeJSON(w, http.StatusOK, map[string]string{"status": "ok"}) +} + +// GET /api/v1/infra-backup/{customer_id} +// Fresh controller pulls the infra backup for its customer. +func (h *Handler) handleInfraBackupGet(w http.ResponseWriter, r *http.Request) { + customerID := strings.TrimPrefix(r.URL.Path, "/api/v1/infra-backup/") + if customerID == "" { + writeJSON(w, http.StatusBadRequest, map[string]string{"status": "error", "error": "missing customer_id"}) + return + } + + data, err := h.store.GetInfraBackup(customerID) + if err != nil { + writeJSON(w, http.StatusInternalServerError, map[string]string{"status": "error", "error": err.Error()}) + return + } + if data == nil { + writeJSON(w, http.StatusNotFound, map[string]string{"status": "error", "error": "no infra backup found"}) + return + } + + w.Header().Set("Content-Type", "application/json") + w.Write(data) +} +``` + +Register routes in the existing `ServeHTTP()` or router setup: + +```go +case r.Method == http.MethodPost && path == "/api/v1/infra-backup": + h.handleInfraBackupPush(w, r) +case r.Method == http.MethodGet && strings.HasPrefix(path, "/api/v1/infra-backup/"): + h.handleInfraBackupGet(w, r) +``` + +Both endpoints use the existing Bearer token auth (same `report_api_key`). + +### 1C: Hub — add infra backup info to dashboard + +**File:** `hub/internal/web/templates/customer.html` + +Add a section to the customer detail page showing infra backup status: + +```html + +
+

Infra Backup

+ {{if .InfraBackup}} +

Last updated: {{.InfraBackupAge}} ago

+

Deployed stacks: {{.InfraBackupStackCount}}

+

Disks: {{.InfraBackupDiskCount}}

+ {{else}} +

No infra backup received yet

+ {{end}} +
+``` + +Add store method and web handler logic to load infra backup metadata for the +customer detail page. + +### 1D: Controller — push infra snapshot to Hub + +**File:** `controller/internal/report/infra_backup.go` (NEW) + +```go +package report + +import ( + "encoding/base64" + "encoding/json" + "os" + "time" + + "gitea.dooplex.hu/admin/felhom-controller/internal/backup" + "gitea.dooplex.hu/admin/felhom-controller/internal/settings" +) + +// InfraBackup is the payload pushed to the Hub for disaster recovery. +type InfraBackup struct { + CustomerID string `json:"customer_id"` + Domain string `json:"domain"` + ControllerVersion string `json:"controller_version"` + Timestamp string `json:"timestamp"` + + ControllerConfigB64 string `json:"controller_config_b64"` + SettingsJSONB64 string `json:"settings_json_b64,omitempty"` + + DiskLayout DiskLayout `json:"disk_layout"` + DeployedStacks []InfraStack `json:"deployed_stacks"` + + ResticPassword string `json:"restic_password,omitempty"` + CrossDrivePassword string `json:"cross_drive_password,omitempty"` +} + +type DiskLayout struct { + Mounts []DiskMount `json:"mounts"` +} + +type DiskMount struct { + UUID string `json:"uuid"` + Label string `json:"label"` + MountPoint string `json:"mount_point"` + FSType string `json:"fs_type"` + SizeBytes int64 `json:"size_bytes"` + FstabOptions string `json:"fstab_options"` + Role string `json:"role"` // "system_data", "hdd_storage", "root" + BindSubdir string `json:"bind_subdir"` // e.g., "felhom_data" for HDD bind mounts + RawMount string `json:"raw_mount"` // e.g., "/mnt/.felhom-raw/hdd_1" +} + +type InfraStack struct { + Name string `json:"name"` + DisplayName string `json:"display_name"` + HDDPath string `json:"hdd_path,omitempty"` + NeedsHDD bool `json:"needs_hdd"` +} + +// BuildInfraBackup collects all infrastructure state for Hub backup. +func BuildInfraBackup( + customerID, domain, version string, + controllerYAMLPath string, + settingsPath string, + resticPasswordFile string, + sett *settings.Settings, + stackProvider backup.StackDataProvider, +) (*InfraBackup, error) { + ib := &InfraBackup{ + CustomerID: customerID, + Domain: domain, + ControllerVersion: version, + Timestamp: time.Now().UTC().Format(time.RFC3339), + } + + // Read and encode controller.yaml + if data, err := os.ReadFile(controllerYAMLPath); err == nil { + ib.ControllerConfigB64 = base64.StdEncoding.EncodeToString(data) + } + + // Read and encode settings.json + if data, err := os.ReadFile(settingsPath); err == nil { + ib.SettingsJSONB64 = base64.StdEncoding.EncodeToString(data) + } + + // Read restic password + if data, err := os.ReadFile(resticPasswordFile); err == nil { + ib.ResticPassword = base64.StdEncoding.EncodeToString(data) + } + + // Read cross-drive password + if pw := sett.GetCrossDriveResticPassword(); pw != "" { + ib.CrossDrivePassword = pw + } + + // Collect disk layout (see implementation note below) + ib.DiskLayout = collectDiskLayout() + + // Collect deployed stacks + deployed := stackProvider.ListDeployedStacks() + for _, s := range deployed { + ib.DeployedStacks = append(ib.DeployedStacks, InfraStack{ + Name: s.Name, + DisplayName: s.DisplayName, + HDDPath: stackProvider.GetStackHDDPath(s.Name), + NeedsHDD: s.NeedsHDD, + }) + } + + return ib, nil +} + +// collectDiskLayout reads /etc/fstab and lsblk to build the disk layout. +// This runs inside the container which has /host-fstab mounted and access to +// /host-dev/ for block device info. +func collectDiskLayout() DiskLayout { + // Implementation: parse /host-fstab (mounted from host /etc/fstab) + // and correlate with lsblk -J output. + // + // The controller already has disk management code in internal/stacks/ + // or similar — reuse the existing lsblk parsing. + // + // For each non-root, non-swap, non-boot mount in fstab: + // - Extract UUID, mount point, fs_type, options + // - Detect role: "system_data" if mount_point matches system_data_path, + // "hdd_storage" if it's under /mnt/.felhom-raw/ or /mnt/hdd_* + // - Detect bind mounts (type=none, options contain "bind") + // - Get size from lsblk + // + // Return the DiskLayout struct. + // + // See the detailed implementation note in the "Implementation details" section. + return DiskLayout{} +} +``` + +### 1E: Controller — push infra backup after each backup cycle + +**File:** `controller/cmd/controller/main.go` + +Add the infra backup push to the backup scheduler (after Tier1 + Tier2 complete): + +```go +// In the "backup" daily scheduler: +sched.Daily("backup", cfg.Backup.ResticSchedule, func(ctx context.Context) error { + err := backupMgr.RunBackup(ctx) + crossDriveRunner.RunAllScheduled(ctx, "daily") + if time.Now().Weekday() == time.Sunday { + crossDriveRunner.RunAllScheduled(ctx, "weekly") + } + + // NEW: Push infra backup to Hub + if hubPusher != nil && cfg.Hub.Enabled { + go pushInfraBackup(cfg, sett, stackProv, hubPusher, logger) + } + + return err +}) +``` + +```go +func pushInfraBackup(cfg *config.Config, sett *settings.Settings, + stackProv backup.StackDataProvider, pusher *report.Pusher, logger *log.Logger) { + + ib, err := report.BuildInfraBackup( + cfg.Customer.ID, cfg.Customer.Domain, Version, + "/opt/docker/felhom-controller/controller.yaml", + filepath.Join(cfg.Paths.DataDir, "settings.json"), + cfg.Backup.ResticPasswordFile, + sett, stackProv, + ) + if err != nil { + logger.Printf("[WARN] Failed to build infra backup: %v", err) + return + } + + data, err := json.Marshal(ib) + if err != nil { + logger.Printf("[WARN] Failed to marshal infra backup: %v", err) + return + } + + if err := pusher.PushInfraBackup(data); err != nil { + logger.Printf("[WARN] Failed to push infra backup to Hub: %v", err) + } else { + logger.Printf("[INFO] Infra backup pushed to Hub (%d bytes)", len(data)) + } +} +``` + +### 1F: Controller — add `PushInfraBackup` to Pusher + +**File:** `controller/internal/report/pusher.go` + +Add a new method alongside the existing `Push()`: + +```go +// PushInfraBackup sends the infrastructure backup to the Hub. +func (p *Pusher) PushInfraBackup(data []byte) error { + if !p.enabled { + return nil + } + + url := p.hubURL + "/api/v1/infra-backup" + + var lastErr error + for attempt := 0; attempt < 3; attempt++ { + if attempt > 0 { + time.Sleep(5 * time.Second) + } + + req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(data)) + if err != nil { + lastErr = err + continue + } + req.Header.Set("Content-Type", "application/json") + if p.apiKey != "" { + req.Header.Set("Authorization", "Bearer "+p.apiKey) + } + + resp, err := p.httpClient.Do(req) + if err != nil { + lastErr = err + continue + } + io.Copy(io.Discard, resp.Body) + resp.Body.Close() + + if resp.StatusCode >= 200 && resp.StatusCode < 300 { + return nil + } + lastErr = fmt.Errorf("HTTP %d", resp.StatusCode) + } + + return fmt.Errorf("infra backup push failed after 3 attempts: %w", lastErr) +} +``` + +--- + +## Phase 2: New-deployment detection + Hub pull + auto-mount + +### 2A: Controller — detect fresh deployment + +**File:** `controller/cmd/controller/main.go` + +The controller uses a Docker named volume (`controller-data`) at +`/opt/docker/felhom-controller/data`. On a fresh deployment, this volume is +empty — no `settings.json`, no `session_secret`, no `snapshot-history.json`. + +Add detection after settings initialization: + +```go +// Detect fresh deployment (empty data directory = new install) +isFreshDeployment := !fileExists(filepath.Join(cfg.Paths.DataDir, "settings.json")) + +if isFreshDeployment { + logger.Println("[INFO] Fresh deployment detected — checking Hub for infra backup") + + // Write a marker so we don't re-trigger on next restart + // (settings.json will be created by Settings.save() soon anyway) +} +``` + +**Important:** The marker to distinguish "fresh" from "restarted" is the absence +of `settings.json`. Once the Settings package creates it (on first save), subsequent +restarts won't trigger the fresh-deployment path. + +### 2B: Controller — pull infra backup from Hub + +**File:** `controller/internal/report/infra_pull.go` (NEW) + +```go +package report + +import ( + "encoding/json" + "fmt" + "io" + "net/http" + "time" +) + +// PullInfraBackup fetches the infrastructure backup from the Hub. +// Returns nil, nil if no backup exists for this customer. +func PullInfraBackup(hubURL, apiKey, customerID string) (*InfraBackup, error) { + url := hubURL + "/api/v1/infra-backup/" + customerID + + client := &http.Client{Timeout: 30 * time.Second} + + req, err := http.NewRequest(http.MethodGet, url, nil) + if err != nil { + return nil, err + } + if apiKey != "" { + req.Header.Set("Authorization", "Bearer "+apiKey) + } + + resp, err := client.Do(req) + if err != nil { + return nil, fmt.Errorf("hub request failed: %w", err) + } + defer resp.Body.Close() + + if resp.StatusCode == http.StatusNotFound { + return nil, nil // no backup for this customer + } + if resp.StatusCode != http.StatusOK { + return nil, fmt.Errorf("hub returned HTTP %d", resp.StatusCode) + } + + body, err := io.ReadAll(io.LimitReader(resp.Body, 5<<20)) // 5MB limit + if err != nil { + return nil, fmt.Errorf("reading response: %w", err) + } + + var ib InfraBackup + if err := json.Unmarshal(body, &ib); err != nil { + return nil, fmt.Errorf("parsing infra backup: %w", err) + } + + return &ib, nil +} +``` + +### 2C: Controller — auto-mount drives from Hub disk layout + +**File:** `controller/internal/backup/restore_drives.go` (NEW) + +```go +package backup + +import ( + "context" + "encoding/json" + "fmt" + "log" + "os" + "os/exec" + "path/filepath" + "strings" + + "gitea.dooplex.hu/admin/felhom-controller/internal/report" +) + +// MountDrivesFromLayout scans block devices for disks matching the Hub's +// stored disk layout and mounts them. Uses the controller's existing +// two-layer mount pattern: raw mount → bind mount. +// +// The controller container has: +// - /host-dev:/dev (rw) — block device access +// - /host-fstab:/etc/fstab — can update fstab +// - privileged: true — can mount filesystems +// +// Returns the list of successfully mounted paths. +func MountDrivesFromLayout(ctx context.Context, layout report.DiskLayout, logger *log.Logger) ([]string, error) { + // 1. Get current block devices with UUIDs + lsblkDevices, err := getLsblkDevices(ctx) + if err != nil { + return nil, fmt.Errorf("scanning block devices: %w", err) + } + + var mounted []string + + for _, diskMount := range layout.Mounts { + if diskMount.UUID == "" { + continue + } + + // Skip system partitions (root, boot, swap) + if diskMount.Role == "root" || diskMount.Role == "boot" || diskMount.Role == "swap" { + continue + } + + // Find matching device by UUID + device := findDeviceByUUID(lsblkDevices, diskMount.UUID) + if device == "" { + logger.Printf("[WARN] Disk UUID %s (%s) not found — drive may be missing", + diskMount.UUID, diskMount.Label) + continue + } + + // Check if already mounted + if isMounted(diskMount.MountPoint) || isMounted(diskMount.RawMount) { + logger.Printf("[INFO] %s already mounted", diskMount.MountPoint) + mounted = append(mounted, diskMount.MountPoint) + continue + } + + logger.Printf("[INFO] Found disk %s (UUID=%s, label=%s) — mounting to %s", + device, diskMount.UUID[:12], diskMount.Label, diskMount.MountPoint) + + // Mount using the felhom two-layer pattern: + // Layer 1: raw mount → /mnt/.felhom-raw/