v0.15.5: Disaster recovery — Hub-based infra backup, auto-mount, restore UI
Complete DR implementation (TASK2.md Phases 1-4): - Hub infra-backup push/pull endpoints (controller.yaml, disk layout, stacks) - Fresh-deployment detection pulls config from Hub, auto-mounts drives by UUID - Full-page restore UI with drive status, app table, sequential restore - docker-setup.sh shows DR instructions when customer_id is configured New files: disk_layout.go, restore_scan.go, restore_app_linux.go, restore_drives_linux.go, infra_backup.go, infra_pull.go, handler_restore.go, restore.html Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -420,6 +420,15 @@ Update the summary to reflect:
|
||||
- Show hub status (enabled)
|
||||
- Remove the `CUSTOMER_ID` display bug (the "Note: No --customer specified"
|
||||
message is inside the `if [[ -n "$CUSTOMER_ID" ]]` block — wrong logic)
|
||||
- Add DR/reinstallation note:
|
||||
```
|
||||
If this is a reinstallation, the controller will automatically:
|
||||
1. Contact the Hub for your previous configuration
|
||||
2. Mount your existing storage drives
|
||||
3. Detect and restore your applications
|
||||
|
||||
Open https://felhom.<DOMAIN> to monitor the restore process.
|
||||
```
|
||||
|
||||
### 12. Update `print_help()`
|
||||
|
||||
|
||||
+46
-5
@@ -4,7 +4,7 @@
|
||||
|
||||
A single, lightweight Go container that replaces Portainer + scattered systemd scripts with a unified, Hungarian-language web dashboard for managing Docker Compose stacks, backups, storage, monitoring, and notifications on customer hardware.
|
||||
|
||||
**Current version: v0.15.4**
|
||||
**Current version: v0.15.5**
|
||||
|
||||
---
|
||||
|
||||
@@ -593,14 +593,49 @@ Periodic JSON push (default every 15 min) to the central felhom-hub service:
|
||||
|
||||
Bearer token authentication, 3-attempt retry with 5-second backoff.
|
||||
|
||||
#### Infrastructure Backup to Hub (`internal/report/infra_backup.go`)
|
||||
|
||||
After each backup cycle, the controller pushes a full infrastructure snapshot to the Hub for disaster recovery. This snapshot includes:
|
||||
- `controller.yaml` (base64-encoded, full config including secrets)
|
||||
- `settings.json` (base64-encoded, backup prefs, storage paths, cross-drive configs)
|
||||
- Disk layout (UUIDs, labels, mount points, fstab options, bind-mount topology)
|
||||
- Deployed stacks manifest (app names, HDD paths)
|
||||
- Restic passwords (primary + cross-drive, base64-encoded)
|
||||
|
||||
This enables fully automated recovery when the system drive is replaced — the new controller pulls the snapshot from the Hub, auto-mounts surviving drives by UUID, and restores all applications.
|
||||
|
||||
#### Hub Dashboard
|
||||
|
||||
The hub service (separate Go app in the `felhom.eu` repo) provides:
|
||||
- Multi-customer overview table with status indicators
|
||||
- Customer detail page with system/storage/containers/backup/health sections
|
||||
- Infra backup status per customer (last sync, stack count, disk count)
|
||||
- Color coding: green (<30min), yellow (30-60min), red (>60min since last report)
|
||||
- 90-day report retention with daily prune
|
||||
|
||||
### 9. Disaster Recovery
|
||||
|
||||
When a system drive fails and is replaced, the controller can automatically restore the full deployment:
|
||||
|
||||
```
|
||||
1. docker-setup.sh deploys fresh controller (Hub enabled, customer_id configured)
|
||||
2. Controller detects empty data dir → fresh deployment
|
||||
3. Controller pulls infra backup from Hub → gets disk layout, passwords, configs
|
||||
4. Controller scans block devices for UUIDs matching stored disk layout
|
||||
5. Controller mounts surviving drives (e.g., HDD with backups)
|
||||
6. Controller scans mounted drives for local backup data (_infra/ + rsync copies)
|
||||
7. Controller auto-restores stack configs → apps appear in dashboard
|
||||
8. User opens dashboard → "Visszaállítás" (Restore) wizard
|
||||
9. User confirms → sequential restore: rsync first, restic fallback, DB import
|
||||
10. Apps restored and running
|
||||
```
|
||||
|
||||
**Backup sources (priority order):**
|
||||
1. **Rsync copies** (cross-drive, plain files, no password needed) — fastest, most reliable
|
||||
2. **Restic snapshots** (encrypted, needs password from Hub) — comprehensive but slower
|
||||
|
||||
**Fallback:** If the Hub is unreachable, the controller can still detect backups on already-mounted drives (manual mount or pre-existing fstab entries).
|
||||
|
||||
---
|
||||
|
||||
## Repository Layout
|
||||
@@ -631,7 +666,10 @@ controller/
|
||||
│ │ ├── restic.go # Restic operations (init, snapshot, prune, check) — repoPath as param
|
||||
│ │ ├── appdata.go # StackDataProvider interface, app data discovery
|
||||
│ │ ├── crossdrive.go # Per-app backup to secondary storage (rsync/restic)
|
||||
│ │ └── restore.go # Per-app restore from per-drive repo
|
||||
│ │ ├── restore.go # Per-app restore from per-drive repo
|
||||
│ │ ├── restore_scan.go # DR: scan drives for backup data, build restore plan
|
||||
│ │ ├── restore_app_linux.go # DR: per-app restore (rsync config/data + docker compose up)
|
||||
│ │ └── restore_drives_linux.go # DR: auto-mount drives by UUID from Hub infra backup
|
||||
│ ├── api/router.go # REST API endpoints (~30 routes)
|
||||
│ ├── scheduler/scheduler.go # Central job scheduler (Every, Daily)
|
||||
│ ├── system/
|
||||
@@ -648,16 +686,18 @@ controller/
|
||||
│ ├── notify/notifier.go # Email relay to hub, preference sync, cooldowns
|
||||
│ ├── report/
|
||||
│ │ ├── builder.go # Hub report builder (all subsystems → JSON)
|
||||
│ │ └── pusher.go # HTTP POST to hub (retry, Bearer auth)
|
||||
│ │ ├── pusher.go # HTTP POST to hub (retry, Bearer auth)
|
||||
│ │ └── infra_pull.go # DR: pull infra backup from Hub for fresh deployment
|
||||
│ └── web/
|
||||
│ ├── server.go # HTTP server, routing, static files
|
||||
│ ├── auth.go # Session auth, login/logout, session cleanup
|
||||
│ ├── handlers.go # Page handlers (dashboard, stacks, deploy, backups, etc.)
|
||||
│ ├── handler_restore.go # DR: restore page handler + APIs (scan, restore all, skip)
|
||||
│ ├── storage_handlers.go # Storage API handlers (scan, format, attach, migrate, cleanup)
|
||||
│ ├── alerts.go # State-based alert generation
|
||||
│ ├── funcmap.go # Template functions (state colors, Hungarian formatting)
|
||||
│ ├── embed.go # go:embed for templates + Chart.js
|
||||
│ └── templates/ # 12 HTML files + style.css (Hungarian UI)
|
||||
│ └── templates/ # 13 HTML files + style.css (Hungarian UI)
|
||||
├── configs/
|
||||
│ ├── controller.yaml.example # Full config reference
|
||||
│ └── example-felhom-metadata.yml # .felhom.yml format reference
|
||||
@@ -869,6 +909,7 @@ See `docker-compose.yml` for the full volume configuration.
|
||||
- [x] Cross-drive restic pruning (v0.14.0)
|
||||
- [x] Auto Tier 2 for small apps (v0.14.1) — auto-enable daily rsync for non-HDD apps when ≥2 drives
|
||||
- [x] Infrastructure config in cross-drive backup (v0.14.1) — stacks dir + controller.yaml in `_infra/` + restic
|
||||
- [x] Disaster recovery (v0.15.5) — Hub-based infra backup, auto-mount by UUID, restore UI with full-page takeover
|
||||
|
||||
### In Progress / Planned
|
||||
|
||||
@@ -885,7 +926,7 @@ See `docker-compose.yml` for the full volume configuration.
|
||||
|
||||
| Node | Hardware | Domain | Status |
|
||||
|------|----------|--------|--------|
|
||||
| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | Controller v0.15.0 |
|
||||
| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | Controller v0.15.5 |
|
||||
| pi-customer-1 | Raspberry Pi 3B+, 1G RAM, 32G SD | pi-customer-1.local | Not yet tested |
|
||||
|
||||
## Related Repositories
|
||||
|
||||
@@ -2,6 +2,8 @@ package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/base64"
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"log"
|
||||
@@ -61,6 +63,76 @@ func main() {
|
||||
logger.Fatalf("[FATAL] Failed to load settings from %s: %v", settingsPath, err)
|
||||
}
|
||||
|
||||
// --- Detect fresh deployment (Phase 2+3: DR restore from Hub) ---
|
||||
var restorePlan *backup.RestorePlan
|
||||
isFreshDeployment := !fileExists(settingsPath)
|
||||
if isFreshDeployment && cfg.Hub.Enabled && cfg.Hub.URL != "" {
|
||||
logger.Println("[INFO] Fresh deployment detected — checking Hub for infra backup")
|
||||
|
||||
ib, pullErr := report.PullInfraBackup(cfg.Hub.URL, cfg.Hub.APIKey, cfg.Customer.ID)
|
||||
if pullErr != nil {
|
||||
logger.Printf("[WARN] Could not reach Hub for infra backup: %v", pullErr)
|
||||
} else if ib != nil {
|
||||
logger.Printf("[INFO] Found infra backup on Hub: %s (%s), %d stacks, synced %s",
|
||||
ib.Domain, ib.CustomerID, len(ib.DeployedStacks), ib.Timestamp)
|
||||
|
||||
// Restore restic passwords
|
||||
restorePasswordsFromHub(ib, cfg, sett, logger)
|
||||
|
||||
// Restore settings.json from Hub backup
|
||||
restoreSettingsFromHub(ib, cfg, logger)
|
||||
|
||||
// Re-load settings (now from restored file)
|
||||
if restoredSett, loadErr := settings.Load(settingsPath, logger); loadErr == nil {
|
||||
sett = restoredSett
|
||||
logger.Println("[INFO] Settings reloaded after Hub restore")
|
||||
}
|
||||
|
||||
// Mount drives using stored disk layout
|
||||
mountCtx, mountCancel := context.WithTimeout(context.Background(), 2*time.Minute)
|
||||
mountedPaths, mountErr := backup.MountDrivesFromLayout(mountCtx, ib.DiskLayout, logger)
|
||||
mountCancel()
|
||||
if mountErr != nil {
|
||||
logger.Printf("[WARN] Drive mounting error: %v", mountErr)
|
||||
} else if len(mountedPaths) > 0 {
|
||||
logger.Printf("[INFO] Mounted %d drives from Hub disk layout: %v", len(mountedPaths), mountedPaths)
|
||||
} else {
|
||||
logger.Println("[INFO] No matching drives found to mount from Hub disk layout")
|
||||
}
|
||||
|
||||
// Phase 3: Scan mounted drives for backup data and build restore plan
|
||||
if len(ib.DeployedStacks) > 0 {
|
||||
// Collect mount paths from disk layout
|
||||
var drivePaths []string
|
||||
for _, dm := range ib.DiskLayout.Mounts {
|
||||
if dm.MountPoint != "" {
|
||||
drivePaths = append(drivePaths, dm.MountPoint)
|
||||
}
|
||||
}
|
||||
|
||||
// Convert report stacks to backup scan format
|
||||
var infraStacks []backup.InfraStackInfo
|
||||
for _, s := range ib.DeployedStacks {
|
||||
infraStacks = append(infraStacks, backup.InfraStackInfo{
|
||||
Name: s.Name,
|
||||
DisplayName: s.DisplayName,
|
||||
HDDPath: s.HDDPath,
|
||||
NeedsHDD: s.NeedsHDD,
|
||||
})
|
||||
}
|
||||
|
||||
restorePlan = backup.ScanDrivesForBackups(drivePaths, infraStacks, logger)
|
||||
restorePlan.CustomerID = ib.CustomerID
|
||||
restorePlan.Domain = ib.Domain
|
||||
restorePlan.Timestamp = ib.Timestamp
|
||||
|
||||
logger.Printf("[INFO] DR restore plan ready: %d apps to restore", len(restorePlan.Apps))
|
||||
}
|
||||
} else {
|
||||
logger.Println("[INFO] No infra backup found on Hub for this customer")
|
||||
}
|
||||
}
|
||||
|
||||
// --- Auto-discover storage paths from deployed apps ---
|
||||
discoveredPaths := discoverHDDPaths(cfg.Paths.StacksDir, logger)
|
||||
sett.AutoDiscoverStoragePaths(discoveredPaths, cfg.Paths.HDDPath, logger)
|
||||
@@ -183,6 +255,12 @@ func main() {
|
||||
return nil
|
||||
})
|
||||
|
||||
// --- Central hub pusher (declared early so backup closure can reference it) ---
|
||||
var hubPusher *report.Pusher
|
||||
if cfg.Hub.URL != "" && cfg.Hub.APIKey != "" {
|
||||
hubPusher = report.NewPusher(&cfg.Hub, logger)
|
||||
}
|
||||
|
||||
// Backup daily jobs
|
||||
if cfg.Backup.Enabled && backupMgr != nil {
|
||||
sched.Daily("db-dump", cfg.Backup.DBDumpSchedule, func(ctx context.Context) error {
|
||||
@@ -209,6 +287,10 @@ func main() {
|
||||
}
|
||||
}
|
||||
}
|
||||
// Push infra backup to Hub after all backup tiers complete
|
||||
if hubPusher != nil && cfg.Hub.Enabled {
|
||||
go pushInfraBackup(cfg, sett, stackProv, hubPusher, logger)
|
||||
}
|
||||
return err
|
||||
})
|
||||
|
||||
@@ -245,10 +327,8 @@ func main() {
|
||||
})
|
||||
}
|
||||
|
||||
// --- Central hub reporting ---
|
||||
var hubPusher *report.Pusher
|
||||
if cfg.Hub.URL != "" && cfg.Hub.APIKey != "" {
|
||||
hubPusher = report.NewPusher(&cfg.Hub, logger)
|
||||
// --- Central hub reporting schedule ---
|
||||
if hubPusher != nil {
|
||||
if cfg.Hub.Enabled {
|
||||
pushInterval, err := time.ParseDuration(cfg.Hub.PushInterval)
|
||||
if err != nil {
|
||||
@@ -305,6 +385,8 @@ func main() {
|
||||
if pushErr != nil {
|
||||
logger.Printf("[WARN] Startup hub report failed after 3 attempts — next scheduled push in %s", cfg.Hub.PushInterval)
|
||||
}
|
||||
// Also push infra backup on startup
|
||||
go pushInfraBackup(cfg, sett, stackProv, hubPusher, logger)
|
||||
} else {
|
||||
// Send a minimal "disabled" notification so hub knows reporting is intentionally off
|
||||
r := &report.Report{
|
||||
@@ -356,6 +438,12 @@ func main() {
|
||||
// --- Initialize web server ---
|
||||
webServer := web.NewServer(cfg, stackMgr, cpuCollector, backupMgr, crossDriveRunner, sched, sett, alertMgr, notifier, logger, Version)
|
||||
|
||||
// Phase 3: Set DR restore mode if a restore plan was built
|
||||
if restorePlan != nil && len(restorePlan.Apps) > 0 {
|
||||
webServer.SetRestoreState(restorePlan)
|
||||
logger.Println("[INFO] DR restore mode activated — all web routes redirect to /restore")
|
||||
}
|
||||
|
||||
// --- Build HTTP mux ---
|
||||
mux := http.NewServeMux()
|
||||
|
||||
@@ -491,6 +579,84 @@ func (a *stackAdapter) GetStackHDDPath(name string) string {
|
||||
return ""
|
||||
}
|
||||
|
||||
// pushInfraBackup builds and sends the infrastructure snapshot to the Hub.
|
||||
func pushInfraBackup(cfg *config.Config, sett *settings.Settings,
|
||||
stackProv *stackAdapter, pusher *report.Pusher, logger *log.Logger) {
|
||||
|
||||
ib, err := report.BuildInfraBackup(
|
||||
cfg.Customer.ID, cfg.Customer.Domain, Version,
|
||||
"/opt/docker/felhom-controller/controller.yaml",
|
||||
filepath.Join(cfg.Paths.DataDir, "settings.json"),
|
||||
cfg.Backup.ResticPasswordFile,
|
||||
cfg.Paths.SystemDataPath,
|
||||
sett, stackProv,
|
||||
)
|
||||
if err != nil {
|
||||
logger.Printf("[WARN] Failed to build infra backup: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
data, err := json.Marshal(ib)
|
||||
if err != nil {
|
||||
logger.Printf("[WARN] Failed to marshal infra backup: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
if err := pusher.PushInfraBackup(data); err != nil {
|
||||
logger.Printf("[WARN] Failed to push infra backup to Hub: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// fileExists returns true if the path exists (file or directory).
|
||||
func fileExists(path string) bool {
|
||||
_, err := os.Stat(path)
|
||||
return err == nil
|
||||
}
|
||||
|
||||
// restorePasswordsFromHub restores restic passwords from a Hub infra backup.
|
||||
func restorePasswordsFromHub(ib *report.InfraBackup, cfg *config.Config,
|
||||
sett *settings.Settings, logger *log.Logger) {
|
||||
|
||||
if ib.ResticPassword != "" {
|
||||
decoded, err := base64.StdEncoding.DecodeString(ib.ResticPassword)
|
||||
if err == nil && len(decoded) > 0 {
|
||||
dir := filepath.Dir(cfg.Backup.ResticPasswordFile)
|
||||
os.MkdirAll(dir, 0700)
|
||||
if err := os.WriteFile(cfg.Backup.ResticPasswordFile, decoded, 0600); err == nil {
|
||||
logger.Println("[INFO] Primary restic password restored from Hub")
|
||||
} else {
|
||||
logger.Printf("[WARN] Failed to write restic password file: %v", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if ib.CrossDrivePassword != "" {
|
||||
if err := sett.SetCrossDriveResticPassword(ib.CrossDrivePassword); err == nil {
|
||||
logger.Println("[INFO] Cross-drive restic password restored from Hub")
|
||||
} else {
|
||||
logger.Printf("[WARN] Failed to set cross-drive password: %v", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// restoreSettingsFromHub restores settings.json from a Hub infra backup.
|
||||
func restoreSettingsFromHub(ib *report.InfraBackup, cfg *config.Config, logger *log.Logger) {
|
||||
if ib.SettingsJSONB64 == "" {
|
||||
return
|
||||
}
|
||||
decoded, err := base64.StdEncoding.DecodeString(ib.SettingsJSONB64)
|
||||
if err != nil {
|
||||
logger.Printf("[WARN] Failed to decode settings from Hub: %v", err)
|
||||
return
|
||||
}
|
||||
settingsPath := filepath.Join(cfg.Paths.DataDir, "settings.json")
|
||||
if err := os.WriteFile(settingsPath, decoded, 0600); err != nil {
|
||||
logger.Printf("[WARN] Failed to write restored settings.json: %v", err)
|
||||
} else {
|
||||
logger.Println("[INFO] Settings restored from Hub backup")
|
||||
}
|
||||
}
|
||||
|
||||
// discoverHDDPaths scans deployed apps' app.yaml for HDD_PATH env values.
|
||||
func discoverHDDPaths(stacksDir string, logger *log.Logger) []string {
|
||||
entries, err := os.ReadDir(stacksDir)
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
package backup
|
||||
|
||||
// DiskLayout holds the fstab-derived mount topology for disaster recovery.
|
||||
type DiskLayout struct {
|
||||
Mounts []DiskMount `json:"mounts"`
|
||||
}
|
||||
|
||||
// DiskMount represents a single mount entry from fstab.
|
||||
type DiskMount struct {
|
||||
UUID string `json:"uuid"`
|
||||
Label string `json:"label"`
|
||||
MountPoint string `json:"mount_point"`
|
||||
FSType string `json:"fs_type"`
|
||||
SizeBytes int64 `json:"size_bytes"`
|
||||
FstabOptions string `json:"fstab_options"`
|
||||
Role string `json:"role"` // "system_data", "hdd_storage"
|
||||
BindSubdir string `json:"bind_subdir"` // e.g., "felhom_data"
|
||||
RawMount string `json:"raw_mount"` // e.g., "/mnt/.felhom-raw/hdd_1"
|
||||
}
|
||||
@@ -0,0 +1,194 @@
|
||||
//go:build linux
|
||||
|
||||
package backup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// RestoreAppFromBackup restores a single app from its cross-drive backup.
|
||||
// Steps: restore config → verify/restore data → copy DB dumps → docker compose up.
|
||||
func RestoreAppFromBackup(ctx context.Context, app *RestorableApp, stacksDir string, logger *log.Logger) error {
|
||||
stackDir := filepath.Join(stacksDir, app.Name)
|
||||
|
||||
// Step 1: Restore stack config from _config/ backup
|
||||
if app.HasConfig {
|
||||
logger.Printf("[INFO] Restoring config for %s from %s", app.Name, app.ConfigPath)
|
||||
if err := restoreConfigDir(ctx, app.ConfigPath, stackDir); err != nil {
|
||||
return fmt.Errorf("restoring config: %w", err)
|
||||
}
|
||||
} else {
|
||||
// No config backup — check if stack dir already exists (from catalog sync)
|
||||
if !dirExists(stackDir) {
|
||||
return fmt.Errorf("no config backup and no stack directory for %s", app.Name)
|
||||
}
|
||||
logger.Printf("[INFO] No config backup for %s — using existing stack dir", app.Name)
|
||||
}
|
||||
|
||||
// Step 2: Verify app data on HDD (common case: HDD survived, data is intact)
|
||||
if app.NeedsHDD && !app.HasData && app.HasRsyncData {
|
||||
// App data is missing but rsync backup exists — restore it
|
||||
logger.Printf("[INFO] Restoring user data for %s from rsync backup", app.Name)
|
||||
if err := restoreUserData(ctx, app, logger); err != nil {
|
||||
logger.Printf("[WARN] User data restore failed for %s: %v", app.Name, err)
|
||||
// Non-fatal: app might still start without all data
|
||||
}
|
||||
} else if app.HasData {
|
||||
logger.Printf("[INFO] App data for %s found at %s — no restore needed", app.Name, app.DataPath)
|
||||
}
|
||||
|
||||
// Step 3: Copy DB dumps to primary backup location
|
||||
if app.HasDBDump {
|
||||
logger.Printf("[INFO] Restoring DB dumps for %s", app.Name)
|
||||
if err := restoreDBDumps(app, logger); err != nil {
|
||||
logger.Printf("[WARN] DB dump restore failed for %s: %v", app.Name, err)
|
||||
// Non-fatal
|
||||
}
|
||||
}
|
||||
|
||||
// Step 4: Docker compose pull + up
|
||||
composePath := filepath.Join(stackDir, "docker-compose.yml")
|
||||
if !fileExistsCheck(composePath) {
|
||||
composePath = filepath.Join(stackDir, "compose.yml")
|
||||
if !fileExistsCheck(composePath) {
|
||||
return fmt.Errorf("no compose file found in %s", stackDir)
|
||||
}
|
||||
}
|
||||
|
||||
composeDir := filepath.Dir(composePath)
|
||||
|
||||
logger.Printf("[INFO] Pulling images for %s", app.Name)
|
||||
pullCmd := exec.CommandContext(ctx, "docker", "compose", "-f", composePath, "pull")
|
||||
pullCmd.Dir = composeDir
|
||||
if out, err := pullCmd.CombinedOutput(); err != nil {
|
||||
logger.Printf("[WARN] docker compose pull failed for %s: %v (%s)", app.Name, err, strings.TrimSpace(string(out)))
|
||||
// Non-fatal: might work with cached images
|
||||
}
|
||||
|
||||
logger.Printf("[INFO] Starting %s", app.Name)
|
||||
upCmd := exec.CommandContext(ctx, "docker", "compose", "-f", composePath, "up", "-d")
|
||||
upCmd.Dir = composeDir
|
||||
if out, err := upCmd.CombinedOutput(); err != nil {
|
||||
return fmt.Errorf("docker compose up: %v (%s)", err, strings.TrimSpace(string(out)))
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// restoreConfigDir rsyncs the backed-up _config/ directory to the stack directory.
|
||||
func restoreConfigDir(ctx context.Context, configBackupDir, stackDir string) error {
|
||||
if err := os.MkdirAll(stackDir, 0755); err != nil {
|
||||
return fmt.Errorf("creating stack dir: %w", err)
|
||||
}
|
||||
|
||||
src := strings.TrimRight(configBackupDir, "/") + "/"
|
||||
dst := strings.TrimRight(stackDir, "/") + "/"
|
||||
|
||||
cmd := exec.CommandContext(ctx, "rsync", "-a", src, dst)
|
||||
if out, err := cmd.CombinedOutput(); err != nil {
|
||||
return fmt.Errorf("rsync config: %v (%s)", err, strings.TrimSpace(string(out)))
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// restoreUserData rsyncs user data from cross-drive backup back to the app's HDD path.
|
||||
func restoreUserData(ctx context.Context, app *RestorableApp, logger *log.Logger) error {
|
||||
if app.RsyncDataPath == "" || app.HDDPath == "" {
|
||||
return fmt.Errorf("no rsync data path or HDD path")
|
||||
}
|
||||
|
||||
// The rsync backup contains the app's data directories.
|
||||
// Walk the backup dir and rsync each subdirectory (excluding _config/_db)
|
||||
// back to the app's HDD data directory.
|
||||
entries, err := os.ReadDir(app.RsyncDataPath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
dataDir := AppDataDir(app.HDDPath, app.Name)
|
||||
if err := os.MkdirAll(dataDir, 0755); err != nil {
|
||||
return fmt.Errorf("creating data dir: %w", err)
|
||||
}
|
||||
|
||||
for _, e := range entries {
|
||||
name := e.Name()
|
||||
if name == "_config" || name == "_db" || strings.HasPrefix(name, ".") {
|
||||
continue
|
||||
}
|
||||
|
||||
src := filepath.Join(app.RsyncDataPath, name)
|
||||
dst := filepath.Join(dataDir, name)
|
||||
|
||||
if e.IsDir() {
|
||||
src = strings.TrimRight(src, "/") + "/"
|
||||
if err := os.MkdirAll(dst, 0755); err != nil {
|
||||
logger.Printf("[WARN] Cannot create %s: %v", dst, err)
|
||||
continue
|
||||
}
|
||||
dst = strings.TrimRight(dst, "/") + "/"
|
||||
cmd := exec.CommandContext(ctx, "rsync", "-a", src, dst)
|
||||
if out, err := cmd.CombinedOutput(); err != nil {
|
||||
logger.Printf("[WARN] rsync data %s: %v (%s)", name, err, strings.TrimSpace(string(out)))
|
||||
}
|
||||
} else {
|
||||
// Single file — copy directly
|
||||
data, err := os.ReadFile(src)
|
||||
if err != nil {
|
||||
logger.Printf("[WARN] Cannot read %s: %v", src, err)
|
||||
continue
|
||||
}
|
||||
if err := os.WriteFile(dst, data, 0644); err != nil {
|
||||
logger.Printf("[WARN] Cannot write %s: %v", dst, err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// restoreDBDumps copies DB dump files from cross-drive backup to the primary dump dir.
|
||||
func restoreDBDumps(app *RestorableApp, logger *log.Logger) error {
|
||||
if app.DBDumpPath == "" || app.HDDPath == "" {
|
||||
return nil
|
||||
}
|
||||
|
||||
destDir := AppDBDumpPath(app.HDDPath, app.Name)
|
||||
if err := os.MkdirAll(destDir, 0755); err != nil {
|
||||
return fmt.Errorf("creating dump dir: %w", err)
|
||||
}
|
||||
|
||||
entries, err := os.ReadDir(app.DBDumpPath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
for _, e := range entries {
|
||||
if e.IsDir() {
|
||||
continue
|
||||
}
|
||||
src := filepath.Join(app.DBDumpPath, e.Name())
|
||||
dst := filepath.Join(destDir, e.Name())
|
||||
data, err := os.ReadFile(src)
|
||||
if err != nil {
|
||||
logger.Printf("[WARN] Cannot read dump %s: %v", e.Name(), err)
|
||||
continue
|
||||
}
|
||||
if err := os.WriteFile(dst, data, 0644); err != nil {
|
||||
logger.Printf("[WARN] Cannot write dump %s: %v", e.Name(), err)
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// fileExistsCheck returns true if path exists and is a file.
|
||||
func fileExistsCheck(path string) bool {
|
||||
info, err := os.Stat(path)
|
||||
return err == nil && !info.IsDir()
|
||||
}
|
||||
@@ -0,0 +1,13 @@
|
||||
//go:build !linux
|
||||
|
||||
package backup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log"
|
||||
)
|
||||
|
||||
// RestoreAppFromBackup is a no-op on non-Linux platforms.
|
||||
func RestoreAppFromBackup(ctx context.Context, app *RestorableApp, stacksDir string, logger *log.Logger) error {
|
||||
return nil
|
||||
}
|
||||
@@ -0,0 +1,290 @@
|
||||
//go:build linux
|
||||
|
||||
package backup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// MountDrivesFromLayout scans block devices for disks matching the stored
|
||||
// disk layout and mounts them using the felhom two-layer mount pattern
|
||||
// (raw mount → bind mount).
|
||||
//
|
||||
// The controller container runs privileged with:
|
||||
// - /host-dev mounted from host /dev
|
||||
// - /host-fstab mounted from host /etc/fstab
|
||||
// - /mnt with rshared propagation
|
||||
//
|
||||
// Returns the list of successfully mounted final mount paths.
|
||||
func MountDrivesFromLayout(ctx context.Context, layout DiskLayout, logger *log.Logger) ([]string, error) {
|
||||
if len(layout.Mounts) == 0 {
|
||||
return nil, nil
|
||||
}
|
||||
|
||||
// Get current block devices with UUIDs
|
||||
uuidToDevice, err := scanBlockDeviceUUIDs(ctx)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("scanning block devices: %w", err)
|
||||
}
|
||||
|
||||
var mounted []string
|
||||
|
||||
for _, dm := range layout.Mounts {
|
||||
if dm.UUID == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
// Find matching device by UUID
|
||||
device := uuidToDevice[dm.UUID]
|
||||
if device == "" {
|
||||
logger.Printf("[WARN] Disk UUID %s (%s) not found — drive may be missing or disconnected",
|
||||
dm.UUID, dm.Label)
|
||||
continue
|
||||
}
|
||||
|
||||
// Check if already mounted
|
||||
finalMount := dm.MountPoint
|
||||
if isMountedPath(finalMount) {
|
||||
logger.Printf("[INFO] %s already mounted at %s", dm.Label, finalMount)
|
||||
mounted = append(mounted, finalMount)
|
||||
continue
|
||||
}
|
||||
if dm.RawMount != "" && isMountedPath(dm.RawMount) {
|
||||
logger.Printf("[INFO] %s raw mount already at %s", dm.Label, dm.RawMount)
|
||||
mounted = append(mounted, finalMount)
|
||||
continue
|
||||
}
|
||||
|
||||
uuidShort := dm.UUID
|
||||
if len(uuidShort) > 12 {
|
||||
uuidShort = uuidShort[:12]
|
||||
}
|
||||
logger.Printf("[INFO] Found disk %s (UUID=%s, label=%s) — mounting to %s",
|
||||
device, uuidShort, dm.Label, finalMount)
|
||||
|
||||
// Mount using the appropriate pattern
|
||||
if dm.RawMount != "" && dm.BindSubdir != "" {
|
||||
// Two-layer HDD mount: raw → bind
|
||||
if err := mountRawAndBind(ctx, device, dm, logger); err != nil {
|
||||
logger.Printf("[ERROR] Failed to mount %s: %v", dm.Label, err)
|
||||
continue
|
||||
}
|
||||
} else {
|
||||
// Simple direct mount (e.g., sys_drive)
|
||||
if err := mountDirect(ctx, device, dm, logger); err != nil {
|
||||
logger.Printf("[ERROR] Failed to mount %s: %v", dm.Label, err)
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
// Update host fstab so mount persists across reboots
|
||||
if err := addDRFstabEntries(dm, logger); err != nil {
|
||||
logger.Printf("[WARN] Failed to update fstab for %s: %v — mount works but won't persist", dm.Label, err)
|
||||
}
|
||||
|
||||
mounted = append(mounted, finalMount)
|
||||
logger.Printf("[INFO] Successfully mounted %s at %s", dm.Label, finalMount)
|
||||
}
|
||||
|
||||
return mounted, nil
|
||||
}
|
||||
|
||||
// scanBlockDeviceUUIDs runs lsblk + blkid to build a UUID → device path map.
|
||||
func scanBlockDeviceUUIDs(ctx context.Context) (map[string]string, error) {
|
||||
// First try lsblk with UUID output
|
||||
out, err := exec.CommandContext(ctx, "lsblk", "-J", "-o", "NAME,UUID,FSTYPE,MOUNTPOINT").Output()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("lsblk failed: %w", err)
|
||||
}
|
||||
|
||||
var parsed struct {
|
||||
BlockDevices []struct {
|
||||
Name string `json:"name"`
|
||||
UUID *string `json:"uuid"`
|
||||
FSType *string `json:"fstype"`
|
||||
Mount *string `json:"mountpoint"`
|
||||
Children []struct {
|
||||
Name string `json:"name"`
|
||||
UUID *string `json:"uuid"`
|
||||
FSType *string `json:"fstype"`
|
||||
Mount *string `json:"mountpoint"`
|
||||
} `json:"children"`
|
||||
} `json:"blockdevices"`
|
||||
}
|
||||
if err := json.Unmarshal(out, &parsed); err != nil {
|
||||
return nil, fmt.Errorf("lsblk parse failed: %w", err)
|
||||
}
|
||||
|
||||
devices := make(map[string]string) // UUID → /dev/path
|
||||
for _, dev := range parsed.BlockDevices {
|
||||
if dev.UUID != nil && *dev.UUID != "" {
|
||||
devices[*dev.UUID] = "/dev/" + dev.Name
|
||||
}
|
||||
for _, child := range dev.Children {
|
||||
if child.UUID != nil && *child.UUID != "" {
|
||||
devices[*child.UUID] = "/dev/" + child.Name
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// If lsblk didn't return UUIDs (common inside containers), enrich via blkid
|
||||
if len(devices) == 0 {
|
||||
// Try blkid on /host-dev devices
|
||||
blkOut, err := exec.CommandContext(ctx, "blkid").Output()
|
||||
if err == nil {
|
||||
for _, line := range strings.Split(string(blkOut), "\n") {
|
||||
line = strings.TrimSpace(line)
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
// Parse: /dev/sdb1: UUID="277a2179-..." TYPE="ext4" ...
|
||||
colonIdx := strings.Index(line, ":")
|
||||
if colonIdx < 0 {
|
||||
continue
|
||||
}
|
||||
devPath := line[:colonIdx]
|
||||
if uuidIdx := strings.Index(line, `UUID="`); uuidIdx >= 0 {
|
||||
rest := line[uuidIdx+6:]
|
||||
if endIdx := strings.Index(rest, `"`); endIdx >= 0 {
|
||||
uuid := rest[:endIdx]
|
||||
devices[uuid] = devPath
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return devices, nil
|
||||
}
|
||||
|
||||
// mountDirect creates a simple direct mount.
|
||||
func mountDirect(ctx context.Context, device string, dm DiskMount, logger *log.Logger) error {
|
||||
if err := os.MkdirAll(dm.MountPoint, 0755); err != nil {
|
||||
return fmt.Errorf("creating mount point: %w", err)
|
||||
}
|
||||
|
||||
// Use host device path if available
|
||||
devPath := hostDevPath(device)
|
||||
cmd := exec.CommandContext(ctx, "mount", "-t", dm.FSType, "-o", "noatime", devPath, dm.MountPoint)
|
||||
if out, err := cmd.CombinedOutput(); err != nil {
|
||||
return fmt.Errorf("mount %s: %s: %w", devPath, strings.TrimSpace(string(out)), err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// mountRawAndBind implements the two-layer felhom mount pattern.
|
||||
func mountRawAndBind(ctx context.Context, device string, dm DiskMount, logger *log.Logger) error {
|
||||
// Layer 1: raw mount
|
||||
if err := os.MkdirAll(dm.RawMount, 0755); err != nil {
|
||||
return fmt.Errorf("creating raw mount point: %w", err)
|
||||
}
|
||||
|
||||
devPath := hostDevPath(device)
|
||||
cmd := exec.CommandContext(ctx, "mount", "-t", dm.FSType, "-o", "noatime", devPath, dm.RawMount)
|
||||
if out, err := cmd.CombinedOutput(); err != nil {
|
||||
return fmt.Errorf("raw mount %s → %s: %s: %w", devPath, dm.RawMount, strings.TrimSpace(string(out)), err)
|
||||
}
|
||||
|
||||
// Layer 2: bind mount (subdir → final mount point)
|
||||
bindSrc := filepath.Join(dm.RawMount, dm.BindSubdir)
|
||||
if err := os.MkdirAll(bindSrc, 0755); err != nil {
|
||||
return fmt.Errorf("creating bind source dir: %w", err)
|
||||
}
|
||||
if err := os.MkdirAll(dm.MountPoint, 0755); err != nil {
|
||||
return fmt.Errorf("creating final mount point: %w", err)
|
||||
}
|
||||
|
||||
cmd = exec.CommandContext(ctx, "mount", "--bind", bindSrc, dm.MountPoint)
|
||||
if out, err := cmd.CombinedOutput(); err != nil {
|
||||
return fmt.Errorf("bind mount %s → %s: %s: %w", bindSrc, dm.MountPoint, strings.TrimSpace(string(out)), err)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// addDRFstabEntries adds fstab entries so mounts persist across host reboots.
|
||||
func addDRFstabEntries(dm DiskMount, logger *log.Logger) error {
|
||||
const fstabPath = "/host-fstab"
|
||||
|
||||
data, err := os.ReadFile(fstabPath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("reading fstab: %w", err)
|
||||
}
|
||||
|
||||
content := string(data)
|
||||
|
||||
// Skip if UUID already in fstab (idempotent)
|
||||
if strings.Contains(content, dm.UUID) {
|
||||
return nil
|
||||
}
|
||||
|
||||
var additions strings.Builder
|
||||
additions.WriteString("\n# Restored by felhom-controller DR\n")
|
||||
|
||||
if dm.RawMount != "" {
|
||||
// Raw mount entry
|
||||
additions.WriteString(fmt.Sprintf("UUID=%s\t%s\t%s\t%s\t0 2\n",
|
||||
dm.UUID, dm.RawMount, dm.FSType, dm.FstabOptions))
|
||||
}
|
||||
|
||||
if dm.BindSubdir != "" && dm.RawMount != "" {
|
||||
// Bind mount entry
|
||||
additions.WriteString(fmt.Sprintf("%s/%s\t%s\tnone\tbind,nofail\t0 0\n",
|
||||
dm.RawMount, dm.BindSubdir, dm.MountPoint))
|
||||
} else if dm.RawMount == "" {
|
||||
// Direct mount entry (no bind)
|
||||
additions.WriteString(fmt.Sprintf("UUID=%s\t%s\t%s\t%s\t0 2\n",
|
||||
dm.UUID, dm.MountPoint, dm.FSType, dm.FstabOptions))
|
||||
}
|
||||
|
||||
newContent := content + additions.String()
|
||||
|
||||
// Write atomically (try rename, fallback to direct write for bind-mounted fstab)
|
||||
tmpPath := fstabPath + ".tmp"
|
||||
if err := os.WriteFile(tmpPath, []byte(newContent), 0644); err != nil {
|
||||
return fmt.Errorf("writing fstab tmp: %w", err)
|
||||
}
|
||||
if err := os.Rename(tmpPath, fstabPath); err != nil {
|
||||
os.Remove(tmpPath)
|
||||
// Fallback: direct write (bind-mounted files can't be renamed)
|
||||
if err := os.WriteFile(fstabPath, []byte(newContent), 0644); err != nil {
|
||||
return fmt.Errorf("writing fstab: %w", err)
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// isMountedPath checks if a path is currently a mount point via /proc/mounts.
|
||||
func isMountedPath(path string) bool {
|
||||
if path == "" {
|
||||
return false
|
||||
}
|
||||
data, err := os.ReadFile("/proc/mounts")
|
||||
if err != nil {
|
||||
return false
|
||||
}
|
||||
cleanPath := filepath.Clean(path)
|
||||
for _, line := range strings.Split(string(data), "\n") {
|
||||
fields := strings.Fields(line)
|
||||
if len(fields) >= 2 && filepath.Clean(fields[1]) == cleanPath {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// hostDevPath converts /dev/xxx to /host-dev/xxx for container access.
|
||||
func hostDevPath(devPath string) string {
|
||||
if strings.HasPrefix(devPath, "/dev/") {
|
||||
return "/host-dev/" + strings.TrimPrefix(devPath, "/dev/")
|
||||
}
|
||||
return devPath
|
||||
}
|
||||
@@ -0,0 +1,13 @@
|
||||
//go:build !linux
|
||||
|
||||
package backup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log"
|
||||
)
|
||||
|
||||
// MountDrivesFromLayout is a no-op on non-Linux platforms.
|
||||
func MountDrivesFromLayout(ctx context.Context, layout DiskLayout, logger *log.Logger) ([]string, error) {
|
||||
return nil, nil
|
||||
}
|
||||
@@ -0,0 +1,256 @@
|
||||
package backup
|
||||
|
||||
import (
|
||||
"log"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// RestorableApp describes an app that can be restored during DR.
|
||||
type RestorableApp struct {
|
||||
Name string `json:"name"`
|
||||
DisplayName string `json:"display_name"`
|
||||
NeedsHDD bool `json:"needs_hdd"`
|
||||
HDDPath string `json:"hdd_path,omitempty"`
|
||||
|
||||
// What was found on disk
|
||||
HasConfig bool `json:"has_config"` // _config/ dir with compose files
|
||||
ConfigPath string `json:"config_path"` // full path to _config/ backup
|
||||
HasData bool `json:"has_data"` // app data dir exists on HDD
|
||||
DataPath string `json:"data_path"` // e.g., /mnt/hdd_1/appdata/immich
|
||||
HasDBDump bool `json:"has_db_dump"` // _db/ dir with dump files
|
||||
DBDumpPath string `json:"db_dump_path"` // full path to _db/ backup
|
||||
HasRsyncData bool `json:"has_rsync_data"` // rsync user data (excl _config/_db)
|
||||
RsyncDataPath string `json:"rsync_data_path"` // full path to rsync backup
|
||||
DrivePath string `json:"drive_path"` // which drive has the backup
|
||||
DriveLabel string `json:"drive_label"` // label for display
|
||||
|
||||
// Restore progress (updated during restore)
|
||||
Status string `json:"status"` // "pending", "restoring", "done", "failed", "skipped"
|
||||
Error string `json:"error,omitempty"`
|
||||
StartedAt string `json:"started_at,omitempty"`
|
||||
CompletedAt string `json:"completed_at,omitempty"`
|
||||
}
|
||||
|
||||
// RestorePlan holds the complete DR restore plan.
|
||||
type RestorePlan struct {
|
||||
mu sync.RWMutex
|
||||
|
||||
CustomerID string `json:"customer_id"`
|
||||
Domain string `json:"domain"`
|
||||
Timestamp string `json:"timestamp"` // when the infra backup was taken
|
||||
Apps []RestorableApp `json:"apps"`
|
||||
|
||||
// Drive summary
|
||||
Drives []DriveInfo `json:"drives"`
|
||||
|
||||
// Overall status
|
||||
Status string `json:"status"` // "pending", "restoring", "done"
|
||||
}
|
||||
|
||||
// DriveInfo summarizes a mounted drive for display.
|
||||
type DriveInfo struct {
|
||||
Path string `json:"path"`
|
||||
Label string `json:"label"`
|
||||
Available bool `json:"available"` // mount is accessible
|
||||
HasBackup bool `json:"has_backup"` // has backups/secondary/ dir
|
||||
}
|
||||
|
||||
// GetApps returns a snapshot of the apps list.
|
||||
func (rp *RestorePlan) GetApps() []RestorableApp {
|
||||
rp.mu.RLock()
|
||||
defer rp.mu.RUnlock()
|
||||
apps := make([]RestorableApp, len(rp.Apps))
|
||||
copy(apps, rp.Apps)
|
||||
return apps
|
||||
}
|
||||
|
||||
// Snapshot returns a thread-safe snapshot of the plan for JSON serialization.
|
||||
func (rp *RestorePlan) Snapshot() map[string]interface{} {
|
||||
rp.mu.RLock()
|
||||
defer rp.mu.RUnlock()
|
||||
return map[string]interface{}{
|
||||
"ok": true,
|
||||
"status": rp.Status,
|
||||
"apps": rp.Apps,
|
||||
"drives": rp.Drives,
|
||||
}
|
||||
}
|
||||
|
||||
// UpdateApp updates a single app's status in the plan.
|
||||
func (rp *RestorePlan) UpdateApp(name, status, errMsg string) {
|
||||
rp.mu.Lock()
|
||||
defer rp.mu.Unlock()
|
||||
for i := range rp.Apps {
|
||||
if rp.Apps[i].Name == name {
|
||||
rp.Apps[i].Status = status
|
||||
rp.Apps[i].Error = errMsg
|
||||
if status == "restoring" {
|
||||
rp.Apps[i].StartedAt = time.Now().UTC().Format(time.RFC3339)
|
||||
}
|
||||
if status == "done" || status == "failed" {
|
||||
rp.Apps[i].CompletedAt = time.Now().UTC().Format(time.RFC3339)
|
||||
}
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// AllDone returns true if all apps are done/failed/skipped.
|
||||
func (rp *RestorePlan) AllDone() bool {
|
||||
rp.mu.RLock()
|
||||
defer rp.mu.RUnlock()
|
||||
for _, app := range rp.Apps {
|
||||
if app.Status != "done" && app.Status != "failed" && app.Status != "skipped" {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// InfraStackInfo is a minimal stack descriptor from the Hub infra backup.
|
||||
// Used to pass deployed_stacks info into the scan without importing report.
|
||||
type InfraStackInfo struct {
|
||||
Name string
|
||||
DisplayName string
|
||||
HDDPath string
|
||||
NeedsHDD bool
|
||||
}
|
||||
|
||||
// ScanDrivesForBackups scans mounted drives for cross-drive backup data
|
||||
// and correlates with the deployed stacks manifest from the Hub.
|
||||
func ScanDrivesForBackups(mountedPaths []string, stacks []InfraStackInfo, logger *log.Logger) *RestorePlan {
|
||||
plan := &RestorePlan{
|
||||
Status: "pending",
|
||||
}
|
||||
|
||||
// Build drive info and find backup directories
|
||||
type driveBackup struct {
|
||||
drivePath string
|
||||
label string
|
||||
secPath string // backups/secondary/ path
|
||||
}
|
||||
var backupDrives []driveBackup
|
||||
|
||||
for _, mp := range mountedPaths {
|
||||
label := filepath.Base(mp)
|
||||
avail := dirExists(mp)
|
||||
|
||||
di := DriveInfo{
|
||||
Path: mp,
|
||||
Label: label,
|
||||
Available: avail,
|
||||
}
|
||||
|
||||
secPath := SecondaryBackupPath(mp)
|
||||
if dirExists(secPath) {
|
||||
di.HasBackup = true
|
||||
backupDrives = append(backupDrives, driveBackup{
|
||||
drivePath: mp,
|
||||
label: label,
|
||||
secPath: secPath,
|
||||
})
|
||||
logger.Printf("[INFO] Found backup data on %s (%s)", mp, secPath)
|
||||
}
|
||||
|
||||
plan.Drives = append(plan.Drives, di)
|
||||
}
|
||||
|
||||
// For each stack from the manifest, look for backup data on drives
|
||||
for _, stack := range stacks {
|
||||
app := RestorableApp{
|
||||
Name: stack.Name,
|
||||
DisplayName: stack.DisplayName,
|
||||
NeedsHDD: stack.NeedsHDD,
|
||||
HDDPath: stack.HDDPath,
|
||||
Status: "pending",
|
||||
}
|
||||
|
||||
// Check if app data exists directly on HDD (common case: HDD survived)
|
||||
if stack.HDDPath != "" {
|
||||
dataDir := AppDataDir(stack.HDDPath, stack.Name)
|
||||
if dirExists(dataDir) {
|
||||
app.HasData = true
|
||||
app.DataPath = dataDir
|
||||
}
|
||||
}
|
||||
|
||||
// Scan each drive for cross-drive backup of this app
|
||||
for _, db := range backupDrives {
|
||||
rsyncBase := AppSecondaryRsyncPath(db.drivePath, stack.Name)
|
||||
if !dirExists(rsyncBase) {
|
||||
continue
|
||||
}
|
||||
|
||||
// Found a backup for this app
|
||||
app.DrivePath = db.drivePath
|
||||
app.DriveLabel = db.label
|
||||
|
||||
// Check for _config/ (stack compose directory backup)
|
||||
configDir := filepath.Join(rsyncBase, "_config")
|
||||
if dirExists(configDir) {
|
||||
app.HasConfig = true
|
||||
app.ConfigPath = configDir
|
||||
}
|
||||
|
||||
// Check for _db/ (database dump backup)
|
||||
dbDir := filepath.Join(rsyncBase, "_db")
|
||||
if dirExists(dbDir) && !dirIsEmpty(dbDir) {
|
||||
app.HasDBDump = true
|
||||
app.DBDumpPath = dbDir
|
||||
}
|
||||
|
||||
// Check for user data in rsync (anything besides _config and _db)
|
||||
if hasUserData(rsyncBase) {
|
||||
app.HasRsyncData = true
|
||||
app.RsyncDataPath = rsyncBase
|
||||
}
|
||||
|
||||
break // use first drive with backup for this app
|
||||
}
|
||||
|
||||
plan.Apps = append(plan.Apps, app)
|
||||
}
|
||||
|
||||
if len(plan.Apps) == 0 {
|
||||
plan.Apps = []RestorableApp{}
|
||||
}
|
||||
|
||||
logger.Printf("[INFO] Restore plan: %d apps, %d drives (%d with backups)",
|
||||
len(plan.Apps), len(plan.Drives), len(backupDrives))
|
||||
|
||||
return plan
|
||||
}
|
||||
|
||||
// dirExists checks if a directory exists and is accessible.
|
||||
func dirExists(path string) bool {
|
||||
info, err := os.Stat(path)
|
||||
return err == nil && info.IsDir()
|
||||
}
|
||||
|
||||
// dirIsEmpty returns true if a directory has no entries.
|
||||
func dirIsEmpty(path string) bool {
|
||||
entries, err := os.ReadDir(path)
|
||||
return err != nil || len(entries) == 0
|
||||
}
|
||||
|
||||
// hasUserData checks if the rsync backup dir has user data (not just _config/_db).
|
||||
func hasUserData(rsyncBase string) bool {
|
||||
entries, err := os.ReadDir(rsyncBase)
|
||||
if err != nil {
|
||||
return false
|
||||
}
|
||||
for _, e := range entries {
|
||||
name := e.Name()
|
||||
if name != "_config" && name != "_db" && !hasPrefix(name, ".") {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func hasPrefix(s, prefix string) bool {
|
||||
return len(s) >= len(prefix) && s[:len(prefix)] == prefix
|
||||
}
|
||||
@@ -0,0 +1,92 @@
|
||||
package report
|
||||
|
||||
import (
|
||||
"encoding/base64"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"gitea.dooplex.hu/admin/felhom-controller/internal/backup"
|
||||
"gitea.dooplex.hu/admin/felhom-controller/internal/settings"
|
||||
)
|
||||
|
||||
// InfraBackup is the payload pushed to the Hub for disaster recovery.
|
||||
type InfraBackup struct {
|
||||
CustomerID string `json:"customer_id"`
|
||||
Domain string `json:"domain"`
|
||||
ControllerVersion string `json:"controller_version"`
|
||||
Timestamp string `json:"timestamp"`
|
||||
|
||||
ControllerConfigB64 string `json:"controller_config_b64"`
|
||||
SettingsJSONB64 string `json:"settings_json_b64,omitempty"`
|
||||
|
||||
DiskLayout backup.DiskLayout `json:"disk_layout"`
|
||||
DeployedStacks []InfraStack `json:"deployed_stacks"`
|
||||
|
||||
ResticPassword string `json:"restic_password,omitempty"`
|
||||
CrossDrivePassword string `json:"cross_drive_password,omitempty"`
|
||||
}
|
||||
|
||||
// InfraStack identifies a deployed app for disaster recovery.
|
||||
type InfraStack struct {
|
||||
Name string `json:"name"`
|
||||
DisplayName string `json:"display_name"`
|
||||
HDDPath string `json:"hdd_path,omitempty"`
|
||||
NeedsHDD bool `json:"needs_hdd"`
|
||||
}
|
||||
|
||||
// BuildInfraBackup collects all infrastructure state for Hub backup.
|
||||
func BuildInfraBackup(
|
||||
customerID, domain, version string,
|
||||
controllerYAMLPath string,
|
||||
settingsPath string,
|
||||
resticPasswordFile string,
|
||||
systemDataPath string,
|
||||
sett *settings.Settings,
|
||||
stackProvider backup.StackDataProvider,
|
||||
) (*InfraBackup, error) {
|
||||
ib := &InfraBackup{
|
||||
CustomerID: customerID,
|
||||
Domain: domain,
|
||||
ControllerVersion: version,
|
||||
Timestamp: time.Now().UTC().Format(time.RFC3339),
|
||||
}
|
||||
|
||||
// Read and encode controller.yaml
|
||||
if data, err := os.ReadFile(controllerYAMLPath); err == nil {
|
||||
ib.ControllerConfigB64 = base64.StdEncoding.EncodeToString(data)
|
||||
}
|
||||
|
||||
// Read and encode settings.json
|
||||
if data, err := os.ReadFile(settingsPath); err == nil {
|
||||
ib.SettingsJSONB64 = base64.StdEncoding.EncodeToString(data)
|
||||
}
|
||||
|
||||
// Read primary restic password
|
||||
if data, err := os.ReadFile(resticPasswordFile); err == nil {
|
||||
ib.ResticPassword = base64.StdEncoding.EncodeToString(data)
|
||||
}
|
||||
|
||||
// Read cross-drive restic password
|
||||
if pw := sett.GetCrossDriveResticPassword(); pw != "" {
|
||||
ib.CrossDrivePassword = pw
|
||||
}
|
||||
|
||||
// Collect disk layout from fstab + blkid
|
||||
ib.DiskLayout = collectDiskLayout(systemDataPath)
|
||||
|
||||
// Collect deployed stacks
|
||||
deployed := stackProvider.ListDeployedStacks()
|
||||
for _, s := range deployed {
|
||||
ib.DeployedStacks = append(ib.DeployedStacks, InfraStack{
|
||||
Name: s.Name,
|
||||
DisplayName: s.DisplayName,
|
||||
HDDPath: stackProvider.GetStackHDDPath(s.Name),
|
||||
NeedsHDD: s.NeedsHDD,
|
||||
})
|
||||
}
|
||||
if ib.DeployedStacks == nil {
|
||||
ib.DeployedStacks = []InfraStack{}
|
||||
}
|
||||
|
||||
return ib, nil
|
||||
}
|
||||
@@ -0,0 +1,135 @@
|
||||
//go:build linux
|
||||
|
||||
package report
|
||||
|
||||
import (
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"gitea.dooplex.hu/admin/felhom-controller/internal/backup"
|
||||
)
|
||||
|
||||
// collectDiskLayout reads /host-fstab and correlates with blkid/lsblk to build
|
||||
// the disk mount topology. Only includes data partitions (not root, boot, or swap).
|
||||
func collectDiskLayout(systemDataPath string) backup.DiskLayout {
|
||||
layout := backup.DiskLayout{}
|
||||
|
||||
fstabPath := "/host-fstab"
|
||||
if _, err := os.Stat(fstabPath); err != nil {
|
||||
fstabPath = "/etc/fstab"
|
||||
}
|
||||
|
||||
data, err := os.ReadFile(fstabPath)
|
||||
if err != nil {
|
||||
return layout
|
||||
}
|
||||
|
||||
// Parse fstab into UUID-based entries and bind mount entries
|
||||
type fstabEntry struct {
|
||||
source string
|
||||
mountPoint string
|
||||
fsType string
|
||||
options string
|
||||
}
|
||||
|
||||
var uuidEntries []fstabEntry
|
||||
var bindEntries []fstabEntry
|
||||
|
||||
systemMounts := map[string]bool{"/": true, "/boot": true, "/boot/efi": true}
|
||||
|
||||
for _, line := range strings.Split(string(data), "\n") {
|
||||
line = strings.TrimSpace(line)
|
||||
if line == "" || strings.HasPrefix(line, "#") {
|
||||
continue
|
||||
}
|
||||
fields := strings.Fields(line)
|
||||
if len(fields) < 4 {
|
||||
continue
|
||||
}
|
||||
source := fields[0]
|
||||
mountPoint := fields[1]
|
||||
fsType := fields[2]
|
||||
options := fields[3]
|
||||
|
||||
// Skip system mounts and swap
|
||||
if systemMounts[mountPoint] || fsType == "swap" {
|
||||
continue
|
||||
}
|
||||
|
||||
if strings.HasPrefix(source, "UUID=") {
|
||||
uuidEntries = append(uuidEntries, fstabEntry{
|
||||
source: strings.TrimPrefix(source, "UUID="),
|
||||
mountPoint: mountPoint,
|
||||
fsType: fsType,
|
||||
options: options,
|
||||
})
|
||||
} else if fsType == "none" && strings.Contains(options, "bind") {
|
||||
bindEntries = append(bindEntries, fstabEntry{
|
||||
source: source,
|
||||
mountPoint: mountPoint,
|
||||
options: options,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Process UUID-based entries
|
||||
for _, e := range uuidEntries {
|
||||
dm := backup.DiskMount{
|
||||
UUID: e.source,
|
||||
MountPoint: e.mountPoint,
|
||||
FSType: e.fsType,
|
||||
FstabOptions: e.options,
|
||||
}
|
||||
|
||||
// Get label via blkid
|
||||
if out, err := exec.Command("blkid", "-o", "value", "-s", "LABEL", "-U", e.source).Output(); err == nil {
|
||||
dm.Label = strings.TrimSpace(string(out))
|
||||
}
|
||||
|
||||
// Get size via lsblk (resolve UUID to device first)
|
||||
if devPath, err := exec.Command("blkid", "-U", e.source).Output(); err == nil {
|
||||
dev := strings.TrimSpace(string(devPath))
|
||||
if dev != "" {
|
||||
if out, err := exec.Command("lsblk", "-b", "-n", "-o", "SIZE", dev).Output(); err == nil {
|
||||
if sz, err := strconv.ParseInt(strings.TrimSpace(string(out)), 10, 64); err == nil {
|
||||
dm.SizeBytes = sz
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Determine role
|
||||
if e.mountPoint == systemDataPath {
|
||||
dm.Role = "system_data"
|
||||
} else {
|
||||
dm.Role = "hdd_storage"
|
||||
}
|
||||
|
||||
// Check for a corresponding bind mount
|
||||
for _, bind := range bindEntries {
|
||||
if strings.HasPrefix(bind.source, e.mountPoint+"/") {
|
||||
subdir := strings.TrimPrefix(bind.source, e.mountPoint+"/")
|
||||
dm.BindSubdir = subdir
|
||||
dm.RawMount = e.mountPoint
|
||||
dm.MountPoint = bind.mountPoint // the final user-facing mount point
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
// Get label from mount point basename as fallback
|
||||
if dm.Label == "" {
|
||||
if dm.RawMount != "" {
|
||||
dm.Label = filepath.Base(dm.RawMount)
|
||||
} else {
|
||||
dm.Label = filepath.Base(dm.MountPoint)
|
||||
}
|
||||
}
|
||||
|
||||
layout.Mounts = append(layout.Mounts, dm)
|
||||
}
|
||||
|
||||
return layout
|
||||
}
|
||||
@@ -0,0 +1,11 @@
|
||||
//go:build !linux
|
||||
|
||||
package report
|
||||
|
||||
import "gitea.dooplex.hu/admin/felhom-controller/internal/backup"
|
||||
|
||||
// collectDiskLayout is a no-op on non-Linux platforms.
|
||||
// The controller only runs on Linux; this stub allows cross-compilation.
|
||||
func collectDiskLayout(systemDataPath string) backup.DiskLayout {
|
||||
return backup.DiskLayout{}
|
||||
}
|
||||
@@ -0,0 +1,51 @@
|
||||
package report
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
// PullInfraBackup fetches the infrastructure backup from the Hub.
|
||||
// Returns nil, nil if no backup exists for this customer.
|
||||
func PullInfraBackup(hubURL, apiKey, customerID string) (*InfraBackup, error) {
|
||||
url := strings.TrimRight(hubURL, "/") + "/api/v1/infra-backup/" + customerID
|
||||
|
||||
client := &http.Client{Timeout: 30 * time.Second}
|
||||
|
||||
req, err := http.NewRequest(http.MethodGet, url, nil)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if apiKey != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+apiKey)
|
||||
}
|
||||
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("hub request failed: %w", err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
if resp.StatusCode == http.StatusNotFound {
|
||||
return nil, nil // no backup for this customer
|
||||
}
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return nil, fmt.Errorf("hub returned HTTP %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
body, err := io.ReadAll(io.LimitReader(resp.Body, 5<<20)) // 5MB limit
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("reading response: %w", err)
|
||||
}
|
||||
|
||||
var ib InfraBackup
|
||||
if err := json.Unmarshal(body, &ib); err != nil {
|
||||
return nil, fmt.Errorf("parsing infra backup: %w", err)
|
||||
}
|
||||
|
||||
return &ib, nil
|
||||
}
|
||||
@@ -82,6 +82,49 @@ func (p *Pusher) Push(report *Report) error {
|
||||
return fmt.Errorf("hub push failed after 3 attempts: %w", lastErr)
|
||||
}
|
||||
|
||||
// PushInfraBackup sends the infrastructure backup payload to the Hub.
|
||||
// Uses the same retry logic as Push.
|
||||
func (p *Pusher) PushInfraBackup(data []byte) error {
|
||||
if !p.enabled {
|
||||
return nil
|
||||
}
|
||||
|
||||
url := p.hubURL + "/api/v1/infra-backup"
|
||||
|
||||
var lastErr error
|
||||
for attempt := 0; attempt < 3; attempt++ {
|
||||
if attempt > 0 {
|
||||
time.Sleep(5 * time.Second)
|
||||
}
|
||||
|
||||
req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(data))
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
continue
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if p.apiKey != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+p.apiKey)
|
||||
}
|
||||
|
||||
resp, err := p.httpClient.Do(req)
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
continue
|
||||
}
|
||||
io.Copy(io.Discard, resp.Body)
|
||||
resp.Body.Close()
|
||||
|
||||
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
|
||||
p.logger.Printf("[INFO] Infra backup pushed to Hub (%d bytes)", len(data))
|
||||
return nil
|
||||
}
|
||||
lastErr = fmt.Errorf("HTTP %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
return fmt.Errorf("infra backup push failed after 3 attempts: %w", lastErr)
|
||||
}
|
||||
|
||||
// PushOnce sends a single report regardless of the enabled flag.
|
||||
// Used for one-time notifications (e.g., reporting-disabled on startup).
|
||||
func (p *Pusher) PushOnce(report *Report) error {
|
||||
|
||||
@@ -292,6 +292,22 @@ func (s *Settings) GetAllCrossDriveConfigs() map[string]*CrossDriveBackup {
|
||||
return result
|
||||
}
|
||||
|
||||
// GetCrossDriveResticPassword returns the cross-drive restic password (read-only).
|
||||
// Returns empty string if not yet generated.
|
||||
func (s *Settings) GetCrossDriveResticPassword() string {
|
||||
s.mu.RLock()
|
||||
defer s.mu.RUnlock()
|
||||
return s.CrossDriveResticPassword
|
||||
}
|
||||
|
||||
// SetCrossDriveResticPassword sets the cross-drive restic password (e.g., during DR restore).
|
||||
func (s *Settings) SetCrossDriveResticPassword(password string) error {
|
||||
s.mu.Lock()
|
||||
defer s.mu.Unlock()
|
||||
s.CrossDriveResticPassword = password
|
||||
return s.save()
|
||||
}
|
||||
|
||||
// GetOrCreateCrossDrivePassword returns the cross-drive restic password,
|
||||
// generating and persisting one if it doesn't exist yet.
|
||||
func (s *Settings) GetOrCreateCrossDrivePassword() (string, error) {
|
||||
|
||||
@@ -305,6 +305,23 @@ func (s *Server) templateFuncMap() template.FuncMap {
|
||||
}
|
||||
return id
|
||||
},
|
||||
// statusText maps DR restore status codes to Hungarian labels.
|
||||
"statusText": func(status string) string {
|
||||
switch status {
|
||||
case "pending":
|
||||
return "Várakozik"
|
||||
case "restoring":
|
||||
return "Visszaállítás..."
|
||||
case "done":
|
||||
return "Kész"
|
||||
case "failed":
|
||||
return "Sikertelen"
|
||||
case "skipped":
|
||||
return "Kihagyva"
|
||||
default:
|
||||
return status
|
||||
}
|
||||
},
|
||||
// pageMatch returns true if currentPage is in the pages slice.
|
||||
// Used to filter page-specific alerts in layout.html.
|
||||
"pageMatch": func(pages []string, currentPage string) bool {
|
||||
|
||||
@@ -0,0 +1,132 @@
|
||||
package web
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"time"
|
||||
|
||||
"gitea.dooplex.hu/admin/felhom-controller/internal/backup"
|
||||
)
|
||||
|
||||
// restorePageHandler renders the full-page DR restore UI.
|
||||
func (s *Server) restorePageHandler(w http.ResponseWriter, r *http.Request) {
|
||||
if s.restorePlan == nil {
|
||||
http.Redirect(w, r, "/", http.StatusFound)
|
||||
return
|
||||
}
|
||||
|
||||
data := map[string]interface{}{
|
||||
"Title": "Katasztrófa utáni visszaállítás",
|
||||
"CustomerName": s.cfg.Customer.Name,
|
||||
"Domain": s.cfg.Customer.Domain,
|
||||
"Version": s.version,
|
||||
"CustomerID": s.restorePlan.CustomerID,
|
||||
"Timestamp": s.restorePlan.Timestamp,
|
||||
"Apps": s.restorePlan.GetApps(),
|
||||
"Drives": s.restorePlan.Drives,
|
||||
"PlanStatus": s.restorePlan.Status,
|
||||
}
|
||||
|
||||
s.render(w, "restore", data)
|
||||
}
|
||||
|
||||
// apiRestoreStatus returns the current restore plan status as JSON.
|
||||
func (s *Server) apiRestoreStatus(w http.ResponseWriter, r *http.Request) {
|
||||
if s.restorePlan == nil {
|
||||
jsonError(w, "not in restore mode", http.StatusBadRequest)
|
||||
return
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
json.NewEncoder(w).Encode(s.restorePlan.Snapshot())
|
||||
}
|
||||
|
||||
// apiRestoreAll starts restoring all pending apps sequentially.
|
||||
func (s *Server) apiRestoreAll(w http.ResponseWriter, r *http.Request) {
|
||||
if s.restorePlan == nil {
|
||||
jsonError(w, "not in restore mode", http.StatusBadRequest)
|
||||
return
|
||||
}
|
||||
if s.restorePlan.Status == "restoring" {
|
||||
jsonError(w, "restore already in progress", http.StatusConflict)
|
||||
return
|
||||
}
|
||||
|
||||
s.restorePlan.Status = "restoring"
|
||||
go s.executeAllRestores()
|
||||
|
||||
jsonResponse(w, map[string]interface{}{
|
||||
"ok": true,
|
||||
"message": "Visszaállítás elindítva",
|
||||
})
|
||||
}
|
||||
|
||||
// apiRestoreSkip exits restore mode without restoring.
|
||||
func (s *Server) apiRestoreSkip(w http.ResponseWriter, r *http.Request) {
|
||||
if s.restorePlan == nil {
|
||||
jsonError(w, "not in restore mode", http.StatusBadRequest)
|
||||
return
|
||||
}
|
||||
|
||||
s.logger.Println("[INFO] User skipped DR restore — entering normal mode")
|
||||
s.clearRestoreMode()
|
||||
|
||||
jsonResponse(w, map[string]interface{}{
|
||||
"ok": true,
|
||||
"message": "Visszaállítás kihagyva",
|
||||
})
|
||||
}
|
||||
|
||||
// executeAllRestores runs the restore for each pending app sequentially.
|
||||
func (s *Server) executeAllRestores() {
|
||||
s.logger.Println("[INFO] Starting DR restore for all apps")
|
||||
|
||||
for i := range s.restorePlan.Apps {
|
||||
app := &s.restorePlan.Apps[i]
|
||||
if app.Status != "pending" {
|
||||
continue
|
||||
}
|
||||
|
||||
s.restorePlan.UpdateApp(app.Name, "restoring", "")
|
||||
s.logger.Printf("[INFO] Restoring app %s (%s)", app.Name, app.DisplayName)
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
|
||||
err := backup.RestoreAppFromBackup(ctx, app, s.cfg.Paths.StacksDir, s.logger)
|
||||
cancel()
|
||||
|
||||
if err != nil {
|
||||
s.restorePlan.UpdateApp(app.Name, "failed", err.Error())
|
||||
s.logger.Printf("[ERROR] Restore failed for %s: %v", app.Name, err)
|
||||
} else {
|
||||
s.restorePlan.UpdateApp(app.Name, "done", "")
|
||||
s.logger.Printf("[INFO] Restore completed for %s", app.Name)
|
||||
}
|
||||
}
|
||||
|
||||
s.restorePlan.Status = "done"
|
||||
s.logger.Println("[INFO] All app restores completed")
|
||||
|
||||
// Re-scan stacks so dashboard picks up restored apps
|
||||
if s.stackMgr != nil {
|
||||
if err := s.stackMgr.ScanStacks(); err != nil {
|
||||
s.logger.Printf("[WARN] Post-restore stack scan failed: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Auto-clear restore mode after a brief delay so user can see final status
|
||||
go func() {
|
||||
time.Sleep(5 * time.Second)
|
||||
// Only auto-clear if user hasn't already navigated away
|
||||
if s.restorePlan != nil && s.restorePlan.AllDone() {
|
||||
// Keep plan visible — user clicks "continue to dashboard" to clear
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
// clearRestoreMode exits restore mode and returns to normal operation.
|
||||
func (s *Server) clearRestoreMode() {
|
||||
s.restoreMu.Lock()
|
||||
defer s.restoreMu.Unlock()
|
||||
s.restorePlan = nil
|
||||
}
|
||||
@@ -43,6 +43,10 @@ type Server struct {
|
||||
|
||||
// Active raw mount for the attach wizard (empty when not in use)
|
||||
activeRawMount string
|
||||
|
||||
// DR restore mode state
|
||||
restoreMu sync.RWMutex
|
||||
restorePlan *backup.RestorePlan
|
||||
}
|
||||
|
||||
func NewServer(cfg *config.Config, stackMgr *stacks.Manager, cpuCollector *system.CPUCollector, backupMgr *backup.Manager, crossDrive *backup.CrossDriveRunner, sched *scheduler.Scheduler, sett *settings.Settings, alertMgr *AlertManager, notif *notify.Notifier, logger *log.Logger, version string) *Server {
|
||||
@@ -85,10 +89,48 @@ func (s *Server) loadTemplates() {
|
||||
)
|
||||
}
|
||||
|
||||
// SetRestoreState puts the server into DR restore mode with the given plan.
|
||||
func (s *Server) SetRestoreState(plan *backup.RestorePlan) {
|
||||
s.restoreMu.Lock()
|
||||
defer s.restoreMu.Unlock()
|
||||
s.restorePlan = plan
|
||||
}
|
||||
|
||||
// InRestoreMode returns true if the server is in DR restore mode.
|
||||
func (s *Server) InRestoreMode() bool {
|
||||
s.restoreMu.RLock()
|
||||
defer s.restoreMu.RUnlock()
|
||||
return s.restorePlan != nil
|
||||
}
|
||||
|
||||
// ServeHTTP handles all non-API web requests.
|
||||
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
path := r.URL.Path
|
||||
|
||||
// DR restore mode: intercept all routes except restore page, static, and restore API
|
||||
if s.InRestoreMode() {
|
||||
switch {
|
||||
case path == "/restore":
|
||||
s.restorePageHandler(w, r)
|
||||
return
|
||||
case path == "/api/restore/status":
|
||||
s.apiRestoreStatus(w, r)
|
||||
return
|
||||
case path == "/api/restore/all" && r.Method == http.MethodPost:
|
||||
s.apiRestoreAll(w, r)
|
||||
return
|
||||
case path == "/api/restore/skip" && r.Method == http.MethodPost:
|
||||
s.apiRestoreSkip(w, r)
|
||||
return
|
||||
case strings.HasPrefix(path, "/static/"):
|
||||
// Allow static assets through
|
||||
default:
|
||||
// Redirect everything else to the restore page
|
||||
http.Redirect(w, r, "/restore", http.StatusFound)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
switch {
|
||||
case path == "/" || path == "/dashboard":
|
||||
s.dashboardHandler(w, r)
|
||||
|
||||
@@ -0,0 +1,323 @@
|
||||
{{define "restore"}}
|
||||
<!DOCTYPE html>
|
||||
<html lang="hu">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Katasztrófa utáni visszaállítás — Felhom</title>
|
||||
<link rel="stylesheet" href="/static/style.css">
|
||||
<style>
|
||||
body { background: var(--bg-darker, #0d1117); margin: 0; padding: 0; }
|
||||
.dr-container { max-width: 900px; margin: 0 auto; padding: 2rem 1.5rem; }
|
||||
.dr-header { text-align: center; margin-bottom: 2rem; }
|
||||
.dr-header img { width: 48px; height: 48px; margin-bottom: 0.5rem; }
|
||||
.dr-header h1 { color: var(--warning, #f0ad4e); font-size: 1.5rem; margin: 0.5rem 0; }
|
||||
.dr-header p { color: var(--text-secondary, #8b949e); margin: 0.25rem 0; }
|
||||
.dr-card { background: var(--card-bg, #161b22); border: 1px solid var(--border, #30363d); border-radius: 8px; padding: 1.25rem; margin-bottom: 1rem; }
|
||||
.dr-card h3 { margin: 0 0 0.75rem 0; color: var(--text-primary, #e6edf3); font-size: 1rem; }
|
||||
.dr-drives { display: flex; gap: 0.75rem; flex-wrap: wrap; }
|
||||
.dr-drive { background: var(--bg-darker, #0d1117); border: 1px solid var(--border, #30363d); border-radius: 6px; padding: 0.75rem 1rem; flex: 1; min-width: 200px; }
|
||||
.dr-drive-label { font-weight: 600; color: var(--text-primary, #e6edf3); }
|
||||
.dr-drive-path { font-size: 0.85rem; color: var(--text-secondary, #8b949e); font-family: monospace; }
|
||||
.dr-drive-status { font-size: 0.85rem; margin-top: 0.25rem; }
|
||||
.dr-drive-ok { color: var(--success, #3fb950); }
|
||||
.dr-drive-warn { color: var(--warning, #f0ad4e); }
|
||||
table { width: 100%; border-collapse: collapse; }
|
||||
th { text-align: left; padding: 0.5rem 0.75rem; color: var(--text-secondary, #8b949e); font-size: 0.85rem; font-weight: 500; border-bottom: 1px solid var(--border, #30363d); }
|
||||
td { padding: 0.6rem 0.75rem; border-bottom: 1px solid var(--border, #30363d); color: var(--text-primary, #e6edf3); font-size: 0.9rem; }
|
||||
.badge { display: inline-block; padding: 2px 8px; border-radius: 12px; font-size: 0.75rem; font-weight: 500; }
|
||||
.badge-ok { background: rgba(63,185,80,0.15); color: var(--success, #3fb950); }
|
||||
.badge-warn { background: rgba(240,173,78,0.15); color: var(--warning, #f0ad4e); }
|
||||
.badge-none { background: rgba(139,148,158,0.15); color: var(--text-secondary, #8b949e); }
|
||||
.status-pending { color: var(--text-secondary, #8b949e); }
|
||||
.status-restoring { color: var(--info, #58a6ff); }
|
||||
.status-done { color: var(--success, #3fb950); }
|
||||
.status-failed { color: var(--danger, #f85149); }
|
||||
.status-skipped { color: var(--text-secondary, #8b949e); }
|
||||
.dr-actions { display: flex; gap: 0.75rem; justify-content: center; margin-top: 1.5rem; }
|
||||
.btn { display: inline-flex; align-items: center; justify-content: center; padding: 0.6rem 1.5rem; border-radius: 6px; border: 1px solid transparent; font-size: 0.9rem; font-weight: 500; cursor: pointer; text-decoration: none; transition: background 0.2s; }
|
||||
.btn-primary { background: var(--accent, #238636); color: #fff; border-color: var(--accent, #238636); }
|
||||
.btn-primary:hover { background: #2ea043; }
|
||||
.btn-primary:disabled { opacity: 0.5; cursor: not-allowed; }
|
||||
.btn-outline { background: transparent; color: var(--text-secondary, #8b949e); border-color: var(--border, #30363d); }
|
||||
.btn-outline:hover { color: var(--text-primary, #e6edf3); border-color: var(--text-secondary, #8b949e); }
|
||||
.btn-success { background: var(--accent, #238636); color: #fff; }
|
||||
.progress-bar { height: 4px; background: var(--border, #30363d); border-radius: 2px; margin-top: 1rem; overflow: hidden; display: none; }
|
||||
.progress-bar-inner { height: 100%; background: var(--accent, #238636); transition: width 0.5s; width: 0%; }
|
||||
.dr-info { display: flex; gap: 2rem; flex-wrap: wrap; margin-bottom: 0.5rem; }
|
||||
.dr-info-item { font-size: 0.9rem; }
|
||||
.dr-info-label { color: var(--text-secondary, #8b949e); }
|
||||
.dr-info-value { color: var(--text-primary, #e6edf3); font-weight: 500; }
|
||||
.spinner { display: inline-block; width: 14px; height: 14px; border: 2px solid var(--border, #30363d); border-top-color: var(--info, #58a6ff); border-radius: 50%; animation: spin 0.8s linear infinite; vertical-align: middle; margin-right: 4px; }
|
||||
@keyframes spin { to { transform: rotate(360deg); } }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="dr-container">
|
||||
<div class="dr-header">
|
||||
<img src="/static/felhom-logo.svg" alt="Felhom">
|
||||
<h1>Korábbi telepítés észlelve</h1>
|
||||
<p>A rendszer biztonsági mentést talált a központi szerveren</p>
|
||||
</div>
|
||||
|
||||
<!-- Info card -->
|
||||
<div class="dr-card">
|
||||
<h3>Rendszer információ</h3>
|
||||
<div class="dr-info">
|
||||
<div class="dr-info-item">
|
||||
<span class="dr-info-label">Ügyfél: </span>
|
||||
<span class="dr-info-value">{{.CustomerName}}</span>
|
||||
</div>
|
||||
<div class="dr-info-item">
|
||||
<span class="dr-info-label">Domain: </span>
|
||||
<span class="dr-info-value">{{.Domain}}</span>
|
||||
</div>
|
||||
<div class="dr-info-item">
|
||||
<span class="dr-info-label">Mentés időpontja: </span>
|
||||
<span class="dr-info-value">{{.Timestamp}}</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Drives card -->
|
||||
<div class="dr-card">
|
||||
<h3>Meghajtók</h3>
|
||||
<div class="dr-drives">
|
||||
{{range .Drives}}
|
||||
<div class="dr-drive">
|
||||
<div class="dr-drive-label">{{.Label}}</div>
|
||||
<div class="dr-drive-path">{{.Path}}</div>
|
||||
<div class="dr-drive-status">
|
||||
{{if .Available}}
|
||||
{{if .HasBackup}}
|
||||
<span class="dr-drive-ok">Elérhető, mentés megtalálva</span>
|
||||
{{else}}
|
||||
<span class="dr-drive-ok">Elérhető</span>
|
||||
{{end}}
|
||||
{{else}}
|
||||
<span class="dr-drive-warn">Nem elérhető</span>
|
||||
{{end}}
|
||||
</div>
|
||||
</div>
|
||||
{{end}}
|
||||
{{if not .Drives}}
|
||||
<p style="color:var(--text-secondary)">Nem találhatók csatlakoztatott meghajtók.</p>
|
||||
{{end}}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Apps table card -->
|
||||
<div class="dr-card">
|
||||
<h3>Visszaállítható alkalmazások</h3>
|
||||
{{if .Apps}}
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Alkalmazás</th>
|
||||
<th>Konfiguráció</th>
|
||||
<th>Adatok</th>
|
||||
<th>DB mentés</th>
|
||||
<th>Állapot</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="app-table-body">
|
||||
{{range .Apps}}
|
||||
<tr data-app="{{.Name}}">
|
||||
<td>
|
||||
<strong>{{.DisplayName}}</strong>
|
||||
<div style="font-size:.8rem;color:var(--text-secondary)">{{.Name}}</div>
|
||||
</td>
|
||||
<td>
|
||||
{{if .HasConfig}}
|
||||
<span class="badge badge-ok">Megtalálva</span>
|
||||
{{else}}
|
||||
<span class="badge badge-none">Hiányzik</span>
|
||||
{{end}}
|
||||
</td>
|
||||
<td>
|
||||
{{if .HasData}}
|
||||
<span class="badge badge-ok">Elérhető</span>
|
||||
{{else if .HasRsyncData}}
|
||||
<span class="badge badge-warn">Mentésből</span>
|
||||
{{else if not .NeedsHDD}}
|
||||
<span class="badge badge-none">Nem szükséges</span>
|
||||
{{else}}
|
||||
<span class="badge badge-warn">Hiányzik</span>
|
||||
{{end}}
|
||||
</td>
|
||||
<td>
|
||||
{{if .HasDBDump}}
|
||||
<span class="badge badge-ok">Van</span>
|
||||
{{else}}
|
||||
<span class="badge badge-none">Nincs</span>
|
||||
{{end}}
|
||||
</td>
|
||||
<td class="app-status" data-app="{{.Name}}">
|
||||
<span class="status-{{.Status}}">{{statusText .Status}}</span>
|
||||
</td>
|
||||
</tr>
|
||||
{{end}}
|
||||
</tbody>
|
||||
</table>
|
||||
<div class="progress-bar" id="progress-bar">
|
||||
<div class="progress-bar-inner" id="progress-inner"></div>
|
||||
</div>
|
||||
{{else}}
|
||||
<p style="color:var(--text-secondary)">Nem találhatók visszaállítható alkalmazások.</p>
|
||||
{{end}}
|
||||
</div>
|
||||
|
||||
<!-- Action buttons -->
|
||||
<div class="dr-actions" id="dr-actions">
|
||||
{{if eq .PlanStatus "pending"}}
|
||||
{{if .Apps}}
|
||||
<button class="btn btn-primary" id="btn-restore-all" onclick="startRestoreAll()">
|
||||
Összes visszaállítása ({{len .Apps}} alkalmazás)
|
||||
</button>
|
||||
{{end}}
|
||||
<button class="btn btn-outline" id="btn-skip" onclick="skipRestore()">
|
||||
Kihagyás — tovább a vezérlőpulthoz
|
||||
</button>
|
||||
{{else if eq .PlanStatus "restoring"}}
|
||||
<button class="btn btn-primary" disabled>
|
||||
<span class="spinner"></span> Visszaállítás folyamatban...
|
||||
</button>
|
||||
{{else if eq .PlanStatus "done"}}
|
||||
<a href="/" class="btn btn-success" id="btn-continue" onclick="finishRestore(event)">
|
||||
Tovább a vezérlőpulthoz
|
||||
</a>
|
||||
{{end}}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
var polling = null;
|
||||
var planStatus = "{{.PlanStatus}}";
|
||||
|
||||
if (planStatus === "restoring") {
|
||||
startPolling();
|
||||
}
|
||||
|
||||
function startRestoreAll() {
|
||||
var btn = document.getElementById('btn-restore-all');
|
||||
var skipBtn = document.getElementById('btn-skip');
|
||||
btn.disabled = true;
|
||||
btn.innerHTML = '<span class="spinner"></span> Visszaállítás indítása...';
|
||||
if (skipBtn) skipBtn.style.display = 'none';
|
||||
|
||||
fetch('/api/restore/all', { method: 'POST' })
|
||||
.then(function(resp) { return resp.json(); })
|
||||
.then(function(data) {
|
||||
if (data.ok) {
|
||||
planStatus = 'restoring';
|
||||
document.getElementById('progress-bar').style.display = 'block';
|
||||
startPolling();
|
||||
} else {
|
||||
alert('Hiba: ' + (data.error || 'Ismeretlen hiba'));
|
||||
btn.disabled = false;
|
||||
btn.textContent = 'Összes visszaállítása';
|
||||
if (skipBtn) skipBtn.style.display = '';
|
||||
}
|
||||
})
|
||||
.catch(function(err) {
|
||||
alert('Hálózati hiba: ' + err.message);
|
||||
btn.disabled = false;
|
||||
btn.textContent = 'Összes visszaállítása';
|
||||
if (skipBtn) skipBtn.style.display = '';
|
||||
});
|
||||
}
|
||||
|
||||
function skipRestore() {
|
||||
if (!confirm('Biztosan ki szeretné hagyni a visszaállítást? A vezérlőpult üres alkalmazáslistával fog elindulni.')) return;
|
||||
fetch('/api/restore/skip', { method: 'POST' })
|
||||
.then(function(resp) { return resp.json(); })
|
||||
.then(function(data) {
|
||||
if (data.ok) {
|
||||
window.location.href = '/';
|
||||
} else {
|
||||
alert('Hiba: ' + (data.error || 'Ismeretlen hiba'));
|
||||
}
|
||||
})
|
||||
.catch(function(err) { alert('Hálózati hiba: ' + err.message); });
|
||||
}
|
||||
|
||||
function finishRestore(e) {
|
||||
e.preventDefault();
|
||||
fetch('/api/restore/skip', { method: 'POST' })
|
||||
.then(function() { window.location.href = '/'; })
|
||||
.catch(function() { window.location.href = '/'; });
|
||||
}
|
||||
|
||||
function startPolling() {
|
||||
if (polling) return;
|
||||
document.getElementById('progress-bar').style.display = 'block';
|
||||
polling = setInterval(pollStatus, 2000);
|
||||
pollStatus();
|
||||
}
|
||||
|
||||
function pollStatus() {
|
||||
fetch('/api/restore/status')
|
||||
.then(function(resp) { return resp.json(); })
|
||||
.then(function(data) {
|
||||
if (!data.ok) return;
|
||||
updateTable(data.apps || []);
|
||||
updateProgress(data.apps || []);
|
||||
|
||||
if (data.status === 'done') {
|
||||
clearInterval(polling);
|
||||
polling = null;
|
||||
planStatus = 'done';
|
||||
updateActions();
|
||||
}
|
||||
})
|
||||
.catch(function() {});
|
||||
}
|
||||
|
||||
function updateTable(apps) {
|
||||
apps.forEach(function(app) {
|
||||
var cells = document.querySelectorAll('.app-status[data-app="' + app.name + '"]');
|
||||
cells.forEach(function(cell) {
|
||||
var html = '<span class="status-' + app.status + '">';
|
||||
if (app.status === 'restoring') {
|
||||
html += '<span class="spinner"></span> ';
|
||||
}
|
||||
html += statusText(app.status);
|
||||
if (app.error) {
|
||||
html += ' <span style="font-size:.8rem;color:var(--danger)">(' + app.error.substring(0, 60) + ')</span>';
|
||||
}
|
||||
html += '</span>';
|
||||
cell.innerHTML = html;
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
function updateProgress(apps) {
|
||||
var total = apps.length;
|
||||
if (total === 0) return;
|
||||
var done = 0;
|
||||
apps.forEach(function(a) {
|
||||
if (a.status === 'done' || a.status === 'failed' || a.status === 'skipped') done++;
|
||||
});
|
||||
var pct = Math.round((done / total) * 100);
|
||||
document.getElementById('progress-inner').style.width = pct + '%';
|
||||
}
|
||||
|
||||
function updateActions() {
|
||||
var actions = document.getElementById('dr-actions');
|
||||
actions.innerHTML = '<a href="/" class="btn btn-success" id="btn-continue" onclick="finishRestore(event)">Tovább a vezérlőpulthoz</a>';
|
||||
}
|
||||
|
||||
function statusText(s) {
|
||||
switch (s) {
|
||||
case 'pending': return 'Várakozik';
|
||||
case 'restoring': return 'Visszaállítás...';
|
||||
case 'done': return 'Kész';
|
||||
case 'failed': return 'Sikertelen';
|
||||
case 'skipped': return 'Kihagyva';
|
||||
default: return s;
|
||||
}
|
||||
}
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
{{end}}
|
||||
@@ -1512,6 +1512,16 @@ print_summary() {
|
||||
echo -e "${YELLOW}Env vars (passwords, secrets) must be filled in manually.${NC}"
|
||||
fi
|
||||
echo ""
|
||||
if [[ -n "$CUSTOMER_ID" ]]; then
|
||||
echo -e "${BOLD}${YELLOW}Disaster Recovery:${NC}"
|
||||
echo " If this is a reinstallation, the controller will automatically:"
|
||||
echo " 1. Contact the Hub for your previous configuration"
|
||||
echo " 2. Mount your existing storage drives"
|
||||
echo " 3. Detect and offer to restore your applications"
|
||||
echo ""
|
||||
echo " Open https://felhom.${BASE_DOMAIN} to monitor the restore process."
|
||||
echo ""
|
||||
fi
|
||||
echo -e "${BOLD}Quick Commands:${NC}"
|
||||
echo " dps → List containers"
|
||||
echo " dlogs <n> → View container logs"
|
||||
|
||||
Reference in New Issue
Block a user