26 KiB
TASK.md — Backup Architecture Overhaul (v0.12.7)
Prompt (copy-paste this into Claude Code)
Read TASK.md for the full plan. Apply all code changes described, then build and deploy.
After all fixes are done:
1. Run `go build ./...` and `go vet ./...` from the controller/ directory — fix any errors
2. Update CHANGELOG.md with a new entry at the top (session 43, v0.12.7)
3. Update controller/README.md backup section with the new architecture
4. Commit, build, and deploy following the workflow in CLAUDE.md
Context and Goals
The backup system has accumulated an unnecessary Enabled flag that gates user data
inclusion in the nightly restic backup. This was a design mistake — user data backup
must be mandatory for all apps that have HDD data. Without it, the service isn't viable.
This task removes the Enabled gate, makes HDD data backup automatic, and fixes the
restore feature to show all apps.
Backup architecture (3-2-1 rule):
-
Nightly restic (mandatory, same drive) — DB dumps + config + ALL user data (HDD). Every app with data gets backed up automatically. This protects against accidental deletion and enables point-in-time restore. The backup lives on the same drive as the app, so it does NOT protect against drive failure.
-
Cross-drive backup (opt-in, different device) — rsync or restic to a secondary physical drive. This is the second copy that protects against drive failure. When not configured for an app with HDD data, the UI warns that data is only on one drive.
-
Remote backup (future) — offsite copy for disaster recovery. Not implemented yet.
Current problems:
AppBackupPrefs.Enabledflag gates HDD inclusion in nightly restic — should be automatic- The flag gets out of sync with cross-drive config
- Restore dropdown only shows apps with
HasHDDData && BackupEnabled— excludes most apps - "Docker kötetek: Auto ✓" in UI implies volumes are separately backed up — they're not
- Cross-drive backup doesn't trigger a DB dump first
Fix 1: Make HDD data backup mandatory
Remove the Enabled flag as a gate. All apps with HDD data are always included in the
nightly restic backup. No toggle, no opt-in.
1a: Rewrite resolveAppBackupPaths() — include ALL HDD paths
File: internal/backup/backup.go, function resolveAppBackupPaths (line 429)
Current:
func (m *Manager) resolveAppBackupPaths() []string {
if m.stackProvider == nil || m.settings == nil {
return nil
}
appBackupMap := m.settings.GetAppBackupMap()
if len(appBackupMap) == 0 {
return nil
}
var paths []string
seen := make(map[string]bool)
for stackName, enabled := range appBackupMap {
if !enabled {
continue
}
hddMounts := m.stackProvider.GetStackHDDMounts(stackName)
for _, mount := range hddMounts {
if seen[mount] {
continue
}
if _, err := os.Stat(mount); err == nil {
paths = append(paths, mount)
seen[mount] = true
m.logger.Printf("[DEBUG] Including app data: %s (from %s)", mount, stackName)
}
}
}
return paths
}
Replace with:
// resolveAppBackupPaths returns HDD paths for ALL deployed apps.
// User data backup is mandatory — every app with HDD mounts is included.
func (m *Manager) resolveAppBackupPaths() []string {
if m.stackProvider == nil {
return nil
}
var paths []string
seen := make(map[string]bool)
for _, stack := range m.stackProvider.ListDeployedStacks() {
hddMounts := m.stackProvider.GetStackHDDMounts(stack.Name)
for _, mount := range hddMounts {
if seen[mount] {
continue
}
if _, err := os.Stat(mount); err == nil {
paths = append(paths, mount)
seen[mount] = true
m.logger.Printf("[DEBUG] Including app data: %s (from %s)", mount, stack.Name)
}
}
}
return paths
}
Key change: no longer reads GetAppBackupMap(), no longer checks Enabled. Iterates
all deployed stacks via ListDeployedStacks() and includes every HDD mount.
The m.settings dependency is also removed from this method.
1b: Remove backupPrefs parameter from DiscoverAppData()
File: internal/backup/appdata.go, function DiscoverAppData (line 61)
Current:
func DiscoverAppData(provider StackDataProvider, backupPrefs map[string]bool, discoveredDBs []DiscoveredDB) []AppBackupInfo {
and at line 100:
info.BackupEnabled = backupPrefs[stack.Name]
Replace function signature with:
func DiscoverAppData(provider StackDataProvider, discoveredDBs []DiscoveredDB) []AppBackupInfo {
Replace line 100 with:
// All apps with HDD data are backed up automatically (mandatory)
info.BackupEnabled = info.HasHDDData
1c: Update RefreshCache() caller
File: internal/backup/backup.go, in RefreshCache (around line 547)
Current:
if m.stackProvider != nil {
backupPrefs := m.settings.GetAppBackupMap()
status.AppDataInfo = DiscoverAppData(m.stackProvider, backupPrefs, status.DiscoveredDBs)
Replace with:
if m.stackProvider != nil {
status.AppDataInfo = DiscoverAppData(m.stackProvider, status.DiscoveredDBs)
1d: Delete dead settings methods
File: internal/settings/settings.go
Delete these methods entirely — they are no longer called anywhere:
IsAppBackupEnabled()(lines 235–243) — was used by restore.go, replaced in Fix 3fSetAppBackup()(lines 245–257) — was never called outside settings.goGetAppBackupMap()(lines 259–271) — was used by resolveAppBackupPaths + RefreshCache, replaced in 1a + 1cSetAppBackupBulk()(lines 273–287) — was never called outside settings.goGetAppBackupPrefs()(lines 289–298) — was never called outside settings.go
Keep the AppBackupPrefs struct and AppBackup field in Settings — the JSON field
"app_backup" still holds CrossDrive configs. The Enabled field stays in the struct
for backward compat (existing settings.json won't break on load) but nothing reads it.
Fix 2: DB dump before cross-drive backup
When cross-drive backup runs (scheduled or manual), trigger a fresh DB dump for that app's databases first. This ensures DB state matches the user data being rsynced.
2a: Define DBDumper interface
File: internal/backup/crossdrive.go, add near top (after imports):
// DBDumper can run a database dump for a specific stack.
type DBDumper interface {
DumpStackDB(ctx context.Context, stackName string) error
}
2b: Add field to CrossDriveRunner
File: internal/backup/crossdrive.go
Add dbDumper field to the struct (do NOT change the constructor signature):
type CrossDriveRunner struct {
sett *settings.Settings
stackProvider StackDataProvider
dbDumper DBDumper
logger *log.Logger
mu sync.Mutex
running map[string]bool
}
Add setter method after NewCrossDriveRunner:
// SetDBDumper sets the DB dumper for pre-backup database dumps.
// Called after backup manager is initialized (avoids circular init dependency).
func (r *CrossDriveRunner) SetDBDumper(d DBDumper) {
r.dbDumper = d
}
2c: Implement DumpStackDB on backup Manager
File: internal/backup/backup.go, add new method:
// DumpStackDB runs a database dump for containers belonging to a specific stack.
// Used by cross-drive backup to ensure DB state matches user data.
func (m *Manager) DumpStackDB(ctx context.Context, stackName string) error {
dbs, err := DiscoverDatabases(ctx, m.logger)
if err != nil {
return fmt.Errorf("database discovery failed: %w", err)
}
var stackDBs []DiscoveredDB
for _, db := range dbs {
if db.StackName == stackName {
stackDBs = append(stackDBs, db)
}
}
if len(stackDBs) == 0 {
m.logger.Printf("[DEBUG] No databases found for stack %s — skipping pre-backup dump", stackName)
return nil
}
m.logger.Printf("[INFO] Running pre-backup DB dump for %s (%d database(s))", stackName, len(stackDBs))
results := DumpAll(ctx, stackDBs, m.cfg.Paths.DBDumpDir, m.logger)
for _, r := range results {
if r.Error != nil {
return fmt.Errorf("DB dump failed for %s: %w", r.DB.ContainerName, r.Error)
}
m.logger.Printf("[INFO] Pre-backup DB dump OK: %s (%s)", r.DB.ContainerName, humanizeBytes(r.Size))
// Persist validation to settings
if m.settings != nil && r.FilePath != "" {
filename := filepath.Base(r.FilePath)
cache := settings.DBValidationCache{
ValidatedAt: time.Now().Format(time.RFC3339),
TableCount: r.Validation.TableCount,
HasHeader: r.Validation.Valid,
}
if !r.Validation.Valid {
cache.Error = r.Validation.Error
}
_ = m.settings.SetDBValidation(filename, cache)
}
}
return nil
}
2d: Call DB dump in RunAppBackup
File: internal/backup/crossdrive.go, in RunAppBackup, add BEFORE the
ValidateDestination call (around line 67, after the "Mark as running" block):
// Trigger fresh DB dump for this app before cross-drive backup
if r.dbDumper != nil {
if err := r.dbDumper.DumpStackDB(ctx, stackName); err != nil {
r.logger.Printf("[WARN] Pre-backup DB dump failed for %s: %v — proceeding with user data backup", stackName, err)
// Non-fatal: user data backup is still valuable without fresh dump
}
}
2e: Wire up in main.go
File: cmd/controller/main.go, after both crossDriveRunner and backupMgr are created
(after line 135):
// Wire cross-drive → backup manager for pre-backup DB dumps
if backupMgr != nil {
crossDriveRunner.SetDBDumper(backupMgr)
}
Fix 3: Restore dropdown shows ALL deployed apps
The restore section should show every deployed app. All apps have restic snapshots (stacks dir + DB dumps are always backed up). Apps with HDD data get full restore.
3a: Update template filter
File: internal/web/templates/backups.html, lines 450–456
Current:
{{range .Backup.AppDataInfo}}
{{if and .HasHDDData .BackupEnabled}}
<option value="{{.StackName}}">{{.DisplayName}}</option>
{{end}}
{{end}}
Replace with:
{{range .Backup.AppDataInfo}}
<option value="{{.StackName}}" data-has-hdd="{{.HasHDDData}}" data-has-db="{{.HasDBDump}}">{{.DisplayName}}</option>
{{end}}
Note: no data-backup-enabled attribute — user data is always backed up when present.
3b: Show restore type info when app is selected
File: internal/web/templates/backups.html
Add a new info div after the snapshot selector (after line 464, before the warning div):
<div id="restore-type-info" class="restore-info" style="display:none;margin-bottom:0.5rem">
</div>
3c: Update onRestoreAppChange() JavaScript
File: internal/web/templates/backups.html, replace the onRestoreAppChange function
(lines 604–637) with:
function onRestoreAppChange() {
var sel = document.getElementById('restore-app');
var appName = sel.value;
var snapSel = document.getElementById('restore-snapshot');
var noSnaps = document.getElementById('restore-no-snapshots');
var typeInfo = document.getElementById('restore-type-info');
document.getElementById('restore-confirm-cb').checked = false;
document.getElementById('restore-btn').disabled = true;
noSnaps.style.display = 'none';
typeInfo.style.display = 'none';
if (!appName) {
snapSel.innerHTML = '<option value="">— Válasszon alkalmazást —</option>';
return;
}
// Determine restore type from data attributes
var opt = sel.options[sel.selectedIndex];
var hasHDD = opt.getAttribute('data-has-hdd') === 'true';
var hasDB = opt.getAttribute('data-has-db') === 'true';
if (hasHDD) {
typeInfo.innerHTML = '🔄 Teljes visszaállítás: adatbázis + konfiguráció + felhasználói adatok a kiválasztott pillanatképből.';
typeInfo.className = 'restore-info';
} else if (hasDB) {
typeInfo.innerHTML = 'ℹ Adatbázis és konfiguráció visszaállítása — az alkalmazásnak nincs külön felhasználói adata.';
typeInfo.className = 'restore-info restore-info-partial';
} else {
typeInfo.innerHTML = 'ℹ Csak konfiguráció visszaállítása (compose fájlok, beállítások).';
typeInfo.className = 'restore-info restore-info-partial';
}
typeInfo.style.display = 'block';
snapSel.innerHTML = '<option value="">— Betöltés... —</option>';
fetch('/api/backup/snapshots?stack=' + encodeURIComponent(appName))
.then(function(r) { return r.json(); })
.then(function(data) {
snapSel.innerHTML = '<option value="">— Válasszon —</option>';
if (data.ok && data.data && data.data.length > 0) {
data.data.forEach(function(s) {
var o = document.createElement('option');
o.value = s.short_id;
o.textContent = formatSnapshot(s);
snapSel.appendChild(o);
});
} else {
snapSel.innerHTML = '<option value="">— Nincs elérhető mentés —</option>';
noSnaps.style.display = 'block';
}
});
}
Note: the JS is simpler than before — no backupEnabled check. If hasHDD is true, the
data IS backed up (it's mandatory). So: hasHDD → full restore, hasDB → config+DB, else → config only.
3d: Add CSS for restore info
File: internal/web/templates/style.css, add:
.restore-info {
padding: 0.5rem 0.75rem;
border-radius: 6px;
font-size: 0.85rem;
background: rgba(59, 130, 246, 0.1);
border: 1px solid rgba(59, 130, 246, 0.3);
color: #93c5fd;
}
.restore-info-partial {
background: rgba(251, 191, 36, 0.1);
border-color: rgba(251, 191, 36, 0.3);
color: #fcd34d;
}
3e: Update snapshot filtering API for non-HDD apps
File: internal/api/router.go, function backupSnapshots (line 448)
Current code (lines 460–466):
if stackName := req.URL.Query().Get("stack"); stackName != "" {
mounts := r.backupMgr.GetStackHDDMounts(stackName)
if len(mounts) > 0 {
snapshots = filterSnapshotsByPaths(snapshots, mounts)
}
}
Replace with:
if stackName := req.URL.Query().Get("stack"); stackName != "" {
mounts := r.backupMgr.GetStackHDDMounts(stackName)
if len(mounts) > 0 {
// App has HDD data — filter to snapshots containing those paths
snapshots = filterSnapshotsByPaths(snapshots, mounts)
}
// Apps without HDD mounts: return all snapshots (they all contain
// the stacks dir + DB dumps which cover this app's config and database)
}
This is effectively a no-op change (just adding a comment), since if len(mounts) > 0
already skips filtering for non-HDD apps. The comment clarifies intent.
3f: Update RestoreApp to handle all apps
File: internal/backup/restore.go
Replace the entire file with:
package backup
import (
"fmt"
"path/filepath"
"regexp"
)
// snapshotIDRe validates restic snapshot IDs: 8-64 lowercase hex characters.
var snapshotIDRe = regexp.MustCompile(`^[0-9a-f]{8,64}$`)
// RestoreApp restores an app from a restic snapshot.
// All apps get config + DB dump restored. Apps with HDD data also get user data restored.
func (m *Manager) RestoreApp(stackName, snapshotID string) error {
if m.stackProvider == nil {
return fmt.Errorf("stack provider not configured")
}
// Validate snapshot ID format
if !snapshotIDRe.MatchString(snapshotID) {
return fmt.Errorf("invalid snapshot ID: must be 8-64 lowercase hex characters")
}
// Prevent concurrent operations
m.mu.Lock()
if m.running {
m.mu.Unlock()
return fmt.Errorf("backup or restore already in progress")
}
m.running = true
m.mu.Unlock()
defer func() {
m.mu.Lock()
m.running = false
m.mu.Unlock()
}()
// Determine what to restore
hddMounts := m.stackProvider.GetStackHDDMounts(stackName)
hasHDD := len(hddMounts) > 0
// Build list of paths to restore from the snapshot
var restorePaths []string
// Always restore the stack's config dir (compose + app.yaml + .felhom.yml)
composePath, ok := m.stackProvider.GetStackComposePath(stackName)
if ok {
stackDir := filepath.Dir(composePath)
restorePaths = append(restorePaths, stackDir)
}
// Restore DB dump files for this stack
if m.cfg.Paths.DBDumpDir != "" {
restorePaths = append(restorePaths, m.cfg.Paths.DBDumpDir)
}
// Restore HDD data (always included for apps that have it — backup is mandatory)
if hasHDD {
restorePaths = append(restorePaths, hddMounts...)
}
if len(restorePaths) == 0 {
return fmt.Errorf("no restorable paths found for %s", stackName)
}
m.logger.Printf("[WARN] RESTORE starting: stack=%s, snapshot=%s, paths=%v, hasHDD=%v",
stackName, snapshotID, restorePaths, hasHDD)
// Stop the app before restore
if err := m.stackProvider.StopStack(stackName); err != nil {
m.logger.Printf("[WARN] RESTORE could not stop %s: %v (proceeding anyway)", stackName, err)
}
// Execute restore via restic
if err := m.restic.RestoreAppData(snapshotID, restorePaths); err != nil {
m.logger.Printf("[ERROR] RESTORE failed for %s: %v", stackName, err)
if startErr := m.stackProvider.StartStack(stackName); startErr != nil {
m.logger.Printf("[WARN] RESTORE could not restart %s after failure: %v", stackName, startErr)
}
return err
}
// Restart the app
if err := m.stackProvider.StartStack(stackName); err != nil {
m.logger.Printf("[WARN] RESTORE could not restart %s after restore: %v", stackName, err)
}
restoreType := "config+DB"
if hasHDD {
restoreType = "full (config+DB+userdata)"
}
m.logger.Printf("[INFO] RESTORE completed: stack=%s, snapshot=%s, type=%s", stackName, snapshotID, restoreType)
return nil
}
Key difference from previous version: no IsAppBackupEnabled() check — HDD data is always
backed up and always restorable.
Fix 4: Honest Docker volume UI label
4a: Change label in template
File: internal/web/templates/backups.html, lines 283–292
Current:
<div class="backup-layer-row">
<span class="layer-label">Docker kötetek</span>
<span class="layer-badge">Auto</span>
Replace Docker kötetek with Konfiguráció:
<div class="backup-layer-row">
<span class="layer-label">Konfiguráció</span>
<span class="layer-badge">Auto</span>
Why: Docker named volumes (immich_postgres_data) live at /var/lib/docker/volumes/
which is NOT in the restic backup paths. DB data is protected via pg_dump (separate row).
What's actually backed up is compose files + app.yaml + .felhom.yml = configuration.
Summary of all changes
| Fix | What | File(s) |
|---|---|---|
| 1a | resolveAppBackupPaths() includes ALL deployed stacks' HDD |
backup.go |
| 1b | Remove backupPrefs parameter from DiscoverAppData() |
appdata.go |
| 1c | Update RefreshCache() caller to match new signature |
backup.go |
| 1d | Delete 5 dead settings methods | settings.go |
| 2a | DBDumper interface |
crossdrive.go |
| 2b | dbDumper field + SetDBDumper on CrossDriveRunner |
crossdrive.go |
| 2c | DumpStackDB method on backup Manager |
backup.go |
| 2d | Call DB dump before cross-drive backup | crossdrive.go |
| 2e | Wire dbDumper in main.go | main.go |
| 3a | Remove filter from restore dropdown | backups.html |
| 3b | Add restore type info div | backups.html |
| 3c | Update JS to show restore type per app | backups.html |
| 3d | CSS for restore info banners | style.css |
| 3e | Comment clarifying snapshot API for non-HDD apps | router.go |
| 3f | RestoreApp handles config+DB+HDD restore |
restore.go |
| 4a | Rename "Docker kötetek" → "Konfiguráció" | backups.html |
Files to modify (8)
internal/backup/backup.go— Fix 1a + 1c + 2cinternal/backup/appdata.go— Fix 1binternal/settings/settings.go— Fix 1d (delete 5 methods)internal/backup/crossdrive.go— Fix 2a + 2b + 2dinternal/backup/restore.go— Fix 3f (full rewrite)internal/web/templates/backups.html— Fix 3a + 3b + 3c + 4ainternal/web/templates/style.css— Fix 3dinternal/api/router.go— Fix 3ecmd/controller/main.go— Fix 2e
Architecture after fix
Backup layers per app (3-2-1 rule):
┌──────────────────────────────────────────────────────┐
│ RULE 1 — Nightly restic (MANDATORY, same drive) │
│ │
│ 1. Adatbázis mentés (Auto) │
│ - pg_dump / mysqldump → /srv/backups/db-dumps/ │
│ - Runs nightly 02:30 (all discovered DBs) │
│ - Also runs before cross-drive backup (per-app) │
│ │
│ 2. Konfiguráció (Auto) │
│ - compose.yml, app.yaml, .felhom.yml │
│ - In nightly restic snapshot (always) │
│ │
│ 3. Felhasználói adatok (Auto — if app has HDD data) │
│ - HDD bind mounts (photos, documents, etc.) │
│ - Always included in nightly restic │
│ - No toggle — backup is mandatory │
│ │
├──────────────────────────────────────────────────────┤
│ RULE 2 — Cross-drive backup (OPT-IN, second device) │
│ │
│ - rsync or restic to a secondary physical drive │
│ - Protects against primary drive failure │
│ - When NOT configured: UI warns "nincs 2. másolat" │
│ - Triggers fresh DB dump before backup │
│ │
├──────────────────────────────────────────────────────┤
│ RULE 3 — Remote backup (FUTURE) │
│ │
│ - Offsite copy for disaster recovery │
│ - Not implemented yet │
└──────────────────────────────────────────────────────┘
Restore capabilities:
┌─────────────────┬────────────┬──────────┬────────────┐
│ App type │ DB restore │ Config │ User data │
├─────────────────┼────────────┼──────────┼────────────┤
│ Has HDD data │ ✓ │ ✓ │ ✓ (always) │
│ DB only, no HDD │ ✓ │ ✓ │ n/a │
│ No DB, no HDD │ — │ ✓ │ n/a │
└─────────────────┴────────────┴──────────┴────────────┘
Note: the "DB + HDD, no backup" row from the previous version is GONE. All apps with HDD data are always backed up. The restore table is simpler.
Post-fix checklist
go build ./...passesgo vet ./...passes- Verify no remaining references to deleted methods (
IsAppBackupEnabled,GetAppBackupMap,SetAppBackup,SetAppBackupBulk,GetAppBackupPrefs) — usegrep -rto confirm - Update
CHANGELOG.md— session 43, version v0.12.7:- HDD data backup is now mandatory for all apps (no opt-in toggle)
- Removed dead
Enabledflag gating from settings - Cross-drive: triggers pre-backup DB dump for consistency
- Restore: all deployed apps now appear in dropdown
- Restore: shows restore type info (full / config+DB / config only)
- Restore: supports config+DB restore for apps without user data
- UI: "Docker kötetek" renamed to "Konfiguráció" (accuracy)
- Update
controller/README.md— backup architecture section:- 3-2-1 backup rule (nightly mandatory, cross-drive opt-in, remote future)
- Restore capabilities matrix
- Cross-drive pre-backup DB dump flow
- Commit, build on 192.168.0.180, deploy on 192.168.0.162
- Verify with
docker psanddocker logs - After deploy, verify:
- Immich shows in restore dropdown (HDD data → full restore)
- Gokapi/Mealie also show (DB+config restore)
- Selecting Immich shows "Teljes visszaállítás" banner
- Selecting Gokapi shows "Adatbázis és konfiguráció" banner
- Manual cross-drive backup triggers DB dump first (check logs)
- Backup status page shows "Konfiguráció" instead of "Docker kötetek"
- Nightly restic backup includes Immich HDD paths without any toggle