v0.53.0: Phase 2 capture side — per-app secret-free recovery unit

Each app's on-drive backup becomes a self-contained, recreatable recovery unit:
compose/ (docker-compose.yml + .felhom.yml + secret-stripped app.yaml) alongside
the existing db-dumps/ + volume-dumps/, plus a secret-free manifest.json (image
pins, secret env-var NAMES, data_key names, checksums). The unit stores no secret
value, no data-key, and not the image — secrets are recovered at restore from the
guest's own app.yaml (live/PBS), never regenerated.

- appbackup: RecoveryUnit* path helpers, RecoveryInfo + GetStackRecoveryInfo,
  ParseComposeImages; AppDBDump/Volume refactored onto RecoveryUnitPath.
- backup: recovery_unit.go (manifest + CaptureRecoveryUnit), wired into RunDBDumps;
  capture test proves secret-free.
- stacks: DeployField.DataKey + Metadata.DataKeyEnvVars(); main.go stackAdapter
  implements GetStackRecoveryInfo (excludes secret-named + encrypted values).
- Restore-from-unit recreate + fail-closed gate + live AdventureLog validation: next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-13 10:20:37 +02:00
parent 5eb25c3861
commit 70eb521cd0
9 changed files with 586 additions and 3 deletions
+9 -1
View File
@@ -26,6 +26,7 @@ type Manager struct {
settings *settings.Settings
stackProvider StackDataProvider
systemDataPath string // fallback drive for SSD-only apps
version string // controller version, stamped into recovery-unit manifests
mu sync.Mutex
lastDBDump *DBDumpStatus
@@ -235,9 +236,16 @@ func (m *Manager) runDBDumpsInternal(ctx context.Context) error {
m.logger.Printf("[INFO] [backup] DB dump completed: %d databases, %s total (%s)",
len(results), humanizeBytes(totalSize), duration.Round(time.Millisecond))
} else {
return fmt.Errorf("some database dumps failed")
// Still refresh recovery units below — a partial DB failure shouldn't leave units stale.
m.logger.Printf("[WARN] [backup] some database dumps failed; refreshing recovery units anyway")
}
// Phase 2: refresh each deployed app's self-contained recovery unit (compose + manifest).
m.captureAllRecoveryUnits()
if !allOK {
return fmt.Errorf("some database dumps failed")
}
return nil
}