feat: backup safety — stop-before-dump, streaming restore, health check, per-app restic, infra configs (v0.34.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -336,6 +336,7 @@ Path computation is centralized in `backup/paths.go` via the `FelhomDataDir = "f
|
||||
**Phase 1b — Docker Volume Dumps** (`internal/backup/backup.go`, runs after DB dumps)
|
||||
|
||||
- Iterates all deployed stacks that have Docker named volumes (`GetDockerVolumes()`)
|
||||
- **v0.34.0:** Each stack is stopped before dump, restarted after (`DumpAppVolumesSafe()`) — prevents inconsistent tars of live databases. Protected stacks (traefik, etc.) that reject StopStack are skipped with a warning.
|
||||
- For each volume: `docker run --rm -v <vol>:/vol:ro -v <dumpDir>:/out alpine tar cf /out/<vol>.tar -C /vol .`
|
||||
- 10-minute timeout per volume; warnings on failure (non-fatal)
|
||||
- Stale tars cleaned up (volumes that no longer exist)
|
||||
@@ -347,12 +348,12 @@ Path computation is centralized in `backup/paths.go` via the `FelhomDataDir = "f
|
||||
- Apps are **grouped by drive** via `groupStacksByDrive()` — each drive's apps are backed up to that drive's restic repo
|
||||
- App drive resolution: `GetStackHDDPath()` (from `StackDataProvider`) → falls back to `SystemDataPath`
|
||||
- Auto-generated repository password (32 random bytes, base64url), shared across all repos, synced to hub
|
||||
- **Paths included in every per-drive snapshot:**
|
||||
- **Paths included in each per-drive snapshot (v0.34.0: per-app scoped):**
|
||||
- Per-app DB dump dirs on that drive
|
||||
- Per-app Docker volume dump dirs (`volume-dumps/*.tar`)
|
||||
- Per-app HDD mount paths (user data)
|
||||
- Stacks dir (compose.yml + app.yaml + .felhom.yml for all apps)
|
||||
- `controller.yaml` (controller config)
|
||||
- Per-app stack config dir (`<StacksDir>/<stackName>/` — only for stacks on this drive)
|
||||
- `controller.yaml` — only on the system drive (not duplicated across all drives)
|
||||
- Auto-detects and unlocks stale locks (restic repo lock)
|
||||
- Weekly prune on Sundays with configurable retention (keep-daily, keep-weekly, keep-monthly)
|
||||
- Weekly integrity check (`restic check`) on Sunday 04:00 — checks **all** primary repos
|
||||
@@ -377,7 +378,7 @@ data back up config + DB + user data + Docker volumes; apps without HDD back up
|
||||
- **restic** — Versioned, deduplicated, encrypted (shared repo across apps, not browsable)
|
||||
- Per-app configuration in settings.json: destination path, method, schedule (daily/weekly/manual)
|
||||
- **Pre-backup DB dump:** `DumpStackDB()` runs fresh pg_dump/mariadb-dump before each cross-drive backup; non-fatal on failure (wired via `DBDumper` interface to avoid circular imports)
|
||||
- **Pre-backup volume dump (v0.33.0):** `DumpAppVolumes()` exports Docker named volumes to tar before each cross-drive backup (wired via `VolumeDumper` interface)
|
||||
- **Pre-backup volume dump (v0.33.0, safe stop/start v0.34.0):** `DumpAppVolumesSafe()` stops the stack, exports Docker named volumes to tar, restarts — wired via `VolumeDumper` interface
|
||||
- **Empty mounts allowed:** `RunAppBackup` accepts apps with no HDD mounts — the rsync
|
||||
mount loop simply doesn't execute, but DB + config copy still runs
|
||||
- **Drive-type-aware validation** (`ValidateDestination`):
|
||||
@@ -440,16 +441,17 @@ appear in the restore dropdown with per-app snapshot filtering.
|
||||
- Config only: "Csak konfiguracio visszaallitasa"
|
||||
|
||||
**Tier 1 restore** (`RestoreApp`):
|
||||
- Stop app → resolve app's home drive → `restic restore <id> --target / --include <path>...` → populate Docker volumes from restored tars → restart app
|
||||
- Stop app → resolve app's home drive → `restic restore <id> --target / --include <path>...` → populate Docker volumes from restored tars → restart app → health check
|
||||
- Restore paths: config dir, DB dump dir, volume dump dir, HDD mounts
|
||||
- Docker volumes restored via `restoreDockerVolumes()`: `docker volume rm -f` → `docker volume create` → `docker run alpine tar xf`
|
||||
|
||||
**Tier 2 restore** (`RestoreAppFromTier2`):
|
||||
- Stop app → rsync config from `_config/` → rsync HDD data (single/multi-mount) → copy DB dumps from `_db/` → restore Docker volumes from `_volumes/` tars → restart app
|
||||
- Stop app → rsync config from `_config/` → rsync HDD data (single/multi-mount) → copy DB dumps from `_db/` (streaming `copyFile`) → restore Docker volumes from `_volumes/` tars → restart app → health check
|
||||
- Uses rsync `--delete` for config and HDD data to ensure exact mirror state
|
||||
- Single-mount apps: data directly in rsync dir (excluding `_*`); multi-mount: per-leaf subdirectories
|
||||
|
||||
**Common:**
|
||||
- **v0.34.0:** Post-restore health check (`waitForHealthy`) polls container state with `docker ps` refresh every 5s for up to 90s. Warning logged if app doesn't reach running state; restore still returns success (data is restored regardless).
|
||||
- Running flag prevents concurrent backup/restore operations
|
||||
- Snapshot ID validated (8-64 lowercase hex, or special `tier2-rsync`)
|
||||
- Import from `.fab` bundle link shown in restore section for cross-system migration
|
||||
@@ -970,7 +972,7 @@ After each backup cycle (including manual Tier 2 triggers via `OnCrossDriveCompl
|
||||
- `controller.yaml` (base64-encoded, full config including secrets)
|
||||
- `settings.json` (base64-encoded, backup prefs, storage paths, cross-drive configs)
|
||||
- Disk layout (UUIDs, labels, mount points, fstab options, bind-mount topology)
|
||||
- Deployed stacks manifest (app names, HDD paths)
|
||||
- Deployed stacks manifest (app names, HDD paths) with actual config files: `docker-compose.yml`, `app.yaml`, `.felhom.yml` (base64-encoded per stack, v0.34.0)
|
||||
- Restic passwords (primary + cross-drive, base64-encoded)
|
||||
|
||||
This enables fully automated recovery when the system drive is replaced — the new controller pulls the snapshot from the Hub, auto-mounts surviving drives by UUID, and restores all applications.
|
||||
|
||||
Reference in New Issue
Block a user