Post-deploy fixes (v0.12.7a)

This commit is contained in:
2026-02-18 11:03:56 +01:00
parent 6c1762141a
commit e4433f07b4
6 changed files with 152 additions and 752 deletions
+88 -43
View File
@@ -164,76 +164,121 @@ The `/apps/{slug}` page renders hero section, screenshots, setup guide, and opti
### 2. Backup System
The backup system implements a **3-2-1 backup architecture**:
1. **Nightly restic (mandatory, same drive)** — DB dumps + config + ALL user data (HDD). Every app with data is backed up automatically. No toggles.
2. **Cross-drive backup (opt-in, different device)** — rsync or restic to a secondary physical drive. Protects against drive failure.
3. **Remote backup (future)** — offsite copy for disaster recovery.
#### Layer 1: Database Dumps (`internal/backup/dbdump.go`)
| Rule | What | Where | Status |
|------|------|-------|--------|
| **1. Nightly backup** | DB dumps + config + ALL user data | Same drive as app | Mandatory, automatic |
| **2. Cross-drive backup** | User data copy to secondary drive | Different physical device | Opt-in per app |
| **3. Remote backup** | Offsite copy for disaster recovery | Cloud / remote server | Future |
**Key principle:** User data backup is **mandatory** — every app with HDD bind mounts
is included in the nightly restic snapshot automatically. There is no per-app toggle.
The `AppBackupPrefs.Enabled` field in settings.json is legacy and not read by any code.
#### Rule 1: Nightly Backup (mandatory, same drive)
The nightly backup has two phases that run sequentially:
**Phase 1 — Database Dumps** (`internal/backup/dbdump.go`, scheduled 02:30)
- **Auto-discovery** of PostgreSQL and MariaDB containers via `docker ps` + `docker inspect`
- Dumps via `docker exec pg_dump` / `docker exec mariadb-dump` with 5-minute timeout
- Atomic writes (`.tmp``.sql`) to prevent corruption
- **Validation** after each dump: checks file size, header presence, counts `CREATE TABLE` statements
- **Validation** after each dump: checks file size, header presence, counts `CREATE TABLE`
- Results cached in `settings.json` surviving container restarts
- Scheduled nightly at 02:30
- Also triggered per-app by cross-drive backup before each run (`DumpStackDB`)
#### Layer 2: Restic Snapshots (`internal/backup/restic.go`)
**Phase 2 — Restic Snapshot** (`internal/backup/restic.go`, scheduled 03:00)
- Auto-generated repository password (32 random bytes, base64url)
- Password synced to hub for disaster recovery
- Backs up: stacks dir + DB dump dir + **ALL deployed apps' HDD mount paths** (mandatory, no opt-in)
- `resolveAppBackupPaths()` iterates all deployed stacks via `ListDeployedStacks()` — no `Enabled` flag
- Auto-detects and unlocks stale locks
- Auto-generated repository password (32 random bytes, base64url), synced to hub
- **Paths included in every snapshot:**
- Stacks dir (all compose.yml + app.yaml + .felhom.yml)
- DB dump dir (all `.sql` dump files from Phase 1)
- `controller.yaml` (controller config)
- **ALL deployed apps' HDD mount paths** — discovered via `resolveAppBackupPaths()` which iterates `ListDeployedStacks()`, no `Enabled` flag
- Auto-detects and unlocks stale locks (restic repo lock)
- Weekly prune on Sundays with configurable retention (keep-daily, keep-weekly, keep-monthly)
- Weekly integrity check (`restic check`) on Sunday 04:00
- Scheduled nightly at 03:00 (runs after DB dumps complete)
#### Layer 3: Cross-Drive Backup (`internal/backup/crossdrive.go`)
**What this protects against:** accidental deletion, data corruption, point-in-time rollback.
Does NOT protect against drive failure (backup is on the same physical drive).
Implements the 3-2-1 backup rule by copying data to a different physical drive.
#### Rule 2: Cross-Drive Backup (opt-in, different device) (`internal/backup/crossdrive.go`)
Copies user data to a **different physical drive**, providing the second copy for 3-2-1.
- **Two methods:**
- **rsync** — Simple mirror with `--delete` (fast, no versioning)
- **restic** — Versioned, deduplicated, encrypted (shared repo across apps)
- Per-app configuration: destination path, method, schedule (daily/weekly/manual)
- **Pre-backup DB dump**: `DumpStackDB()` runs before cross-drive backup to ensure DB consistency; non-fatal on failure
- **Drive-type-aware validation** (`ValidateDestination` / `CheckBackupDestination`):
- External mount: block if <100 MB free; warn/block at 90%/95% usage
- System drive (same block device as `/`): require ≥10 GB free AND <90% usage; allowed with logged warning
- **Rsync destination layout** (`runRsyncBackup`):
- Single mount: data goes directly into `backups/rsync/<app>/` (no extra nesting)
- Multiple mounts: each gets `backups/rsync/<app>/<leaf>/` subfolder; duplicate leaf names get `_N` suffix
- DB dump files excluded: `--exclude backups/*.sql.gz/sql/dump` — avoids duplicating pg_dump data
- **restic** — Versioned, deduplicated, encrypted (shared repo across apps, auto-generated password)
- Per-app configuration in settings.json: destination path, method, schedule (daily/weekly/manual)
- **Pre-backup DB dump:** `DumpStackDB()` runs fresh pg_dump/mariadb-dump before each cross-drive backup to ensure DB state matches user data; non-fatal on failure (wired via `DBDumper` interface to avoid circular imports)
- **Drive-type-aware validation** (`ValidateDestination`):
| Destination type | Space checks |
|-----------------|--------------|
| External mount (different device than `/`) | Block if <100 MB free |
| System drive (same device as `/`) | Require ≥10 GB free AND <90% used; logged warning |
- **Rsync destination layout:**
- Single mount: `backups/rsync/<app>/` (flat, no extra nesting)
- Multiple mounts: `backups/rsync/<app>/<leaf>/` per mount; duplicate leaf names get `_N` suffix
- DB dump files excluded (`--exclude backups/*.sql.gz/sql/dump`) — already handled by pg_dump
- Safety guards: destination ≠ source, path-overlap check, writable check
- **Chained execution**: cross-drive runs immediately after nightly restic backup (daily apps every night, weekly apps on Sundays)
- **Chained execution:** runs immediately after nightly restic daily apps every night, weekly apps on Sundays
- Per-app concurrency lock prevents overlapping runs
- Status tracking (last_run, duration, size, error) persisted to settings.json
- Status (last_run, duration, size, error) persisted to settings.json
**What this protects against:** primary drive failure, drive theft/damage.
#### Rule 3: Remote Backup (future)
Offsite backup for disaster recovery. Not yet implemented.
#### Restore (`internal/backup/restore.go`)
All deployed apps appear in the restore dropdown — not just those with HDD data.
All deployed apps appear in the restore dropdown — every app has restic snapshot data
(stacks dir + DB dumps are always backed up).
| App type | DB restored | Config restored | User data restored |
|----------|------------|-----------------|-------------------|
| Has HDD data | ✓ | ✓ | ✓ (always — mandatory) |
| App type | Config restored | DB restored | User data restored |
|----------|----------------|------------|-------------------|
| Has HDD data | ✓ | ✓ | ✓ (always — backup is mandatory) |
| DB only, no HDD | ✓ | ✓ | n/a |
| No DB, no HDD | | | n/a |
| No DB, no HDD | | | n/a |
- Restore type info shown in UI when app selected (Hungarian banner: full / config+DB / config only)
- Snapshot API: apps without HDD mounts return all snapshots (all contain stacks dir + DB dumps)
- **Auto stop/restart**: stops app before `restic restore`, restarts after (even on failure)
- **Snapshot API** returns ALL snapshots unfiltered — older snapshots (pre-mandatory HDD backup) still allow config+DB restore; `RestoreApp` extracts whatever paths are available
- **Restore type info** shown per-app when selected in dropdown (Hungarian banners):
- Has HDD: "Teljes visszaállítás: adatbázis + konfiguráció + felhasználói adatok"
- Has DB, no HDD: "Adatbázis és konfiguráció visszaállítása"
- No DB, no HDD: "Csak konfiguráció visszaállítása"
- **Execution flow:** stop app → `restic restore <id> --target / --include <path>...` → restart app
- Running flag prevents concurrent backup/restore operations
- Snapshot ID validated (864 lowercase hex)
#### Backup Page UI
#### Backup Page UI (`internal/web/templates/backups.html`)
The backups page shows a unified per-app status table:
- **Status dot**: green (fully covered), yellow (warning — failed run, system drive, disk full), red (HDD data without cross-drive), auto (no user data)
- Expandable row per app showing all 3 backup layers (DB, Konfiguráció, user data)
- Schedule overview with next run times
- Snapshot history table (last 20 snapshots with ID, time, data added)
Unified per-app status table with expandable rows showing 3 backup layers per app:
**Status dot per app:**
| Dot color | Meaning |
|-----------|---------|
| Green | Fully covered — cross-drive configured and last run OK |
| Yellow | Warning — no second copy, or last backup failed, or disk space issue |
| Red | Cross-drive destination blocked or inaccessible |
| Gray (auto) | No user data — only config/DB backup (automatic) |
**Three backup layers per app row:**
1. **Adatbázis mentés** — Auto badge + last run timestamp + status
2. **Konfiguráció** — Auto badge + last restic snapshot timestamp + status
3. **Felhasználói adatok** — one of:
- Cross-drive configured: method + destination + schedule + last run + status + "Futtatás most" button
- HDD data, no cross-drive: "✓ Helyi mentés auto" (green) + "⚠ Nincs 2. másolat" (yellow) + settings link
- No HDD data: "— (nincs HDD adat)" (muted)
**Other sections:**
- Schedule overview with next run times for DB dump, restic, prune
- Snapshot history table (last 20 snapshots with ID, time, files new/changed, data added)
- Repository info card (path, size, snapshot count, encryption key with show/copy)
- Restore section with app/snapshot dropdowns and confirmation flow
- Restore section: app dropdown → snapshot dropdown → restore type info → confirmation checkbox → execute
---