From 0e13d42ccdc1a76ce2d94686cc026a725ad5293d Mon Sep 17 00:00:00 2001 From: kisfenyo Date: Wed, 18 Feb 2026 18:53:15 +0100 Subject: [PATCH] Update README.md for v0.14.0 architecture - Per-drive backup architecture (restic repos, DB dumps, path helpers) - Updated Tier 1/2 sections with new drive layout diagrams - Updated controller.yaml example (system_data_path, no global paths) - Updated repo layout (add paths.go), build examples, roadmap - Removed all v0.12.x references Co-Authored-By: Claude Opus 4.6 --- controller/README.md | 102 +++++++++++++++++++++++++++---------------- 1 file changed, 65 insertions(+), 37 deletions(-) diff --git a/controller/README.md b/controller/README.md index d2de07b..e5e10a0 100644 --- a/controller/README.md +++ b/controller/README.md @@ -4,7 +4,7 @@ A single, lightweight Go container that replaces Portainer + scattered systemd scripts with a unified, Hungarian-language web dashboard for managing Docker Compose stacks, backups, storage, monitoring, and notifications on customer hardware. -**Current version: v0.12.9** +**Current version: v0.14.0** --- @@ -69,7 +69,7 @@ A single, lightweight Go container that replaces Portainer + scattered systemd s - **Pure Go, no frameworks** — stdlib `net/http` + `html/template`. Only external deps: `bcrypt`, `yaml.v3`, `modernc.org/sqlite` (pure Go, no CGO). - **Privileged container** — Required for disk operations (format, mount, fstab), `/dev` access, and Docker socket control. - **`/host-dev` indirection** — Docker overrides `/dev` with a tmpfs. The host's `/dev` is mounted at `/host-dev` to access block devices. -- **`StackDataProvider` interface** — Breaks circular import between backup and stacks packages. Implemented by `stackAdapter` in `main.go`. +- **`StackDataProvider` interface** — Breaks circular import between backup and stacks packages. Implemented by `stackAdapter` in `main.go`. Provides `GetStackHDDPath()` for per-drive backup routing. - **Atomic file writes** — All persistent state (`settings.json`, `app.yaml`) written to `.tmp` then `os.Rename` for crash safety. - **`go:embed` templates** — All HTML/CSS/JS compiled into the binary. No runtime file dependencies. - **Europe/Budapest timezone** — All scheduled jobs, timestamps, and UI labels use Hungarian timezone. @@ -82,7 +82,7 @@ A single, lightweight Go container that replaces Portainer + scattered systemd s | **Settings** | `internal/settings/` | Runtime-mutable `settings.json` (passwords, backup prefs, storage paths, notifications) | | **Stacks** | `internal/stacks/` | Compose operations, scanning, `.felhom.yml` metadata, deploy/delete flow | | **Sync** | `internal/sync/` | Git-based app catalog sync (clone/pull, content-hash copy) | -| **Backup** | `internal/backup/` | 3-layer backup: DB dumps → restic snapshots → cross-drive copies, restore | +| **Backup** | `internal/backup/` | Per-drive 3-layer backup: DB dumps → restic snapshots → cross-drive copies, restore | | **Storage** | `internal/storage/` | Disk scanning (`lsblk`), partitioning (`sfdisk`), formatting (`mkfs.ext4`), mounting, data migration (`rsync`) | | **System** | `internal/system/` | System info (`/proc`), CPU collector, mount points, disk usage, FS info | | **Monitor** | `internal/monitor/` | Healthchecks.io pinger, system health checks | @@ -192,27 +192,45 @@ self-sufficient backup** — any single tier can fully restore an app. #### Tier 1: Nightly Backup (mandatory, same drive) -The nightly backup has two phases that run sequentially: +The nightly backup has two phases that run sequentially. All paths are **per-drive** — each physical drive gets its own restic repo and per-app DB dump directories. + +**Drive layout (v0.14.0):** +``` +/ +├── appdata// ← app user data +└── backups/ + └── primary/ + ├── restic/ ← one restic repo per drive (all apps on this drive) + └── /db-dumps/ ← per-app DB dump files +``` + +Path computation is centralized in `backup/paths.go`: +- `PrimaryResticRepoPath(drivePath)` → `/backups/primary/restic/` +- `AppDBDumpPath(drivePath, stackName)` → `/backups/primary//db-dumps/` +- `AppDataDir(drivePath, stackName)` → `/appdata//` **Phase 1 — Database Dumps** (`internal/backup/dbdump.go`, scheduled 02:30) - **Auto-discovery** of PostgreSQL and MariaDB containers via `docker ps` + `docker inspect` - Dumps via `docker exec pg_dump` / `docker exec mariadb-dump` with 5-minute timeout +- Dumps are written to the app's **home drive**: `AppDBDumpPath(appDrive, stackName)` - Atomic writes (`.tmp` → `.sql`) to prevent corruption - **Validation** after each dump: checks file size, header presence, counts `CREATE TABLE` - Results cached in `settings.json` surviving container restarts **Phase 2 — Restic Snapshot** (`internal/backup/restic.go`, scheduled 03:00) -- Auto-generated repository password (32 random bytes, base64url), synced to hub -- **Paths included in every snapshot:** - - Stacks dir (all compose.yml + app.yaml + .felhom.yml) - - DB dump dir (all `.sql` dump files from Phase 1) +- Apps are **grouped by drive** via `groupStacksByDrive()` — each drive's apps are backed up to that drive's restic repo +- App drive resolution: `GetStackHDDPath()` (from `StackDataProvider`) → falls back to `SystemDataPath` +- Auto-generated repository password (32 random bytes, base64url), shared across all repos, synced to hub +- **Paths included in every per-drive snapshot:** + - Per-app DB dump dirs on that drive + - Per-app HDD mount paths (user data) + - Stacks dir (compose.yml + app.yaml + .felhom.yml for all apps) - `controller.yaml` (controller config) - - **ALL deployed apps' HDD mount paths** — discovered via `resolveAppBackupPaths()` which iterates `ListDeployedStacks()`, no `Enabled` flag - Auto-detects and unlocks stale locks (restic repo lock) - Weekly prune on Sundays with configurable retention (keep-daily, keep-weekly, keep-monthly) -- Weekly integrity check (`restic check`) on Sunday 04:00 +- Weekly integrity check (`restic check`) on Sunday 04:00 — checks **all** primary repos **Protects against:** accidental deletion, data corruption, point-in-time rollback. Does NOT protect against drive failure (backup is on the same physical drive). @@ -236,17 +254,19 @@ data back up config + DB + user data; apps without HDD back up config + DB dumps | External mount (different device than `/`) | Block if <100 MB free | | System drive (same device as `/`) | Require ≥10 GB free AND <90% used; logged warning | -- **Rsync destination layout** (complete — can restore app independently): +- **Secondary drive layout (v0.14.0):** ``` - backups/rsync// - _db/ ← DB dump files (stackName_postgres.sql, etc.) - _config/ ← compose.yml, app.yaml, .felhom.yml - ← HDD mount contents (only for apps with HDD data) + /backups/secondary/ + ├── /rsync/ ← per-app rsync mirror + │ ├── _db/ ← DB dump files + │ ├── _config/ ← compose.yml, app.yaml, .felhom.yml + │ └── ← HDD mount contents (if app has HDD data) + └── restic/ ← shared restic repo (all cross-drive apps) ``` - - DB dump files excluded from user data rsync (`--exclude backups/*.sql.gz/sql/dump`) to avoid duplicating app-internal dumps + - DB dump files read from **per-app home drive** path (`AppDBDumpPath`) - `_` prefix directories prevent collision with user data - For non-HDD apps, only `_db/` and `_config/` are present (no user data directory) -- **Restic backup paths:** includes HDD mounts (if any) + config dir + DB dump dir (deduplication handles overlap) +- **Restic backup paths:** includes HDD mounts (if any) + config dir + per-app DB dump dir from home drive - Safety guards: destination ≠ source, path-overlap check (HDD mounts only), writable check - **Chained execution:** runs immediately after nightly restic — daily apps every night, weekly apps on Sundays - Per-app concurrency lock prevents overlapping runs @@ -275,12 +295,14 @@ All deployed apps appear in the restore dropdown — every app has restic snapsh - Has HDD: "Teljes visszaállitas: adatbazis + konfiguracio + felhasznaloi adatok" - Has DB, no HDD: "Adatbazis es konfiguracio visszaallitasa" - No DB, no HDD: "Csak konfiguracio visszaallitasa" -- **Execution flow:** stop app → `restic restore --target / --include ...` → restart app +- **Execution flow:** stop app → resolve app's home drive → `restic restore --target / --include ...` from per-drive repo → restart app +- Restic repo resolved via `PrimaryResticRepoPath(appDrivePath)` +- DB dumps restored from `AppDBDumpPath(appDrivePath, stackName)` - Running flag prevents concurrent backup/restore operations - Snapshot ID validated (8-64 lowercase hex) -**Note:** Restore currently uses Tier 1 (primary restic repo) only. Restoring from Tier 2 -(cross-drive) is a future enhancement. +**Note:** Restore currently uses Tier 1 (primary restic repo on app's home drive) only. +Restoring from Tier 2 (cross-drive) is a future enhancement. #### Backup Page UI (`internal/web/templates/backups.html`) @@ -314,8 +336,8 @@ not just those with HDD data. Non-HDD apps can configure destination, method, an **Other sections:** - Schedule overview with next run times for DB dump, restic, prune -- Snapshot history table (last 20 snapshots with ID, time, files new/changed, data added) -- Repository info card (path, size, snapshot count, encryption key with show/copy) +- Snapshot history table (last 20 snapshots aggregated from all per-drive repos, sorted by time) +- Storage overview card (total size across repos, snapshot count, DB dump count/size, encryption key with show/copy) - Restore section: app dropdown → snapshot dropdown → restore type info → confirmation checkbox → execute --- @@ -339,7 +361,7 @@ A step-by-step UI at `/settings/storage/init`: 1. **Scan** — Lists available disks with model, size, partition info 2. **Select** — User picks a disk and enters a mount name (e.g., `hdd_1`) 3. **Confirm** — User types "FORMAZAS" to confirm destructive operation -4. **Format pipeline**: `wipefs` → `sfdisk` (GPT) → `mkfs.ext4` → `blkid` UUID → backup fstab → append UUID-based fstab entry → mount → `findmnt` verification → `chown 1000:1000` → create `storage/` and `Dokumentumok/` subdirectories +4. **Format pipeline**: `wipefs` → `sfdisk` (GPT) → `mkfs.ext4` → `blkid` UUID → backup fstab → append UUID-based fstab entry → mount → `findmnt` verification → `chown 1000:1000` → create `appdata/`, `backups/`, and `Dokumentumok/` subdirectories 5. Auto-registers new storage path in settings.json 6. Smart partition detection: skips repartitioning for existing empty partitions @@ -373,7 +395,7 @@ Progress UI at `/stacks/{name}/migrate` with byte counter and percentage. After migration, the deploy page detects leftover data on previous storage paths: - Shows path, size, and a delete button - Two-step confirmation required -- Protected paths (storage root, media, Dokumentumok, appdata) cannot be deleted +- Protected paths (appdata, backups, media, Dokumentumok) cannot be deleted #### FileBrowser Mount Sync @@ -573,12 +595,13 @@ controller/ │ │ ├── migrate.go # App data migration (rsync with progress) │ │ └── *_other.go # Non-Linux stubs for cross-compilation │ ├── backup/ -│ │ ├── backup.go # Orchestrator (dumps + restic + cross-drive chain) +│ │ ├── backup.go # Orchestrator (per-drive dumps + restic + cross-drive chain) +│ │ ├── paths.go # Per-drive path helpers (PrimaryResticRepoPath, AppDBDumpPath, etc.) │ │ ├── dbdump.go # DB auto-discovery + dump (pg_dump, mariadb-dump) -│ │ ├── restic.go # Restic operations (init, snapshot, prune, check) +│ │ ├── restic.go # Restic operations (init, snapshot, prune, check) — repoPath as param │ │ ├── appdata.go # StackDataProvider interface, app data discovery │ │ ├── crossdrive.go # Per-app backup to secondary storage (rsync/restic) -│ │ └── restore.go # Per-app restore with auto stop/restart +│ │ └── restore.go # Per-app restore from per-drive repo │ ├── api/router.go # REST API endpoints (~30 routes) │ ├── scheduler/scheduler.go # Central job scheduler (Every, Daily) │ ├── system/ @@ -625,22 +648,26 @@ Key sections: ```yaml customer: name: "Demo Felhom" - node_id: "demo-felhom" + id: "demo-felhom" paths: stacks_dir: "/opt/docker/stacks" data_dir: "/opt/docker/felhom-controller/data" - db_dump_dir: "/srv/backups/db-dumps" - restic_repo: "/srv/backups/restic-repo" + system_data_path: "/mnt/sys_drive" # NVMe/system drive — fallback for apps without HDD git: repo_url: "https://gitea.dooplex.hu/admin/app-catalog-felhom.eu.git" sync_interval: "15m" +# Per-drive backup paths are computed automatically: +# /backups/primary/restic/ — restic repo per drive +# /backups/primary//db-dumps/ — DB dumps per app +# /backups/secondary/ — cross-drive rsync + restic backup: enabled: true - db_dump_time: "02:30" - restic_time: "03:00" + restic_password_file: "/opt/docker/felhom-controller/data/restic-password" + db_dump_schedule: "02:30" + restic_schedule: "03:00" retention: { keep_daily: 7, keep_weekly: 4, keep_monthly: 6 } monitoring: @@ -756,7 +783,7 @@ Response format: `{"ok": true/false, "data": ..., "error": "...", "message": ".. # On build server (192.168.0.180) cd ~/build/felhom-controller git -C ~/git/deploy-felhom-compose pull -./build.sh v0.12.2 --push +./build.sh v0.14.0 --push ``` ### Deploy on customer node @@ -764,8 +791,8 @@ git -C ~/git/deploy-felhom-compose pull ```bash # On customer node (e.g., 192.168.0.162) cd /opt/docker/felhom-controller -sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.12.2 -sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.12.2|' docker-compose.yml +sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.14.0 +sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.14.0|' docker-compose.yml sudo docker compose up -d ``` @@ -801,6 +828,8 @@ See `docker-compose.yml` for the full volume configuration. - [x] Email notifications via hub relay - [x] Settings persistence and password management - [x] Dashboard alert system +- [x] Per-drive backup architecture (v0.14.0) — per-drive restic repos, per-app DB dumps, path helpers +- [x] Cross-drive restic pruning (v0.14.0) ### In Progress / Planned @@ -808,7 +837,6 @@ See `docker-compose.yml` for the full volume configuration. - [ ] Self-update mechanism with health-based rollback - [ ] Docker volume backup (`/var/lib/docker/volumes:ro`) - [ ] Raspberry Pi testing (pi-customer-1) -- [ ] Cross-drive restic pruning (unbounded snapshot growth) - [ ] CSRF protection on POST endpoints - [ ] Login rate limiting @@ -818,7 +846,7 @@ See `docker-compose.yml` for the full volume configuration. | Node | Hardware | Domain | Status | |------|----------|--------|--------| -| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | Controller v0.12.2 running | +| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | Controller v0.14.0 (pending OS reinstall) | | pi-customer-1 | Raspberry Pi 3B+, 1G RAM, 32G SD | pi-customer-1.local | Not yet tested | ## Related Repositories