v0.22.0: First-run setup wizard, local infra backup, hub verification
New controller features:
- Web-based setup wizard replaces docker-setup.sh interactive config
- Dual listener: :8080 (Traefik) + :8081 (direct HTTP for LAN)
- Drive scanner finds .felhom-infra-backup/ on all block devices
- Hub recovery pull (GET /api/v1/recovery/{id}) with retrieval password
- Fresh install: Hub config download or manual wizard
- CSRF protection, state persistence, Hungarian UI
- Local infra backup written to all connected drives after each backup cycle
- .felhom-infra-backup/backup.json + metadata.json with SHA256 checksum
- Hub verification: parse customer_blocked from report push response
- Limited mode after 7 days without verification
- Recovery info page on Settings + recovery-info.txt file generation
- Pending events queue: DR events sent to Hub on next report push
- docker-setup.sh v6.0.0: removed interactive wizard, minimal controller.yaml only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
+103
-21
@@ -4,7 +4,7 @@
|
||||
|
||||
A single, lightweight Go container that replaces Portainer + scattered systemd scripts with a unified, Hungarian-language web dashboard for managing Docker Compose stacks, backups, storage, monitoring, and notifications on customer hardware.
|
||||
|
||||
**Current version: v0.21.0**
|
||||
**Current version: v0.22.0**
|
||||
|
||||
---
|
||||
|
||||
@@ -20,6 +20,8 @@ A single, lightweight Go container that replaces Portainer + scattered systemd s
|
||||
- [Update Management](#6-update-management)
|
||||
- [Authentication & Settings](#7-authentication--settings)
|
||||
- [Central Hub](#8-central-hub-reporting)
|
||||
- [Setup Wizard](#9-first-run-setup-wizard)
|
||||
- [Disaster Recovery](#10-disaster-recovery)
|
||||
- [Repository Layout](#repository-layout)
|
||||
- [Configuration](#configuration)
|
||||
- [REST API](#rest-api)
|
||||
@@ -812,28 +814,95 @@ The hub service (separate Go app in the `felhom.eu` repo) provides:
|
||||
- Color coding: green (<30min), yellow (30-60min), red (>60min since last report)
|
||||
- 90-day report + event retention with daily prune at 04:30 Budapest time
|
||||
|
||||
### 9. Disaster Recovery
|
||||
### 9. First-Run Setup Wizard
|
||||
|
||||
When a system drive fails and is replaced, the controller can automatically restore the full deployment:
|
||||
When the controller starts with no valid customer configuration (`customer.id` empty or `"demo-felhom"`), it enters **setup mode** — a web-based wizard that handles all initial configuration. This replaces the old interactive shell wizard in `docker-setup.sh`.
|
||||
|
||||
#### Setup Mode Detection (`internal/setup/setup.go`)
|
||||
|
||||
`NeedsSetup(cfg)` returns true when `customer.id` is empty or `"demo-felhom"`. In setup mode, the controller skips normal startup (no scheduler, no backup, no stacks) and serves only the wizard UI on two listeners:
|
||||
- `:8080` — behind Traefik (accessible via domain, e.g. `https://felhom.example.com`)
|
||||
- `:8081` — direct HTTP (accessible via LAN IP, e.g. `http://192.168.0.100:8081`)
|
||||
|
||||
#### Wizard Flow
|
||||
|
||||
```
|
||||
1. docker-setup.sh deploys fresh controller (Hub enabled, customer_id configured)
|
||||
2. Controller detects empty data dir → fresh deployment
|
||||
3. Controller pulls infra backup from Hub → gets disk layout, passwords, configs
|
||||
4. Controller scans block devices for UUIDs matching stored disk layout
|
||||
5. Controller mounts surviving drives (e.g., HDD with backups)
|
||||
6. Controller scans mounted drives for local backup data (_infra/ + rsync copies)
|
||||
7. Controller auto-restores stack configs → apps appear in dashboard
|
||||
8. User opens dashboard → "Visszaállítás" (Restore) wizard
|
||||
9. User confirms → sequential restore: rsync first, restic fallback, DB import
|
||||
10. Apps restored and running
|
||||
┌──────────────────────────────────┐
|
||||
│ 1. Welcome │
|
||||
│ Choose: Restore / Fresh install │
|
||||
└─────────┬───────────┬────────────┘
|
||||
│ │
|
||||
┌─────▼─────┐ ┌──▼───────────────┐
|
||||
│ 2a. Scan │ │ 2b. Hub download │
|
||||
│ drives for│ │ (customer ID + │
|
||||
│ local │ │ password) │
|
||||
│ backups │ │ │
|
||||
└─────┬─────┘ └──────┬────────────┘
|
||||
│ │
|
||||
┌─────▼─────┐ │
|
||||
│ 2a.2 Hub │ │
|
||||
│ recovery │ │
|
||||
│ (fallback)│ │
|
||||
└─────┬─────┘ │
|
||||
│ │
|
||||
┌─────▼─────┐ ┌──────▼───────────┐
|
||||
│ Execute │ │ Execute fresh │
|
||||
│ restore │ │ install │
|
||||
└─────┬─────┘ └──────┬───────────┘
|
||||
│ │
|
||||
└───────┬───────┘
|
||||
▼
|
||||
os.Exit(0) → Docker restarts
|
||||
→ normal mode
|
||||
```
|
||||
|
||||
#### Key Components
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `setup/setup.go` | `NeedsSetup()` detection, `SetupState` persistence to `setup-state.json` |
|
||||
| `setup/handlers.go` | HTTP handlers for each wizard step (welcome, scan, hub-restore, fresh, manual) |
|
||||
| `setup/scanner.go` | Scans all block devices for `.felhom-infra-backup/` directories via `lsblk` + temp mounts |
|
||||
| `setup/hub.go` | Hub recovery pull (`GET /api/v1/recovery/{id}`) and config download |
|
||||
| `setup/csrf.go` | Lightweight CSRF protection (cookie + hidden field, `SameSite=Strict`) |
|
||||
| `setup/network.go` | Detects local IPs for LAN access URL display |
|
||||
| `setup/templates/` | 7 embedded HTML templates (Hungarian, dark theme matching main UI) |
|
||||
|
||||
#### Local Infra Backup (`internal/backup/local_infra.go`)
|
||||
|
||||
The controller writes infrastructure snapshots to **every connected drive** after each backup cycle and on startup. Location: `<drive>/.felhom-infra-backup/`. Files:
|
||||
- `backup.json` — full infra backup (config, settings, disk layout, passwords, stacks)
|
||||
- `metadata.json` — schema version, timestamp, customer ID, controller version, SHA256 checksum
|
||||
|
||||
During setup wizard drive scan, these backups are discovered, integrity-verified, and offered for one-click restore.
|
||||
|
||||
#### Recovery Info (`internal/recovery/info.go`)
|
||||
|
||||
Generates `recovery-info.txt` on the system data partition with customer ID, Hub URL, retrieval password, and recovery instructions in Hungarian. Updated on startup and after config changes. Also displayed on the Settings page in a "Vészhelyzeti információk" section.
|
||||
|
||||
### 10. Disaster Recovery
|
||||
|
||||
When a system drive fails and is replaced, the recovery flow uses the setup wizard:
|
||||
|
||||
```
|
||||
1. docker-setup.sh deploys fresh controller with minimal config (domain + paths only)
|
||||
2. Controller detects empty customer.id → enters setup mode
|
||||
3. User opens wizard at http://<LAN-IP>:8081
|
||||
4. Wizard scans all drives for .felhom-infra-backup/ directories
|
||||
5. If found: one-click restore (config, settings, passwords, disk layout)
|
||||
6. If not found: Hub recovery via customer ID + retrieval password
|
||||
7. Controller restarts into normal mode with full config
|
||||
8. Controller auto-mounts surviving drives by UUID from disk layout
|
||||
9. Dashboard shows "Visszaállítás" (Restore) page for app-level recovery
|
||||
10. User confirms → sequential restore: rsync first, restic fallback, DB import
|
||||
```
|
||||
|
||||
**Backup sources (priority order):**
|
||||
1. **Rsync copies** (cross-drive, plain files, no password needed) — fastest, most reliable
|
||||
2. **Restic snapshots** (encrypted, needs password from Hub) — comprehensive but slower
|
||||
1. **Local infra backup** (`.felhom-infra-backup/` on surviving drives) — fastest, no network needed
|
||||
2. **Hub recovery endpoint** (`GET /api/v1/recovery/{id}`) — requires retrieval password
|
||||
3. **Manual config** (wizard form) — enter all details manually as last resort
|
||||
|
||||
**Fallback:** If the Hub is unreachable, the controller can still detect backups on already-mounted drives (manual mount or pre-existing fstab entries).
|
||||
**Hub verification:** After setup, the controller periodically verifies customer standing via the Hub report push response (`customer_blocked` field). If blocked or Hub unreachable for >7 days, the controller enters limited mode (no new deployments).
|
||||
|
||||
---
|
||||
|
||||
@@ -841,7 +910,7 @@ When a system drive fails and is replaced, the controller can automatically rest
|
||||
|
||||
```
|
||||
controller/
|
||||
├── cmd/controller/main.go # Entry point, wires all 14 modules
|
||||
├── cmd/controller/main.go # Entry point, wires all 15 modules (setup mode branch + normal startup)
|
||||
├── internal/
|
||||
│ ├── config/config.go # YAML loader, validation, env overrides
|
||||
│ ├── settings/settings.go # Runtime settings (JSON, atomic writes, RWMutex)
|
||||
@@ -860,7 +929,8 @@ controller/
|
||||
│ │ └── *_other.go # Non-Linux stubs for cross-compilation
|
||||
│ ├── backup/
|
||||
│ │ ├── backup.go # Orchestrator (per-drive dumps + restic + cross-drive chain)
|
||||
│ │ ├── paths.go # Per-drive path helpers (PrimaryResticRepoPath, AppDBDumpPath, etc.)
|
||||
│ │ ├── paths.go # Per-drive path helpers (PrimaryResticRepoPath, InfraBackupDir, etc.)
|
||||
│ │ ├── local_infra.go # Local infra backup to all drives (.felhom-infra-backup/)
|
||||
│ │ ├── dbdump.go # DB auto-discovery + dump (pg_dump, mariadb-dump)
|
||||
│ │ ├── restic.go # Restic operations (init, snapshot, prune, check) — repoPath as param
|
||||
│ │ ├── appdata.go # StackDataProvider interface, app data discovery
|
||||
@@ -890,8 +960,16 @@ controller/
|
||||
│ ├── notify/notifier.go # Email relay to hub, preference sync, cooldowns
|
||||
│ ├── report/
|
||||
│ │ ├── builder.go # Hub report builder (all subsystems → JSON)
|
||||
│ │ ├── pusher.go # HTTP POST to hub (retry, Bearer auth)
|
||||
│ │ └── infra_pull.go # DR: pull infra backup from Hub for fresh deployment
|
||||
│ │ ├── pusher.go # HTTP POST to hub (retry, Bearer auth, parses customer_blocked)
|
||||
│ │ └── infra_pull.go # DR: pull recovery/config from Hub (retrieval password auth)
|
||||
│ ├── setup/ # First-run setup wizard (web-based, replaces docker-setup.sh wizard)
|
||||
│ │ ├── setup.go # NeedsSetup() detection, state persistence
|
||||
│ │ ├── handlers.go # HTTP handlers for all wizard steps
|
||||
│ │ ├── scanner.go # Drive scanner for local infra backups
|
||||
│ │ ├── csrf.go # Lightweight CSRF (cookie + hidden field)
|
||||
│ │ ├── network.go # Local IP detection for LAN access URLs
|
||||
│ │ └── templates/ # 7 wizard HTML templates (Hungarian)
|
||||
│ ├── recovery/info.go # Recovery info file generator (recovery-info.txt)
|
||||
│ └── web/
|
||||
│ ├── server.go # HTTP server, routing, static files
|
||||
│ ├── auth.go # Session auth, login/logout, session cleanup
|
||||
@@ -953,6 +1031,10 @@ monitoring:
|
||||
backup: "uuid-here"
|
||||
backup_integrity: "uuid-here"
|
||||
|
||||
web:
|
||||
listen: ":8080"
|
||||
setup_listen: ":8081" # Plain HTTP for setup wizard LAN access
|
||||
|
||||
hub:
|
||||
enabled: true
|
||||
url: "https://hub.felhom.eu"
|
||||
@@ -966,7 +1048,7 @@ Environment variable overrides: `FELHOM_LOGGING_LEVEL=debug`, `FELHOM_HUB_ENABLE
|
||||
|
||||
### Runtime settings (`settings.json`)
|
||||
|
||||
Auto-managed by the controller. Contains password hash overrides, notification preferences, per-app backup configs, storage path registry, DB validation cache. All writes are atomic.
|
||||
Auto-managed by the controller. Contains password hash overrides, notification preferences, per-app backup configs, storage path registry, DB validation cache, Hub verification state (`hub_verified`, `hub_verified_at`), retrieval password for disaster recovery, and pending event queue. All writes are atomic (write `.tmp`, rename).
|
||||
|
||||
### Per-app config (`app.yaml`)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user