feat: add controller self-update mechanism (v0.16.0)

New selfupdate package: version parsing, audit state file, updater with
Gitea registry V2 check, docker pull + compose rewrite + compose up flow.

- API: /api/selfupdate/{status,check,update} with session+bearer auth
- UI: Settings "Verzió és frissítés" card with check/install buttons + JS polling
- Scheduler: periodic check (6h default) + optional daily auto-update
- Notifications: success/failure on post-update startup verification
- Alert: info banner when update available
- docker-compose.yml: add directory bind mount for compose file access

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-19 17:33:40 +01:00
parent 1a58797dc8
commit c9a88afcef
14 changed files with 1074 additions and 22 deletions
+167 -8
View File
@@ -4,7 +4,7 @@
A single, lightweight Go container that replaces Portainer + scattered systemd scripts with a unified, Hungarian-language web dashboard for managing Docker Compose stacks, backups, storage, monitoring, and notifications on customer hardware.
**Current version: v0.15.5**
**Current version: v0.16.0**
---
@@ -88,6 +88,7 @@ A single, lightweight Go container that replaces Portainer + scattered systemd s
| **Monitor** | `internal/monitor/` | Healthchecks.io pinger, system health checks |
| **Metrics** | `internal/metrics/` | SQLite time-series store, system + container metric collection |
| **Scheduler** | `internal/scheduler/` | Central job scheduler (periodic + daily, skip-if-running, panic recovery) |
| **SelfUpdate** | `internal/selfupdate/` | Version checking (registry), update trigger, state persistence, startup verification |
| **Notify** | `internal/notify/` | Email notifications via hub relay, preference sync, per-event cooldowns |
| **Report** | `internal/report/` | Hub report builder + HTTP pusher (system, stacks, backup, health) |
| **API** | `internal/api/` | REST JSON endpoints |
@@ -542,6 +543,134 @@ Notification preferences (email, enabled events, cooldown) are:
| `UPDATE_REQUIRED=true` | Mandatory — auto-applied during next update window |
| `UPDATE_SECURITY=true` | Critical — applied immediately |
#### Controller Self-Update (`internal/selfupdate/`)
The controller can update itself — a Watchtower-style pull-and-restart mechanism for a single container. Replaces manual SSH-based `docker pull + sed + docker compose up -d` with a one-click Settings page button or scheduled auto-update.
##### How It Works
```
1. Check Gitea Docker Registry V2 API for new image tags
2. Compare highest semver tag with current Version (set at build time via ldflags)
3. If newer version exists → pull image → update compose file → docker compose up -d
4. Current container is replaced by Docker → new container starts with new version
5. On startup, new container reads update-state.json → marks update success/failure
```
##### Design Philosophy
- **No automatic rollback** — follows the Watchtower pattern (24k+ GitHub stars, no rollback). Docker's `restart: unless-stopped` policy is the crash safety net. Healthchecks.io detects when the controller goes down.
- **Audit state file** — `update-state.json` in the data volume records every update attempt (previous version, target version, initiator, result). Operators can SSH in and revert using `PreviousImage` from this file.
- **Backup-aware** — refuses to start an update while a backup is in progress (`backupRunning()` guard).
##### Package Structure
| File | Purpose |
|------|---------|
| `version.go` | `ParseVersion("X.Y.Z")` → `Version{Major,Minor,Patch}`, `Compare()` returns -1/0/1. Hand-rolled, no external deps. Rejects "dev" and "latest". |
| `state.go` | `UpdateState` struct persisted as JSON. `LoadState()`, `SaveState()` (atomic: `.tmp` + rename), `ClearState()`. Status values: `"pending"`, `"success"`, `"failed"`. |
| `updater.go` | Core `Updater` struct. Registry check via HTTP GET to `gitea.dooplex.hu/v2/admin/felhom-controller/tags/list` with Basic Auth (git username/token). Update trigger: `docker pull` → compose file regex replace → `docker compose up -d`. Thread-safe with `sync.Mutex`. |
##### Update Trigger Flow
1. **Guard checks:** concurrent update lock, dev version check, backup running check, compose file accessible
2. Write `update-state.json` with status `"pending"` (audit trail)
3. `docker pull <image>:<targetVersion>`
4. Read compose file → replace image tag via regexp → atomic write (`.tmp` + rename)
5. `docker compose -f /opt/docker/felhom-controller/docker-compose.yml -p felhom-controller up -d`
6. Docker kills the current container, starts the new one
##### Startup Verification
Called once from `main.go` before the scheduler starts:
1. Load `update-state.json` — if missing or status != `"pending"`, nothing to do
2. Compare running `Version` with `state.TargetVersion`
3. **Match** → mark `"success"`, notify via hub
4. **Mismatch** → mark `"failed"`, notify via hub
5. No rollback attempt — operator reverts manually if needed
##### Auto-Update Scheduling
Two separate scheduler jobs prevent interference with backups:
| Job | Type | Default | Purpose |
|-----|------|---------|---------|
| `selfupdate-check` | `sched.Every` | 6h | Check registry, cache result (for UI). Never triggers update. |
| `selfupdate-auto` | `sched.Daily` | 04:30 | If auto-update enabled + update available + backup not running → trigger. |
The auto-update time (`config.SelfUpdate.AutoUpdateTime`, default `"04:30"`) is deliberately separate from the backup window (02:30-~04:00) to avoid collisions. The `backupRunning()` guard is the hard safety check — if backups run long past 04:30, the update is skipped and retried the next day.
An initial version check fires 30s after startup so the Settings page shows version info quickly.
##### Compose File Access
The controller needs write access to its own `docker-compose.yml`. This is achieved via Docker volume mount ordering:
```yaml
volumes:
# 1. Directory mount — gives access to compose file + .env
- /opt/docker/felhom-controller:/opt/docker/felhom-controller
# 2. Read-only override — prevents accidental config writes
- /opt/docker/felhom-controller/controller.yaml:/opt/docker/felhom-controller/controller.yaml:ro
# 3. Named volume override — persistent data in Docker-managed volume
- controller-data:/opt/docker/felhom-controller/data
```
##### API Endpoints
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/api/selfupdate/status` | Session or API key | Current status (cached, no network call) |
| POST | `/api/selfupdate/check` | Session or API key | Force registry check, return result |
| POST | `/api/selfupdate/update` | Session or API key | Trigger update (async, returns immediately) |
Self-update endpoints accept either session auth (for UI) or hub API key as bearer token (for external triggering from build scripts or hub). This enables the post-v0.16.0 deploy workflow:
```bash
# After building + pushing new image:
curl -s -X POST https://felhom.demo-felhom.eu/api/selfupdate/update \
-H "Authorization: Bearer <HUB_API_KEY>"
```
##### Settings Page UI
The "Verzió és frissítés" card on the Settings page (`/settings`) shows:
- Current version and latest available version
- "Frissítés elérhető" (update available) badge
- Last check time and any errors
- Auto-update status with configured time
- Last update result (success/failed/pending)
- **Buttons:** "Frissítés keresése" (check) + "Frissítés telepítése" (apply)
After triggering an update, the page polls `/api/health` every 3s and reloads when the new container responds.
A global info-level alert ("Új controller verzió elérhető") appears on all pages when an update is available, linking to the Settings page.
##### Configuration
```yaml
self_update:
enabled: true
check_interval: "6h" # How often to check registry
image: "gitea.dooplex.hu/admin/felhom-controller" # Default
auto_update: false # Set true for unattended updates
auto_update_time: "04:30" # When to auto-apply (after backups)
health_timeout_seconds: 60 # Reserved for future use
```
##### Edge Cases
| Scenario | Behavior |
|----------|----------|
| `Version == "dev"` | `ParseVersion` returns error → no updates reported, trigger refused |
| Registry unreachable | Log warning, return error in check result. No crash. |
| No registry credentials | Return error "Registry hitelesítő adatok hiányoznak" |
| Compose file not writable | Refuse update before doing anything |
| Backup running | Refuse with "Mentés fut, próbálja később" |
| Concurrent update | Mutex prevents duplicates: "Frissítés már folyamatban" |
| Bad update (crash loop) | Docker restarts container. State file stays "pending". Operator SSH-reverts using `PreviousImage`. |
| Corrupt state file | Treated as "no pending update", logged, deleted |
---
### 7. Authentication & Settings
@@ -572,11 +701,12 @@ All public methods use `sync.RWMutex`. File writes are atomic (`.tmp` + rename).
#### Settings Page (`/settings`)
Three sections:
Five sections:
1. **System config** — read-only display of `controller.yaml` values
2. **Password change** — current + new + confirm, min 8 chars
2. **Version & update** — current/latest version, check/update buttons, auto-update status, last update result
3. **Storage paths** — add/remove, edit labels, set default, toggle schedulable, per-path app list with sizes
4. **Notifications** — email, event checkboxes, cooldown hours, test email button
4. **Password change** — current + new + confirm, min 8 chars
5. **Notifications** — email, event checkboxes, cooldown hours, test email button
---
@@ -683,6 +813,10 @@ controller/
│ │ ├── store.go # SQLite time-series (WAL mode, downsampled queries)
│ │ ├── collector.go # Background collector (60s, system + docker stats)
│ │ └── sysinfo.go # Static system info (/proc, /etc)
│ ├── selfupdate/
│ │ ├── version.go # Semver parsing + comparison (hand-rolled)
│ │ ├── state.go # Update audit state (JSON, atomic writes)
│ │ └── updater.go # Registry check, update trigger, startup verify
│ ├── notify/notifier.go # Email relay to hub, preference sync, cooldowns
│ ├── report/
│ │ ├── builder.go # Hub report builder (all subsystems → JSON)
@@ -784,6 +918,8 @@ Auto-generated during deployment. Contains env vars, locked fields list, deploy
| backup | daily | 03:00 | Restic backup → cross-drive chain |
| backup-integrity | daily | Sun 04:00 | Restic check |
| metrics-prune | daily | 04:00 | Delete metrics older than 30 days |
| selfupdate-check | periodic | 6h | Check registry for new version (cache for UI) |
| selfupdate-auto | daily | 04:30 | Auto-update if enabled + backup not running |
All daily jobs use Europe/Budapest timezone. Skip-if-running prevents concurrent execution. Panic recovery in all jobs.
@@ -838,6 +974,16 @@ All daily jobs use Europe/Budapest timezone. Skip-if-running prevents concurrent
| POST | `/api/storage/migrate` | Start app data migration |
| GET | `/api/storage/migrate/status` | Migration progress |
### Self-Update
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/selfupdate/status` | Update status (cached check result + last state) |
| POST | `/api/selfupdate/check` | Force registry check |
| POST | `/api/selfupdate/update` | Trigger self-update (async) |
Self-update endpoints accept session auth OR `Authorization: Bearer <hub_api_key>` for external triggering.
### Metrics
| Method | Endpoint | Description |
@@ -864,11 +1010,24 @@ git -C ~/git/deploy-felhom-compose pull
### Deploy on customer node
**Option A: Self-Update API (v0.16.0+)**
After building and pushing the new image, trigger the controller's self-update endpoint:
```bash
curl -s -X POST https://felhom.demo-felhom.eu/api/selfupdate/update \
-H "Authorization: Bearer <HUB_API_KEY>"
```
The controller pulls the new image, updates its own compose file, and runs `docker compose up -d` to replace itself. The Settings page also has a "Frissítés telepítése" button for manual triggering.
**Option B: Manual SSH (pre-v0.16.0 or fallback)**
```bash
# On customer node (e.g., 192.168.0.162)
cd /opt/docker/felhom-controller
sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.14.1
sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.14.1|' docker-compose.yml
sudo docker pull gitea.dooplex.hu/admin/felhom-controller:<VERSION>
sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:<VERSION>|' docker-compose.yml
sudo docker compose up -d
```
@@ -910,11 +1069,11 @@ See `docker-compose.yml` for the full volume configuration.
- [x] Auto Tier 2 for small apps (v0.14.1) — auto-enable daily rsync for non-HDD apps when ≥2 drives
- [x] Infrastructure config in cross-drive backup (v0.14.1) — stacks dir + controller.yaml in `_infra/` + restic
- [x] Disaster recovery (v0.15.5) — Hub-based infra backup, auto-mount by UUID, restore UI with full-page takeover
- [x] Controller self-update (v0.16.0) — Watchtower-style pull + restart, Settings page UI, API key auth, auto-update scheduling
### In Progress / Planned
- [ ] Update classification and auto-apply (optional/required/security markers)
- [ ] Self-update mechanism with health-based rollback
- [ ] Docker volume backup (`/var/lib/docker/volumes:ro`)
- [ ] Raspberry Pi testing (pi-customer-1)
- [ ] CSRF protection on POST endpoints
@@ -926,7 +1085,7 @@ See `docker-compose.yml` for the full volume configuration.
| Node | Hardware | Domain | Status |
|------|----------|--------|--------|
| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | Controller v0.15.5 |
| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | Controller v0.16.0 |
| pi-customer-1 | Raspberry Pi 3B+, 1G RAM, 32G SD | pi-customer-1.local | Not yet tested |
## Related Repositories