docs: update CONTEXT.md and README.md for v0.4.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
+71
-12
@@ -7,7 +7,7 @@
|
||||
>
|
||||
> Ask Claude Code: "Please update CONTEXT.md with what we did today"
|
||||
|
||||
Last updated: 2026-02-15 (session 9)
|
||||
Last updated: 2026-02-15 (session 10)
|
||||
|
||||
---
|
||||
|
||||
@@ -22,13 +22,65 @@ Last updated: 2026-02-15 (session 9)
|
||||
## Current project state
|
||||
|
||||
### felhom-controller (this repo)
|
||||
- **Version:** v0.3.0
|
||||
- **Version:** v0.4.0
|
||||
- **Phase 1:** ✅ COMPLETE — Stack Manager + Deploy Flow
|
||||
- **Phase 2:** ✅ COMPLETE — Monitoring & Health (scheduler, CPU/temp, healthchecks.io pings)
|
||||
- **Phase 3:** ✅ COMPLETE — Backups (DB dumps, restic integration, manual trigger)
|
||||
- **First app deployed:** Paperless-ngx on demo-felhom.eu (2026-02-13)
|
||||
- **Running on:** demo-felhom (N100 mini PC) at 192.168.0.162:8080
|
||||
- **All Phase 1 features working:** deploy, start/stop/restart/update, logs, health-aware states, auth
|
||||
- **All Phase 1-3 features working:** deploy, start/stop/restart/update, logs, health-aware states, auth, monitoring, backups
|
||||
|
||||
### What was just completed (2026-02-15 session 9)
|
||||
### What was just completed (2026-02-15 session 10)
|
||||
- **v0.4.0 — Monitoring & Health + Backups (Phase 2 & 3):**
|
||||
- **Central job scheduler** (`internal/scheduler/scheduler.go`):
|
||||
- Replaces ad-hoc goroutines in main.go with a unified scheduler
|
||||
- `Every(name, interval, fn)` for periodic jobs, `Daily(name, timeStr, fn)` for scheduled tasks
|
||||
- Panic recovery, skip-if-running, quiet mode for high-frequency jobs (≤30s)
|
||||
- Daily jobs use `Europe/Budapest` timezone with `time.Timer` for DST correctness
|
||||
- Graceful shutdown with 30s timeout for running jobs
|
||||
- **CPU usage collector** (`internal/system/cpu_linux.go`):
|
||||
- Background goroutine samples `/proc/stat` every 5s, computes delta-based CPU %
|
||||
- Platform stubs for non-Linux in `cpu_other.go`
|
||||
- **Temperature & load metrics** (`internal/system/info_linux.go`):
|
||||
- Reads `/proc/loadavg` for 1/5/15 min load averages
|
||||
- Reads thermal zones from `/host/sys/class/thermal/` (Docker mount) with `/sys/` fallback
|
||||
- Handles millidegree values, picks highest zone, with hwmon fallback
|
||||
- **Healthchecks.io pinger** (`internal/monitor/pinger.go`):
|
||||
- HTTP ping client for Healthchecks.io-compatible endpoints
|
||||
- POST to `/ping/{uuid}` (success), `/fail` (failure), `/start` (started)
|
||||
- 10s timeout, 3 retries with 2s backoff, skips CHANGEME UUIDs
|
||||
- **System health checks** (`internal/monitor/healthcheck.go`):
|
||||
- Checks disk, memory, CPU, temperature, Docker reachability, protected containers
|
||||
- Returns HealthReport with status "ok"/"warn"/"fail" + formatted message for pings
|
||||
- **Database dump engine** (`internal/backup/dbdump.go`):
|
||||
- Auto-discovers PostgreSQL/MariaDB containers via `docker ps` + `docker inspect`
|
||||
- Dumps via `docker exec pg_dump`/`mariadb-dump` with 5min timeout
|
||||
- Atomic writes (`.tmp` → `.sql`), empty file detection, stale temp cleanup
|
||||
- **Restic integration** (`internal/backup/restic.go`):
|
||||
- Auto-generates repository password (32 random bytes, base64url)
|
||||
- Init, snapshot (JSON output), prune, check, stats, latest snapshot
|
||||
- Stale lock detection with automatic unlock + retry
|
||||
- **Backup orchestrator** (`internal/backup/backup.go`):
|
||||
- DB dumps + restic snapshots, weekly prune on Sundays
|
||||
- Thread-safe running flag, Healthchecks.io pings with results
|
||||
- `RunFullBackup()` for manual trigger (sequential: dumps → snapshot)
|
||||
- **Wiring updates:**
|
||||
- `main.go`: scheduler-based job registration, cpuCollector lifecycle, pinger + backupMgr init
|
||||
- `api/router.go`: `GET /api/backup/status`, `POST /api/backup/run`
|
||||
- `web/server.go` + `handlers.go`: pass cpuCollector to GetInfo(), backup status on dashboard
|
||||
- `funcmap.go`: `tempColor`, `fmtTemp`, `fmtLoad` template functions
|
||||
- **Dashboard UI enhancements:**
|
||||
- CPU usage bar with load average display below
|
||||
- Temperature with colored indicator dot (green/yellow/red at 60°/75°C)
|
||||
- Backup status card: last run time, DB count, repo size/snapshots
|
||||
- "Mentés most" button triggers manual backup via API
|
||||
- **Config updates:**
|
||||
- `controller.yaml.example`: added `system_health_interval`, `hdd_path`, `system.reserved_memory_mb`
|
||||
- `docker-compose.yml`: added `/sys:/host/sys:ro` mount for temperature reading
|
||||
- `restic_password_file` default changed to `data/` subdir (auto-generated in named volume)
|
||||
- **Controller version:** v0.4.0 — deployed and verified on demo-felhom.eu
|
||||
|
||||
### What was previously completed (2026-02-15 session 9)
|
||||
- **v0.3.0 — Structural refactoring (templates + server split + domain rename):**
|
||||
- **Templates: go:embed migration** — moved all 7 HTML templates + CSS from Go string constants to individual files in `internal/web/templates/`. Created `embed.go` with `//go:embed` directive. Template loading now uses `ParseFS()` instead of `Parse()`. CSS served from embed.FS via `ReadFile()`. Zero runtime file dependencies — still compiled into the binary.
|
||||
- **Server decomposition** — split monolithic `server.go` (540 lines) into focused files:
|
||||
@@ -190,14 +242,15 @@ Last updated: 2026-02-15 (session 9)
|
||||
7. Documentation: restart vs up -d for image updates
|
||||
|
||||
### What's next (priorities)
|
||||
1. **Test orphan delete flow** — try deleting the orphaned filebrowser stack via the UI
|
||||
2. Add `app_info` + `optional_config` to more apps (start with Immich, Mealie, Vaultwarden)
|
||||
3. Deploy a second app (e.g., ActualBudget — simplest, or Immich — tests HDD + secrets) to validate all .felhom.yml files
|
||||
4. Add app screenshots to the asset pipeline (romm-screenshot-1.webp etc.)
|
||||
5. Test on Raspberry Pi (pi-customer-1)
|
||||
6. Add `paths.hdd_path` to demo-felhom controller.yaml to enable HDD bar
|
||||
7. Phase 2 continued: CPU/temperature metrics, Healthchecks.io pings
|
||||
8. Phase 3: Backup system (DB dumps + restic)
|
||||
1. **Configure Healthchecks.io UUIDs** on demo-felhom.eu (replace CHANGEME in controller.yaml)
|
||||
2. **Test backup flow** — trigger manual backup via dashboard, verify restic repo + DB dumps
|
||||
3. **Test orphan delete flow** — try deleting the orphaned filebrowser stack via the UI
|
||||
4. Add `app_info` + `optional_config` to more apps (start with Immich, Mealie, Vaultwarden)
|
||||
5. Deploy a second app (e.g., ActualBudget — simplest, or Immich — tests HDD + secrets)
|
||||
6. Add app screenshots to the asset pipeline (romm-screenshot-1.webp etc.)
|
||||
7. Test on Raspberry Pi (pi-customer-1)
|
||||
8. Add `paths.hdd_path` to demo-felhom controller.yaml to enable HDD bar
|
||||
9. Phase 4: Self-update mechanism
|
||||
|
||||
## Architecture decisions
|
||||
|
||||
@@ -222,6 +275,12 @@ Last updated: 2026-02-15 (session 9)
|
||||
| Orphan = deployed but not in catalog | Safe lifecycle: remove from catalog → mark orphaned → user deletes via UI |
|
||||
| FileBrowser as infra (not catalog) | Needed even after apps deleted (user browses HDD data); deployed by setup script |
|
||||
| Protected HDD paths | Safety net: never delete top-level HDD dirs (media, storage, Dokumentumok, appdata) |
|
||||
| Central scheduler (not ad-hoc goroutines) | Single place to register/monitor all periodic tasks, graceful shutdown, skip-if-running |
|
||||
| CPU sampling via background goroutine | /proc/stat delta needs two readings — collector runs every 5s, GetInfo() reads cached value |
|
||||
| Temperature from /host/sys (Docker mount) | Container can't read host /sys directly — mount /sys:/host/sys:ro, try /host/sys first |
|
||||
| Restic password auto-generated | No manual setup needed — generated on first backup run, stored in named volume |
|
||||
| DB discovery via docker inspect | No config needed — discovers postgres/mariadb containers by image name + env vars |
|
||||
| Backup orchestrator with running flag | Prevents concurrent backups, supports both scheduled and manual trigger |
|
||||
|
||||
## Key file locations on demo-felhom
|
||||
|
||||
|
||||
Reference in New Issue
Block a user