docs: update CONTEXT.md and README for v0.6.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-16 13:25:05 +01:00
parent 97074e7a0c
commit 8a1b9e57ae
2 changed files with 80 additions and 22 deletions
+47 -10
View File
@@ -7,7 +7,7 @@
>
> Ask Claude Code: "Please update CONTEXT.md with what we did today"
Last updated: 2026-02-16 (session 17)
Last updated: 2026-02-16 (session 18)
---
@@ -22,7 +22,7 @@ Last updated: 2026-02-16 (session 17)
## Current project state
### felhom-controller (this repo)
- **Version:** v0.5.4
- **Version:** v0.6.0
- **Phase 1:** ✅ COMPLETE — Stack Manager + Deploy Flow
- **Phase 2:** ✅ COMPLETE — Monitoring & Health (scheduler, CPU/temp, healthchecks.io pings)
- **Phase 3:** ✅ COMPLETE — Backups (DB dumps, restic integration, manual trigger, **dedicated backup page**)
@@ -31,7 +31,40 @@ Last updated: 2026-02-16 (session 17)
- **Running on:** demo-felhom (N100 mini PC) at 192.168.0.162:8080
- **All Phase 1-4 features working:** deploy, start/stop/restart/update, logs, health-aware states, auth, monitoring, backups, backup detail page, system monitoring page
### What was just completed (2026-02-16 session 17)
### What was just completed (2026-02-16 session 18)
- **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:**
- **Part 1 — Healthcheck enhancements (controller-side):**
- Added `heartbeat` ping — lightweight "I'm alive" signal every 5 min (no logic, just ping)
- Added `backup_integrity` ping — weekly `restic check` on Sunday 04:00, pings healthchecks with result
- Added `Heartbeat` and `BackupIntegrity` fields to `PingUUIDsConfig`
- Added `RunIntegrityCheck()` to backup Manager (calls restic Check(), updates lastCheckTime/lastCheckOK, pings)
- Updated `controller.yaml.example` with new monitoring ping_uuids
- Created `monitoring/DEPRECATED.md` for legacy bash monitoring scripts
- **Part 2 — Central hub reporting (controller-side):**
- New `internal/report/` package: types.go (Report struct), builder.go (BuildReport), pusher.go (HTTP push)
- Report builder gathers data from all subsystems: system info (via metrics.GetStaticInfo + system.GetInfo), container stats (via metricsStore.QueryContainerSummary), backup status (via backupMgr.GetFullStatus), health (via monitor.RunHealthCheck), stacks (via stackMgr.GetStacks)
- Report pusher: POST JSON to hub with Bearer token auth, 3 retries with 5s backoff, never fails caller
- Added `HubConfig` to config.go (enabled, url, api_key, push_interval)
- Wired hub reporting into scheduler (configurable interval, default 15m)
- Hub reporting disabled by default (hub.enabled: false)
- **Part 3 — Hub service (felhom.eu repo, new `hub/` subfolder):**
- Full Go service: `cmd/hub/main.go`, `internal/api/handler.go`, `internal/store/store.go`, `internal/web/server.go`
- SQLite store with WAL mode, auto-migration, denormalized fields for fast queries
- REST API: POST /api/v1/report (Bearer token auth), GET /api/v1/customers, GET /api/v1/customers/{id}, GET /api/v1/customers/{id}/history
- Dark theme dashboard (English): multi-customer overview table with status indicators, customer detail page with system/storage/containers/backup/health sections
- Color coding: green (OK, <30min), yellow (warn or 30-60min), red (fail or >60min)
- K8s manifest: Deployment + Service + Ingress for hub.felhom.eu in felhom-system namespace
- Dockerfile, Makefile, hub.yaml.example config
- 90-day report retention with daily auto-prune
- **Controller version:** v0.6.0 — deployed and verified on demo-felhom.eu (9 scheduler jobs, all new jobs registered)
- **Manual steps remaining for Viktor (Part 4 of TASK.md):**
- Create 5 healthcheck checks on status.felhom.eu (heartbeat, system-health, db-dump, backup, backup-integrity)
- Update controller.yaml on demo-felhom with real UUIDs
- Build and deploy felhom-hub to k3s cluster
- Configure hub.felhom.eu DNS in Cloudflare
- Enable hub reporting on demo-felhom controller.yaml
### What was previously completed (2026-02-16 session 17)
- **v0.5.4 — Monitoring Page Frontend Fixes (4 bugs, frontend-only):**
- **Bug 1: Tooltip "Invalid Date"** — `items[0].parsed.x` unreliable across Chart.js versions. Fixed tooltip callback to use `items[0].raw.x` (direct {x,y} data access) with `parsed.x` as fallback.
- **Bug 2: Charts fill full width regardless of data density** — `setChartXBounds()` setting `min/max` at runtime was ignored because the scale was created without them. Fixed by including `min: now - defaultRangeMs, max: now` in the initial `chartOpts()` options. Now "7 nap" shows full 7-day x-axis with data clustered on the right.
@@ -336,15 +369,19 @@ Last updated: 2026-02-16 (session 17)
7. Documentation: restart vs up -d for image updates
### What's next (priorities)
1. **Configure Healthchecks.io UUIDs** on demo-felhom.eu (replace CHANGEME in controller.yaml)
1. **Manual steps for v0.6.0** — Viktor needs to:
- Create 5 healthcheck checks on status.felhom.eu with correct periods/grace
- Update controller.yaml on demo-felhom with real UUIDs
- Build + deploy felhom-hub to k3s (`cd hub && make docker-push`, `kubectl apply -f manifests/hub.yaml`)
- Configure hub.felhom.eu DNS in Cloudflare
- Enable hub reporting on demo-felhom (`hub.enabled: true`, `hub.api_key: <key>`)
2. **Test backup flow** — trigger manual backup via dashboard, verify restic repo + DB dumps
3. **Test orphan delete flow** — try deleting the orphaned filebrowser stack via the UI
3. **Test backup integrity check** — wait for Sunday 04:00 or manually trigger
4. Add `app_info` + `optional_config` to more apps (start with Immich, Mealie, Vaultwarden)
5. Deploy a second app (e.g., ActualBudget — simplest, or Immich — tests HDD + secrets)
6. Add app screenshots to the asset pipeline (romm-screenshot-1.webp etc.)
7. Test on Raspberry Pi (pi-customer-1)
8. Add `paths.hdd_path` to demo-felhom controller.yaml to enable HDD bar
9. Phase 4: Self-update mechanism
6. Test on Raspberry Pi (pi-customer-1)
7. Phase 4: Self-update mechanism
8. v0.6.1: Hub alerting (webhook to Healthchecks for stale customers)
## Architecture decisions
@@ -411,7 +448,7 @@ Last updated: 2026-02-16 (session 17)
|------------|--------|-------|
| deploy-felhom-compose | Active | This repo. Controller code + deploy scripts |
| app-catalog-felhom.eu | Active | 10 app templates, all with .felhom.yml metadata + memory limits |
| felhom.eu | Stable | Website live, SEO indexed, email working |
| felhom.eu | Active | Website + hub/ subfolder (felhom-hub service) + k8s manifests |
| homelab-manifests | Stable | k3s cluster running (dooplex.hu services) |
| misc-scripts | Utility | collect-repo.sh, backup helpers |