docs: update CONTEXT.md and README for v0.6.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
+47
-10
@@ -7,7 +7,7 @@
|
||||
>
|
||||
> Ask Claude Code: "Please update CONTEXT.md with what we did today"
|
||||
|
||||
Last updated: 2026-02-16 (session 17)
|
||||
Last updated: 2026-02-16 (session 18)
|
||||
|
||||
---
|
||||
|
||||
@@ -22,7 +22,7 @@ Last updated: 2026-02-16 (session 17)
|
||||
## Current project state
|
||||
|
||||
### felhom-controller (this repo)
|
||||
- **Version:** v0.5.4
|
||||
- **Version:** v0.6.0
|
||||
- **Phase 1:** ✅ COMPLETE — Stack Manager + Deploy Flow
|
||||
- **Phase 2:** ✅ COMPLETE — Monitoring & Health (scheduler, CPU/temp, healthchecks.io pings)
|
||||
- **Phase 3:** ✅ COMPLETE — Backups (DB dumps, restic integration, manual trigger, **dedicated backup page**)
|
||||
@@ -31,7 +31,40 @@ Last updated: 2026-02-16 (session 17)
|
||||
- **Running on:** demo-felhom (N100 mini PC) at 192.168.0.162:8080
|
||||
- **All Phase 1-4 features working:** deploy, start/stop/restart/update, logs, health-aware states, auth, monitoring, backups, backup detail page, system monitoring page
|
||||
|
||||
### What was just completed (2026-02-16 session 17)
|
||||
### What was just completed (2026-02-16 session 18)
|
||||
- **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:**
|
||||
- **Part 1 — Healthcheck enhancements (controller-side):**
|
||||
- Added `heartbeat` ping — lightweight "I'm alive" signal every 5 min (no logic, just ping)
|
||||
- Added `backup_integrity` ping — weekly `restic check` on Sunday 04:00, pings healthchecks with result
|
||||
- Added `Heartbeat` and `BackupIntegrity` fields to `PingUUIDsConfig`
|
||||
- Added `RunIntegrityCheck()` to backup Manager (calls restic Check(), updates lastCheckTime/lastCheckOK, pings)
|
||||
- Updated `controller.yaml.example` with new monitoring ping_uuids
|
||||
- Created `monitoring/DEPRECATED.md` for legacy bash monitoring scripts
|
||||
- **Part 2 — Central hub reporting (controller-side):**
|
||||
- New `internal/report/` package: types.go (Report struct), builder.go (BuildReport), pusher.go (HTTP push)
|
||||
- Report builder gathers data from all subsystems: system info (via metrics.GetStaticInfo + system.GetInfo), container stats (via metricsStore.QueryContainerSummary), backup status (via backupMgr.GetFullStatus), health (via monitor.RunHealthCheck), stacks (via stackMgr.GetStacks)
|
||||
- Report pusher: POST JSON to hub with Bearer token auth, 3 retries with 5s backoff, never fails caller
|
||||
- Added `HubConfig` to config.go (enabled, url, api_key, push_interval)
|
||||
- Wired hub reporting into scheduler (configurable interval, default 15m)
|
||||
- Hub reporting disabled by default (hub.enabled: false)
|
||||
- **Part 3 — Hub service (felhom.eu repo, new `hub/` subfolder):**
|
||||
- Full Go service: `cmd/hub/main.go`, `internal/api/handler.go`, `internal/store/store.go`, `internal/web/server.go`
|
||||
- SQLite store with WAL mode, auto-migration, denormalized fields for fast queries
|
||||
- REST API: POST /api/v1/report (Bearer token auth), GET /api/v1/customers, GET /api/v1/customers/{id}, GET /api/v1/customers/{id}/history
|
||||
- Dark theme dashboard (English): multi-customer overview table with status indicators, customer detail page with system/storage/containers/backup/health sections
|
||||
- Color coding: green (OK, <30min), yellow (warn or 30-60min), red (fail or >60min)
|
||||
- K8s manifest: Deployment + Service + Ingress for hub.felhom.eu in felhom-system namespace
|
||||
- Dockerfile, Makefile, hub.yaml.example config
|
||||
- 90-day report retention with daily auto-prune
|
||||
- **Controller version:** v0.6.0 — deployed and verified on demo-felhom.eu (9 scheduler jobs, all new jobs registered)
|
||||
- **Manual steps remaining for Viktor (Part 4 of TASK.md):**
|
||||
- Create 5 healthcheck checks on status.felhom.eu (heartbeat, system-health, db-dump, backup, backup-integrity)
|
||||
- Update controller.yaml on demo-felhom with real UUIDs
|
||||
- Build and deploy felhom-hub to k3s cluster
|
||||
- Configure hub.felhom.eu DNS in Cloudflare
|
||||
- Enable hub reporting on demo-felhom controller.yaml
|
||||
|
||||
### What was previously completed (2026-02-16 session 17)
|
||||
- **v0.5.4 — Monitoring Page Frontend Fixes (4 bugs, frontend-only):**
|
||||
- **Bug 1: Tooltip "Invalid Date"** — `items[0].parsed.x` unreliable across Chart.js versions. Fixed tooltip callback to use `items[0].raw.x` (direct {x,y} data access) with `parsed.x` as fallback.
|
||||
- **Bug 2: Charts fill full width regardless of data density** — `setChartXBounds()` setting `min/max` at runtime was ignored because the scale was created without them. Fixed by including `min: now - defaultRangeMs, max: now` in the initial `chartOpts()` options. Now "7 nap" shows full 7-day x-axis with data clustered on the right.
|
||||
@@ -336,15 +369,19 @@ Last updated: 2026-02-16 (session 17)
|
||||
7. Documentation: restart vs up -d for image updates
|
||||
|
||||
### What's next (priorities)
|
||||
1. **Configure Healthchecks.io UUIDs** on demo-felhom.eu (replace CHANGEME in controller.yaml)
|
||||
1. **Manual steps for v0.6.0** — Viktor needs to:
|
||||
- Create 5 healthcheck checks on status.felhom.eu with correct periods/grace
|
||||
- Update controller.yaml on demo-felhom with real UUIDs
|
||||
- Build + deploy felhom-hub to k3s (`cd hub && make docker-push`, `kubectl apply -f manifests/hub.yaml`)
|
||||
- Configure hub.felhom.eu DNS in Cloudflare
|
||||
- Enable hub reporting on demo-felhom (`hub.enabled: true`, `hub.api_key: <key>`)
|
||||
2. **Test backup flow** — trigger manual backup via dashboard, verify restic repo + DB dumps
|
||||
3. **Test orphan delete flow** — try deleting the orphaned filebrowser stack via the UI
|
||||
3. **Test backup integrity check** — wait for Sunday 04:00 or manually trigger
|
||||
4. Add `app_info` + `optional_config` to more apps (start with Immich, Mealie, Vaultwarden)
|
||||
5. Deploy a second app (e.g., ActualBudget — simplest, or Immich — tests HDD + secrets)
|
||||
6. Add app screenshots to the asset pipeline (romm-screenshot-1.webp etc.)
|
||||
7. Test on Raspberry Pi (pi-customer-1)
|
||||
8. Add `paths.hdd_path` to demo-felhom controller.yaml to enable HDD bar
|
||||
9. Phase 4: Self-update mechanism
|
||||
6. Test on Raspberry Pi (pi-customer-1)
|
||||
7. Phase 4: Self-update mechanism
|
||||
8. v0.6.1: Hub alerting (webhook to Healthchecks for stale customers)
|
||||
|
||||
## Architecture decisions
|
||||
|
||||
@@ -411,7 +448,7 @@ Last updated: 2026-02-16 (session 17)
|
||||
|------------|--------|-------|
|
||||
| deploy-felhom-compose | Active | This repo. Controller code + deploy scripts |
|
||||
| app-catalog-felhom.eu | Active | 10 app templates, all with .felhom.yml metadata + memory limits |
|
||||
| felhom.eu | Stable | Website live, SEO indexed, email working |
|
||||
| felhom.eu | Active | Website + hub/ subfolder (felhom-hub service) + k8s manifests |
|
||||
| homelab-manifests | Stable | k3s cluster running (dooplex.hu services) |
|
||||
| misc-scripts | Utility | collect-repo.sh, backup helpers |
|
||||
|
||||
|
||||
Reference in New Issue
Block a user