# TASK.md — v0.6.0: Healthcheck Implementation + Central Push + Multi-Customer Dashboard > **Version:** v0.6.0 > **Depends on:** v0.5.4 (current) > **Repo:** `deploy-felhom-compose` (controller/ subfolder) > **Build:** `~/build/felhom-controller/build.sh 0.6.0 --push` > **Deploy target:** demo-felhom.eu (N100) + k3s cluster (dooplex.hu) --- ## Context The controller already has health monitoring infrastructure built in v0.4.0: - `internal/monitor/pinger.go` — Healthchecks.io-compatible HTTP ping client (success/fail/start, retries) - `internal/monitor/healthcheck.go` — System health checks (disk, memory, CPU, temp, Docker, protected containers) - Scheduler jobs in `main.go`: `system-health` (every 5m), `db-dump` (daily), `backup` (daily) - Backup manager already calls `pinger.Ping()`/`pinger.Fail()` after each operation **Problem:** The demo-felhom Healthchecks project has **zero checks created** (screenshot confirms empty project at `status.felhom.eu/projects/.../checks/`). The `controller.yaml` on demo-felhom has all `CHANGEME` placeholder UUIDs. Nothing is actually pinging. Additionally, there are legacy bash scripts (`backup-healthcheck.sh`, `monitoring-setup.sh`) from the pre-controller era that duplicate functionality now built into the controller. These should be deprecated in favor of controller-native pings. **This version has two major parts:** 1. **Prerequisite:** Get healthchecks actually working on demo-felhom (create checks, configure UUIDs, verify pings) 2. **New feature:** Central push from customer controllers to k3s + multi-customer overview dashboard --- ## Part 0: Healthcheck Ping Design (controller.yaml schema update) ### Current ping types (already implemented in code) | Ping | Schedule | Source | What it proves | |------|----------|--------|----------------| | `system_health` | Every 5 min | `monitor.RunHealthCheck()` | Server alive, Docker running, disks OK, protected containers up, CPU/mem/temp within thresholds | | `db_dump` | Daily 02:30 | `backup.RunDBDumps()` | Database dumps completed successfully | | `backup` | Daily 03:00 | `backup.RunBackup()` | Restic snapshot completed successfully | ### New ping types to add | Ping | Schedule | Source | What it proves | |------|----------|--------|----------------| | `backup_integrity` | Weekly (Sunday 04:00) | New: `backup.RunIntegrityCheck()` | Restic repo passes `restic check` — data is not corrupted | | `heartbeat` | Every 5 min | New: lightweight HTTP POST, no logic | Controller process is alive (distinct from `system_health` which does heavy checks and could fail due to a bug while the controller itself is fine) | ### Revised `controller.yaml` monitoring section ```yaml monitoring: enabled: true healthchecks_base: "https://status.felhom.eu" ping_uuids: heartbeat: "" # NEW — every 1 min, controller alive system_health: "" # existing — every 5 min, comprehensive check db_dump: "" # existing — daily after db dumps backup: "" # existing — daily after restic snapshot backup_integrity: "" # NEW — weekly after restic check system_health_interval: "5m" health_check_schedule: "06:00" thresholds: disk_warn_percent: 80 disk_crit_percent: 90 backup_max_age_hours: 36 cpu_warn_percent: 90 memory_warn_percent: 85 temperature_warn_celsius: 75 ``` > **Note:** Empty string and "CHANGEME..." UUIDs are both skipped by the pinger (already implemented). This means any check can be left unconfigured — the controller just skips it silently. ### Healthchecks check configuration (to be created manually on status.felhom.eu) For each customer project, create these checks: | Check name | Period | Grace | Tags | |-----------|--------|-------|------| | `heartbeat` | 5 minutes | 10 minutes | `heartbeat` | | `system-health` | 5 minutes | 10 minutes | `system`, `health` | | `db-dump` | 1 day (02:30 CET) | 30 minutes | `backup`, `db` | | `backup` | 1 day (03:00 CET) | 60 minutes | `backup`, `restic` | | `backup-integrity` | 7 days | 24 hours | `backup`, `integrity` | --- ## Part 1: Controller-side healthcheck implementation ### Task 1.1: Add heartbeat ping **Files:** `cmd/controller/main.go` Add a new scheduler job — the simplest possible ping, no health check logic: ```go // Heartbeat — lightweight "I'm alive" signal sched.Every("heartbeat", 5*time.Minute, func(ctx context.Context) error { pinger.Ping(cfg.Monitoring.PingUUIDs.Heartbeat, "") return nil }) ``` **Files:** `internal/config/config.go` Add `Heartbeat` field to `PingUUIDsConfig`: ```go type PingUUIDsConfig struct { Heartbeat string `yaml:"heartbeat"` DBDump string `yaml:"db_dump"` Backup string `yaml:"backup"` SystemHealth string `yaml:"system_health"` BackupIntegrity string `yaml:"backup_integrity"` // new } ``` ### Task 1.2: Add backup integrity check **Files:** `internal/backup/restic.go` Add a `Check()` method (may already exist as part of prune logic — verify first): ```go // Check runs `restic check` to verify repository integrity. func (r *ResticRunner) Check() error { args := []string{"check", "--repo", r.repo, "--json"} // ... standard exec with password file, timeout 30 min } ``` **Files:** `internal/backup/backup.go` Add `RunIntegrityCheck()`: ```go // RunIntegrityCheck runs restic check and pings healthchecks with the result. func (m *Manager) RunIntegrityCheck(ctx context.Context) error { err := m.restic.Check() uuid := m.cfg.Monitoring.PingUUIDs.BackupIntegrity if err != nil { m.pinger.Fail(uuid, fmt.Sprintf("restic check failed: %v", err)) return err } m.pinger.Ping(uuid, "restic check passed") return nil } ``` **Files:** `cmd/controller/main.go` Register the weekly job: ```go if cfg.Backup.Enabled && backupMgr != nil { // ... existing daily jobs ... // Weekly integrity check — Sunday 04:00 sched.Daily("backup-integrity", "04:00", func(ctx context.Context) error { if time.Now().Weekday() != time.Sunday { return nil // skip non-Sundays } return backupMgr.RunIntegrityCheck(ctx) }) } ``` > **Note on scheduler:** `Daily()` fires every day at the given time. To make it weekly, check the weekday inside the function. If you prefer, add a `Weekly()` method to the scheduler — but the weekday check is simpler and consistent with how prune already works. ### Task 1.3: Update example config **Files:** `controller/configs/controller.yaml.example` Update the `monitoring.ping_uuids` section to include `heartbeat` and `backup_integrity` fields. Add comments explaining each. ### Task 1.4: Deprecation note for bash monitoring scripts The following files in `deploy-felhom-compose/monitoring/` are **superseded** by the controller's built-in monitoring: - `backup-healthcheck.sh` → replaced by `internal/monitor/healthcheck.go` (scheduler: `system-health`) - `monitoring-setup.sh` → no longer needed (controller reads `controller.yaml` directly) - `monitoring.conf.template` → replaced by `controller.yaml` monitoring section - `backup-healthcheck.service` / `.timer` → replaced by controller's scheduler **Action:** Add a `DEPRECATED.md` in `deploy-felhom-compose/monitoring/` explaining that these scripts are kept for reference only and should not be used on nodes running felhom-controller v0.4.0+. Do NOT delete the files yet — they may be needed if a customer is still on a pre-controller setup. ### Verification (Part 1) After building and deploying v0.6.0 to demo-felhom: 1. Check controller logs: `docker logs felhom-controller --since 5m | grep -i "ping\|health\|heartbeat"` 2. Verify pings arrive at `status.felhom.eu` — all 5 checks should show green within 10 minutes 3. Test failure: `docker stop traefik`, wait 5 min, check that `system-health` goes red (protected container missing) 4. Restart traefik: `docker start traefik`, verify recovery --- ## Part 2: Central push to k3s (customer → operator reporting) ### Architecture ``` ┌─────────────────────────┐ HTTPS POST /api/v1/report │ Customer controller │────────────────────────────────────────┐ │ (demo-felhom.eu) │ every 15 min (configurable) │ └─────────────────────────┘ ▼ ┌─────────────────────────────┐ ┌─────────────────────────┐ HTTPS POST │ felhom-hub │ │ Customer controller │────────────────────────▶│ (k3s pod on dooplex.hu) │ │ (customer-2) │ │ │ └─────────────────────────┘ │ - Receives reports │ │ - Stores in SQLite │ │ - Serves dashboard │ │ - Alerts on stale reports │ └─────────────────────────────┘ hub.felhom.eu ``` ### Task 2.1: Define the report payload The controller pushes a JSON summary every 15 minutes. This is **not** raw metrics — it's an aggregated health summary. ```json { "version": 1, "customer_id": "demo-felhom", "customer_name": "Demo Ügyfél", "controller_version": "0.6.0", "timestamp": "2026-02-16T12:00:00Z", "system": { "hostname": "demo-felhom", "os": "Debian GNU/Linux 13 (trixie)", "kernel": "6.12.69+deb13-amd64", "cpu_model": "Intel N100", "cpu_cores": 4, "uptime_seconds": 345600, "cpu_percent": 12.5, "memory_total_mb": 15872, "memory_used_mb": 4200, "memory_percent": 26.5, "temperature_celsius": 48.0, "load_avg_1": 0.45, "load_avg_5": 0.38, "load_avg_15": 0.32 }, "storage": [ { "mount": "/", "total_gb": 476.0, "used_gb": 28.5, "percent": 6.0 }, { "mount": "/mnt/hdd_1", "total_gb": 931.0, "used_gb": 120.3, "percent": 12.9 } ], "containers": { "total": 16, "running": 14, "stopped": 2, "unhealthy": 0, "list": [ { "name": "paperless-ngx-webserver-1", "state": "running", "cpu_percent": 2.1, "memory_mb": 350 }, { "name": "traefik", "state": "running", "cpu_percent": 0.3, "memory_mb": 45 } ] }, "backup": { "enabled": true, "last_db_dump": "2026-02-16T02:30:15Z", "last_snapshot": "2026-02-16T03:02:45Z", "snapshot_count": 42, "repo_size_mb": 2048, "last_integrity_check": "2026-02-09T04:00:00Z", "integrity_ok": true }, "health": { "status": "ok", "issues": [], "warnings": ["Disk /mnt/hdd_1 at 82%"] }, "stacks": { "deployed": ["paperless-ngx", "immich", "jellyfin"], "available": ["nextcloud", "vaultwarden", "home-assistant"], "updates_available": 1 } } ``` ### Task 2.2: Implement report builder in the controller **New file:** `controller/internal/report/builder.go` ```go package report // Report is the JSON payload pushed to the central hub. type Report struct { Version int `json:"version"` CustomerID string `json:"customer_id"` CustomerName string `json:"customer_name"` ControllerVersion string `json:"controller_version"` Timestamp time.Time `json:"timestamp"` System SystemReport `json:"system"` Storage []StorageReport `json:"storage"` Containers ContainerReport `json:"containers"` Backup BackupReport `json:"backup"` Health HealthReport `json:"health"` Stacks StacksReport `json:"stacks"` } // BuildReport collects current state from all subsystems and returns a Report. func BuildReport(cfg *config.Config, stackMgr *stacks.Manager, backupMgr *backup.Manager, cpuCollector *system.CPUCollector, pinger *monitor.Pinger, version string) *Report { // Gather system info from system.GetInfo() // Gather container info from stackMgr // Gather backup info from backupMgr.GetFullStatus() // Gather health from monitor.RunHealthCheck() // Gather stack list from stackMgr.GetStacks() // Return assembled Report } ``` This function should call existing methods — **do not duplicate logic**. Use the same data sources the dashboard and monitoring page already use. ### Task 2.3: Implement report pusher in the controller **New file:** `controller/internal/report/pusher.go` ```go package report // Pusher sends reports to the central hub. type Pusher struct { hubURL string apiKey string httpClient *http.Client logger *log.Logger enabled bool } // Push sends a report to the hub. Returns nil on success. // Retries 3 times with 5s backoff. Never returns error to caller // (push failures should not affect controller operation). func (p *Pusher) Push(report *Report) error { // JSON marshal // POST to hubURL + "/api/v1/report" // Header: Authorization: Bearer // Header: Content-Type: application/json // Retry on failure // Log but don't propagate errors } ``` ### Task 2.4: Add hub configuration to controller.yaml **Files:** `internal/config/config.go`, `controller/configs/controller.yaml.example` ```yaml # --- Central hub (operator dashboard) --- hub: enabled: false # Enable central reporting url: "https://hub.felhom.eu" # Hub API endpoint api_key: "" # Shared secret for authentication push_interval: "15m" # How often to push reports ``` ```go type HubConfig struct { Enabled bool `yaml:"enabled"` URL string `yaml:"url"` APIKey string `yaml:"api_key"` PushInterval string `yaml:"push_interval"` } ``` Add `Hub HubConfig `yaml:"hub"`` to the main `Config` struct. ### Task 2.5: Wire the pusher into main.go ```go // --- Central hub reporting --- if cfg.Hub.Enabled && cfg.Hub.URL != "" { pushInterval, err := time.ParseDuration(cfg.Hub.PushInterval) if err != nil { pushInterval = 15 * time.Minute } pusher := report.NewPusher(&cfg.Hub, logger) sched.Every("hub-report", pushInterval, func(ctx context.Context) error { r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, pinger, version) return pusher.Push(r) }) logger.Printf("[INFO] Hub reporting enabled (every %s to %s)", pushInterval, cfg.Hub.URL) } ``` ### Verification (Part 2) 1. Set `hub.enabled: true` and `hub.url` to a temporary endpoint (e.g., `https://webhook.site/...`) in demo-felhom's `controller.yaml` 2. Restart controller, check logs for "Hub reporting enabled" 3. Wait 15 min (or set `push_interval: "1m"` for testing), verify JSON arrives at the endpoint 4. Validate JSON structure matches the spec above 5. Reset `push_interval` to `"15m"` after testing --- ## Part 3: Hub service on k3s (operator side) ### Overview The hub is a lightweight Go service deployed on Viktor's k3s cluster in the `felhom-system` namespace. It receives reports from customer controllers, stores them in SQLite, and serves an English-language dashboard for Viktor. **Domain:** `hub.felhom.eu` (Nginx Ingress, cert-manager TLS) **Namespace:** `felhom-system` (alongside Healthchecks and other felhom infra) **Code:** `felhom.eu` repo on Gitea, `hub/` subfolder ### Task 3.1: Hub service (subfolder in felhom.eu repository) The hub lives in the existing `felhom.eu` repository on Gitea as a `hub/` subfolder. It's deployed to the k3s cluster in the `felhom-system` namespace (alongside Healthchecks and other felhom infra). K8s manifests go in the `homelab-manifests` repo as usual. **Structure (inside felhom.eu repo):** ``` hub/ ├── cmd/hub/main.go # Entry point ├── internal/ │ ├── api/ │ │ └── handler.go # POST /api/v1/report, GET /api/v1/customers │ ├── store/ │ │ └── store.go # SQLite: save reports, query latest per customer │ └── web/ │ ├── server.go # Dashboard HTTP server │ ├── templates/ │ │ ├── dashboard.html # Multi-customer overview (English) │ │ ├── customer.html # Single customer detail (English) │ │ └── style.css # Dark theme matching felhom.eu │ └── embed.go ├── configs/ │ └── hub.yaml.example ├── Dockerfile ├── Makefile └── go.mod ``` K8s manifests in `felhom.eu/manifests/` (alongside healthchecks.yaml, webpage.yaml, etc.): ``` manifests/hub.yaml # Deployment, Service, Ingress, PVC ``` ### Task 3.2: Hub API endpoints | Method | Path | Auth | Description | |--------|------|------|-------------| | `POST` | `/api/v1/report` | Bearer token | Receive customer report (JSON body) | | `GET` | `/api/v1/customers` | Session/Basic | List all customers with latest status | | `GET` | `/api/v1/customers/{id}` | Session/Basic | Get latest report for a customer | | `GET` | `/api/v1/customers/{id}/history` | Session/Basic | Get report history (last 24h/7d/30d) | | `GET` | `/` | Session/Basic | Dashboard HTML page | | `GET` | `/customers/{id}` | Session/Basic | Customer detail HTML page | **Authentication:** - Report ingest: Bearer token (shared secret per customer, or a single hub-wide key for simplicity) - Dashboard: Basic auth or simple password (Viktor only) — reuse the same bcrypt approach as the controller ### Task 3.3: Hub SQLite schema ```sql CREATE TABLE IF NOT EXISTS reports ( id INTEGER PRIMARY KEY AUTOINCREMENT, customer_id TEXT NOT NULL, received_at DATETIME NOT NULL DEFAULT (datetime('now')), report_json TEXT NOT NULL, -- Full JSON payload -- Denormalized fields for fast queries: health_status TEXT, -- "ok", "warn", "fail" cpu_percent REAL, memory_percent REAL, container_total INTEGER, container_running INTEGER, backup_last_snapshot DATETIME, controller_version TEXT ); CREATE INDEX IF NOT EXISTS idx_reports_customer ON reports(customer_id, received_at DESC); -- Prune old reports: keep 30 days of history -- Run daily: DELETE FROM reports WHERE received_at < datetime('now', '-30 days'); ``` ### Task 3.4: Hub dashboard UI (English) **Overview page (`/`):** A table/grid showing all customers at a glance: | Customer | Status | Last seen | CPU | Memory | Disk | Containers | Last backup | Version | |----------|--------|-----------|-----|--------|------|------------|-------------|---------| | 🟢 Demo Ügyfél | OK | 2 min ago | 12% | 26% | 6%/13% | 14/16 | 3h ago | 0.6.0 | | 🟡 Kovács Péter | WARN | 18 min ago | 45% | 78% | 82% ⚠️ | 8/8 | 4h ago | 0.5.4 | | 🔴 Nagy Anna | DOWN | 2h ago | – | – | – | – | 26h ago ⚠️ | 0.5.4 | **Color coding:** - 🟢 Green: last seen < 30 min AND health = "ok" - 🟡 Yellow: last seen < 30 min AND health = "warn", OR last seen 30-60 min - 🔴 Red: last seen > 60 min OR health = "fail" **Customer detail page (`/customers/{id}`):** - Last report timestamp - Full system info section (same layout as controller's monitoring page) - Container list with CPU/memory - Backup status details - Health issues/warnings - Report history (collapsible list, last 24h) **Design:** English language. Dark theme matching felhom.eu / the controller dashboard. Use the same CSS variables and fonts. ### Task 3.5: Hub Kubernetes manifests **File:** `felhom.eu/manifests/hub.yaml` (alongside `healthchecks.yaml`, `webpage.yaml`, etc.) ```yaml # Namespace: felhom-system (shared with healthchecks and other felhom infra) # Deployment: 1 replica, 64Mi-256Mi memory # Service: ClusterIP port 8080 # PVC: 1Gi for SQLite (Longhorn) # Ingress: hub.felhom.eu via nginx-internal, cert-manager TLS # Auth: same geo-restriction as other dooplex.hu services (HU only) ``` **ConfigMap** for `hub.yaml` config: ```yaml auth: password_hash: "" # bcrypt hash, same approach as controller api: report_api_key: "" # Bearer token for report ingest retention: max_days: 90 # Keep 90 days of report history prune_schedule: "04:30" # Daily prune alerting: stale_threshold: "30m" # Alert if customer not seen for 30 min ``` ### Task 3.6: Alerting (optional, future enhancement) When a customer is "stale" (no report for > 30 min), the hub could: - Send a webhook to Healthchecks (one "customer-X-reporting" check per customer) - Send email via Resend - Push to Telegram For v0.6.0 scope: just show the status on the dashboard. Alerting can be added in v0.6.1. --- ## Part 4: Manual steps for Viktor (demo-felhom setup) These steps must be done by Viktor manually — Claude Code cannot access status.felhom.eu or the demo-felhom server. ### 4.1: Create Healthchecks checks on status.felhom.eu 1. Log into `status.felhom.eu` 2. Open the "demo-felhom" project 3. Create 5 checks with the settings from the table in Part 0 4. Copy the ping UUIDs for each check ### 4.2: Update controller.yaml on demo-felhom SSH into demo-felhom and update `/opt/docker/felhom-controller/controller.yaml`: ```yaml monitoring: enabled: true healthchecks_base: "https://status.felhom.eu" ping_uuids: heartbeat: "" system_health: "" db_dump: "" backup: "" backup_integrity: "" system_health_interval: "5m" health_check_schedule: "06:00" thresholds: disk_warn_percent: 80 disk_crit_percent: 90 backup_max_age_hours: 36 cpu_warn_percent: 90 memory_warn_percent: 85 temperature_warn_celsius: 75 ``` ### 4.3: Restart controller ```bash cd /opt/docker/felhom-controller docker compose pull docker compose up -d docker logs -f felhom-controller --since 1m ``` ### 4.4: Verify pings Wait 5 minutes, then check `status.felhom.eu` — all 5 checks should be green. ### 4.5: Deploy hub to k3s (after Part 3 is built) ```bash # Build and push hub image (from felhom.eu repo, hub/ subfolder) cd hub && make docker-push # Apply k8s manifests (from felhom.eu repo, manifests/ folder) kubectl apply -f manifests/hub.yaml # Configure hub.felhom.eu DNS in Cloudflare # Update demo-felhom controller.yaml with hub config ``` --- ## Implementation order 1. **Part 1** (controller-side, in `deploy-felhom-compose` repo): - Task 1.1: Heartbeat ping (5 min) - Task 1.2: Backup integrity check (20 min) - Task 1.3: Update example config (5 min) - Task 1.4: Deprecation note for bash scripts (5 min) 2. **Part 4.1–4.4** (Viktor manual: create checks, configure UUIDs, verify) 3. **Part 2** (controller-side, report push): - Task 2.1: Report payload types (10 min) - Task 2.2: Report builder (30 min) - Task 2.3: Report pusher (15 min) - Task 2.4: Hub config in controller.yaml (10 min) - Task 2.5: Wire into main.go (5 min) 4. **Part 3** (hub in `felhom.eu` repo, k8s manifests in `homelab-manifests`): - Task 3.1: Project scaffold in `hub/` subfolder (10 min) - Task 3.2: API handlers (30 min) - Task 3.3: SQLite store (20 min) - Task 3.4: Dashboard UI — English (60 min) - Task 3.5: K8s manifests in `felhom.eu/manifests/` (20 min) 5. **Part 4.5** (Viktor manual: deploy hub, wire everything) --- ## Files to modify (controller repo) ``` controller/cmd/controller/main.go — heartbeat job, integrity job, hub pusher controller/internal/config/config.go — PingUUIDsConfig + HubConfig controller/internal/backup/backup.go — RunIntegrityCheck() controller/internal/backup/restic.go — Check() method (verify/add) controller/internal/report/builder.go — NEW: report assembly controller/internal/report/pusher.go — NEW: HTTP push client controller/internal/report/types.go — NEW: Report struct definitions controller/configs/controller.yaml.example — updated monitoring + new hub section monitoring/DEPRECATED.md — NEW: deprecation notice for bash scripts ``` ## Files to create (hub — in felhom.eu repo) ``` hub/cmd/hub/main.go hub/internal/api/handler.go hub/internal/store/store.go hub/internal/web/server.go hub/internal/web/templates/dashboard.html hub/internal/web/templates/customer.html hub/internal/web/templates/style.css hub/internal/web/embed.go hub/configs/hub.yaml.example hub/Dockerfile hub/Makefile hub/go.mod hub/README.md ``` ## Files to create (k8s manifests — in felhom.eu repo) ``` manifests/hub.yaml ``` --- ## Verification checklist - [ ] Heartbeat ping arrives every 5 min at status.felhom.eu - [ ] System health ping arrives every 5 min with diagnostic body - [ ] DB dump ping arrives daily at ~02:30 - [ ] Backup ping arrives daily at ~03:00 - [ ] Backup integrity ping arrives weekly on Sunday ~04:00 - [ ] Stopping a protected container triggers system-health FAIL - [ ] Controller logs show "Hub reporting enabled" when hub.enabled=true - [ ] Hub receives JSON reports from controller - [ ] Hub dashboard shows demo-felhom with green status - [ ] Hub dashboard shows "last seen: X min ago" updating correctly - [ ] Hub shows red status when controller is stopped for > 60 min - [ ] Hub SQLite prunes old reports automatically - [ ] All UUIDs are configurable (empty/CHANGEME = silently skipped) --- ## CONTEXT.md update (after completion) Add to "What was just completed" section: ``` ### What was just completed (session N) - **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:** - **Healthcheck pings fully operational:** 5 check types (heartbeat, system-health, db-dump, backup, backup-integrity) configured on demo-felhom, all pinging status.felhom.eu - **Backup integrity check:** Weekly `restic check` with Healthchecks ping - **Central hub reporting:** Controller pushes JSON health summary every 15 min to hub.felhom.eu - **felhom-hub service:** New Go service in felhom.eu repo (`hub/` subfolder), k8s manifests in `felhom.eu/manifests/hub.yaml`, deployed on k3s in felhom-system namespace, SQLite storage, English multi-customer dashboard - **Deprecated:** Legacy bash monitoring scripts (backup-healthcheck.sh, monitoring-setup.sh) superseded by controller-native monitoring ``` Also update the repository distinction in CONTEXT.md: ``` ## Repository & manifest layout - **homelab-manifests** — Viktor's personal k3s apps (*.dooplex.hu): mon-system, servarr, pihole, etc. - **felhom.eu** — Everything felhom-related: - `website/` — felhom.eu public website HTML - `manifests/` — k8s manifests for felhom infra in felhom-system namespace (webpage, healthchecks, contact-mailer, umami, hub, felhom.secret) - `hub/` — felhom-hub Go service (central multi-customer dashboard) - **deploy-felhom-compose** — Customer-side: felhom-controller code, deploy scripts, monitoring scripts - **app-catalog-felhom.eu** — Docker Compose templates for customer apps ```