27 KiB
TASK.md — v0.6.0: Healthcheck Implementation + Central Push + Multi-Customer Dashboard
Version: v0.6.0 Depends on: v0.5.4 (current) Repo:
deploy-felhom-compose(controller/ subfolder) Build:~/build/felhom-controller/build.sh 0.6.0 --pushDeploy target: demo-felhom.eu (N100) + k3s cluster (dooplex.hu)
Context
The controller already has health monitoring infrastructure built in v0.4.0:
internal/monitor/pinger.go— Healthchecks.io-compatible HTTP ping client (success/fail/start, retries)internal/monitor/healthcheck.go— System health checks (disk, memory, CPU, temp, Docker, protected containers)- Scheduler jobs in
main.go:system-health(every 5m),db-dump(daily),backup(daily) - Backup manager already calls
pinger.Ping()/pinger.Fail()after each operation
Problem: The demo-felhom Healthchecks project has zero checks created (screenshot confirms empty project at status.felhom.eu/projects/.../checks/). The controller.yaml on demo-felhom has all CHANGEME placeholder UUIDs. Nothing is actually pinging.
Additionally, there are legacy bash scripts (backup-healthcheck.sh, monitoring-setup.sh) from the pre-controller era that duplicate functionality now built into the controller. These should be deprecated in favor of controller-native pings.
This version has two major parts:
- Prerequisite: Get healthchecks actually working on demo-felhom (create checks, configure UUIDs, verify pings)
- New feature: Central push from customer controllers to k3s + multi-customer overview dashboard
Part 0: Healthcheck Ping Design (controller.yaml schema update)
Current ping types (already implemented in code)
| Ping | Schedule | Source | What it proves |
|---|---|---|---|
system_health |
Every 5 min | monitor.RunHealthCheck() |
Server alive, Docker running, disks OK, protected containers up, CPU/mem/temp within thresholds |
db_dump |
Daily 02:30 | backup.RunDBDumps() |
Database dumps completed successfully |
backup |
Daily 03:00 | backup.RunBackup() |
Restic snapshot completed successfully |
New ping types to add
| Ping | Schedule | Source | What it proves |
|---|---|---|---|
backup_integrity |
Weekly (Sunday 04:00) | New: backup.RunIntegrityCheck() |
Restic repo passes restic check — data is not corrupted |
heartbeat |
Every 5 min | New: lightweight HTTP POST, no logic | Controller process is alive (distinct from system_health which does heavy checks and could fail due to a bug while the controller itself is fine) |
Revised controller.yaml monitoring section
monitoring:
enabled: true
healthchecks_base: "https://status.felhom.eu"
ping_uuids:
heartbeat: "" # NEW — every 1 min, controller alive
system_health: "" # existing — every 5 min, comprehensive check
db_dump: "" # existing — daily after db dumps
backup: "" # existing — daily after restic snapshot
backup_integrity: "" # NEW — weekly after restic check
system_health_interval: "5m"
health_check_schedule: "06:00"
thresholds:
disk_warn_percent: 80
disk_crit_percent: 90
backup_max_age_hours: 36
cpu_warn_percent: 90
memory_warn_percent: 85
temperature_warn_celsius: 75
Note: Empty string and "CHANGEME..." UUIDs are both skipped by the pinger (already implemented). This means any check can be left unconfigured — the controller just skips it silently.
Healthchecks check configuration (to be created manually on status.felhom.eu)
For each customer project, create these checks:
| Check name | Period | Grace | Tags |
|---|---|---|---|
heartbeat |
5 minutes | 10 minutes | heartbeat |
system-health |
5 minutes | 10 minutes | system, health |
db-dump |
1 day (02:30 CET) | 30 minutes | backup, db |
backup |
1 day (03:00 CET) | 60 minutes | backup, restic |
backup-integrity |
7 days | 24 hours | backup, integrity |
Part 1: Controller-side healthcheck implementation
Task 1.1: Add heartbeat ping
Files: cmd/controller/main.go
Add a new scheduler job — the simplest possible ping, no health check logic:
// Heartbeat — lightweight "I'm alive" signal
sched.Every("heartbeat", 5*time.Minute, func(ctx context.Context) error {
pinger.Ping(cfg.Monitoring.PingUUIDs.Heartbeat, "")
return nil
})
Files: internal/config/config.go
Add Heartbeat field to PingUUIDsConfig:
type PingUUIDsConfig struct {
Heartbeat string `yaml:"heartbeat"`
DBDump string `yaml:"db_dump"`
Backup string `yaml:"backup"`
SystemHealth string `yaml:"system_health"`
BackupIntegrity string `yaml:"backup_integrity"` // new
}
Task 1.2: Add backup integrity check
Files: internal/backup/restic.go
Add a Check() method (may already exist as part of prune logic — verify first):
// Check runs `restic check` to verify repository integrity.
func (r *ResticRunner) Check() error {
args := []string{"check", "--repo", r.repo, "--json"}
// ... standard exec with password file, timeout 30 min
}
Files: internal/backup/backup.go
Add RunIntegrityCheck():
// RunIntegrityCheck runs restic check and pings healthchecks with the result.
func (m *Manager) RunIntegrityCheck(ctx context.Context) error {
err := m.restic.Check()
uuid := m.cfg.Monitoring.PingUUIDs.BackupIntegrity
if err != nil {
m.pinger.Fail(uuid, fmt.Sprintf("restic check failed: %v", err))
return err
}
m.pinger.Ping(uuid, "restic check passed")
return nil
}
Files: cmd/controller/main.go
Register the weekly job:
if cfg.Backup.Enabled && backupMgr != nil {
// ... existing daily jobs ...
// Weekly integrity check — Sunday 04:00
sched.Daily("backup-integrity", "04:00", func(ctx context.Context) error {
if time.Now().Weekday() != time.Sunday {
return nil // skip non-Sundays
}
return backupMgr.RunIntegrityCheck(ctx)
})
}
Note on scheduler:
Daily()fires every day at the given time. To make it weekly, check the weekday inside the function. If you prefer, add aWeekly()method to the scheduler — but the weekday check is simpler and consistent with how prune already works.
Task 1.3: Update example config
Files: controller/configs/controller.yaml.example
Update the monitoring.ping_uuids section to include heartbeat and backup_integrity fields. Add comments explaining each.
Task 1.4: Deprecation note for bash monitoring scripts
The following files in deploy-felhom-compose/monitoring/ are superseded by the controller's built-in monitoring:
backup-healthcheck.sh→ replaced byinternal/monitor/healthcheck.go(scheduler:system-health)monitoring-setup.sh→ no longer needed (controller readscontroller.yamldirectly)monitoring.conf.template→ replaced bycontroller.yamlmonitoring sectionbackup-healthcheck.service/.timer→ replaced by controller's scheduler
Action: Add a DEPRECATED.md in deploy-felhom-compose/monitoring/ explaining that these scripts are kept for reference only and should not be used on nodes running felhom-controller v0.4.0+. Do NOT delete the files yet — they may be needed if a customer is still on a pre-controller setup.
Verification (Part 1)
After building and deploying v0.6.0 to demo-felhom:
- Check controller logs:
docker logs felhom-controller --since 5m | grep -i "ping\|health\|heartbeat" - Verify pings arrive at
status.felhom.eu— all 5 checks should show green within 10 minutes - Test failure:
docker stop traefik, wait 5 min, check thatsystem-healthgoes red (protected container missing) - Restart traefik:
docker start traefik, verify recovery
Part 2: Central push to k3s (customer → operator reporting)
Architecture
┌─────────────────────────┐ HTTPS POST /api/v1/report
│ Customer controller │────────────────────────────────────────┐
│ (demo-felhom.eu) │ every 15 min (configurable) │
└─────────────────────────┘ ▼
┌─────────────────────────────┐
┌─────────────────────────┐ HTTPS POST │ felhom-hub │
│ Customer controller │────────────────────────▶│ (k3s pod on dooplex.hu) │
│ (customer-2) │ │ │
└─────────────────────────┘ │ - Receives reports │
│ - Stores in SQLite │
│ - Serves dashboard │
│ - Alerts on stale reports │
└─────────────────────────────┘
hub.felhom.eu
Task 2.1: Define the report payload
The controller pushes a JSON summary every 15 minutes. This is not raw metrics — it's an aggregated health summary.
{
"version": 1,
"customer_id": "demo-felhom",
"customer_name": "Demo Ügyfél",
"controller_version": "0.6.0",
"timestamp": "2026-02-16T12:00:00Z",
"system": {
"hostname": "demo-felhom",
"os": "Debian GNU/Linux 13 (trixie)",
"kernel": "6.12.69+deb13-amd64",
"cpu_model": "Intel N100",
"cpu_cores": 4,
"uptime_seconds": 345600,
"cpu_percent": 12.5,
"memory_total_mb": 15872,
"memory_used_mb": 4200,
"memory_percent": 26.5,
"temperature_celsius": 48.0,
"load_avg_1": 0.45,
"load_avg_5": 0.38,
"load_avg_15": 0.32
},
"storage": [
{ "mount": "/", "total_gb": 476.0, "used_gb": 28.5, "percent": 6.0 },
{ "mount": "/mnt/hdd_1", "total_gb": 931.0, "used_gb": 120.3, "percent": 12.9 }
],
"containers": {
"total": 16,
"running": 14,
"stopped": 2,
"unhealthy": 0,
"list": [
{ "name": "paperless-ngx-webserver-1", "state": "running", "cpu_percent": 2.1, "memory_mb": 350 },
{ "name": "traefik", "state": "running", "cpu_percent": 0.3, "memory_mb": 45 }
]
},
"backup": {
"enabled": true,
"last_db_dump": "2026-02-16T02:30:15Z",
"last_snapshot": "2026-02-16T03:02:45Z",
"snapshot_count": 42,
"repo_size_mb": 2048,
"last_integrity_check": "2026-02-09T04:00:00Z",
"integrity_ok": true
},
"health": {
"status": "ok",
"issues": [],
"warnings": ["Disk /mnt/hdd_1 at 82%"]
},
"stacks": {
"deployed": ["paperless-ngx", "immich", "jellyfin"],
"available": ["nextcloud", "vaultwarden", "home-assistant"],
"updates_available": 1
}
}
Task 2.2: Implement report builder in the controller
New file: controller/internal/report/builder.go
package report
// Report is the JSON payload pushed to the central hub.
type Report struct {
Version int `json:"version"`
CustomerID string `json:"customer_id"`
CustomerName string `json:"customer_name"`
ControllerVersion string `json:"controller_version"`
Timestamp time.Time `json:"timestamp"`
System SystemReport `json:"system"`
Storage []StorageReport `json:"storage"`
Containers ContainerReport `json:"containers"`
Backup BackupReport `json:"backup"`
Health HealthReport `json:"health"`
Stacks StacksReport `json:"stacks"`
}
// BuildReport collects current state from all subsystems and returns a Report.
func BuildReport(cfg *config.Config, stackMgr *stacks.Manager,
backupMgr *backup.Manager, cpuCollector *system.CPUCollector,
pinger *monitor.Pinger, version string) *Report {
// Gather system info from system.GetInfo()
// Gather container info from stackMgr
// Gather backup info from backupMgr.GetFullStatus()
// Gather health from monitor.RunHealthCheck()
// Gather stack list from stackMgr.GetStacks()
// Return assembled Report
}
This function should call existing methods — do not duplicate logic. Use the same data sources the dashboard and monitoring page already use.
Task 2.3: Implement report pusher in the controller
New file: controller/internal/report/pusher.go
package report
// Pusher sends reports to the central hub.
type Pusher struct {
hubURL string
apiKey string
httpClient *http.Client
logger *log.Logger
enabled bool
}
// Push sends a report to the hub. Returns nil on success.
// Retries 3 times with 5s backoff. Never returns error to caller
// (push failures should not affect controller operation).
func (p *Pusher) Push(report *Report) error {
// JSON marshal
// POST to hubURL + "/api/v1/report"
// Header: Authorization: Bearer <apiKey>
// Header: Content-Type: application/json
// Retry on failure
// Log but don't propagate errors
}
Task 2.4: Add hub configuration to controller.yaml
Files: internal/config/config.go, controller/configs/controller.yaml.example
# --- Central hub (operator dashboard) ---
hub:
enabled: false # Enable central reporting
url: "https://hub.felhom.eu" # Hub API endpoint
api_key: "" # Shared secret for authentication
push_interval: "15m" # How often to push reports
type HubConfig struct {
Enabled bool `yaml:"enabled"`
URL string `yaml:"url"`
APIKey string `yaml:"api_key"`
PushInterval string `yaml:"push_interval"`
}
Add Hub HubConfig yaml:"hub"`` to the main Config struct.
Task 2.5: Wire the pusher into main.go
// --- Central hub reporting ---
if cfg.Hub.Enabled && cfg.Hub.URL != "" {
pushInterval, err := time.ParseDuration(cfg.Hub.PushInterval)
if err != nil {
pushInterval = 15 * time.Minute
}
pusher := report.NewPusher(&cfg.Hub, logger)
sched.Every("hub-report", pushInterval, func(ctx context.Context) error {
r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, pinger, version)
return pusher.Push(r)
})
logger.Printf("[INFO] Hub reporting enabled (every %s to %s)", pushInterval, cfg.Hub.URL)
}
Verification (Part 2)
- Set
hub.enabled: trueandhub.urlto a temporary endpoint (e.g.,https://webhook.site/...) in demo-felhom'scontroller.yaml - Restart controller, check logs for "Hub reporting enabled"
- Wait 15 min (or set
push_interval: "1m"for testing), verify JSON arrives at the endpoint - Validate JSON structure matches the spec above
- Reset
push_intervalto"15m"after testing
Part 3: Hub service on k3s (operator side)
Overview
The hub is a lightweight Go service deployed on Viktor's k3s cluster in the felhom-system namespace. It receives reports from customer controllers, stores them in SQLite, and serves an English-language dashboard for Viktor.
Domain: hub.felhom.eu (Nginx Ingress, cert-manager TLS)
Namespace: felhom-system (alongside Healthchecks and other felhom infra)
Code: felhom.eu repo on Gitea, hub/ subfolder
Task 3.1: Hub service (subfolder in felhom.eu repository)
The hub lives in the existing felhom.eu repository on Gitea as a hub/ subfolder. It's deployed to the k3s cluster in the felhom-system namespace (alongside Healthchecks and other felhom infra). K8s manifests go in the homelab-manifests repo as usual.
Structure (inside felhom.eu repo):
hub/
├── cmd/hub/main.go # Entry point
├── internal/
│ ├── api/
│ │ └── handler.go # POST /api/v1/report, GET /api/v1/customers
│ ├── store/
│ │ └── store.go # SQLite: save reports, query latest per customer
│ └── web/
│ ├── server.go # Dashboard HTTP server
│ ├── templates/
│ │ ├── dashboard.html # Multi-customer overview (English)
│ │ ├── customer.html # Single customer detail (English)
│ │ └── style.css # Dark theme matching felhom.eu
│ └── embed.go
├── configs/
│ └── hub.yaml.example
├── Dockerfile
├── Makefile
└── go.mod
K8s manifests in felhom.eu/manifests/ (alongside healthchecks.yaml, webpage.yaml, etc.):
manifests/hub.yaml # Deployment, Service, Ingress, PVC
Task 3.2: Hub API endpoints
| Method | Path | Auth | Description |
|---|---|---|---|
POST |
/api/v1/report |
Bearer token | Receive customer report (JSON body) |
GET |
/api/v1/customers |
Session/Basic | List all customers with latest status |
GET |
/api/v1/customers/{id} |
Session/Basic | Get latest report for a customer |
GET |
/api/v1/customers/{id}/history |
Session/Basic | Get report history (last 24h/7d/30d) |
GET |
/ |
Session/Basic | Dashboard HTML page |
GET |
/customers/{id} |
Session/Basic | Customer detail HTML page |
Authentication:
- Report ingest: Bearer token (shared secret per customer, or a single hub-wide key for simplicity)
- Dashboard: Basic auth or simple password (Viktor only) — reuse the same bcrypt approach as the controller
Task 3.3: Hub SQLite schema
CREATE TABLE IF NOT EXISTS reports (
id INTEGER PRIMARY KEY AUTOINCREMENT,
customer_id TEXT NOT NULL,
received_at DATETIME NOT NULL DEFAULT (datetime('now')),
report_json TEXT NOT NULL, -- Full JSON payload
-- Denormalized fields for fast queries:
health_status TEXT, -- "ok", "warn", "fail"
cpu_percent REAL,
memory_percent REAL,
container_total INTEGER,
container_running INTEGER,
backup_last_snapshot DATETIME,
controller_version TEXT
);
CREATE INDEX IF NOT EXISTS idx_reports_customer ON reports(customer_id, received_at DESC);
-- Prune old reports: keep 30 days of history
-- Run daily: DELETE FROM reports WHERE received_at < datetime('now', '-30 days');
Task 3.4: Hub dashboard UI (English)
Overview page (/):
A table/grid showing all customers at a glance:
| Customer | Status | Last seen | CPU | Memory | Disk | Containers | Last backup | Version |
|---|---|---|---|---|---|---|---|---|
| 🟢 Demo Ügyfél | OK | 2 min ago | 12% | 26% | 6%/13% | 14/16 | 3h ago | 0.6.0 |
| 🟡 Kovács Péter | WARN | 18 min ago | 45% | 78% | 82% ⚠️ | 8/8 | 4h ago | 0.5.4 |
| 🔴 Nagy Anna | DOWN | 2h ago | – | – | – | – | 26h ago ⚠️ | 0.5.4 |
Color coding:
- 🟢 Green: last seen < 30 min AND health = "ok"
- 🟡 Yellow: last seen < 30 min AND health = "warn", OR last seen 30-60 min
- 🔴 Red: last seen > 60 min OR health = "fail"
Customer detail page (/customers/{id}):
- Last report timestamp
- Full system info section (same layout as controller's monitoring page)
- Container list with CPU/memory
- Backup status details
- Health issues/warnings
- Report history (collapsible list, last 24h)
Design: English language. Dark theme matching felhom.eu / the controller dashboard. Use the same CSS variables and fonts.
Task 3.5: Hub Kubernetes manifests
File: felhom.eu/manifests/hub.yaml (alongside healthchecks.yaml, webpage.yaml, etc.)
# Namespace: felhom-system (shared with healthchecks and other felhom infra)
# Deployment: 1 replica, 64Mi-256Mi memory
# Service: ClusterIP port 8080
# PVC: 1Gi for SQLite (Longhorn)
# Ingress: hub.felhom.eu via nginx-internal, cert-manager TLS
# Auth: same geo-restriction as other dooplex.hu services (HU only)
ConfigMap for hub.yaml config:
auth:
password_hash: "" # bcrypt hash, same approach as controller
api:
report_api_key: "" # Bearer token for report ingest
retention:
max_days: 90 # Keep 90 days of report history
prune_schedule: "04:30" # Daily prune
alerting:
stale_threshold: "30m" # Alert if customer not seen for 30 min
Task 3.6: Alerting (optional, future enhancement)
When a customer is "stale" (no report for > 30 min), the hub could:
- Send a webhook to Healthchecks (one "customer-X-reporting" check per customer)
- Send email via Resend
- Push to Telegram
For v0.6.0 scope: just show the status on the dashboard. Alerting can be added in v0.6.1.
Part 4: Manual steps for Viktor (demo-felhom setup)
These steps must be done by Viktor manually — Claude Code cannot access status.felhom.eu or the demo-felhom server.
4.1: Create Healthchecks checks on status.felhom.eu
- Log into
status.felhom.eu - Open the "demo-felhom" project
- Create 5 checks with the settings from the table in Part 0
- Copy the ping UUIDs for each check
4.2: Update controller.yaml on demo-felhom
SSH into demo-felhom and update /opt/docker/felhom-controller/controller.yaml:
monitoring:
enabled: true
healthchecks_base: "https://status.felhom.eu"
ping_uuids:
heartbeat: "<UUID-from-step-4.1>"
system_health: "<UUID-from-step-4.1>"
db_dump: "<UUID-from-step-4.1>"
backup: "<UUID-from-step-4.1>"
backup_integrity: "<UUID-from-step-4.1>"
system_health_interval: "5m"
health_check_schedule: "06:00"
thresholds:
disk_warn_percent: 80
disk_crit_percent: 90
backup_max_age_hours: 36
cpu_warn_percent: 90
memory_warn_percent: 85
temperature_warn_celsius: 75
4.3: Restart controller
cd /opt/docker/felhom-controller
docker compose pull
docker compose up -d
docker logs -f felhom-controller --since 1m
4.4: Verify pings
Wait 5 minutes, then check status.felhom.eu — all 5 checks should be green.
4.5: Deploy hub to k3s (after Part 3 is built)
# Build and push hub image (from felhom.eu repo, hub/ subfolder)
cd hub && make docker-push
# Apply k8s manifests (from felhom.eu repo, manifests/ folder)
kubectl apply -f manifests/hub.yaml
# Configure hub.felhom.eu DNS in Cloudflare
# Update demo-felhom controller.yaml with hub config
Implementation order
-
Part 1 (controller-side, in
deploy-felhom-composerepo):- Task 1.1: Heartbeat ping (5 min)
- Task 1.2: Backup integrity check (20 min)
- Task 1.3: Update example config (5 min)
- Task 1.4: Deprecation note for bash scripts (5 min)
-
Part 4.1–4.4 (Viktor manual: create checks, configure UUIDs, verify)
-
Part 2 (controller-side, report push):
- Task 2.1: Report payload types (10 min)
- Task 2.2: Report builder (30 min)
- Task 2.3: Report pusher (15 min)
- Task 2.4: Hub config in controller.yaml (10 min)
- Task 2.5: Wire into main.go (5 min)
-
Part 3 (hub in
felhom.eurepo, k8s manifests inhomelab-manifests):- Task 3.1: Project scaffold in
hub/subfolder (10 min) - Task 3.2: API handlers (30 min)
- Task 3.3: SQLite store (20 min)
- Task 3.4: Dashboard UI — English (60 min)
- Task 3.5: K8s manifests in
felhom.eu/manifests/(20 min)
- Task 3.1: Project scaffold in
-
Part 4.5 (Viktor manual: deploy hub, wire everything)
Files to modify (controller repo)
controller/cmd/controller/main.go — heartbeat job, integrity job, hub pusher
controller/internal/config/config.go — PingUUIDsConfig + HubConfig
controller/internal/backup/backup.go — RunIntegrityCheck()
controller/internal/backup/restic.go — Check() method (verify/add)
controller/internal/report/builder.go — NEW: report assembly
controller/internal/report/pusher.go — NEW: HTTP push client
controller/internal/report/types.go — NEW: Report struct definitions
controller/configs/controller.yaml.example — updated monitoring + new hub section
monitoring/DEPRECATED.md — NEW: deprecation notice for bash scripts
Files to create (hub — in felhom.eu repo)
hub/cmd/hub/main.go
hub/internal/api/handler.go
hub/internal/store/store.go
hub/internal/web/server.go
hub/internal/web/templates/dashboard.html
hub/internal/web/templates/customer.html
hub/internal/web/templates/style.css
hub/internal/web/embed.go
hub/configs/hub.yaml.example
hub/Dockerfile
hub/Makefile
hub/go.mod
hub/README.md
Files to create (k8s manifests — in felhom.eu repo)
manifests/hub.yaml
Verification checklist
- Heartbeat ping arrives every 5 min at status.felhom.eu
- System health ping arrives every 5 min with diagnostic body
- DB dump ping arrives daily at ~02:30
- Backup ping arrives daily at ~03:00
- Backup integrity ping arrives weekly on Sunday ~04:00
- Stopping a protected container triggers system-health FAIL
- Controller logs show "Hub reporting enabled" when hub.enabled=true
- Hub receives JSON reports from controller
- Hub dashboard shows demo-felhom with green status
- Hub dashboard shows "last seen: X min ago" updating correctly
- Hub shows red status when controller is stopped for > 60 min
- Hub SQLite prunes old reports automatically
- All UUIDs are configurable (empty/CHANGEME = silently skipped)
CONTEXT.md update (after completion)
Add to "What was just completed" section:
### What was just completed (session N)
- **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:**
- **Healthcheck pings fully operational:** 5 check types (heartbeat, system-health, db-dump, backup, backup-integrity) configured on demo-felhom, all pinging status.felhom.eu
- **Backup integrity check:** Weekly `restic check` with Healthchecks ping
- **Central hub reporting:** Controller pushes JSON health summary every 15 min to hub.felhom.eu
- **felhom-hub service:** New Go service in felhom.eu repo (`hub/` subfolder), k8s manifests in `felhom.eu/manifests/hub.yaml`, deployed on k3s in felhom-system namespace, SQLite storage, English multi-customer dashboard
- **Deprecated:** Legacy bash monitoring scripts (backup-healthcheck.sh, monitoring-setup.sh) superseded by controller-native monitoring
Also update the repository distinction in CONTEXT.md:
## Repository & manifest layout
- **homelab-manifests** — Viktor's personal k3s apps (*.dooplex.hu): mon-system, servarr, pihole, etc.
- **felhom.eu** — Everything felhom-related:
- `website/` — felhom.eu public website HTML
- `manifests/` — k8s manifests for felhom infra in felhom-system namespace (webpage, healthchecks, contact-mailer, umami, hub, felhom.secret)
- `hub/` — felhom-hub Go service (central multi-customer dashboard)
- **deploy-felhom-compose** — Customer-side: felhom-controller code, deploy scripts, monitoring scripts
- **app-catalog-felhom.eu** — Docker Compose templates for customer apps