diff --git a/TASK.md b/TASK.md
index d82d17c..32c71e4 100644
--- a/TASK.md
+++ b/TASK.md
@@ -1,351 +1,720 @@
-# TASK.md — v0.5.4: Monitoring Page Frontend Fixes
+# TASK.md — v0.6.0: Healthcheck Implementation + Central Push + Multi-Customer Dashboard
-> Version bump: **v0.5.4**
-> Scope: Frontend-only — all changes in `monitoring.html` and `style.css`
-> No Go code changes needed.
+> **Version:** v0.6.0
+> **Depends on:** v0.5.4 (current)
+> **Repo:** `deploy-felhom-compose` (controller/ subfolder)
+> **Build:** `~/build/felhom-controller/build.sh 0.6.0 --push`
+> **Deploy target:** demo-felhom.eu (N100) + k3s cluster (dooplex.hu)
---
-## IMPORTANT: Build & Validation
+## Context
-Build must happen in `~/build/felhom-controller/`, NOT in the git repo:
-```bash
-cd ~/build/felhom-controller
-git -C ~/git/deploy-felhom-compose pull
-./build.sh 0.5.2 --push
-```
+The controller already has health monitoring infrastructure built in v0.4.0:
+- `internal/monitor/pinger.go` — Healthchecks.io-compatible HTTP ping client (success/fail/start, retries)
+- `internal/monitor/healthcheck.go` — System health checks (disk, memory, CPU, temp, Docker, protected containers)
+- Scheduler jobs in `main.go`: `system-health` (every 5m), `db-dump` (daily), `backup` (daily)
+- Backup manager already calls `pinger.Ping()`/`pinger.Fail()` after each operation
-**Never run `go build` inside `~/git/deploy-felhom-compose/controller/`.**
+**Problem:** The demo-felhom Healthchecks project has **zero checks created** (screenshot confirms empty project at `status.felhom.eu/projects/.../checks/`). The `controller.yaml` on demo-felhom has all `CHANGEME` placeholder UUIDs. Nothing is actually pinging.
-After deployment, validate all 4 fixes by:
-1. Opening https://felhom.demo-felhom.eu/monitoring in browser
-2. Opening the browser Developer Tools (F12) → Console tab
-3. Checking each item below
+Additionally, there are legacy bash scripts (`backup-healthcheck.sh`, `monitoring-setup.sh`) from the pre-controller era that duplicate functionality now built into the controller. These should be deprecated in favor of controller-native pings.
-If you cannot access the browser, validate by reading the deployed HTML source:
-```bash
-ssh kisfenyo@192.168.0.162 "docker exec felhom-controller cat /app/templates/monitoring.html" | head -50
-```
+**This version has two major parts:**
+1. **Prerequisite:** Get healthchecks actually working on demo-felhom (create checks, configure UUIDs, verify pings)
+2. **New feature:** Central push from customer controllers to k3s + multi-customer overview dashboard
---
-## Bug 1: Tooltip shows "Invalid Date"
+## Part 0: Healthcheck Ping Design (controller.yaml schema update)
-### Root cause
+### Current ping types (already implemented in code)
-The tooltip callback uses `items[0].parsed.x` which should return a numeric timestamp on a Chart.js linear axis. However, depending on the Chart.js version/build, `parsed.x` may return something unexpected (undefined, wrong type) causing `new Date()` to produce "Invalid Date".
+| Ping | Schedule | Source | What it proves |
+|------|----------|--------|----------------|
+| `system_health` | Every 5 min | `monitor.RunHealthCheck()` | Server alive, Docker running, disks OK, protected containers up, CPU/mem/temp within thresholds |
+| `db_dump` | Daily 02:30 | `backup.RunDBDumps()` | Database dumps completed successfully |
+| `backup` | Daily 03:00 | `backup.RunBackup()` | Restic snapshot completed successfully |
-### Diagnosis step
+### New ping types to add
-Before fixing, add a temporary console.log to confirm what `parsed.x` actually returns. In `monitoring.html`, in the tooltip callback inside `chartOpts()`:
+| Ping | Schedule | Source | What it proves |
+|------|----------|--------|----------------|
+| `backup_integrity` | Weekly (Sunday 04:00) | New: `backup.RunIntegrityCheck()` | Restic repo passes `restic check` — data is not corrupted |
+| `heartbeat` | Every 5 min | New: lightweight HTTP POST, no logic | Controller process is alive (distinct from `system_health` which does heavy checks and could fail due to a bug while the controller itself is fine) |
-```javascript
-title: function(items) {
- if (!items.length) return '';
- console.log('[tooltip debug]', 'parsed.x:', items[0].parsed.x, typeof items[0].parsed.x, 'raw:', items[0].raw);
- return formatTimestamp(items[0].parsed.x);
+### Revised `controller.yaml` monitoring section
+
+```yaml
+monitoring:
+ enabled: true
+ healthchecks_base: "https://status.felhom.eu"
+ ping_uuids:
+ heartbeat: "" # NEW — every 1 min, controller alive
+ system_health: "" # existing — every 5 min, comprehensive check
+ db_dump: "" # existing — daily after db dumps
+ backup: "" # existing — daily after restic snapshot
+ backup_integrity: "" # NEW — weekly after restic check
+ system_health_interval: "5m"
+ health_check_schedule: "06:00"
+ thresholds:
+ disk_warn_percent: 80
+ disk_crit_percent: 90
+ backup_max_age_hours: 36
+ cpu_warn_percent: 90
+ memory_warn_percent: 85
+ temperature_warn_celsius: 75
+```
+
+> **Note:** Empty string and "CHANGEME..." UUIDs are both skipped by the pinger (already implemented). This means any check can be left unconfigured — the controller just skips it silently.
+
+### Healthchecks check configuration (to be created manually on status.felhom.eu)
+
+For each customer project, create these checks:
+
+| Check name | Period | Grace | Tags |
+|-----------|--------|-------|------|
+| `heartbeat` | 5 minutes | 10 minutes | `heartbeat` |
+| `system-health` | 5 minutes | 10 minutes | `system`, `health` |
+| `db-dump` | 1 day (02:30 CET) | 30 minutes | `backup`, `db` |
+| `backup` | 1 day (03:00 CET) | 60 minutes | `backup`, `restic` |
+| `backup-integrity` | 7 days | 24 hours | `backup`, `integrity` |
+
+---
+
+## Part 1: Controller-side healthcheck implementation
+
+### Task 1.1: Add heartbeat ping
+
+**Files:** `cmd/controller/main.go`
+
+Add a new scheduler job — the simplest possible ping, no health check logic:
+
+```go
+// Heartbeat — lightweight "I'm alive" signal
+sched.Every("heartbeat", 5*time.Minute, func(ctx context.Context) error {
+ pinger.Ping(cfg.Monitoring.PingUUIDs.Heartbeat, "")
+ return nil
+})
+```
+
+**Files:** `internal/config/config.go`
+
+Add `Heartbeat` field to `PingUUIDsConfig`:
+
+```go
+type PingUUIDsConfig struct {
+ Heartbeat string `yaml:"heartbeat"`
+ DBDump string `yaml:"db_dump"`
+ Backup string `yaml:"backup"`
+ SystemHealth string `yaml:"system_health"`
+ BackupIntegrity string `yaml:"backup_integrity"` // new
}
```
-Deploy, hover over a data point, check browser console. Possible findings:
-- `parsed.x` is `undefined` → Chart.js isn't finding the x value from `{x,y}` data
-- `parsed.x` is a very small number (like an index) → linear scale isn't applied
-- `parsed.x` is correct ms timestamp → bug is in `formatTimestamp`
+### Task 1.2: Add backup integrity check
-### Fix
+**Files:** `internal/backup/restic.go`
-Replace the tooltip callback with a more robust approach that accesses the raw data point directly:
+Add a `Check()` method (may already exist as part of prune logic — verify first):
-```javascript
-callbacks: {
- title: function(items) {
- if (!items.length) return '';
- // Access raw {x, y} data point directly — most reliable across Chart.js versions
- var raw = items[0].raw;
- if (raw && typeof raw === 'object' && raw.x) {
- return formatTimestamp(raw.x);
- }
- // Fallback: try parsed.x
- if (items[0].parsed && items[0].parsed.x) {
- return formatTimestamp(items[0].parsed.x);
- }
- return '';
+```go
+// Check runs `restic check` to verify repository integrity.
+func (r *ResticRunner) Check() error {
+ args := []string{"check", "--repo", r.repo, "--json"}
+ // ... standard exec with password file, timeout 30 min
+}
+```
+
+**Files:** `internal/backup/backup.go`
+
+Add `RunIntegrityCheck()`:
+
+```go
+// RunIntegrityCheck runs restic check and pings healthchecks with the result.
+func (m *Manager) RunIntegrityCheck(ctx context.Context) error {
+ err := m.restic.Check()
+ uuid := m.cfg.Monitoring.PingUUIDs.BackupIntegrity
+ if err != nil {
+ m.pinger.Fail(uuid, fmt.Sprintf("restic check failed: %v", err))
+ return err
}
+ m.pinger.Ping(uuid, "restic check passed")
+ return nil
}
```
-After deploying and verifying, remove the console.log line.
+**Files:** `cmd/controller/main.go`
-### Verification
-- Hover over any data point on any chart → tooltip title shows formatted date like "2026. 02. 16. 11:30"
-- Verify on CPU, Memory, Temperature, Load charts
-- Verify on container detail charts too (same `chartOpts` function is shared)
+Register the weekly job:
----
+```go
+if cfg.Backup.Enabled && backupMgr != nil {
+ // ... existing daily jobs ...
-## Bug 2: Charts fill full width regardless of data density
-
-### Root cause
-
-`setChartXBounds()` sets `chart.options.scales.x.min/max` after chart initialization. Chart.js may not pick up dynamically added `min`/`max` properties if they weren't present in the options during initialization. The scale was created without `min`/`max`, and adding them at runtime may be ignored.
-
-### Diagnosis step
-
-Add console.log in `loadSystemMetrics()` after setting bounds and updating:
-
-```javascript
-allCharts.forEach(function(c) { setChartXBounds(c, systemRange); });
-updateLineChart(chartCPU, timestamps, d.cpu);
-console.log('[bounds debug] range:', systemRange,
- 'options.min:', chartCPU.options.scales.x.min,
- 'options.max:', chartCPU.options.scales.x.max,
- 'scale.min:', chartCPU.scales.x.min,
- 'scale.max:', chartCPU.scales.x.max);
-```
-
-Select "7 nap", check console. If `options.min/max` are set correctly but `scales.x.min/max` show the data extent, then Chart.js is ignoring the runtime-added properties.
-
-### Fix
-
-Include `min` and `max` in the initial chart options so Chart.js registers them from creation. Then dynamic updates work.
-
-**Step 1**: Modify `chartOpts()` to include initial min/max:
-
-```javascript
-function chartOpts(yLabel, beginAtZero) {
- var now = Date.now();
- var defaultRangeMs = parseRangeMs('1h'); // match default systemRange
- return {
- responsive: true,
- maintainAspectRatio: false,
- animation: {duration: 300},
- plugins: {
- legend: {display: false},
- tooltip: {
- backgroundColor: '#1c2128',
- titleColor: '#e6edf3',
- bodyColor: '#8b949e',
- borderColor: '#30363d',
- borderWidth: 1,
- callbacks: {
- title: function(items) {
- if (!items.length) return '';
- var raw = items[0].raw;
- if (raw && typeof raw === 'object' && raw.x) {
- return formatTimestamp(raw.x);
- }
- if (items[0].parsed && items[0].parsed.x) {
- return formatTimestamp(items[0].parsed.x);
- }
- return '';
- }
- }
- }
- },
- scales: {
- x: {
- type: 'linear',
- min: now - defaultRangeMs,
- max: now,
- grid: {color: 'rgba(48,54,61,0.5)'},
- ticks: {
- color: '#8b949e',
- maxTicksLimit: 8,
- callback: function(v) {
- return formatTimeLabel(v);
- }
- }
- },
- y: {
- grid: {color: 'rgba(48,54,61,0.5)'},
- ticks: {color: '#8b949e'},
- beginAtZero: beginAtZero !== false,
- title: {display: !!yLabel, text: yLabel || '', color: '#6e7681', font: {size: 11}}
- }
+ // Weekly integrity check — Sunday 04:00
+ sched.Daily("backup-integrity", "04:00", func(ctx context.Context) error {
+ if time.Now().Weekday() != time.Sunday {
+ return nil // skip non-Sundays
}
- };
+ return backupMgr.RunIntegrityCheck(ctx)
+ })
}
```
-Key change: `min: now - defaultRangeMs, max: now` are present from creation.
+> **Note on scheduler:** `Daily()` fires every day at the given time. To make it weekly, check the weekday inside the function. If you prefer, add a `Weekly()` method to the scheduler — but the weekday check is simpler and consistent with how prune already works.
-**Step 2**: `setChartXBounds()` stays the same — it updates existing properties.
+### Task 1.3: Update example config
-**Step 3**: Same fix for container detail charts — `initDetailCharts()` uses the same `chartOpts()` so it gets min/max automatically.
+**Files:** `controller/configs/controller.yaml.example`
-### Verification
-- Select "7 nap" → x-axis spans 7 full days (Feb 9 to Feb 16), data appears as a small cluster on the far right
-- Select "1 óra" → data fills most of the chart width
-- Select "24 óra" → data fills proportional to collection time
-- X-axis labels for 7d show dates (02.09 .. 02.16), not times
-- X-axis labels for 1h/6h/24h show times (10:00, 11:00, etc.)
+Update the `monitoring.ping_uuids` section to include `heartbeat` and `backup_integrity` fields. Add comments explaining each.
+
+### Task 1.4: Deprecation note for bash monitoring scripts
+
+The following files in `deploy-felhom-compose/monitoring/` are **superseded** by the controller's built-in monitoring:
+
+- `backup-healthcheck.sh` → replaced by `internal/monitor/healthcheck.go` (scheduler: `system-health`)
+- `monitoring-setup.sh` → no longer needed (controller reads `controller.yaml` directly)
+- `monitoring.conf.template` → replaced by `controller.yaml` monitoring section
+- `backup-healthcheck.service` / `.timer` → replaced by controller's scheduler
+
+**Action:** Add a `DEPRECATED.md` in `deploy-felhom-compose/monitoring/` explaining that these scripts are kept for reference only and should not be used on nodes running felhom-controller v0.4.0+. Do NOT delete the files yet — they may be needed if a customer is still on a pre-controller setup.
+
+### Verification (Part 1)
+
+After building and deploying v0.6.0 to demo-felhom:
+
+1. Check controller logs: `docker logs felhom-controller --since 5m | grep -i "ping\|health\|heartbeat"`
+2. Verify pings arrive at `status.felhom.eu` — all 5 checks should show green within 10 minutes
+3. Test failure: `docker stop traefik`, wait 5 min, check that `system-health` goes red (protected container missing)
+4. Restart traefik: `docker start traefik`, verify recovery
---
-## Bug 3: System overview values not consistently right-aligned
+## Part 2: Central push to k3s (customer → operator reporting)
-### Root cause
+### Architecture
-`.sysinfo-row` uses `display: flex; justify-content: space-between` which does push values to the right of each cell. But `.sysinfo-grid` uses `repeat(auto-fill, minmax(280px, 1fr))` which creates varying cell widths — values don't align to a consistent edge across columns.
+```
+┌─────────────────────────┐ HTTPS POST /api/v1/report
+│ Customer controller │────────────────────────────────────────┐
+│ (demo-felhom.eu) │ every 15 min (configurable) │
+└─────────────────────────┘ ▼
+ ┌─────────────────────────────┐
+┌─────────────────────────┐ HTTPS POST │ felhom-hub │
+│ Customer controller │────────────────────────▶│ (k3s pod on dooplex.hu) │
+│ (customer-2) │ │ │
+└─────────────────────────┘ │ - Receives reports │
+ │ - Stores in SQLite │
+ │ - Serves dashboard │
+ │ - Alerts on stale reports │
+ └─────────────────────────────┘
+ hub.felhom.eu
+```
-The `
```
-The mobile rule `@media(max-width: 768px) { .sysinfo-grid { grid-template-columns: 1fr; } }` already exists and stays — collapses to single column on mobile.
+This function should call existing methods — **do not duplicate logic**. Use the same data sources the dashboard and monitoring page already use.
-### Verification
-- Values are consistently right-aligned within each cell
-- "Debian GNU/Linux 13 (trixie)" and "6.12.69+deb13-amd64" align to the right edge
-- Both grid columns have equal width
-- Long values wrap without breaking layout
+### Task 2.3: Implement report pusher in the controller
+
+**New file:** `controller/internal/report/pusher.go`
+
+```go
+package report
+
+// Pusher sends reports to the central hub.
+type Pusher struct {
+ hubURL string
+ apiKey string
+ httpClient *http.Client
+ logger *log.Logger
+ enabled bool
+}
+
+// Push sends a report to the hub. Returns nil on success.
+// Retries 3 times with 5s backoff. Never returns error to caller
+// (push failures should not affect controller operation).
+func (p *Pusher) Push(report *Report) error {
+ // JSON marshal
+ // POST to hubURL + "/api/v1/report"
+ // Header: Authorization: Bearer
+ // Header: Content-Type: application/json
+ // Retry on failure
+ // Log but don't propagate errors
+}
+```
+
+### Task 2.4: Add hub configuration to controller.yaml
+
+**Files:** `internal/config/config.go`, `controller/configs/controller.yaml.example`
+
+```yaml
+# --- Central hub (operator dashboard) ---
+hub:
+ enabled: false # Enable central reporting
+ url: "https://hub.felhom.eu" # Hub API endpoint
+ api_key: "" # Shared secret for authentication
+ push_interval: "15m" # How often to push reports
+```
+
+```go
+type HubConfig struct {
+ Enabled bool `yaml:"enabled"`
+ URL string `yaml:"url"`
+ APIKey string `yaml:"api_key"`
+ PushInterval string `yaml:"push_interval"`
+}
+```
+
+Add `Hub HubConfig `yaml:"hub"`` to the main `Config` struct.
+
+### Task 2.5: Wire the pusher into main.go
+
+```go
+// --- Central hub reporting ---
+if cfg.Hub.Enabled && cfg.Hub.URL != "" {
+ pushInterval, err := time.ParseDuration(cfg.Hub.PushInterval)
+ if err != nil {
+ pushInterval = 15 * time.Minute
+ }
+ pusher := report.NewPusher(&cfg.Hub, logger)
+ sched.Every("hub-report", pushInterval, func(ctx context.Context) error {
+ r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, pinger, version)
+ return pusher.Push(r)
+ })
+ logger.Printf("[INFO] Hub reporting enabled (every %s to %s)", pushInterval, cfg.Hub.URL)
+}
+```
+
+### Verification (Part 2)
+
+1. Set `hub.enabled: true` and `hub.url` to a temporary endpoint (e.g., `https://webhook.site/...`) in demo-felhom's `controller.yaml`
+2. Restart controller, check logs for "Hub reporting enabled"
+3. Wait 15 min (or set `push_interval: "1m"` for testing), verify JSON arrives at the endpoint
+4. Validate JSON structure matches the spec above
+5. Reset `push_interval` to `"15m"` after testing
---
-## Bug 4: Charts overflow their container on mobile
+## Part 3: Hub service on k3s (operator side)
-### Root cause
+### Overview
-`.chart-wrap` has `position: relative; height: 180px` but no overflow or width constraint. CSS grid children default to `min-width: auto`, preventing them from shrinking below their content width. Chart.js canvas may render wider than the parent on narrow screens.
+The hub is a lightweight Go service deployed on Viktor's k3s cluster in the `felhom-system` namespace. It receives reports from customer controllers, stores them in SQLite, and serves an English-language dashboard for Viktor.
-### Fix
+**Domain:** `hub.felhom.eu` (Nginx Ingress, cert-manager TLS)
+**Namespace:** `felhom-system` (alongside Healthchecks and other felhom infra)
+**Code:** `felhom.eu` repo on Gitea, `hub/` subfolder
-**In `style.css`**, update these rules:
+### Task 3.1: Hub service (subfolder in felhom.eu repository)
-```css
-.chart-box {
- background: var(--bg-secondary);
- border-radius: 8px;
- padding: .75rem;
- border: 1px solid rgba(48, 54, 61, 0.5);
- min-width: 0; /* Allow grid children to shrink — critical fix */
- overflow: hidden;
-}
-.chart-wrap {
- position: relative;
- height: 180px;
- overflow: hidden;
- max-width: 100%;
-}
-.chart-wrap canvas {
- max-width: 100%;
-}
-.chart-wrap-bar {
- position: relative;
- height: 250px;
- overflow: hidden;
- max-width: 100%;
-}
+The hub lives in the existing `felhom.eu` repository on Gitea as a `hub/` subfolder. It's deployed to the k3s cluster in the `felhom-system` namespace (alongside Healthchecks and other felhom infra). K8s manifests go in the `homelab-manifests` repo as usual.
+
+**Structure (inside felhom.eu repo):**
+
+```
+hub/
+├── cmd/hub/main.go # Entry point
+├── internal/
+│ ├── api/
+│ │ └── handler.go # POST /api/v1/report, GET /api/v1/customers
+│ ├── store/
+│ │ └── store.go # SQLite: save reports, query latest per customer
+│ └── web/
+│ ├── server.go # Dashboard HTTP server
+│ ├── templates/
+│ │ ├── dashboard.html # Multi-customer overview (English)
+│ │ ├── customer.html # Single customer detail (English)
+│ │ └── style.css # Dark theme matching felhom.eu
+│ └── embed.go
+├── configs/
+│ └── hub.yaml.example
+├── Dockerfile
+├── Makefile
+└── go.mod
```
-Also add `.chart-box-half` update:
-```css
-.chart-box-half {
- flex: 1;
- min-width: 0; /* Same fix for flex containers */
-}
+K8s manifests in `felhom.eu/manifests/` (alongside healthchecks.yaml, webpage.yaml, etc.):
+```
+manifests/hub.yaml # Deployment, Service, Ingress, PVC
```
-Key additions:
-- `min-width: 0` on `.chart-box` — **the critical CSS grid fix**: prevents grid children from forcing the grid wider than the viewport
-- `overflow: hidden` on `.chart-wrap` and `.chart-wrap-bar` — clips any canvas overflow
-- `max-width: 100%` on `.chart-wrap` and canvas
-- `min-width: 0` on `.chart-box-half` — same fix for the flex-based container charts
+### Task 3.2: Hub API endpoints
-### Verification
-- Open monitoring page at 375px width (browser devtools responsive mode)
-- All four system metric charts fit within the screen
-- Container bar charts fit within the screen
-- No horizontal scrollbar appears
-- Charts remain interactive (hover/click works)
+| Method | Path | Auth | Description |
+|--------|------|------|-------------|
+| `POST` | `/api/v1/report` | Bearer token | Receive customer report (JSON body) |
+| `GET` | `/api/v1/customers` | Session/Basic | List all customers with latest status |
+| `GET` | `/api/v1/customers/{id}` | Session/Basic | Get latest report for a customer |
+| `GET` | `/api/v1/customers/{id}/history` | Session/Basic | Get report history (last 24h/7d/30d) |
+| `GET` | `/` | Session/Basic | Dashboard HTML page |
+| `GET` | `/customers/{id}` | Session/Basic | Customer detail HTML page |
+
+**Authentication:**
+- Report ingest: Bearer token (shared secret per customer, or a single hub-wide key for simplicity)
+- Dashboard: Basic auth or simple password (Viktor only) — reuse the same bcrypt approach as the controller
+
+### Task 3.3: Hub SQLite schema
+
+```sql
+CREATE TABLE IF NOT EXISTS reports (
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
+ customer_id TEXT NOT NULL,
+ received_at DATETIME NOT NULL DEFAULT (datetime('now')),
+ report_json TEXT NOT NULL, -- Full JSON payload
+ -- Denormalized fields for fast queries:
+ health_status TEXT, -- "ok", "warn", "fail"
+ cpu_percent REAL,
+ memory_percent REAL,
+ container_total INTEGER,
+ container_running INTEGER,
+ backup_last_snapshot DATETIME,
+ controller_version TEXT
+);
+
+CREATE INDEX IF NOT EXISTS idx_reports_customer ON reports(customer_id, received_at DESC);
+
+-- Prune old reports: keep 30 days of history
+-- Run daily: DELETE FROM reports WHERE received_at < datetime('now', '-30 days');
+```
+
+### Task 3.4: Hub dashboard UI (English)
+
+**Overview page (`/`):**
+
+A table/grid showing all customers at a glance:
+
+| Customer | Status | Last seen | CPU | Memory | Disk | Containers | Last backup | Version |
+|----------|--------|-----------|-----|--------|------|------------|-------------|---------|
+| 🟢 Demo Ügyfél | OK | 2 min ago | 12% | 26% | 6%/13% | 14/16 | 3h ago | 0.6.0 |
+| 🟡 Kovács Péter | WARN | 18 min ago | 45% | 78% | 82% ⚠️ | 8/8 | 4h ago | 0.5.4 |
+| 🔴 Nagy Anna | DOWN | 2h ago | – | – | – | – | 26h ago ⚠️ | 0.5.4 |
+
+**Color coding:**
+- 🟢 Green: last seen < 30 min AND health = "ok"
+- 🟡 Yellow: last seen < 30 min AND health = "warn", OR last seen 30-60 min
+- 🔴 Red: last seen > 60 min OR health = "fail"
+
+**Customer detail page (`/customers/{id}`):**
+
+- Last report timestamp
+- Full system info section (same layout as controller's monitoring page)
+- Container list with CPU/memory
+- Backup status details
+- Health issues/warnings
+- Report history (collapsible list, last 24h)
+
+**Design:** English language. Dark theme matching felhom.eu / the controller dashboard. Use the same CSS variables and fonts.
+
+### Task 3.5: Hub Kubernetes manifests
+
+**File:** `felhom.eu/manifests/hub.yaml` (alongside `healthchecks.yaml`, `webpage.yaml`, etc.)
+
+```yaml
+# Namespace: felhom-system (shared with healthchecks and other felhom infra)
+# Deployment: 1 replica, 64Mi-256Mi memory
+# Service: ClusterIP port 8080
+# PVC: 1Gi for SQLite (Longhorn)
+# Ingress: hub.felhom.eu via nginx-internal, cert-manager TLS
+# Auth: same geo-restriction as other dooplex.hu services (HU only)
+```
+
+**ConfigMap** for `hub.yaml` config:
+```yaml
+auth:
+ password_hash: "" # bcrypt hash, same approach as controller
+api:
+ report_api_key: "" # Bearer token for report ingest
+retention:
+ max_days: 90 # Keep 90 days of report history
+ prune_schedule: "04:30" # Daily prune
+alerting:
+ stale_threshold: "30m" # Alert if customer not seen for 30 min
+```
+
+### Task 3.6: Alerting (optional, future enhancement)
+
+When a customer is "stale" (no report for > 30 min), the hub could:
+- Send a webhook to Healthchecks (one "customer-X-reporting" check per customer)
+- Send email via Resend
+- Push to Telegram
+
+For v0.6.0 scope: just show the status on the dashboard. Alerting can be added in v0.6.1.
+
+---
+
+## Part 4: Manual steps for Viktor (demo-felhom setup)
+
+These steps must be done by Viktor manually — Claude Code cannot access status.felhom.eu or the demo-felhom server.
+
+### 4.1: Create Healthchecks checks on status.felhom.eu
+
+1. Log into `status.felhom.eu`
+2. Open the "demo-felhom" project
+3. Create 5 checks with the settings from the table in Part 0
+4. Copy the ping UUIDs for each check
+
+### 4.2: Update controller.yaml on demo-felhom
+
+SSH into demo-felhom and update `/opt/docker/felhom-controller/controller.yaml`:
+
+```yaml
+monitoring:
+ enabled: true
+ healthchecks_base: "https://status.felhom.eu"
+ ping_uuids:
+ heartbeat: ""
+ system_health: ""
+ db_dump: ""
+ backup: ""
+ backup_integrity: ""
+ system_health_interval: "5m"
+ health_check_schedule: "06:00"
+ thresholds:
+ disk_warn_percent: 80
+ disk_crit_percent: 90
+ backup_max_age_hours: 36
+ cpu_warn_percent: 90
+ memory_warn_percent: 85
+ temperature_warn_celsius: 75
+```
+
+### 4.3: Restart controller
+
+```bash
+cd /opt/docker/felhom-controller
+docker compose pull
+docker compose up -d
+docker logs -f felhom-controller --since 1m
+```
+
+### 4.4: Verify pings
+
+Wait 5 minutes, then check `status.felhom.eu` — all 5 checks should be green.
+
+### 4.5: Deploy hub to k3s (after Part 3 is built)
+
+```bash
+# Build and push hub image (from felhom.eu repo, hub/ subfolder)
+cd hub && make docker-push
+
+# Apply k8s manifests (from felhom.eu repo, manifests/ folder)
+kubectl apply -f manifests/hub.yaml
+
+# Configure hub.felhom.eu DNS in Cloudflare
+# Update demo-felhom controller.yaml with hub config
+```
---
## Implementation order
-1. Edit `style.css` — sysinfo alignment + chart overflow fixes
-2. Edit `monitoring.html` — remove `