From ded59b687e66e9954f9a2a67986e199a03d63730 Mon Sep 17 00:00:00 2001 From: kisfenyo Date: Mon, 16 Feb 2026 19:09:43 +0100 Subject: [PATCH] =?UTF-8?q?0.7.1=20-=20Phase=202=20=E2=80=94=20Monitoring?= =?UTF-8?q?=20Warnings,=20Dashboard=20Alerts=20&=20Notification=20System?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- TASK.md | 847 ++++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 552 insertions(+), 295 deletions(-) diff --git a/TASK.md b/TASK.md index 71d1d15..a7e6d12 100644 --- a/TASK.md +++ b/TASK.md @@ -1,340 +1,597 @@ -# TASK: Phase 1 — Authentication, Persistence & Settings Page +# TASK: Phase 2 — Monitoring Warnings, Dashboard Alerts & Notification System -**Version target:** 0.7.0 -**Repo:** `deploy-felhom-compose` (controller) +**Version target:** 0.7.1 +**Repos:** `deploy-felhom-compose` (controller) + `felhom.eu` (notification-relay on k3s) ## Overview Three workstreams in this phase: -1. Implement login/logout authentication for the controller dashboard -2. Persist DB validation results across container restarts -3. Add "Beállítások" (Settings) page with config display and password change - -All user-editable state is stored in a single `settings.json` file at: -`/opt/docker/felhom-controller/data/settings.json` - -This path is already bind-mounted into the container (the `data/` dir holds `restic-password` etc.). -The controller.yaml config remains the source of truth for operator-provisioned values; -`settings.json` holds customer-modifiable overrides. +1. **Monitoring page warnings** — Show healthcheck ping configuration status, warn about missing UUIDs +2. **Dashboard alert system** — Persistent in-app banners for active issues/warnings +3. **Notification system** — Central email relay on k3s + customer-side preferences UI --- -## 1. settings.json — Shared Persistence Layer +## 1. Monitoring Page: Healthcheck Ping Status -### 1.1 Create `internal/settings/settings.go` -``` -File: controller/internal/settings/settings.go -``` +### 1.1 Problem + +The monitoring page shows system metrics but doesn't indicate whether healthcheck pings are actually configured. If a ping UUID is empty or `CHANGEME`, the pinger silently skips it — the customer has no visibility into whether remote monitoring is working. + +### 1.2 New section: "Távoli monitoring" (Remote Monitoring) + +Add a new section to `monitoring.html` **between** "Rendszer áttekintés" and "Rendszer metrikák" (section 1 and 2). This section is server-rendered (not JS/API — ping config is static and known at page load). + +Display a table showing each healthcheck ping's configuration status: + +| Ellenőrzés | UUID státusz | Gyakoriság | +|---|---|---| +| 💓 Életjel (Heartbeat) | ✅ Beállítva | 5 percenként | +| 🖥️ Rendszer állapot | ✅ Beállítva | 5 percenként | +| 🗄️ Adatbázis mentés | ⚠️ Nincs beállítva | Naponta 02:30 | +| 💾 Biztonsági mentés | ✅ Beállítva | Naponta 03:00 | +| 🔍 Mentés integritás | ⚠️ Nincs beállítva | Hetente (vasárnap) | + +**Logic for each row:** +- Read `monitoring.ping_uuids.*` from config +- UUID is "configured" if: non-empty AND doesn't start with `CHANGEME` +- If configured: show `✅ Beállítva` (green text) +- If not configured: show `⚠️ Nincs beállítva` (yellow/orange warning text) +- If monitoring is disabled entirely (`monitoring.enabled = false`): show a single warning banner instead of the table: "A távoli monitoring ki van kapcsolva. Az üzemeltető nem kap értesítést hibák esetén." + +**Summary banner above the table:** +- All configured: green banner — "✅ Minden távoli monitoring aktív — az üzemeltető értesítést kap hibák esetén." +- Some missing: yellow banner — "⚠️ Egyes monitoring ellenőrzések nincsenek beállítva. Kérd az üzemeltetőt a konfiguráláshoz." +- Monitoring disabled: red/orange banner — as above + +### 1.3 Data flow + +Add a new template data struct for the monitoring handler: -Define the settings struct and load/save logic: ```go -type Settings struct { - mu sync.RWMutex `json:"-"` +type MonitoringPageData struct { + // Existing fields... + SystemInfo *system.Info + ActivePage string - // Auth - PasswordHash string `json:"password_hash,omitempty"` // bcrypt hash, overrides controller.yaml - - // Notification preferences (Phase 2 — define struct now, leave empty) - Notifications *NotificationPrefs `json:"notifications,omitempty"` - - // Cached state - DBValidations map[string]DBValidationCache `json:"db_validations,omitempty"` + // New: healthcheck ping status + MonitoringEnabled bool + PingStatus []PingStatusItem + AllPingsConfigured bool } +type PingStatusItem struct { + Label string // Hungarian display name + Icon string // emoji + Configured bool // UUID is valid + Schedule string // "5 percenként" / "Naponta 02:30" etc. +} +``` + +Build `PingStatus` slice in the handler from `cfg.Monitoring.PingUUIDs`: + +```go +pings := []PingStatusItem{ + {Label: "Életjel (Heartbeat)", Icon: "💓", Configured: isConfigured(cfg.Monitoring.PingUUIDs.Heartbeat), Schedule: "5 percenként"}, + {Label: "Rendszer állapot", Icon: "🖥️", Configured: isConfigured(cfg.Monitoring.PingUUIDs.SystemHealth), Schedule: "5 percenként"}, + {Label: "Adatbázis mentés", Icon: "🗄️", Configured: isConfigured(cfg.Monitoring.PingUUIDs.DBDump), Schedule: "Naponta " + cfg.Backup.DBDumpSchedule}, + {Label: "Biztonsági mentés", Icon: "💾", Configured: isConfigured(cfg.Monitoring.PingUUIDs.Backup), Schedule: "Naponta " + cfg.Backup.ResticSchedule}, + {Label: "Mentés integritás", Icon: "🔍", Configured: isConfigured(cfg.Monitoring.PingUUIDs.BackupIntegrity), Schedule: "Hetente (vasárnap)"}, +} + +func isConfigured(uuid string) bool { + return uuid != "" && !strings.HasPrefix(uuid, "CHANGEME") +} +``` + +### 1.4 CSS + +Reuse existing `.settings-row` / `.sysinfo-row` pattern for the table. Add: +- `.ping-status-ok` — green text (same as `.state-text-green`) +- `.ping-status-warn` — orange/yellow text +- `.monitoring-banner` — full-width banner with icon, green/yellow/red variants + +--- + +## 2. Dashboard Alert System + +### 2.1 Concept + +Display persistent alert banners at the top of the main content area (below page header, above page content). Alerts are generated from the latest health check results and other events. They show on ALL pages, not just monitoring. + +### 2.2 Alert sources + +The controller already runs health checks every 5 minutes (`RunHealthCheck`). The resulting `HealthReport` contains `Issues` (critical) and `Warnings` (non-critical). Use these directly. + +Additionally, generate alerts for: +- Missing healthcheck ping UUIDs (from section 1 above) +- Backup not configured (`backup.enabled = false`) +- Hub reporting not configured when it should be +- Recent backup failures (from backup manager state) + +### 2.3 Implementation: Alert Manager + +Create `internal/web/alerts.go`: + +```go +type Alert struct { + ID string // unique, for dismiss tracking + Level string // "error", "warning", "info" + Message string // Hungarian text + Link string // optional link to relevant page (e.g., "/monitoring", "/backups") + LinkText string // "Részletek" etc. +} + +type AlertManager struct { + mu sync.RWMutex + alerts []Alert + logger *log.Logger +} +``` + +**Alert generation** runs after each health check cycle (every 5 min): + +```go +func (am *AlertManager) Refresh(healthReport *HealthReport, cfg *config.Config, backupMgr *backup.Manager) { + var alerts []Alert + + // From health check issues + for _, issue := range healthReport.Issues { + alerts = append(alerts, Alert{ + ID: "health-" + hash(issue), Level: "error", + Message: issue, Link: "/monitoring", LinkText: "Rendszermonitor", + }) + } + + // From health check warnings + for _, w := range healthReport.Warnings { + alerts = append(alerts, Alert{ + ID: "health-" + hash(w), Level: "warning", + Message: w, Link: "/monitoring", LinkText: "Rendszermonitor", + }) + } + + // Missing ping UUIDs + missingCount := countMissingPings(cfg) + if missingCount > 0 { + alerts = append(alerts, Alert{ + ID: "pings-missing", Level: "warning", + Message: fmt.Sprintf("%d monitoring ellenőrzés nincs beállítva", missingCount), + Link: "/monitoring", LinkText: "Rendszermonitor", + }) + } + + // Backup disabled + if !cfg.Backup.Enabled { + alerts = append(alerts, Alert{ + ID: "backup-disabled", Level: "warning", + Message: "A biztonsági mentés nincs bekapcsolva", + Link: "/settings", LinkText: "Beállítások", + }) + } + + am.mu.Lock() + am.alerts = alerts + am.mu.Unlock() +} +``` + +### 2.4 Template integration + +In `layout_start` template (or a new `alerts` partial), render alerts above the page content: + +```html +{{if .Alerts}} +
+ {{range .Alerts}} +
+ {{if eq .Level "error"}}🔴{{else if eq .Level "warning"}}🟡{{else}}ℹ️{{end}} + {{.Message}} + {{if .Link}}{{.LinkText}} →{{end}} +
+ {{end}} +
+{{end}} +``` + +**Key decisions:** +- Alerts are NOT dismissible (they reflect real state — they disappear when the issue is resolved) +- Maximum 5 alerts shown, with "+N more" indicator if overflow +- On the monitoring page, skip the "pings-missing" alert since the detailed table is already visible +- Error alerts (red) above warning alerts (yellow) + +### 2.5 Passing alerts to templates + +Every page handler already passes template data via a struct. Add an `Alerts []Alert` field to each page's data struct (or use a shared base struct). The alert manager is available via the web server struct. + +```go +// In each handler: +data.Alerts = s.alertManager.GetAlerts() +``` + +### 2.6 CSS + +```css +.alerts-container { margin-bottom: 1rem; } +.alert-banner { + display: flex; align-items: center; gap: 0.75rem; + padding: 0.75rem 1rem; border-radius: 8px; margin-bottom: 0.5rem; + font-size: 0.9rem; +} +.alert-banner-error { background: rgba(248, 113, 113, 0.1); border: 1px solid rgba(248, 113, 113, 0.3); color: #f87171; } +.alert-banner-warning { background: rgba(250, 204, 21, 0.1); border: 1px solid rgba(250, 204, 21, 0.3); color: #facc15; } +.alert-banner-info { background: rgba(96, 165, 250, 0.1); border: 1px solid rgba(96, 165, 250, 0.3); color: #60a5fa; } +.alert-link { margin-left: auto; white-space: nowrap; } +``` + +--- + +## 3. Notification System + +### 3.1 Architecture + +``` +Customer Node k3s Cluster +┌──────────────────────┐ ┌──────────────────────────────┐ +│ felhom-controller │ HTTP POST │ notification-relay │ +│ │ ─────────────────>│ (notify.felhom.eu) │ +│ Event detected: │ {customer_id, │ │ +│ - disk_warning │ event_type, │ 1. Validate API key │ +│ - backup_failed │ message, │ 2. Format email │ +│ - ... │ severity} │ 3. Send via Resend API │ +│ │ │ 4. Return 200/4xx/5xx │ +└──────────────────────┘ └──────────────────────────────┘ + │ + │ Resend API + ▼ + ┌──────────────┐ + │ Customer │ + │ email inbox │ + └──────────────┘ +``` + +**Why a relay?** +- Resend API key stays on trusted infrastructure (k3s), never on customer hardware +- Central rate limiting and logging of all notifications +- Operator visibility into what notifications were sent +- Customer controllers only need hub URL + API key (already have these for hub reporting) + +### 3.2 Notification Relay Service (k3s side) + +**Repo:** `felhom.eu` — new directory `notification-relay/` alongside `hub/` + +This is a small Go service, similar to `contact-mailer`. Deploy on k3s at `notify.felhom.eu` (or as a path under hub, e.g., `hub.felhom.eu/api/v1/notify`). + +**Option A: Standalone service at notify.felhom.eu** +- Separate deployment, its own ingress +- Clean separation of concerns +- More k3s resources + +**Option B: Add notify endpoint to the existing hub** +- Hub already runs, has API key auth, knows customer IDs +- Just add a `POST /api/v1/notify` endpoint +- Reuse hub's Resend integration +- Less infrastructure + +**Recommendation: Option B** — Add to the hub. The hub already authenticates customers by API key and has all the context needed. Adding a `/api/v1/notify` endpoint is minimal work. + +#### Hub notify endpoint + +``` +POST /api/v1/notify +Authorization: Bearer +Content-Type: application/json + +{ + "customer_id": "demo-felhom", + "event_type": "disk_warning", + "severity": "warning", // "info", "warning", "critical" + "message": "SSD disk usage: 85%", + "details": "Threshold: 80%" // optional +} +``` + +**Hub processing:** +1. Validate API key (same auth as report push) +2. Look up customer notification preferences (stored in hub's SQLite) +3. If customer has email configured AND event_type is in their enabled events: + - Format email (Hungarian template) + - Send via Resend API (direct HTTP call, same pattern as contact-mailer) +4. Log the notification attempt and result +5. Return 200 (accepted), 400 (bad request), 401 (unauthorized) + +**Hub config additions** (hub.yaml secret): +```yaml +RESEND_API_KEY: "re_XZZenCJs..." # Same key as healthchecks/contact-mailer +FROM_EMAIL: "monitoring@felhom.eu" +``` + +#### Customer notification config (hub-side storage) + +The hub stores per-customer notification preferences in its SQLite DB: + +```sql +CREATE TABLE customer_notifications ( + customer_id TEXT PRIMARY KEY, + email TEXT NOT NULL DEFAULT '', -- customer email address + enabled_events TEXT NOT NULL DEFAULT '[]', -- JSON array of event types + created_at DATETIME DEFAULT CURRENT_TIMESTAMP, + updated_at DATETIME DEFAULT CURRENT_TIMESTAMP +); +``` + +**How the customer's email and preferences get there:** +- **Phase 2a (this task):** Operator sets them manually via hub dashboard or SQLite +- **Phase 2b (future):** Controller pushes notification preferences to hub along with reports, hub saves them + +For now (2a), the operator configures each customer's email in the hub after setup. This avoids needing the controller to push preferences to the hub yet. + +### 3.3 Controller-side: Notification Trigger + +Add `internal/notify/notifier.go`: + +```go +type Notifier struct { + hubURL string + apiKey string + httpClient *http.Client + logger *log.Logger + enabled bool + prefs *settings.NotificationPrefs // local preferences +} + +// Notify sends a notification event to the hub relay. +// Non-blocking: fires and forgets (logs errors but doesn't retry aggressively). +func (n *Notifier) Notify(eventType, severity, message, details string) { + if !n.enabled { return } + if !n.prefs.IsEventEnabled(eventType) { return } + + // POST to hub + payload := NotifyRequest{ + CustomerID: n.customerID, + EventType: eventType, + Severity: severity, + Message: message, + Details: details, + } + // ... HTTP POST to hubURL + "/api/v1/notify" +} +``` + +**Integration points** — trigger notifications from: +1. `monitor/healthcheck.go` — after RunHealthCheck, if status changed from ok to warn/fail +2. `backup/backup.go` — after backup failure +3. `backup/dbdump.go` — after DB dump failure +4. `scheduler/scheduler.go` — after integrity check failure + +**Important: Don't spam.** Track last notification time per event type. Don't re-notify for the same ongoing issue within 6 hours (configurable). This is handled locally with an in-memory map. + +### 3.4 Notification Preferences (settings.json + settings page) + +Expand the `NotificationPrefs` struct: + +```go type NotificationPrefs struct { - // placeholder for Phase 2 + // Customer email for notifications (sent to hub, hub delivers via Resend) + Email string `json:"email,omitempty"` + + // Which events to be notified about + EnabledEvents []string `json:"enabled_events,omitempty"` + + // Notification cooldown in hours (don't re-send for same issue within this period) + CooldownHours int `json:"cooldown_hours,omitempty"` // default: 6 } -type DBValidationCache struct { - ValidatedAt string `json:"validated_at"` // RFC3339 - TableCount int `json:"table_count"` - HasHeader bool `json:"has_header"` - Error string `json:"error,omitempty"` +// Default events if not configured +var DefaultCustomerEvents = []string{ + "disk_warning", + "backup_failed", + "update_available", } ``` -**Behavior:** -- `Load(path string) (*Settings, error)` — reads file, returns empty Settings{} if file doesn't exist (not an error) -- `Save() error` — atomic write (write to `.tmp`, rename). Path stored internally from Load. -- Use `sync.RWMutex` for concurrent access (backup goroutine writes validations, web reads them) -- Log on every save: `[DEBUG] Settings saved to ` +### 3.5 Settings page: Notification Preferences UI -**File location:** The controller's docker-compose.yml already mounts `./data:/opt/docker/felhom-controller/data`. -The settings file path should be passed to the Settings manager on init. +Add a **third section** to the existing settings page (below "Jelszó módosítás"): -**Testing:** On startup, if file doesn't exist, log `[INFO] No settings.json found, using defaults`. Do NOT create the file until something actually needs saving. +**Section C: "Értesítések" (Notifications)** + +Only shown if hub is enabled (`hub.enabled = true`). If hub is disabled, show info message: +"Az értesítések a központi rendszeren keresztül működnek, ami jelenleg nincs bekapcsolva." + +``` +┌─────────────────────────────────────────────────────────┐ +│ Értesítések │ +│ │ +│ E-mail cím: [________________________] (text input) │ +│ │ +│ Az alábbi eseményekről kapjon értesítést: │ +│ │ +│ [x] Lemez figyelmeztetés (80%+) │ +│ [x] Biztonsági mentés sikertelen │ +│ [x] Frissítés elérhető │ +│ [ ] Biztonsági frissítés │ +│ │ +│ Értesítési szünet: [6] óra │ +│ (Azonos probléma esetén ennyi ideig nem küld újat) │ +│ │ +│ [Mentés] [Teszt email küldése]│ +└─────────────────────────────────────────────────────────┘ +``` + +**Route:** + +| Method | Path | Auth? | Handler | +|--------|------|-------|---------| +| POST | `/settings/notifications` | Yes | Save notification preferences | +| POST | `/settings/notifications/test` | Yes | Send test notification via hub relay | + +**POST `/settings/notifications` handler:** +1. Parse form: email, enabled_events (checkbox list), cooldown_hours +2. Validate email format (basic regex, allow empty = disable) +3. Save to `settings.json` → `notifications` +4. Show success flash: "Értesítési beállítások mentve." + +**POST `/settings/notifications/test` handler:** +1. Read current notification preferences from settings +2. Send a test notification via the hub relay: + ```json + { + "customer_id": "demo-felhom", + "event_type": "test", + "severity": "info", + "message": "Teszt értesítés a Felhom rendszerből", + "details": "Ha ezt az emailt megkapta, az értesítések megfelelően működnek." + } + ``` +3. Show result: "Teszt email elküldve." or error message + +### 3.6 Event Types Reference + +| Event type | Severity | Trigger | Hungarian label | +|---|---|---|---| +| `disk_warning` | warning | Disk usage >= warn threshold | Lemez figyelmeztetés | +| `disk_critical` | critical | Disk usage >= crit threshold | Lemez kritikus | +| `backup_failed` | critical | Restic snapshot failed | Biztonsági mentés sikertelen | +| `db_dump_failed` | critical | DB dump failed | Adatbázis mentés sikertelen | +| `update_available` | info | New controller version available | Frissítés elérhető | +| `security_update` | warning | Security update available | Biztonsági frissítés | +| `container_unhealthy` | warning | Protected container not running | Alkalmazás leállt | +| `integrity_failed` | warning | Weekly restic check failed | Mentés integritás hiba | +| `test` | info | Manual test from settings page | Teszt | --- -## 2. Authentication (Login / Logout) +## 4. Implementation Order -### 2.1 Password resolution +### Step 1: Monitoring page — ping status section +- Add `PingStatusItem` struct and builder in monitoring handler +- Add "Távoli monitoring" section to `monitoring.html` +- Add CSS for ping status rows and banner +- **Test:** Check monitoring page shows ✅/⚠️ for each ping UUID -The effective password hash is determined with this priority: -1. `settings.json` → `password_hash` (customer changed it) -2. `controller.yaml` → `web.password_hash` (operator provisioned) -3. Empty string → no auth required (current testing behavior) +### Step 2: Alert manager + dashboard banners +- Create `internal/web/alerts.go` with AlertManager +- Wire AlertManager into health check cycle +- Add alert rendering to `layout_start` template +- Add CSS for alert banners +- **Test:** Set a ping UUID to empty → warning banner appears on all pages. Fix it → banner disappears. -On startup, log which source is active: -``` -[INFO] Auth: using password from settings.json -[INFO] Auth: using password from controller.yaml -[INFO] Auth: no password configured — dashboard is open -``` +### Step 3: Hub notification endpoint (felhom.eu repo) +- Add Resend API key to hub's k8s secret +- Add `POST /api/v1/notify` endpoint to hub +- Add `customer_notifications` table to hub's SQLite +- Add email sending via Resend HTTP API (not SMTP — direct API call) +- Add hub admin page or CLI to set customer email/preferences +- **Test:** `curl -X POST hub.felhom.eu/api/v1/notify -H "Authorization: Bearer ..." -d '{"customer_id":"demo-felhom","event_type":"test","severity":"info","message":"Test"}'` → email arrives -### 2.2 Session management +### Step 4: Controller-side notifier +- Create `internal/notify/notifier.go` +- Wire into health check, backup, and DB dump flows +- Add cooldown tracking (in-memory map, not persisted) +- **Test:** Trigger a disk warning → notification sent to hub → email arrives -Use a **signed session cookie** approach (not just `"authenticated"` string): -- Generate a random 32-byte session secret on startup (store in memory only, not persisted — restarts invalidate sessions, which is fine) -- Cookie name: `felhom_session` -- Cookie value: `.` — HMAC of expiry timestamp with session secret -- `HttpOnly: true`, `SameSite: Strict`, `Secure: false` (local network, self-signed certs) -- `MaxAge: 7 days` (configurable later) -- `Path: /` +### Step 5: Notification preferences UI +- Expand `NotificationPrefs` struct in settings.go +- Add "Értesítések" section to settings.html +- Add POST handlers for save and test +- Push email preference to hub when saving (optional — can be deferred) +- **Test:** Set email → save → test → email arrives → change events → save → verify filtering works -### 2.3 Routes - -Add to `internal/web/server.go` routing: - -| Method | Path | Auth? | Handler | -|--------|-------------|-------|--------------------| -| GET | `/login` | No | Show login form | -| POST | `/login` | No | Validate + set cookie + redirect to `/` | -| GET/POST | `/logout` | No | Clear cookie + redirect to `/login` | - -**Auth middleware** (wrap all routes except `/login`, `/logout`, `/style.css`, `/assets/*`, `/api/*`): -- Check `felhom_session` cookie → validate HMAC + check expiry -- If invalid/missing → redirect to `/login` (for browser) or return 401 (for API) -- If no password configured → pass through (no auth) - -### 2.4 Login page template -``` -File: controller/internal/web/templates/login.html -``` - -- Standalone page (no sidebar), same dark theme as rest of dashboard -- Felhom logo centered at top -- Single password field + "Bejelentkezés" button -- On error: "Hibás jelszó" message (red, below form) -- Customer name shown below logo (from config) -- No username field — single-user system - -### 2.5 Logout - -- `GET /logout` or `POST /logout` → set cookie MaxAge=-1 → redirect to `/login` -- Add logout link to sidebar bottom (near version display): -``` - | Kijelentkezés -``` - Only show "Kijelentkezés" if auth is enabled. - -### 2.6 Edge cases - -- If password_hash is empty in both sources → no auth, no login page, no logout link -- If user is on a page and session expires → next request redirects to `/login`, after login redirect back to original page (use `?next=/backups` query param) -- Cookie cleared on logout must work even if server secret rotated (clear by MaxAge=-1) - ---- - -## 3. DB Validation Persistence - -### 3.1 Problem - -After container restart, the backup page "Érvényesítés" column shows "–" for all databases until the next backup cycle runs validation. The v0.6.2 cross-check helps once RefreshCache runs, but the initial load after restart still shows no data. - -### 3.2 Solution - -**On validation completion** (`internal/backup/dbdump.go` → `ValidateDump`): -- After successful validation, save result to `settings.json` via settings package: -```go - settings.SetDBValidation("immich-postgres.sql", DBValidationCache{ - ValidatedAt: time.Now().Format(time.RFC3339), - TableCount: 60, - HasHeader: true, - }) - // SetDBValidation acquires write lock, updates map, calls Save() -``` - -**On startup / RefreshCache** (`internal/backup/backup.go`): -- Load cached validations from `settings.json` -- For each dump file that exists on disk AND has a cached validation: - - Use the cached validation data - - Set `DumpValidation.Valid = true/false` based on cached result - - Set `DumpValidation.Message` to include cached info: e.g., `"60 tábla (utolsó: 08:04)"` -- The next actual validation run overwrites the cache with fresh data - -**Important:** The cache is keyed by dump filename (e.g., `immich-postgres.sql`). If a dump file no longer exists, its cached validation is ignored (stale data cleanup). - -### 3.3 Template update - -No template changes needed — the existing 4-branch guard from v0.6.2 already handles showing validation status correctly. The only difference is now the data will be populated from cache on startup instead of being zero-valued. - ---- - -## 4. "Beállítások" (Settings) Page - -### 4.1 Sidebar menu item - -Add "Beállítások" as the last menu item, visually separated from the main navigation — placed at the bottom of the sidebar, just above the version/logout section. Use a gear/cog icon (⚙ or SVG). - -Sidebar order: -``` -Vezérlőpult -Alkalmazások -Biztonsági mentés -Rendszermonitor -─── (spacer / flex-grow) ─── -⚙ Beállítások ← new, pinned to bottom -v0.7.0 | Kijelentkezés -``` - -### 4.2 Route - -| Method | Path | Auth? | Handler | -|--------|-------------|-------|---------------------| -| GET | `/settings` | Yes | Show settings page | -| POST | `/settings/password` | Yes | Change password | - -### 4.3 Settings page layout - -The page has two sections: - -#### Section A: "Rendszer konfiguráció" (System Configuration) — Read-only - -Display key values from `controller.yaml` in a clean info-grid. These are read-only — the customer can see what's configured but can't change it here. - -| Label (Hungarian) | Source in controller.yaml | Display format | -|--------------------------|----------------------------------|----------------| -| Ügyfél azonosító | `customer.id` | `demo-felhom` | -| Ügyfél neve | `customer.name` | `Demo Ügyfél` | -| Domain | `customer.domain` | `demo-felhom.eu` | -| Alkalmazás sablon forrás | `git.repo_url` | URL (truncated) | -| Sablon szinkronizálás | `git.sync_interval` | `15m` | -| Biztonsági mentés | `backup.enabled` | ✅ Aktív / ❌ Inaktív | -| Mentés ütemezés | `backup.db_dump_schedule` + `backup.restic_schedule` | `02:30 / 03:00` | -| Monitoring | `monitoring.enabled` | ✅ Aktív / ❌ Inaktív | -| Healthchecks URL | `monitoring.healthchecks_base` | URL or "–" | -| Hub jelentés | `hub.enabled` (if exists) | ✅ Aktív / ❌ Inaktív / "–" | -| Controller verzió | built-in Version constant | `0.7.0` | - -Use green checkmark / red X styling for boolean states, consistent with the backup page. - -#### Section B: "Jelszó módosítás" (Change Password) — Editable - -Only shown if auth is enabled (password_hash is set). If no auth, show an info message: -"A jelszavas védelem nincs beállítva. Kérd az üzemeltetőt a beállításhoz." - -Form fields: -- Jelenlegi jelszó (current password) — required -- Új jelszó (new password) — required, min 8 chars -- Új jelszó megerősítése (confirm new password) — required, must match - -**POST `/settings/password` handler:** -1. Validate current password against effective hash (bcrypt compare) -2. Validate new password: min 8 chars, both fields match -3. Generate bcrypt hash (cost 10) for new password -4. Save to `settings.json` → `password_hash` -5. Invalidate current session (regenerate session secret so all cookies become invalid) -6. Redirect to `/login` with success flash message: "Jelszó sikeresen módosítva. Kérjük, jelentkezzen be az új jelszóval." - -**Error handling:** -- Wrong current password → "Hibás jelenlegi jelszó" (stay on page) -- Passwords don't match → "A két jelszó nem egyezik" -- Too short → "A jelszónak legalább 8 karakter hosszúnak kell lennie" -- Show errors inline, don't clear form - -### 4.4 Template -``` -File: controller/internal/web/templates/settings.html -``` - -Follow the same card-based layout as the backup page. Two cards: -1. "Rendszer konfiguráció" — info-grid with labels + values -2. "Jelszó módosítás" — form card - ---- - -## 5. Implementation Order - -Follow this sequence to keep each step testable: - -### Step 1: settings.json package -- Create `internal/settings/settings.go` with Settings struct, Load/Save -- Add settings instance to the controller's main app struct -- Load on startup, log result -- **Test:** Start controller, check logs for settings load message - -### Step 2: Authentication -- Add session secret generation on startup -- Add auth middleware -- Add login.html template -- Add login/logout handlers -- Add logout link to sidebar -- Wire up routes in server.go -- Set `web.password_hash` in controller.yaml on demo to test -- **Test:** Navigate to dashboard → redirected to /login → enter password → dashboard loads → /logout → back to /login - -### Step 3: DB validation persistence -- After ValidateDump completes, save results to settings.json -- On RefreshCache, load cached validations for initial display -- **Test:** Deploy apps with DBs → trigger backup → check Érvényesítés column shows data → restart container → check column still shows data - -### Step 4: Settings page -- Add settings.html template -- Add settings route + handler -- Add sidebar menu item with bottom-pinning -- Implement password change POST handler -- **Test:** Open /settings → see config values → change password → re-login with new password - -### Step 5: Cleanup & version bump +### Step 6: Cleanup & version bump - Update CONTEXT.md -- Bump version to 0.7.0 -- Build + deploy + verify on demo-felhom.eu +- Bump controller version to 0.7.1 +- Bump hub version accordingly +- Build + deploy both → verify on demo-felhom.eu --- -## 6. Files to Create / Modify +## 5. Files to Create / Modify -### New files: -- `controller/internal/settings/settings.go` — Settings persistence -- `controller/internal/web/templates/login.html` — Login page -- `controller/internal/web/templates/settings.html` — Settings page +### Controller repo (`deploy-felhom-compose`): -### Modified files: -- `controller/internal/web/server.go` — Add auth middleware, new routes (/login, /logout, /settings, /settings/password), session management, settings page handler -- `controller/internal/web/templates/` sidebar partial or base template — Add "Beállítások" menu item at bottom, logout link -- `controller/internal/backup/backup.go` — Load cached validations in RefreshCache -- `controller/internal/backup/dbdump.go` — Save validation results to settings.json -- `controller/internal/config/config.go` — Possibly add data_dir path helper -- `controller/cmd/controller/main.go` — Initialize settings, pass to web server and backup manager +**New files:** +- `controller/internal/web/alerts.go` — AlertManager +- `controller/internal/notify/notifier.go` — Hub notification client + +**Modified files:** +- `controller/internal/web/server.go` — Add AlertManager, wire into handlers, pass alerts to all templates +- `controller/internal/web/handlers.go` — Monitoring handler: add ping status data. Settings handler: add notification preferences section + POST handlers. +- `controller/internal/web/templates/monitoring.html` — Add "Távoli monitoring" section +- `controller/internal/web/templates/settings.html` — Add "Értesítések" section +- `controller/internal/web/templates/layout.html` — Add alert banner rendering +- `controller/internal/web/templates/style.css` — New styles for alerts and ping status +- `controller/internal/settings/settings.go` — Expand NotificationPrefs struct +- `controller/internal/monitor/healthcheck.go` — After health check, update AlertManager + trigger notifications +- `controller/internal/backup/backup.go` — Trigger notification on backup failure +- `controller/internal/backup/dbdump.go` — Trigger notification on dump failure +- `controller/cmd/controller/main.go` — Initialize Notifier, AlertManager, wire dependencies + +### Hub repo (`felhom.eu`): + +**Modified files:** +- `hub/internal/api/server.go` (or new `notify.go`) — Add `POST /api/v1/notify` endpoint +- `hub/internal/store/store.go` — Add `customer_notifications` table + queries +- `hub/cmd/hub/main.go` — Add Resend API key config +- `manifests/hub.yaml` — Add `RESEND_API_KEY` to hub secret --- -## 7. Design Decisions & Notes +## 6. Design Decisions & Notes -### Why settings.json (not SQLite)? -- Single file, human-debuggable, easy to backup/restore -- Tiny data volume (password hash + a few validation entries) -- No query needs — just load/save whole struct -- Customers or operators can inspect/reset it easily +### Why add notify to the hub instead of a new service? +- Hub already authenticates customers, has SQLite, knows customer IDs +- Adding one endpoint is simpler than deploying+maintaining a separate service +- Shared Resend API key, shared k8s secret +- One less DNS record, ingress, deployment to manage -### Why not modify controller.yaml for password changes? -- controller.yaml is operator-provisioned config, risky to programmatically rewrite YAML -- settings.json is a clean override layer: operator sets initial password in yaml, customer changes it in json -- If settings.json is deleted, system falls back to controller.yaml password (recovery path) +### Customer email configuration flow (Phase 2a — operator-managed) +For now, the operator sets each customer's email via direct SQLite or a simple hub admin endpoint. The customer can see and change their email in the controller's settings page, but actually syncing this to the hub is deferred — the controller just stores it locally. The operator manually ensures the hub has the right email. -### Session secret is ephemeral (memory only) -- Container restart = all sessions invalidated = users must re-login -- This is acceptable and actually desirable for security -- No need to persist session state +This is acceptable for the initial small customer base. Phase 2b (future) will add automatic preference sync via the report push. -### Notifications (Phase 2 prep) -- The Settings struct includes a `Notifications` field placeholder -- Phase 2 will add: email relay via k3s (similar to contact-mailer pattern), notification preferences UI -- The relay approach: customer controller sends HTTP POST to a central notification API on k3s, which handles Resend delivery. Avoids storing Resend API keys on customer hardware. -- This keeps secrets centralized and customer nodes lightweight. +### Notification cooldown +The controller tracks in-memory when each event type was last notified. If the same event type fires again within the cooldown period (default 6 hours), the notification is suppressed. This prevents email spam during prolonged issues (e.g., disk stays at 85% for days). -### Settings page scope — what goes where -- "Beállítások" = actual settings (things the user can configure or needs to know about their setup) -- "Rendszermonitor" = live system state (hostname, uptime, CPU, RAM, disk, Docker containers) -- No overlap — config is static, monitoring is dynamic \ No newline at end of file +The cooldown resets on controller restart, which is fine — restarting the controller during an active issue should re-trigger a notification. + +### Dashboard alerts are state-based, not event-based +Alerts reflect current system state. They're regenerated every 5 minutes from the latest health check. When the issue resolves, the alert disappears. No persistence needed — alerts live in memory only. + +### Resend API usage from hub +Use Resend's HTTP API directly (POST to `https://api.resend.com/emails`) rather than SMTP. This avoids SMTP connection management complexity and is more idiomatic for a Go service. The contact-mailer already demonstrates this pattern. + +```go +// Example Resend API call +req, _ := http.NewRequest("POST", "https://api.resend.com/emails", bytes.NewReader(payload)) +req.Header.Set("Authorization", "Bearer " + resendAPIKey) +req.Header.Set("Content-Type", "application/json") +``` + +### Email template +Notifications should be simple, text-focused emails in Hungarian: + +``` +Tárgy: [Felhom] Figyelmeztetés: SSD lemez használat 85% + +Kedves Ügyfél! + +A Felhom rendszered a következő figyelmeztetést jelezte: + +SSD lemez használat: 85% (küszöb: 80%) + +Részletek: +- Szerver: demo-felhom.eu +- Időpont: 2026-02-16 14:30 +- Szint: Figyelmeztetés + +Ha kérdésed van, vedd fel a kapcsolatot az üzemeltetővel. + +Üdvözlettel, +Felhom.eu monitoring +``` + +### Monitoring page vs Settings page — what goes where +- **Rendszermonitor** shows: live ping status table (read-only), system metrics, alerts related to monitoring +- **Beállítások** shows: notification email + event preferences (editable), test button +- No overlap — monitoring shows status, settings allows configuration \ No newline at end of file