feat: Hub monitoring takeover — event push system + config cleanup (v0.21.0)

Replace external Healthchecks.io with Hub-native event system. Controller
now pushes structured events via POST /api/v1/event with typed detail
structs. Hub handles dead man's switch, notification dispatch, and cooldowns.

Phase 5: PushEvent() core method, 21 event types, expanded notification
settings (11 toggles), Hub connection monitoring on dashboard, alerts.
Phase 6: Deprecation log for ping UUIDs, pinger kept for transition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-20 18:53:21 +01:00
parent 55abe401ee
commit 8aebbb8902
13 changed files with 722 additions and 318 deletions
+47
View File
@@ -1,5 +1,52 @@
## Changelog
### What was just completed (2026-02-20 session 64)
- **v0.21.0 — Hub Monitoring Takeover (Controller-side, Phases 5+6):**
Replaces external Healthchecks.io dependency with Hub-native event system. The controller now pushes structured events directly to the Hub's `/api/v1/event` endpoint, and the Hub handles dead man's switch detection, notification dispatch, and cooldown management.
**Phase 5 — Event Push System (`internal/notify/notifier.go`):**
- New core method `PushEvent(eventType, severity, message, details)` — non-blocking goroutine, 2 retries with 3s backoff, POSTs to Hub `/api/v1/event`
- 8 typed detail structs: `BackupDetails`, `DBDumpDetails`, `DiskDetails`, `HealthDetails`, `StorageDetails`, `UpdateDetails`, `AppDetails`, `CrossDriveDetails`
- Replaced all old `Notify*` methods with event-based equivalents:
- `NotifyBackupCompleted/Failed``backup_completed`/`backup_failed` events
- `NotifyDBDumpCompleted/Failed``db_dump_completed`/`db_dump_failed` events
- `NotifyIntegrityOK/Failed``backup_integrity_ok`/`backup_integrity_failed` events
- `NotifyHealthChange` → detects transitions, pushes `health_degraded`/`health_critical`/`health_recovered`
- `NotifyStorageDisconnected/Reconnected``storage_disconnected`/`storage_reconnected` events
- `NotifyControllerStarted``controller_started` event on startup
- `NotifyControllerUpdated``controller_updated` event (replaces `NotifyUpdateSuccess/Failed`)
- `NotifyAppDeployed/Removed``app_deployed`/`app_removed` events
- `NotifyCrossDriveCompleted/Failed``crossdrive_completed`/`crossdrive_failed` events
- `NotifyDRStarted/Completed``disaster_recovery_started`/`disaster_recovery_completed` events
- Removed old `/api/v1/notify` relay, `classifyWarning()`, and client-side cooldown logic (Hub handles cooldowns now)
- `SendTest()` now pushes `test` event type via `PushEvent`
- `SyncPreferences` updated to include `cooldownHours` parameter
**Phase 5 — Event Wiring:**
- `main.go`: Wired success events for backup, db-dump, integrity check; startup event with 5s delay; update event after `VerifyStartup()`
- `router.go`: Added `NotifyAppDeployed`/`NotifyAppRemoved` after successful deploy/remove via API
- `handler_restore.go`: Added `NotifyDRStarted`/`NotifyDRCompleted` in DR restore flow
- `server.go`: New `HubPushStatusData` struct and `SetHubPushStatus` callback for monitoring page
**Phase 5 — Hub Connection Monitoring:**
- `pusher.go`: Added `PushStatus` tracking (LastAttempt, LastSuccess, LastError, Consecutive failures) to report Pusher
- `handlers.go`: Monitoring page now shows Hub connection status (connected/unreachable, URL, customer ID, last success, last error) instead of Healthchecks ping UUIDs
- `monitoring.html`: Replaced "Távoli monitoring" section with "Hub kapcsolat" section
- `alerts.go`: Replaced "Missing ping UUIDs" alert with Hub connection alerts (`hub-disabled` warning, `hub-unreachable` error)
**Phase 5 — Expanded Notification Settings:**
- `settings.html`: Expanded from 4 checkboxes to 11 grouped toggles in two categories:
- "Hibák és figyelmeztetések": backup_failed, db_dump_failed, backup_integrity_failed, crossdrive_failed, disk alerts, storage_disconnected, node_down, health_critical, expected missed
- "Tájékoztató": storage_reconnected, health_recovered
- Compound toggles: "Lemez figyelmeztetés" maps to `disk_warning` + `disk_critical`; "Elvárt mentés elmaradt" maps to `expected_backup_missed` + `expected_dbdump_missed`
- `settings.go`: Updated `DefaultEnabledEvents` to new Hub event types
- `handlers.go`: Updated settings POST handler for expanded event names and compound toggles
**Phase 6 — Config Cleanup:**
- `main.go`: Deprecation log on startup when ping UUIDs are configured: `[INFO] Healthchecks ping UUIDs configured but no longer used — monitoring is now handled by the Hub`
- Pinger still runs for transitional backward compatibility
### What was just completed (2026-02-20 session 63)
- **v0.20.0 — Hub Config Management (Phase B):**