feat: Hub monitoring takeover — event system, dead man's switch, notifications (v0.3.0)
Replace external Healthchecks.io with Hub-native monitoring. New events table + /api/v1/event endpoint for structured events from controllers. Staleness checker (60s) detects unresponsive nodes. Backup deadline checker (daily 05:00) catches missed backups. Notification dispatcher sends operator (English) + customer (Hungarian) emails via Resend with per-event cooldowns. Event timeline on customer page, dashboard badges. Config form deprecates Monitoring UUIDs section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -80,12 +80,12 @@ backup:
|
||||
monitoring:
|
||||
enabled: true
|
||||
healthchecks_base: "https://status.felhom.eu"
|
||||
ping_uuids:
|
||||
heartbeat: "" # Every 5 min — controller process alive
|
||||
system_health: "" # Every 5 min — comprehensive system check
|
||||
db_dump: "" # Daily — after database dumps
|
||||
backup: "" # Daily — after restic snapshot
|
||||
backup_integrity: "" # Weekly (Sunday) — restic check
|
||||
# ping_uuids: (deprecated — monitoring is now handled by the Hub event system)
|
||||
# heartbeat: ""
|
||||
# system_health: ""
|
||||
# db_dump: ""
|
||||
# backup: ""
|
||||
# backup_integrity: ""
|
||||
system_health_interval: "5m"
|
||||
health_check_schedule: "06:00"
|
||||
thresholds:
|
||||
|
||||
Reference in New Issue
Block a user