feat(hub): app telemetry analytics dashboard (v0.4.0)

- store/telemetry.go: new app_telemetry + app_log_issues tables with
  SaveAppTelemetry, GetFleetAppSummary (with P95), GetAppTelemetryHistory,
  GetAppCustomerBreakdown, GetCustomerAppSummary, GetAppIssues, prune methods
- api/handler.go: parse and save optional app_telemetry from report body,
  backward-compatible with old controllers
- cmd/hub/main.go: prune app_telemetry (90d) and stale issues (30d)
- web/apps.go: handleApps + handleAppDetail + chart data aggregation helpers
- web/server.go: routes for /apps, /apps/{name}, /static/chart.min.js;
  added memoryColor/accuracyClass/gt template functions
- web/embed.go: embed static/chart.min.js
- web/configs.go: add app telemetry section to handleCustomerUnified
- templates/apps.html: fleet-wide app list with summary cards and sortable table
- templates/app_detail.html: per-app page with Chart.js memory trend,
  customer breakdown, and known issues table
- templates/customer_unified.html: new Alkalmazás telemetria card
- templates/style.css: badge, summary-card, chart, period-selector,
  accuracy-dot, mem-color, data-table styles
- All templates: added Alkalmazások nav link

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-23 10:46:50 +01:00
parent 8bed5ec339
commit a757bee07a
20 changed files with 1323 additions and 2 deletions
+6 -2
View File
@@ -4,7 +4,7 @@
A lightweight Go service that receives periodic reports and structured events from felhom-controller instances, stores them in SQLite, and provides a web dashboard for fleet monitoring. Also serves as the infrastructure backup store for disaster recovery, event-based dead man's switch monitoring, and notification dispatch.
**Current version: v0.3.8**
**Current version: v0.4.0**
---
@@ -55,11 +55,13 @@ All API endpoints require `Authorization: Bearer <api_key>` (except `/healthz` a
| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/api/v1/report` | Controller pushes periodic status report |
| `POST` | `/api/v1/report` | Controller pushes periodic status report (v0.28.0+ includes `app_telemetry` field) |
| `GET` | `/api/v1/customers` | List all customers with latest report summary |
| `GET` | `/api/v1/customers/{id}` | Get latest full report for a customer |
| `GET` | `/api/v1/customers/{id}/history?period=7d` | Get report history |
The `POST /api/v1/report` handler (v0.4.0+) automatically parses the optional `app_telemetry` JSON array from the request body and stores it in `app_telemetry` / `app_log_issues` tables. Old controllers (no `app_telemetry` key) continue to work unchanged.
### Infrastructure Backup (Disaster Recovery)
| Method | Path | Description |
@@ -180,6 +182,8 @@ Synchronizer-token CSRF protection on all browser POST/DELETE/PATCH operations:
- **Dashboard (`/`)** — Fleet overview table showing all customers with live status and event count badges (error+warning in last 24h). Config-only customers (no reports yet) appear as "PENDING" with gray badge. Blocked customers are hidden. Auto-refreshes every 60 seconds.
- **Customers (`/configs`)** — Customer management list. Shows all customers (both managed and manual), their status, controller version, and config type (MANAGED/MANUAL). Blocked customers shown grayed-out with BLOCKED badge.
- **Fleet App Analytics (`/apps`)** — Fleet-wide app telemetry overview (v0.4.0+). Shows all deployed apps across all customers with deployment count, avg/P95 memory, catalog estimate/limit accuracy indicators, and 24h error/warning badge counts. Sortable columns (deployments/memory/errors), 24h/7d/30d time period selector.
- **App Detail (`/apps/{name}`)** — Per-app drill-down page with Chart.js memory trend (avg + peak lines, catalog limit dashed line), per-customer breakdown table, and known log issues table (severity, message, occurrence count, affected customers, first/last seen). Shows suggested mem_limit from P95×1.2 rounded to 32 MB.
- **Unified Customer Detail (`/customers/{id}`)** — Single page per customer combining config management and live monitoring. Adapts content based on available data:
- **Managed + reporting:** Full view — config info, system metrics, storage, containers, backup status, events timeline (last 50, severity filter), credentials, setup commands, YAML preview, controller update, notifications (with channel column), history
- **Managed + no reports yet:** Config info, credentials, setup commands, "Waiting for first report" indicator