339 lines
15 KiB
Markdown
339 lines
15 KiB
Markdown
# TASK: Add Telemetry Debug Section to Controller Debug Page
|
||
|
||
**Scope:** Controller only — add a "Telemetria teszt" section to the existing Debug page
|
||
|
||
## Overview
|
||
|
||
The App Telemetry feature (metrics collection, log scanning, hub reporting, hub dashboard) is fully implemented and deployed. The remaining task is to add a debug section that lets the operator run telemetry collection on-demand and see the results — useful for verifying container→stack mapping, checking metrics DB values, and testing log scanner pattern matching without waiting for the 15-minute report cycle.
|
||
|
||
---
|
||
|
||
## What to Implement
|
||
|
||
### 1. New debug endpoint: `GET /api/debug/telemetry`
|
||
|
||
**File:** `controller/internal/web/handler_debug.go`
|
||
|
||
Add a new route case in `handleDebugAPI()` (between the existing "Section 5: Hub" and "Section 6: Self-Update" blocks, around line 83):
|
||
|
||
```go
|
||
// Section: Telemetry testing
|
||
case subpath == "telemetry" && r.Method == http.MethodGet:
|
||
s.debugTelemetry(w, r)
|
||
```
|
||
|
||
Add the handler function:
|
||
|
||
```go
|
||
func (s *Server) debugTelemetry(w http.ResponseWriter, r *http.Request) {
|
||
start := time.Now()
|
||
telemetry := report.BuildAppTelemetryForDebug(s.stackMgr, s.metricsStore, s.logger)
|
||
latency := time.Since(start).Milliseconds()
|
||
|
||
writeDebugJSON(w, http.StatusOK, true, fmt.Sprintf("Telemetria összegyűjtve: %d alkalmazás (%dms)", len(telemetry), latency), map[string]interface{}{
|
||
"latency_ms": latency,
|
||
"app_count": len(telemetry),
|
||
"app_telemetry": telemetry,
|
||
})
|
||
}
|
||
```
|
||
|
||
**Problem:** The Server struct doesn't have `metricsStore`. There are two approaches:
|
||
|
||
**Approach A (preferred): Add a `TriggerTelemetryTest` callback to `DebugCallbacks`**
|
||
|
||
This follows the existing pattern for operations needing main.go wiring. The Server struct lacks `metricsStore` but main.go has it.
|
||
|
||
In `handler_debug.go`, add to the `DebugCallbacks` struct:
|
||
```go
|
||
type DebugCallbacks struct {
|
||
TriggerHubReportPush func() error
|
||
TriggerHubInfraPush func() error
|
||
TriggerLocalInfraWrite func() error
|
||
TriggerSetupMode func() error
|
||
HubConnectivityTest func() (statusCode int, latencyMs int64, err error)
|
||
GiteaConnectivityTest func() (statusCode int, latencyMs int64, err error)
|
||
GetTelemetryPreview func() ([]report.AppTelemetry, error) // NEW
|
||
}
|
||
```
|
||
|
||
Then the handler becomes:
|
||
```go
|
||
func (s *Server) debugTelemetry(w http.ResponseWriter, r *http.Request) {
|
||
if s.debugCallbacks == nil || s.debugCallbacks.GetTelemetryPreview == nil {
|
||
writeDebugJSON(w, http.StatusNotImplemented, false, "Nem bekötött", nil)
|
||
return
|
||
}
|
||
start := time.Now()
|
||
telemetry, err := s.debugCallbacks.GetTelemetryPreview()
|
||
latency := time.Since(start).Milliseconds()
|
||
if err != nil {
|
||
writeDebugJSON(w, http.StatusOK, false, err.Error(), map[string]interface{}{"latency_ms": latency})
|
||
return
|
||
}
|
||
|
||
// Compute summary stats
|
||
totalErrors := 0
|
||
totalWarnings := 0
|
||
for _, app := range telemetry {
|
||
totalErrors += app.LogErrors
|
||
totalWarnings += app.LogWarnings
|
||
}
|
||
|
||
writeDebugJSON(w, http.StatusOK, true,
|
||
fmt.Sprintf("Telemetria összegyűjtve: %d app, %d hiba, %d figyelmeztetés (%dms)",
|
||
len(telemetry), totalErrors, totalWarnings, latency),
|
||
map[string]interface{}{
|
||
"latency_ms": latency,
|
||
"app_count": len(telemetry),
|
||
"total_errors": totalErrors,
|
||
"total_warnings": totalWarnings,
|
||
"app_telemetry": telemetry,
|
||
})
|
||
}
|
||
```
|
||
|
||
**Approach B: Export `buildAppTelemetrySection` from the report package**
|
||
|
||
In `controller/internal/report/telemetry.go`, the function `buildAppTelemetrySection` is private (lowercase). Create a public wrapper:
|
||
|
||
```go
|
||
// BuildAppTelemetryForDebug runs the telemetry collection pipeline (metrics + log scan)
|
||
// and returns the result. Used by the debug endpoint.
|
||
func BuildAppTelemetryForDebug(stackMgr *stacks.Manager, metricsStore *metrics.MetricsStore, logger *log.Logger) []AppTelemetry {
|
||
return buildAppTelemetrySection(stackMgr, metricsStore, logger)
|
||
}
|
||
```
|
||
|
||
Then wire the callback in main.go using this exported function.
|
||
|
||
### 2. Wire the callback in `cmd/controller/main.go`
|
||
|
||
In the `DebugCallbacks` setup block (around line 615), add:
|
||
|
||
```go
|
||
dc.GetTelemetryPreview = func() ([]report.AppTelemetry, error) {
|
||
return report.BuildAppTelemetryForDebug(stackMgr, metricsStore, logger), nil
|
||
}
|
||
```
|
||
|
||
This goes inside the existing `if cfg.Logging.Level == "debug"` block, alongside the other callback assignments. It does NOT need to be inside the `if hubPusher != nil` guard — telemetry works regardless of hub config.
|
||
|
||
### 3. Add the debug page section in `debug.html`
|
||
|
||
**File:** `controller/internal/web/templates/debug.html`
|
||
|
||
Add a new section between "Hub & Kapcsolatok" (section-hub) and "Önfrissítés teszt" (section-selfupdate). Insert after the closing `</div>` of section-hub (line 128) and before the `<!-- Section 6: Self-Update Testing -->` comment (line 130):
|
||
|
||
```html
|
||
<!-- Section: Telemetry Testing -->
|
||
<div class="card debug-section" id="section-telemetry">
|
||
<div class="card-header debug-section-header" onclick="toggleSection('telemetry')">
|
||
<h3>Telemetria teszt</h3>
|
||
<span class="section-toggle">▶</span>
|
||
</div>
|
||
<div class="card-body debug-section-body" style="display:none">
|
||
<div id="telemetry-status"><span class="text-muted">Kattintson a gombra a telemetria futtatásához.</span></div>
|
||
<div class="debug-actions">
|
||
<button class="btn btn-primary btn-sm" id="btn-telemetry-run" data-label="Telemetria futtatása" onclick="runTelemetryTest()">Telemetria futtatása</button>
|
||
<span class="debug-result" id="btn-telemetry-run-result"></span>
|
||
</div>
|
||
<div id="telemetry-detail" style="display:none; margin-top:1rem;"></div>
|
||
</div>
|
||
</div>
|
||
```
|
||
|
||
### 4. Add JavaScript for the telemetry section in `debug.html`
|
||
|
||
Add in the `<script>` block, alongside the other section functions. Also update `loadSectionData` to add the telemetry case (though this section doesn't auto-load — the user clicks the button):
|
||
|
||
In `loadSectionData`, add:
|
||
```javascript
|
||
case 'telemetry': break; // no auto-load, user triggers manually
|
||
```
|
||
|
||
Add the `runTelemetryTest()` function:
|
||
|
||
```javascript
|
||
// ── Telemetry test ──
|
||
function runTelemetryTest() {
|
||
var btn = document.getElementById('btn-telemetry-run');
|
||
var result = document.getElementById('btn-telemetry-run-result');
|
||
var detail = document.getElementById('telemetry-detail');
|
||
btn.disabled = true;
|
||
btn.textContent = 'Folyamatban...';
|
||
result.className = 'debug-result';
|
||
result.textContent = '';
|
||
detail.style.display = 'none';
|
||
|
||
fetch('/api/debug/telemetry', {headers: csrfHeaders()}).then(function(r){return r.json()}).then(function(data) {
|
||
if (data.ok) {
|
||
result.className = 'debug-result debug-result-ok';
|
||
result.textContent = data.message;
|
||
if (data.data && data.data.app_telemetry) {
|
||
renderTelemetryDetail(data.data);
|
||
}
|
||
} else {
|
||
result.className = 'debug-result debug-result-error';
|
||
result.textContent = data.error || 'Hiba';
|
||
}
|
||
}).catch(function(e) {
|
||
result.className = 'debug-result debug-result-error';
|
||
result.textContent = 'Hálózati hiba: ' + e.message;
|
||
}).finally(function() {
|
||
btn.disabled = false;
|
||
btn.textContent = btn.dataset.label;
|
||
});
|
||
}
|
||
|
||
function renderTelemetryDetail(data) {
|
||
var detail = document.getElementById('telemetry-detail');
|
||
var apps = data.app_telemetry || [];
|
||
if (apps.length === 0) {
|
||
detail.innerHTML = '<span class="text-muted">Nincs telepített alkalmazás vagy nincs mérési adat.</span>';
|
||
detail.style.display = 'block';
|
||
return;
|
||
}
|
||
|
||
var html = '<table class="debug-table" style="width:100%;font-size:.85rem">';
|
||
html += '<thead><tr><th>Alkalmazás</th><th>Konténerek</th><th>Memória (jelen.)</th><th>Memória (átlag)</th><th>Memória (csúcs)</th><th>CPU (átlag)</th><th>Katalógus limit</th><th>Hibák</th><th>Figyelmeztetések</th></tr></thead><tbody>';
|
||
|
||
for (var i = 0; i < apps.length; i++) {
|
||
var a = apps[i];
|
||
var errorClass = a.log_errors > 0 ? ' style="color:var(--red);font-weight:600"' : '';
|
||
var warnClass = a.log_warnings > 0 ? ' style="color:var(--yellow);font-weight:600"' : '';
|
||
html += '<tr>';
|
||
html += '<td><strong>' + escHtml(a.display_name || a.app_name) + '</strong></td>';
|
||
html += '<td class="mono" style="font-size:.8rem">' + (a.containers||[]).join(', ') + '</td>';
|
||
html += '<td>' + a.memory_current_mb.toFixed(1) + ' MB</td>';
|
||
html += '<td>' + a.memory_avg_mb.toFixed(1) + ' MB</td>';
|
||
html += '<td>' + a.memory_peak_mb.toFixed(1) + ' MB</td>';
|
||
html += '<td>' + a.cpu_avg_percent.toFixed(1) + '%</td>';
|
||
html += '<td class="mono">' + (a.catalog_limit || '-') + '</td>';
|
||
html += '<td' + errorClass + '>' + a.log_errors + '</td>';
|
||
html += '<td' + warnClass + '>' + a.log_warnings + '</td>';
|
||
html += '</tr>';
|
||
|
||
// Show issues as sub-rows if any
|
||
if (a.issues && a.issues.length > 0) {
|
||
for (var j = 0; j < a.issues.length; j++) {
|
||
var issue = a.issues[j];
|
||
var sevColor = issue.severity === 'error' ? 'var(--red)' : 'var(--yellow)';
|
||
html += '<tr style="font-size:.8rem;opacity:.85"><td></td>';
|
||
html += '<td colspan="6" style="padding-left:1.5rem"><span style="color:' + sevColor + '">' + issue.severity.toUpperCase() + '</span> ' + escHtml(issue.message) + '</td>';
|
||
html += '<td colspan="2">×' + issue.count + '</td></tr>';
|
||
}
|
||
}
|
||
}
|
||
html += '</tbody></table>';
|
||
|
||
// Also show raw JSON in a collapsible detail
|
||
html += '<details style="margin-top:1rem"><summary class="text-muted" style="cursor:pointer;font-size:.85rem">Nyers JSON</summary>';
|
||
html += '<pre class="mono" style="font-size:.75rem;max-height:400px;overflow:auto;padding:.5rem;background:rgba(0,0,0,.3);border-radius:.25rem;margin-top:.5rem">' + escHtml(JSON.stringify(apps, null, 2)) + '</pre>';
|
||
html += '</details>';
|
||
|
||
detail.innerHTML = html;
|
||
detail.style.display = 'block';
|
||
}
|
||
```
|
||
|
||
**Note:** The `escHtml` function should already exist in the debug page JS (check if it's already defined; if not, add a simple one):
|
||
```javascript
|
||
function escHtml(s) {
|
||
var d = document.createElement('div');
|
||
d.textContent = s;
|
||
return d.innerHTML;
|
||
}
|
||
```
|
||
|
||
### 5. Export the telemetry builder from the report package
|
||
|
||
**File:** `controller/internal/report/telemetry.go`
|
||
|
||
Add this exported function at the end of the file:
|
||
|
||
```go
|
||
// BuildAppTelemetryForDebug runs the full telemetry collection pipeline
|
||
// (metrics query + log scan) and returns per-app telemetry data.
|
||
// Used by the debug endpoint to preview telemetry without pushing to hub.
|
||
func BuildAppTelemetryForDebug(stackMgr *stacks.Manager, metricsStore *metrics.MetricsStore, logger *log.Logger) []AppTelemetry {
|
||
return buildAppTelemetrySection(stackMgr, metricsStore, logger)
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Files Changed Summary
|
||
|
||
| File | Change |
|
||
|------|--------|
|
||
| `controller/internal/report/telemetry.go` | **MODIFY** — Add `BuildAppTelemetryForDebug()` public wrapper |
|
||
| `controller/internal/web/handler_debug.go` | **MODIFY** — Add `GetTelemetryPreview` to `DebugCallbacks`, add `debugTelemetry()` handler, add route case |
|
||
| `controller/cmd/controller/main.go` | **MODIFY** — Wire `GetTelemetryPreview` callback in debug callbacks block |
|
||
| `controller/internal/web/templates/debug.html` | **MODIFY** — Add "Telemetria teszt" section + JS rendering |
|
||
|
||
---
|
||
|
||
## Key Implementation Details
|
||
|
||
### Existing patterns to follow exactly:
|
||
|
||
1. **DebugCallbacks pattern** — See existing callbacks like `TriggerHubReportPush` at `handler_debug.go:24-30`. The new `GetTelemetryPreview` follows the same nil-check guard pattern used by `debugHubPush()` at line 460.
|
||
|
||
2. **writeDebugJSON** — Standard response helper at `handler_debug.go:102-120`. Always use this for debug endpoint responses.
|
||
|
||
3. **Button pattern** — Button needs `id`, `data-label` (original text, restored after action), and `onclick` calling the JS function. Result span has `id="btn-{id}-result"`.
|
||
|
||
4. **JS function pattern** — Disable button → change text to "Folyamatban..." → fetch endpoint → show result → restore button. See `triggerAction()` at approximately line 220 in debug.html.
|
||
|
||
5. **Section HTML pattern** — `div.card.debug-section > div.card-header.debug-section-header + div.card-body.debug-section-body`. The body starts with `style="display:none"`.
|
||
|
||
6. **loadSectionData switch** — At approximately line 262 in debug.html. Add the `'telemetry'` case to the switch (even though it's a no-op, for consistency).
|
||
|
||
7. **All UI text in Hungarian** — "Telemetria teszt", "Telemetria futtatása", "Folyamatban...", "Nincs telepített alkalmazás", "Nyers JSON", "Hibák", "Figyelmeztetések", etc.
|
||
|
||
### Import needed in handler_debug.go:
|
||
|
||
The `debugTelemetry` handler uses `report.AppTelemetry` type via the callback return. The import for the report package:
|
||
```go
|
||
import "gitea.dooplex.hu/admin/felhom-controller/internal/report"
|
||
```
|
||
Check if this import already exists in handler_debug.go — it likely doesn't (current imports are for `backup`, `monitor`, `stacks`, `system`). Add it.
|
||
|
||
### CSS already available:
|
||
|
||
The debug page already has styles for `debug-table`, `debug-result`, `debug-result-ok`, `debug-result-error`, `debug-actions`, `debug-kv-grid`, `mono`, `text-muted`, `btn`, `btn-primary`, `btn-sm`, etc. No new CSS needed.
|
||
|
||
### CSS variables available:
|
||
|
||
- `var(--red)` — for errors
|
||
- `var(--yellow)` — for warnings
|
||
- `var(--green)` — for success/ok
|
||
|
||
---
|
||
|
||
## Testing
|
||
|
||
After deployment:
|
||
1. Go to Debug page on the controller
|
||
2. Expand "Telemetria teszt" section
|
||
3. Click "Telemetria futtatása"
|
||
4. Verify: table shows all deployed non-protected stacks
|
||
5. Verify: memory values match what you see on the Rendszermonitor page
|
||
6. Verify: log errors/warnings appear (intentionally stop a dependent container to generate errors)
|
||
7. Verify: "Nyers JSON" collapsible shows the exact payload that would go to the hub
|
||
8. Verify: the scan completes in reasonable time (shown in the result message as "Xms")
|
||
|
||
---
|
||
|
||
## Build & Deploy
|
||
|
||
```bash
|
||
SSH=/c/Windows/System32/OpenSSH/ssh.exe
|
||
# After committing and pushing:
|
||
$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-controller && git -C ~/git/deploy-felhom-compose pull && ./build.sh v0.28.0 --push"
|
||
$SSH kisfenyo@192.168.0.162 "cd /opt/docker/felhom-controller && sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.28.0 && sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.28.0|' docker-compose.yml && sudo docker compose up -d"
|
||
```
|
||
|
||
Note: If v0.28.0 is already built/deployed with the telemetry feature, bump to v0.28.1 for this debug addition.
|