Files
deploy-felhom-compose/TASK.md
T

339 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TASK: Add Telemetry Debug Section to Controller Debug Page
**Scope:** Controller only — add a "Telemetria teszt" section to the existing Debug page
## Overview
The App Telemetry feature (metrics collection, log scanning, hub reporting, hub dashboard) is fully implemented and deployed. The remaining task is to add a debug section that lets the operator run telemetry collection on-demand and see the results — useful for verifying container→stack mapping, checking metrics DB values, and testing log scanner pattern matching without waiting for the 15-minute report cycle.
---
## What to Implement
### 1. New debug endpoint: `GET /api/debug/telemetry`
**File:** `controller/internal/web/handler_debug.go`
Add a new route case in `handleDebugAPI()` (between the existing "Section 5: Hub" and "Section 6: Self-Update" blocks, around line 83):
```go
// Section: Telemetry testing
case subpath == "telemetry" && r.Method == http.MethodGet:
s.debugTelemetry(w, r)
```
Add the handler function:
```go
func (s *Server) debugTelemetry(w http.ResponseWriter, r *http.Request) {
start := time.Now()
telemetry := report.BuildAppTelemetryForDebug(s.stackMgr, s.metricsStore, s.logger)
latency := time.Since(start).Milliseconds()
writeDebugJSON(w, http.StatusOK, true, fmt.Sprintf("Telemetria összegyűjtve: %d alkalmazás (%dms)", len(telemetry), latency), map[string]interface{}{
"latency_ms": latency,
"app_count": len(telemetry),
"app_telemetry": telemetry,
})
}
```
**Problem:** The Server struct doesn't have `metricsStore`. There are two approaches:
**Approach A (preferred): Add a `TriggerTelemetryTest` callback to `DebugCallbacks`**
This follows the existing pattern for operations needing main.go wiring. The Server struct lacks `metricsStore` but main.go has it.
In `handler_debug.go`, add to the `DebugCallbacks` struct:
```go
type DebugCallbacks struct {
TriggerHubReportPush func() error
TriggerHubInfraPush func() error
TriggerLocalInfraWrite func() error
TriggerSetupMode func() error
HubConnectivityTest func() (statusCode int, latencyMs int64, err error)
GiteaConnectivityTest func() (statusCode int, latencyMs int64, err error)
GetTelemetryPreview func() ([]report.AppTelemetry, error) // NEW
}
```
Then the handler becomes:
```go
func (s *Server) debugTelemetry(w http.ResponseWriter, r *http.Request) {
if s.debugCallbacks == nil || s.debugCallbacks.GetTelemetryPreview == nil {
writeDebugJSON(w, http.StatusNotImplemented, false, "Nem bekötött", nil)
return
}
start := time.Now()
telemetry, err := s.debugCallbacks.GetTelemetryPreview()
latency := time.Since(start).Milliseconds()
if err != nil {
writeDebugJSON(w, http.StatusOK, false, err.Error(), map[string]interface{}{"latency_ms": latency})
return
}
// Compute summary stats
totalErrors := 0
totalWarnings := 0
for _, app := range telemetry {
totalErrors += app.LogErrors
totalWarnings += app.LogWarnings
}
writeDebugJSON(w, http.StatusOK, true,
fmt.Sprintf("Telemetria összegyűjtve: %d app, %d hiba, %d figyelmeztetés (%dms)",
len(telemetry), totalErrors, totalWarnings, latency),
map[string]interface{}{
"latency_ms": latency,
"app_count": len(telemetry),
"total_errors": totalErrors,
"total_warnings": totalWarnings,
"app_telemetry": telemetry,
})
}
```
**Approach B: Export `buildAppTelemetrySection` from the report package**
In `controller/internal/report/telemetry.go`, the function `buildAppTelemetrySection` is private (lowercase). Create a public wrapper:
```go
// BuildAppTelemetryForDebug runs the telemetry collection pipeline (metrics + log scan)
// and returns the result. Used by the debug endpoint.
func BuildAppTelemetryForDebug(stackMgr *stacks.Manager, metricsStore *metrics.MetricsStore, logger *log.Logger) []AppTelemetry {
return buildAppTelemetrySection(stackMgr, metricsStore, logger)
}
```
Then wire the callback in main.go using this exported function.
### 2. Wire the callback in `cmd/controller/main.go`
In the `DebugCallbacks` setup block (around line 615), add:
```go
dc.GetTelemetryPreview = func() ([]report.AppTelemetry, error) {
return report.BuildAppTelemetryForDebug(stackMgr, metricsStore, logger), nil
}
```
This goes inside the existing `if cfg.Logging.Level == "debug"` block, alongside the other callback assignments. It does NOT need to be inside the `if hubPusher != nil` guard — telemetry works regardless of hub config.
### 3. Add the debug page section in `debug.html`
**File:** `controller/internal/web/templates/debug.html`
Add a new section between "Hub & Kapcsolatok" (section-hub) and "Önfrissítés teszt" (section-selfupdate). Insert after the closing `</div>` of section-hub (line 128) and before the `<!-- Section 6: Self-Update Testing -->` comment (line 130):
```html
<!-- Section: Telemetry Testing -->
<div class="card debug-section" id="section-telemetry">
<div class="card-header debug-section-header" onclick="toggleSection('telemetry')">
<h3>Telemetria teszt</h3>
<span class="section-toggle"></span>
</div>
<div class="card-body debug-section-body" style="display:none">
<div id="telemetry-status"><span class="text-muted">Kattintson a gombra a telemetria futtatásához.</span></div>
<div class="debug-actions">
<button class="btn btn-primary btn-sm" id="btn-telemetry-run" data-label="Telemetria futtatása" onclick="runTelemetryTest()">Telemetria futtatása</button>
<span class="debug-result" id="btn-telemetry-run-result"></span>
</div>
<div id="telemetry-detail" style="display:none; margin-top:1rem;"></div>
</div>
</div>
```
### 4. Add JavaScript for the telemetry section in `debug.html`
Add in the `<script>` block, alongside the other section functions. Also update `loadSectionData` to add the telemetry case (though this section doesn't auto-load — the user clicks the button):
In `loadSectionData`, add:
```javascript
case 'telemetry': break; // no auto-load, user triggers manually
```
Add the `runTelemetryTest()` function:
```javascript
// ── Telemetry test ──
function runTelemetryTest() {
var btn = document.getElementById('btn-telemetry-run');
var result = document.getElementById('btn-telemetry-run-result');
var detail = document.getElementById('telemetry-detail');
btn.disabled = true;
btn.textContent = 'Folyamatban...';
result.className = 'debug-result';
result.textContent = '';
detail.style.display = 'none';
fetch('/api/debug/telemetry', {headers: csrfHeaders()}).then(function(r){return r.json()}).then(function(data) {
if (data.ok) {
result.className = 'debug-result debug-result-ok';
result.textContent = data.message;
if (data.data && data.data.app_telemetry) {
renderTelemetryDetail(data.data);
}
} else {
result.className = 'debug-result debug-result-error';
result.textContent = data.error || 'Hiba';
}
}).catch(function(e) {
result.className = 'debug-result debug-result-error';
result.textContent = 'Hálózati hiba: ' + e.message;
}).finally(function() {
btn.disabled = false;
btn.textContent = btn.dataset.label;
});
}
function renderTelemetryDetail(data) {
var detail = document.getElementById('telemetry-detail');
var apps = data.app_telemetry || [];
if (apps.length === 0) {
detail.innerHTML = '<span class="text-muted">Nincs telepített alkalmazás vagy nincs mérési adat.</span>';
detail.style.display = 'block';
return;
}
var html = '<table class="debug-table" style="width:100%;font-size:.85rem">';
html += '<thead><tr><th>Alkalmazás</th><th>Konténerek</th><th>Memória (jelen.)</th><th>Memória (átlag)</th><th>Memória (csúcs)</th><th>CPU (átlag)</th><th>Katalógus limit</th><th>Hibák</th><th>Figyelmeztetések</th></tr></thead><tbody>';
for (var i = 0; i < apps.length; i++) {
var a = apps[i];
var errorClass = a.log_errors > 0 ? ' style="color:var(--red);font-weight:600"' : '';
var warnClass = a.log_warnings > 0 ? ' style="color:var(--yellow);font-weight:600"' : '';
html += '<tr>';
html += '<td><strong>' + escHtml(a.display_name || a.app_name) + '</strong></td>';
html += '<td class="mono" style="font-size:.8rem">' + (a.containers||[]).join(', ') + '</td>';
html += '<td>' + a.memory_current_mb.toFixed(1) + ' MB</td>';
html += '<td>' + a.memory_avg_mb.toFixed(1) + ' MB</td>';
html += '<td>' + a.memory_peak_mb.toFixed(1) + ' MB</td>';
html += '<td>' + a.cpu_avg_percent.toFixed(1) + '%</td>';
html += '<td class="mono">' + (a.catalog_limit || '-') + '</td>';
html += '<td' + errorClass + '>' + a.log_errors + '</td>';
html += '<td' + warnClass + '>' + a.log_warnings + '</td>';
html += '</tr>';
// Show issues as sub-rows if any
if (a.issues && a.issues.length > 0) {
for (var j = 0; j < a.issues.length; j++) {
var issue = a.issues[j];
var sevColor = issue.severity === 'error' ? 'var(--red)' : 'var(--yellow)';
html += '<tr style="font-size:.8rem;opacity:.85"><td></td>';
html += '<td colspan="6" style="padding-left:1.5rem"><span style="color:' + sevColor + '">' + issue.severity.toUpperCase() + '</span> ' + escHtml(issue.message) + '</td>';
html += '<td colspan="2">×' + issue.count + '</td></tr>';
}
}
}
html += '</tbody></table>';
// Also show raw JSON in a collapsible detail
html += '<details style="margin-top:1rem"><summary class="text-muted" style="cursor:pointer;font-size:.85rem">Nyers JSON</summary>';
html += '<pre class="mono" style="font-size:.75rem;max-height:400px;overflow:auto;padding:.5rem;background:rgba(0,0,0,.3);border-radius:.25rem;margin-top:.5rem">' + escHtml(JSON.stringify(apps, null, 2)) + '</pre>';
html += '</details>';
detail.innerHTML = html;
detail.style.display = 'block';
}
```
**Note:** The `escHtml` function should already exist in the debug page JS (check if it's already defined; if not, add a simple one):
```javascript
function escHtml(s) {
var d = document.createElement('div');
d.textContent = s;
return d.innerHTML;
}
```
### 5. Export the telemetry builder from the report package
**File:** `controller/internal/report/telemetry.go`
Add this exported function at the end of the file:
```go
// BuildAppTelemetryForDebug runs the full telemetry collection pipeline
// (metrics query + log scan) and returns per-app telemetry data.
// Used by the debug endpoint to preview telemetry without pushing to hub.
func BuildAppTelemetryForDebug(stackMgr *stacks.Manager, metricsStore *metrics.MetricsStore, logger *log.Logger) []AppTelemetry {
return buildAppTelemetrySection(stackMgr, metricsStore, logger)
}
```
---
## Files Changed Summary
| File | Change |
|------|--------|
| `controller/internal/report/telemetry.go` | **MODIFY** — Add `BuildAppTelemetryForDebug()` public wrapper |
| `controller/internal/web/handler_debug.go` | **MODIFY** — Add `GetTelemetryPreview` to `DebugCallbacks`, add `debugTelemetry()` handler, add route case |
| `controller/cmd/controller/main.go` | **MODIFY** — Wire `GetTelemetryPreview` callback in debug callbacks block |
| `controller/internal/web/templates/debug.html` | **MODIFY** — Add "Telemetria teszt" section + JS rendering |
---
## Key Implementation Details
### Existing patterns to follow exactly:
1. **DebugCallbacks pattern** — See existing callbacks like `TriggerHubReportPush` at `handler_debug.go:24-30`. The new `GetTelemetryPreview` follows the same nil-check guard pattern used by `debugHubPush()` at line 460.
2. **writeDebugJSON** — Standard response helper at `handler_debug.go:102-120`. Always use this for debug endpoint responses.
3. **Button pattern** — Button needs `id`, `data-label` (original text, restored after action), and `onclick` calling the JS function. Result span has `id="btn-{id}-result"`.
4. **JS function pattern** — Disable button → change text to "Folyamatban..." → fetch endpoint → show result → restore button. See `triggerAction()` at approximately line 220 in debug.html.
5. **Section HTML pattern**`div.card.debug-section > div.card-header.debug-section-header + div.card-body.debug-section-body`. The body starts with `style="display:none"`.
6. **loadSectionData switch** — At approximately line 262 in debug.html. Add the `'telemetry'` case to the switch (even though it's a no-op, for consistency).
7. **All UI text in Hungarian** — "Telemetria teszt", "Telemetria futtatása", "Folyamatban...", "Nincs telepített alkalmazás", "Nyers JSON", "Hibák", "Figyelmeztetések", etc.
### Import needed in handler_debug.go:
The `debugTelemetry` handler uses `report.AppTelemetry` type via the callback return. The import for the report package:
```go
import "gitea.dooplex.hu/admin/felhom-controller/internal/report"
```
Check if this import already exists in handler_debug.go — it likely doesn't (current imports are for `backup`, `monitor`, `stacks`, `system`). Add it.
### CSS already available:
The debug page already has styles for `debug-table`, `debug-result`, `debug-result-ok`, `debug-result-error`, `debug-actions`, `debug-kv-grid`, `mono`, `text-muted`, `btn`, `btn-primary`, `btn-sm`, etc. No new CSS needed.
### CSS variables available:
- `var(--red)` — for errors
- `var(--yellow)` — for warnings
- `var(--green)` — for success/ok
---
## Testing
After deployment:
1. Go to Debug page on the controller
2. Expand "Telemetria teszt" section
3. Click "Telemetria futtatása"
4. Verify: table shows all deployed non-protected stacks
5. Verify: memory values match what you see on the Rendszermonitor page
6. Verify: log errors/warnings appear (intentionally stop a dependent container to generate errors)
7. Verify: "Nyers JSON" collapsible shows the exact payload that would go to the hub
8. Verify: the scan completes in reasonable time (shown in the result message as "Xms")
---
## Build & Deploy
```bash
SSH=/c/Windows/System32/OpenSSH/ssh.exe
# After committing and pushing:
$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-controller && git -C ~/git/deploy-felhom-compose pull && ./build.sh v0.28.0 --push"
$SSH kisfenyo@192.168.0.162 "cd /opt/docker/felhom-controller && sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.28.0 && sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.28.0|' docker-compose.yml && sudo docker compose up -d"
```
Note: If v0.28.0 is already built/deployed with the telemetry feature, bump to v0.28.1 for this debug addition.