234 lines
7.7 KiB
Markdown
234 lines
7.7 KiB
Markdown
# TASK.md — v0.5.1: Monitoring Page Bugfixes
|
|
|
|
> Version bump: **v0.5.1**
|
|
> Scope: 4 bugs in the monitoring page
|
|
|
|
---
|
|
|
|
## Bug 1: Hostname shows container ID instead of host hostname
|
|
|
|
### Problem
|
|
|
|
"Gépnév" displays `75f2f2a113f3` — the Docker container ID. `os.Hostname()` inside a container returns the container's hostname, not the host's.
|
|
|
|
### Root cause
|
|
|
|
`sysinfo.go` line: `info.Hostname, _ = os.Hostname()`
|
|
|
|
Inside a Docker container, `os.Hostname()` returns the container ID unless `hostname:` is set in docker-compose.yml.
|
|
|
|
### Fix — Two options (use both for robustness)
|
|
|
|
**Option A: Mount host's /etc/hostname** (preferred — works for all cases):
|
|
|
|
In `controller/docker-compose.yml`, add:
|
|
```yaml
|
|
volumes:
|
|
- /etc/hostname:/host/etc/hostname:ro
|
|
```
|
|
|
|
In `sysinfo.go`, read host hostname first:
|
|
```go
|
|
// Hostname — try host mount first, fall back to os.Hostname()
|
|
if data, err := os.ReadFile("/host/etc/hostname"); err == nil {
|
|
info.Hostname = strings.TrimSpace(string(data))
|
|
} else {
|
|
info.Hostname, _ = os.Hostname()
|
|
}
|
|
```
|
|
|
|
**Option B: Set hostname in docker-compose.yml** (simpler but requires per-customer config):
|
|
|
|
```yaml
|
|
hostname: ${HOSTNAME:-felhom}
|
|
```
|
|
|
|
But this requires the env var to be set. Option A is better — it reads the actual host hostname dynamically.
|
|
|
|
**Use Option A.** It's consistent with the `/etc/os-release` mount pattern already in place.
|
|
|
|
---
|
|
|
|
## Bug 2: Tooltip timestamps show "1970. 01. 01. 01:00"
|
|
|
|
### Problem
|
|
|
|
Hovering over chart data points shows `1970. 01. 01. 01:00` instead of the actual timestamp.
|
|
|
|
### Root cause
|
|
|
|
In the tooltip callback:
|
|
```javascript
|
|
callbacks: {
|
|
title: function(items) {
|
|
if (!items.length) return '';
|
|
return formatTimestamp(items[0].parsed.x || items[0].label);
|
|
}
|
|
}
|
|
```
|
|
|
|
The chart uses a **category** x-axis (default), not a time axis. `items[0].parsed.x` returns the **category index** (0, 1, 2, 3...), not the timestamp. When the index is > 0, `parsed.x || label` evaluates to the index (truthy). Then `formatTimestamp(5)` does `new Date(5 * 1000)` → `1970-01-01 01:00:00.005`.
|
|
|
|
When the index is 0, `0 || label` falls through to `label`, which works correctly. That's why the first data point shows the right time.
|
|
|
|
### Fix
|
|
|
|
Always use `items[0].label` instead of `parsed.x`:
|
|
|
|
```javascript
|
|
callbacks: {
|
|
title: function(items) {
|
|
if (!items.length) return '';
|
|
return formatTimestamp(items[0].label);
|
|
}
|
|
}
|
|
```
|
|
|
|
`items[0].label` is the raw label value from the labels array, which IS the timestamp in milliseconds.
|
|
|
|
### Files
|
|
|
|
`internal/web/templates/monitoring.html` — tooltip callback in `chartOpts` function.
|
|
|
|
---
|
|
|
|
## Bug 3: Range selector appears non-functional / 24h shows empty
|
|
|
|
### Problem
|
|
|
|
Default range is `24h` but the system has only ~20 minutes of data. On page load, charts appear empty (Y-axis 0-1.0, no visible lines). Clicking "1 óra" shows data. User perceives buttons as "not doing anything" because the initial state is already broken.
|
|
|
|
### Root cause (likely)
|
|
|
|
Two contributing factors:
|
|
|
|
1. **Default range too wide**: `systemRange = '24h'` — for a newly deployed system with minutes of data, this either shows nothing or shows a barely visible sliver at the right edge.
|
|
|
|
2. **Downsampling compression**: 24h range with resolution=200 → `bucketSeconds = 432`. Twenty data points spanning 20 minutes (~1200s) get grouped into ~3 buckets. Three data points CAN render as a line chart, but if Chart.js's auto-scaling or the bucket timestamps are at the very edge, the chart might not render visibly.
|
|
|
|
### Fix
|
|
|
|
**A. Change default range to `1h`:**
|
|
|
|
```javascript
|
|
let systemRange = '1h';
|
|
```
|
|
|
|
And move the `active` class to the `1h` button:
|
|
```html
|
|
<button class="filter-btn active" data-range="1h">1 óra</button>
|
|
<button class="filter-btn" data-range="6h">6 óra</button>
|
|
<button class="filter-btn" data-range="24h">24 óra</button>
|
|
```
|
|
|
|
Same for container detail range:
|
|
```javascript
|
|
let detailRange = '1h';
|
|
```
|
|
|
|
**B. Smart default**: After the system has been running for 24+ hours, `24h` makes more sense as a default. But for v0.5.1, just use `1h` — it's always reasonable.
|
|
|
|
**C. Add diagnostic logging**: To understand if 24h truly returns empty, add a temporary console.log in the JS:
|
|
|
|
```javascript
|
|
async function loadSystemMetrics() {
|
|
try {
|
|
const resp = await fetch('/api/metrics/system?range=' + systemRange + '&resolution=200');
|
|
const json = await resp.json();
|
|
console.log('[metrics] system range=' + systemRange + ', data points=' + (json.data?.labels?.length || 0));
|
|
// ... rest of handler
|
|
```
|
|
|
|
This helps debug if the issue is no data returned vs. data not rendering.
|
|
|
|
### Troubleshooting commands (run on demo node)
|
|
|
|
Before implementing the fix, verify the data is in SQLite:
|
|
|
|
```bash
|
|
# Check how many system metric rows exist
|
|
docker exec -it felhom-controller sh -c "cat /app/data/metrics.db" | strings | head -5
|
|
# Or directly via the API from the browser:
|
|
# https://felhom.demo-felhom.eu/api/metrics/system?range=1h&resolution=200
|
|
# https://felhom.demo-felhom.eu/api/metrics/system?range=24h&resolution=200
|
|
```
|
|
|
|
Compare the JSON responses. If 24h returns labels but cpu/memory arrays are zeros, it's a rendering issue. If labels are empty, it's a query issue.
|
|
|
|
---
|
|
|
|
## Bug 4: Charts empty on initial page load
|
|
|
|
### Problem
|
|
|
|
When navigating to the monitoring page, all four system charts show empty (no data) until the user clicks a range button.
|
|
|
|
### Root cause
|
|
|
|
Same as Bug 3 — the initial `loadSystemMetrics()` call uses the `24h` default range, which returns no visible data for a new system. Fixing Bug 3 (changing default to `1h`) should also fix this.
|
|
|
|
### Additional fix — race condition protection
|
|
|
|
Ensure the init sequence is robust. Currently:
|
|
```javascript
|
|
initSystemCharts();
|
|
initContainerCharts();
|
|
initDetailCharts();
|
|
loadSysInfo();
|
|
loadSystemMetrics();
|
|
loadContainerSummary();
|
|
```
|
|
|
|
This looks correct — charts are initialized before data is loaded. No race condition here.
|
|
|
|
### Edge case: very first load (0 data points)
|
|
|
|
If the monitoring page is loaded before the collector has stored even 1 sample (within the first 60 seconds of controller start), the "Még nincsenek adatok" message should appear. Verify this works correctly.
|
|
|
|
---
|
|
|
|
## Implementation order
|
|
|
|
### Step 1: Fix hostname
|
|
1. Add `/etc/hostname:/host/etc/hostname:ro` to `controller/docker-compose.yml`
|
|
2. Update `sysinfo.go` — read from `/host/etc/hostname` first
|
|
|
|
### Step 2: Fix tooltip timestamps
|
|
1. Change `items[0].parsed.x || items[0].label` to `items[0].label` in `monitoring.html`
|
|
|
|
### Step 3: Fix default range + empty charts
|
|
1. Change `systemRange = '1h'` and `detailRange = '1h'`
|
|
2. Move `active` class to "1 óra" button in both range bars
|
|
3. Add console.log diagnostic for data loading
|
|
|
|
### Step 4: Build, deploy, verify
|
|
1. Build v0.5.1
|
|
2. Deploy to demo node (sync docker-compose.yml for new volume mount)
|
|
3. Verify hostname shows "demo-felhom"
|
|
4. Verify tooltip shows correct timestamp
|
|
5. Verify charts show data on page load
|
|
6. Test all range buttons (1h → 6h → 24h → 7d → 30d)
|
|
|
|
---
|
|
|
|
## Files to modify
|
|
|
|
```
|
|
controller/docker-compose.yml — add /etc/hostname mount
|
|
internal/metrics/sysinfo.go — read hostname from /host/etc/hostname
|
|
internal/web/templates/monitoring.html — fix tooltip callback + default range
|
|
```
|
|
|
|
---
|
|
|
|
## Verification checklist
|
|
|
|
- [ ] Hostname shows "demo-felhom" (not container ID)
|
|
- [ ] Tooltip shows correct timestamp (e.g., "2026. 02. 16. 10:21")
|
|
- [ ] Charts show data on initial page load (1h default)
|
|
- [ ] "1 óra" button is active/highlighted by default
|
|
- [ ] Clicking each range button updates charts
|
|
- [ ] "24 óra" shows data if there are 1+ hours of collected metrics
|
|
- [ ] Container bar charts still render correctly
|
|
- [ ] Container detail panel still works
|
|
- [ ] No console errors in browser devtools |