# TASK.md — Hub Dashboard Bugs + Backup Validation Fix
## Overview
Three bugs identified from the live hub.felhom.eu and controller backup page:
1. **Hub main page shows DOWN** despite the detail page showing STATUS: OK
2. **Hub report history timestamps show 00:00:00** instead of actual times
3. **Backup page shows "Hiba" for all DB validations** with no tooltip detail
Bugs 1 and 2 share the same root cause (timestamp parsing). Bug 3 is in the controller.
---
## Bug 1 & 2: Hub timestamp parsing failure
**Repository:** `felhom.eu` → `hub/`
### Root cause
The hub's SQLite store parses `received_at` timestamps with a single format:
```go
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
```
The parse error is silently discarded (`_`). When the format doesn't match what the
`modernc.org/sqlite` driver returns, `ReceivedAt` becomes Go's zero time (`0001-01-01 00:00:00`).
**Consequences:**
- `time.Since(zeroTime)` ≈ 740,000+ hours → `TimeSinceReport > 1 hour` → **OverallStatus = "down"**
- `zeroTime.Format("15:04:05")` → **"00:00:00"** in report history
- Detail page health status shows OK because that comes from the report JSON payload, not the timestamp
The `modernc.org/sqlite` driver may return datetime strings in various formats depending on
how the value was stored and the SQLite version:
- `2026-02-16 14:30:00` (what we expect)
- `2026-02-16T14:30:00Z` (ISO 8601 / RFC3339-ish)
- `2026-02-16 14:30:00+00:00` (with timezone offset)
- `2026-02-16 14:30:00.123456` (with fractional seconds)
### Fix: `hub/internal/store/store.go`
**Step 1:** Add a robust timestamp parser function at the bottom of store.go:
```go
// parseSQLiteTime tries multiple formats that modernc.org/sqlite may return.
func parseSQLiteTime(s string) time.Time {
formats := []string{
"2006-01-02 15:04:05", // SQLite datetime('now')
"2006-01-02T15:04:05Z", // RFC3339 without fractional
time.RFC3339, // 2006-01-02T15:04:05Z07:00
time.RFC3339Nano, // with fractional seconds
"2006-01-02 15:04:05+00:00", // with explicit UTC offset
"2006-01-02 15:04:05.999999999", // with fractional, no TZ
}
for _, f := range formats {
if t, err := time.Parse(f, s); err == nil {
return t
}
}
// Last resort: if string is non-empty, log it for debugging
if s != "" {
log.Printf("[WARN] Could not parse timestamp: %q", s)
}
return time.Time{} // zero time
}
```
Note: Add `"log"` to the import block if not already present.
**Step 2:** Replace ALL occurrences of `time.Parse("2006-01-02 15:04:05", receivedAt)` in store.go.
There are **three** locations:
1. **`GetCustomers()`** — in the `for rows.Next()` loop:
```go
// BEFORE:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
// AFTER:
c.ReceivedAt = parseSQLiteTime(receivedAt)
```
2. **`GetCustomer()`** — after `row.Scan`:
```go
// BEFORE:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
// AFTER:
c.ReceivedAt = parseSQLiteTime(receivedAt)
```
3. **`GetCustomerHistory()`** — in the `for rows.Next()` loop:
```go
// BEFORE:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
// AFTER:
c.ReceivedAt = parseSQLiteTime(receivedAt)
```
**Step 3 (optional diagnostic):** Temporarily add a log line in `SaveReport` to see what format
SQLite actually stores/returns. This can be removed after verifying the fix:
```go
// Add after the INSERT in SaveReport, before return:
// Debug: check what format SQLite returns
var dbTime string
s.db.QueryRow("SELECT received_at FROM reports WHERE customer_id = ? ORDER BY id DESC LIMIT 1", customerID).Scan(&dbTime)
s.logger.Printf("[DEBUG] SQLite received_at raw value: %q", dbTime)
```
### Verify
After rebuilding and deploying the hub:
1. Wait for the next controller report push (or trigger manually)
2. Check hub.felhom.eu — status should show **OK** (green), not DOWN
3. Click into customer detail — "Last report: X min ago" should show a reasonable value
4. Report History timestamps should show actual times like `14:36:32`, not `00:00:00`
5. Check hub pod logs for any `[WARN] Could not parse timestamp` messages (should be none)
### Post-fix grep
```bash
grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/store.go
# Should return 0 results — all replaced with parseSQLiteTime()
```
---
## Bug 3: Backup page shows "Hiba" for all DB validations
**Repository:** `deploy-felhom-compose` → `controller/`
### Symptoms
- All 3 databases (immich, paperless, romm) show "Hiba" in the Érvényesítés column
- The Állapot column shows "OK" (dump succeeded)
- No tooltip text on hover (meaning `Validation.Error` is empty)
- Dump files are valid — headers are correct, sizes are reasonable (43.2 MB / 319.6 KB / 38.7 KB)
### Analysis
The template condition for "Hiba" in the `LastDBDump` path is:
```html
{{if .Error}} → shows "–" (dump failed)
{{else if .Validation.Valid}} → shows "X tábla" (validation passed)
{{else}} → shows "Hiba" (THIS IS WHAT WE SEE)
```
"Hiba" with empty tooltip means `Validation.Valid == false` AND `Validation.Error == ""`.
This is the **zero-value** of `DumpValidation{}` — meaning validation was never assigned.
The code in `DumpOne()` calls `ValidateDump()` and the code in `ListDumpFiles()` also calls
`ValidateDump()`. Both paths should populate the Validation field. Yet the UI shows zero-value.
**Most likely cause:** The `lastDBDump` state was populated by an older code version (before
validation was wired), OR there's a race condition where `RefreshCache` captures `lastDBDump`
mid-construction, OR the validation ran but hit an unexpected issue (permissions, encoding).
### Diagnostic step (run on demo-felhom FIRST)
Before applying fixes, check the controller logs to understand what happened:
```bash
# Check the last DB dump run
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "db dump|table|valid|dump:"
# Check if there was a controller restart
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "starting|version|shutdown"
# Check if the old bash systemd timer is ALSO running (double-dump conflict!)
systemctl is-active backup-db-dump.timer
systemctl list-timers | grep backup
```
**IMPORTANT:** If `backup-db-dump.timer` is still active, it will race with the controller's
built-in `db-dump` scheduler job. Both write to the same directory. The bash script overwrites
files directly (no `.tmp` + rename), which could corrupt the file mid-validation. **Disable it:**
```bash
sudo systemctl stop backup-db-dump.timer
sudo systemctl disable backup-db-dump.timer
```
### Fix 1: Add debug logging to `ValidateDump`
**File:** `controller/internal/backup/dbdump.go`, function `ValidateDump`
Add a log parameter and diagnostic output so we can see what's happening:
```go
// BEFORE:
func ValidateDump(filePath string, dbType DBType) DumpValidation {
// AFTER:
func ValidateDump(filePath string, dbType DBType) DumpValidation {
log.Printf("[DEBUG] ValidateDump: %s (type=%s)", filePath, dbType)
```
And at the end, before `return v`:
```go
v.Valid = true
log.Printf("[DEBUG] ValidateDump OK: %s — %d tables, header found", filePath, tableCount)
return v
}
```
Also add logging to the error paths:
After `v.Error = "dump file too small (< 100 bytes)"`:
```go
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
```
After `v.Error = fmt.Sprintf("read failed: %v", err)`:
```go
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
```
After `v.Error = "... dump missing comment header"`:
```go
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
```
After `v.Error = "no CREATE TABLE statements found"`:
```go
log.Printf("[WARN] ValidateDump FAIL: %s — %s (header was found, scanned %d lines)", filePath, v.Error, len(strings.Split(content, "\n")))
```
Note: Import `"log"` at the top of the file if not already imported (use the standard `log`
package, not the `*log.Logger` parameter — this is a quick debug addition. Can be cleaned up later.)
### Fix 2: Template guard against zero-value Validation
Even with debug logging, we should make the template resilient to zero-value Validation.
The "Hiba" label with no explanation is a bad UX.
**File:** `controller/internal/web/templates/backups.html`
In the `LastDBDump` section, change the Érvényesítés (validation) column:
```html
{{if .Error}}
–
{{else if .Validation.Valid}}
{{.Validation.TableCount}} tábla
{{else}}
Hiba
{{end}}
{{if .Error}}
–
{{else if .Validation.Valid}}
{{.Validation.TableCount}} tábla
{{else if .Validation.Error}}
Hiba
{{else}}
–
{{end}}
```
This ensures:
- If validation passed → green badge with table count
- If validation failed with a reason → red "Hiba" with tooltip
- If validation never ran (zero-value) → gray "–" with explanatory tooltip
### Fix 3: Re-validate on cache refresh (belt-and-suspenders)
Since `RefreshCache` already calls `ListDumpFiles()` which runs `ValidateDump()` per file,
the `DumpFiles` fallback always has fresh validation. The issue is only in the `LastDBDump`
path when in-memory results have stale/missing validation.
Add a cross-check: if `LastDBDump` results have zero-value Validation but the file exists,
re-validate it. Add this in `RefreshCache`, after the existing code:
**File:** `controller/internal/backup/backup.go`, function `RefreshCache`
After the line `status.DumpFiles = files` and before the lock section, add:
```go
// Cross-check: if LastDBDump results have empty validation but files exist,
// re-validate from disk. This handles controller restarts and race conditions.
if m.lastDBDump != nil {
fileValidation := make(map[string]DumpValidation) // keyed by filename
for _, f := range files {
fileValidation[f.FileName] = f.Validation
}
for i, r := range m.lastDBDump.Results {
if !r.Validation.Valid && r.Validation.Error == "" && r.FilePath != "" {
filename := filepath.Base(r.FilePath)
if fv, ok := fileValidation[filename]; ok {
m.lastDBDump.Results[i].Validation = fv
m.logger.Printf("[INFO] Re-validated %s from disk: valid=%v tables=%d",
filename, fv.Valid, fv.TableCount)
}
}
}
}
```
Note: Add `"path/filepath"` to imports if not already present.
This runs every 5 minutes (same cadence as the cache refresh) and will automatically
heal any stale validation state in `lastDBDump` by cross-referencing the fresh
`ListDumpFiles` results.
### Fix 4: Disable conflicting systemd timer (manual step)
If the diagnostic step above reveals that `backup-db-dump.timer` is still active:
```bash
sudo systemctl stop backup-db-dump.timer
sudo systemctl disable backup-db-dump.timer
# Optionally verify:
systemctl list-timers | grep backup
# Should show nothing
```
The controller's built-in `db-dump` scheduler job at 02:30 replaces this timer entirely.
Having both run simultaneously can corrupt dump files mid-write.
### Verify
After deploying fixes:
1. Wait for cache refresh (5 minutes) or trigger a manual backup ("Mentés most")
2. Check `/backups` page — validation column should show "X tábla" for all databases
3. Check controller logs for `[DEBUG] ValidateDump` lines confirming validation ran
4. Verify no `[WARN] ValidateDump FAIL` lines in logs
---
## Post-fix checklist
### Hub (felhom.eu repo → hub/)
- [ ] `grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/` → 0 results
- [ ] `parseSQLiteTime` function exists in store.go
- [ ] `go build ./cmd/hub/` succeeds
- [ ] `go vet ./...` passes
- [ ] Build new image, deploy to k3s
- [ ] hub.felhom.eu shows OK status for demo-felhom
- [ ] Report history shows real timestamps
### Controller (deploy-felhom-compose repo → controller/)
- [ ] Template has 4-branch validation check (Valid / Error / zero-value guard)
- [ ] `RefreshCache` has cross-check re-validation logic
- [ ] `ValidateDump` has debug logging
- [ ] `backup-db-dump.timer` is disabled on demo-felhom
- [ ] `go build ./cmd/controller/` succeeds
- [ ] `go vet ./...` passes
- [ ] Build, deploy to demo-felhom
- [ ] Backup page shows table counts, not "Hiba"
- [ ] Controller logs show `[DEBUG] ValidateDump OK` entries
### Version bumps
- Hub: bump to next patch version
- Controller: include in v0.6.1 release (alongside the code review fixes from the other TASK.md)