13 KiB
TASK.md — Hub Dashboard Bugs + Backup Validation Fix
Overview
Three bugs identified from the live hub.felhom.eu and controller backup page:
- Hub main page shows DOWN despite the detail page showing STATUS: OK
- Hub report history timestamps show 00:00:00 instead of actual times
- Backup page shows "Hiba" for all DB validations with no tooltip detail
Bugs 1 and 2 share the same root cause (timestamp parsing). Bug 3 is in the controller.
Bug 1 & 2: Hub timestamp parsing failure
Repository: felhom.eu → hub/
Root cause
The hub's SQLite store parses received_at timestamps with a single format:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
The parse error is silently discarded (_). When the format doesn't match what the
modernc.org/sqlite driver returns, ReceivedAt becomes Go's zero time (0001-01-01 00:00:00).
Consequences:
time.Since(zeroTime)≈ 740,000+ hours →TimeSinceReport > 1 hour→ OverallStatus = "down"zeroTime.Format("15:04:05")→ "00:00:00" in report history- Detail page health status shows OK because that comes from the report JSON payload, not the timestamp
The modernc.org/sqlite driver may return datetime strings in various formats depending on
how the value was stored and the SQLite version:
2026-02-16 14:30:00(what we expect)2026-02-16T14:30:00Z(ISO 8601 / RFC3339-ish)2026-02-16 14:30:00+00:00(with timezone offset)2026-02-16 14:30:00.123456(with fractional seconds)
Fix: hub/internal/store/store.go
Step 1: Add a robust timestamp parser function at the bottom of store.go:
// parseSQLiteTime tries multiple formats that modernc.org/sqlite may return.
func parseSQLiteTime(s string) time.Time {
formats := []string{
"2006-01-02 15:04:05", // SQLite datetime('now')
"2006-01-02T15:04:05Z", // RFC3339 without fractional
time.RFC3339, // 2006-01-02T15:04:05Z07:00
time.RFC3339Nano, // with fractional seconds
"2006-01-02 15:04:05+00:00", // with explicit UTC offset
"2006-01-02 15:04:05.999999999", // with fractional, no TZ
}
for _, f := range formats {
if t, err := time.Parse(f, s); err == nil {
return t
}
}
// Last resort: if string is non-empty, log it for debugging
if s != "" {
log.Printf("[WARN] Could not parse timestamp: %q", s)
}
return time.Time{} // zero time
}
Note: Add "log" to the import block if not already present.
Step 2: Replace ALL occurrences of time.Parse("2006-01-02 15:04:05", receivedAt) in store.go.
There are three locations:
GetCustomers()— in thefor rows.Next()loop:
// BEFORE:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
// AFTER:
c.ReceivedAt = parseSQLiteTime(receivedAt)
GetCustomer()— afterrow.Scan:
// BEFORE:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
// AFTER:
c.ReceivedAt = parseSQLiteTime(receivedAt)
GetCustomerHistory()— in thefor rows.Next()loop:
// BEFORE:
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
// AFTER:
c.ReceivedAt = parseSQLiteTime(receivedAt)
Step 3 (optional diagnostic): Temporarily add a log line in SaveReport to see what format
SQLite actually stores/returns. This can be removed after verifying the fix:
// Add after the INSERT in SaveReport, before return:
// Debug: check what format SQLite returns
var dbTime string
s.db.QueryRow("SELECT received_at FROM reports WHERE customer_id = ? ORDER BY id DESC LIMIT 1", customerID).Scan(&dbTime)
s.logger.Printf("[DEBUG] SQLite received_at raw value: %q", dbTime)
Verify
After rebuilding and deploying the hub:
- Wait for the next controller report push (or trigger manually)
- Check hub.felhom.eu — status should show OK (green), not DOWN
- Click into customer detail — "Last report: X min ago" should show a reasonable value
- Report History timestamps should show actual times like
14:36:32, not00:00:00 - Check hub pod logs for any
[WARN] Could not parse timestampmessages (should be none)
Post-fix grep
grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/store.go
# Should return 0 results — all replaced with parseSQLiteTime()
Bug 3: Backup page shows "Hiba" for all DB validations
Repository: deploy-felhom-compose → controller/
Symptoms
- All 3 databases (immich, paperless, romm) show "Hiba" in the Érvényesítés column
- The Állapot column shows "OK" (dump succeeded)
- No tooltip text on hover (meaning
Validation.Erroris empty) - Dump files are valid — headers are correct, sizes are reasonable (43.2 MB / 319.6 KB / 38.7 KB)
Analysis
The template condition for "Hiba" in the LastDBDump path is:
{{if .Error}} → shows "–" (dump failed)
{{else if .Validation.Valid}} → shows "X tábla" (validation passed)
{{else}} → shows "Hiba" (THIS IS WHAT WE SEE)
"Hiba" with empty tooltip means Validation.Valid == false AND Validation.Error == "".
This is the zero-value of DumpValidation{} — meaning validation was never assigned.
The code in DumpOne() calls ValidateDump() and the code in ListDumpFiles() also calls
ValidateDump(). Both paths should populate the Validation field. Yet the UI shows zero-value.
Most likely cause: The lastDBDump state was populated by an older code version (before
validation was wired), OR there's a race condition where RefreshCache captures lastDBDump
mid-construction, OR the validation ran but hit an unexpected issue (permissions, encoding).
Diagnostic step (run on demo-felhom FIRST)
Before applying fixes, check the controller logs to understand what happened:
# Check the last DB dump run
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "db dump|table|valid|dump:"
# Check if there was a controller restart
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "starting|version|shutdown"
# Check if the old bash systemd timer is ALSO running (double-dump conflict!)
systemctl is-active backup-db-dump.timer
systemctl list-timers | grep backup
IMPORTANT: If backup-db-dump.timer is still active, it will race with the controller's
built-in db-dump scheduler job. Both write to the same directory. The bash script overwrites
files directly (no .tmp + rename), which could corrupt the file mid-validation. Disable it:
sudo systemctl stop backup-db-dump.timer
sudo systemctl disable backup-db-dump.timer
Fix 1: Add debug logging to ValidateDump
File: controller/internal/backup/dbdump.go, function ValidateDump
Add a log parameter and diagnostic output so we can see what's happening:
// BEFORE:
func ValidateDump(filePath string, dbType DBType) DumpValidation {
// AFTER:
func ValidateDump(filePath string, dbType DBType) DumpValidation {
log.Printf("[DEBUG] ValidateDump: %s (type=%s)", filePath, dbType)
And at the end, before return v:
v.Valid = true
log.Printf("[DEBUG] ValidateDump OK: %s — %d tables, header found", filePath, tableCount)
return v
}
Also add logging to the error paths:
After v.Error = "dump file too small (< 100 bytes)":
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
After v.Error = fmt.Sprintf("read failed: %v", err):
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
After v.Error = "... dump missing comment header":
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
After v.Error = "no CREATE TABLE statements found":
log.Printf("[WARN] ValidateDump FAIL: %s — %s (header was found, scanned %d lines)", filePath, v.Error, len(strings.Split(content, "\n")))
Note: Import "log" at the top of the file if not already imported (use the standard log
package, not the *log.Logger parameter — this is a quick debug addition. Can be cleaned up later.)
Fix 2: Template guard against zero-value Validation
Even with debug logging, we should make the template resilient to zero-value Validation. The "Hiba" label with no explanation is a bad UX.
File: controller/internal/web/templates/backups.html
In the LastDBDump section, change the Érvényesítés (validation) column:
<!-- BEFORE: -->
{{if .Error}}
<span class="validation-badge validation-na">–</span>
{{else if .Validation.Valid}}
<span class="validation-badge validation-ok">{{.Validation.TableCount}} tábla</span>
{{else}}
<span class="validation-badge validation-fail" title="{{.Validation.Error}}">Hiba</span>
{{end}}
<!-- AFTER: -->
{{if .Error}}
<span class="validation-badge validation-na">–</span>
{{else if .Validation.Valid}}
<span class="validation-badge validation-ok">{{.Validation.TableCount}} tábla</span>
{{else if .Validation.Error}}
<span class="validation-badge validation-fail" title="{{.Validation.Error}}">Hiba</span>
{{else}}
<span class="validation-badge validation-na" title="Az érvényesítés nem futott le">–</span>
{{end}}
This ensures:
- If validation passed → green badge with table count
- If validation failed with a reason → red "Hiba" with tooltip
- If validation never ran (zero-value) → gray "–" with explanatory tooltip
Fix 3: Re-validate on cache refresh (belt-and-suspenders)
Since RefreshCache already calls ListDumpFiles() which runs ValidateDump() per file,
the DumpFiles fallback always has fresh validation. The issue is only in the LastDBDump
path when in-memory results have stale/missing validation.
Add a cross-check: if LastDBDump results have zero-value Validation but the file exists,
re-validate it. Add this in RefreshCache, after the existing code:
File: controller/internal/backup/backup.go, function RefreshCache
After the line status.DumpFiles = files and before the lock section, add:
// Cross-check: if LastDBDump results have empty validation but files exist,
// re-validate from disk. This handles controller restarts and race conditions.
if m.lastDBDump != nil {
fileValidation := make(map[string]DumpValidation) // keyed by filename
for _, f := range files {
fileValidation[f.FileName] = f.Validation
}
for i, r := range m.lastDBDump.Results {
if !r.Validation.Valid && r.Validation.Error == "" && r.FilePath != "" {
filename := filepath.Base(r.FilePath)
if fv, ok := fileValidation[filename]; ok {
m.lastDBDump.Results[i].Validation = fv
m.logger.Printf("[INFO] Re-validated %s from disk: valid=%v tables=%d",
filename, fv.Valid, fv.TableCount)
}
}
}
}
Note: Add "path/filepath" to imports if not already present.
This runs every 5 minutes (same cadence as the cache refresh) and will automatically
heal any stale validation state in lastDBDump by cross-referencing the fresh
ListDumpFiles results.
Fix 4: Disable conflicting systemd timer (manual step)
If the diagnostic step above reveals that backup-db-dump.timer is still active:
sudo systemctl stop backup-db-dump.timer
sudo systemctl disable backup-db-dump.timer
# Optionally verify:
systemctl list-timers | grep backup
# Should show nothing
The controller's built-in db-dump scheduler job at 02:30 replaces this timer entirely.
Having both run simultaneously can corrupt dump files mid-write.
Verify
After deploying fixes:
- Wait for cache refresh (5 minutes) or trigger a manual backup ("Mentés most")
- Check
/backupspage — validation column should show "X tábla" for all databases - Check controller logs for
[DEBUG] ValidateDumplines confirming validation ran - Verify no
[WARN] ValidateDump FAILlines in logs
Post-fix checklist
Hub (felhom.eu repo → hub/)
grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/→ 0 resultsparseSQLiteTimefunction exists in store.gogo build ./cmd/hub/succeedsgo vet ./...passes- Build new image, deploy to k3s
- hub.felhom.eu shows OK status for demo-felhom
- Report history shows real timestamps
Controller (deploy-felhom-compose repo → controller/)
- Template has 4-branch validation check (Valid / Error / zero-value guard)
RefreshCachehas cross-check re-validation logicValidateDumphas debug loggingbackup-db-dump.timeris disabled on demo-felhomgo build ./cmd/controller/succeedsgo vet ./...passes- Build, deploy to demo-felhom
- Backup page shows table counts, not "Hiba"
- Controller logs show
[DEBUG] ValidateDump OKentries
Version bumps
- Hub: bump to next patch version
- Controller: include in v0.6.1 release (alongside the code review fixes from the other TASK.md)