Hub Dashboard Bugs + Backup Validation Fix
This commit is contained in:
@@ -1,341 +1,361 @@
|
|||||||
# TASK.md — Code Review Bugfixes (v0.6.1)
|
# TASK.md — Hub Dashboard Bugs + Backup Validation Fix
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
Fix bugs and logic issues identified during the v0.6.0 code review. All changes are in the `controller/` subtree.
|
Three bugs identified from the live hub.felhom.eu and controller backup page:
|
||||||
No new features — only correctness, safety, and quality fixes.
|
|
||||||
|
|
||||||
After all fixes: bump version to **v0.6.1**, commit, build, push, deploy, verify.
|
1. **Hub main page shows DOWN** despite the detail page showing STATUS: OK
|
||||||
|
2. **Hub report history timestamps show 00:00:00** instead of actual times
|
||||||
|
3. **Backup page shows "Hiba" for all DB validations** with no tooltip detail
|
||||||
|
|
||||||
|
Bugs 1 and 2 share the same root cause (timestamp parsing). Bug 3 is in the controller.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Fix 1: `http.NotFound(w, nil)` — pass request, not nil
|
## Bug 1 & 2: Hub timestamp parsing failure
|
||||||
|
|
||||||
**Files:** `internal/web/handlers.go`
|
**Repository:** `felhom.eu` → `hub/`
|
||||||
|
|
||||||
**Problem:** Two handlers discard the `*http.Request` parameter as `_`, then call `http.NotFound(w, nil)`. While Go's current stdlib doesn't dereference the request in `NotFound`, this is incorrect and will break if middleware wraps it.
|
### Root cause
|
||||||
|
|
||||||
**Changes:**
|
The hub's SQLite store parses `received_at` timestamps with a single format:
|
||||||
|
|
||||||
In `deployHandler`, change signature and call:
|
```go
|
||||||
|
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
||||||
|
```
|
||||||
|
|
||||||
|
The parse error is silently discarded (`_`). When the format doesn't match what the
|
||||||
|
`modernc.org/sqlite` driver returns, `ReceivedAt` becomes Go's zero time (`0001-01-01 00:00:00`).
|
||||||
|
|
||||||
|
**Consequences:**
|
||||||
|
- `time.Since(zeroTime)` ≈ 740,000+ hours → `TimeSinceReport > 1 hour` → **OverallStatus = "down"**
|
||||||
|
- `zeroTime.Format("15:04:05")` → **"00:00:00"** in report history
|
||||||
|
- Detail page health status shows OK because that comes from the report JSON payload, not the timestamp
|
||||||
|
|
||||||
|
The `modernc.org/sqlite` driver may return datetime strings in various formats depending on
|
||||||
|
how the value was stored and the SQLite version:
|
||||||
|
- `2026-02-16 14:30:00` (what we expect)
|
||||||
|
- `2026-02-16T14:30:00Z` (ISO 8601 / RFC3339-ish)
|
||||||
|
- `2026-02-16 14:30:00+00:00` (with timezone offset)
|
||||||
|
- `2026-02-16 14:30:00.123456` (with fractional seconds)
|
||||||
|
|
||||||
|
### Fix: `hub/internal/store/store.go`
|
||||||
|
|
||||||
|
**Step 1:** Add a robust timestamp parser function at the bottom of store.go:
|
||||||
|
|
||||||
|
```go
|
||||||
|
// parseSQLiteTime tries multiple formats that modernc.org/sqlite may return.
|
||||||
|
func parseSQLiteTime(s string) time.Time {
|
||||||
|
formats := []string{
|
||||||
|
"2006-01-02 15:04:05", // SQLite datetime('now')
|
||||||
|
"2006-01-02T15:04:05Z", // RFC3339 without fractional
|
||||||
|
time.RFC3339, // 2006-01-02T15:04:05Z07:00
|
||||||
|
time.RFC3339Nano, // with fractional seconds
|
||||||
|
"2006-01-02 15:04:05+00:00", // with explicit UTC offset
|
||||||
|
"2006-01-02 15:04:05.999999999", // with fractional, no TZ
|
||||||
|
}
|
||||||
|
for _, f := range formats {
|
||||||
|
if t, err := time.Parse(f, s); err == nil {
|
||||||
|
return t
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Last resort: if string is non-empty, log it for debugging
|
||||||
|
if s != "" {
|
||||||
|
log.Printf("[WARN] Could not parse timestamp: %q", s)
|
||||||
|
}
|
||||||
|
return time.Time{} // zero time
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: Add `"log"` to the import block if not already present.
|
||||||
|
|
||||||
|
**Step 2:** Replace ALL occurrences of `time.Parse("2006-01-02 15:04:05", receivedAt)` in store.go.
|
||||||
|
|
||||||
|
There are **three** locations:
|
||||||
|
|
||||||
|
1. **`GetCustomers()`** — in the `for rows.Next()` loop:
|
||||||
```go
|
```go
|
||||||
// BEFORE:
|
// BEFORE:
|
||||||
func (s *Server) deployHandler(w http.ResponseWriter, _ *http.Request, name string) {
|
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
||||||
...
|
|
||||||
if err != nil {
|
|
||||||
http.NotFound(w, nil)
|
|
||||||
|
|
||||||
// AFTER:
|
// AFTER:
|
||||||
func (s *Server) deployHandler(w http.ResponseWriter, r *http.Request, name string) {
|
c.ReceivedAt = parseSQLiteTime(receivedAt)
|
||||||
...
|
|
||||||
if err != nil {
|
|
||||||
http.NotFound(w, r)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
In `appDetailHandler`, same fix:
|
2. **`GetCustomer()`** — after `row.Scan`:
|
||||||
```go
|
```go
|
||||||
// BEFORE:
|
// BEFORE:
|
||||||
func (s *Server) appDetailHandler(w http.ResponseWriter, _ *http.Request, slug string) {
|
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
||||||
...
|
|
||||||
if found == nil {
|
|
||||||
http.NotFound(w, nil)
|
|
||||||
|
|
||||||
// AFTER:
|
// AFTER:
|
||||||
func (s *Server) appDetailHandler(w http.ResponseWriter, r *http.Request, slug string) {
|
c.ReceivedAt = parseSQLiteTime(receivedAt)
|
||||||
...
|
|
||||||
if found == nil {
|
|
||||||
http.NotFound(w, r)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Verify:** Grep for `NotFound(w, nil)` — should return 0 results after fix.
|
3. **`GetCustomerHistory()`** — in the `for rows.Next()` loop:
|
||||||
|
```go
|
||||||
|
// BEFORE:
|
||||||
|
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
||||||
|
|
||||||
---
|
// AFTER:
|
||||||
|
c.ReceivedAt = parseSQLiteTime(receivedAt)
|
||||||
|
```
|
||||||
|
|
||||||
## Fix 2: Dashboard running/stopped counts don't match displayed stacks
|
**Step 3 (optional diagnostic):** Temporarily add a log line in `SaveReport` to see what format
|
||||||
|
SQLite actually stores/returns. This can be removed after verifying the fix:
|
||||||
**File:** `internal/web/handlers.go`, `dashboardHandler`
|
|
||||||
|
|
||||||
**Problem:** The `running`/`stopped` stat counters iterate over ALL stacks (including non-deployed ones), but the dashboard only displays deployed + protected stacks. The numbers don't match what the user sees.
|
|
||||||
|
|
||||||
**Fix:** Compute the counts from the same filtered set (`deployedStacks`), not from `stackList`. Move the filter loop first, then count from the filtered result.
|
|
||||||
|
|
||||||
```go
|
```go
|
||||||
func (s *Server) dashboardHandler(w http.ResponseWriter, _ *http.Request) {
|
// Add after the INSERT in SaveReport, before return:
|
||||||
stackList := s.stackMgr.GetStacks()
|
// Debug: check what format SQLite returns
|
||||||
|
var dbTime string
|
||||||
// Filter to deployed + protected stacks first
|
s.db.QueryRow("SELECT received_at FROM reports WHERE customer_id = ? ORDER BY id DESC LIMIT 1", customerID).Scan(&dbTime)
|
||||||
var deployedStacks []stacks.Stack
|
s.logger.Printf("[DEBUG] SQLite received_at raw value: %q", dbTime)
|
||||||
for _, st := range stackList {
|
|
||||||
if st.Deployed || st.Protected {
|
|
||||||
deployedStacks = append(deployedStacks, st)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Count from the DISPLAYED set only
|
|
||||||
running, stopped := 0, 0
|
|
||||||
for _, st := range deployedStacks {
|
|
||||||
switch st.State {
|
|
||||||
case stacks.StateRunning, stacks.StateStarting, stacks.StateUnhealthy, stacks.StateRestarting:
|
|
||||||
running++
|
|
||||||
case stacks.StateStopped, stacks.StateExited:
|
|
||||||
stopped++
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ... rest unchanged, but use deployedStacks for display ...
|
|
||||||
data["Stacks"] = deployedStacks
|
|
||||||
data["RunningCount"] = running
|
|
||||||
data["StoppedCount"] = stopped
|
|
||||||
data["TotalCount"] = len(stackList) // keep this as total catalog size
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Verify:** Deploy, open dashboard. Count of green + red badges on cards should match the stat numbers.
|
### Verify
|
||||||
|
|
||||||
|
After rebuilding and deploying the hub:
|
||||||
|
1. Wait for the next controller report push (or trigger manually)
|
||||||
|
2. Check hub.felhom.eu — status should show **OK** (green), not DOWN
|
||||||
|
3. Click into customer detail — "Last report: X min ago" should show a reasonable value
|
||||||
|
4. Report History timestamps should show actual times like `14:36:32`, not `00:00:00`
|
||||||
|
5. Check hub pod logs for any `[WARN] Could not parse timestamp` messages (should be none)
|
||||||
|
|
||||||
|
### Post-fix grep
|
||||||
|
|
||||||
|
```bash
|
||||||
|
grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/store.go
|
||||||
|
# Should return 0 results — all replaced with parseSQLiteTime()
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Fix 3: `Secure: true` cookie blocks HTTP login
|
## Bug 3: Backup page shows "Hiba" for all DB validations
|
||||||
|
|
||||||
**File:** `internal/web/auth.go`, `handleLogin`
|
**Repository:** `deploy-felhom-compose` → `controller/`
|
||||||
|
|
||||||
**Problem:** The session cookie has `Secure: true` hardcoded. When accessing via plain HTTP (e.g., `http://192.168.0.162:8080` during local setup), the browser silently refuses to send the cookie back, making login impossible with no visible error.
|
### Symptoms
|
||||||
|
|
||||||
**Fix:** Set `Secure` dynamically based on the incoming request:
|
- All 3 databases (immich, paperless, romm) show "Hiba" in the Érvényesítés column
|
||||||
|
- The Állapot column shows "OK" (dump succeeded)
|
||||||
|
- No tooltip text on hover (meaning `Validation.Error` is empty)
|
||||||
|
- Dump files are valid — headers are correct, sizes are reasonable (43.2 MB / 319.6 KB / 38.7 KB)
|
||||||
|
|
||||||
|
### Analysis
|
||||||
|
|
||||||
|
The template condition for "Hiba" in the `LastDBDump` path is:
|
||||||
|
```html
|
||||||
|
{{if .Error}} → shows "–" (dump failed)
|
||||||
|
{{else if .Validation.Valid}} → shows "X tábla" (validation passed)
|
||||||
|
{{else}} → shows "Hiba" (THIS IS WHAT WE SEE)
|
||||||
|
```
|
||||||
|
|
||||||
|
"Hiba" with empty tooltip means `Validation.Valid == false` AND `Validation.Error == ""`.
|
||||||
|
This is the **zero-value** of `DumpValidation{}` — meaning validation was never assigned.
|
||||||
|
|
||||||
|
The code in `DumpOne()` calls `ValidateDump()` and the code in `ListDumpFiles()` also calls
|
||||||
|
`ValidateDump()`. Both paths should populate the Validation field. Yet the UI shows zero-value.
|
||||||
|
|
||||||
|
**Most likely cause:** The `lastDBDump` state was populated by an older code version (before
|
||||||
|
validation was wired), OR there's a race condition where `RefreshCache` captures `lastDBDump`
|
||||||
|
mid-construction, OR the validation ran but hit an unexpected issue (permissions, encoding).
|
||||||
|
|
||||||
|
### Diagnostic step (run on demo-felhom FIRST)
|
||||||
|
|
||||||
|
Before applying fixes, check the controller logs to understand what happened:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check the last DB dump run
|
||||||
|
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "db dump|table|valid|dump:"
|
||||||
|
|
||||||
|
# Check if there was a controller restart
|
||||||
|
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "starting|version|shutdown"
|
||||||
|
|
||||||
|
# Check if the old bash systemd timer is ALSO running (double-dump conflict!)
|
||||||
|
systemctl is-active backup-db-dump.timer
|
||||||
|
systemctl list-timers | grep backup
|
||||||
|
```
|
||||||
|
|
||||||
|
**IMPORTANT:** If `backup-db-dump.timer` is still active, it will race with the controller's
|
||||||
|
built-in `db-dump` scheduler job. Both write to the same directory. The bash script overwrites
|
||||||
|
files directly (no `.tmp` + rename), which could corrupt the file mid-validation. **Disable it:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo systemctl stop backup-db-dump.timer
|
||||||
|
sudo systemctl disable backup-db-dump.timer
|
||||||
|
```
|
||||||
|
|
||||||
|
### Fix 1: Add debug logging to `ValidateDump`
|
||||||
|
|
||||||
|
**File:** `controller/internal/backup/dbdump.go`, function `ValidateDump`
|
||||||
|
|
||||||
|
Add a log parameter and diagnostic output so we can see what's happening:
|
||||||
|
|
||||||
```go
|
```go
|
||||||
// BEFORE:
|
// BEFORE:
|
||||||
http.SetCookie(w, &http.Cookie{
|
func ValidateDump(filePath string, dbType DBType) DumpValidation {
|
||||||
Name: sessionCookieName,
|
|
||||||
Value: token,
|
|
||||||
Path: "/",
|
|
||||||
MaxAge: int(sessionMaxAge.Seconds()),
|
|
||||||
HttpOnly: true,
|
|
||||||
SameSite: http.SameSiteStrictMode,
|
|
||||||
Secure: true,
|
|
||||||
})
|
|
||||||
|
|
||||||
// AFTER:
|
// AFTER:
|
||||||
isSecure := r.TLS != nil || r.Header.Get("X-Forwarded-Proto") == "https"
|
func ValidateDump(filePath string, dbType DBType) DumpValidation {
|
||||||
http.SetCookie(w, &http.Cookie{
|
log.Printf("[DEBUG] ValidateDump: %s (type=%s)", filePath, dbType)
|
||||||
Name: sessionCookieName,
|
|
||||||
Value: token,
|
|
||||||
Path: "/",
|
|
||||||
MaxAge: int(sessionMaxAge.Seconds()),
|
|
||||||
HttpOnly: true,
|
|
||||||
SameSite: http.SameSiteLaxMode, // Lax needed: Strict can break redirects through CF tunnel
|
|
||||||
Secure: isSecure,
|
|
||||||
})
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Note: Also change `SameSiteStrictMode` → `SameSiteLaxMode`. Strict mode can cause issues when users arrive via Cloudflare Tunnel redirects (the cookie won't be sent on the first navigation from an external link).
|
And at the end, before `return v`:
|
||||||
|
|
||||||
**Verify:** Access `http://192.168.0.162:8080` in browser, log in — should work. Also verify HTTPS login still works via `https://vezerlo.demo-felhom.eu`.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Fix 4: Remove misleading `subtle.ConstantTimeCompare` in session check
|
|
||||||
|
|
||||||
**File:** `internal/web/auth.go`, `isValidSession`
|
|
||||||
|
|
||||||
**Problem:** The map lookup `s.sessions[token]` already reveals the token via timing. The subsequent `ConstantTimeCompare` compares the token to itself (it was just fetched by that key), so it always returns 1 and adds no security. It's misleading to keep it.
|
|
||||||
|
|
||||||
**Fix:** Simplify:
|
|
||||||
|
|
||||||
```go
|
```go
|
||||||
// BEFORE:
|
v.Valid = true
|
||||||
func (s *Server) isValidSession(token string) bool {
|
log.Printf("[DEBUG] ValidateDump OK: %s — %d tables, header found", filePath, tableCount)
|
||||||
s.sessionsMu.RLock()
|
return v
|
||||||
defer s.sessionsMu.RUnlock()
|
|
||||||
sess, ok := s.sessions[token]
|
|
||||||
if !ok || time.Now().After(sess.expiresAt) {
|
|
||||||
return false
|
|
||||||
}
|
|
||||||
return subtle.ConstantTimeCompare([]byte(sess.token), []byte(token)) == 1
|
|
||||||
}
|
|
||||||
|
|
||||||
// AFTER:
|
|
||||||
func (s *Server) isValidSession(token string) bool {
|
|
||||||
s.sessionsMu.RLock()
|
|
||||||
defer s.sessionsMu.RUnlock()
|
|
||||||
sess, ok := s.sessions[token]
|
|
||||||
return ok && time.Now().Before(sess.expiresAt)
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Also: the `token` field in the `session` struct is now unused (it duplicates the map key). Remove it:
|
Also add logging to the error paths:
|
||||||
|
|
||||||
|
After `v.Error = "dump file too small (< 100 bytes)"`:
|
||||||
|
```go
|
||||||
|
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
|
||||||
|
```
|
||||||
|
|
||||||
|
After `v.Error = fmt.Sprintf("read failed: %v", err)`:
|
||||||
|
```go
|
||||||
|
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
|
||||||
|
```
|
||||||
|
|
||||||
|
After `v.Error = "... dump missing comment header"`:
|
||||||
|
```go
|
||||||
|
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
|
||||||
|
```
|
||||||
|
|
||||||
|
After `v.Error = "no CREATE TABLE statements found"`:
|
||||||
|
```go
|
||||||
|
log.Printf("[WARN] ValidateDump FAIL: %s — %s (header was found, scanned %d lines)", filePath, v.Error, len(strings.Split(content, "\n")))
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: Import `"log"` at the top of the file if not already imported (use the standard `log`
|
||||||
|
package, not the `*log.Logger` parameter — this is a quick debug addition. Can be cleaned up later.)
|
||||||
|
|
||||||
|
### Fix 2: Template guard against zero-value Validation
|
||||||
|
|
||||||
|
Even with debug logging, we should make the template resilient to zero-value Validation.
|
||||||
|
The "Hiba" label with no explanation is a bad UX.
|
||||||
|
|
||||||
|
**File:** `controller/internal/web/templates/backups.html`
|
||||||
|
|
||||||
|
In the `LastDBDump` section, change the Érvényesítés (validation) column:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<!-- BEFORE: -->
|
||||||
|
{{if .Error}}
|
||||||
|
<span class="validation-badge validation-na">–</span>
|
||||||
|
{{else if .Validation.Valid}}
|
||||||
|
<span class="validation-badge validation-ok">{{.Validation.TableCount}} tábla</span>
|
||||||
|
{{else}}
|
||||||
|
<span class="validation-badge validation-fail" title="{{.Validation.Error}}">Hiba</span>
|
||||||
|
{{end}}
|
||||||
|
|
||||||
|
<!-- AFTER: -->
|
||||||
|
{{if .Error}}
|
||||||
|
<span class="validation-badge validation-na">–</span>
|
||||||
|
{{else if .Validation.Valid}}
|
||||||
|
<span class="validation-badge validation-ok">{{.Validation.TableCount}} tábla</span>
|
||||||
|
{{else if .Validation.Error}}
|
||||||
|
<span class="validation-badge validation-fail" title="{{.Validation.Error}}">Hiba</span>
|
||||||
|
{{else}}
|
||||||
|
<span class="validation-badge validation-na" title="Az érvényesítés nem futott le">–</span>
|
||||||
|
{{end}}
|
||||||
|
```
|
||||||
|
|
||||||
|
This ensures:
|
||||||
|
- If validation passed → green badge with table count
|
||||||
|
- If validation failed with a reason → red "Hiba" with tooltip
|
||||||
|
- If validation never ran (zero-value) → gray "–" with explanatory tooltip
|
||||||
|
|
||||||
|
### Fix 3: Re-validate on cache refresh (belt-and-suspenders)
|
||||||
|
|
||||||
|
Since `RefreshCache` already calls `ListDumpFiles()` which runs `ValidateDump()` per file,
|
||||||
|
the `DumpFiles` fallback always has fresh validation. The issue is only in the `LastDBDump`
|
||||||
|
path when in-memory results have stale/missing validation.
|
||||||
|
|
||||||
|
Add a cross-check: if `LastDBDump` results have zero-value Validation but the file exists,
|
||||||
|
re-validate it. Add this in `RefreshCache`, after the existing code:
|
||||||
|
|
||||||
|
**File:** `controller/internal/backup/backup.go`, function `RefreshCache`
|
||||||
|
|
||||||
|
After the line `status.DumpFiles = files` and before the lock section, add:
|
||||||
|
|
||||||
```go
|
```go
|
||||||
// BEFORE:
|
// Cross-check: if LastDBDump results have empty validation but files exist,
|
||||||
type session struct {
|
// re-validate from disk. This handles controller restarts and race conditions.
|
||||||
token string
|
if m.lastDBDump != nil {
|
||||||
expiresAt time.Time
|
fileValidation := make(map[string]DumpValidation) // keyed by filename
|
||||||
}
|
for _, f := range files {
|
||||||
|
fileValidation[f.FileName] = f.Validation
|
||||||
// AFTER:
|
}
|
||||||
type session struct {
|
for i, r := range m.lastDBDump.Results {
|
||||||
expiresAt time.Time
|
if !r.Validation.Valid && r.Validation.Error == "" && r.FilePath != "" {
|
||||||
}
|
filename := filepath.Base(r.FilePath)
|
||||||
|
if fv, ok := fileValidation[filename]; ok {
|
||||||
|
m.lastDBDump.Results[i].Validation = fv
|
||||||
|
m.logger.Printf("[INFO] Re-validated %s from disk: valid=%v tables=%d",
|
||||||
|
filename, fv.Valid, fv.TableCount)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
And update `createSession`:
|
Note: Add `"path/filepath"` to imports if not already present.
|
||||||
```go
|
|
||||||
// BEFORE:
|
|
||||||
s.sessions[token] = &session{token: token, expiresAt: time.Now().Add(sessionMaxAge)}
|
|
||||||
|
|
||||||
// AFTER:
|
This runs every 5 minutes (same cadence as the cache refresh) and will automatically
|
||||||
s.sessions[token] = &session{expiresAt: time.Now().Add(sessionMaxAge)}
|
heal any stale validation state in `lastDBDump` by cross-referencing the fresh
|
||||||
|
`ListDumpFiles` results.
|
||||||
|
|
||||||
|
### Fix 4: Disable conflicting systemd timer (manual step)
|
||||||
|
|
||||||
|
If the diagnostic step above reveals that `backup-db-dump.timer` is still active:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo systemctl stop backup-db-dump.timer
|
||||||
|
sudo systemctl disable backup-db-dump.timer
|
||||||
|
# Optionally verify:
|
||||||
|
systemctl list-timers | grep backup
|
||||||
|
# Should show nothing
|
||||||
```
|
```
|
||||||
|
|
||||||
After these changes, remove the `"crypto/subtle"` import if no longer used.
|
The controller's built-in `db-dump` scheduler job at 02:30 replaces this timer entirely.
|
||||||
|
Having both run simultaneously can corrupt dump files mid-write.
|
||||||
|
|
||||||
**Verify:** Log in, navigate around — session should work. Log out — should redirect to login.
|
### Verify
|
||||||
|
|
||||||
---
|
After deploying fixes:
|
||||||
|
1. Wait for cache refresh (5 minutes) or trigger a manual backup ("Mentés most")
|
||||||
## Fix 5: `cleanupSessions` goroutine leak
|
2. Check `/backups` page — validation column should show "X tábla" for all databases
|
||||||
|
3. Check controller logs for `[DEBUG] ValidateDump` lines confirming validation ran
|
||||||
**File:** `internal/web/auth.go`
|
4. Verify no `[WARN] ValidateDump FAIL` lines in logs
|
||||||
|
|
||||||
**Problem:** `time.Tick()` creates a ticker that can never be GC'd. The goroutine runs forever, even during shutdown.
|
|
||||||
|
|
||||||
**Fix:** This one is lower priority since the controller runs as a long-lived process, but the fix is simple. Since we don't currently pass a context to `NewServer`, use a `done` channel on the server:
|
|
||||||
|
|
||||||
Add a `done` channel to the Server struct:
|
|
||||||
```go
|
|
||||||
type Server struct {
|
|
||||||
// ... existing fields ...
|
|
||||||
done chan struct{}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Initialize it in `NewServer`:
|
|
||||||
```go
|
|
||||||
func NewServer(...) *Server {
|
|
||||||
s := &Server{
|
|
||||||
// ... existing ...
|
|
||||||
done: make(chan struct{}),
|
|
||||||
}
|
|
||||||
s.loadTemplates()
|
|
||||||
go s.cleanupSessions()
|
|
||||||
return s
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Rewrite `cleanupSessions`:
|
|
||||||
```go
|
|
||||||
func (s *Server) cleanupSessions() {
|
|
||||||
ticker := time.NewTicker(15 * time.Minute)
|
|
||||||
defer ticker.Stop()
|
|
||||||
for {
|
|
||||||
select {
|
|
||||||
case <-s.done:
|
|
||||||
return
|
|
||||||
case <-ticker.C:
|
|
||||||
s.sessionsMu.Lock()
|
|
||||||
now := time.Now()
|
|
||||||
for t, sess := range s.sessions {
|
|
||||||
if now.After(sess.expiresAt) {
|
|
||||||
delete(s.sessions, t)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
s.sessionsMu.Unlock()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Add a `Close` method (called from main during shutdown, optional for now):
|
|
||||||
```go
|
|
||||||
func (s *Server) Close() {
|
|
||||||
close(s.done)
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Verify:** Build succeeds, controller starts without errors.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Fix 6: Add `http.MaxBytesReader` to API POST endpoints
|
|
||||||
|
|
||||||
**File:** `internal/api/router.go`
|
|
||||||
|
|
||||||
**Problem:** `json.NewDecoder(req.Body).Decode(&body)` has no size limit. A malicious or accidental large POST could exhaust memory.
|
|
||||||
|
|
||||||
**Fix:** Add a helper and use it in all handlers that decode JSON bodies (`deployStack`, `updateOptionalConfig`, `deleteStack`):
|
|
||||||
|
|
||||||
Add helper at the bottom of `router.go`:
|
|
||||||
```go
|
|
||||||
// limitBody wraps the request body with a size limit (default 1MB).
|
|
||||||
func limitBody(w http.ResponseWriter, req *http.Request) {
|
|
||||||
req.Body = http.MaxBytesReader(w, req.Body, 1<<20) // 1MB
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Then at the start of each handler that reads the body:
|
|
||||||
```go
|
|
||||||
func (r *Router) deployStack(w http.ResponseWriter, req *http.Request, name string) {
|
|
||||||
limitBody(w, req)
|
|
||||||
// ... existing json decode ...
|
|
||||||
}
|
|
||||||
|
|
||||||
func (r *Router) updateOptionalConfig(w http.ResponseWriter, req *http.Request, name string) {
|
|
||||||
limitBody(w, req)
|
|
||||||
// ...
|
|
||||||
}
|
|
||||||
|
|
||||||
func (r *Router) deleteStack(w http.ResponseWriter, req *http.Request, name string) {
|
|
||||||
limitBody(w, req)
|
|
||||||
// ...
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Verify:** Build succeeds. Normal deploy/config/delete still works (payloads are tiny).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Fix 7: Cache `time.LoadLocation` in template funcmap
|
|
||||||
|
|
||||||
**File:** `internal/web/funcmap.go`
|
|
||||||
|
|
||||||
**Problem:** At least 5 template functions call `time.LoadLocation("Europe/Budapest")` on every render. While Go caches internally, it still acquires a mutex each time.
|
|
||||||
|
|
||||||
**Fix:** Load once at the top of `templateFuncMap` and capture in the closures:
|
|
||||||
|
|
||||||
```go
|
|
||||||
func (s *Server) templateFuncMap() template.FuncMap {
|
|
||||||
loc, err := time.LoadLocation("Europe/Budapest")
|
|
||||||
if err != nil {
|
|
||||||
loc = time.UTC
|
|
||||||
}
|
|
||||||
|
|
||||||
return template.FuncMap{
|
|
||||||
// ... in every function that currently calls time.LoadLocation,
|
|
||||||
// replace with the captured `loc` variable.
|
|
||||||
// Remove the per-function `loc, _ := time.LoadLocation(...)` lines.
|
|
||||||
// Example:
|
|
||||||
"timeAgo": func(t time.Time) string {
|
|
||||||
if t.IsZero() { return "–" }
|
|
||||||
now := time.Now().In(loc)
|
|
||||||
d := now.Sub(t.In(loc))
|
|
||||||
// ... rest unchanged ...
|
|
||||||
},
|
|
||||||
// Apply same pattern to: fmtTime, fmtTimeShort, nextRunLabel, nextPruneLabel
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
There are 5 functions that need this change: `timeAgo`, `fmtTime`, `fmtTimeShort`, `nextRunLabel`, `nextPruneLabel`.
|
|
||||||
|
|
||||||
**Verify:** Build succeeds. Dashboard/backup page timestamps still display correctly in Budapest time.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Post-fix checklist
|
## Post-fix checklist
|
||||||
|
|
||||||
1. `grep -rn 'NotFound(w, nil)' internal/` → 0 results
|
### Hub (felhom.eu repo → hub/)
|
||||||
2. `grep -rn 'subtle.ConstantTimeCompare' internal/` → 0 results (unless used elsewhere)
|
- [ ] `grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/` → 0 results
|
||||||
3. `grep -rn 'time.Tick(' internal/` → 0 results
|
- [ ] `parseSQLiteTime` function exists in store.go
|
||||||
4. `grep -rn 'Secure:.*true' internal/web/auth.go` → 0 results (now dynamic)
|
- [ ] `go build ./cmd/hub/` succeeds
|
||||||
5. Build: `go build ./cmd/controller/` succeeds with no errors
|
- [ ] `go vet ./...` passes
|
||||||
6. `go vet ./...` passes
|
- [ ] Build new image, deploy to k3s
|
||||||
7. Version bump in build to v0.6.1
|
- [ ] hub.felhom.eu shows OK status for demo-felhom
|
||||||
8. Commit, push, build, deploy, verify on demo-felhom.eu
|
- [ ] Report history shows real timestamps
|
||||||
|
|
||||||
|
### Controller (deploy-felhom-compose repo → controller/)
|
||||||
|
- [ ] Template has 4-branch validation check (Valid / Error / zero-value guard)
|
||||||
|
- [ ] `RefreshCache` has cross-check re-validation logic
|
||||||
|
- [ ] `ValidateDump` has debug logging
|
||||||
|
- [ ] `backup-db-dump.timer` is disabled on demo-felhom
|
||||||
|
- [ ] `go build ./cmd/controller/` succeeds
|
||||||
|
- [ ] `go vet ./...` passes
|
||||||
|
- [ ] Build, deploy to demo-felhom
|
||||||
|
- [ ] Backup page shows table counts, not "Hiba"
|
||||||
|
- [ ] Controller logs show `[DEBUG] ValidateDump OK` entries
|
||||||
|
|
||||||
|
### Version bumps
|
||||||
|
- Hub: bump to next patch version
|
||||||
|
- Controller: include in v0.6.1 release (alongside the code review fixes from the other TASK.md)
|
||||||
Reference in New Issue
Block a user