Create CLAUDE.md + cleanup + statusIcon fix (felhom.eu repo)
This commit is contained in:
@@ -1,361 +1,316 @@
|
|||||||
# TASK.md — Hub Dashboard Bugs + Backup Validation Fix
|
# TASK: Create CLAUDE.md + cleanup + statusIcon fix (felhom.eu repo)
|
||||||
|
|
||||||
## Overview
|
## Context
|
||||||
|
|
||||||
Three bugs identified from the live hub.felhom.eu and controller backup page:
|
The `felhom.eu` repo lacks a CLAUDE.md with build instructions for the hub, has no `.gitignore` (so `hub.exe` got committed), and has a statusIcon rendering bug on the hub dashboard.
|
||||||
|
|
||||||
1. **Hub main page shows DOWN** despite the detail page showing STATUS: OK
|
**Current state:** Hub v0.1.2 running on k3s. Controller v0.6.2 on demo node.
|
||||||
2. **Hub report history timestamps show 00:00:00** instead of actual times
|
|
||||||
3. **Backup page shows "Hiba" for all DB validations** with no tooltip detail
|
|
||||||
|
|
||||||
Bugs 1 and 2 share the same root cause (timestamp parsing). Bug 3 is in the controller.
|
All changes in this task are in the **felhom.eu repo** only.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Bug 1 & 2: Hub timestamp parsing failure
|
## Task 1: Create CLAUDE.md
|
||||||
|
|
||||||
**Repository:** `felhom.eu` → `hub/`
|
Create `CLAUDE.md` in the repo root (`E:\git\felhom.eu\CLAUDE.md`) with the following content.
|
||||||
|
Use the controller's `CLAUDE.md` (in `deploy-felhom-compose`) as a style reference.
|
||||||
|
|
||||||
### Root cause
|
The CLAUDE.md should include these sections:
|
||||||
|
|
||||||
The hub's SQLite store parses `received_at` timestamps with a single format:
|
### Project overview
|
||||||
|
|
||||||
```go
|
This repo (`felhom.eu`) contains:
|
||||||
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
- **Website** (`website/`) — Static HTML pages at felhom.eu, served via k3s nginx + git-sync sidecar
|
||||||
|
- **Hub** (`hub/`) — Go application (felhom-hub) — centralized dashboard for monitoring customer controllers, runs on k3s at hub.felhom.eu
|
||||||
|
- **K8s manifests** (`manifests/`) — k3s deployment manifests for all felhom-system services
|
||||||
|
|
||||||
|
See `README.md` for full architecture, DNS, email, and SEO documentation.
|
||||||
|
See `TASK.md` for the current task to implement (if it exists).
|
||||||
|
|
||||||
|
### Code quality rules
|
||||||
|
|
||||||
|
Same as controller CLAUDE.md:
|
||||||
|
- Always double-check generated code for bugs, logic issues, syntax errors
|
||||||
|
- Handle edge cases without overcomplicating
|
||||||
|
- Add debug capabilities for troubleshooting
|
||||||
|
- Ask for more input rather than guessing
|
||||||
|
|
||||||
|
### Workspace layout
|
||||||
|
|
||||||
|
```
|
||||||
|
E:\git\felhom.eu\ (or /e/git/felhom.eu/ in Git Bash)
|
||||||
|
├── hub/ # felhom-hub Go application
|
||||||
|
│ ├── cmd/hub/ # Entry point (main.go)
|
||||||
|
│ ├── internal/
|
||||||
|
│ │ ├── api/ # Report ingestion API
|
||||||
|
│ │ ├── store/ # SQLite storage + queries
|
||||||
|
│ │ └── web/ # Dashboard UI
|
||||||
|
│ │ ├── server.go # Server, routing, template funcs
|
||||||
|
│ │ ├── embed.go # go:embed for templates
|
||||||
|
│ │ └── templates/ # HTML templates + CSS
|
||||||
|
│ ├── configs/ # Example config files
|
||||||
|
│ ├── Dockerfile
|
||||||
|
│ ├── Makefile
|
||||||
|
│ └── go.mod
|
||||||
|
├── manifests/ # k3s deployment manifests
|
||||||
|
│ ├── hub.yaml # Hub deployment (hub.felhom.eu)
|
||||||
|
│ ├── webpage.yaml # Website + FileBrowser + git-sync
|
||||||
|
│ ├── contact-mailer.yaml # Contact form email sender
|
||||||
|
│ ├── healthchecks.yaml # Healthchecks (status.felhom.eu)
|
||||||
|
│ └── umami.yaml # Analytics (stats.felhom.eu)
|
||||||
|
├── website/ # Static HTML pages (felhom.eu)
|
||||||
|
│ ├── index.html
|
||||||
|
│ ├── alkalmazasok.html
|
||||||
|
│ ├── ... (all Hungarian, UTF-8 with BOM)
|
||||||
|
│ └── assets/ # Logos, screenshots, OG images
|
||||||
|
├── CLAUDE.md # This file
|
||||||
|
├── README.md # Full project documentation
|
||||||
|
└── TASK.md # Current task (if exists)
|
||||||
```
|
```
|
||||||
|
|
||||||
The parse error is silently discarded (`_`). When the format doesn't match what the
|
Related repos (same parent directory):
|
||||||
`modernc.org/sqlite` driver returns, `ReceivedAt` becomes Go's zero time (`0001-01-01 00:00:00`).
|
```
|
||||||
|
E:\git\deploy-felhom-compose\ # felhom-controller Go app + deploy scripts
|
||||||
|
E:\git\app-catalog-felhom.eu\ # Docker Compose templates per app
|
||||||
|
E:\git\homelab-manifests\ # k3s cluster manifests (dooplex.hu services)
|
||||||
|
E:\git\misc-scripts\ # Helper scripts (build scripts, repo collector)
|
||||||
|
```
|
||||||
|
|
||||||
**Consequences:**
|
All repos hosted at `gitea.dooplex.hu/admin/`.
|
||||||
- `time.Since(zeroTime)` ≈ 740,000+ hours → `TimeSinceReport > 1 hour` → **OverallStatus = "down"**
|
|
||||||
- `zeroTime.Format("15:04:05")` → **"00:00:00"** in report history
|
|
||||||
- Detail page health status shows OK because that comes from the report JSON payload, not the timestamp
|
|
||||||
|
|
||||||
The `modernc.org/sqlite` driver may return datetime strings in various formats depending on
|
### SSH access
|
||||||
how the value was stored and the SQLite version:
|
|
||||||
- `2026-02-16 14:30:00` (what we expect)
|
|
||||||
- `2026-02-16T14:30:00Z` (ISO 8601 / RFC3339-ish)
|
|
||||||
- `2026-02-16 14:30:00+00:00` (with timezone offset)
|
|
||||||
- `2026-02-16 14:30:00.123456` (with fractional seconds)
|
|
||||||
|
|
||||||
### Fix: `hub/internal/store/store.go`
|
SSH key-based authentication configured. No password prompts.
|
||||||
|
|
||||||
**Step 1:** Add a robust timestamp parser function at the bottom of store.go:
|
| Host | IP | User | Role |
|
||||||
|
|------|----|------|------|
|
||||||
|
| Build server (k3s node) | 192.168.0.180 | kisfenyo | Build + push images, kubectl |
|
||||||
|
| Demo node | 192.168.0.162 | kisfenyo | Test deployment (demo-felhom.eu) |
|
||||||
|
|
||||||
|
**Note:** `kubectl` on the build server requires `sudo` (k3s kubeconfig permissions).
|
||||||
|
|
||||||
|
### Build & deploy workflow — Hub
|
||||||
|
|
||||||
|
After making code changes to `hub/`, you **MUST** build, push, and deploy the new image.
|
||||||
|
Do NOT leave code changes uncommitted or undeployed.
|
||||||
|
|
||||||
|
#### Step 1: Commit and push changes
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /e/git/felhom.eu
|
||||||
|
git add -A && git commit -m "<descriptive message>" && git push
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 2: Build + push the container image on the build server
|
||||||
|
|
||||||
|
The build server (192.168.0.180) has the build toolchain. The build script lives at
|
||||||
|
`~/build/felhom-hub/build.sh` on the build server (NOT in this repo).
|
||||||
|
|
||||||
|
First, check the current running version:
|
||||||
|
```bash
|
||||||
|
ssh kisfenyo@192.168.0.180 "sudo kubectl get deploy -n felhom-system hub -o jsonpath='{.spec.template.spec.containers[0].image}'"
|
||||||
|
```
|
||||||
|
|
||||||
|
Then build with the next version (e.g., if current is 0.1.2, use 0.1.3):
|
||||||
|
```bash
|
||||||
|
ssh kisfenyo@192.168.0.180 "cd ~/build/felhom-hub && ./build.sh <NEW_VERSION> --push"
|
||||||
|
```
|
||||||
|
|
||||||
|
The build script:
|
||||||
|
- Pulls latest code from Gitea (`git pull` on the felhom.eu repo)
|
||||||
|
- Copies `hub/` source to a clean build workspace
|
||||||
|
- Builds Docker image with version + build-time ldflags
|
||||||
|
- Pushes to `gitea.dooplex.hu/admin/felhom-hub:<VERSION>` and `:latest`
|
||||||
|
|
||||||
|
#### Step 3: Deploy to k3s
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh kisfenyo@192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:<NEW_VERSION>"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 4: Verify the deployment
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh kisfenyo@192.168.0.180 "sudo kubectl get pods -n felhom-system -l app=hub && echo '---' && sudo kubectl logs -n felhom-system -l app=hub --tail 10"
|
||||||
|
```
|
||||||
|
|
||||||
|
Should show pod Running and `[INFO] felhom-hub <VERSION> starting` in logs.
|
||||||
|
|
||||||
|
#### Build workflow summary
|
||||||
|
|
||||||
|
| Step | Command | Where |
|
||||||
|
|------|---------|-------|
|
||||||
|
| 1. Commit + push | `git add -A && git commit && git push` | Local (this repo) |
|
||||||
|
| 2. Build + push image | `ssh 192.168.0.180 "cd ~/build/felhom-hub && ./build.sh <VER> --push"` | Build server |
|
||||||
|
| 3. Deploy | `ssh 192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=...:<VER>"` | Build server (kubectl) |
|
||||||
|
| 4. Verify | `ssh 192.168.0.180 "sudo kubectl get pods -n felhom-system -l app=hub"` | Build server |
|
||||||
|
|
||||||
|
### Build & deploy workflow — Website
|
||||||
|
|
||||||
|
The website auto-deploys via git-sync sidecar. Just push to `main`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /e/git/felhom.eu
|
||||||
|
git add -A && git commit -m "<message>" && git push
|
||||||
|
```
|
||||||
|
|
||||||
|
Changes are live within 1-2 minutes. No build step needed.
|
||||||
|
|
||||||
|
For emergency edits, use FileBrowser at `https://files.felhom.eu`.
|
||||||
|
|
||||||
|
### Build & deploy workflow — K8s Manifests
|
||||||
|
|
||||||
|
Manifests are applied manually:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh kisfenyo@192.168.0.180 "sudo kubectl apply -f /home/kisfenyo/git/felhom.eu/manifests/<manifest>.yaml"
|
||||||
|
```
|
||||||
|
|
||||||
|
Remember to `git pull` on the build server first if you pushed changes locally.
|
||||||
|
|
||||||
|
### Tech stack (Hub)
|
||||||
|
|
||||||
|
- **Language:** Go 1.24+
|
||||||
|
- **Web framework:** stdlib `net/http` + `html/template`
|
||||||
|
- **Database:** SQLite via `modernc.org/sqlite` (pure Go, no CGo)
|
||||||
|
- **Auth:** bcrypt password hash + basic auth
|
||||||
|
- **Deployment:** Docker container on k3s (felhom-system namespace)
|
||||||
|
- **Storage:** Longhorn PVC at `/data/` (SQLite DB)
|
||||||
|
- **Config:** YAML file mounted via k8s ConfigMap at `/etc/felhom-hub/hub.yaml`
|
||||||
|
|
||||||
|
### Key patterns
|
||||||
|
|
||||||
|
- Hub receives reports from customer controllers via `POST /api/v1/report` (Bearer token auth)
|
||||||
|
- Dashboard shows all customers in a table with status, CPU, memory, disk, containers, backup age
|
||||||
|
- Customer detail page shows system info, report history, full JSON report
|
||||||
|
- Status logic: OK (report < 30m), WARN (30m-1h or health=warn), DOWN (> 1h or health=fail)
|
||||||
|
- SQLite timestamps may vary in format — use `parseSQLiteTime()` for robust parsing
|
||||||
|
- Auto-refresh: dashboard and detail pages refresh every 60 seconds via `<meta http-equiv="refresh">`
|
||||||
|
- Geo-restricted to Hungary via nginx ingress annotation
|
||||||
|
|
||||||
|
### File encoding
|
||||||
|
|
||||||
|
All HTML files in `website/` are **UTF-8 with BOM**. Ensure your editor preserves this.
|
||||||
|
Hub Go source files are standard UTF-8 (no BOM).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 2: Create .gitignore
|
||||||
|
|
||||||
|
Create `.gitignore` in the repo root with appropriate entries:
|
||||||
|
|
||||||
|
```gitignore
|
||||||
|
# Go binaries
|
||||||
|
hub/hub
|
||||||
|
hub/hub.exe
|
||||||
|
hub/bin/
|
||||||
|
|
||||||
|
# Build artifacts
|
||||||
|
*.exe
|
||||||
|
*.dll
|
||||||
|
*.so
|
||||||
|
*.dylib
|
||||||
|
|
||||||
|
# Test and coverage
|
||||||
|
*.test
|
||||||
|
*.out
|
||||||
|
coverage.html
|
||||||
|
|
||||||
|
# IDE
|
||||||
|
.idea/
|
||||||
|
.vscode/
|
||||||
|
*.swp
|
||||||
|
*.swo
|
||||||
|
*~
|
||||||
|
|
||||||
|
# OS
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
|
|
||||||
|
# Temporary files
|
||||||
|
*.tmp
|
||||||
|
*.bak
|
||||||
|
```
|
||||||
|
|
||||||
|
## Task 3: Remove hub.exe from git history
|
||||||
|
|
||||||
|
After creating `.gitignore`, remove the committed binary:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git rm --cached hub/hub.exe
|
||||||
|
```
|
||||||
|
|
||||||
|
This removes it from tracking but the `.gitignore` prevents re-adding. No need to rewrite
|
||||||
|
history — just remove from current tree.
|
||||||
|
|
||||||
|
Also check for any other binaries that shouldn't be tracked:
|
||||||
|
```bash
|
||||||
|
find hub/ -name "*.exe" -o -name "hub" -type f -executable
|
||||||
|
```
|
||||||
|
|
||||||
|
## Task 4: Fix hub statusIcon rendering
|
||||||
|
|
||||||
|
**File:** `hub/internal/web/server.go`
|
||||||
|
|
||||||
|
**Problem:** `statusIcon()` returns HTML entities (`🟢`), but Go's `html/template`
|
||||||
|
auto-escapes them to literal text (`&#x1F7E2;`). Additionally, emoji don't respond to
|
||||||
|
CSS `color` — but the templates already apply `style="color: {{statusColor .OverallStatus}}"`.
|
||||||
|
|
||||||
|
**Fix:** Change `statusIcon()` to return `●` (U+25CF, BLACK CIRCLE) — a plain Unicode character
|
||||||
|
that responds to CSS color styling. The existing `statusColor()` function handles color differentiation.
|
||||||
|
|
||||||
```go
|
```go
|
||||||
// parseSQLiteTime tries multiple formats that modernc.org/sqlite may return.
|
// BEFORE (broken):
|
||||||
func parseSQLiteTime(s string) time.Time {
|
func statusIcon(status string) string {
|
||||||
formats := []string{
|
switch status {
|
||||||
"2006-01-02 15:04:05", // SQLite datetime('now')
|
case "ok":
|
||||||
"2006-01-02T15:04:05Z", // RFC3339 without fractional
|
return "🟢" // green circle
|
||||||
time.RFC3339, // 2006-01-02T15:04:05Z07:00
|
case "warn":
|
||||||
time.RFC3339Nano, // with fractional seconds
|
return "🟡" // yellow circle
|
||||||
"2006-01-02 15:04:05+00:00", // with explicit UTC offset
|
case "down", "fail":
|
||||||
"2006-01-02 15:04:05.999999999", // with fractional, no TZ
|
return "🔴" // red circle
|
||||||
|
default:
|
||||||
|
return "⚪" // white circle
|
||||||
}
|
}
|
||||||
for _, f := range formats {
|
}
|
||||||
if t, err := time.Parse(f, s); err == nil {
|
|
||||||
return t
|
// AFTER (works with CSS color):
|
||||||
}
|
func statusIcon(status string) string {
|
||||||
}
|
return "●"
|
||||||
// Last resort: if string is non-empty, log it for debugging
|
|
||||||
if s != "" {
|
|
||||||
log.Printf("[WARN] Could not parse timestamp: %q", s)
|
|
||||||
}
|
|
||||||
return time.Time{} // zero time
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Note: Add `"log"` to the import block if not already present.
|
No template changes needed — `statusColor()` already provides the correct color per status.
|
||||||
|
|
||||||
**Step 2:** Replace ALL occurrences of `time.Parse("2006-01-02 15:04:05", receivedAt)` in store.go.
|
**Verification:**
|
||||||
|
1. Dashboard: colored dot (green/yellow/red) before customer name, no `&#x` text
|
||||||
There are **three** locations:
|
2. Customer detail: colored dot in header
|
||||||
|
3. Colors match status (green=OK, yellow=WARN, red=DOWN)
|
||||||
1. **`GetCustomers()`** — in the `for rows.Next()` loop:
|
|
||||||
```go
|
|
||||||
// BEFORE:
|
|
||||||
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
|
||||||
|
|
||||||
// AFTER:
|
|
||||||
c.ReceivedAt = parseSQLiteTime(receivedAt)
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **`GetCustomer()`** — after `row.Scan`:
|
|
||||||
```go
|
|
||||||
// BEFORE:
|
|
||||||
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
|
||||||
|
|
||||||
// AFTER:
|
|
||||||
c.ReceivedAt = parseSQLiteTime(receivedAt)
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **`GetCustomerHistory()`** — in the `for rows.Next()` loop:
|
|
||||||
```go
|
|
||||||
// BEFORE:
|
|
||||||
c.ReceivedAt, _ = time.Parse("2006-01-02 15:04:05", receivedAt)
|
|
||||||
|
|
||||||
// AFTER:
|
|
||||||
c.ReceivedAt = parseSQLiteTime(receivedAt)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Step 3 (optional diagnostic):** Temporarily add a log line in `SaveReport` to see what format
|
|
||||||
SQLite actually stores/returns. This can be removed after verifying the fix:
|
|
||||||
|
|
||||||
```go
|
|
||||||
// Add after the INSERT in SaveReport, before return:
|
|
||||||
// Debug: check what format SQLite returns
|
|
||||||
var dbTime string
|
|
||||||
s.db.QueryRow("SELECT received_at FROM reports WHERE customer_id = ? ORDER BY id DESC LIMIT 1", customerID).Scan(&dbTime)
|
|
||||||
s.logger.Printf("[DEBUG] SQLite received_at raw value: %q", dbTime)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Verify
|
|
||||||
|
|
||||||
After rebuilding and deploying the hub:
|
|
||||||
1. Wait for the next controller report push (or trigger manually)
|
|
||||||
2. Check hub.felhom.eu — status should show **OK** (green), not DOWN
|
|
||||||
3. Click into customer detail — "Last report: X min ago" should show a reasonable value
|
|
||||||
4. Report History timestamps should show actual times like `14:36:32`, not `00:00:00`
|
|
||||||
5. Check hub pod logs for any `[WARN] Could not parse timestamp` messages (should be none)
|
|
||||||
|
|
||||||
### Post-fix grep
|
|
||||||
|
|
||||||
```bash
|
|
||||||
grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/store.go
|
|
||||||
# Should return 0 results — all replaced with parseSQLiteTime()
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Bug 3: Backup page shows "Hiba" for all DB validations
|
## Build & Deploy
|
||||||
|
|
||||||
**Repository:** `deploy-felhom-compose` → `controller/`
|
After all changes, commit and deploy hub v0.1.3:
|
||||||
|
|
||||||
### Symptoms
|
|
||||||
|
|
||||||
- All 3 databases (immich, paperless, romm) show "Hiba" in the Érvényesítés column
|
|
||||||
- The Állapot column shows "OK" (dump succeeded)
|
|
||||||
- No tooltip text on hover (meaning `Validation.Error` is empty)
|
|
||||||
- Dump files are valid — headers are correct, sizes are reasonable (43.2 MB / 319.6 KB / 38.7 KB)
|
|
||||||
|
|
||||||
### Analysis
|
|
||||||
|
|
||||||
The template condition for "Hiba" in the `LastDBDump` path is:
|
|
||||||
```html
|
|
||||||
{{if .Error}} → shows "–" (dump failed)
|
|
||||||
{{else if .Validation.Valid}} → shows "X tábla" (validation passed)
|
|
||||||
{{else}} → shows "Hiba" (THIS IS WHAT WE SEE)
|
|
||||||
```
|
|
||||||
|
|
||||||
"Hiba" with empty tooltip means `Validation.Valid == false` AND `Validation.Error == ""`.
|
|
||||||
This is the **zero-value** of `DumpValidation{}` — meaning validation was never assigned.
|
|
||||||
|
|
||||||
The code in `DumpOne()` calls `ValidateDump()` and the code in `ListDumpFiles()` also calls
|
|
||||||
`ValidateDump()`. Both paths should populate the Validation field. Yet the UI shows zero-value.
|
|
||||||
|
|
||||||
**Most likely cause:** The `lastDBDump` state was populated by an older code version (before
|
|
||||||
validation was wired), OR there's a race condition where `RefreshCache` captures `lastDBDump`
|
|
||||||
mid-construction, OR the validation ran but hit an unexpected issue (permissions, encoding).
|
|
||||||
|
|
||||||
### Diagnostic step (run on demo-felhom FIRST)
|
|
||||||
|
|
||||||
Before applying fixes, check the controller logs to understand what happened:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Check the last DB dump run
|
# 1. Commit
|
||||||
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "db dump|table|valid|dump:"
|
cd /e/git/felhom.eu
|
||||||
|
git add -A && git commit -m "add CLAUDE.md, .gitignore, fix statusIcon rendering" && git push
|
||||||
|
|
||||||
# Check if there was a controller restart
|
# 2. Build
|
||||||
sudo journalctl -u felhom-controller --since "2026-02-16 00:00" | grep -iE "starting|version|shutdown"
|
ssh kisfenyo@192.168.0.180 "cd ~/build/felhom-hub && ./build.sh 0.1.3 --push"
|
||||||
|
|
||||||
# Check if the old bash systemd timer is ALSO running (double-dump conflict!)
|
# 3. Deploy
|
||||||
systemctl is-active backup-db-dump.timer
|
ssh kisfenyo@192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:0.1.3"
|
||||||
systemctl list-timers | grep backup
|
|
||||||
|
# 4. Verify
|
||||||
|
ssh kisfenyo@192.168.0.180 "sudo kubectl rollout status -n felhom-system deploy/hub && sudo kubectl logs -n felhom-system -l app=hub --tail 5"
|
||||||
```
|
```
|
||||||
|
|
||||||
**IMPORTANT:** If `backup-db-dump.timer` is still active, it will race with the controller's
|
## Post-deploy checklist
|
||||||
built-in `db-dump` scheduler job. Both write to the same directory. The bash script overwrites
|
|
||||||
files directly (no `.tmp` + rename), which could corrupt the file mid-validation. **Disable it:**
|
|
||||||
|
|
||||||
```bash
|
- [ ] `hub.felhom.eu` shows colored `●` dot, not `🟢` text
|
||||||
sudo systemctl stop backup-db-dump.timer
|
- [ ] `hub.exe` no longer in repo (`git ls-files hub/hub.exe` returns empty)
|
||||||
sudo systemctl disable backup-db-dump.timer
|
- [ ] `CLAUDE.md` exists in repo root
|
||||||
```
|
- [ ] `.gitignore` exists in repo root
|
||||||
|
|
||||||
### Fix 1: Add debug logging to `ValidateDump`
|
|
||||||
|
|
||||||
**File:** `controller/internal/backup/dbdump.go`, function `ValidateDump`
|
|
||||||
|
|
||||||
Add a log parameter and diagnostic output so we can see what's happening:
|
|
||||||
|
|
||||||
```go
|
|
||||||
// BEFORE:
|
|
||||||
func ValidateDump(filePath string, dbType DBType) DumpValidation {
|
|
||||||
|
|
||||||
// AFTER:
|
|
||||||
func ValidateDump(filePath string, dbType DBType) DumpValidation {
|
|
||||||
log.Printf("[DEBUG] ValidateDump: %s (type=%s)", filePath, dbType)
|
|
||||||
```
|
|
||||||
|
|
||||||
And at the end, before `return v`:
|
|
||||||
|
|
||||||
```go
|
|
||||||
v.Valid = true
|
|
||||||
log.Printf("[DEBUG] ValidateDump OK: %s — %d tables, header found", filePath, tableCount)
|
|
||||||
return v
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Also add logging to the error paths:
|
|
||||||
|
|
||||||
After `v.Error = "dump file too small (< 100 bytes)"`:
|
|
||||||
```go
|
|
||||||
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
|
|
||||||
```
|
|
||||||
|
|
||||||
After `v.Error = fmt.Sprintf("read failed: %v", err)`:
|
|
||||||
```go
|
|
||||||
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
|
|
||||||
```
|
|
||||||
|
|
||||||
After `v.Error = "... dump missing comment header"`:
|
|
||||||
```go
|
|
||||||
log.Printf("[WARN] ValidateDump FAIL: %s — %s", filePath, v.Error)
|
|
||||||
```
|
|
||||||
|
|
||||||
After `v.Error = "no CREATE TABLE statements found"`:
|
|
||||||
```go
|
|
||||||
log.Printf("[WARN] ValidateDump FAIL: %s — %s (header was found, scanned %d lines)", filePath, v.Error, len(strings.Split(content, "\n")))
|
|
||||||
```
|
|
||||||
|
|
||||||
Note: Import `"log"` at the top of the file if not already imported (use the standard `log`
|
|
||||||
package, not the `*log.Logger` parameter — this is a quick debug addition. Can be cleaned up later.)
|
|
||||||
|
|
||||||
### Fix 2: Template guard against zero-value Validation
|
|
||||||
|
|
||||||
Even with debug logging, we should make the template resilient to zero-value Validation.
|
|
||||||
The "Hiba" label with no explanation is a bad UX.
|
|
||||||
|
|
||||||
**File:** `controller/internal/web/templates/backups.html`
|
|
||||||
|
|
||||||
In the `LastDBDump` section, change the Érvényesítés (validation) column:
|
|
||||||
|
|
||||||
```html
|
|
||||||
<!-- BEFORE: -->
|
|
||||||
{{if .Error}}
|
|
||||||
<span class="validation-badge validation-na">–</span>
|
|
||||||
{{else if .Validation.Valid}}
|
|
||||||
<span class="validation-badge validation-ok">{{.Validation.TableCount}} tábla</span>
|
|
||||||
{{else}}
|
|
||||||
<span class="validation-badge validation-fail" title="{{.Validation.Error}}">Hiba</span>
|
|
||||||
{{end}}
|
|
||||||
|
|
||||||
<!-- AFTER: -->
|
|
||||||
{{if .Error}}
|
|
||||||
<span class="validation-badge validation-na">–</span>
|
|
||||||
{{else if .Validation.Valid}}
|
|
||||||
<span class="validation-badge validation-ok">{{.Validation.TableCount}} tábla</span>
|
|
||||||
{{else if .Validation.Error}}
|
|
||||||
<span class="validation-badge validation-fail" title="{{.Validation.Error}}">Hiba</span>
|
|
||||||
{{else}}
|
|
||||||
<span class="validation-badge validation-na" title="Az érvényesítés nem futott le">–</span>
|
|
||||||
{{end}}
|
|
||||||
```
|
|
||||||
|
|
||||||
This ensures:
|
|
||||||
- If validation passed → green badge with table count
|
|
||||||
- If validation failed with a reason → red "Hiba" with tooltip
|
|
||||||
- If validation never ran (zero-value) → gray "–" with explanatory tooltip
|
|
||||||
|
|
||||||
### Fix 3: Re-validate on cache refresh (belt-and-suspenders)
|
|
||||||
|
|
||||||
Since `RefreshCache` already calls `ListDumpFiles()` which runs `ValidateDump()` per file,
|
|
||||||
the `DumpFiles` fallback always has fresh validation. The issue is only in the `LastDBDump`
|
|
||||||
path when in-memory results have stale/missing validation.
|
|
||||||
|
|
||||||
Add a cross-check: if `LastDBDump` results have zero-value Validation but the file exists,
|
|
||||||
re-validate it. Add this in `RefreshCache`, after the existing code:
|
|
||||||
|
|
||||||
**File:** `controller/internal/backup/backup.go`, function `RefreshCache`
|
|
||||||
|
|
||||||
After the line `status.DumpFiles = files` and before the lock section, add:
|
|
||||||
|
|
||||||
```go
|
|
||||||
// Cross-check: if LastDBDump results have empty validation but files exist,
|
|
||||||
// re-validate from disk. This handles controller restarts and race conditions.
|
|
||||||
if m.lastDBDump != nil {
|
|
||||||
fileValidation := make(map[string]DumpValidation) // keyed by filename
|
|
||||||
for _, f := range files {
|
|
||||||
fileValidation[f.FileName] = f.Validation
|
|
||||||
}
|
|
||||||
for i, r := range m.lastDBDump.Results {
|
|
||||||
if !r.Validation.Valid && r.Validation.Error == "" && r.FilePath != "" {
|
|
||||||
filename := filepath.Base(r.FilePath)
|
|
||||||
if fv, ok := fileValidation[filename]; ok {
|
|
||||||
m.lastDBDump.Results[i].Validation = fv
|
|
||||||
m.logger.Printf("[INFO] Re-validated %s from disk: valid=%v tables=%d",
|
|
||||||
filename, fv.Valid, fv.TableCount)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Note: Add `"path/filepath"` to imports if not already present.
|
|
||||||
|
|
||||||
This runs every 5 minutes (same cadence as the cache refresh) and will automatically
|
|
||||||
heal any stale validation state in `lastDBDump` by cross-referencing the fresh
|
|
||||||
`ListDumpFiles` results.
|
|
||||||
|
|
||||||
### Fix 4: Disable conflicting systemd timer (manual step)
|
|
||||||
|
|
||||||
If the diagnostic step above reveals that `backup-db-dump.timer` is still active:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
sudo systemctl stop backup-db-dump.timer
|
|
||||||
sudo systemctl disable backup-db-dump.timer
|
|
||||||
# Optionally verify:
|
|
||||||
systemctl list-timers | grep backup
|
|
||||||
# Should show nothing
|
|
||||||
```
|
|
||||||
|
|
||||||
The controller's built-in `db-dump` scheduler job at 02:30 replaces this timer entirely.
|
|
||||||
Having both run simultaneously can corrupt dump files mid-write.
|
|
||||||
|
|
||||||
### Verify
|
|
||||||
|
|
||||||
After deploying fixes:
|
|
||||||
1. Wait for cache refresh (5 minutes) or trigger a manual backup ("Mentés most")
|
|
||||||
2. Check `/backups` page — validation column should show "X tábla" for all databases
|
|
||||||
3. Check controller logs for `[DEBUG] ValidateDump` lines confirming validation ran
|
|
||||||
4. Verify no `[WARN] ValidateDump FAIL` lines in logs
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Post-fix checklist
|
|
||||||
|
|
||||||
### Hub (felhom.eu repo → hub/)
|
|
||||||
- [ ] `grep -rn 'time.Parse("2006-01-02 15:04:05"' hub/internal/store/` → 0 results
|
|
||||||
- [ ] `parseSQLiteTime` function exists in store.go
|
|
||||||
- [ ] `go build ./cmd/hub/` succeeds
|
|
||||||
- [ ] `go vet ./...` passes
|
|
||||||
- [ ] Build new image, deploy to k3s
|
|
||||||
- [ ] hub.felhom.eu shows OK status for demo-felhom
|
|
||||||
- [ ] Report history shows real timestamps
|
|
||||||
|
|
||||||
### Controller (deploy-felhom-compose repo → controller/)
|
|
||||||
- [ ] Template has 4-branch validation check (Valid / Error / zero-value guard)
|
|
||||||
- [ ] `RefreshCache` has cross-check re-validation logic
|
|
||||||
- [ ] `ValidateDump` has debug logging
|
|
||||||
- [ ] `backup-db-dump.timer` is disabled on demo-felhom
|
|
||||||
- [ ] `go build ./cmd/controller/` succeeds
|
|
||||||
- [ ] `go vet ./...` passes
|
|
||||||
- [ ] Build, deploy to demo-felhom
|
|
||||||
- [ ] Backup page shows table counts, not "Hiba"
|
|
||||||
- [ ] Controller logs show `[DEBUG] ValidateDump OK` entries
|
|
||||||
|
|
||||||
### Version bumps
|
|
||||||
- Hub: bump to next patch version
|
|
||||||
- Controller: include in v0.6.1 release (alongside the code review fixes from the other TASK.md)
|
|
||||||
Reference in New Issue
Block a user