# CONTEXT.md — Project Memory
> This file serves as persistent project memory across Claude Code sessions.
> It replaces the auto-generated "Memory" from the claude.ai Project.
> **Update this file at the end of each working session** with current state,
> recent decisions, and anything the next session needs to know.
>
> Ask Claude Code: "Please update CONTEXT.md with what we did today"
Last updated: 2026-02-16 (session 22)
---
## About Viktor (project owner)
- Works at Magyar Telekom (Budapest), building Felhom as a side business
- Felhom: managed home-server service for Hungarian households
- Technical but prefers pragmatic solutions over over-engineering
- Runs all infrastructure on Gitea (gitea.dooplex.hu), k3s cluster for management
- Customer deployments use Docker Compose (not Kubernetes) for simplicity
## Current project state
### felhom-controller (this repo)
- **Version:** v0.7.0
- **Phase 1:** ✅ COMPLETE — Stack Manager + Deploy Flow
- **Phase 2:** ✅ COMPLETE — Monitoring & Health (scheduler, CPU/temp, healthchecks.io pings)
- **Phase 3:** ✅ COMPLETE — Backups (DB dumps, restic integration, manual trigger, **dedicated backup page**)
- **Phase 4:** ✅ COMPLETE — Monitoring Page with Metrics Store (SQLite, Chart.js, system + container metrics)
- **Phase 5:** ✅ COMPLETE — Authentication, Persistence & Settings Page (settings.json, password change, session management)
- **First app deployed:** Paperless-ngx on demo-felhom.eu (2026-02-13)
- **Running on:** demo-felhom (N100 mini PC) at 192.168.0.162:8080
- **All Phase 1-5 features working:** deploy, start/stop/restart/update, logs, health-aware states, auth, monitoring, backups, backup detail page, system monitoring page, settings page
### What was just completed (2026-02-16 session 22)
- **v0.7.0 — Phase 1: Authentication, Persistence & Settings Page:**
- **New `internal/settings/settings.go`:** Shared persistence layer via `settings.json` in the data directory. Atomic writes (tmp + rename), thread-safe with `sync.RWMutex`. Stores password hash overrides and DB validation cache. Graceful handling if file doesn't exist.
- **Auth improvements:**
- Password resolution priority: `settings.json` → `controller.yaml` → none (open dashboard)
- Startup logs which source is active: `Auth: using password from settings.json/controller.yaml/no password configured`
- Session duration extended to 7 days (was 24h)
- `?next=` redirect after session expiry — returns user to the page they were on
- Flash messages on login page (green info box, used after password change)
- Conditional logout link — hidden when auth is disabled (no password configured)
- `invalidateAllSessions()` method for password change flow
- **New Settings page (`/settings`):**
- "Rendszer konfiguráció" section: read-only display of controller.yaml values (customer ID/name/domain, git repo/sync interval, backup enabled/schedule, monitoring, healthchecks URL, hub status, controller version)
- "Jelszó módosítás" section: form with current password, new password, confirm — validates min 8 chars, match check, bcrypt comparison
- Password saved to `settings.json`, all sessions invalidated, redirect to login with flash message
- Only shown if auth is enabled; otherwise shows info message to contact operator
- **Sidebar update:**
- "Beállítások" menu item with ⚙ icon pinned to bottom (above version/logout)
- Version and logout link separated from nav links
- Logout link conditionally shown only when auth is enabled
- **DB validation persistence:**
- After each successful dump, validation results saved to `settings.json` (`db_validations` map keyed by filename)
- Cached data survives container restarts
- `DBValidationCache` struct with `validated_at`, `table_count`, `has_header`, `error`
- **10 files changed** (3 new: settings.go, settings.html; 7 modified: main.go, backup.go, auth.go, handlers.go, server.go, layout.html, login.html, style.css)
- **Deployed:** Controller v0.7.0 to demo-felhom.eu, verified healthy
### What was previously completed (2026-02-16 session 21)
- **v0.6.3 — Bug fixes from v0.6.2 code scan (4 minor fixes):**
- **Bug 1:** `--hdd-path` in `docker-setup.sh` now uses `require_arg` validation like all other flags. Previously, `--hdd-path` as the last argument without a value would crash with a cryptic bash error under `set -u` instead of a friendly message.
- **Bug 2:** `stackAction()` in `layout.html` now receives `event` as an explicit parameter instead of relying on the deprecated implicit `window.event`. All 10 onclick call sites in `dashboard.html` and `stacks.html` updated to pass `event` as first argument.
- **Bug 3:** Page `
` now has an em dash separator: `"Vezérlőpult — Felhom.eu"` instead of `"VezérlőpultFelhom.eu"`.
- **Bug 4:** `nextPruneLabel()` in `funcmap.go` now returns `"ma"` (Hungarian for "today") on Sunday before 4am, consistent with the `nextRunLabel` function. Previously returned the date in `"2006-01-02"` format.
- **Deployed:** Controller v0.6.3 to demo-felhom.eu, verified healthy
### What was previously completed (2026-02-16 session 20)
- **Hub Dashboard Bugs + Backup Validation Fix (3 bugs):**
- **Bug 1&2 (Hub repo, felhom-hub v0.1.2):** Hub timestamp parsing failure — `time.Parse` with single hardcoded format silently failed for formats returned by `modernc.org/sqlite`. Added `parseSQLiteTime()` that tries 6 common formats. Fixed: hub main page showing DOWN despite OK status, and report history timestamps showing 00:00:00.
- **Bug 3 (Controller repo, v0.6.2):** Backup page showing "Hiba" for all DB validations — zero-value `DumpValidation{}` (never assigned) hit the `{{else}}` branch in template. Three fixes:
- Template: 4-branch guard (Valid → OK / Error → Hiba / zero-value → "–" with tooltip)
- Debug logging: Added `[DEBUG]` and `[WARN]` log lines to all `ValidateDump()` code paths
- Re-validation: `RefreshCache()` now cross-checks `lastDBDump` results against fresh `ListDumpFiles()` validation, healing stale in-memory state
- **Deployed:** Hub v0.1.2 to k3s, Controller v0.6.2 to demo-felhom
- **Verified:** Controller logs show `ValidateDump OK` for all 3 databases (immich: 60 tables, paperless: 67 tables, romm: 14 tables)
### What was previously completed (2026-02-16 session 19)
- **v0.6.1 — Code Review Bugfixes (7 fixes):**
- **Fix 1:** `http.NotFound(w, nil)` → pass actual `*http.Request` in `deployHandler` and `appDetailHandler`
- **Fix 2:** Dashboard running/stopped counts now computed from the filtered `deployedStacks` set (was counting ALL stacks including non-deployed)
- **Fix 3:** Session cookie `Secure` flag now dynamic based on `r.TLS != nil || X-Forwarded-Proto == "https"`. `SameSite` changed from `Strict` to `Lax` (Strict breaks Cloudflare Tunnel redirects)
- **Fix 4:** Removed misleading `subtle.ConstantTimeCompare` from `isValidSession()` (map lookup already leaks timing; comparing token to itself is meaningless). Removed unused `token` field from `session` struct. Removed `crypto/subtle` import.
- **Fix 5:** Replaced `time.Tick()` (goroutine leak) with proper `time.NewTicker` + `done` channel in `cleanupSessions()`. Added `Close()` method to Server. Added `done chan struct{}` to Server struct.
- **Fix 6:** Added `http.MaxBytesReader(w, req.Body, 1<<20)` (1MB limit) to `deployStack`, `updateOptionalConfig`, `deleteStack` API handlers via `limitBody()` helper.
- **Fix 7:** Cached `time.LoadLocation("Europe/Budapest")` once at top of `templateFuncMap()`, removed 5 per-function `LoadLocation` calls (timeAgo, fmtTime, fmtTimeShort, nextRunLabel, nextPruneLabel).
- **Post-fix verification:** All 4 grep checks pass (0 results for NotFound(w,nil), ConstantTimeCompare, time.Tick(, Secure:.*true). `go vet ./...` clean.
- **Controller version:** v0.6.1 — deployed and verified on demo-felhom.eu
### What was previously completed (2026-02-16 session 18)
- **v0.6.0 — Healthcheck Implementation + Central Push + Hub Dashboard:**
- **Part 1 — Healthcheck enhancements (controller-side):**
- Added `heartbeat` ping — lightweight "I'm alive" signal every 5 min (no logic, just ping)
- Added `backup_integrity` ping — weekly `restic check` on Sunday 04:00, pings healthchecks with result
- Added `Heartbeat` and `BackupIntegrity` fields to `PingUUIDsConfig`
- Added `RunIntegrityCheck()` to backup Manager (calls restic Check(), updates lastCheckTime/lastCheckOK, pings)
- Updated `controller.yaml.example` with new monitoring ping_uuids
- Created `monitoring/DEPRECATED.md` for legacy bash monitoring scripts
- **Part 2 — Central hub reporting (controller-side):**
- New `internal/report/` package: types.go (Report struct), builder.go (BuildReport), pusher.go (HTTP push)
- Report builder gathers data from all subsystems: system info (via metrics.GetStaticInfo + system.GetInfo), container stats (via metricsStore.QueryContainerSummary), backup status (via backupMgr.GetFullStatus), health (via monitor.RunHealthCheck), stacks (via stackMgr.GetStacks)
- Report pusher: POST JSON to hub with Bearer token auth, 3 retries with 5s backoff, never fails caller
- Added `HubConfig` to config.go (enabled, url, api_key, push_interval)
- Wired hub reporting into scheduler (configurable interval, default 15m)
- Hub reporting disabled by default (hub.enabled: false)
- **Part 3 — Hub service (felhom.eu repo, new `hub/` subfolder):**
- Full Go service: `cmd/hub/main.go`, `internal/api/handler.go`, `internal/store/store.go`, `internal/web/server.go`
- SQLite store with WAL mode, auto-migration, denormalized fields for fast queries
- REST API: POST /api/v1/report (Bearer token auth), GET /api/v1/customers, GET /api/v1/customers/{id}, GET /api/v1/customers/{id}/history
- Dark theme dashboard (English): multi-customer overview table with status indicators, customer detail page with system/storage/containers/backup/health sections
- Color coding: green (OK, <30min), yellow (warn or 30-60min), red (fail or >60min)
- K8s manifest: Deployment + Service + Ingress for hub.felhom.eu in felhom-system namespace
- Dockerfile, Makefile, hub.yaml.example config
- 90-day report retention with daily auto-prune
- **Controller version:** v0.6.0 — deployed and verified on demo-felhom.eu (9 scheduler jobs, all new jobs registered)
- **Manual steps remaining for Viktor (Part 4 of TASK.md):**
- Create 5 healthcheck checks on status.felhom.eu (heartbeat, system-health, db-dump, backup, backup-integrity)
- Update controller.yaml on demo-felhom with real UUIDs
- Build and deploy felhom-hub to k3s cluster
- Configure hub.felhom.eu DNS in Cloudflare
- Enable hub reporting on demo-felhom controller.yaml
### What was previously completed (2026-02-16 session 17)
- **v0.5.4 — Monitoring Page Frontend Fixes (4 bugs, frontend-only):**
- **Bug 1: Tooltip "Invalid Date"** — `items[0].parsed.x` unreliable across Chart.js versions. Fixed tooltip callback to use `items[0].raw.x` (direct {x,y} data access) with `parsed.x` as fallback.
- **Bug 2: Charts fill full width regardless of data density** — `setChartXBounds()` setting `min/max` at runtime was ignored because the scale was created without them. Fixed by including `min: now - defaultRangeMs, max: now` in the initial `chartOpts()` options. Now "7 nap" shows full 7-day x-axis with data clustered on the right.
- **Bug 3: Sysinfo values not consistently right-aligned** — `.sysinfo-grid` used `auto-fill` creating variable-width cells. Fixed to `1fr 1fr` (fixed 2-column). Added `align-items: baseline`, `gap: 1rem`, `white-space: nowrap` on labels, `font-weight: 600` + `word-break: break-word` on values. Removed redundant `