diff --git a/CLAUDE.md b/CLAUDE.md index 2a9e1a4..f74e21b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,191 +1,116 @@ -# CLAUDE.md — Project Instructions for Claude Code +# CLAUDE.md — Project Instructions for Claude Code (`felhom.eu`) -> This file is read automatically by Claude Code at the start of every session. -> It replaces the "Instructions" panel from the claude.ai Project. -> Keep it updated as the project evolves. +> Read automatically by Claude Code when it works in this repo. Keep it updated as the project +> evolves. Cross-repo orientation (the felhom system, artifact taxonomy, access) lives in the +> workspace-root `e:\git\CLAUDE.md`; this file is `felhom.eu`-specific. ## Project overview This repo (`felhom.eu`) contains: -- **Website** (`website/`) — Static HTML pages at felhom.eu, served via k3s nginx + git-sync sidecar -- **Hub** (`hub/`) — Go application (felhom-hub) — centralized dashboard for monitoring customer controllers, runs on k3s at hub.felhom.eu -- **K8s manifests** (`manifests/`) — k3s deployment manifests for all felhom-system services +- **Website** (`website/`) — static HTML at felhom.eu, served via k3s nginx + git-sync sidecar. +- **Hub** (`hub/`) — Go application (felhom-hub) — the **operator backend**, on k3s at `hub.felhom.eu`. +- **K8s manifests** (`manifests/`) — k3s deployment manifests for felhom-system services. +- **Architecture docs** (`documentation/`) — the **authoritative design home for the whole Felhom + system**: `architecture/01..05-*.md` (topology/trust, controller module map, host-agent, signing, + hub), `proxmox-platform.md`, and `tests/phase{0,1-2,3,4}-findings.md`. Read these before designing. -See `README.md` for full architecture, DNS, email, and SEO documentation. -See `TASK.md` for the current task to implement (if it exists). +See `README.md` for full architecture/DNS/email/SEO docs. See `TASK.md` for the current task (if any). + +## The Felhom system (so the hub's role is in context) + +Felhom is **Proxmox-based**, with a locked **three-component model**: +- **Hub** (this repo, `hub/`) — operator backend. Authors operator *intent*; mirrors box *reality*; + holds **no data-plane role** and never connects inbound to a box. +- **Host agent** (repo `felhom-agent/`) — one per Proxmox host; owns all Proxmox interaction. +- **In-guest controller** (repo `felhom-controller/`) — one per customer LXC; Docker-only. + +The hub is **not** just controller monitoring anymore. As of slice 3 it ingests **two report +streams**: the agent's host-domain report (`POST /api/v1/host-report`, the heartbeat) and the +legacy controller report (`POST /api/v1/report`). The controller path is **frozen and retires at the +slice-10 cutover** — do not modify it until then. + +## Hub — current state (v0.7.x) + +- **Tables:** `customer_configs`, `events`, `app_telemetry`/`app_log_issues`, the legacy `reports`, + and the slice-3 host-domain additions `hosts` / `guests` / `host_reports` (additive; columns + marked inert exist for the slice-10 cutover but are unused now). +- **Auth:** Bearer — global key, per-customer key (legacy), and per-host key (`GetHostByAPIKey`, + slice 3). Provisional global-key host mint at `POST /api/v1/admin/hosts`. +- **Monitoring:** the controller `StalenessChecker` (over `reports`) AND a sibling + `HostStalenessChecker` (over `host_reports`, emitting `host_stale`/`host_down`/`host_recovered`). +- Two-tier notifications (operator English / customer Hungarian, Resend, cooldowns); `events` audit. ## Code quality rules -- Always double-check generated code for bugs, logic issues, syntax errors -- Handle edge cases without overcomplicating the script/program -- Add debug capabilities (logging, verbose output) for easier troubleshooting -- If you need more input or troubleshooting command output, ask first — don't guess +- Always double-check generated code for bugs, logic issues, syntax errors. +- Handle edge cases without overcomplicating. +- Add debug capabilities (logging, verbose output). +- If you need more input or troubleshooting output, **ask first — don't guess**. -## Workspace layout +## Workflow & artifacts -``` -E:\git\felhom.eu\ (or /e/git/felhom.eu/ in Git Bash) -├── hub/ # felhom-hub Go application -│ ├── cmd/hub/ # Entry point (main.go) -│ ├── internal/ -│ │ ├── api/ # Report ingestion API -│ │ ├── store/ # SQLite storage + queries -│ │ └── web/ # Dashboard UI -│ │ ├── server.go # Server, routing, template funcs -│ │ ├── embed.go # go:embed for templates -│ │ └── templates/ # HTML templates + CSS -│ ├── configs/ # Example config files -│ ├── Dockerfile -│ ├── Makefile -│ └── go.mod -├── manifests/ # k3s deployment manifests -│ ├── hub.yaml # Hub deployment (hub.felhom.eu) -│ ├── webpage.yaml # Website + FileBrowser + git-sync -│ ├── contact-mailer.yaml # Contact form email sender -│ ├── healthchecks.yaml # Healthchecks (status.felhom.eu) -│ └── umami.yaml # Analytics (stats.felhom.eu) -├── website/ # Static HTML pages (felhom.eu) -│ ├── index.html -│ ├── alkalmazasok.html -│ ├── ... (all Hungarian, UTF-8 with BOM) -│ └── assets/ # Logos, screenshots, OG images -├── CLAUDE.md # This file -├── README.md # Full project documentation -└── TASK.md # Current task (if exists) -``` +The planning/architecture assistant ("project Claude", in claude.ai) writes specs and validates +pushes; **you (Claude Code) implement**. A file being open in the editor is NOT an instruction. -Related repos (same parent directory): -``` -E:\git\felhom-controller\ # felhom-controller Go app + deploy scripts -E:\git\app-catalog-felhom.eu\ # Docker Compose templates per app -E:\git\homelab-manifests\ # k3s cluster manifests (dooplex.hu services) -E:\git\misc-scripts\ # Helper scripts (build scripts, repo collector) -``` - -All repos hosted at `gitea.dooplex.hu/admin/`. - -## SSH access - -SSH key-based authentication configured. No password prompts. - -**IMPORTANT — SSH binary:** Claude Code runs in Git Bash, which has its own SSH at -`/usr/bin/ssh` (= `C:\Program Files\Git\usr\bin\ssh.exe`). This binary does NOT have -access to the Windows SSH agent and will fail silently. Always use the Windows native -OpenSSH binary: - -``` -SSH=/c/Windows/System32/OpenSSH/ssh.exe -``` - -All SSH commands below use `$SSH` — set it at the start of your session. - -| Host | IP | User | Role | -|------|----|------|------| -| Build server (k3s node) | 192.168.0.180 | kisfenyo | Build + push images, kubectl | -| Demo node | 192.168.0.162 | kisfenyo | Test deployment (demo-felhom.eu) | - -**Note:** `kubectl` on the build server requires `sudo` (k3s kubeconfig permissions). - -## Build & deploy workflow — Hub - -After making code changes to `hub/`, you **MUST** build, push, and deploy the new image. -Do NOT leave code changes uncommitted or undeployed. - -### Step 1: Commit and push changes - -```bash -cd /e/git/felhom.eu -git add -A && git commit -m "" && git push -``` - -### Step 2: Build + push the container image on the build server - -The build server (192.168.0.180) has the build toolchain. The build script lives at -`~/build/felhom-hub/build.sh` on the build server (NOT in this repo). - -First, check the current running version: -```bash -$SSH kisfenyo@192.168.0.180 "sudo kubectl get deploy -n felhom-system hub -o jsonpath='{.spec.template.spec.containers[0].image}'" -``` - -Then build with the next version (e.g., if current is 0.1.2, use 0.1.3): -```bash -$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-hub && ./build.sh --push" -``` - -The build script: -- Pulls latest code from Gitea (`git pull` on the felhom.eu repo) -- Copies `hub/` source to a clean build workspace -- Builds Docker image with version + build-time ldflags -- Pushes to `gitea.dooplex.hu/admin/felhom-hub:` and `:latest` - -### Step 3: Deploy to k3s - -```bash -$SSH kisfenyo@192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:" -``` - -### Step 4: Verify the deployment - -```bash -$SSH kisfenyo@192.168.0.180 "sudo kubectl get pods -n felhom-system -l app=hub && echo '---' && sudo kubectl logs -n felhom-system -l app=hub --tail 10" -``` - -Should show pod Running and `[INFO] felhom-hub starting` in logs. - -### Build workflow summary - -| Step | Command | Where | -|------|---------|-------| -| 1. Commit + push | `git add -A && git commit && git push` | Local (this repo) | -| 2. Build + push image | `$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-hub && ./build.sh --push"` | Build server | -| 3. Deploy | `$SSH kisfenyo@192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=...:"` | Build server (kubectl) | -| 4. Verify | `$SSH kisfenyo@192.168.0.180 "sudo kubectl get pods -n felhom-system -l app=hub"` | Build server | - -## Build & deploy workflow — Website - -The website auto-deploys via git-sync sidecar. Just push to `main`: - -```bash -cd /e/git/felhom.eu -git add -A && git commit -m "" && git push -``` - -Changes are live within 1-2 minutes. No build step needed. - -For emergency edits, use FileBrowser at `https://files.felhom.eu`. - -## Build & deploy workflow — K8s Manifests - -Manifests are applied manually: - -```bash -ssh kisfenyo@192.168.0.180 "sudo kubectl apply -f /home/kisfenyo/git/felhom.eu/manifests/.yaml" -``` - -Remember to `git pull` on the build server first if you pushed changes locally. +- **`TASK.md` / `TASK-*.md`** — a spec for you to implement. Then push, update `hub/CHANGELOG.md`, + and **append** a section to this repo's root `REPORT.md` (this repo appends; newest section last). +- **`RUNBOOK-*.md`** — a human-run operational procedure. **Do NOT execute.** +- Validation of a push against a spec's criteria is project Claude's job, not yours, unless asked. ## Tech stack (Hub) -- **Language:** Go 1.24+ -- **Web framework:** stdlib `net/http` + `html/template` -- **Database:** SQLite via `modernc.org/sqlite` (pure Go, no CGo) -- **Auth:** bcrypt password hash + basic auth -- **Deployment:** Docker container on k3s (felhom-system namespace) -- **Storage:** Longhorn PVC at `/data/` (SQLite DB) -- **Config:** YAML file mounted via k8s ConfigMap at `/etc/felhom-hub/hub.yaml` +- **Language:** Go 1.24+ (build server is go1.26.0). +- **Web:** stdlib `net/http` + `html/template`. **DB:** SQLite via `modernc.org/sqlite` (pure Go). +- **Auth:** bcrypt + Bearer tokens. **Deploy:** Docker on k3s (felhom-system ns). +- **Storage:** Longhorn PVC at `/data/` (SQLite DB). **Config:** YAML via ConfigMap at `/etc/felhom-hub/hub.yaml`. + +## SSH access + +Use the Windows OpenSSH binary (Git Bash's `/usr/bin/ssh` can't reach the Windows agent and fails +silently): `SSH=/c/Windows/System32/OpenSSH/ssh.exe`. All SSH commands below use `$SSH`. + +| Host | IP | User | Role | +|------|----|------|------| +| Build server (k3s node) | 192.168.0.180 | kisfenyo | Build + push images, kubectl (needs `sudo`) | +| Demo node | 192.168.0.162 | kisfenyo | Test deployment (demo-felhom.eu) | + +## Build & deploy — Hub + +After code changes to `hub/`, you **MUST** build, push, and deploy. + +1. **Commit + push:** `cd /e/git/felhom.eu && git add -A && git commit -m "" && git push` +2. **Check running version:** + `$SSH kisfenyo@192.168.0.180 "sudo kubectl get deploy -n felhom-system hub -o jsonpath='{.spec.template.spec.containers[0].image}'"` +3. **Build + push image** (next version; build script lives on the build server, not in this repo): + `$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-hub && ./build.sh --push"` + (pulls latest from Gitea, builds with version+build-time ldflags into `main.Version`, pushes + `gitea.dooplex.hu/admin/felhom-hub:` and `:latest`.) +4. **Deploy:** + `$SSH kisfenyo@192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:"` +5. **Verify:** + `$SSH kisfenyo@192.168.0.180 "sudo kubectl get pods -n felhom-system -l app=hub && sudo kubectl logs -n felhom-system -l app=hub --tail 10"` + (expect Running + `[INFO] felhom-hub starting`.) + +> If the hub deployment is ArgoCD-managed (auto-sync), a manual `kubectl set image` may be reverted +> by ArgoCD drift-correction — confirm the deploy path before relying on step 4. + +## Build & deploy — Website / Manifests + +- **Website** auto-deploys via git-sync; just push to `main` (live in 1–2 min). Emergency edits: + FileBrowser at `https://files.felhom.eu`. +- **Manifests** are applied manually (git pull on the build server first if you pushed): + `$SSH kisfenyo@192.168.0.180 "sudo kubectl apply -f /home/kisfenyo/git/felhom.eu/manifests/.yaml"` ## Key patterns -- Hub receives reports from customer controllers via `POST /api/v1/report` (Bearer token auth) -- Dashboard shows all customers in a table with status, CPU, memory, disk, containers, backup age -- Customer detail page shows system info, report history, full JSON report -- Status logic: OK (report < 30m), WARN (30m-1h or health=warn), DOWN (> 1h or health=fail) -- SQLite timestamps may vary in format — use `parseSQLiteTime()` for robust parsing -- Auto-refresh: dashboard and detail pages refresh every 60 seconds via `` -- Geo-restricted to Hungary via nginx ingress annotation +- Hub ingests **host-reports from agents** (`POST /api/v1/host-report`, Bearer per-host) and legacy + **controller reports** (`POST /api/v1/report`). The host-report `received_at` is the dead-man's- + switch liveness signal. +- Status logic: OK (report < 30m), WARN (30m–1h or health=warn), DOWN (> 1h or health=fail). +- SQLite timestamps vary in format — use `parseSQLiteTime()`. +- Dashboard/detail auto-refresh every 60s via ``. Geo-restricted to + Hungary via nginx ingress annotation. ## File encoding -All HTML files in `website/` are **UTF-8 with BOM**. Ensure your editor preserves this. -Hub Go source files are standard UTF-8 (no BOM). +All `website/` HTML is **UTF-8 with BOM** — preserve it. Hub Go source is standard UTF-8 (no BOM). \ No newline at end of file