diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..3e6f200 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,102 @@ +# CLAUDE.md — Project Instructions for Claude Code + +> This file is read automatically by Claude Code at the start of every session. +> It replaces the "Instructions" panel from the claude.ai Project. +> Keep it updated as the project evolves. + +## Project overview + +Creating a business (Felhom) for home-server deployment for Hungarian customers. This repository +(`deploy-felhom-compose`) contains the felhom-controller — a Go application that manages Docker +Compose stacks on customer hardware via a Hungarian-language web dashboard. + +See `controller/README.md` for full architecture and status. +See `CONTEXT.md` for current project state, recent work, and decisions (update after each session). + +## Code quality rules + +- Always double-check generated code for bugs, logic issues, syntax errors +- Handle edge cases without overcomplicating the script/program +- Add debug capabilities (logging, verbose output) for easier troubleshooting +- If you need more input or troubleshooting command output, ask first — don't guess + +## Repository structure + +This repo has two main parts: + +``` +deploy-felhom-compose/ +├── controller/ # Go application (this is the main codebase) +│ ├── cmd/controller/ # Entry point (main.go) +│ ├── internal/ +│ │ ├── config/ # YAML config loading +│ │ ├── stacks/ # Docker Compose operations, deploy flow +│ │ ├── api/ # REST API endpoints +│ │ └── web/ # Dashboard UI (embedded HTML/CSS templates) +│ ├── Dockerfile +│ ├── Makefile +│ └── go.mod +├── scripts/ # Setup scripts for customer nodes +├── CLAUDE.md # This file +└── CONTEXT.md # Project memory / state +``` + +## Related repositories (not in this repo) + +- **app-catalog-felhom.eu** — Docker Compose templates + .felhom.yml metadata per app +- **felhom.eu** — Website (htmls) + k3s manifests for the web server +- **homelab-manifests** — Viktor's k3s cluster manifests (dooplex.hu services) +- **misc-scripts** — Helper scripts for daily tasks + +All hosted at `gitea.dooplex.hu/admin/` + +## Test environments + +| Node | Hardware | Domain | IP | Notes | +|------|----------|--------|----|-------| +| demo-felhom | Acemagic N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | 192.168.0.162 | Primary test node, Cloudflare Tunnel | +| pi-customer-1 | Raspberry Pi 3B+, 1G RAM, 32G SD | pi-customer-1.local | 192.168.0.161 | Secondary test, not yet active | + +- Pi-hole DNS on local network forwards `*.demo-felhom.eu` → 192.168.0.162 +- External access via Cloudflare Tunnel → Traefik reverse proxy + +## Build workflow + +Source is in `controller/` but builds happen in `~/build/felhom-controller/`: + +```bash +cd ~/build/felhom-controller +./build.sh --push # builds + pushes to gitea.dooplex.hu registry +``` + +Deploy on node: `docker compose up -d` (NOT `restart` — restart doesn't pick up new images) + +## Tech stack + +- **Language:** Go 1.22+ +- **Web framework:** stdlib `net/http` + `html/template` (no frameworks) +- **Templates:** Embedded as Go string constants in `templates.go` (Hungarian UI) +- **CSS:** Single embedded const in `templates.go` (no external CSS files) +- **Auth:** bcrypt password hash + session cookies +- **Container orchestration:** Docker Compose via CLI (`docker compose up -d`) +- **Reverse proxy:** Traefik (separate stack, managed by controller) +- **Tunnel:** Cloudflare Tunnel (cloudflared, separate stack) + +## Key patterns + +- All UI text is in Hungarian (Budapest timezone, Hungarian locale) +- Templates use Go template functions: `stateColor`, `stateLabel`, `stateIcon`, `stateStr`, `isOperational`, `logoURL`, `logoPNGURL`, `appPageURL` +- Container states: `running`, `starting`, `unhealthy`, `stopped`, `exited`, `restarting`, `paused`, `not_deployed` +- Docker `.State` field is combined with `.Status` field to detect health substatus +- Stacks are sorted alphabetically by DisplayName +- Protected stacks (traefik, cloudflared, felhom-controller) can't be stopped from UI +- `app.yaml` persists deploy config; `deployed: true` flag controls UI state +- Password fields require explicit user input or generation (no silent auto-fill) + +## Important lessons learned + +1. `PAPERLESS_OCR_LANGUAGES` (plural, with S) **installs** tesseract packs; `PAPERLESS_OCR_LANGUAGE` (singular) **selects** which to use +2. `docker compose restart` does NOT pick up new images — always use `docker compose up -d` +3. Go map iteration order is random — always sort before displaying in UI +4. Docker's `.State` field says "running" even for unhealthy containers — must parse `.Status` for health info +5. After `DeployStack()` succeeds, update in-memory `Deployed` flag immediately — `RefreshStatus()` only reads docker ps, not app.yaml \ No newline at end of file diff --git a/CONTEXT.md b/CONTEXT.md new file mode 100644 index 0000000..965e043 --- /dev/null +++ b/CONTEXT.md @@ -0,0 +1,105 @@ +# CONTEXT.md — Project Memory + +> This file serves as persistent project memory across Claude Code sessions. +> It replaces the auto-generated "Memory" from the claude.ai Project. +> **Update this file at the end of each working session** with current state, +> recent decisions, and anything the next session needs to know. +> +> Ask Claude Code: "Please update CONTEXT.md with what we did today" + +Last updated: 2026-02-14 + +--- + +## About Viktor (project owner) + +- Works at Magyar Telekom (Budapest), building Felhom as a side business +- Felhom: managed home-server service for Hungarian households +- Technical but prefers pragmatic solutions over over-engineering +- Runs all infrastructure on Gitea (gitea.dooplex.hu), k3s cluster for management +- Customer deployments use Docker Compose (not Kubernetes) for simplicity + +## Current project state + +### felhom-controller (this repo) +- **Version:** v0.2.1 +- **Phase 1:** ✅ COMPLETE — Stack Manager + Deploy Flow +- **First app deployed:** Paperless-ngx on demo-felhom.eu (2026-02-13) +- **Running on:** demo-felhom (N100 mini PC) at 192.168.0.162:8080 +- **All Phase 1 features working:** deploy, start/stop/restart/update, logs, health-aware states, auth + +### What was just completed (2026-02-13/14) +- Built the entire felhom-controller from scratch (Go, no frameworks) +- Debugged and fixed 7 issues during first real deployment: + 1. Password validation (empty passwords accepted) + 2. In-memory Deployed flag not updating after deploy + 3. Health-aware state parsing (starting/unhealthy detection) + 4. Random card ordering (Go map iteration) + 5. "Részletek" button redirect for deployed apps + 6. Paperless OCR language installation (LANGUAGES vs LANGUAGE env var) + 7. Documentation: restart vs up -d for image updates +- Controller image builds via build.sh, pushes to Gitea container registry + +### What's next (priorities) +1. Deploy a second app (e.g., Immich, Jellyfin) to validate the template system +2. Test on Raspberry Pi (pi-customer-1) +3. Phase 2: Monitoring & Healthchecks integration +4. Phase 3: Backup system (DB dumps + restic) +5. Dashboard dark theme (align with felhom.eu website) + +## Architecture decisions + +| Decision | Rationale | +|----------|-----------| +| Go stdlib for web (no Gin/Echo) | Minimal dependencies, single binary, easy to embed templates | +| Templates as Go string constants | Zero runtime file dependencies, everything in the binary | +| Docker Compose for customers (not k8s) | Simpler troubleshooting, customers don't need k8s knowledge | +| k3s for management infra only | Viktor's own services (gitea, monitoring, website) run on k3s | +| Cloudflare Tunnel for remote access | No port forwarding needed, works behind any NAT | +| app.yaml per stack | Separates deploy config from compose files, survives git pulls | +| Password fields require explicit input | Prevents accidental empty-password deployments | +| Health-aware state from Docker Status field | Docker's State says "running" even for unhealthy containers | + +## Key file locations on demo-felhom + +``` +/opt/docker/felhom-controller/ # Controller compose + config + ├── controller.yaml # Customer config (domain, auth, paths) + ├── docker-compose.yml # Controller's own compose + └── .env # DOMAIN=demo-felhom.eu + +/opt/docker/stacks/ # All app stacks + ├── traefik/ # Reverse proxy (protected) + ├── cloudflared/ # Tunnel (protected) + ├── paperless-ngx/ # First deployed app ✅ + │ ├── docker-compose.yml + │ ├── .felhom.yml # App metadata + │ └── app.yaml # Deploy config (env vars, locked fields) + └── whoami/ # Test stack (not deployed) + +/mnt/hdd_placeholder/storage/ # HDD storage for apps + └── paperless/ + ├── consume/ # Drop files here for OCR + ├── media/ # Processed documents + └── export/ # Backup exports +``` + +## Related repositories and their state + +| Repository | Status | Notes | +|------------|--------|-------| +| deploy-felhom-compose | Active | This repo. Controller code + deploy scripts | +| app-catalog-felhom.eu | Active | 49 app templates, needs PAPERLESS_OCR_LANGUAGES fix | +| felhom.eu | Stable | Website live, SEO indexed, email working | +| homelab-manifests | Stable | k3s cluster running (dooplex.hu services) | +| misc-scripts | Utility | collect-repo.sh, backup helpers | + +## Gotchas & lessons learned + +- `docker compose restart` ≠ `docker compose up -d` — restart doesn't pick up new images +- Go maps have random iteration order — always sort slices before displaying +- Docker `.State`="running" doesn't mean healthy — check `.Status` for "(health: starting)" / "(unhealthy)" +- Paperless-ngx needs `PAPERLESS_OCR_LANGUAGES` (plural) to install language packs, `PAPERLESS_OCR_LANGUAGE` (singular) to select +- After deploying a stack, update the in-memory Deployed flag immediately — RefreshStatus() only reads docker ps +- Cloudflare Tunnel handles *.demo-felhom.eu → Traefik handles Host()-based routing to containers +- BIOS "AC Power Recovery" must be enabled on N100 for auto-restart after power outage \ No newline at end of file diff --git a/controller/README.md b/controller/README.md index 0a3e209..1385a56 100644 --- a/controller/README.md +++ b/controller/README.md @@ -6,6 +6,7 @@ Replaces Portainer + scattered systemd scripts with a single, lightweight contai - Hungarian-language web dashboard for customers - Docker Compose stack management (start/stop/update) - Interactive first-deployment flow with auto-generated secrets +- Health-aware container state monitoring (starting/unhealthy/running) - Backup orchestration (DB dumps + restic snapshots) — Phase 3 - System health monitoring with Healthchecks pings — Phase 2 - Git-based stack synchronization with update management — Phase 4 @@ -13,12 +14,33 @@ Replaces Portainer + scattered systemd scripts with a single, lightweight contai ## Current Status -**Phase 1 — Stack Manager + Deploy Flow: core features operational.** +**Phase 1 — Stack Manager + Deploy Flow: ✅ COMPLETE** The controller is built, deployed, and running on the N100 test node (demo-felhom.eu). -The web dashboard is accessible, stack scanning works, and the deploy UI renders correctly. +First application (Paperless-ngx) successfully deployed end-to-end through the dashboard on 2026-02-13. -Next milestone: end-to-end test deploying a real app (e.g., Paperless-ngx) through the dashboard. +**Milestone achieved:** Full deploy cycle works — customer clicks "Telepítés", fills in fields, +controller generates secrets, saves app.yaml, runs `docker compose up -d`, and the app comes up +with Traefik routing and health checks. The dashboard correctly shows real-time container states +including health substatus (starting → healthy → running). + +Current version: **v0.2.1** + +### What works +- Dashboard with live container state (green/orange/yellow/red) +- Deploy form with password validation, auto-generation, and field locking +- Stack operations: start, stop, restart, update (pull + recreate) +- Log viewer for each stack +- Deploy page doubles as config viewer (read-only mode for deployed apps) +- Periodic stack rescanning (every 2 minutes) +- Manual rescan endpoint (`POST /api/stacks/rescan`) +- Alphabetically sorted stack display (consistent card ordering) +- Protected stacks (traefik, cloudflared, felhom-controller) can't be stopped + +### Known issues / next priorities +- Cloudflare Tunnel + Traefik TLS: paperless.demo-felhom.eu works locally but shows "Not secure" (certificate chain not fully validated through tunnel) +- No undo/delete for deployed apps yet +- Dashboard theme doesn't yet match felhom.eu dark theme ## Architecture @@ -87,11 +109,11 @@ controller/ |--------|------|--------|----------------| | **Config** | `internal/config/` | ✅ Done | Load & validate controller.yaml, env overrides | | **Stacks** | `internal/stacks/` | ✅ Done | Compose operations, scanning, metadata, deploy flow | -| **API** | `internal/api/` | ✅ Done | REST endpoints (stacks, deploy, system info, health) | +| **API** | `internal/api/` | ✅ Done | REST endpoints (stacks, deploy, rescan, system info, health) | | **Web** | `internal/web/` | ✅ Done | Hungarian dashboard, auth, deploy pages, asset serving | -| **Backup** | `internal/backup/` | 🔲 Phase 3 | DB dumps, restic snapshots, restore | -| **Monitor** | `internal/monitor/` | 🔲 Phase 2 | Health checks, Healthchecks pings, system metrics | -| **Scheduler** | `internal/scheduler/` | 🔲 Phase 2 | Cron-like job runner for all periodic tasks | +| **Backup** | `internal/backup/` | 📲 Phase 3 | DB dumps, restic snapshots, restore | +| **Monitor** | `internal/monitor/` | 📲 Phase 2 | Health checks, Healthchecks pings, system metrics | +| **Scheduler** | `internal/scheduler/` | 📲 Phase 2 | Cron-like job runner for all periodic tasks | ## Stack Management @@ -110,11 +132,29 @@ controller/ - **User input**: HDD path, admin password, language, etc. - **"🎲 Generálás"** button next to password fields 3. Clicks "Telepítés" → controller: - - Generates all secrets - - Validates required fields (checks path exists, etc.) + - Validates all required fields (password fields must be explicitly filled or generated) + - Generates auto-secrets (DB passwords, hex keys) - Saves `app.yaml` (env vars + locked fields list) - Runs `docker compose up -d` with env vars injected + - Updates in-memory state immediately (no stale "Telepítés" button) 4. Post-deploy: locked fields (DB_PASSWORD, etc.) become read-only +5. "Részletek" button opens deploy page in read-only mode showing current config + +### Container state display + +The dashboard shows health-aware container states with distinct colors: + +| State | Color | Label | Meaning | +|-------|-------|-------|---------| +| Running + healthy | 🟢 Green | "Fut" | All containers running and healthy | +| Running + health: starting | 🟠 Orange | "Indulás..." | Container up but healthcheck not yet passed | +| Running + unhealthy | 🟡 Yellow | "Nem egészséges" | Container running but healthcheck failing | +| Stopped/exited | 🔴 Red | "Leállítva" | All containers stopped | +| Restarting | 🟡 Yellow | "Újraindítás..." | Container in restart loop | +| Not deployed | ⚪ Gray | "Nincs telepítve" | Compose file exists but not yet deployed | + +Action buttons adapt: "operational" states (running/starting/unhealthy/restarting) show restart/stop, +while stopped states show a start button. ### Update strategy (Phase 4) @@ -152,13 +192,14 @@ Each deployed app gets an `app.yaml` in its stack directory: # /opt/docker/stacks/paperless-ngx/app.yaml # Auto-generated by felhom-controller — do not edit locked fields manually deployed: true -deployed_at: "2026-02-13T14:30:00Z" +deployed_at: "2026-02-13T21:10:00Z" env: DOMAIN: "demo-felhom.eu" DB_PASSWORD: "a7f2b9c1e4d..." # locked PAPERLESS_SECRET_KEY: "8b3e..." # locked PAPERLESS_ADMIN_USER: "admin" # editable - HDD_PATH: "/mnt/hdd_1" # locked + PAPERLESS_OCR_LANGUAGE: "hun+eng" # editable + HDD_PATH: "/mnt/hdd_placeholder" # locked locked_fields: - DB_PASSWORD - PAPERLESS_SECRET_KEY @@ -166,9 +207,6 @@ locked_fields: - HDD_PATH ``` -Fields are defined in each stack's `.felhom.yml` metadata file. See -`configs/example-felhom-metadata.yml` for the full format. - ### App assets (logos, screenshots) Baked into the container image at build time — no external dependencies at runtime. @@ -176,12 +214,6 @@ Synced from the felhom.eu website repo before building. Served locally at `/static/assets/`. Logos try SVG first, fall back to PNG. -| Asset | File pattern | Served at | -|-------|-------------|-----------| -| Logo (SVG) | `assets/{slug}-logo.svg` | `/static/assets/{slug}-logo.svg` | -| Logo (PNG fallback) | `assets/{slug}-logo.png` | `/static/assets/{slug}-logo.png` | -| Screenshot | `assets/{slug}-screenshot-{n}.webp` | `/static/assets/{slug}-screenshot-{n}.webp` | - ## Build & Deploy Source: `https://gitea.dooplex.hu/admin/deploy-felhom-compose` → `controller/` subfolder. @@ -192,27 +224,20 @@ See `docs/BUILDING.md` for the full guide. ```bash # Quick build (current platform only) cd ~/build/felhom-controller -./build.sh 0.1.0 +./build.sh 0.2.1 # Build + push to Gitea registry -./build.sh 0.1.0 --push - -# Build for N100 (amd64) + Pi (arm64) and push -./build.sh 0.1.0 --multiarch +./build.sh 0.2.1 --push ``` ### Deploy on customer node ```bash -# Create config first -nano /opt/docker/felhom-controller/controller.yaml +# Pull new image +docker pull gitea.dooplex.hu/admin/felhom-controller:0.2.1 -# Create .env for compose labels -echo "DOMAIN=demo-felhom.eu" > /opt/docker/felhom-controller/.env - -# Pull and start +# IMPORTANT: use 'up -d', NOT 'restart' — restart doesn't pick up new images cd /opt/docker/felhom-controller -docker compose pull docker compose up -d ``` @@ -220,8 +245,22 @@ docker compose up -d | Node | Hardware | Domain | IP | Status | |------|----------|--------|----|--------| -| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | 192.168.0.162 | ✅ Controller running | -| pi-customer-1 | Raspberry Pi 3B+, 1G RAM, 32G SD | pi-customer-1.local | — | 🔲 Not yet tested | +| demo-felhom | Acemagic GK3PLUS N100, 16G RAM, 512G SSD + 1TB HDD | demo-felhom.eu | 192.168.0.162 | ✅ Controller v0.2.1 + Paperless-ngx running | +| pi-customer-1 | Raspberry Pi 3B+, 1G RAM, 32G SD | pi-customer-1.local | — | 📲 Not yet tested | + +### First deployment log (Paperless-ngx on demo-felhom) + +- **Date:** 2026-02-13 +- **App:** Paperless-ngx (document management) +- **Deploy method:** Dashboard UI → "Telepítés" button +- **Issues encountered & resolved:** + 1. Password fields accepted empty values → Added server-side + client-side validation + 2. "Telepítés" button appeared for already-deployed apps → Fixed in-memory Deployed flag update + 3. Green status shown for `(health: starting)` containers → Added health-aware state parsing + 4. Stack cards switched positions on refresh → Added alphabetical sorting in GetStacks() + 5. "Részletek" button did nothing for deployed apps → Redirects to deploy page (read-only) + 6. OCR crash: `PAPERLESS_OCR_LANGUAGE=hun` not installed → Added `PAPERLESS_OCR_LANGUAGES` (plural) to docker-compose + 7. Container restart vs recreate: `docker compose restart` doesn't pick up new images → Documented: always use `docker compose up -d` ## REST API @@ -237,11 +276,12 @@ docker compose up -d | POST | `/api/stacks/{name}/restart` | Yes | Restart stack | | POST | `/api/stacks/{name}/update` | Yes | Pull images + recreate | | GET | `/api/stacks/{name}/logs` | Yes | Container logs | +| POST | `/api/stacks/rescan` | Yes | Trigger manual stack discovery | | GET | `/api/system/info` | Yes | Customer/domain info | ## Status & Roadmap -### Phase 1 — Stack Manager + Deploy Flow ✅ +### Phase 1 — Stack Manager + Deploy Flow ✅ COMPLETE - [x] Project skeleton & config format - [x] .felhom.yml app metadata format with deploy fields - [x] Per-app config persistence (app.yaml) @@ -249,23 +289,26 @@ docker compose up -d - [x] Stack catalog (read compose files + metadata from disk) - [x] Docker Compose operations (up/down/pull/ps/logs) - [x] Deploy flow with interactive field input +- [x] Password validation (server-side + client-side, no empty passwords) - [x] Basic web dashboard with start/stop/deploy buttons +- [x] Health-aware container states (starting/unhealthy/running) - [x] REST API for stack + deploy operations - [x] Simple web authentication (bcrypt sessions) - [x] App assets baked into container (SVG/PNG logos, webp screenshots) - [x] Container image build pipeline (Dockerfile + build.sh) - [x] Build + push to Gitea container registry - [x] Deploy on N100 test node — dashboard accessible -- [x] Stack scanning + display working (whoami test stack) -- [ ] End-to-end test: deploy an app through dashboard (whoami / paperless-ngx) -- [ ] Dashboard UI redesign (align with felhom.eu dark theme) +- [x] Stack scanning + display working +- [x] **First app deployed: Paperless-ngx via dashboard** (2026-02-13) +- [x] Periodic stack rescanning (every 2 minutes) +- [x] Alphabetically sorted stack display +- [x] Deploy page doubles as read-only config viewer for deployed apps ### Phase 2 — Monitoring & Health - [ ] System metrics collection (CPU, RAM, disk, temperature) - [ ] Healthchecks.io ping integration - [ ] Dashboard system health panel - [ ] Customer notifications (email/Telegram) -- [ ] Periodic stack status refresh (background goroutine) ### Phase 3 — Backups - [ ] DB dump engine (PostgreSQL, MariaDB/MySQL, SQLite)