Files
deploy-felhom-compose/CONTEXT.md
T
2026-02-14 18:57:20 +01:00

12 KiB

CONTEXT.md — Project Memory

This file serves as persistent project memory across Claude Code sessions. It replaces the auto-generated "Memory" from the claude.ai Project. Update this file at the end of each working session with current state, recent decisions, and anything the next session needs to know.

Ask Claude Code: "Please update CONTEXT.md with what we did today"

Last updated: 2026-02-14 (session 3)


About Viktor (project owner)

  • Works at Magyar Telekom (Budapest), building Felhom as a side business
  • Felhom: managed home-server service for Hungarian households
  • Technical but prefers pragmatic solutions over over-engineering
  • Runs all infrastructure on Gitea (gitea.dooplex.hu), k3s cluster for management
  • Customer deployments use Docker Compose (not Kubernetes) for simplicity

Current project state

felhom-controller (this repo)

  • Version: v0.2.1
  • Phase 1: COMPLETE — Stack Manager + Deploy Flow
  • First app deployed: Paperless-ngx on demo-felhom.eu (2026-02-13)
  • Running on: demo-felhom (N100 mini PC) at 192.168.0.162:8080
  • All Phase 1 features working: deploy, start/stop/restart/update, logs, health-aware states, auth

What was just completed (2026-02-14 session 3)

  • Enhanced debug logging across all stack operations in internal/stacks/:
    • Operation timing: All stack ops (start, stop, restart, update, deploy) now log elapsed time
    • Post-start container state check: Async goroutine after start/restart/update/deploy
    • Image pull detection: Checks local images before deploy/update (debug level)
    • GetLogs/ScanStacks improvements: Byte count logging, deployed/available counts
    • All verbose checks gated on cfg.Logging.Level == "debug"; timing always at INFO
  • UI improvements in internal/web/templates.go and server.go:
    • Memory bar fix on deploy page: Bar segments now always visible (min-width: 3px), new app segment uses translucent green with distinct border for clear visual separation from committed memory
    • Clickable app cards: Cards on Vezérlőpult and Alkalmazások pages are now clickable (navigates to deploy/detail page). Uses data-href attribute + delegated click handler. Protected stacks excluded. Actions area (buttons, state labels) excluded from click-to-navigate
    • Live-scrolling logs: Logs page now auto-refreshes every 3s via AJAX polling (?raw=1 returns plain text). Fixed-height container (70vh) with auto-scroll to bottom. Pulsing green "Élő" indicator. Pause/resume toggle ("Szüneteltetés"/"Folytatás"). User scroll position preserved when scrolled up to read history
    • Deployment progress UI: Deploy button no longer shows alert+redirect immediately. Instead shows 3-step progress panel: config saved → containers starting → app initializing. Polls GET /api/stacks/{name} every 3s to track actual container health state. Handles running (auto-redirect), starting (keep polling), unhealthy (warning), exited (error), and 120s timeout. Shows elapsed time counter
  • Mealie healthcheck fix (app-catalog-felhom.eu):
    • wget --spider replaced with Python TCP socket check — mealie image doesn't include wget
    • start_period increased to 60s (DB migrations take ~40s on first start)
  • Healthcheck audit: filebrowser (Alpine, has BusyBox wget — OK), stirling-pdf (Ubuntu, has wget — OK)

Previously completed (2026-02-15 session 2)

  • Phase 4: Git Sync + App Catalog Audit — major milestone
  • Git sync module (internal/sync/sync.go):
    • Clones/pulls app-catalog-felhom.eu repo to local cache on startup
    • Periodic sync based on git.sync_interval (default 15m)
    • Copies docker-compose.yml + .felhom.yml to stacks dir (never overwrites app.yaml/.env)
    • SHA-256 content comparison — only writes changed files
    • Triggers ScanStacks() after sync so dashboard updates immediately
    • Uses os/exec git CLI — no Go git library dependency
  • Manual sync button ("Sablonok frissítése") on Alkalmazások page:
    • POST /api/sync endpoint with 30s debounce
    • Toast notification shows result (success/failure/what changed)
    • Auto-reloads page if new apps or updates detected
  • Sync status added to /api/system/info (last_sync, last_status, syncing flag)
  • .felhom.yml files created for all 10 apps (paperless-ngx already had one):
    • actualbudget, docmost, filebrowser, homebox, immich, mealie, romm, stirling-pdf, vaultwarden
    • All follow the same format: display_name, description, category, subdomain, resources, deploy_fields
  • Docker Compose templates audited and fixed for all 10 apps:
    • Fixed {{DOMAIN}}${DOMAIN} syntax in homebox, mealie, romm, stirling-pdf
    • Fixed {{HDD_PATH}}${HDD_PATH} in romm
    • Added deploy.resources.limits.memory to all services across all templates
    • Added TZ=Europe/Budapest to all sidecar services (postgres, redis, mariadb)
    • Added healthcheck to romm main service
    • Added romm-redis condition: service_healthy (was service_started)
    • Standardized header comment blocks across all templates
  • Documentation updated: app-catalog README, CLAUDE.md, CONTEXT.md

Previously completed (2026-02-15 session 1)

  • Memory validation during deployment:
    • Pre-deploy memory check: compares mem_request sum against usable system RAM
    • Hard block if requests exceed usable memory (total - 384MB reserved)
    • Soft warning if mem_limit sum exceeds total RAM (overcommit OK for limits)
    • ParseMemoryMB() supports "500M", "1G", "1.5G", "1024" formats
    • CommittedMemory() sums requests/limits across all deployed stacks
    • Memory summary bar shown on deploy page before user clicks deploy
    • system.reserved_memory_mb configurable in controller.yaml (default: 384)
  • Display: ~ prefix on mem_request in UI badges (display-only, exact value stored)
  • Felhom.eu logo replaced text logos in sidebar and login page with actual SVG logo
    • Logo SVG embedded as Go string constant, served at /static/felhom-logo.svg

Previously completed (2026-02-14)

  • System info bar on Vezérlőpult dashboard: RAM, SSD, and optional HDD usage
    • Progress bars with color coding (green < 70%, yellow 70-85%, red > 85%)
    • New internal/system package reads /proc/meminfo + syscall.Statfs
    • Platform-specific: Linux impl + non-Linux stub (build tags)
    • Hungarian labels: "Memória", "SSD tárhely", "Külső HDD"
  • Docker Compose memory limits on paperless-ngx template:
    • paperless-webserver: 768M, postgres: 256M, redis: 128M
    • Added mem_limit field to .felhom.yml ResourceHints (total: 1152M)
  • /api/system/info endpoint now returns live system metrics (was customer info)
  • Config: Added paths.hdd_path for external HDD monitoring
  • Controller image builds via build.sh, pushes to Gitea container registry

Previously completed (2026-02-13)

  • Built the entire felhom-controller from scratch (Go, no frameworks)
  • Debugged and fixed 7 issues during first real deployment:
    1. Password validation (empty passwords accepted)
    2. In-memory Deployed flag not updating after deploy
    3. Health-aware state parsing (starting/unhealthy detection)
    4. Random card ordering (Go map iteration)
    5. "Részletek" button redirect for deployed apps
    6. Paperless OCR language installation (LANGUAGES vs LANGUAGE env var)
    7. Documentation: restart vs up -d for image updates

What's next (priorities)

  1. Build + deploy the updated controller with git sync module
  2. Deploy a second app (e.g., ActualBudget — simplest, or Immich — tests HDD + secrets) to validate all .felhom.yml files
  3. Test git sync end-to-end: push a template change to app-catalog, verify controller picks it up
  4. Test on Raspberry Pi (pi-customer-1)
  5. Add paths.hdd_path to demo-felhom controller.yaml to enable HDD bar
  6. Phase 2 continued: CPU/temperature metrics, Healthchecks.io pings
  7. Phase 3: Backup system (DB dumps + restic)

Architecture decisions

Decision Rationale
Go stdlib for web (no Gin/Echo) Minimal dependencies, single binary, easy to embed templates
Templates as Go string constants Zero runtime file dependencies, everything in the binary
Docker Compose for customers (not k8s) Simpler troubleshooting, customers don't need k8s knowledge
k3s for management infra only Viktor's own services (gitea, monitoring, website) run on k3s
Cloudflare Tunnel for remote access No port forwarding needed, works behind any NAT
app.yaml per stack Separates deploy config from compose files, survives git pulls
Password fields require explicit input Prevents accidental empty-password deployments
Health-aware state from Docker Status field Docker's State says "running" even for unhealthy containers
Memory limits via deploy.resources.limits Prevents runaway containers; ~50% headroom over expected usage
System info from /proc/meminfo + statfs No external dependencies, cheap to read on each page load
mem_request vs mem_limit (K8s-inspired) Requests = expected usage (hard block), limits = peak (overcommit OK)
384MB reserved for system Prevents deploying apps that would starve the OS/controller
Logo SVG embedded as Go constant Same approach as CSS/HTML — zero external file deps
Git sync via os/exec git CLI No Go git library needed, git is in the container image
SHA-256 for content comparison Only copy changed files, avoid unnecessary disk writes
30s debounce on manual sync Prevents spamming the git server

Key file locations on demo-felhom

/opt/docker/felhom-controller/         # Controller compose + config
  ├── controller.yaml                  # Customer config (domain, auth, paths)
  ├── docker-compose.yml               # Controller's own compose
  └── .env                             # DOMAIN=demo-felhom.eu

/opt/docker/stacks/                    # All app stacks
  ├── traefik/                         # Reverse proxy (protected)
  ├── cloudflared/                     # Tunnel (protected)
  ├── paperless-ngx/                   # First deployed app ✅
  │   ├── docker-compose.yml
  │   ├── .felhom.yml                  # App metadata
  │   └── app.yaml                     # Deploy config (env vars, locked fields)
  └── whoami/                          # Test stack (not deployed)

/mnt/hdd_placeholder/storage/          # HDD storage for apps
  └── paperless/
      ├── consume/                     # Drop files here for OCR
      ├── media/                       # Processed documents
      └── export/                      # Backup exports
Repository Status Notes
deploy-felhom-compose Active This repo. Controller code + deploy scripts
app-catalog-felhom.eu Active 10 app templates, all with .felhom.yml metadata + memory limits
felhom.eu Stable Website live, SEO indexed, email working
homelab-manifests Stable k3s cluster running (dooplex.hu services)
misc-scripts Utility collect-repo.sh, backup helpers

Gotchas & lessons learned

  • docker compose restartdocker compose up -d — restart doesn't pick up new images
  • Go maps have random iteration order — always sort slices before displaying
  • Docker .State="running" doesn't mean healthy — check .Status for "(health: starting)" / "(unhealthy)"
  • Paperless-ngx needs PAPERLESS_OCR_LANGUAGES (plural) to install language packs, PAPERLESS_OCR_LANGUAGE (singular) to select
  • After deploying a stack, update the in-memory Deployed flag immediately — RefreshStatus() only reads docker ps
  • Cloudflare Tunnel handles *.demo-felhom.eu → Traefik handles Host()-based routing to containers
  • BIOS "AC Power Recovery" must be enabled on N100 for auto-restart after power outage
  • docker compose up -d returns exit 0 even when containers immediately crash-loop — need post-start status check to detect this
  • When logging env vars for debugging, only log keys (not values) to avoid leaking secrets in log files
  • Mealie image (ghcr.io/mealie-recipes/mealie) doesn't include wget/curl — use Python TCP socket check for healthcheck
  • Mealie DB migrations on first start take ~40s (alembic) — use start_period: 60s to avoid premature unhealthy status
  • Alpine-based images (filebrowser, vaultwarden) have wget via BusyBox — healthchecks with wget --spider work fine