Files
felhom.eu/hub
admin 4c8bf63ce3 feat: customer config management — CRUD, API retrieval, per-customer auth (v0.2.0)
New "Configurations" section lets operators pre-configure customer settings
in the Hub, then docker-setup.sh can download a ready-made controller.yaml
using just a customer ID and retrieval password.

- Store: customer_configs table with CRUD + per-customer API key lookup
- API: GET /api/v1/config/{id} with X-Retrieval-Password auth
- Auth: per-customer API keys alongside existing global key (backward compatible)
- Web UI: /configs list, create, edit, delete, YAML preview, copy-to-clipboard
- YAML gen: deep-merge controller.yaml.example template with customer overrides
- Template fetcher: background goroutine refreshing template from Gitea repo
- Navigation: Dashboard / Configurations tabs on all pages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 13:36:32 +01:00
..
2026-02-16 13:44:25 +01:00
2026-02-16 14:16:11 +01:00

felhom-hub

Central operator dashboard for monitoring and managing Felhom customer deployments.

A lightweight Go service that receives periodic reports from felhom-controller instances, stores them in SQLite, and provides a web dashboard for fleet monitoring. Also serves as the infrastructure backup store for disaster recovery.

Current version: v0.2.0


Architecture

   Customer nodes                             Central Hub (k3s)
┌─────────────────┐                     ┌────────────────────────┐
│ felhom-controller│──── JSON push ────▶│  felhom-hub            │
│ (every 15 min)   │    (Bearer auth)   │                        │
│                  │                     │  ┌─────────────────┐   │
│ POST /api/v1/    │                     │  │ API Handler     │   │
│   report         │                     │  │ (ingest reports, │   │
│   infra-backup   │                     │  │  infra backups)  │   │
│   notify         │                     │  └────────┬────────┘   │
│                  │                     │           │             │
└─────────────────┘                     │  ┌────────▼────────┐   │
                                        │  │ SQLite Store    │   │
   Operator browser                     │  │ (reports,       │   │
┌─────────────────┐                     │  │  infra_backups, │   │
│ Web Dashboard   │◀── HTML pages ──────│  │  notifications) │   │
│ (hub.felhom.eu) │    (bcrypt auth)    │  └─────────────────┘   │
└─────────────────┘                     │                        │
                                        │  ┌─────────────────┐   │
                                        │  │ Web Dashboard   │   │
                                        │  │ (multi-customer │   │
                                        │  │  overview)      │   │
                                        │  └─────────────────┘   │
                                        └────────────────────────┘

API Endpoints

All API endpoints require Authorization: Bearer <api_key> (except /healthz and /api/v1/config/{id}). Auth accepts both the global report_api_key and per-customer API keys (generated when creating customer configs).

Report Ingest

Method Path Description
POST /api/v1/report Controller pushes periodic status report
GET /api/v1/customers List all customers with latest report summary
GET /api/v1/customers/{id} Get latest full report for a customer
GET /api/v1/customers/{id}/history?period=7d Get report history

Infrastructure Backup (Disaster Recovery)

Method Path Description
POST /api/v1/infra-backup Controller pushes infrastructure snapshot
GET /api/v1/infra-backup/{customer_id} Fresh controller pulls backup for restore

The infra-backup payload contains everything needed to restore a customer deployment:

  • controller.yaml (base64, full config including secrets)
  • settings.json (base64, backup preferences, storage paths)
  • Disk layout (UUIDs, labels, mount points, fstab options, bind-mount topology)
  • Deployed stacks manifest (app names, HDD paths, display names)
  • Restic passwords (primary + cross-drive, for encrypted backup access)

Disaster recovery flow:

  1. Customer's system drive fails → replaced with fresh Debian install
  2. docker-setup.sh deploys controller with Hub details (customer_id + API key)
  3. Controller detects fresh deployment → calls GET /api/v1/infra-backup/{customer_id}
  4. Controller uses disk UUIDs to auto-mount surviving drives
  5. Controller restores apps from local backups on those drives

Notifications

Method Path Description
POST /api/v1/notify Controller sends event notification (backup_failed, disk_warning, etc.)
POST /api/v1/preferences Controller syncs customer notification preferences

Notifications are sent via Resend.com email API.

Customer Config Retrieval

Method Path Description
GET /api/v1/config/{customer_id} Download generated controller.yaml (auth: X-Retrieval-Password header)

Config retrieval uses a separate per-customer retrieval password (not the API key). The Hub generates a complete controller.yaml by deep-merging controller.yaml.example (periodically fetched from the Gitea repo) with customer-specific overrides (identity, infrastructure tokens, hub API key, session secret).

Health

Method Path Description
GET /healthz Health check (no auth required)

Web Dashboard

Protected by bcrypt password + session cookie (7-day expiry).

  • Customer overview table: status indicators (OK/WARN/DOWN), CPU/memory %, disk usage, container counts, backup age, controller version
  • Customer detail page: system info, storage bars, container table, notification preferences, notification log, 24h history graphs
  • Configurations page: CRUD management for customer configs — pre-configure customer identity, infrastructure secrets, monitoring UUIDs; auto-generates retrieval password + per-customer API key; shows setup commands (docker-setup.sh and curl); YAML preview
  • Auto-refresh: 60-second cycle
  • Status logic:
    • Green: report < 30 min old, health = ok
    • Yellow: 30-60 min stale or health = warn
    • Red: > 60 min stale or health = fail

Data Storage

SQLite with WAL mode. Tables:

Table Purpose
reports Full JSON reports with denormalized fields for dashboard queries
infra_backups Per-customer infrastructure snapshots for disaster recovery
customer_notifications Email + enabled event types per customer
notification_log Send/skip/fail history for notifications
customer_configs Pre-configured customer settings, retrieval passwords, per-customer API keys

Retention: configurable (default 90 days), daily prune at 04:30 Budapest time.

Configuration

# hub.yaml
auth:
  password_hash: ""           # bcrypt hash for dashboard login (empty = no auth)

api:
  report_api_key: ""          # Bearer token for API auth

notifications:
  resend_api_key: ""          # Resend.com API key for email
  from_email: "monitoring@felhom.eu"

retention:
  max_days: 90
  prune_schedule: "04:30"

alerting:
  stale_threshold: "30m"      # Customer considered stale after this duration

registry:
  image: "gitea.dooplex.hu/admin/felhom-controller"
  username: ""                # Gitea registry credentials
  token: ""
  check_interval: "30m"      # How often to check for new controller versions
  template_interval: "1h"    # How often to refresh controller.yaml.example

server:
  listen: ":8080"
  data_dir: "/data"           # SQLite database location

Deployment

Runs on k3s (Kubernetes) in the felhom-system namespace:

  • PVC: 1GB Longhorn volume for SQLite database
  • Resources: 64Mi-256Mi memory, 50m-500m CPU
  • Ingress: hub.felhom.eu with TLS (cert-manager)
  • Geo-restriction: Hungary only (nginx annotation)
# Build and push
cd hub/
make VERSION=0.2.0 docker docker-push

# Deploy
kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:v0.2.0
kubectl rollout status -n felhom-system deploy/hub

# Check
kubectl logs -n felhom-system -l app=hub --tail 20

Dependencies

  • golang.org/x/crypto — bcrypt for password hashing
  • gopkg.in/yaml.v3 — YAML config parsing
  • modernc.org/sqlite — Pure Go SQLite (no CGo)