Document unified customer page, blocked status, pending dashboard, config push, auto-create config, and customer state matrix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 KiB
felhom-hub
Central operator dashboard for monitoring and managing Felhom customer deployments.
A lightweight Go service that receives periodic reports from felhom-controller instances, stores them in SQLite, and provides a web dashboard for fleet monitoring. Also serves as the infrastructure backup store for disaster recovery.
Current version: v0.2.1
Architecture
Customer nodes Central Hub (k3s)
┌─────────────────┐ ┌────────────────────────┐
│ felhom-controller│──── JSON push ────▶│ felhom-hub │
│ (every 15 min) │ (Bearer auth) │ │
│ │ │ ┌─────────────────┐ │
│ POST /api/v1/ │ │ │ API Handler │ │
│ report │ │ │ (ingest reports, │ │
│ infra-backup │◀── config push ────│ │ infra backups, │ │
│ notify │ (YAML body) │ │ config push) │ │
│ │ │ └────────┬────────┘ │
└─────────────────┘ │ │ │
│ ┌────────▼────────┐ │
Operator browser │ │ SQLite Store │ │
┌─────────────────┐ │ │ (reports, │ │
│ Web Dashboard │◀── HTML pages ──────│ │ infra_backups, │ │
│ (hub.felhom.eu) │ (bcrypt auth) │ │ configs, │ │
└─────────────────┘ │ │ notifications) │ │
│ └─────────────────┘ │
│ │
│ ┌─────────────────┐ │
│ │ Web Dashboard │ │
│ │ (unified customer│ │
│ │ management) │ │
│ └─────────────────┘ │
└────────────────────────┘
API Endpoints
All API endpoints require Authorization: Bearer <api_key> (except /healthz and /api/v1/config/{id}). Auth accepts both the global report_api_key and per-customer API keys (generated when creating customer configs).
Report Ingest
| Method | Path | Description |
|---|---|---|
POST |
/api/v1/report |
Controller pushes periodic status report |
GET |
/api/v1/customers |
List all customers with latest report summary |
GET |
/api/v1/customers/{id} |
Get latest full report for a customer |
GET |
/api/v1/customers/{id}/history?period=7d |
Get report history |
Infrastructure Backup (Disaster Recovery)
| Method | Path | Description |
|---|---|---|
POST |
/api/v1/infra-backup |
Controller pushes infrastructure snapshot |
GET |
/api/v1/infra-backup/{customer_id} |
Fresh controller pulls backup for restore |
The infra-backup payload contains everything needed to restore a customer deployment:
controller.yaml(base64, full config including secrets)settings.json(base64, backup preferences, storage paths)- Disk layout (UUIDs, labels, mount points, fstab options, bind-mount topology)
- Deployed stacks manifest (app names, HDD paths, display names)
- Restic passwords (primary + cross-drive, for encrypted backup access)
Disaster recovery flow:
- Customer's system drive fails → replaced with fresh Debian install
docker-setup.shdeploys controller with Hub details (customer_id + API key)- Controller detects fresh deployment → calls
GET /api/v1/infra-backup/{customer_id} - Controller uses disk UUIDs to auto-mount surviving drives
- Controller restores apps from local backups on those drives
Notifications
| Method | Path | Description |
|---|---|---|
POST |
/api/v1/notify |
Controller sends event notification (backup_failed, disk_warning, etc.) |
POST |
/api/v1/preferences |
Controller syncs customer notification preferences |
Notifications are sent via Resend.com email API.
Customer Config Retrieval
| Method | Path | Description |
|---|---|---|
GET |
/api/v1/config/{customer_id} |
Download generated controller.yaml (auth: X-Retrieval-Password header) |
Config retrieval uses a separate per-customer retrieval password (not the API key). The Hub generates a complete controller.yaml by deep-merging controller.yaml.example (periodically fetched from the Gitea repo) with customer-specific overrides (identity, infrastructure tokens, hub API key, session secret).
Health
| Method | Path | Description |
|---|---|---|
GET |
/healthz |
Health check (no auth required) |
Web Dashboard
Protected by bcrypt password + session cookie (7-day expiry).
Pages
- Dashboard (
/) — Fleet overview table showing all customers with live status. Config-only customers (no reports yet) appear as "PENDING" with gray badge. Blocked customers are hidden. Auto-refreshes every 60 seconds. - Customers (
/configs) — Customer management list. Shows all customers (both managed and manual), their status, controller version, and config type (MANAGED/MANUAL). Blocked customers shown grayed-out with BLOCKED badge. - Unified Customer Detail (
/customers/{id}) — Single page per customer combining config management and live monitoring. Adapts content based on available data:- Managed + reporting: Full view — config info, system metrics, storage, containers, backup status, credentials, setup commands, YAML preview, controller update, notifications, history
- Managed + no reports yet: Config info, credentials, setup commands, "Waiting for first report" indicator
- Manual (report-only): System metrics, storage, containers, backup, with "Create Config" button to convert to managed
- Config Form (
/configs/new,/configs/{id}/edit) — Create/edit customer configurations with identity, infrastructure tokens, and monitoring overrides
Customer States
| State | Dashboard | Customers List | Detail Page |
|---|---|---|---|
| Active + reporting | Shown with live status | MANAGED + status badge | Full unified view |
| Active + no reports | Shown as PENDING (gray) | MANAGED + no status | Config + "waiting for report" |
| Manual (report-only) | Shown with live status | MANUAL + status badge | Reports + "Create Config" button |
| Blocked | Hidden | Shown grayed-out, BLOCKED badge | Blocked banner + Unblock button |
Customer Actions
| Action | Description |
|---|---|
| Block/Unblock | Toggle blocked status — blocked customers are hidden from dashboard and notifications are suppressed, but reports are still accepted and stored |
| Push Config | Generate YAML from Hub config and POST it to the controller's /api/config/apply endpoint (requires controller URL from reports) |
| Create Config | Auto-create a managed config from a manual customer's report data, then redirect to edit form |
| Trigger Update | Instruct controller to self-update to the latest version |
| Delete | Remove customer config (customer reappears as manual if reports continue) |
Status Logic
- OK (green): report < 30 min old, health = ok
- WARN (yellow): 30-60 min stale or health = warn
- DOWN (red): > 60 min stale or health = fail
- DISABLED (gray): controller monitoring paused
- PENDING (gray): config exists but no reports received yet
- BLOCKED (gray): customer blocked by operator
Data Storage
SQLite with WAL mode. Tables:
| Table | Purpose |
|---|---|
reports |
Full JSON reports with denormalized fields for dashboard queries |
infra_backups |
Per-customer infrastructure snapshots for disaster recovery |
customer_notifications |
Email + enabled event types per customer |
notification_log |
Send/skip/fail history for notifications |
customer_configs |
Pre-configured customer settings, retrieval passwords, per-customer API keys, status (active/blocked) |
Retention: configurable (default 90 days), daily prune at 04:30 Budapest time.
Configuration
# hub.yaml
auth:
password_hash: "" # bcrypt hash for dashboard login (empty = no auth)
api:
report_api_key: "" # Bearer token for API auth
notifications:
resend_api_key: "" # Resend.com API key for email
from_email: "monitoring@felhom.eu"
retention:
max_days: 90
prune_schedule: "04:30"
alerting:
stale_threshold: "30m" # Customer considered stale after this duration
registry:
image: "gitea.dooplex.hu/admin/felhom-controller"
username: "" # Gitea registry credentials
token: ""
check_interval: "30m" # How often to check for new controller versions
template_interval: "1h" # How often to refresh controller.yaml.example
server:
listen: ":8080"
data_dir: "/data" # SQLite database location
Deployment
Runs on k3s (Kubernetes) in the felhom-system namespace:
- PVC: 1GB Longhorn volume for SQLite database
- Resources: 64Mi-256Mi memory, 50m-500m CPU
- Ingress:
hub.felhom.euwith TLS (cert-manager) - Geo-restriction: Hungary only (nginx annotation)
# Build and push
cd hub/
make VERSION=0.2.1 docker docker-push
# Deploy
kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:v0.2.1
kubectl rollout status -n felhom-system deploy/hub
# Check
kubectl logs -n felhom-system -l app=hub --tail 20
Dependencies
golang.org/x/crypto— bcrypt for password hashinggopkg.in/yaml.v3— YAML config parsingmodernc.org/sqlite— Pure Go SQLite (no CGo)