4c8bf63ce3
New "Configurations" section lets operators pre-configure customer settings
in the Hub, then docker-setup.sh can download a ready-made controller.yaml
using just a customer ID and retrieval password.
- Store: customer_configs table with CRUD + per-customer API key lookup
- API: GET /api/v1/config/{id} with X-Retrieval-Password auth
- Auth: per-customer API keys alongside existing global key (backward compatible)
- Web UI: /configs list, create, edit, delete, YAML preview, copy-to-clipboard
- YAML gen: deep-merge controller.yaml.example template with customer overrides
- Template fetcher: background goroutine refreshing template from Gitea repo
- Navigation: Dashboard / Configurations tabs on all pages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
182 lines
8.1 KiB
Markdown
182 lines
8.1 KiB
Markdown
# felhom-hub
|
|
|
|
**Central operator dashboard for monitoring and managing Felhom customer deployments.**
|
|
|
|
A lightweight Go service that receives periodic reports from felhom-controller instances, stores them in SQLite, and provides a web dashboard for fleet monitoring. Also serves as the infrastructure backup store for disaster recovery.
|
|
|
|
**Current version: v0.2.0**
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Customer nodes Central Hub (k3s)
|
|
┌─────────────────┐ ┌────────────────────────┐
|
|
│ felhom-controller│──── JSON push ────▶│ felhom-hub │
|
|
│ (every 15 min) │ (Bearer auth) │ │
|
|
│ │ │ ┌─────────────────┐ │
|
|
│ POST /api/v1/ │ │ │ API Handler │ │
|
|
│ report │ │ │ (ingest reports, │ │
|
|
│ infra-backup │ │ │ infra backups) │ │
|
|
│ notify │ │ └────────┬────────┘ │
|
|
│ │ │ │ │
|
|
└─────────────────┘ │ ┌────────▼────────┐ │
|
|
│ │ SQLite Store │ │
|
|
Operator browser │ │ (reports, │ │
|
|
┌─────────────────┐ │ │ infra_backups, │ │
|
|
│ Web Dashboard │◀── HTML pages ──────│ │ notifications) │ │
|
|
│ (hub.felhom.eu) │ (bcrypt auth) │ └─────────────────┘ │
|
|
└─────────────────┘ │ │
|
|
│ ┌─────────────────┐ │
|
|
│ │ Web Dashboard │ │
|
|
│ │ (multi-customer │ │
|
|
│ │ overview) │ │
|
|
│ └─────────────────┘ │
|
|
└────────────────────────┘
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
All API endpoints require `Authorization: Bearer <api_key>` (except `/healthz` and `/api/v1/config/{id}`). Auth accepts both the global `report_api_key` and per-customer API keys (generated when creating customer configs).
|
|
|
|
### Report Ingest
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `POST` | `/api/v1/report` | Controller pushes periodic status report |
|
|
| `GET` | `/api/v1/customers` | List all customers with latest report summary |
|
|
| `GET` | `/api/v1/customers/{id}` | Get latest full report for a customer |
|
|
| `GET` | `/api/v1/customers/{id}/history?period=7d` | Get report history |
|
|
|
|
### Infrastructure Backup (Disaster Recovery)
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `POST` | `/api/v1/infra-backup` | Controller pushes infrastructure snapshot |
|
|
| `GET` | `/api/v1/infra-backup/{customer_id}` | Fresh controller pulls backup for restore |
|
|
|
|
The infra-backup payload contains everything needed to restore a customer deployment:
|
|
- `controller.yaml` (base64, full config including secrets)
|
|
- `settings.json` (base64, backup preferences, storage paths)
|
|
- Disk layout (UUIDs, labels, mount points, fstab options, bind-mount topology)
|
|
- Deployed stacks manifest (app names, HDD paths, display names)
|
|
- Restic passwords (primary + cross-drive, for encrypted backup access)
|
|
|
|
**Disaster recovery flow:**
|
|
1. Customer's system drive fails → replaced with fresh Debian install
|
|
2. `docker-setup.sh` deploys controller with Hub details (customer_id + API key)
|
|
3. Controller detects fresh deployment → calls `GET /api/v1/infra-backup/{customer_id}`
|
|
4. Controller uses disk UUIDs to auto-mount surviving drives
|
|
5. Controller restores apps from local backups on those drives
|
|
|
|
### Notifications
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `POST` | `/api/v1/notify` | Controller sends event notification (backup_failed, disk_warning, etc.) |
|
|
| `POST` | `/api/v1/preferences` | Controller syncs customer notification preferences |
|
|
|
|
Notifications are sent via Resend.com email API.
|
|
|
|
### Customer Config Retrieval
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `GET` | `/api/v1/config/{customer_id}` | Download generated controller.yaml (auth: `X-Retrieval-Password` header) |
|
|
|
|
Config retrieval uses a separate per-customer retrieval password (not the API key). The Hub generates a complete `controller.yaml` by deep-merging `controller.yaml.example` (periodically fetched from the Gitea repo) with customer-specific overrides (identity, infrastructure tokens, hub API key, session secret).
|
|
|
|
### Health
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `GET` | `/healthz` | Health check (no auth required) |
|
|
|
|
## Web Dashboard
|
|
|
|
Protected by bcrypt password + session cookie (7-day expiry).
|
|
|
|
- **Customer overview table:** status indicators (OK/WARN/DOWN), CPU/memory %, disk usage, container counts, backup age, controller version
|
|
- **Customer detail page:** system info, storage bars, container table, notification preferences, notification log, 24h history graphs
|
|
- **Configurations page:** CRUD management for customer configs — pre-configure customer identity, infrastructure secrets, monitoring UUIDs; auto-generates retrieval password + per-customer API key; shows setup commands (`docker-setup.sh` and `curl`); YAML preview
|
|
- **Auto-refresh:** 60-second cycle
|
|
- **Status logic:**
|
|
- Green: report < 30 min old, health = ok
|
|
- Yellow: 30-60 min stale or health = warn
|
|
- Red: > 60 min stale or health = fail
|
|
|
|
## Data Storage
|
|
|
|
SQLite with WAL mode. Tables:
|
|
|
|
| Table | Purpose |
|
|
|-------|---------|
|
|
| `reports` | Full JSON reports with denormalized fields for dashboard queries |
|
|
| `infra_backups` | Per-customer infrastructure snapshots for disaster recovery |
|
|
| `customer_notifications` | Email + enabled event types per customer |
|
|
| `notification_log` | Send/skip/fail history for notifications |
|
|
| `customer_configs` | Pre-configured customer settings, retrieval passwords, per-customer API keys |
|
|
|
|
Retention: configurable (default 90 days), daily prune at 04:30 Budapest time.
|
|
|
|
## Configuration
|
|
|
|
```yaml
|
|
# hub.yaml
|
|
auth:
|
|
password_hash: "" # bcrypt hash for dashboard login (empty = no auth)
|
|
|
|
api:
|
|
report_api_key: "" # Bearer token for API auth
|
|
|
|
notifications:
|
|
resend_api_key: "" # Resend.com API key for email
|
|
from_email: "monitoring@felhom.eu"
|
|
|
|
retention:
|
|
max_days: 90
|
|
prune_schedule: "04:30"
|
|
|
|
alerting:
|
|
stale_threshold: "30m" # Customer considered stale after this duration
|
|
|
|
registry:
|
|
image: "gitea.dooplex.hu/admin/felhom-controller"
|
|
username: "" # Gitea registry credentials
|
|
token: ""
|
|
check_interval: "30m" # How often to check for new controller versions
|
|
template_interval: "1h" # How often to refresh controller.yaml.example
|
|
|
|
server:
|
|
listen: ":8080"
|
|
data_dir: "/data" # SQLite database location
|
|
```
|
|
|
|
## Deployment
|
|
|
|
Runs on k3s (Kubernetes) in the `felhom-system` namespace:
|
|
- **PVC:** 1GB Longhorn volume for SQLite database
|
|
- **Resources:** 64Mi-256Mi memory, 50m-500m CPU
|
|
- **Ingress:** `hub.felhom.eu` with TLS (cert-manager)
|
|
- **Geo-restriction:** Hungary only (nginx annotation)
|
|
|
|
```bash
|
|
# Build and push
|
|
cd hub/
|
|
make VERSION=0.2.0 docker docker-push
|
|
|
|
# Deploy
|
|
kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:v0.2.0
|
|
kubectl rollout status -n felhom-system deploy/hub
|
|
|
|
# Check
|
|
kubectl logs -n felhom-system -l app=hub --tail 20
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
- `golang.org/x/crypto` — bcrypt for password hashing
|
|
- `gopkg.in/yaml.v3` — YAML config parsing
|
|
- `modernc.org/sqlite` — Pure Go SQLite (no CGo)
|