added controller
This commit is contained in:
@@ -0,0 +1,283 @@
|
||||
# felhom-controller
|
||||
|
||||
**Central management container for Felhom home servers.**
|
||||
|
||||
Replaces Portainer + scattered systemd scripts with a single, lightweight container that provides:
|
||||
- Hungarian-language web dashboard for customers
|
||||
- Docker Compose stack management (start/stop/update)
|
||||
- Backup orchestration (DB dumps + restic snapshots)
|
||||
- System health monitoring with Healthchecks pings
|
||||
- Git-based stack synchronization with update management
|
||||
- Self-update with automatic rollback on failure
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Customer Hardware (N100 mini PC / Raspberry Pi) │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌────────────────────────────────────────────┐ │
|
||||
│ │ Traefik │ │ felhom-controller │ │
|
||||
│ │ (reverse │──▶│ │ │
|
||||
│ │ proxy) │ │ ┌──────────┐ ┌─────────────────────────┐│ │
|
||||
│ └──────────┘ │ │ Web UI │ │ Stack Manager ││ │
|
||||
│ │ │ (HU dash │ │ (compose up/down/pull, ││ │
|
||||
│ ┌──────────┐ │ │ board) │ │ git sync, update mgmt) ││ │
|
||||
│ │cloudflared│ │ └──────────┘ └─────────────────────────┘│ │
|
||||
│ │ (tunnel) │ │ ┌──────────┐ ┌─────────────────────────┐│ │
|
||||
│ └──────────┘ │ │ Backup │ │ Monitor & Pinger ││ │
|
||||
│ │ │ (db dump │ │ (healthchecks pings, ││ │
|
||||
│ ┌──────────┐ │ │ restic) │ │ system metrics) ││ │
|
||||
│ │ App │ │ └──────────┘ └─────────────────────────┘│ │
|
||||
│ │ stacks │ │ ┌──────────┐ ┌─────────────────────────┐│ │
|
||||
│ │ (docker │ │ │Scheduler │ │ REST API ││ │
|
||||
│ │ compose) │ │ │(cron-like│ │ (for UI + remote mgmt) ││ │
|
||||
│ └──────────┘ │ │ jobs) │ └─────────────────────────┘│ │
|
||||
│ │ └──────────┘ │ │
|
||||
│ └────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│ pings │ git pull
|
||||
▼ ▼
|
||||
status.felhom.eu gitea.dooplex.hu
|
||||
(Healthchecks on k3s) (stack definitions)
|
||||
```
|
||||
|
||||
## Module Overview
|
||||
|
||||
| Module | Path | Responsibility |
|
||||
|--------|------|----------------|
|
||||
| **Config** | `internal/config/` | Load & validate controller.yaml |
|
||||
| **Stacks** | `internal/stacks/` | Docker Compose operations, catalog, container status |
|
||||
| **Backup** | `internal/backup/` | DB dumps, restic snapshots, restore |
|
||||
| **Monitor** | `internal/monitor/` | Health checks, Healthchecks pings, system metrics |
|
||||
| **Scheduler** | `internal/scheduler/` | Cron-like job runner for all periodic tasks |
|
||||
| **API** | `internal/api/` | REST API endpoints (consumed by web UI + remote mgmt) |
|
||||
| **Web** | `internal/web/` | Dashboard UI, static files, server-side templates |
|
||||
|
||||
## Stack Management
|
||||
|
||||
### How stacks get onto the machine
|
||||
|
||||
1. During initial setup, `deploy-felhom-compose.sh` clones the app catalog
|
||||
2. Compose files + `.felhom.yml` metadata land in `/opt/docker/stacks/<app>/`
|
||||
3. The controller periodically pulls from Git to detect changes
|
||||
|
||||
### First deployment flow (via dashboard)
|
||||
|
||||
1. Customer sees app card with "🚀 Telepítés" (Deploy) button
|
||||
2. Clicks → deploy page shows:
|
||||
- **Auto-filled**: DOMAIN (from controller config), read-only
|
||||
- **Auto-generated**: DB passwords, secret keys (shown as "✓ Generated")
|
||||
- **User input**: HDD path, admin password, language, etc.
|
||||
- **"🎲 Generálás"** button next to password fields
|
||||
3. Clicks "Telepítés" → controller:
|
||||
- Generates all secrets
|
||||
- Validates required fields (checks path exists, etc.)
|
||||
- Saves `app.yaml` (env vars + locked fields list)
|
||||
- Runs `docker compose up -d` with env vars injected
|
||||
4. Post-deploy: locked fields (DB_PASSWORD, etc.) become read-only
|
||||
|
||||
### Update strategy
|
||||
|
||||
Stack updates are classified in the Git repository via markers:
|
||||
|
||||
| Marker | Behavior |
|
||||
|--------|----------|
|
||||
| No marker | Optional update — shown on dashboard, customer clicks "Update" |
|
||||
| `UPDATE_REQUIRED=true` | Mandatory — auto-applied during next update window |
|
||||
| `UPDATE_SECURITY=true` | Critical — applied immediately (within minutes) |
|
||||
|
||||
The update window is configurable per customer (default: 03:00-05:00 local time).
|
||||
|
||||
### Protected stacks
|
||||
|
||||
The following stacks cannot be stopped from the customer UI:
|
||||
- `traefik` (reverse proxy)
|
||||
- `cloudflared` (tunnel)
|
||||
- `felhom-controller` (this container)
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
The controller replaces Backrest and manages backups directly:
|
||||
|
||||
1. **DB dumps** (default 02:30): Discovers running database containers, dumps via pg_dump/mysqldump
|
||||
2. **Restic snapshots** (default 03:00): Backs up `/opt/docker/stacks/` data + DB dumps
|
||||
3. **Verification**: Periodically checks snapshot integrity
|
||||
4. **Pruning**: Configurable retention (default: 7 daily, 4 weekly, 6 monthly)
|
||||
|
||||
Backup status is displayed on the dashboard and reported to Healthchecks.
|
||||
|
||||
## Self-Update Mechanism
|
||||
|
||||
1. Controller checks for new image versions periodically
|
||||
2. Before updating: creates a restic snapshot of its own config
|
||||
3. Pulls new image, recreates container
|
||||
4. Health check timeout (60s) — if new container doesn't become healthy → rollback
|
||||
5. Rollback: restores previous image tag, restarts with old config
|
||||
|
||||
## Configuration
|
||||
|
||||
### Controller config (infrastructure only)
|
||||
|
||||
Single YAML file per customer: `/opt/docker/felhom-controller/controller.yaml`
|
||||
|
||||
Contains customer identity, infrastructure secrets, backup/monitoring settings.
|
||||
Does **not** contain app-specific config (HDD paths, DB passwords, etc.).
|
||||
|
||||
See `configs/controller.yaml.example` for the full reference.
|
||||
|
||||
### Per-app config (created during deployment)
|
||||
|
||||
Each deployed app gets an `app.yaml` in its stack directory:
|
||||
|
||||
```yaml
|
||||
# /opt/docker/stacks/paperless-ngx/app.yaml
|
||||
# Auto-generated by felhom-controller — do not edit locked fields manually
|
||||
deployed: true
|
||||
deployed_at: "2026-02-13T14:30:00Z"
|
||||
env:
|
||||
DOMAIN: "demo-felhom.eu"
|
||||
DB_PASSWORD: "a7f2b9c1e4d..." # locked
|
||||
PAPERLESS_SECRET_KEY: "8b3e..." # locked
|
||||
PAPERLESS_ADMIN_USER: "admin" # editable
|
||||
HDD_PATH: "/mnt/hdd_1" # locked
|
||||
locked_fields:
|
||||
- DB_PASSWORD
|
||||
- PAPERLESS_SECRET_KEY
|
||||
- DOMAIN
|
||||
- HDD_PATH
|
||||
```
|
||||
|
||||
Fields are defined in each stack's `.felhom.yml` metadata file. See
|
||||
`configs/example-felhom-metadata.yml` for the full format.
|
||||
|
||||
### App assets (logos, screenshots, descriptions)
|
||||
|
||||
Baked into the container image at build time — no external dependencies at runtime.
|
||||
Assets are synced from the felhom.eu website repo before building:
|
||||
|
||||
```bash
|
||||
make sync-assets # copies from ../felhom.eu/website/assets/
|
||||
make sync-assets WEBSITE_ASSETS_DIR=/path # or specify custom path
|
||||
```
|
||||
|
||||
Served locally at `/static/assets/`. Naming convention matches the website:
|
||||
|
||||
| Asset | File pattern | Served at |
|
||||
|-------|-------------|-----------|
|
||||
| Logo (SVG) | `assets/{slug}-logo.svg` | `/static/assets/{slug}-logo.svg` |
|
||||
| Logo (PNG fallback) | `assets/{slug}-logo.png` | `/static/assets/{slug}-logo.png` |
|
||||
| Screenshot | `assets/{slug}-screenshot-{n}.webp` | `/static/assets/{slug}-screenshot-{n}.webp` |
|
||||
|
||||
## Build & Deploy
|
||||
|
||||
```bash
|
||||
# Build for both architectures
|
||||
make build-all
|
||||
|
||||
# Build Docker image
|
||||
make docker-build
|
||||
|
||||
# Push to registry
|
||||
make docker-push
|
||||
|
||||
# Build for specific arch
|
||||
make build-amd64
|
||||
make build-arm64
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
# Run locally (needs Docker socket)
|
||||
go run ./cmd/controller/ --config configs/controller.yaml.example
|
||||
|
||||
# Run tests
|
||||
go test ./...
|
||||
|
||||
# Lint
|
||||
golangci-lint run
|
||||
```
|
||||
|
||||
## Repository Layout
|
||||
|
||||
```
|
||||
felhom-controller/
|
||||
├── cmd/controller/ # Entry point
|
||||
│ └── main.go
|
||||
├── internal/
|
||||
│ ├── config/ # Configuration loading
|
||||
│ │ └── config.go
|
||||
│ ├── stacks/ # Docker Compose stack management
|
||||
│ │ ├── manager.go # Core: scan, start, stop, restart, update, logs
|
||||
│ │ ├── metadata.go # Parse .felhom.yml app metadata
|
||||
│ │ └── deploy.go # First-deploy flow: secret gen, app.yaml, compose up
|
||||
│ ├── backup/ # DB dumps + restic operations (Phase 3)
|
||||
│ ├── monitor/ # Health checks + metrics (Phase 2)
|
||||
│ ├── scheduler/ # Periodic job runner (Phase 2)
|
||||
│ ├── api/ # REST API
|
||||
│ │ └── router.go
|
||||
│ └── web/ # Dashboard UI
|
||||
│ ├── server.go # HTTP server, auth, page handlers
|
||||
│ └── templates.go # Embedded HTML templates + CSS (Hungarian)
|
||||
├── configs/ # Example config files
|
||||
│ ├── controller.yaml.example
|
||||
│ └── example-felhom-metadata.yml
|
||||
├── docs/
|
||||
│ └── BUILDING.md # Container image build & registry guide
|
||||
├── scripts/
|
||||
│ └── hashpass.go # Password hash generator
|
||||
├── Dockerfile # Multi-stage build (Go + debian-slim)
|
||||
├── docker-compose.yml # Controller's own compose definition
|
||||
├── Makefile # Build targets (amd64, arm64, docker)
|
||||
├── go.mod
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Status & Roadmap
|
||||
|
||||
### Phase 1 — Stack Manager + Deploy Flow (current)
|
||||
- [x] Project skeleton & config format
|
||||
- [x] .felhom.yml app metadata format with deploy fields
|
||||
- [x] Per-app config persistence (app.yaml)
|
||||
- [x] Secret generation engine (password, hex, static)
|
||||
- [x] Stack catalog (read compose files + metadata from disk)
|
||||
- [x] Docker Compose operations (up/down/pull/ps/logs)
|
||||
- [x] Deploy flow with interactive field input
|
||||
- [x] Basic web dashboard with start/stop/deploy buttons
|
||||
- [x] REST API for stack + deploy operations
|
||||
- [x] Simple web authentication (bcrypt sessions)
|
||||
- [x] App logos + screenshots loaded from felhom.eu
|
||||
- [x] Container image build pipeline (Dockerfile + Makefile)
|
||||
- [ ] First build & test on N100 hardware
|
||||
- [ ] End-to-end test: deploy an app through dashboard
|
||||
|
||||
### Phase 2 — Monitoring & Health
|
||||
- [ ] System metrics collection (CPU, RAM, disk, temperature)
|
||||
- [ ] Healthchecks.io ping integration
|
||||
- [ ] Dashboard system health panel
|
||||
- [ ] Customer notifications (email/Telegram)
|
||||
|
||||
### Phase 3 — Backups
|
||||
- [ ] DB dump engine (PostgreSQL, MariaDB/MySQL, SQLite)
|
||||
- [ ] Restic integration (snapshot, prune, check)
|
||||
- [ ] Backup status on dashboard
|
||||
- [ ] Manual backup trigger from UI
|
||||
- [ ] Restore workflow
|
||||
|
||||
### Phase 4 — Git Sync & Updates
|
||||
- [ ] Periodic git pull for stack definitions
|
||||
- [ ] Update classification (optional/required/security)
|
||||
- [ ] Update window enforcement
|
||||
- [ ] Dashboard update notifications with "Update" button
|
||||
|
||||
### Phase 5 — Self-Update & Resilience
|
||||
- [ ] Self-update check & execution
|
||||
- [ ] Pre-update config backup
|
||||
- [ ] Health-based rollback mechanism
|
||||
- [ ] Config export/import
|
||||
|
||||
### Phase 6 — Central Management (future)
|
||||
- [ ] API authentication for remote management
|
||||
- [ ] Central dashboard on k3s querying all customer controllers
|
||||
- [ ] Fleet-wide update management
|
||||
Reference in New Issue
Block a user