Felhom Controller Architecture — Part 1: Topology & Trust

Status: draft (decisions from the topology/trust design sessions). Platform facts referenced here live in docs/proxmox-platform.md; this document records Felhom's decisions, not Proxmox behaviour.

1. Model at a glance

Three components. Control is always box-initiated — the hub never connects into a customer box.

        operator side                     customer box (per Proxmox host)
   ┌───────────────────┐         ┌───────────────────────────────────────────┐
   │       HUB         │         │  Proxmox host                              │
   │ (dooplex.hu, k3s) │         │   ┌──────────────┐                         │
   │  - report sink    │◀──poll──┤   │  HOST AGENT  │  operator-tier          │
   │  - signed jobs    │  signed │   │  (Proxmox    │  • all Proxmox ops      │
   │  - dashboard      │  jobs   │   │   token)     │  • provision / restore  │
   │  - customer record│         │   └──────┬───────┘  • storage mgmt         │
   │  - PBS namespace  │         │          │ local constrained API           │
   └─────────▲─────────┘         │   ┌──────▼───────────────────────────────┐ │
             │                   │   │  customer LXC (one per customer)      │ │
             │  direct, app-     │   │   ┌──────────────┐   Docker:          │ │
             └───────────────────┼───┤   │ IN-GUEST     │   [app] [app] ...  │ │
                domain reports   │   │   │ CONTROLLER   │   (Docker containers)│
                                 │   │   │ (Docker-only)│                    │ │
                                 │   │   └──────────────┘                    │ │
                                 │   └───────────────────────────────────────┘ │
                                 └───────────────────────────────────────────┘
   PBS (offsite) ◀── outbound, client-side-encrypted backups ── customer box
   end-users / customer ◀── Cloudflare Tunnel ── apps + controller UI

2. The customer node

One Proxmox host per box (PVE 9.2, Debian 13, LVM-thin).
Default workload topology: one customer LXC, Docker inside it, each app a Docker container/stack. Apps are isolated at the Docker layer (separate containers, networks, volumes, cgroup limits); they share one LXC/kernel/Docker daemon.
Escape hatch: promote an individual app to its own guest (LXC or VM) only for a specific reason — a non-Linux/Windows app, a genuinely untrusted or exposed app needing hard isolation, or a resource hog needing guarantees.
Multi-tenant: one customer per host is the home default; multiple customer LXCs on one host (a company environment) is not precluded — the agent manages a set of guests. The only multi-tenant-specific work deferred to "if it becomes real" is resource fairness (per-guest disk/RAM/CPU quotas).

3. Components & responsibilities

	Hub	Host agent	In-guest controller
Runs on	dooplex.hu (k3s)	the Proxmox host	the customer LXC
Tier	operator backend	operator (high-privilege)	customer-facing (app)
Holds	customer records, signed-job source, PBS namespaces, escrowed keys	the only Proxmox API token; per-host operator identity	no Proxmox creds; its own hub API key + a local-API token to the agent
Does	reporting sink, dashboard, job queue, source of durable truth	all Proxmox ops (provision, restore, snapshot, backup, storage mgmt, LXC lifecycle); polls hub for signed jobs; exposes a constrained local API to the controller; per-guest authorization gate	Docker/app lifecycle, catalog deploy, customer UI, app-level (data-layer) backup; reports app-domain to the hub directly
Never does	initiate a connection into a box	—	touch the Proxmox API directly

Key separation: the controller manages Docker; the agent manages Proxmox. The controller's only path to guest-level operations (snapshot-before-deploy, "grow my RAM") is a constrained local API call to the agent, which the agent authorizes (scoped to that controller's own guest) and executes with its operator-tier token. This consolidates all Proxmox access and all per-guest authorization in one auditable place and leaves the guest with zero Proxmox credentials.

4. Control plane — box-initiated

CGNAT does not force this: the Cloudflare Tunnel already makes a box reachable through Cloudflare's edge. We choose box-initiated control for the smallest attack surface — the box exposes no control endpoint at all.
The agent and the controller poll the hub; the hub never initiates inbound.
Operator actions are delivered as signed jobs: the agent verifies an operator signature before executing, so a compromised hub database alone cannot forge commands.
All operator-initiated actions are recorded in a customer-visible audit log.

5. Trust boundaries

Boundary	What crosses	Mechanism	Blast radius if breached
end-user ↔ apps	app traffic	Cloudflare Tunnel → Traefik (Host routing)	that app
customer ↔ controller UI	management UI	Cloudflare Tunnel; UI auth (bcrypt)	the customer's own box
controller ↔ agent	snapshot/resize/backup requests	local constrained RPC; agent authorizes per-guest	the controller's own guest only
agent ↔ hub	reports + signed jobs	outbound poll; signed jobs	one box; signed jobs limit forgery
controller ↔ hub	app-domain reports/jobs (incl. geo desired-state)	outbound, own API key	app-domain of one customer
box ↔ PBS	encrypted backups	outbound; per-customer namespace; client-side encryption	ciphertext only (operator can't read)
guest ↔ Proxmox host	(none direct)	the guest holds no Proxmox creds; all via the agent	—
hub ↔ Cloudflare API	geo-restriction WAF (enforcement)	the hub holds the CF API token; reconciles geo desired-state → WAF	the customer's zone/WAF

6. Enrollment & identity

Physical presence at provisioning (on-site install, or pre-imaged-and-delivered). This removes any zero-touch remote-enrollment problem.
A one-time retrieval code mints durable identity. Single-use (burned on the successful config fetch) plus a short pre-use TTL; one-click regenerate for the only real failure case (fetch fails before anything is persisted). After the fetch, the code is irrelevant — everything downstream runs on durable credentials, so retries don't need it.
Order: the agent enrolls first (and, running as root at setup, mints its own scoped operator-tier Proxmox token), then provisions the customer LXC from the golden template and deploys the controller into it — injecting the controller's hub API key and its local-API token. The controller is the agent's product, never the other way around.
The hub customer record is the durable source of truth, and it survives box loss: identity, domain, Cloudflare tunnel token, PBS namespace, storage manifest, declarative app inventory, and the escrowed (zero-knowledge) backup key. This is what makes hardware replacement possible.

7. Networking

Cloudflare Tunnel provides inbound access to apps and the controller UI (the CGNAT solution). Tunnel token lives in the hub record → reused on new hardware during DR, so DNS/routing stay intact through an outage.
Outbound only for control/report/backup (poll to hub, push to PBS). No inbound control endpoint exists in the chosen model.
Tunnel placement: host (resolved, Part 3 §3/§5). cloudflared runs on the Proxmox host as its own agent-managed systemd service — not inside the guest — so the data path survives control-plane death by construction. Geo-restriction WAF is hub-enforced (the hub holds the CF API token; the controller only reports geo desired-state).

8. Storage & backup

Tiers (escalating failure scope):

Layer	Mechanism	Survives	Note
Snapshot	LVM-thin snapshot (transient)	logical loss only	whole-LXC rollback; not a backup
Local — second storage	vzdump to `dir`/`nfs`/`cifs`	primary-disk failure (USB) / box death (NAS)	first real backup tier
Offsite — PBS	dedup'd, incremental, encrypted	site loss	the DR substrate; paid tier

Storage manifest (hub-held, agent-reconciled): per target → type, durable identity (UUID / server:/export / repo+fingerprint), class (fast/slow + rough IOPS, set once at attach), role, encrypted credentials, schedule/retention. The agent creates the Proxmox storages, continuously checks presence/reachability, and reports per-target status (a disconnected target → actionable notification).
App data placement is per-volume, not per-app: .felhom.yml classifies each volume hot (DB/config/cache → fast storage, enforced) vs bulk (media/files → may be slow). A photo app's DB stays on SSD while its blobs go to the USB.
Backup scoping: hot data (LXC rootfs) rides the guest vzdump → tiers + PBS. Bulk data on external mount points is excluded from the guest vzdump (per-mount backup flag) and gets its own per-volume policy (file-level to a tier, slower cadence — or explicitly not backed up for re-downloadable content, with the customer informed).
Tiers double as the DR restore-source priority: restore from the fastest surviving source (local if still attachable, PBS on true site loss).
Key custody (zero-knowledge default): three tiers the customer chooses — customer-only / zero-knowledge escrow (default) / operator-managed. Default escrows the PBS passphrase-protected keyfile in the hub, wrapped under a customer recovery code the operator can't open; DR needs the customer's code. Access-notification is an audit signal, never the primary guard. (Don't build bespoke crypto — use PBS's native keyfile passphrase.)

9. Disaster recovery

Guest-loss (host + agent alive): the agent restores the guest from the fastest surviving tier, resets identity (MAC/hostname — see proxmox-platform.md), boots it, controller returns. Validated mechanics: Phase 2.
Host / hardware-loss (agent gone): re-provision (§6) in restore mode — the hub, knowing the customer has PBS backups, hands the freshly-enrolled agent the existing identity
- PBS namespace + a restore directive instead of a clean-provision directive. The agent restores from PBS; the controller returns on the same domain (tunnel reused from the hub record). DR = provisioning + a restore mode, not a separate mechanism.
Snapshot-before-deploy: controller asks the agent to snapshot, deploys, runs its post-deploy health check, asks the agent to roll back on failure. (Transient snapshot, §8.)

10. How this embodies the product values

Zero-knowledge offsite — the operator holds the offsite backup but cannot read it.
Box-initiated control + signed jobs — no standing operator backdoor; a hub compromise alone can't forge commands.
Customer-visible audit log — every operator action is visible to the customer.
Never hold data hostage — subscriptions cover ongoing labour (monitoring, offsite, support, new deployments); the customer's data and deployed apps remain recoverable by the customer (recovery code), with nothing locked behind the operator.

11. Open sub-decisions (carried into later parts)

RTO/RPO targets → drive the backup + offsite-replication schedule (§8).
Offboarding / decommission (scenario 6) — not yet designed; must honour "never hold data hostage" in credential revocation + data hand-off.
Multi-tenant resource fairness — deferred until multi-tenant is real (§2).

Appendix — relationship to the spike

Phase 0 → §2: LXC-default for the workload; overhead numbers.
Phase 1 → §3/§5: validated the privilege boundary (create/allocate is operator-tier). The guest-side scoped-backup-token it proved possible is not used — we chose the agent-mediated path — but it confirmed restore = operator-tier, which shapes the agent.
Phase 2 → §8/§9: backup→restore round-trip; identity reset on restore.

Changelog — design-review + Phase-3 fold-in (2026-06-08)

§5 trust boundaries: added hub ↔ Cloudflare API row (hub holds the CF token, enforces geo→WAF); controller↔hub row notes it carries geo desired-state (S4).
§7 networking: tunnel placement resolved → host (agent-managed systemd service); geo is hub-enforced (S4/S5).
§11 open items: removed the now-resolved tunnel placement and self-update flow entries (S5; self-update designed in 03 §11).

13 KiB Raw Blame History