Files
felhom-agent/docs/architecture/01-topology-and-trust.md
T
2026-06-08 12:59:37 +02:00

14 KiB

Felhom Controller Architecture — Part 1: Topology & Trust

Status: draft (decisions from the topology/trust design sessions). Platform facts referenced here live in docs/proxmox-platform.md; this document records Felhom's decisions, not Proxmox behaviour.


1. Model at a glance

Three components. Control is always box-initiated — the hub never connects into a customer box.

        operator side                     customer box (per Proxmox host)
   ┌───────────────────┐         ┌───────────────────────────────────────────┐
   │       HUB         │         │  Proxmox host                              │
   │ (dooplex.hu, k3s) │         │   ┌──────────────┐                         │
   │  - report sink    │◀──poll──┤   │  HOST AGENT  │  operator-tier          │
   │  - signed jobs    │  signed │   │  (Proxmox    │  • all Proxmox ops      │
   │  - dashboard      │  jobs   │   │   token)     │  • provision / restore  │
   │  - customer record│         │   └──────┬───────┘  • storage mgmt         │
   │  - PBS namespace  │         │          │ local constrained API           │
   └─────────▲─────────┘         │   ┌──────▼───────────────────────────────┐ │
             │                   │   │  customer LXC (one per customer)      │ │
             │  direct, app-     │   │   ┌──────────────┐   Docker:          │ │
             └───────────────────┼───┤   │ IN-GUEST     │   [app] [app] ...  │ │
                domain reports   │   │   │ CONTROLLER   │   (Docker containers)│
                                 │   │   │ (Docker-only)│                    │ │
                                 │   │   └──────────────┘                    │ │
                                 │   └───────────────────────────────────────┘ │
                                 └───────────────────────────────────────────┘
   PBS (offsite) ◀── outbound, client-side-encrypted backups ── customer box
   end-users / customer ◀── Cloudflare Tunnel ── apps + controller UI

2. The customer node

  • One Proxmox host per box (PVE 9.2, Debian 13, LVM-thin).
  • Default workload topology: one customer LXC, Docker inside it, each app a Docker container/stack. Apps are isolated at the Docker layer (separate containers, networks, volumes, cgroup limits); they share one LXC/kernel/Docker daemon.
  • Escape hatch: promote an individual app to its own guest (LXC or VM) only for a specific reason — a non-Linux/Windows app, a genuinely untrusted or exposed app needing hard isolation, or a resource hog needing guarantees.
  • Multi-tenant: one customer per host is the home default; multiple customer LXCs on one host (a company environment) is not precluded — the agent manages a set of guests. The only multi-tenant-specific work deferred to "if it becomes real" is resource fairness (per-guest disk/RAM/CPU quotas).

3. Components & responsibilities

Hub Host agent In-guest controller
Runs on dooplex.hu (k3s) the Proxmox host the customer LXC
Tier operator backend operator (high-privilege) customer-facing (app)
Holds customer records, signed-job source, PBS namespaces, escrowed keys the only Proxmox API token; per-host operator identity no Proxmox creds; its own hub API key + a local-API token to the agent
Does reporting sink, dashboard, job queue, source of durable truth all Proxmox ops (provision, restore, snapshot, backup, storage mgmt, LXC lifecycle); polls hub for signed jobs; exposes a constrained local API to the controller; per-guest authorization gate Docker/app lifecycle, catalog deploy, customer UI, app-level (data-layer) backup; reports app-domain to the hub directly
Never does initiate a connection into a box touch the Proxmox API directly

Key separation: the controller manages Docker; the agent manages Proxmox. The controller's only path to guest-level operations (snapshot-before-deploy, "grow my RAM") is a constrained local API call to the agent, which the agent authorizes (scoped to that controller's own guest) and executes with its operator-tier token. This consolidates all Proxmox access and all per-guest authorization in one auditable place and leaves the guest with zero Proxmox credentials.


4. Control plane — box-initiated

  • CGNAT does not force this: the Cloudflare Tunnel already makes a box reachable through Cloudflare's edge. We choose box-initiated control for the smallest attack surface — the box exposes no control endpoint at all.
  • The agent and the controller poll the hub; the hub never initiates inbound.
  • Operator actions are delivered as signed jobs: the agent verifies an operator signature before executing, so a compromised hub database alone cannot forge commands.
  • All operator-initiated actions are recorded in a customer-visible audit log.

5. Trust boundaries

Boundary What crosses Mechanism Blast radius if breached
end-user ↔ apps app traffic Cloudflare Tunnel → Traefik (Host routing) that app
customer ↔ controller UI management UI Cloudflare Tunnel; UI auth (bcrypt) the customer's own box
controller ↔ agent snapshot/resize/backup requests local constrained RPC; agent authorizes per-guest the controller's own guest only
agent ↔ hub reports + signed jobs outbound poll; signed jobs one box; signed jobs limit forgery
controller ↔ hub app-domain reports/jobs (incl. geo desired-state) outbound, own API key app-domain of one customer
box ↔ PBS encrypted backups outbound; per-customer namespace; client-side encryption ciphertext only (operator can't read)
guest ↔ Proxmox host (none direct) the guest holds no Proxmox creds; all via the agent
hub ↔ Cloudflare API geo-restriction WAF (enforcement) the hub holds the CF API token; reconciles geo desired-state → WAF the customer's zone/WAF

6. Enrollment & identity

  • Physical presence at provisioning (on-site install, or pre-imaged-and-delivered). This removes any zero-touch remote-enrollment problem.
  • A one-time retrieval code mints durable identity. Single-use (burned on the successful config fetch) plus a short pre-use TTL; one-click regenerate for the only real failure case (fetch fails before anything is persisted). After the fetch, the code is irrelevant — everything downstream runs on durable credentials, so retries don't need it.
  • Order: the agent enrolls first (and, running as root at setup, mints its own scoped operator-tier Proxmox token), then provisions the customer LXC from the golden template and deploys the controller into it — injecting the controller's hub API key and its local-API token. The controller is the agent's product, never the other way around.
  • The hub customer record is the durable source of truth, and it survives box loss: identity, domain, Cloudflare tunnel token, PBS namespace, storage manifest, a mirrored app inventory (bottom-up reality, not operator-declared intent — apps themselves restore from the PBS guest snapshot, never re-deployed from this record; see 05 §1/§9), and the escrowed (zero-knowledge) backup key. This is what makes hardware replacement possible.

7. Networking

  • Cloudflare Tunnel provides inbound access to apps and the controller UI (the CGNAT solution). Tunnel token lives in the hub record → reused on new hardware during DR, so DNS/routing stay intact through an outage.
  • Outbound only for control/report/backup (poll to hub, push to PBS). No inbound control endpoint exists in the chosen model.
  • Tunnel placement: host (resolved, Part 3 §3/§5). cloudflared runs on the Proxmox host as its own agent-managed systemd service — not inside the guest — so the data path survives control-plane death by construction. Geo-restriction WAF is hub-enforced (the hub holds the CF API token; the controller only reports geo desired-state).

8. Storage & backup

Tiers (escalating failure scope):

Layer Mechanism Survives Note
Snapshot LVM-thin snapshot (transient) logical loss only whole-LXC rollback; not a backup
Local — second storage vzdump to dir/nfs/cifs primary-disk failure (USB) / box death (NAS) first real backup tier
Offsite — PBS dedup'd, incremental, encrypted site loss the DR substrate; paid tier
  • Storage manifest (hub-held, agent-reconciled): per target → type, durable identity (UUID / server:/export / repo+fingerprint), class (fast/slow + rough IOPS, set once at attach), role, encrypted credentials, schedule/retention. The agent creates the Proxmox storages, continuously checks presence/reachability, and reports per-target status (a disconnected target → actionable notification).
  • App data placement is per-volume, not per-app: .felhom.yml classifies each volume hot (DB/config/cache → fast storage, enforced) vs bulk (media/files → may be slow). A photo app's DB stays on SSD while its blobs go to the USB.
  • Backup scoping: hot data (LXC rootfs) rides the guest vzdump → tiers + PBS. Bulk data on external mount points is excluded from the guest vzdump (per-mount backup flag) and gets its own per-volume policy (file-level to a tier, slower cadence — or explicitly not backed up for re-downloadable content, with the customer informed).
  • Tiers double as the DR restore-source priority: restore from the fastest surviving source (local if still attachable, PBS on true site loss).
  • Key custody (zero-knowledge default): three tiers the customer chooses — customer-only / zero-knowledge escrow (default) / operator-managed. Default escrows the PBS passphrase-protected keyfile in the hub, wrapped under a customer recovery code the operator can't open; DR needs the customer's code. Access-notification is an audit signal, never the primary guard. (Don't build bespoke crypto — use PBS's native keyfile passphrase.)

9. Disaster recovery

  • Guest-loss (host + agent alive): the agent restores the guest from the fastest surviving tier, resets identity (MAC/hostname — see proxmox-platform.md), boots it, controller returns. Validated mechanics: Phase 2.
  • Host / hardware-loss (agent gone): re-provision (§6) in restore mode — the hub, knowing the customer has PBS backups, hands the freshly-enrolled agent the existing identity
    • PBS namespace + a restore directive instead of a clean-provision directive. The agent restores from PBS; the controller returns on the same domain (tunnel reused from the hub record). DR = provisioning + a restore mode, not a separate mechanism.
  • Snapshot-before-deploy: controller asks the agent to snapshot, deploys, runs its post-deploy health check, asks the agent to roll back on failure. (Transient snapshot, §8.)

10. How this embodies the product values

  • Zero-knowledge offsite — the operator holds the offsite backup but cannot read it.
  • Box-initiated control + signed jobs — no standing operator backdoor; a hub compromise alone can't forge commands.
  • Customer-visible audit log — every operator action is visible to the customer.
  • Never hold data hostage — subscriptions cover ongoing labour (monitoring, offsite, support, new deployments); the customer's data and deployed apps remain recoverable by the customer (recovery code), with nothing locked behind the operator.

11. Open sub-decisions (carried into later parts)

  • RTO/RPO targets → drive the backup + offsite-replication schedule (§8).
  • Offboarding / decommission (scenario 6) — not yet designed; must honour "never hold data hostage" in credential revocation + data hand-off.
  • Multi-tenant resource fairness — deferred until multi-tenant is real (§2).

Appendix — relationship to the spike

  • Phase 0 → §2: LXC-default for the workload; overhead numbers.
  • Phase 1 → §3/§5: validated the privilege boundary (create/allocate is operator-tier). The guest-side scoped-backup-token it proved possible is not used — we chose the agent-mediated path — but it confirmed restore = operator-tier, which shapes the agent.
  • Phase 2 → §8/§9: backup→restore round-trip; identity reset on restore.

Changelog — design-review + Phase-3 fold-in (2026-06-08)

  • §5 trust boundaries: added hub ↔ Cloudflare API row (hub holds the CF token, enforces geo→WAF); controller↔hub row notes it carries geo desired-state (S4).
  • §7 networking: tunnel placement resolved → host (agent-managed systemd service); geo is hub-enforced (S4/S5).
  • §11 open items: removed the now-resolved tunnel placement and self-update flow entries (S5; self-update designed in 03 §11).
  • §6 durable record: "declarative app inventory" → "mirrored app inventory" — aligns the wording with the locked two-driver model (05 §1: apps are bottom-up mirror, never operator-declared; 05 §9: apps restore from the PBS guest snapshot, not re-deployed from this record).