topology&trust
This commit is contained in:
@@ -0,0 +1,208 @@
|
|||||||
|
# Felhom Controller Architecture — Part 1: Topology & Trust
|
||||||
|
|
||||||
|
**Status:** draft (decisions from the topology/trust design sessions).
|
||||||
|
**Platform facts** referenced here live in `docs/proxmox-platform.md`; this document
|
||||||
|
records *Felhom's decisions*, not Proxmox behaviour.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Model at a glance
|
||||||
|
|
||||||
|
Three components. **Control is always box-initiated** — the hub never connects *into* a
|
||||||
|
customer box.
|
||||||
|
|
||||||
|
```
|
||||||
|
operator side customer box (per Proxmox host)
|
||||||
|
┌───────────────────┐ ┌───────────────────────────────────────────┐
|
||||||
|
│ HUB │ │ Proxmox host │
|
||||||
|
│ (dooplex.hu, k3s) │ │ ┌──────────────┐ │
|
||||||
|
│ - report sink │◀──poll──┤ │ HOST AGENT │ operator-tier │
|
||||||
|
│ - signed jobs │ signed │ │ (Proxmox │ • all Proxmox ops │
|
||||||
|
│ - dashboard │ jobs │ │ token) │ • provision / restore │
|
||||||
|
│ - customer record│ │ └──────┬───────┘ • storage mgmt │
|
||||||
|
│ - PBS namespace │ │ │ local constrained API │
|
||||||
|
└─────────▲─────────┘ │ ┌──────▼───────────────────────────────┐ │
|
||||||
|
│ │ │ customer LXC (one per customer) │ │
|
||||||
|
│ direct, app- │ │ ┌──────────────┐ Docker: │ │
|
||||||
|
└───────────────────┼───┤ │ IN-GUEST │ [app] [app] ... │ │
|
||||||
|
domain reports │ │ │ CONTROLLER │ (Docker containers)│
|
||||||
|
│ │ │ (Docker-only)│ │ │
|
||||||
|
│ │ └──────────────┘ │ │
|
||||||
|
│ └───────────────────────────────────────┘ │
|
||||||
|
└───────────────────────────────────────────┘
|
||||||
|
PBS (offsite) ◀── outbound, client-side-encrypted backups ── customer box
|
||||||
|
end-users / customer ◀── Cloudflare Tunnel ── apps + controller UI
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. The customer node
|
||||||
|
|
||||||
|
- One **Proxmox host** per box (PVE 9.2, Debian 13, LVM-thin).
|
||||||
|
- **Default workload topology:** one **customer LXC**, Docker inside it, each app a Docker
|
||||||
|
container/stack. Apps are isolated at the Docker layer (separate containers, networks,
|
||||||
|
volumes, cgroup limits); they share one LXC/kernel/Docker daemon.
|
||||||
|
- **Escape hatch:** promote an individual app to its own guest (LXC or VM) only for a
|
||||||
|
specific reason — a non-Linux/Windows app, a genuinely untrusted or exposed app needing
|
||||||
|
hard isolation, or a resource hog needing guarantees.
|
||||||
|
- **Multi-tenant:** one customer per host is the home default; multiple customer LXCs on
|
||||||
|
one host (a company environment) is **not precluded** — the agent manages a *set* of
|
||||||
|
guests. The only multi-tenant-specific work deferred to "if it becomes real" is resource
|
||||||
|
fairness (per-guest disk/RAM/CPU quotas).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Components & responsibilities
|
||||||
|
|
||||||
|
| | **Hub** | **Host agent** | **In-guest controller** |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Runs on | dooplex.hu (k3s) | the Proxmox host | the customer LXC |
|
||||||
|
| Tier | operator backend | operator (high-privilege) | customer-facing (app) |
|
||||||
|
| Holds | customer records, signed-job source, PBS namespaces, escrowed keys | the **only** Proxmox API token; per-host operator identity | **no Proxmox creds**; its own hub API key + a local-API token to the agent |
|
||||||
|
| Does | reporting sink, dashboard, job queue, source of durable truth | all Proxmox ops (provision, restore, snapshot, backup, storage mgmt, LXC lifecycle); polls hub for signed jobs; exposes a constrained local API to the controller; **per-guest authorization gate** | Docker/app lifecycle, catalog deploy, customer UI, app-level (data-layer) backup; reports app-domain to the hub directly |
|
||||||
|
| Never does | initiate a connection *into* a box | — | touch the Proxmox API directly |
|
||||||
|
|
||||||
|
**Key separation:** the controller manages Docker; the agent manages Proxmox. The controller's
|
||||||
|
only path to guest-level operations (snapshot-before-deploy, "grow my RAM") is a constrained
|
||||||
|
**local API call to the agent**, which the agent authorizes (scoped to that controller's own
|
||||||
|
guest) and executes with its operator-tier token. This consolidates all Proxmox access and
|
||||||
|
all per-guest authorization in one auditable place and leaves the guest with zero Proxmox
|
||||||
|
credentials.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Control plane — box-initiated
|
||||||
|
|
||||||
|
- CGNAT does **not** force this: the Cloudflare Tunnel already makes a box reachable through
|
||||||
|
Cloudflare's edge. We *choose* box-initiated control for the smallest attack surface — the
|
||||||
|
box exposes no control endpoint at all.
|
||||||
|
- The agent and the controller **poll** the hub; the hub never initiates inbound.
|
||||||
|
- Operator actions are delivered as **signed jobs**: the agent verifies an operator signature
|
||||||
|
before executing, so a compromised hub database alone cannot forge commands.
|
||||||
|
- All operator-initiated actions are recorded in a **customer-visible audit log**.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Trust boundaries
|
||||||
|
|
||||||
|
| Boundary | What crosses | Mechanism | Blast radius if breached |
|
||||||
|
|---|---|---|---|
|
||||||
|
| end-user ↔ apps | app traffic | Cloudflare Tunnel → Traefik (Host routing) | that app |
|
||||||
|
| customer ↔ controller UI | management UI | Cloudflare Tunnel; UI auth (bcrypt) | the customer's own box |
|
||||||
|
| controller ↔ agent | snapshot/resize/backup requests | local constrained RPC; agent authorizes per-guest | the controller's own guest only |
|
||||||
|
| agent ↔ hub | reports + signed jobs | outbound poll; signed jobs | one box; signed jobs limit forgery |
|
||||||
|
| controller ↔ hub | app-domain reports/jobs | outbound, own API key | app-domain of one customer |
|
||||||
|
| box ↔ PBS | encrypted backups | outbound; per-customer namespace; client-side encryption | ciphertext only (operator can't read) |
|
||||||
|
| guest ↔ Proxmox host | **(none direct)** | the guest holds no Proxmox creds; all via the agent | — |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Enrollment & identity
|
||||||
|
|
||||||
|
- **Physical presence at provisioning** (on-site install, or pre-imaged-and-delivered).
|
||||||
|
This removes any zero-touch remote-enrollment problem.
|
||||||
|
- A **one-time retrieval code** mints durable identity. Single-use (burned on the successful
|
||||||
|
config fetch) plus a short *pre-use* TTL; one-click regenerate for the only real failure
|
||||||
|
case (fetch fails before anything is persisted). After the fetch, the code is irrelevant —
|
||||||
|
everything downstream runs on durable credentials, so retries don't need it.
|
||||||
|
- **Order:** the agent enrolls first (and, running as root at setup, mints its own scoped
|
||||||
|
operator-tier Proxmox token), then provisions the customer LXC from the golden template and
|
||||||
|
deploys the controller into it — injecting the controller's hub API key and its local-API
|
||||||
|
token. The controller is the agent's product, never the other way around.
|
||||||
|
- The **hub customer record is the durable source of truth**, and it survives box loss:
|
||||||
|
identity, domain, **Cloudflare tunnel token**, **PBS namespace**, **storage manifest**,
|
||||||
|
**declarative app inventory**, and the **escrowed (zero-knowledge) backup key**. This is
|
||||||
|
what makes hardware replacement possible.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Networking
|
||||||
|
|
||||||
|
- **Cloudflare Tunnel** provides inbound access to apps and the controller UI (the CGNAT
|
||||||
|
solution). Tunnel token lives in the hub record → **reused on new hardware during DR**, so
|
||||||
|
DNS/routing stay intact through an outage.
|
||||||
|
- **Outbound only** for control/report/backup (poll to hub, push to PBS). No inbound control
|
||||||
|
endpoint exists in the chosen model.
|
||||||
|
- **OPEN:** Cloudflare Tunnel placement — host vs guest (`cloudflared` on the Proxmox host
|
||||||
|
routing to guest services, or inside the customer LXC). To resolve in a later part.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Storage & backup
|
||||||
|
|
||||||
|
**Tiers** (escalating failure scope):
|
||||||
|
|
||||||
|
| Layer | Mechanism | Survives | Note |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Snapshot | LVM-thin snapshot (transient) | *logical* loss only | whole-LXC rollback; **not a backup** |
|
||||||
|
| Local — second storage | vzdump to `dir`/`nfs`/`cifs` | primary-disk failure (USB) / box death (NAS) | first *real* backup tier |
|
||||||
|
| Offsite — PBS | dedup'd, incremental, encrypted | site loss | the DR substrate; paid tier |
|
||||||
|
|
||||||
|
- **Storage manifest** (hub-held, agent-reconciled): per target → type, durable identity
|
||||||
|
(UUID / `server:/export` / repo+fingerprint), **class** (fast/slow + rough IOPS, set once
|
||||||
|
at attach), role, encrypted credentials, schedule/retention. The agent creates the Proxmox
|
||||||
|
storages, continuously checks presence/reachability, and reports per-target status (a
|
||||||
|
disconnected target → actionable notification).
|
||||||
|
- **App data placement is per-volume, not per-app:** `.felhom.yml` classifies each volume
|
||||||
|
**hot** (DB/config/cache → fast storage, enforced) vs **bulk** (media/files → may be slow).
|
||||||
|
A photo app's DB stays on SSD while its blobs go to the USB.
|
||||||
|
- **Backup scoping:** hot data (LXC rootfs) rides the guest `vzdump` → tiers + PBS. Bulk data
|
||||||
|
on external mount points is **excluded** from the guest vzdump (per-mount `backup` flag) and
|
||||||
|
gets its own per-volume policy (file-level to a tier, slower cadence — or explicitly *not*
|
||||||
|
backed up for re-downloadable content, with the customer informed).
|
||||||
|
- **Tiers double as the DR restore-source priority:** restore from the fastest *surviving*
|
||||||
|
source (local if still attachable, PBS on true site loss).
|
||||||
|
- **Key custody (zero-knowledge default):** three tiers the customer chooses —
|
||||||
|
*customer-only* / *zero-knowledge escrow (default)* / *operator-managed*. Default escrows
|
||||||
|
the **PBS passphrase-protected keyfile** in the hub, wrapped under a **customer recovery
|
||||||
|
code** the operator can't open; DR needs the customer's code. Access-notification is an
|
||||||
|
audit signal, never the primary guard. (Don't build bespoke crypto — use PBS's native
|
||||||
|
keyfile passphrase.)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Disaster recovery
|
||||||
|
|
||||||
|
- **Guest-loss (host + agent alive):** the agent restores the guest from the fastest
|
||||||
|
surviving tier, **resets identity** (MAC/hostname — see `proxmox-platform.md`), boots it,
|
||||||
|
controller returns. Validated mechanics: Phase 2.
|
||||||
|
- **Host / hardware-loss (agent gone):** re-provision (§6) in **restore mode** — the hub,
|
||||||
|
knowing the customer has PBS backups, hands the freshly-enrolled agent the existing identity
|
||||||
|
+ PBS namespace + a restore directive instead of a clean-provision directive. The agent
|
||||||
|
restores from PBS; the controller returns on the same domain (tunnel reused from the hub
|
||||||
|
record). DR = provisioning + a restore mode, not a separate mechanism.
|
||||||
|
- **Snapshot-before-deploy:** controller asks the agent to snapshot, deploys, runs its
|
||||||
|
post-deploy health check, asks the agent to roll back on failure. (Transient snapshot, §8.)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. How this embodies the product values
|
||||||
|
|
||||||
|
- **Zero-knowledge offsite** — the operator holds the offsite backup but cannot read it.
|
||||||
|
- **Box-initiated control + signed jobs** — no standing operator backdoor; a hub compromise
|
||||||
|
alone can't forge commands.
|
||||||
|
- **Customer-visible audit log** — every operator action is visible to the customer.
|
||||||
|
- **Never hold data hostage** — subscriptions cover ongoing labour (monitoring, offsite,
|
||||||
|
support, new deployments); the customer's data and deployed apps remain recoverable by the
|
||||||
|
customer (recovery code), with nothing locked behind the operator.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. Open sub-decisions (carried into later parts)
|
||||||
|
|
||||||
|
- Cloudflare Tunnel placement: host vs guest (§7).
|
||||||
|
- **RTO/RPO targets** → drive the backup + offsite-replication schedule (§8).
|
||||||
|
- Self-update flow (scenario 5) — not yet designed.
|
||||||
|
- Offboarding / decommission (scenario 6) — not yet designed; must honour "never hold data
|
||||||
|
hostage" in credential revocation + data hand-off.
|
||||||
|
- Multi-tenant resource fairness — deferred until multi-tenant is real (§2).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix — relationship to the spike
|
||||||
|
|
||||||
|
- **Phase 0** → §2: LXC-default for the workload; overhead numbers.
|
||||||
|
- **Phase 1** → §3/§5: validated the privilege boundary (create/allocate is operator-tier).
|
||||||
|
The guest-side scoped-backup-token it proved possible is **not** used — we chose the
|
||||||
|
agent-mediated path — but it confirmed restore = operator-tier, which shapes the agent.
|
||||||
|
- **Phase 2** → §8/§9: backup→restore round-trip; identity reset on restore.
|
||||||
Reference in New Issue
Block a user