# Felhom Controller Architecture — Part 2: Controller Module Map **Status:** audit (keep / port / delete / modify / add), grounded in the v0.33 source. **Subject:** the v0.33 controller in `deploy-felhom-compose/controller/` (110 `.go` files, ~40 K LOC) audited against [01-topology-and-trust.md](01-topology-and-trust.md) and [../proxmox-platform.md](../proxmox-platform.md). > This is a **planning map, not the port.** No controller code was changed. Source > citations use `controller/internal/...:line` (a different repo, so links are not > clickable). Classifications reflect the **target model**: the in-guest controller is > **Docker-only and holds no Proxmox credentials**; everything host/disk/Proxmox moves to > a new **host agent** (out of scope here); the controller reaches the agent through a > constrained **local API**. ## Classification scheme **KEEP** (host-agnostic, ~unchanged) · **PORT** (survives, needs rework) · **DELETE (→agent)** (responsibility moves to the host agent) · **DELETE (obsolete)** (no longer needed) · **MODIFY** (stays, materially changes) · **NEW** (no v0.33 equivalent). Risk tags: **clean** · **needs-rework** · **hazard** (entangles a delete-target with a keep/port target). --- ## 0. Executive summary - The **app domain is largely intact and portable**: stack lifecycle (`stacks/`), catalog git-sync (`sync/`), app-to-app integrations (`integrations/`), `.fab` export/import (`appexport/`), the scheduler, crypto, asset sync, the hub report/notify *channels*, and most of the web UI **KEEP/PORT cleanly**. - The **disk/storage/host half deletes wholesale to the agent**: all of `storage/`, `monitor/watchdog.go`, the restic/cross-drive/disk-layout/drive-mount parts of `backup/`, `report/infra_backup*`+`infra_pull`, and the host-physical parts of `system/`. - The **setup wizard (`setup/`) is obsolete** — the agent provisions the controller. - **The single biggest hazard is `backup/`**: the keep side (DB dumps, Docker-volume archive, per-app restore — needed by `appexport/` and the backup UI) and the delete side (restic, cross-drive, drive-mount) are **interleaved inside the same files** (`backup.go`, `restore.go`, `paths.go`), not cleanly file-separated. Extracting the app-data-backup subset into a clean retained package is the critical refactor. - **Intent-vs-reality corrections** (vs the task's provisional split): `monitor/pinger.go` is already **dead** (legacy Healthchecks.io, "deprecated… now handled by Hub" per `main.go`) → DELETE(obsolete), not keep. `backup.go`/`restore.go`/`paths.go` do **not** split on file boundaries — they split *within* the file. `settings/` is **not** pure app domain — it stores disk/disconnect/decommission state. `system/` is genuinely mixed-per-function, not per-file. --- ## 1. v0.33 module inventory (package → purpose, key deps) | Package | Purpose | Key internal deps | |---|---|---| | `cmd/controller/main.go` | Entry point; wires all subsystems; 6 adapters break import cycles; branches into setup mode | imports **every** package | | `api/` | REST API (`router.go`) + geo endpoints (`geo.go`) | stacks, backup, metrics, notify, selfupdate, sync, system, assets, integrations, cloudflare, config, settings | | `appexport/` | `.fab` app export/import (config+DB+volumes, AES-256-CTR+scrypt) | **backup** (DB dump), (provider iface → stacks) | | `assets/` | Download/cache app assets from Hub API | — (HTTP only) | | `backup/` | DB dumps, Docker-volume archive, **restic**, **cross-drive rsync**, per-app restore, **drive mount**, disk-layout, infra-backup metadata | config, monitor, settings, system, util | | `cloudflare/` | Geo-restriction via Cloudflare WAF (zone/waf/geosync/countries) | settings | | `config/` | `controller.yaml` schema + load | — | | `crypto/` | AES-256-GCM for app.yaml secrets | — | | `integrations/` | App-to-app (OnlyOffice→FileBrowser/Nextcloud) via docker exec / config patch | stacks, crypto, settings | | `metrics/` | SQLite time-series: system + container metrics, log scan | system | | `monitor/` | App health (`healthcheck`,`pinger`) + **storage/USB watchdog** | config, notify, settings, system | | `notify/` | Hub event push (direct, own API key) | settings | | `recovery/` | Generate `recovery-info.txt` (DR guide) | — | | `report/` | Build+push hub report; **infra-backup payload**; **recovery pull** | backup, config, metrics, monitor, scheduler, settings, stacks, system | | `scheduler/` | Cron/interval jobs, Budapest TZ | — | | `selftest/` | Startup checks (docker/dirs/catalog/hub/**restic repos**/mountpoint) | backup, config, settings, system | | `selfupdate/` | Self-update: pull image, edit compose, `up -d` | config | | `settings/` | `settings.json` persistent state: **storage paths/disconnect/decommission**, cross-drive cfg, notif prefs, geo, integration state, DB-validation cache | — | | `setup/` | **First-run wizard** (scan drives, hub-restore, manual config) | backup, config, report, settings, web | | `stacks/` | Docker Compose lifecycle, deploy + memory validation, metadata (`.felhom.yml`), HDD-data delete | config, crypto, system | | `storage/` | **Physical disk** scan/format/attach/mount/migrate/fstab/safety | backup, settings, util | | `sync/` | Catalog git-sync (pull templates) | config | | `system/` | Resource info: mem/cpu/load (guest) + **temp/disk-model/USB/mount topology (host)** | — | | `util/` | String helper | — | | `web/` | Hungarian dashboard: pages, auth, deploy, backup UI, **storage/disk UI**, DR restore UI, export UI, debug | appexport, backup, config, crypto, integrations, monitor, notify, scheduler, selfupdate, settings, stacks, storage, system | --- ## 2. Classification table (per package/file) ### `cmd/` | File | Class | Reason | Risk | |---|---|---|---| | `cmd/controller/main.go` | **MODIFY** | Wiring stays, but drop the setup-mode branch, the storage/watchdog/drive-migrator/restic/cross-drive/infra-backup wiring, and add the **agent local-API client**. 6 adapters shrink. | hazard | ### `api/` | File | Class | Reason | Risk | |---|---|---|---| | `api/router.go` | **PORT/MODIFY** | Keep stacks/deploy/integrations/metrics/sync/assets/selfupdate routes; **remove `/api/storage/*` (disk)**; backup routes become **agent-coordinated guest-backup** requests; `config/apply` (hub-pushes-yaml) changes since the **agent** now injects config at provision. | needs-rework | | `api/geo.go` | **PORT (blocked)** | Geo is app-domain, but gated on the tunnel-placement decision (doc 01 §7/§11). | blocked | ### `appexport/` — KEEP/PORT (Docker-volume + DB level, no disk ops) | File | Class | Reason | Risk | |---|---|---|---| | `crypto.go` | **KEEP** | Self-contained AES-256-CTR+HMAC+scrypt for `.fab`. | clean | | `manifest.go`, `provider.go` | **KEEP** | Bundle metadata; provider interface (impl in main). | clean | | `export.go` | **PORT** | Docker-volume `tar`, DB dump via `backup.DumpOne`, config copy. Depends on the **retained** app-data-backup subset of `backup/`; HDD-mount enumeration reworked to **per-volume placement**. | needs-rework | | `restore.go` | **PORT** | `docker volume create`/`tar xf`, DB import, compose up. Same per-volume rework. | needs-rework | | `estimate.go` | **PORT** | `du`/`df` on mounts → per-volume sizing. | clean | ### `assets/` | File | Class | Reason | Risk | |---|---|---|---| | `syncer.go` | **KEEP** | Hub API download + checksum cache; already a direct hub channel. | clean | ### `backup/` — THE SPLIT (delete side interleaved with keep side; see §3) | File | Class | Reason | Risk | |---|---|---|---| | `dbdump.go` | **KEEP** | Pure `docker exec pg_dump`/`mariadb-dump` — app/DB data layer; the retained per-app backup. | clean | | `appdata.go` | **PORT** | App-data discovery (stacks/volumes/DB containers, `du`). "HDD mount" concept → per-volume. | needs-rework | | `backup.go` (1478 L) | **MODIFY (split)** | Mixes **keep** (`RunDBDumps`, `DumpAppVolumes(Safe)`, app restore) with **delete→agent** (`RunBackup`/`backupDrive`/restic snapshot/prune/check on per-drive repos). Must be torn in two. | hazard | | `restore.go` (442 L) | **MODIFY (split)** | `RestoreApp` restic path → agent; Docker-volume + Tier-2 rsync restore (app layer) → keep. | hazard | | `restore_app_linux.go`/`_other.go` | **PORT** | Per-app restore: compose pull/up, rsync app data, DB-dump restore. App layer; depends on backup location that changes. | needs-rework | | `paths.go` | **MODIFY (split)** | `AppDBDumpPath`/`AppVolumeDumpPath` keep; `Primary/SecondaryResticRepoPath`, `InfraBackupDir` → agent. | needs-rework | | `restic.go` | **DELETE (→agent)** | restic repos on drives = infra backup tier; agent does vzdump/PBS. | hazard | | `crossdrive.go` | **DELETE (→agent)** | Tier-2 cross-drive rsync to secondary storage = storage-tier (agent + storage manifest). | hazard | | `restore_drives_linux.go`/`_other.go` | **DELETE (→agent)** | `lsblk`/`blkid`/`mount`/fstab — pure host disk. | hazard | | `disk_layout.go` | **DELETE (→agent)** | Disk topology for DR → agent. | clean | | `local_infra.go` | **DELETE (→agent)** | Per-drive infra-backup metadata → agent. | clean | | `restore_scan.go` | **DELETE (→agent)** | Scans drives to build a DR restore plan = agent-tier DR. | needs-rework | ### `cloudflare/` — BLOCKED on tunnel-placement (doc 01 §7/§11) | File | Class | Reason | Risk | |---|---|---|---| | `client.go`,`zone.go`,`waf.go`,`geosync.go`,`countries.go` | **PORT (blocked)** | Geo-restriction WAF is app-domain and could stay in the controller, but it shares the Cloudflare account/zone with the **tunnel**, whose host-vs-guest placement is undecided. Classify provisionally PORT; do not force. | blocked | ### `config/`, `crypto/`, `util/` | File | Class | Reason | Risk | |---|---|---|---| | `config/config.go` | **MODIFY** | Drop `BackupConfig` (restic/retention) and storage-drive keys; keep customer/paths/web/git/stacks/monitoring/hub/assets/system; **add agent local-API endpoint+token**. Self-update section gated (open). | needs-rework | | `crypto/crypto.go` | **KEEP** | App.yaml secret encryption. | clean | | `util/strings.go` | **KEEP** | Trivial helper. | clean | ### `integrations/` — all KEEP (pure app-domain) | File | Class | Reason | Risk | |---|---|---|---| | `integrations.go`,`lifecycle.go`,`manager.go`,`onlyoffice_filebrowser.go`,`onlyoffice_nextcloud.go` | **KEEP** | App-to-app via `docker exec` / compose-config patch; no host ops. | clean | ### `metrics/` | File | Class | Reason | Risk | |---|---|---|---| | `store.go`,`logscanner.go`,`telemetry.go`,`types.go` | **KEEP** | SQLite store, `docker logs` scan, container telemetry — app-domain. | clean | | `collector.go` | **PORT** | Container metrics (`docker stats`) keep; host metrics via `system.GetInfo` (temp, physical disk) become **agent-provided or dropped**. | needs-rework | | `sysinfo.go`/`sysinfo_other.go` | **MODIFY** | Reads `/host/etc`, `/proc/cpuinfo`, uptime — host static info; in-guest some is meaningful, hardware identity via agent. | needs-rework | ### `monitor/` | File | Class | Reason | Risk | |---|---|---|---| | `healthcheck.go` | **PORT (split)** | Keep guest health (mem/cpu/docker/protected-containers); host health (temp, **physical disk**, storage-path mount status) becomes **agent-fed**. | needs-rework | | `pinger.go` | **DELETE (obsolete)** | Legacy Healthchecks.io; `main.go` itself marks it "deprecated… now handled by Hub". *(Corrects the task's KEEP/PORT guess.)* | clean | | `watchdog.go` (902 L) | **DELETE (→agent)** | Storage/USB disconnect monitoring: `umount -l`, `mount -T /host-fstab`, UUID probing, restic-lock cleanup — pure host storage. | hazard | ### `notify/`, `recovery/`, `scheduler/`, `selftest/` | File | Class | Reason | Risk | |---|---|---|---| | `notify/notifier.go` | **KEEP/MODIFY** | Direct hub event channel (own API key) — keep; prune infra event types that move to the agent (`storage_disconnected`, `crossdrive_*`, `disaster_recovery_*`). | clean | | `recovery/info.go` | **DELETE (obsolete)** | Generates a DR text guide (OS install, docker-setup.sh, hub restore UI); DR is now agent+hub provisioning. | clean | | `scheduler/scheduler.go` | **KEEP** | Generic cron/interval, Budapest TZ. | clean | | `selftest/selftest.go` | **PORT** | Keep docker/dirs/catalog/hub checks; drop restic-repo + system-data **mountpoint** checks (→agent). | needs-rework | ### `report/` | File | Class | Reason | Risk | |---|---|---|---| | `pusher.go` | **KEEP** | Direct hub push (`/api/v1/report`, Bearer). | clean | | `telemetry.go` | **KEEP** | Per-app telemetry section. | clean | | `builder.go` (326 L) | **MODIFY** | Keep containers/telemetry/stacks/geo/app-health; drop/relocate host system info, physical storage, **restic backup status incl. restic password**. | hazard | | `types.go` | **MODIFY** | Schema: drop infra fields (`restic password`, physical storage), keep app-domain. | needs-rework | | `infra_backup.go`/`_linux.go`/`_other.go` | **DELETE (→agent)** | Builds infra-backup payload (disk layout, restic/enc passwords) for hub. | hazard | | `infra_pull.go` | **DELETE (→agent)** | Pulls recovery config + infra backup from hub (setup-wizard DR). | needs-rework | ### `selfupdate/` — OPEN (doc 01 §11: "self-update flow not yet designed") | File | Class | Reason | Risk | |---|---|---|---| | `version.go`,`state.go` | **KEEP** | Semver parse; update audit state. | clean | | `updater.go` | **PORT (open)** | Pulls image + edits `docker-compose.yml` + `compose up -d`. In the agent model the controller is the **agent's product** (doc 01 §3) — self-update may move under the agent. Flag as open. | blocked | ### `settings/` | File | Class | Reason | Risk | |---|---|---|---| | `settings/settings.go` (1101 L) | **MODIFY (split)** | Keep notif prefs, integration state, geo, DB-validation cache, cross-drive *intent*. The **storage-path registry** (`StoragePath` with `Disconnected`/`DisconnectedAt`/`StoppedStacks`/decommission/UUID) is disk-management state → reshape to **per-volume placement** fed by the agent's storage manifest; disconnect/decommission/migrate state leaves. | hazard | ### `setup/` — all DELETE (obsolete); the agent provisions the controller | File | Class | Reason | Risk | |---|---|---|---| | `handlers.go`,`setup.go`,`csrf.go`,`network.go` | **DELETE (obsolete)** | First-run wizard (hub-restore, manual config, LAN-IP detection). | needs-rework | | `scanner.go` | **DELETE (→agent)** | Drive scan (`lsblk`+temp mounts) for backup discovery — host op; its capability informs the agent. | clean | ### `stacks/` — core app domain (KEEP/PORT) | File | Class | Reason | Risk | |---|---|---|---| | `manager.go` (1074 L) | **KEEP/PORT** | Docker Compose orchestration, scan/state/start/stop/logs — the heart. Minor port. | clean | | `deploy.go` | **PORT** | Memory validation (`system.GetMemoryMB` — **guest** mem, fine in LXC), secret gen, encrypted app.yaml. **Add snapshot-before-deploy → agent** hook. | needs-rework | | `healthprobe.go` | **KEEP** | TCP/HTTP app probes. | clean | | `metadata.go` | **PORT** | `.felhom.yml` parse. **Add per-volume hot/bulk classification** (doc 01 §8). | needs-rework | | `delete.go` | **PORT** | Stack delete + HDD-data `os.RemoveAll` on bind mounts → per-volume cleanup. | needs-rework | ### `storage/` — entire package DELETE (→agent) | File | Class | Reason | Risk | |---|---|---|---| | `scan*`,`format*`,`attach*`,`migrate*`,`migrate_drive*`,`safety*` | **DELETE (→agent)** | Physical disk: `lsblk`/`sfdisk`/`wipefs`/`mkfs.ext4`/`partprobe`/`mount`/`umount`/fstab/`blkid`/drive-rsync. The agent owns all of this (doc 01 §3, §8). | hazard | ### `sync/` | File | Class | Reason | Risk | |---|---|---|---| | `sync/sync.go` | **KEEP** | Catalog git-sync (clone/fetch/reset, copy compose+`.felhom.yml`, never overwrite app.yaml). | clean | ### `system/` — split per-function (not per-file) | File | Class | Reason | Risk | |---|---|---|---| | `cpu_linux.go`/`cpu_other.go` | **KEEP** | `/proc/stat` works inside an LXC. | clean | | `info.go`/`info_other.go` | **KEEP** | Structs/stubs. | clean | | `info_linux.go` | **MODIFY (split)** | Keep mem (`/proc/meminfo`)/load/statfs (guest); **temp via `/host/sys`, hwmon → agent**. | needs-rework | | `mounts_linux.go`/`mounts_other.go` | **DELETE (→agent)** mostly | Mount-point detection, USB, disk model, fstab, probe — host/disk. Guest-meaningful `statfs` disk-usage is the only keep-candidate → fold into the kept `info`. | hazard | ### `web/` — split by UI surface | File | Class | Reason | Risk | |---|---|---|---| | `auth.go`,`csrf.go`,`logbuffer.go`,`embed.go`,`templates.go` | **KEEP** | Session/CSRF, log ring buffer, embeds/logo. | clean | | `funcmap.go` | **KEEP/PORT** | Template helpers; a few backup/state labels track the backup rework. | clean | | `server.go` (559 L) | **MODIFY** | Routing/wiring; remove storage/DR-restore/watchdog wiring; keep app/deploy/backup/settings/export/debug. | needs-rework | | `handlers.go` (1883 L) | **PORT/MODIFY** | Core pages keep; the embedded **storage-path management** (add/remove/label/schedulable, storage bars, FileBrowser mount sync) → per-volume / agent-fed. | hazard | | `handler_export.go` | **KEEP/PORT** | `.fab` UI. | clean | | `handler_debug.go` (823 L) | **PORT** | Drop storage-simulate/infra-push/DR debug; keep the rest. | needs-rework | | `alerts.go` | **PORT/MODIFY** | Storage-disconnect alert now sourced from **agent** status; backup/update alerts keep. | needs-rework | | `handler_restore.go` | **DELETE (→agent) / MODIFY** | DR restore-mode UI; DR is agent-tier — replace with an agent-status view or remove. | needs-rework | | `storage_handlers.go` (1600 L) | **DELETE (→agent)** | Format/attach/mount/disconnect/migrate-drive/decommission disk UI. Any survivor is a **thin client calling the agent API** (e.g. per-volume placement requests). | hazard | | `templates/` (HTML, non-Go) | **PORT** | Remove disk-wizard + DR pages; keep app/deploy/backup/settings pages. | needs-rework | ### `scripts/` | File | Class | Reason | Risk | |---|---|---|---| | `scripts/hashpass.go` | **KEEP** | Standalone bcrypt helper. | clean | --- ## 3. Coupling hazards (delete-targets depended on by keep/port) 1. **`backup/` is half-deleted but split *inside files*, not across them.** `backup.go` contains both `RunDBDumps`/`DumpAppVolumesSafe`/app-restore (keep) and `RunBackup`/`backupDrive` + restic (delete→agent); `restore.go` and `paths.go` are likewise mixed. **Keep/port consumers reach into this same package:** - `appexport/export.go:295` → `backup.DiscoverDatabases`/`DumpOne` (DB dump is app-layer — must survive) - `report/builder.go:buildBackupReport` → backup status (MODIFY) - `web/handlers.go` (backups page, `buildAppBackupRows`), `web/funcmap.go`, `web/alerts.go`, `web/handler_restore.go`, `web/handler_debug.go` - `selftest/selftest.go:217` → `checkResticRepos` (restic path — delete) - `main.go` scheduler chain `RunFullBackup` (DB→volume→restic→infra-push) interleaves both sides. **Action:** extract the app-data-backup subset (DB dump, volume archive, per-app restore) into a clean retained package *before* deleting the restic/cross-drive code, or every keep consumer breaks. 2. **`backup/crossdrive.go` (delete→agent) is wired as `crossDriveRunner` into** `main.go`, `api/router.go`, `web/server.go`, and surfaced by `report/builder.go` and the backups page. Removing it requires reworking the backup UI/report to the agent's guest-backup status. 3. **`storage/` (delete→agent) depended on by keep/port UI:** `web/storage_handlers.go` (delete) and `web/server.go`/`web/handlers.go` (port) — the latter renders storage labels/bars and runs **FileBrowser mount sync** off the storage-path registry. `storage/migrate*.go` also imports `backup` (also being split). Untangle the per-volume placement UI from the disk-management UI. 4. **`monitor/watchdog.go` (delete→agent) depended on by** `web/alerts.go` (port), `web/server.go`, `web/handler_debug.go`, `main.go`. The disconnect **alert** must instead consume agent-reported storage status. 5. **`system/` mixed-per-function, consumed by both sides.** Keep consumers — `stacks/deploy.go` (`GetMemoryMB`, guest), `metrics/collector.go` (container) — must not drag in the host-disk/temp/USB code that goes to the agent (`mounts_linux.go`, `info_linux.go` temp). Also consumed by `report/builder.go` (MODIFY), `monitor/healthcheck.go` (PORT), `selftest`, `crossdrive` (delete). **Split `system/` cleanly into guest-info vs host-info first.** 6. **`settings/StoragePath` carries disk state into an app-domain store.** Disk fields (`Disconnected`,`DisconnectedAt`,`StoppedStacks`, decommission, UUID) are written by `watchdog.go`/`storage_handlers.go`/`crossdrive.go` (all delete) but the same struct is read by `stacks`/`web` for labels and **placement** (keep). Reshape `StoragePath` to a placement record fed by the agent manifest. 7. **`report/builder.go` imports almost everything** (backup, monitor, scheduler, stacks, system, metrics, settings, config). Its MODIFY must land *after* the backup and system splits, or it pulls deleted code along. 8. **`backup/paths.go` shared both ways** — `appexport` + `selftest` + the kept DB-dump flow use the app-dump path helpers; the same file holds the restic/secondary helpers that leave. 9. **DR/provisioning chain is cross-cut:** `setup/` (obsolete) → `report/infra_pull` + `recovery/info` + `backup.MountDrivesFromLayout` + `backup.ReadLocalInfraBackup`. All obsolete/→agent, but `main.go`'s setup branch and `web/handler_restore.go` reference them; remove together. --- ## 4. Moves to the host agent (consolidated — feeds the future agent design) > Reporting only; **not** designing the agent here. - **All physical-disk management** — `storage/` in full: scan/classify, format (`wipefs`/`sfdisk`/`mkfs.ext4`/`partprobe`), attach (raw mount + bind + fstab), per-app and full-drive migration (rsync), safety checks (system-disk detection). - **Storage/USB watchdog** — `monitor/watchdog.go`: disconnect/reconnect detection, `umount -l`, `mount -T /host-fstab`, UUID-by-id probing, safe-disconnect, restic-lock cleanup. - **Infra/disk backup tier** — `backup/restic.go`, `crossdrive.go`, `restore_drives_*`, `disk_layout.go`, `local_infra.go`, `restore_scan.go`, plus the restic-snapshot half of `backup.go`, the restic-restore half of `restore.go`, and the restic/secondary path helpers in `paths.go`. (Maps to the agent's `vzdump`→tiers→PBS in doc 01 §8.) - **Infra-backup payload + recovery pull** — `report/infra_backup*`, `report/infra_pull`. - **Host-physical telemetry** — `system/mounts_linux.go` (mount topology, USB, disk model), the temp/hwmon parts of `system/info_linux.go`, and the host-hardware parts of `metrics/sysinfo.go`. - **Drive scanning for provisioning/DR** — `setup/scanner.go`. - **Self-restore-test execution** — the agent performs the restore-to-scratch-guest; the controller only orchestrates/validates (see §5). --- ## 5. New components to build (no v0.33 equivalent) 1. **Agent local-API client** — the controller's only path to guest-level Proxmox operations (doc 01 §3, §5): `snapshot-before-deploy` + rollback, "grow my RAM", request guest backup/restore, read the storage manifest / mount placement, query per-target storage status. Replaces the deleted direct host/disk code with constrained RPC. The controller holds **no Proxmox creds** — only a local-API token. 2. **Per-volume storage placement** (doc 01 §8) — `.felhom.yml` `hot`/`bulk` volume classification (extend `stacks/metadata.go`), enforcement at deploy (extend `stacks/deploy.go`), and a placement record in `settings`. Replaces the per-app HDD-path + cross-drive model. 3. **Self-restore-test orchestration** — controller asks the agent to restore the latest guest backup to a scratch guest, runs its post-restore health probes, reports the verdict to the hub. (Backed by the validated Phase 2 round-trip in [../proxmox-platform.md](../proxmox-platform.md) §4.) 4. **Snapshot-before-deploy/rollback flow** in the deploy path — wraps the existing compose deploy with agent snapshot → health check → agent rollback-on-failure (doc 01 §9). New behaviour on top of `stacks/deploy.go` + `stacks/healthprobe.go`. 5. **Agent-provisioning bootstrap receiver** — the controller accepts its injected hub API key + local-API token from the agent at provision time (doc 01 §6), replacing the deleted `setup/` wizard. --- ## 6. Open / blocked items - **`cloudflare/` + `api/geo.go` — blocked on tunnel placement** (doc 01 §7, §11: host vs guest `cloudflared`). Geo-WAF is app-domain and likely PORT, but it shares the Cloudflare account/zone with the tunnel; do not finalize until placement is decided. - **`selfupdate/updater.go` — open** (doc 01 §11: self-update flow undesigned). Because the controller is "the agent's product" (doc 01 §3), self-update may move under the agent (snapshot → swap → health-gate → rollback) rather than the controller editing its own compose file. Provisionally PORT. - **`settings`/`stacks` per-volume reshape** — depends on the storage-manifest contract between hub ↔ agent ↔ controller (doc 01 §8), not yet specified. - **Backup UI/report surface** — depends on the agent's guest-backup status API shape (what the controller can see about vzdump/PBS state) — undefined. - **Notification event taxonomy** — which infra events (`storage_disconnected`, `crossdrive_*`, `disaster_recovery_*`) the **agent** emits vs the controller, once those responsibilities move.