v0.40.0: bootstrap pull+merge onboarding (controller pulls config from hub)
Fix the onboarding 401: instead of seeding controller.yaml from the agent's HOST hub key (which the hub's customer-scoped /api/v1/report rejects), the controller now PULLS its full controller.yaml from the hub on first boot using the bootstrap's retrieval passphrase (yielding the customer-scoped key) and MERGES in the per-guest local_api block. - internal/bootstrap: contract v1->v2 (customer.id + hub.url + hub.retrieval_password + local_api; drop host key/identity). MaybeIngest gains an injected PullFunc (keeps bootstrap free of the heavy report package), pulls with bounded transient-only retry, merges local_api at YAML-map level (preserves all hub-emitted fields), idempotent + fail-safe + never-crash. - main.go: wire report.PullConfig as the pull adapter (maps ErrHubUnreachable -> ErrPullTransient; auth/not-found permanent). - Lockstep with felhom-agent v0.19.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,5 +1,35 @@
|
|||||||
## Changelog
|
## Changelog
|
||||||
|
|
||||||
|
### v0.40.0 — bootstrap pull+merge onboarding (controller pulls its config from the hub) (2026-06-11)
|
||||||
|
|
||||||
|
Lockstep with `felhom-agent` v0.19.0. Fixes the onboarding 401: a freshly provisioned guest used to
|
||||||
|
seed a "configured" controller.yaml from the agent's **host** hub key, which the hub's `/api/v1/report`
|
||||||
|
(customer-scoped auth) rejects → the controller could never report ONLINE. Now the controller **pulls**
|
||||||
|
its full controller.yaml from the hub on first boot (the hub mints the **customer-scoped** key) and
|
||||||
|
**merges in** the per-guest `local_api` block.
|
||||||
|
|
||||||
|
#### Changed — bootstrap contract `v1 → v2` (`internal/bootstrap`)
|
||||||
|
- `SchemaV1 → SchemaV2 = "felhom.bootstrap/v2"`. `BootstrapCustomer` drops `name`/`domain`/`email` (keeps
|
||||||
|
`id`); `BootstrapHub` drops `api_key`/`host_id`, adds **`retrieval_password`** (SECRET). `local_api`
|
||||||
|
unchanged. A non-v2 schema → setup mode.
|
||||||
|
- **`MaybeIngest(configPath, cfg, logger, pull PullFunc)`** — new injected `pull` arg (decision (b): keeps
|
||||||
|
`bootstrap` from importing the heavy `internal/report` package; wired in `main.go` to `report.PullConfig`).
|
||||||
|
Flow: idempotent (configured → return, **no pull**) → parse + validate v2 → **pull** the hub config with
|
||||||
|
bounded retry (1 + 3 backoff attempts on transient `ErrPullTransient` only; auth/not-found fail fast) →
|
||||||
|
**merge** the per-guest `local_api` at the YAML-map level (preserves every hub-emitted field — assets,
|
||||||
|
CF, backup) → write 0600 atomic → reload. Fail-safe throughout: a hub outage at first boot leaves the
|
||||||
|
guest in setup mode (the manual wizard remains the fallback), never crashes.
|
||||||
|
- New sentinel **`ErrPullTransient`**; `main.go`'s pull adapter maps `report.ErrHubUnreachable` onto it
|
||||||
|
(transient/retryable) and passes auth/not-found through as permanent. Removed `configFromBootstrap`
|
||||||
|
(the host-key-seeding path) and the struct-marshal writer.
|
||||||
|
|
||||||
|
#### Tests (`internal/bootstrap`)
|
||||||
|
- Pull+merge (asserts the merged controller.yaml carries the **customer** key + identity + a preserved
|
||||||
|
unmodeled `assets.source_url` **and** the bootstrap's `local_api`, with **no host key**); idempotency
|
||||||
|
(pull **never invoked** when configured); transient-retry (N attempts then setup); permanent-no-retry;
|
||||||
|
non-v2 schema reject; missing-required reject; malformed/absent. Cross-repo render→ingest round-trip
|
||||||
|
verified against the agent's v2 renderer. `go build ./... && go test ./...` green.
|
||||||
|
|
||||||
### v0.39.1 — 8C orphan-template cleanup (source hygiene) (2026-06-11)
|
### v0.39.1 — 8C orphan-template cleanup (source hygiene) (2026-06-11)
|
||||||
|
|
||||||
Dead-template removal — no behaviour change. Slice 8C de-privileged the controller and retired the
|
Dead-template removal — no behaviour change. Slice 8C de-privileged the controller and retired the
|
||||||
|
|||||||
@@ -3,6 +3,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"context"
|
"context"
|
||||||
"encoding/json"
|
"encoding/json"
|
||||||
|
"errors"
|
||||||
"flag"
|
"flag"
|
||||||
"fmt"
|
"fmt"
|
||||||
"io"
|
"io"
|
||||||
@@ -75,12 +76,21 @@ func main() {
|
|||||||
|
|
||||||
logger, logBuffer := setupLogger(cfg)
|
logger, logBuffer := setupLogger(cfg)
|
||||||
|
|
||||||
// --- Bootstrap ingestion (slice 8A, doc 03 §6) ---
|
// --- Bootstrap ingestion (slice 8A → v0.40.0 onboarding, doc 03 §6) ---
|
||||||
// On first run, if this controller is not yet configured AND the host agent's provisioning
|
// On first run, if this controller is not yet configured AND the host agent's provisioning
|
||||||
// back-half attached a bootstrap.json config mount, seed controller.yaml from it and come up
|
// back-half attached a bootstrap.json config mount, PULL the full controller.yaml from the hub
|
||||||
// CONFIGURED — skipping setup mode. Idempotent (never clobbers an existing controller.yaml)
|
// (using the bootstrap's retrieval passphrase), merge in the per-guest local_api block, and come
|
||||||
// and fail-safe (a malformed/absent bootstrap leaves us in setup mode).
|
// up CONFIGURED — skipping setup mode. Idempotent (never clobbers an existing controller.yaml)
|
||||||
cfg = bootstrap.MaybeIngest(*configPath, cfg, logger)
|
// and fail-safe (a malformed/absent bootstrap, or a hub outage at first boot, leaves us in setup
|
||||||
|
// mode). The adapter marks a transient hub-unreachable error as retryable (the rest are permanent).
|
||||||
|
pull := func(hubURL, customerID, retrievalPassword string) (string, error) {
|
||||||
|
y, perr := report.PullConfig(hubURL, customerID, retrievalPassword)
|
||||||
|
if perr != nil && errors.Is(perr, report.ErrHubUnreachable) {
|
||||||
|
return "", fmt.Errorf("%w: %w", bootstrap.ErrPullTransient, perr)
|
||||||
|
}
|
||||||
|
return y, perr
|
||||||
|
}
|
||||||
|
cfg = bootstrap.MaybeIngest(*configPath, cfg, logger, pull)
|
||||||
|
|
||||||
// --- Wire system package debug logging ---
|
// --- Wire system package debug logging ---
|
||||||
if cfg.Logging.Level == "debug" {
|
if cfg.Logging.Level == "debug" {
|
||||||
|
|||||||
@@ -1,17 +1,25 @@
|
|||||||
// Package bootstrap implements first-run bootstrap.json ingestion (slice 8A, doc 03 §6,
|
// Package bootstrap implements first-run bootstrap.json ingestion (slice 8A → v0.40.0 onboarding,
|
||||||
// config-contract decision (c)). The host agent's provisioning back-half writes a stable
|
// doc 03 §6, config-contract decision (c)/(d)). The host agent's provisioning back-half writes a
|
||||||
// bootstrap.json into a read-only config mount; on first run the controller seeds its own
|
// stable bootstrap.json into a read-only config mount carrying ONLY the customer id, the hub URL, a
|
||||||
// controller.yaml from it and comes up CONFIGURED, skipping the setup wizard. The agent emits
|
// per-customer RETRIEVAL PASSPHRASE, and the per-guest local-API handle. On first run the controller
|
||||||
// the stable contract; the controller owns the translation — the two stay decoupled.
|
// uses the passphrase to PULL its full controller.yaml from the hub (which mints the customer-scoped
|
||||||
|
// hub api_key + identity + assets + backup + CF config), MERGES in the per-guest local_api block (the
|
||||||
|
// only thing the hub yaml lacks, because the hub must not know per-guest Proxmox internals), writes
|
||||||
|
// it, and comes up CONFIGURED — skipping the setup wizard.
|
||||||
|
//
|
||||||
|
// This replaces the old "seed a configured yaml from the agent's HOST key" path, which made the
|
||||||
|
// controller's hub reports 401 (the hub's /report needs the customer-scoped key, not the host key).
|
||||||
package bootstrap
|
package bootstrap
|
||||||
|
|
||||||
import (
|
import (
|
||||||
"encoding/json"
|
"encoding/json"
|
||||||
|
"errors"
|
||||||
"fmt"
|
"fmt"
|
||||||
"log"
|
"log"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
"strings"
|
"strings"
|
||||||
|
"time"
|
||||||
|
|
||||||
"gitea.dooplex.hu/admin/felhom-controller/internal/config"
|
"gitea.dooplex.hu/admin/felhom-controller/internal/config"
|
||||||
"gopkg.in/yaml.v3"
|
"gopkg.in/yaml.v3"
|
||||||
@@ -21,31 +29,47 @@ import (
|
|||||||
// with FELHOM_BOOTSTRAP_PATH for tests / non-standard layouts.
|
// with FELHOM_BOOTSTRAP_PATH for tests / non-standard layouts.
|
||||||
const DefaultMountPath = "/etc/felhom-bootstrap/bootstrap.json"
|
const DefaultMountPath = "/etc/felhom-bootstrap/bootstrap.json"
|
||||||
|
|
||||||
// SchemaV1 is the stable contract version the agent emits and the controller ingests.
|
// SchemaV2 is the stable contract version the agent emits and the controller ingests. v2 changed the
|
||||||
const SchemaV1 = "felhom.bootstrap/v1"
|
// contract's MEANING (the controller pulls its config from the hub rather than seeding it from the
|
||||||
|
// agent's host key) and its field set, so it is a clean version bump with no v1 back-compat
|
||||||
|
// (pre-launch; zero v1 guests deployed). A non-v2 schema is rejected → setup mode.
|
||||||
|
const SchemaV2 = "felhom.bootstrap/v2"
|
||||||
|
|
||||||
// Bootstrap is the stable agent→controller config contract (JSON). It carries exactly what the
|
// ErrPullTransient marks a pull failure as retryable (a boot-time network race reaching the hub).
|
||||||
// controller needs to come up configured + reach the agent's local API. It is deliberately a
|
// The wiring (main.go) wraps report.ErrHubUnreachable with this; permanent failures (auth /
|
||||||
// SEPARATE shape from controller.yaml (decision (c)): the agent never needs to know the
|
// not-found) are NOT wrapped, so MaybeIngest fails fast on them. Keeping this sentinel here (rather
|
||||||
// controller's full config schema.
|
// than importing the heavy internal/report package) keeps bootstrap decoupled — decision (b).
|
||||||
|
var ErrPullTransient = errors.New("bootstrap: transient pull failure")
|
||||||
|
|
||||||
|
// pullRetryDelays is the backoff between transient pull retries (one initial attempt + one retry per
|
||||||
|
// entry → 4 attempts total on persistent transient failure). Overridable in tests for speed.
|
||||||
|
var pullRetryDelays = []time.Duration{2 * time.Second, 4 * time.Second, 8 * time.Second}
|
||||||
|
|
||||||
|
// PullFunc fetches a generated controller.yaml from the hub for a customer, authenticated by the
|
||||||
|
// retrieval passphrase. Injected (decision (b)) so MaybeIngest never imports internal/report; the
|
||||||
|
// production wiring passes report.PullConfig. A transient (retryable) failure must be
|
||||||
|
// errors.Is(err, ErrPullTransient); any other error is treated as permanent (no retry).
|
||||||
|
type PullFunc func(hubURL, customerID, retrievalPassword string) (string, error)
|
||||||
|
|
||||||
|
// Bootstrap is the stable agent→controller config contract (JSON, schema v2). It carries ONLY what
|
||||||
|
// the controller needs to PULL its config (customer id + hub url + retrieval passphrase) and reach
|
||||||
|
// the agent's local API (endpoint/fingerprint/token). It is deliberately a SEPARATE shape from
|
||||||
|
// controller.yaml: the agent never needs to know the controller's full config schema, and never
|
||||||
|
// holds the customer-scoped hub key or CF tokens (those come from the hub pull).
|
||||||
type Bootstrap struct {
|
type Bootstrap struct {
|
||||||
Schema string `json:"schema"`
|
Schema string `json:"schema"`
|
||||||
Customer BootstrapCustomer `json:"customer"`
|
Customer BootstrapCustomer `json:"customer"` // only id (the pull target); the hub provides name/domain/email
|
||||||
Hub BootstrapHub `json:"hub"`
|
Hub BootstrapHub `json:"hub"`
|
||||||
LocalAPI BootstrapLocalAPI `json:"local_api"`
|
LocalAPI BootstrapLocalAPI `json:"local_api"`
|
||||||
}
|
}
|
||||||
|
|
||||||
type BootstrapCustomer struct {
|
type BootstrapCustomer struct {
|
||||||
ID string `json:"id"`
|
ID string `json:"id"`
|
||||||
Name string `json:"name"`
|
|
||||||
Domain string `json:"domain"`
|
|
||||||
Email string `json:"email"`
|
|
||||||
}
|
}
|
||||||
|
|
||||||
type BootstrapHub struct {
|
type BootstrapHub struct {
|
||||||
URL string `json:"url"`
|
URL string `json:"url"`
|
||||||
APIKey string `json:"api_key"`
|
RetrievalPassword string `json:"retrieval_password"` // SECRET — pulls the full config (incl. the customer key)
|
||||||
HostID string `json:"host_id"` // the agent's host id (reference; not load-bearing for the controller)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
type BootstrapLocalAPI struct {
|
type BootstrapLocalAPI struct {
|
||||||
@@ -62,18 +86,20 @@ func Path() string {
|
|||||||
return DefaultMountPath
|
return DefaultMountPath
|
||||||
}
|
}
|
||||||
|
|
||||||
// MaybeIngest seeds controller.yaml from a bootstrap.json mount when the controller is NOT yet
|
// MaybeIngest, on an unconfigured controller, pulls the full controller.yaml from the hub (using the
|
||||||
// configured, and returns the config the caller should use.
|
// bootstrap's retrieval passphrase), merges in the per-guest local_api block, writes controller.yaml,
|
||||||
|
// and returns the reloaded config. Returns the config the caller should use.
|
||||||
//
|
//
|
||||||
// Contract:
|
// Contract:
|
||||||
// - Idempotent: if cfg is already configured (customer.id set), the existing controller.yaml is
|
// - Idempotent: if cfg is already configured (customer.id set), the existing controller.yaml is
|
||||||
// NEVER clobbered — returns cfg unchanged.
|
// NEVER clobbered and the hub is NEVER pulled — returns cfg unchanged.
|
||||||
// - Fail-safe: an absent or malformed bootstrap, or one missing the minimum identity, leaves cfg
|
// - Fail-safe: an absent/malformed bootstrap, a non-v2 schema, a missing required field, or a hub
|
||||||
// unchanged (the caller proceeds to normal setup mode) — it logs and never crashes.
|
// pull that ultimately fails leaves cfg unchanged (the caller proceeds to setup mode). It logs
|
||||||
|
// and NEVER crashes — a hub outage at first boot must not brick the guest.
|
||||||
// - On success: writes controller.yaml (0600, atomic), reloads it, and returns the reloaded cfg.
|
// - On success: writes controller.yaml (0600, atomic), reloads it, and returns the reloaded cfg.
|
||||||
func MaybeIngest(configPath string, cfg *config.Config, logger *log.Logger) *config.Config {
|
func MaybeIngest(configPath string, cfg *config.Config, logger *log.Logger, pull PullFunc) *config.Config {
|
||||||
if cfg != nil && cfg.Customer.ID != "" {
|
if cfg != nil && cfg.Customer.ID != "" {
|
||||||
return cfg // already configured — do not clobber (idempotent)
|
return cfg // already configured — do not clobber, do not pull (idempotent)
|
||||||
}
|
}
|
||||||
bpath := Path()
|
bpath := Path()
|
||||||
data, err := os.ReadFile(bpath)
|
data, err := os.ReadFile(bpath)
|
||||||
@@ -89,17 +115,39 @@ func MaybeIngest(configPath string, cfg *config.Config, logger *log.Logger) *con
|
|||||||
logger.Printf("[WARN] bootstrap: %s is not valid JSON: %v — staying in setup", bpath, err)
|
logger.Printf("[WARN] bootstrap: %s is not valid JSON: %v — staying in setup", bpath, err)
|
||||||
return cfg
|
return cfg
|
||||||
}
|
}
|
||||||
if b.Schema != "" && b.Schema != SchemaV1 {
|
if b.Schema != SchemaV2 {
|
||||||
logger.Printf("[WARN] bootstrap: unsupported schema %q (want %q) — staying in setup", b.Schema, SchemaV1)
|
logger.Printf("[WARN] bootstrap: unsupported schema %q (want %q) — staying in setup", b.Schema, SchemaV2)
|
||||||
return cfg
|
return cfg
|
||||||
}
|
}
|
||||||
if b.Customer.ID == "" || b.Customer.Domain == "" {
|
if b.Customer.ID == "" || b.Hub.URL == "" || b.Hub.RetrievalPassword == "" {
|
||||||
logger.Printf("[WARN] bootstrap: %s missing customer.id/domain — staying in setup", bpath)
|
logger.Printf("[WARN] bootstrap: %s missing customer.id / hub.url / hub.retrieval_password — staying in setup", bpath)
|
||||||
|
return cfg
|
||||||
|
}
|
||||||
|
if b.LocalAPI.Endpoint == "" || b.LocalAPI.Fingerprint == "" || b.LocalAPI.Token == "" {
|
||||||
|
logger.Printf("[WARN] bootstrap: %s missing local_api.{endpoint,fingerprint,token} — staying in setup", bpath)
|
||||||
|
return cfg
|
||||||
|
}
|
||||||
|
if pull == nil {
|
||||||
|
logger.Printf("[WARN] bootstrap: no pull function wired — staying in setup")
|
||||||
return cfg
|
return cfg
|
||||||
}
|
}
|
||||||
|
|
||||||
seeded := configFromBootstrap(b)
|
// --- Pull the full controller.yaml from the hub, with bounded retry on transient errors only. ---
|
||||||
if err := writeYAML(configPath, seeded); err != nil {
|
pulled, err := pullWithRetry(pull, b.Hub.URL, b.Customer.ID, b.Hub.RetrievalPassword, logger)
|
||||||
|
if err != nil {
|
||||||
|
logger.Printf("[WARN] bootstrap: hub config pull failed for customer %s from %s: %v — staying in setup (manual setup wizard remains the fallback)",
|
||||||
|
b.Customer.ID, b.Hub.URL, err)
|
||||||
|
return cfg
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- Merge the per-guest local_api block into the hub yaml at the MAP level (decision (c)) so
|
||||||
|
// every field the hub emits is preserved (forward-compat with hub template changes). ---
|
||||||
|
merged, err := mergeLocalAPI(pulled, b.LocalAPI)
|
||||||
|
if err != nil {
|
||||||
|
logger.Printf("[WARN] bootstrap: merging local_api into pulled config failed: %v — staying in setup", err)
|
||||||
|
return cfg
|
||||||
|
}
|
||||||
|
if err := writeFileAtomic(configPath, merged); err != nil {
|
||||||
logger.Printf("[WARN] bootstrap: could not write %s: %v — staying in setup", configPath, err)
|
logger.Printf("[WARN] bootstrap: could not write %s: %v — staying in setup", configPath, err)
|
||||||
return cfg
|
return cfg
|
||||||
}
|
}
|
||||||
@@ -108,43 +156,62 @@ func MaybeIngest(configPath string, cfg *config.Config, logger *log.Logger) *con
|
|||||||
logger.Printf("[WARN] bootstrap: wrote %s but reload failed: %v — staying in setup", configPath, err)
|
logger.Printf("[WARN] bootstrap: wrote %s but reload failed: %v — staying in setup", configPath, err)
|
||||||
return cfg
|
return cfg
|
||||||
}
|
}
|
||||||
logger.Printf("[INFO] bootstrap: seeded %s from %s (customer=%s, local_api=%s) — coming up configured",
|
logger.Printf("[INFO] bootstrap: pulled config from %s for %s, merged local_api (%s) — coming up configured",
|
||||||
configPath, bpath, b.Customer.ID, b.LocalAPI.Endpoint)
|
b.Hub.URL, b.Customer.ID, b.LocalAPI.Endpoint)
|
||||||
return reloaded
|
return reloaded
|
||||||
}
|
}
|
||||||
|
|
||||||
// configFromBootstrap maps the stable contract onto a controller.yaml Config. Only the
|
// pullWithRetry calls pull once, then retries on transient (ErrPullTransient) failures only, with
|
||||||
// identity/hub/local-api fields are seeded; all other config keeps controller defaults (the
|
// the pullRetryDelays backoff. Permanent failures (anything not ErrPullTransient) fail fast.
|
||||||
// customer configures the rest via the dashboard / hub manifest).
|
func pullWithRetry(pull PullFunc, hubURL, customerID, password string, logger *log.Logger) (string, error) {
|
||||||
func configFromBootstrap(b Bootstrap) *config.Config {
|
var lastErr error
|
||||||
cfg := &config.Config{}
|
for attempt := 0; ; attempt++ {
|
||||||
cfg.Customer.ID = b.Customer.ID
|
yaml, err := pull(hubURL, customerID, password)
|
||||||
cfg.Customer.Name = b.Customer.Name
|
if err == nil {
|
||||||
cfg.Customer.Domain = b.Customer.Domain
|
return yaml, nil
|
||||||
cfg.Customer.Email = b.Customer.Email
|
}
|
||||||
if b.Hub.URL != "" {
|
lastErr = err
|
||||||
cfg.Hub.Enabled = b.Hub.APIKey != ""
|
if !errors.Is(err, ErrPullTransient) {
|
||||||
cfg.Hub.URL = b.Hub.URL
|
return "", err // permanent (auth/not-found/other) — no retry
|
||||||
cfg.Hub.APIKey = b.Hub.APIKey
|
}
|
||||||
|
if attempt >= len(pullRetryDelays) {
|
||||||
|
break // exhausted retries
|
||||||
|
}
|
||||||
|
delay := pullRetryDelays[attempt]
|
||||||
|
logger.Printf("[INFO] bootstrap: hub unreachable (attempt %d), retrying in %s …", attempt+1, delay)
|
||||||
|
time.Sleep(delay)
|
||||||
}
|
}
|
||||||
cfg.LocalAPI.Endpoint = b.LocalAPI.Endpoint
|
return "", lastErr
|
||||||
cfg.LocalAPI.Fingerprint = b.LocalAPI.Fingerprint
|
|
||||||
cfg.LocalAPI.Token = b.LocalAPI.Token
|
|
||||||
return cfg
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// writeYAML marshals cfg to YAML and writes it atomically (tmp + rename), 0600 (it carries the
|
// mergeLocalAPI parses the pulled controller.yaml as a generic map, sets the local_api block from the
|
||||||
// local-api token + any hub key).
|
// bootstrap (overwriting any hub-emitted placeholder), and re-marshals. local_api.enabled is NOT set
|
||||||
func writeYAML(path string, cfg *config.Config) error {
|
// — it defaults on once endpoint is present (config.LocalAPIConfig).
|
||||||
out, err := yaml.Marshal(cfg)
|
func mergeLocalAPI(pulledYAML string, la BootstrapLocalAPI) ([]byte, error) {
|
||||||
if err != nil {
|
m := map[string]any{}
|
||||||
return fmt.Errorf("marshal: %w", err)
|
if err := yaml.Unmarshal([]byte(pulledYAML), &m); err != nil {
|
||||||
|
return nil, fmt.Errorf("parse pulled yaml: %w", err)
|
||||||
}
|
}
|
||||||
|
m["local_api"] = map[string]any{
|
||||||
|
"endpoint": la.Endpoint,
|
||||||
|
"fingerprint": la.Fingerprint,
|
||||||
|
"token": la.Token,
|
||||||
|
}
|
||||||
|
out, err := yaml.Marshal(m)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("marshal merged yaml: %w", err)
|
||||||
|
}
|
||||||
|
return out, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// writeFileAtomic writes b to path atomically (tmp + rename), 0600 (it carries the local-api token +
|
||||||
|
// the customer hub key).
|
||||||
|
func writeFileAtomic(path string, b []byte) error {
|
||||||
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
|
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
|
||||||
return fmt.Errorf("config dir: %w", err)
|
return fmt.Errorf("config dir: %w", err)
|
||||||
}
|
}
|
||||||
tmp := path + ".tmp"
|
tmp := path + ".tmp"
|
||||||
if err := os.WriteFile(tmp, out, 0o600); err != nil {
|
if err := os.WriteFile(tmp, b, 0o600); err != nil {
|
||||||
return err
|
return err
|
||||||
}
|
}
|
||||||
return os.Rename(tmp, path)
|
return os.Rename(tmp, path)
|
||||||
|
|||||||
@@ -1,133 +1,241 @@
|
|||||||
package bootstrap
|
package bootstrap
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"errors"
|
||||||
|
"fmt"
|
||||||
"io"
|
"io"
|
||||||
"log"
|
"log"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
|
"strings"
|
||||||
"testing"
|
"testing"
|
||||||
|
"time"
|
||||||
|
|
||||||
"gitea.dooplex.hu/admin/felhom-controller/internal/config"
|
"gitea.dooplex.hu/admin/felhom-controller/internal/config"
|
||||||
)
|
)
|
||||||
|
|
||||||
func testLogger() *log.Logger { return log.New(io.Discard, "", 0) }
|
func testLogger() *log.Logger { return log.New(io.Discard, "", 0) }
|
||||||
|
|
||||||
const goodBootstrap = `{
|
// A valid v2 bootstrap: only customer.id + hub.url + hub.retrieval_password + the per-guest local_api.
|
||||||
"schema": "felhom.bootstrap/v1",
|
const goodBootstrapV2 = `{
|
||||||
"customer": {"id": "cust-8200", "name": "Teszt", "domain": "cust8200.felhom.eu", "email": "a@b.hu"},
|
"schema": "felhom.bootstrap/v2",
|
||||||
"hub": {"url": "https://hub.felhom.eu", "api_key": "HUBKEY", "host_id": "demo-felhom-01"},
|
"customer": {"id": "cust-8200"},
|
||||||
|
"hub": {"url": "https://hub.felhom.eu", "retrieval_password": "five-word-passphrase-here"},
|
||||||
"local_api": {"endpoint": "192.168.0.162:8443", "fingerprint": "ab12", "token": "PERGUESTTOKEN"}
|
"local_api": {"endpoint": "192.168.0.162:8443", "fingerprint": "ab12", "token": "PERGUESTTOKEN"}
|
||||||
}`
|
}`
|
||||||
|
|
||||||
// A present bootstrap on an unconfigured controller seeds controller.yaml and skips setup.
|
// hubYAML is what the hub's /api/v1/config/{id} returns: a full controller.yaml carrying the
|
||||||
func TestMaybeIngest_SeedsWhenUnconfigured(t *testing.T) {
|
// CUSTOMER-scoped hub key + identity + assets, but NO local_api (the hub can't know per-guest
|
||||||
dir := t.TempDir()
|
// Proxmox internals). Includes an unmodeled field (`assets.source_url`) to prove map-level merge
|
||||||
bpath := filepath.Join(dir, "bootstrap.json")
|
// preserves it.
|
||||||
cfgPath := filepath.Join(dir, "controller.yaml")
|
const hubYAML = `# Felhom Controller Configuration
|
||||||
if err := os.WriteFile(bpath, []byte(goodBootstrap), 0o600); err != nil {
|
customer:
|
||||||
|
id: cust-8200
|
||||||
|
name: Teszt Ügyfél
|
||||||
|
domain: cust8200.felhom.eu
|
||||||
|
email: a@b.hu
|
||||||
|
hub:
|
||||||
|
enabled: true
|
||||||
|
url: https://hub.felhom.eu
|
||||||
|
api_key: CUSTKEY_FROM_HUB
|
||||||
|
assets:
|
||||||
|
source_url: https://hub.felhom.eu/assets
|
||||||
|
sync_enabled: true
|
||||||
|
web:
|
||||||
|
session_secret: deadbeef
|
||||||
|
`
|
||||||
|
|
||||||
|
func writeBootstrap(t *testing.T, dir, content string) (bpath, cfgPath string) {
|
||||||
|
t.Helper()
|
||||||
|
bpath = filepath.Join(dir, "bootstrap.json")
|
||||||
|
cfgPath = filepath.Join(dir, "controller.yaml")
|
||||||
|
if err := os.WriteFile(bpath, []byte(content), 0o600); err != nil {
|
||||||
t.Fatal(err)
|
t.Fatal(err)
|
||||||
}
|
}
|
||||||
t.Setenv("FELHOM_BOOTSTRAP_PATH", bpath)
|
t.Setenv("FELHOM_BOOTSTRAP_PATH", bpath)
|
||||||
|
return bpath, cfgPath
|
||||||
|
}
|
||||||
|
|
||||||
got := MaybeIngest(cfgPath, config.Default(), testLogger())
|
// PULL+MERGE: an unconfigured controller pulls the hub yaml and merges in the per-guest local_api.
|
||||||
|
// The written controller.yaml must carry BOTH the hub's customer key/identity/assets AND the
|
||||||
|
// bootstrap's local_api — and must NOT contain a host key.
|
||||||
|
func TestMaybeIngest_PullsAndMerges(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
_, cfgPath := writeBootstrap(t, dir, goodBootstrapV2)
|
||||||
|
|
||||||
|
var calls int
|
||||||
|
var gotURL, gotID, gotPass string
|
||||||
|
pull := func(hubURL, customerID, pass string) (string, error) {
|
||||||
|
calls++
|
||||||
|
gotURL, gotID, gotPass = hubURL, customerID, pass
|
||||||
|
return hubYAML, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
got := MaybeIngest(cfgPath, config.Default(), testLogger(), pull)
|
||||||
|
|
||||||
|
// pull was called once with the bootstrap's values
|
||||||
|
if calls != 1 || gotURL != "https://hub.felhom.eu" || gotID != "cust-8200" || gotPass != "five-word-passphrase-here" {
|
||||||
|
t.Fatalf("pull args wrong: calls=%d url=%q id=%q pass=%q", calls, gotURL, gotID, gotPass)
|
||||||
|
}
|
||||||
|
// returned cfg carries the hub's CUSTOMER key + identity (from the pull)
|
||||||
|
if got.Hub.APIKey != "CUSTKEY_FROM_HUB" || !got.Hub.Enabled || got.Hub.URL != "https://hub.felhom.eu" {
|
||||||
|
t.Fatalf("hub not from pulled config: %+v", got.Hub)
|
||||||
|
}
|
||||||
if got.Customer.ID != "cust-8200" || got.Customer.Domain != "cust8200.felhom.eu" {
|
if got.Customer.ID != "cust-8200" || got.Customer.Domain != "cust8200.felhom.eu" {
|
||||||
t.Fatalf("customer not seeded: %+v", got.Customer)
|
t.Fatalf("customer not from pulled config: %+v", got.Customer)
|
||||||
}
|
}
|
||||||
|
// AND the per-guest local_api merged in from the bootstrap
|
||||||
if got.LocalAPI.Endpoint != "192.168.0.162:8443" || got.LocalAPI.Token != "PERGUESTTOKEN" || got.LocalAPI.Fingerprint != "ab12" {
|
if got.LocalAPI.Endpoint != "192.168.0.162:8443" || got.LocalAPI.Token != "PERGUESTTOKEN" || got.LocalAPI.Fingerprint != "ab12" {
|
||||||
t.Fatalf("local_api not seeded: %+v", got.LocalAPI)
|
t.Fatalf("local_api not merged from bootstrap: %+v", got.LocalAPI)
|
||||||
}
|
}
|
||||||
if !got.Hub.Enabled || got.Hub.URL != "https://hub.felhom.eu" || got.Hub.APIKey != "HUBKEY" {
|
// unmodeled hub field preserved (forward-compat: map-level merge)
|
||||||
t.Fatalf("hub not seeded: %+v", got.Hub)
|
if got.Assets.SourceURL != "https://hub.felhom.eu/assets" {
|
||||||
|
t.Fatalf("assets.source_url not preserved through merge: %+v", got.Assets)
|
||||||
}
|
}
|
||||||
// controller.yaml must now exist on disk (so a restart reads it directly).
|
// the written file must reload configured, carry the customer key, and NOT carry a host key
|
||||||
if _, err := os.Stat(cfgPath); err != nil {
|
raw, err := os.ReadFile(cfgPath)
|
||||||
|
if err != nil {
|
||||||
t.Fatalf("controller.yaml not written: %v", err)
|
t.Fatalf("controller.yaml not written: %v", err)
|
||||||
}
|
}
|
||||||
// And it must reload as configured (not setup).
|
s := string(raw)
|
||||||
|
if !strings.Contains(s, "CUSTKEY_FROM_HUB") {
|
||||||
|
t.Fatalf("written controller.yaml missing customer key:\n%s", s)
|
||||||
|
}
|
||||||
|
if !strings.Contains(s, "PERGUESTTOKEN") || !strings.Contains(s, "192.168.0.162:8443") {
|
||||||
|
t.Fatalf("written controller.yaml missing merged local_api:\n%s", s)
|
||||||
|
}
|
||||||
|
if strings.Contains(s, "host_id") || strings.Contains(s, "HOSTKEY") {
|
||||||
|
t.Fatalf("written controller.yaml leaked a host key/id:\n%s", s)
|
||||||
|
}
|
||||||
reloaded, err := config.LoadPermissive(cfgPath)
|
reloaded, err := config.LoadPermissive(cfgPath)
|
||||||
if err != nil || reloaded.Customer.ID != "cust-8200" {
|
if err != nil || reloaded.Customer.ID != "cust-8200" || reloaded.Hub.APIKey != "CUSTKEY_FROM_HUB" {
|
||||||
t.Fatalf("seeded controller.yaml does not reload configured: %v / %+v", err, reloaded.Customer)
|
t.Fatalf("written controller.yaml does not reload configured: %v / %+v", err, reloaded)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// An already-configured controller is NEVER clobbered (idempotent).
|
// IDEMPOTENT: an already-configured controller is never clobbered, and pull is NEVER invoked.
|
||||||
func TestMaybeIngest_DoesNotClobberConfigured(t *testing.T) {
|
func TestMaybeIngest_DoesNotClobberConfigured_NoPull(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
bpath := filepath.Join(dir, "bootstrap.json")
|
_, cfgPath := writeBootstrap(t, dir, goodBootstrapV2)
|
||||||
cfgPath := filepath.Join(dir, "controller.yaml")
|
|
||||||
if err := os.WriteFile(bpath, []byte(goodBootstrap), 0o600); err != nil {
|
|
||||||
t.Fatal(err)
|
|
||||||
}
|
|
||||||
t.Setenv("FELHOM_BOOTSTRAP_PATH", bpath)
|
|
||||||
|
|
||||||
existing := config.Default()
|
existing := config.Default()
|
||||||
existing.Customer.ID = "already-here"
|
existing.Customer.ID = "already-here"
|
||||||
existing.Customer.Domain = "existing.felhom.eu"
|
existing.Customer.Domain = "existing.felhom.eu"
|
||||||
|
|
||||||
got := MaybeIngest(cfgPath, existing, testLogger())
|
pulled := false
|
||||||
|
pull := func(string, string, string) (string, error) { pulled = true; return hubYAML, nil }
|
||||||
|
|
||||||
|
got := MaybeIngest(cfgPath, existing, testLogger(), pull)
|
||||||
|
if pulled {
|
||||||
|
t.Fatal("pull was invoked on an already-configured controller")
|
||||||
|
}
|
||||||
if got.Customer.ID != "already-here" {
|
if got.Customer.ID != "already-here" {
|
||||||
t.Fatalf("configured controller was clobbered by bootstrap: %+v", got.Customer)
|
t.Fatalf("configured controller was clobbered: %+v", got.Customer)
|
||||||
}
|
}
|
||||||
if _, err := os.Stat(cfgPath); err == nil {
|
if _, err := os.Stat(cfgPath); err == nil {
|
||||||
t.Fatal("controller.yaml was written despite an already-configured controller")
|
t.Fatal("controller.yaml written despite an already-configured controller")
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// A malformed bootstrap leaves the controller in setup mode (cfg unchanged), no crash.
|
// FAIL-SAFE (transient): a persistently-unreachable hub is retried, then leaves cfg in setup mode
|
||||||
func TestMaybeIngest_MalformedStaysInSetup(t *testing.T) {
|
// (no controller.yaml). Asserts the retry count (1 initial + len(pullRetryDelays)).
|
||||||
|
func TestMaybeIngest_TransientRetriesThenSetup(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
bpath := filepath.Join(dir, "bootstrap.json")
|
_, cfgPath := writeBootstrap(t, dir, goodBootstrapV2)
|
||||||
cfgPath := filepath.Join(dir, "controller.yaml")
|
|
||||||
if err := os.WriteFile(bpath, []byte("{not json"), 0o600); err != nil {
|
|
||||||
t.Fatal(err)
|
|
||||||
}
|
|
||||||
t.Setenv("FELHOM_BOOTSTRAP_PATH", bpath)
|
|
||||||
|
|
||||||
got := MaybeIngest(cfgPath, config.Default(), testLogger())
|
// shrink the backoff so the test is fast
|
||||||
|
orig := pullRetryDelays
|
||||||
|
pullRetryDelays = []time.Duration{time.Millisecond, time.Millisecond, time.Millisecond}
|
||||||
|
defer func() { pullRetryDelays = orig }()
|
||||||
|
|
||||||
|
calls := 0
|
||||||
|
pull := func(string, string, string) (string, error) {
|
||||||
|
calls++
|
||||||
|
return "", fmt.Errorf("%w: dial tcp: timeout", ErrPullTransient)
|
||||||
|
}
|
||||||
|
|
||||||
|
got := MaybeIngest(cfgPath, config.Default(), testLogger(), pull)
|
||||||
if got.Customer.ID != "" {
|
if got.Customer.ID != "" {
|
||||||
t.Fatalf("malformed bootstrap seeded a config: %+v", got.Customer)
|
t.Fatalf("seeded despite a failing hub pull: %+v", got.Customer)
|
||||||
}
|
}
|
||||||
if _, err := os.Stat(cfgPath); err == nil {
|
if _, err := os.Stat(cfgPath); err == nil {
|
||||||
t.Fatal("controller.yaml written from malformed bootstrap")
|
t.Fatal("controller.yaml written despite a failing pull")
|
||||||
|
}
|
||||||
|
if want := 1 + len(pullRetryDelays); calls != want {
|
||||||
|
t.Fatalf("transient retry count: got %d, want %d", calls, want)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// A bootstrap missing the minimum identity is rejected (stays in setup).
|
// FAIL-SAFE (permanent): an auth/not-found failure is NOT retried (fail fast), setup mode.
|
||||||
func TestMaybeIngest_MissingIdentityStaysInSetup(t *testing.T) {
|
func TestMaybeIngest_PermanentNoRetry(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
bpath := filepath.Join(dir, "bootstrap.json")
|
_, cfgPath := writeBootstrap(t, dir, goodBootstrapV2)
|
||||||
cfgPath := filepath.Join(dir, "controller.yaml")
|
|
||||||
if err := os.WriteFile(bpath, []byte(`{"schema":"felhom.bootstrap/v1","local_api":{"endpoint":"x:1"}}`), 0o600); err != nil {
|
|
||||||
t.Fatal(err)
|
|
||||||
}
|
|
||||||
t.Setenv("FELHOM_BOOTSTRAP_PATH", bpath)
|
|
||||||
|
|
||||||
got := MaybeIngest(cfgPath, config.Default(), testLogger())
|
calls := 0
|
||||||
|
pull := func(string, string, string) (string, error) {
|
||||||
|
calls++
|
||||||
|
return "", errors.New("authentication failed") // permanent (not wrapped with ErrPullTransient)
|
||||||
|
}
|
||||||
|
|
||||||
|
got := MaybeIngest(cfgPath, config.Default(), testLogger(), pull)
|
||||||
if got.Customer.ID != "" {
|
if got.Customer.ID != "" {
|
||||||
t.Fatal("seeded despite missing customer identity")
|
t.Fatalf("seeded despite a permanent pull failure: %+v", got.Customer)
|
||||||
|
}
|
||||||
|
if calls != 1 {
|
||||||
|
t.Fatalf("permanent failure was retried: %d calls", calls)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// An absent bootstrap is a no-op (normal setup).
|
// SCHEMA REJECT: a v1 (or any non-v2) schema is rejected → setup mode, no pull.
|
||||||
func TestMaybeIngest_AbsentIsNoop(t *testing.T) {
|
func TestMaybeIngest_RejectsNonV2Schema(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
|
_, cfgPath := writeBootstrap(t, dir, `{"schema":"felhom.bootstrap/v1","customer":{"id":"x"},"hub":{"url":"u","retrieval_password":"p"},"local_api":{"endpoint":"e","fingerprint":"f","token":"t"}}`)
|
||||||
|
|
||||||
|
pulled := false
|
||||||
|
pull := func(string, string, string) (string, error) { pulled = true; return hubYAML, nil }
|
||||||
|
|
||||||
|
got := MaybeIngest(cfgPath, config.Default(), testLogger(), pull)
|
||||||
|
if pulled {
|
||||||
|
t.Fatal("pull invoked for a non-v2 schema")
|
||||||
|
}
|
||||||
|
if got.Customer.ID != "" {
|
||||||
|
t.Fatal("seeded from a non-v2 schema")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// MISSING REQUIRED FIELDS: a v2 bootstrap missing the retrieval passphrase (or local_api) is rejected.
|
||||||
|
func TestMaybeIngest_MissingRequiredStaysInSetup(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
_, cfgPath := writeBootstrap(t, dir, `{"schema":"felhom.bootstrap/v2","customer":{"id":"x"},"hub":{"url":"u"},"local_api":{"endpoint":"e","fingerprint":"f","token":"t"}}`)
|
||||||
|
|
||||||
|
pulled := false
|
||||||
|
pull := func(string, string, string) (string, error) { pulled = true; return hubYAML, nil }
|
||||||
|
got := MaybeIngest(cfgPath, config.Default(), testLogger(), pull)
|
||||||
|
if pulled {
|
||||||
|
t.Fatal("pull invoked despite a missing retrieval_password")
|
||||||
|
}
|
||||||
|
if got.Customer.ID != "" {
|
||||||
|
t.Fatal("seeded despite missing required fields")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// MALFORMED / ABSENT: never crash, stay in setup, no pull.
|
||||||
|
func TestMaybeIngest_MalformedAndAbsent(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
pulled := false
|
||||||
|
pull := func(string, string, string) (string, error) { pulled = true; return hubYAML, nil }
|
||||||
|
|
||||||
|
// malformed
|
||||||
|
_, cfgPath := writeBootstrap(t, dir, "{not json")
|
||||||
|
if got := MaybeIngest(cfgPath, config.Default(), testLogger(), pull); got.Customer.ID != "" {
|
||||||
|
t.Fatal("seeded from malformed bootstrap")
|
||||||
|
}
|
||||||
|
// absent
|
||||||
t.Setenv("FELHOM_BOOTSTRAP_PATH", filepath.Join(dir, "nope.json"))
|
t.Setenv("FELHOM_BOOTSTRAP_PATH", filepath.Join(dir, "nope.json"))
|
||||||
got := MaybeIngest(filepath.Join(dir, "controller.yaml"), config.Default(), testLogger())
|
if got := MaybeIngest(filepath.Join(dir, "c2.yaml"), config.Default(), testLogger(), pull); got.Customer.ID != "" {
|
||||||
if got.Customer.ID != "" {
|
|
||||||
t.Fatal("seeded with no bootstrap present")
|
t.Fatal("seeded with no bootstrap present")
|
||||||
}
|
}
|
||||||
}
|
if pulled {
|
||||||
|
t.Fatal("pull invoked for malformed/absent bootstrap")
|
||||||
// An unsupported schema is rejected.
|
|
||||||
func TestMaybeIngest_UnsupportedSchema(t *testing.T) {
|
|
||||||
dir := t.TempDir()
|
|
||||||
bpath := filepath.Join(dir, "bootstrap.json")
|
|
||||||
if err := os.WriteFile(bpath, []byte(`{"schema":"felhom.bootstrap/v999","customer":{"id":"x","domain":"y"}}`), 0o600); err != nil {
|
|
||||||
t.Fatal(err)
|
|
||||||
}
|
|
||||||
t.Setenv("FELHOM_BOOTSTRAP_PATH", bpath)
|
|
||||||
got := MaybeIngest(filepath.Join(dir, "controller.yaml"), config.Default(), testLogger())
|
|
||||||
if got.Customer.ID != "" {
|
|
||||||
t.Fatal("seeded from an unsupported schema")
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user