1af21a6cac
The security core of slice 4: hub-supplied intent is no longer trusted for destructive change. The gate fronts the per-guest queue's executor, so every mutation passes it. Reuses internal/authz for all crypto (surface untouched). - Classifier (doc 03 §4): benign vs destructive by provenance + data-bearing- ness, NOT by verb. Destroy/overwrite of customer data is destructive unless agent-internal provenance (same-journaled-txn create, or agent-tagged scratch) makes it benign — and that provenance is journal-recorded, NEVER hub-sourced. Unknown op class fails safe to destructive. - Reversibility gate: benign -> allowed unsigned; destructive -> requires a verified, role-scoped, action-bound operator signature, else pending_signature and never executed. Every decision audited (signal, never the guard). - Signed-op consuming layer over authz.Verifier.Verify (locked pipeline untouched): role-scoping (doc 04 §4 — recovery=rotation only, operational= ordinary destructive + planned rotation) + op-to-action binding (op+host+ guest+params must match the gated action). - Signed-job orchestration: idempotency dedupe by nonce + journal-wrapped execution via an injected DestructiveExecutor (nil this slice — inert). - Crash recovery (Note 1): Engine.Recover consumes the journal InFlight() set at startup (resume-or-rollback) — covers an op that crashed after the POST and before its terminal record, which idempotency dedupe alone cannot. Added TaskStatusOnce to the GuestAPI seam. Wired into daemon startup. - Note 2: memory comparison canonicalized to MiB (desiredMemoryMiB) so a non-MiB-aligned MemoryBytes converges in one pass, not perpetual drift. - Daemon: builds the verifier from config signers (none = nil verifier, the common slice-4 state), the gate (+SlogAudit), runs Recover before mutating. Adversarial matrix proven against the REAL authz.Verifier with in-test-minted SSHSIGs (framing replicated in reconcile's test binary; authz untouched, no signing added to the verify-only package): unsigned job + unsigned desired-state delta -> pending_signature; unknown signer/expired/replay-across-restart/wrong host -> typed authz rejections; wrong guest/op/params -> binding_mismatch; recovery key on ordinary destructive -> role_denied; hub-supplied scratch tag ignored -> refused; valid+role+target+fresh nonce -> accepted then replay rejected. Full module race-clean + vet-clean on the Linux build server. Inert this slice: no destructive deltas served until slice 10; the destructive path is classified, gated, and tested but not wired to live execution. CHECKPOINT: Phase B complete (slice 4 done). Awaiting validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
134 lines
5.3 KiB
Go
134 lines
5.3 KiB
Go
package reconcile
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
|
|
"gitea.dooplex.hu/admin/felhom-agent/internal/hub"
|
|
"gitea.dooplex.hu/admin/felhom-agent/internal/proxmox"
|
|
)
|
|
|
|
// RunState is a guest's desired/actual power state. The empty value means
|
|
// "unmanaged" on the desired side (the reconciler leaves run-state alone).
|
|
type RunState string
|
|
|
|
const (
|
|
// RunUnspecified (the zero value) — on a DesiredGuest it means run-state is not
|
|
// managed; the reconciler never starts/stops the guest for run-state reasons.
|
|
RunUnspecified RunState = ""
|
|
// RunRunning maps to proxmox status "running".
|
|
RunRunning RunState = "running"
|
|
// RunStopped maps to proxmox status "stopped".
|
|
RunStopped RunState = "stopped"
|
|
)
|
|
|
|
// normRun maps a raw proxmox status string to a RunState, collapsing anything
|
|
// unrecognized (e.g. "") to RunUnspecified so actual-state comparison is well-defined.
|
|
func normRun(status string) RunState {
|
|
switch status {
|
|
case "running":
|
|
return RunRunning
|
|
case "stopped":
|
|
return RunStopped
|
|
default:
|
|
return RunUnspecified
|
|
}
|
|
}
|
|
|
|
// DesiredGuest is the target state for one existing guest. Every field is
|
|
// individually optional ("unmanaged") so a desired-state source can pin only what it
|
|
// cares about — slice 4's planner only acts on the fields that are set.
|
|
type DesiredGuest struct {
|
|
VMID int
|
|
// Run is the target power state; RunUnspecified leaves it alone.
|
|
Run RunState
|
|
// Spec, when non-nil, manages sizing. Reuses hub.GuestSpec (cores/memory/disk).
|
|
// Phase A reconciles Cores and Memory via SetConfig; DiskBytes is reported but
|
|
// NOT reconciled here (a rootfs grow is `pct resize`, grow-only and separate —
|
|
// deferred to a later slice). Nil = sizing unmanaged.
|
|
Spec *hub.GuestSpec
|
|
// Description, when non-nil, manages the cosmetic `description` field (the first
|
|
// proven SetConfig round-trip, slice-4 pre-check). Nil = unmanaged.
|
|
Description *string
|
|
}
|
|
|
|
// DesiredState is the vmid-keyed target for this host. At slice 4 the only live
|
|
// source is the empty provider, so Guests is empty in production; fixtures inject it
|
|
// in tests. Host-level desired state (storage manifest, etc.) arrives in later slices.
|
|
type DesiredState struct {
|
|
Guests map[int]DesiredGuest
|
|
}
|
|
|
|
// ActualGuest is one guest's observed state, read from Proxmox.
|
|
type ActualGuest struct {
|
|
VMID int
|
|
Run RunState
|
|
// SpecKnown is false when GuestConfig could not be read (the run-state from the
|
|
// list is still trusted; spec/description comparisons are skipped). Mirrors the
|
|
// collector's "keep run-status, omit spec" degradation.
|
|
SpecKnown bool
|
|
Cores int
|
|
MemoryMiB int64 // proxmox LXC `memory` is MiB
|
|
Description string // raw (may carry PVE's trailing newline; compared via normalizers)
|
|
}
|
|
|
|
// ActualState is the vmid-keyed observed state for this host.
|
|
type ActualState struct {
|
|
Guests map[int]ActualGuest
|
|
}
|
|
|
|
// DesiredProvider is the seam the desired-state source plugs into. At slice 4 the
|
|
// only implementation is EmptyProvider (no live source); slice 10's hub-serving path
|
|
// is the real implementation. Do NOT invent a hub/local-file source here.
|
|
type DesiredProvider interface {
|
|
Desired(ctx context.Context) (DesiredState, error)
|
|
}
|
|
|
|
// EmptyProvider is the slice-4 production provider: no desired state, so reconcile is
|
|
// a live no-op (the engine computes an empty action set).
|
|
type EmptyProvider struct{}
|
|
|
|
// Desired returns an empty desired state.
|
|
func (EmptyProvider) Desired(context.Context) (DesiredState, error) {
|
|
return DesiredState{Guests: map[int]DesiredGuest{}}, nil
|
|
}
|
|
|
|
// StaticProvider serves a fixed DesiredState — used by fixtures (and usable as a
|
|
// local override later). It never mutates the value it was given.
|
|
type StaticProvider struct{ State DesiredState }
|
|
|
|
// Desired returns the static state.
|
|
func (p StaticProvider) Desired(context.Context) (DesiredState, error) { return p.State, nil }
|
|
|
|
// GuestAPI is the narrow Proxmox surface the engine needs: read actual state and
|
|
// dispatch the benign-on-existing-guest mutations. *proxmox.Client satisfies it; a
|
|
// fake satisfies it in tests. Every mutating call returns a UPID (or "" for the
|
|
// synchronous path) per the proxmox/mutate.go contract — the engine WaitTasks a
|
|
// non-empty UPID and treats "" as a clean synchronous success.
|
|
type GuestAPI interface {
|
|
ListLXC(ctx context.Context) ([]proxmox.Guest, error)
|
|
GuestConfig(ctx context.Context, vmid int) (proxmox.GuestConfig, error)
|
|
Start(ctx context.Context, vmid int) (string, error)
|
|
Stop(ctx context.Context, vmid int) (string, error)
|
|
SetConfig(ctx context.Context, vmid int, params map[string]string) (string, error)
|
|
WaitTask(ctx context.Context, upid string, opts proxmox.WaitOptions) (proxmox.TaskStatus, error)
|
|
// TaskStatusOnce is a single non-blocking task-status read — used by crash
|
|
// recovery to learn the outcome of an op that was in flight when the agent died.
|
|
TaskStatusOnce(ctx context.Context, upid string) (proxmox.TaskStatus, error)
|
|
}
|
|
|
|
// guestDescription decodes the (string-valued) `description` key from a GuestConfig's
|
|
// raw Extra map, returning "" when absent. The value is returned raw — PVE appends a
|
|
// trailing newline on read, which the normalization layer strips at comparison time.
|
|
func guestDescription(cfg proxmox.GuestConfig) string {
|
|
raw, ok := cfg.Extra["description"]
|
|
if !ok || len(raw) == 0 {
|
|
return ""
|
|
}
|
|
var s string
|
|
if err := json.Unmarshal(raw, &s); err != nil {
|
|
return ""
|
|
}
|
|
return s
|
|
}
|