v0.4.0: slice 4 Phase B — reversibility gate + signed-op consuming layer

The security core of slice 4: hub-supplied intent is no longer trusted for destructive change. The gate fronts the per-guest queue's executor, so every mutation passes it. Reuses internal/authz for all crypto (surface untouched). - Classifier (doc 03 §4): benign vs destructive by provenance + data-bearing- ness, NOT by verb. Destroy/overwrite of customer data is destructive unless agent-internal provenance (same-journaled-txn create, or agent-tagged scratch) makes it benign — and that provenance is journal-recorded, NEVER hub-sourced. Unknown op class fails safe to destructive. - Reversibility gate: benign -> allowed unsigned; destructive -> requires a verified, role-scoped, action-bound operator signature, else pending_signature and never executed. Every decision audited (signal, never the guard). - Signed-op consuming layer over authz.Verifier.Verify (locked pipeline untouched): role-scoping (doc 04 §4 — recovery=rotation only, operational= ordinary destructive + planned rotation) + op-to-action binding (op+host+ guest+params must match the gated action). - Signed-job orchestration: idempotency dedupe by nonce + journal-wrapped execution via an injected DestructiveExecutor (nil this slice — inert). - Crash recovery (Note 1): Engine.Recover consumes the journal InFlight() set at startup (resume-or-rollback) — covers an op that crashed after the POST and before its terminal record, which idempotency dedupe alone cannot. Added TaskStatusOnce to the GuestAPI seam. Wired into daemon startup. - Note 2: memory comparison canonicalized to MiB (desiredMemoryMiB) so a non-MiB-aligned MemoryBytes converges in one pass, not perpetual drift. - Daemon: builds the verifier from config signers (none = nil verifier, the common slice-4 state), the gate (+SlogAudit), runs Recover before mutating. Adversarial matrix proven against the REAL authz.Verifier with in-test-minted SSHSIGs (framing replicated in reconcile's test binary; authz untouched, no signing added to the verify-only package): unsigned job + unsigned desired-state delta -> pending_signature; unknown signer/expired/replay-across-restart/wrong host -> typed authz rejections; wrong guest/op/params -> binding_mismatch; recovery key on ordinary destructive -> role_denied; hub-supplied scratch tag ignored -> refused; valid+role+target+fresh nonce -> accepted then replay rejected. Full module race-clean + vet-clean on the Linux build server. Inert this slice: no destructive deltas served until slice 10; the destructive path is classified, gated, and tested but not wired to live execution. CHECKPOINT: Phase B complete (slice 4 done). Awaiting validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 23:56:20 +02:00
parent 05c450147c
commit 1af21a6cac
18 changed files with 1640 additions and 80 deletions
@@ -0,0 +1,106 @@
+package reconcile
+
+// The benign/destructive classifier (doc 03 §4). The gate decides whether an intended
+// action needs an operator signature by **provenance + data-bearing-ness, NOT by
+// verb**. The op CLASS encodes the semantic intent (a "detach storage" is its own
+// destructive class, not just another PUT) so classification never turns on the HTTP
+// method.
+
+// OpClass is the semantic class of an intended action. This vocabulary is the
+// agent-side contract: the signed-op `op` field (doc 04 §2.1) and slice-10's hub /
+// operator CLI match these exact strings. The committed slice-2 fixture
+// (op="guest_destroy") seeds it.
+type OpClass string
+
+const (
+	// Benign-on-existing-guest set — wired to live execution this slice.
+	ClassStart     OpClass = "start"
+	ClassStop      OpClass = "stop"
+	ClassSetConfig OpClass = "set_config" // benign sizing/description changes only
+
+	// Benign by construction — classified now, executors land in later slices.
+	ClassCreate  OpClass = "create"  // provision a NEW guest (restore-to-new, slice 7)
+	ClassRestart OpClass = "restart" // heal a crashed controller in-place (§4)
+
+	// Destructive set — destroying/overwriting the only/primary copy of customer
+	// data. Classified and gated now; NOT wired to live execution this slice (nothing
+	// serves destructive deltas until slice 10).
+	ClassGuestDestroy     OpClass = "guest_destroy"
+	ClassStorageWipe      OpClass = "storage_wipe"      // storage detach/wipe
+	ClassRestoreOverwrite OpClass = "restore_overwrite" // restore OVER an existing guest
+	ClassDecommission     OpClass = "decommission"
+
+	// Key-rotation re-pin (doc 04 §4) — destructive-class, role-scoped: the cold
+	// recovery key authorizes ONLY this; the operational key authorizes this + ordinary
+	// destructive ops.
+	ClassKeyRotation OpClass = "key_rotation"
+)
+
+// Disposition is the classifier verdict.
+type Disposition string
+
+const (
+	// Benign — the reconciler/executor MAY act without an operator signature.
+	Benign Disposition = "benign"
+	// Destructive — an operator signature bound to the action is REQUIRED.
+	Destructive Disposition = "destructive"
+)
+
+// Provenance is AGENT-INTERNAL evidence that an otherwise-destructive action is
+// actually safe (doc 03 §4). It is recorded in the operation journal by the agent's
+// own bookkeeping and is **NEVER populated from the hub or any external input** — else
+// a compromised hub could relabel a data-bearing guest as scratch to walk the gate.
+// The zero value (no internal evidence) is the only value an externally-sourced intent
+// may carry.
+type Provenance struct {
+	// SameTxnCreated: the agent created this resource earlier in the SAME journaled
+	// transaction, so destroying it is a compensating rollback (§10), not data loss.
+	SameTxnCreated bool
+	// AgentTaggedScratch: the agent tagged this resource ephemeral/scratch (e.g. a
+	// restore-test scratch guest, §8). Journal-recorded provenance only.
+	AgentTaggedScratch bool
+}
+
+// internalEvidence reports whether agent-internal provenance makes a destroy benign.
+func (p Provenance) internalEvidence() bool {
+	return p.SameTxnCreated || p.AgentTaggedScratch
+}
+
+// Classify returns the disposition for an op class given agent-internal provenance.
+//
+// Rules (doc 03 §4):
+//   - create/start/stop/restart/benign-set_config → always Benign.
+//   - destroy/overwrite of a data-bearing resource → Destructive, UNLESS agent-internal
+//     provenance (same-transaction create, or agent-tagged scratch) makes it benign.
+//   - key-rotation → always Destructive (signed); role-scoping picks the allowed key.
+//   - an UNKNOWN class fails safe → Destructive (require a signature).
+func Classify(class OpClass, prov Provenance) Disposition {
+	switch class {
+	case ClassStart, ClassStop, ClassSetConfig, ClassCreate, ClassRestart:
+		return Benign
+	case ClassGuestDestroy, ClassStorageWipe, ClassRestoreOverwrite, ClassDecommission:
+		if prov.internalEvidence() {
+			return Benign // compensating rollback / scratch teardown
+		}
+		return Destructive
+	case ClassKeyRotation:
+		return Destructive
+	default:
+		return Destructive // fail safe: an unrecognized op is treated as destructive
+	}
+}
+
+// classOfAction maps a benign reconcile ActionKind to its OpClass, so every reconcile
+// mutation is classified and passed through the gate like any other intent.
+func classOfAction(k ActionKind) OpClass {
+	switch k {
+	case ActionStart:
+		return ClassStart
+	case ActionStop:
+		return ClassStop
+	case ActionSetConfig:
+		return ClassSetConfig
+	default:
+		return OpClass(k)
+	}
+}
@@ -0,0 +1,70 @@
+package reconcile
+
+import (
+	"testing"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/authz"
+)
+
+func TestClassify_BenignClasses(t *testing.T) {
+	for _, c := range []OpClass{ClassStart, ClassStop, ClassSetConfig, ClassCreate, ClassRestart} {
+		if got := Classify(c, Provenance{}); got != Benign {
+			t.Errorf("Classify(%s) = %s, want benign", c, got)
+		}
+	}
+}
+
+func TestClassify_DestructiveClassesNeedSignature(t *testing.T) {
+	for _, c := range []OpClass{ClassGuestDestroy, ClassStorageWipe, ClassRestoreOverwrite, ClassDecommission, ClassKeyRotation} {
+		if got := Classify(c, Provenance{}); got != Destructive {
+			t.Errorf("Classify(%s) = %s, want destructive", c, got)
+		}
+	}
+}
+
+func TestClassify_InternalProvenanceMakesDestroyBenign(t *testing.T) {
+	// Same-transaction create → compensating rollback is benign (§10).
+	if got := Classify(ClassGuestDestroy, Provenance{SameTxnCreated: true}); got != Benign {
+		t.Errorf("same-txn destroy = %s, want benign", got)
+	}
+	// Agent-tagged scratch teardown is benign (§8).
+	if got := Classify(ClassGuestDestroy, Provenance{AgentTaggedScratch: true}); got != Benign {
+		t.Errorf("scratch destroy = %s, want benign", got)
+	}
+}
+
+func TestClassify_KeyRotationAlwaysDestructive(t *testing.T) {
+	// Even with internal provenance, key-rotation stays signed (role-scoping decides
+	// which key) — provenance flags don't apply to it.
+	if got := Classify(ClassKeyRotation, Provenance{SameTxnCreated: true, AgentTaggedScratch: true}); got != Destructive {
+		t.Errorf("key_rotation = %s, want destructive", got)
+	}
+}
+
+func TestClassify_UnknownClassFailsSafe(t *testing.T) {
+	if got := Classify(OpClass("totally_unknown_op"), Provenance{}); got != Destructive {
+		t.Errorf("unknown class = %s, want destructive (fail-safe)", got)
+	}
+}
+
+func TestRoleAuthorizes(t *testing.T) {
+	op := authz.RoleOperational
+	rec := authz.RoleRecovery
+	cases := []struct {
+		role  authz.KeyRole
+		class OpClass
+		want  bool
+	}{
+		{op, ClassGuestDestroy, true},   // operational does ordinary destructive
+		{op, ClassDecommission, true},   //
+		{op, ClassKeyRotation, true},    // operational does planned rotation
+		{rec, ClassGuestDestroy, false}, // recovery may NOT do ordinary destructive
+		{rec, ClassStorageWipe, false},  //
+		{rec, ClassKeyRotation, true},   // recovery authorizes ONLY rotation
+	}
+	for _, c := range cases {
+		if got := roleAuthorizes(c.role, c.class); got != c.want {
+			t.Errorf("roleAuthorizes(%s, %s) = %v, want %v", c.role, c.class, got, c.want)
+		}
+	}
+}
@@ -27,19 +27,23 @@ type Engine struct {
 	journal  *Journal
 	provider DesiredProvider
 	norm     FieldNormalizers
+	gate     *Gate
+	hostID   string
 	logger   *slog.Logger

 	opSeq uint64 // atomic; makes each op id unique per attempt
 }

 // EngineOptions configures a new Engine. Norm defaults to DefaultNormalizers, Logger
-// to a discard logger.
+// to a discard logger, Gate to a no-verifier gate (benign-allow, destructive-pending).
 type EngineOptions struct {
 	API      GuestAPI
 	Queue    *Queue
 	Journal  *Journal
 	Provider DesiredProvider
 	Norm     FieldNormalizers
+	Gate     *Gate
+	HostID   string
 	Logger   *slog.Logger
 }

@@ -58,12 +62,20 @@ func NewEngine(opts EngineOptions) *Engine {
 	if provider == nil {
 		provider = EmptyProvider{}
 	}
+	gate := opts.Gate
+	if gate == nil {
+		// No verifier configured: benign actions pass, destructive are pending. This is
+		// the common slice-4 daemon state (no signers pinned, no desired state).
+		gate = NewGate(nil, opts.HostID, nil, logger)
+	}
 	return &Engine{
 		api:      opts.API,
 		queue:    opts.Queue,
 		journal:  opts.Journal,
 		provider: provider,
 		norm:     norm,
+		gate:     gate,
+		hostID:   opts.HostID,
 		logger:   logger,
 	}
 }
@@ -97,23 +109,39 @@ func (e *Engine) Reconcile(ctx context.Context) (Result, error) {
 		return res, nil
 	}

-	// Dispatch all actions onto the shared per-guest queue, then await each. Same-vmid
-	// actions serialize in submit order; different vmids run concurrently.
-	chans := make([]<-chan error, len(actions))
+	// Every mutation passes the reversibility gate before the queue (doc 03 §4).
+	// Reconcile only produces benign actions, so each is allowed unsigned — but the
+	// gate is genuinely in the path: a destructive class here would be refused
+	// (pending_signature) and never dispatched. A gate refusal counts as a failed
+	// action (it should not happen for the benign reconcile set).
+	type dispatched struct {
+		act Action
+		ch  <-chan error
+	}
+	var sent []dispatched
 	for i := range actions {
 		act := actions[i]
-		chans[i] = e.queue.Submit(act.VMID, func() error { return e.execute(ctx, act) })
+		dec := e.gate.Authorize(intentForAction(e.hostID, act), nil)
+		if !dec.Allowed {
+			res.Failed++
+			res.Errors = append(res.Errors, fmt.Errorf("reconcile: gate refused %s vmid %d: %s",
+				act.Kind, act.VMID, dec.Reason))
+			e.logger.Error("reconcile: gate refused a benign action (unexpected)",
+				"vmid", act.VMID, "kind", act.Kind, "reason", dec.Reason)
+			continue
+		}
+		sent = append(sent, dispatched{act: act, ch: e.queue.Submit(act.VMID, func() error { return e.execute(ctx, act) })})
 	}
-	for i, ch := range chans {
-		if err := <-ch; err != nil {
+	for _, d := range sent {
+		if err := <-d.ch; err != nil {
 			res.Failed++
 			res.Errors = append(res.Errors, err)
 			e.logger.Error("reconcile: action failed",
-				"vmid", actions[i].VMID, "kind", actions[i].Kind, "err", err)
+				"vmid", d.act.VMID, "kind", d.act.Kind, "err", err)
 		} else {
 			res.Executed++
 			e.logger.Info("reconcile: action applied",
-				"vmid", actions[i].VMID, "kind", actions[i].Kind, "reason", actions[i].Reason)
+				"vmid", d.act.VMID, "kind", d.act.Kind, "reason", d.act.Reason)
 		}
 	}
 	return res, nil
@@ -227,8 +255,13 @@ func (e *Engine) reconcileOnce(ctx context.Context) {

 // nextOpID builds a per-attempt unique op id (kind-vmid-seq) for journal correlation.
 func (e *Engine) nextOpID(act Action) string {
-	n := atomic.AddUint64(&e.opSeq, 1)
-	return string(act.Kind) + "-" + strconv.Itoa(act.VMID) + "-" + strconv.FormatUint(n, 10)
+	return string(act.Kind) + "-" + strconv.Itoa(act.VMID) + "-" + nextSeq(&e.opSeq)
+}
+
+// nextSeq atomically increments a counter and returns it as a string — the unique
+// suffix that distinguishes journal op ids across attempts.
+func nextSeq(p *uint64) string {
+	return strconv.FormatUint(atomic.AddUint64(p, 1), 10)
 }

 // append journals a lifecycle record, logging (never failing the op on) a journal I/O
@@ -23,6 +23,8 @@ type fakeAPI struct {
 	// waitFunc maps a UPID to a (status, err); default = OK. Mirrors the real client,
 	// which errors on a non-OK exitstatus.
 	waitFunc func(upid string) (proxmox.TaskStatus, error)
+	// statusFunc backs TaskStatusOnce (crash recovery); default = stopped/OK.
+	statusFunc func(upid string) (proxmox.TaskStatus, error)

 	starts  []int
 	stops   []int
@@ -31,6 +33,13 @@ type fakeAPI struct {
 	listErr error
 }

+func (f *fakeAPI) TaskStatusOnce(_ context.Context, upid string) (proxmox.TaskStatus, error) {
+	if f.statusFunc != nil {
+		return f.statusFunc(upid)
+	}
+	return proxmox.TaskStatus{UPID: upid, Status: "stopped", ExitStatus: "OK"}, nil
+}
+
 type setCall struct {
 	vmid   int
 	params map[string]string
@@ -0,0 +1,291 @@
+package reconcile
+
+import (
+	"encoding/json"
+	"log/slog"
+	"reflect"
+	"strconv"
+	"time"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/authz"
+)
+
+// SourceKind records where an intent came from — audit/debug ONLY. Classification
+// does NOT depend on it: a destructive desired-state delta and a destructive one-shot
+// job are gated identically (the agent distrusts hub desired state for destructive
+// change, not just jobs — doc 03 §4).
+type SourceKind string
+
+const (
+	SourceDesiredDelta SourceKind = "desired_delta"
+	SourceOneShotJob   SourceKind = "one_shot_job"
+)
+
+// Intent is an intended mutation presented to the gate. For benign reconcile actions
+// the engine builds one per planned Action; destructive intents (jobs / deltas) carry
+// their op class + canonical params for binding.
+type Intent struct {
+	Class   OpClass
+	HostID  string
+	GuestID string // blob-style guest id ("" = host-scoped); matches OpBlob.target.guest_id
+	VMID    int    // numeric, for queue routing (0 = host-scoped)
+	// ParamsJSON is the canonical params (matching the signed blob's `params`) used for
+	// op-to-action binding on destructive ops. Nil for benign actions (not bound).
+	ParamsJSON json.RawMessage
+	// Provenance is AGENT-INTERNAL only (never hub-sourced) — see classify.go.
+	Provenance Provenance
+	Source     SourceKind
+}
+
+// SignedOp is the opaque operator-signed blob+signature pair the hub queues (doc 04
+// §5). The agent never trusts it until authz.Verifier.Verify passes.
+type SignedOp struct {
+	Blob []byte // the canonical OpBlob JSON bytes (verified over RAW bytes)
+	Sig  []byte // the armored SSHSIG
+}
+
+// RefuseReason is a stable, machine-readable gate refusal reason.
+type RefuseReason string
+
+const (
+	ReasonBenign           RefuseReason = "benign"            // allowed, no signature needed
+	ReasonSigned           RefuseReason = "signed"            // allowed by a verified op
+	ReasonPendingSignature RefuseReason = "pending_signature" // destructive, no/again-needed signature
+	ReasonRejected         RefuseReason = "rejected"          // signature failed authz verification
+	ReasonRoleDenied       RefuseReason = "role_denied"       // signer role not authorized for this op class
+	ReasonBindingMismatch  RefuseReason = "binding_mismatch"  // signature is for a different action
+)
+
+// Decision is the gate verdict.
+type Decision struct {
+	Allowed     bool
+	Disposition Disposition
+	Reason      RefuseReason
+	// Verified is the authenticated op when a signature authorized the action.
+	Verified *authz.VerifiedOp
+	// Err is the underlying authz rejection (errors.Is-friendly: ErrUnknownSigner,
+	// ErrExpired, ErrReplay, …) when Reason == ReasonRejected.
+	Err error
+}
+
+// OpVerifier is the crypto verifier seam — *authz.Verifier in production; a fake in
+// gate unit tests. The gate never re-implements any crypto; it only consumes the
+// verdict and enforces the policy layer on top (role-scoping + op-to-action binding).
+type OpVerifier interface {
+	Verify(blob, sigArmored []byte) (*authz.VerifiedOp, error)
+}
+
+// AuditSink records every gate decision to the customer-visible audit log. Audit is a
+// SIGNAL, never the guard (doc 03 §4 / doc 04 §5): a compromised hub could suppress a
+// notice, which is exactly why the signature — not the audit — is the control.
+type AuditSink interface {
+	Record(rec AuditRecord)
+}
+
+// AuditRecord is one audited gate decision.
+type AuditRecord struct {
+	Time        time.Time
+	Class       OpClass
+	HostID      string
+	GuestID     string
+	Source      SourceKind
+	Disposition Disposition
+	Allowed     bool
+	Reason      RefuseReason
+	KeyID       string // matched signer's key id, when signed
+	Nonce       string // the op nonce, when signed
+}
+
+// Gate is the reversibility gate: it sits in front of the per-guest queue's executor
+// so EVERY mutation passes it. Benign intents are allowed unsigned; destructive
+// intents require a verified, role-authorized, action-bound operator signature, else
+// they are refused with pending_signature and never executed.
+type Gate struct {
+	verifier OpVerifier // may be nil (no signers pinned) → destructive is always pending_signature
+	hostID   string
+	audit    AuditSink
+	logger   *slog.Logger
+}
+
+// NewGate builds a gate. verifier may be nil when no signers are configured (the
+// common slice-4 state) — then there is nothing destructive to authorize and any
+// destructive intent is refused pending_signature. audit/logger default to no-ops.
+func NewGate(verifier OpVerifier, hostID string, audit AuditSink, logger *slog.Logger) *Gate {
+	if audit == nil {
+		audit = noopAudit{}
+	}
+	if logger == nil {
+		logger = slog.New(slog.NewTextHandler(discard{}, nil))
+	}
+	return &Gate{verifier: verifier, hostID: hostID, audit: audit, logger: logger}
+}
+
+// Authorize classifies the intent and, for destructive intents, runs the full
+// consuming-layer policy over the verifier verdict. It writes the decision to the
+// audit log and returns it. It NEVER executes anything — the caller dispatches an
+// Allowed decision onto the queue.
+func (g *Gate) Authorize(intent Intent, signed *SignedOp) Decision {
+	disp := Classify(intent.Class, intent.Provenance)
+
+	// Benign: allowed without a signature.
+	if disp == Benign {
+		d := Decision{Allowed: true, Disposition: Benign, Reason: ReasonBenign}
+		g.record(intent, d)
+		return d
+	}
+
+	// Destructive from here: a verified, role-authorized, action-bound signature is
+	// mandatory. Missing signature OR no pinned verifier → pending_signature (refuse).
+	if signed == nil || g.verifier == nil {
+		d := Decision{Allowed: false, Disposition: Destructive, Reason: ReasonPendingSignature}
+		g.record(intent, d)
+		return d
+	}
+
+	// Crypto + namespace + allow-list + target + time + nonce — the LOCKED authz
+	// pipeline. The nonce is consumed (recorded) only if this passes.
+	vop, err := g.verifier.Verify(signed.Blob, signed.Sig)
+	if err != nil {
+		d := Decision{Allowed: false, Disposition: Destructive, Reason: ReasonRejected, Err: err}
+		g.record(intent, d)
+		return d
+	}
+
+	// Role-scoping (the slice-4 job per verifier.go): the signer's pinned role must be
+	// authorized for THIS op class.
+	if !roleAuthorizes(vop.Signer.Role, intent.Class) {
+		d := Decision{Allowed: false, Disposition: Destructive, Reason: ReasonRoleDenied, Verified: vop}
+		g.record(intent, d)
+		return d
+	}
+
+	// Op-to-action binding: the verified op must name THIS exact action (op + target +
+	// params) — a signature for "restore guest X" cannot authorize destroying guest Y.
+	if !g.bindsToAction(vop, intent) {
+		d := Decision{Allowed: false, Disposition: Destructive, Reason: ReasonBindingMismatch, Verified: vop}
+		g.record(intent, d)
+		return d
+	}
+
+	d := Decision{Allowed: true, Disposition: Destructive, Reason: ReasonSigned, Verified: vop}
+	g.record(intent, d)
+	return d
+}
+
+// roleAuthorizes enforces the doc 04 §4 two-key role model: the cold recovery key
+// authorizes ONLY key-rotation re-pins; the operational key authorizes ordinary
+// destructive ops AND planned key-rotation.
+func roleAuthorizes(role authz.KeyRole, class OpClass) bool {
+	if class == ClassKeyRotation {
+		return role == authz.RoleOperational || role == authz.RoleRecovery
+	}
+	return role == authz.RoleOperational
+}
+
+// bindsToAction checks the verified op names this exact action: host (already checked
+// by the verifier, re-asserted here), guest, op class, and params. This is the binding
+// BEYOND the verifier's target check (doc 04 §2.3 binds host; this binds the full
+// action).
+func (g *Gate) bindsToAction(vop *authz.VerifiedOp, intent Intent) bool {
+	if vop.HostID != g.hostID || vop.HostID != intent.HostID {
+		return false
+	}
+	if vop.GuestID != intent.GuestID {
+		return false
+	}
+	if vop.Op != string(intent.Class) {
+		return false
+	}
+	return paramsEqual(vop.Params, intent.ParamsJSON)
+}
+
+// paramsEqual compares two JSON param objects semantically (key order / whitespace
+// independent). Absent params on both sides ({} or empty) compare equal.
+func paramsEqual(a, b json.RawMessage) bool {
+	ax, aok := decodeParams(a)
+	bx, bok := decodeParams(b)
+	if !aok || !bok {
+		return false
+	}
+	return reflect.DeepEqual(ax, bx)
+}
+
+func decodeParams(p json.RawMessage) (any, bool) {
+	if len(p) == 0 {
+		return map[string]any{}, true // absent == empty object
+	}
+	var v any
+	if err := json.Unmarshal(p, &v); err != nil {
+		return nil, false
+	}
+	if v == nil {
+		return map[string]any{}, true // explicit null == empty
+	}
+	return v, true
+}
+
+func (g *Gate) record(intent Intent, d Decision) {
+	rec := AuditRecord{
+		Time:        time.Now().UTC(),
+		Class:       intent.Class,
+		HostID:      intent.HostID,
+		GuestID:     intent.GuestID,
+		Source:      intent.Source,
+		Disposition: d.Disposition,
+		Allowed:     d.Allowed,
+		Reason:      d.Reason,
+	}
+	if d.Verified != nil {
+		rec.KeyID = d.Verified.Signer.KeyID
+		rec.Nonce = d.Verified.Nonce
+	}
+	g.audit.Record(rec)
+	g.logger.Info("gate decision",
+		"class", intent.Class, "guest", intent.GuestID, "source", intent.Source,
+		"disposition", d.Disposition, "allowed", d.Allowed, "reason", d.Reason)
+}
+
+// intentForAction builds the gate Intent for a benign reconcile action. The provenance
+// is the zero value (no agent-internal destroy evidence) and the source is the
+// desired-state delta — reconcile never fabricates scratch/same-txn provenance.
+func intentForAction(hostID string, act Action) Intent {
+	return Intent{
+		Class:      classOfAction(act.Kind),
+		HostID:     hostID,
+		GuestID:    strconv.Itoa(act.VMID),
+		VMID:       act.VMID,
+		Provenance: Provenance{}, // benign actions need none; never hub-sourced
+		Source:     SourceDesiredDelta,
+	}
+}
+
+// noopAudit drops audit records (used when no sink is configured).
+type noopAudit struct{}
+
+func (noopAudit) Record(AuditRecord) {}
+
+// SlogAudit is a minimal AuditSink that emits records to a logger. The durable,
+// customer-visible audit log + its inclusion in the host-report (HostReport.AuditTail)
+// is a later-slice concern; this keeps the signal flowing now without inventing that
+// wire schema.
+type SlogAudit struct{ Logger *slog.Logger }
+
+// Record logs the audit entry at info level.
+func (s SlogAudit) Record(rec AuditRecord) {
+	if s.Logger == nil {
+		return
+	}
+	s.Logger.Info("audit: gate decision",
+		"class", rec.Class, "host", rec.HostID, "guest", rec.GuestID, "source", rec.Source,
+		"disposition", rec.Disposition, "allowed", rec.Allowed, "reason", rec.Reason,
+		"key_id", rec.KeyID, "nonce", auditNonce(rec.Nonce))
+}
+
+// auditNonce shortens a nonce for the log (full nonce is high-cardinality; a prefix is
+// enough to correlate without bloating logs).
+func auditNonce(n string) string {
+	if len(n) <= 8 {
+		return n
+	}
+	return n[:8] + "…"
+}
@@ -0,0 +1,299 @@
+package reconcile
+
+import (
+	"encoding/json"
+	"errors"
+	"path/filepath"
+	"testing"
+	"time"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/authz"
+)
+
+const testHost = "demo-felhom"
+
+// captureAudit records gate decisions so tests can assert audit is always written
+// (audit is a signal, never the guard).
+type captureAudit struct{ recs []AuditRecord }
+
+func (c *captureAudit) Record(r AuditRecord) { c.recs = append(c.recs, r) }
+
+// realVerifierAt builds a real authz.Verifier over a durable nonce store at path
+// (reused across "restart" by reopening the same path), pinning the given signers.
+func realVerifierAt(t *testing.T, path, hostID string, signers ...authz.AllowedSigner) (*authz.Verifier, *authz.FileNonceStore) {
+	t.Helper()
+	store, err := authz.OpenFileNonceStore(path)
+	if err != nil {
+		t.Fatalf("OpenFileNonceStore: %v", err)
+	}
+	t.Cleanup(func() { store.Close() })
+	return authz.New(signers, store, hostID), store
+}
+
+// destroyIntent is the canonical destructive fixture: destroy guest 9001, params
+// {"purge":true} (mirrors the committed slice-2 op_blob.json shape).
+func destroyIntent(source SourceKind) Intent {
+	return Intent{
+		Class:      ClassGuestDestroy,
+		HostID:     testHost,
+		GuestID:    "9001",
+		VMID:       9001,
+		ParamsJSON: json.RawMessage(`{"purge":true}`),
+		Source:     source,
+	}
+}
+
+func freshWindow() (issued, expires time.Time) {
+	now := time.Now().UTC()
+	return now.Add(-1 * time.Minute), now.Add(10 * time.Minute)
+}
+
+// --- The adversarial matrix: each case must be INDEPENDENTLY rejected (or, the one
+// positive case, accepted). ---
+
+func TestGate_DestructiveJobNoSignatureRefused(t *testing.T) {
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	aud := &captureAudit{}
+	g := NewGate(v, testHost, aud, nil)
+
+	d := g.Authorize(destroyIntent(SourceOneShotJob), nil)
+	if d.Allowed || d.Reason != ReasonPendingSignature {
+		t.Fatalf("unsigned destructive job: got allowed=%v reason=%s, want pending_signature", d.Allowed, d.Reason)
+	}
+	if len(aud.recs) != 1 || aud.recs[0].Allowed {
+		t.Errorf("decision must be audited as refused: %+v", aud.recs)
+	}
+}
+
+func TestGate_DestructiveDesiredDeltaNoSignatureRefused(t *testing.T) {
+	// Proves the agent distrusts hub DESIRED STATE for destructive change, not just
+	// jobs — same refusal, different source.
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	d := g.Authorize(destroyIntent(SourceDesiredDelta), nil)
+	if d.Allowed || d.Reason != ReasonPendingSignature {
+		t.Fatalf("unsigned destructive delta: got allowed=%v reason=%s, want pending_signature", d.Allowed, d.Reason)
+	}
+}
+
+func TestGate_UnknownSignerRejected(t *testing.T) {
+	pinned := newTestSigner(t)
+	attacker := newTestSigner(t) // NOT pinned
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, pinned.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	issued, expires := freshWindow()
+	signed := attacker.mint("guest_destroy", testHost, "9001", "op1", nonce(), `{"purge":true}`, issued, expires)
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || d.Reason != ReasonRejected || !errors.Is(d.Err, authz.ErrUnknownSigner) {
+		t.Fatalf("forged signer: got allowed=%v reason=%s err=%v, want rejected/ErrUnknownSigner", d.Allowed, d.Reason, d.Err)
+	}
+}
+
+func TestGate_ExpiredSignatureRejected(t *testing.T) {
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	past := time.Now().UTC().Add(-2 * time.Hour)
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", nonce(), `{"purge":true}`, past, past.Add(time.Minute))
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || !errors.Is(d.Err, authz.ErrExpired) {
+		t.Fatalf("expired op: got allowed=%v err=%v, want ErrExpired", d.Allowed, d.Err)
+	}
+}
+
+func TestGate_WrongHostTargetRejected(t *testing.T) {
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	issued, expires := freshWindow()
+	signed := op.mint("guest_destroy", "some-other-host", "9001", "op1", nonce(), `{"purge":true}`, issued, expires)
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || !errors.Is(d.Err, authz.ErrTarget) {
+		t.Fatalf("wrong host: got allowed=%v err=%v, want ErrTarget", d.Allowed, d.Err)
+	}
+}
+
+func TestGate_WrongGuestBindingMismatch(t *testing.T) {
+	// host matches (verifier passes) but the signature names a DIFFERENT guest than the
+	// action — the op-to-action binding rejects it.
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	issued, expires := freshWindow()
+	signed := op.mint("guest_destroy", testHost, "9002", "op1", nonce(), `{"purge":true}`, issued, expires)
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed) // intent targets 9001
+	if d.Allowed || d.Reason != ReasonBindingMismatch {
+		t.Fatalf("guest mismatch: got allowed=%v reason=%s, want binding_mismatch", d.Allowed, d.Reason)
+	}
+}
+
+func TestGate_WrongParamsBindingMismatch(t *testing.T) {
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	issued, expires := freshWindow()
+	// signature authorizes purge=false; the action wants purge=true.
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", nonce(), `{"purge":false}`, issued, expires)
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || d.Reason != ReasonBindingMismatch {
+		t.Fatalf("params mismatch: got allowed=%v reason=%s, want binding_mismatch", d.Allowed, d.Reason)
+	}
+}
+
+func TestGate_WrongOpBindingMismatch(t *testing.T) {
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+
+	issued, expires := freshWindow()
+	// a valid signature for restore_overwrite cannot authorize a guest_destroy.
+	signed := op.mint("restore_overwrite", testHost, "9001", "op1", nonce(), `{"purge":true}`, issued, expires)
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || d.Reason != ReasonBindingMismatch {
+		t.Fatalf("op mismatch: got allowed=%v reason=%s, want binding_mismatch", d.Allowed, d.Reason)
+	}
+}
+
+func TestGate_RecoveryKeyOnOrdinaryDestructiveRoleDenied(t *testing.T) {
+	// A valid signature from the cold RECOVERY key on an ordinary destructive op is
+	// refused by role-scoping (recovery authorizes ONLY key-rotation).
+	rec := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, rec.allowed(t, "rec1", authz.RoleRecovery))
+	g := NewGate(v, testHost, nil, nil)
+
+	issued, expires := freshWindow()
+	signed := rec.mint("guest_destroy", testHost, "9001", "rec1", nonce(), `{"purge":true}`, issued, expires)
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || d.Reason != ReasonRoleDenied {
+		t.Fatalf("recovery on destroy: got allowed=%v reason=%s, want role_denied", d.Allowed, d.Reason)
+	}
+}
+
+func TestGate_HubSuppliedScratchTagIgnored(t *testing.T) {
+	// A compromised hub attaches a "scratch" hint to a data-bearing guest's destroy
+	// delta to try to walk the gate unsigned. The intent built from a hub delta must
+	// NOT carry that as agent-internal provenance — so it stays destructive and is
+	// refused without a signature.
+	intent := intentFromHubDelta(hubDelta{Class: ClassGuestDestroy, HostID: testHost, GuestID: "9001", VMID: 9001, HubSaysScratch: true})
+	if intent.Provenance.AgentTaggedScratch || intent.Provenance.SameTxnCreated {
+		t.Fatal("hub-supplied scratch must NOT become agent-internal provenance")
+	}
+	g := NewGate(nil, testHost, nil, nil) // no verifier even needed
+	d := g.Authorize(intent, nil)
+	if d.Allowed || d.Reason != ReasonPendingSignature {
+		t.Fatalf("hub-scratch destroy: got allowed=%v reason=%s, want pending_signature", d.Allowed, d.Reason)
+	}
+}
+
+func TestGate_ValidOpAcceptedThenReplayRejected(t *testing.T) {
+	// The ONE positive case: valid signature, correct role, correct target, fresh
+	// nonce → accepted. A SECOND presentation (same nonce) → rejected (nonce consumed).
+	op := newTestSigner(t)
+	path := filepath.Join(t.TempDir(), "n.log")
+	v, _ := realVerifierAt(t, path, testHost, op.allowed(t, "op1", authz.RoleOperational))
+	aud := &captureAudit{}
+	g := NewGate(v, testHost, aud, nil)
+
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	d := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if !d.Allowed || d.Reason != ReasonSigned {
+		t.Fatalf("valid op: got allowed=%v reason=%s err=%v, want accepted/signed", d.Allowed, d.Reason, d.Err)
+	}
+	if d.Verified == nil || d.Verified.Nonce != n {
+		t.Fatalf("accepted op should surface the verified op with nonce %s", n)
+	}
+	// Replay the exact same signed op → nonce already consumed.
+	d2 := g.Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d2.Allowed || !errors.Is(d2.Err, authz.ErrReplay) {
+		t.Fatalf("replay: got allowed=%v err=%v, want ErrReplay", d2.Allowed, d2.Err)
+	}
+}
+
+func TestGate_ReplayAcrossRestartRejected(t *testing.T) {
+	// Replay protection must survive an agent restart (the durable nonce store). Accept
+	// once with verifier A, then reopen the SAME nonce-store path as verifier B (a
+	// restart) and replay → still rejected.
+	op := newTestSigner(t)
+	path := filepath.Join(t.TempDir(), "n.log")
+	signer := op.allowed(t, "op1", authz.RoleOperational)
+
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	vA, storeA := realVerifierAt(t, path, testHost, signer)
+	if d := NewGate(vA, testHost, nil, nil).Authorize(destroyIntent(SourceOneShotJob), signed); !d.Allowed {
+		t.Fatalf("first presentation should be accepted: %+v", d)
+	}
+	storeA.Close() // simulate shutdown
+
+	vB, _ := realVerifierAt(t, path, testHost, signer) // restart: reopen same nonce log
+	d := NewGate(vB, testHost, nil, nil).Authorize(destroyIntent(SourceOneShotJob), signed)
+	if d.Allowed || !errors.Is(d.Err, authz.ErrReplay) {
+		t.Fatalf("replay across restart: got allowed=%v err=%v, want ErrReplay", d.Allowed, d.Err)
+	}
+}
+
+// --- gate unit tests (benign path, binding, params) ---
+
+func TestGate_BenignAllowedWithoutVerifier(t *testing.T) {
+	g := NewGate(nil, testHost, nil, nil) // no verifier at all
+	for _, k := range []ActionKind{ActionStart, ActionStop, ActionSetConfig} {
+		d := g.Authorize(intentForAction(testHost, Action{VMID: 100, Kind: k}), nil)
+		if !d.Allowed || d.Reason != ReasonBenign {
+			t.Errorf("benign %s: got allowed=%v reason=%s, want benign", k, d.Allowed, d.Reason)
+		}
+	}
+}
+
+func TestParamsEqual(t *testing.T) {
+	eq := func(a, b string) bool { return paramsEqual(json.RawMessage(a), json.RawMessage(b)) }
+	if !eq(`{"purge":true}`, `{"purge":true}`) {
+		t.Error("identical params should be equal")
+	}
+	if !eq(`{"a":1,"b":2}`, `{"b":2,"a":1}`) {
+		t.Error("key order must not matter")
+	}
+	if eq(`{"purge":true}`, `{"purge":false}`) {
+		t.Error("different values must differ")
+	}
+	if !eq(``, `{}`) || !eq(`{}`, `null`) {
+		t.Error("absent / empty / null params should all compare equal")
+	}
+}
+
+// --- helpers for the hub-scratch test: a stand-in for the slice-10 desired-delta →
+// intent constructor, proving it never propagates hub-supplied provenance. ---
+
+type hubDelta struct {
+	Class          OpClass
+	HostID         string
+	GuestID        string
+	VMID           int
+	HubSaysScratch bool // a hostile/erroneous hub hint — MUST be ignored
+}
+
+func intentFromHubDelta(d hubDelta) Intent {
+	// NOTE: HubSaysScratch is deliberately NOT mapped to Provenance. Agent-internal
+	// provenance (scratch/same-txn) is recorded by the agent's own journal, never taken
+	// from the hub (doc 03 §4).
+	return Intent{
+		Class:      d.Class,
+		HostID:     d.HostID,
+		GuestID:    d.GuestID,
+		VMID:       d.VMID,
+		Provenance: Provenance{}, // always zero from an external source
+		Source:     SourceDesiredDelta,
+	}
+}
@@ -0,0 +1,120 @@
+package reconcile
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"time"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/authz"
+	"gitea.dooplex.hu/admin/felhom-agent/internal/proxmox"
+)
+
+// DestructiveExecutor performs an authorized destructive op against the host. At slice
+// 4 there is NO live implementation (guest-destroy / storage-wipe / restore-overwrite
+// executors land in slices 6/7) — the consuming layer is wired and tested with fixture
+// executors but never executes a real destructive op, because nothing serves
+// destructive deltas until slice 10. It returns a Proxmox UPID (or "" for a synchronous
+// op) so the journal/Recover path is identical to benign execution.
+type DestructiveExecutor func(ctx context.Context, intent Intent, vop *authz.VerifiedOp) (upid string, err error)
+
+// JobResult is the outcome of RunSignedJob.
+type JobResult struct {
+	Decision       Decision
+	AlreadyApplied bool  // the op's idempotency key was already applied (deduped, not re-run)
+	Executed       bool  // the executor ran and succeeded
+	Err            error // execution error (after a successful authorization)
+}
+
+// RunSignedJob is the signed one-shot consuming layer (doc 03 §4(b) / doc 04). It adds
+// idempotency dedupe + journaling around the gate:
+//
+//  1. Dedupe: if the op's idempotency key (its nonce) is already applied, skip — a
+//     redelivered, already-completed op must not re-run (returns AlreadyApplied).
+//  2. Gate: classify + verify + role-scope + op-to-action bind. A refusal returns the
+//     Decision and executes nothing.
+//  3. Journal + execute: record started → run the executor → record the task id →
+//     record the terminal state under the idempotency key (so success marks the key
+//     applied; a crash mid-execute is resolved by Recover, never by idempotency alone).
+//
+// exec may be nil — then an AUTHORIZED destructive op is journaled as authorized but
+// not executed (the slice-4 inert state: the gate works, the executor doesn't exist
+// yet). A REFUSED op never reaches exec.
+func (e *Engine) RunSignedJob(ctx context.Context, intent Intent, signed *SignedOp, exec DestructiveExecutor) JobResult {
+	idemKey := jobIdempotencyKey(signed)
+
+	// 1. Idempotency dedupe (redelivery after a prior success).
+	if idemKey != "" && e.journal != nil && e.journal.AlreadyApplied(idemKey) {
+		e.logger.Info("job: idempotency key already applied; skipping", "key", auditNonce(idemKey))
+		return JobResult{AlreadyApplied: true, Decision: Decision{Allowed: true, Reason: ReasonSigned}}
+	}
+
+	// 2. Gate (classification + the full signed-op consuming policy).
+	dec := e.gate.Authorize(intent, signed)
+	if !dec.Allowed {
+		return JobResult{Decision: dec}
+	}
+
+	// 3. Journal + execute. Benign authorized ops (no signature path) also flow here if
+	// routed as jobs; they carry no idempotency key and are simply executed.
+	opID := e.nextJobOpID(intent)
+	e.append(JournalEntry{OpID: opID, VMID: intent.VMID, Kind: string(intent.Class),
+		State: OpStarted, IdempKey: idemKey, At: time.Now().UTC()})
+
+	if exec == nil {
+		// Slice-4 inert: authorized, but no destructive executor wired. Record the
+		// authorization terminally (do NOT mark applied — nothing actually ran).
+		e.append(JournalEntry{OpID: opID, VMID: intent.VMID, Kind: string(intent.Class),
+			State: OpFailed, IdempKey: "", At: time.Now().UTC()})
+		e.logger.Warn("job: authorized but no executor wired (slice-4 inert)", "class", intent.Class)
+		return JobResult{Decision: dec, Err: fmt.Errorf("reconcile: no executor for %s (not wired this slice)", intent.Class)}
+	}
+
+	upid, err := exec(ctx, intent, dec.Verified)
+	if err != nil {
+		e.append(JournalEntry{OpID: opID, VMID: intent.VMID, Kind: string(intent.Class),
+			State: OpFailed, IdempKey: idemKey, At: time.Now().UTC()})
+		return JobResult{Decision: dec, Err: err}
+	}
+	e.append(JournalEntry{OpID: opID, VMID: intent.VMID, Kind: string(intent.Class),
+		UPID: upid, State: OpTaskRunning, IdempKey: idemKey, At: time.Now().UTC()})
+
+	if upid != "" {
+		st, err := e.api.WaitTask(ctx, upid, proxmox.WaitOptions{})
+		if err != nil {
+			e.append(JournalEntry{OpID: opID, VMID: intent.VMID, Kind: string(intent.Class),
+				UPID: upid, State: OpFailed, IdempKey: idemKey, At: time.Now().UTC()})
+			return JobResult{Decision: dec, Err: err}
+		}
+		_ = st
+	}
+
+	// Terminal success — marks the idempotency key applied (survives restart).
+	e.append(JournalEntry{OpID: opID, VMID: intent.VMID, Kind: string(intent.Class),
+		UPID: upid, State: OpSucceeded, IdempKey: idemKey, At: time.Now().UTC()})
+	return JobResult{Decision: dec, Executed: true}
+}
+
+// jobIdempotencyKey derives the idempotency key from the signed op's nonce — unique
+// per op (≥128-bit, doc 04 §2.1) and already the anti-replay token, so reusing it as
+// the journal dedupe key is exact. Parsed from the UNVERIFIED blob: it is only a map
+// key here (the gate's verifier is the trust boundary), and a forged blob is refused at
+// the gate regardless.
+func jobIdempotencyKey(signed *SignedOp) string {
+	if signed == nil || len(signed.Blob) == 0 {
+		return ""
+	}
+	var b struct {
+		Nonce string `json:"nonce"`
+	}
+	if json.Unmarshal(signed.Blob, &b) != nil {
+		return ""
+	}
+	return b.Nonce
+}
+
+// nextJobOpID builds a per-attempt op id for a signed job (distinct namespace from
+// reconcile op ids).
+func (e *Engine) nextJobOpID(intent Intent) string {
+	return "job-" + string(intent.Class) + "-" + intent.GuestID + "-" + nextSeq(&e.opSeq)
+}
@@ -0,0 +1,135 @@
+package reconcile
+
+import (
+	"context"
+	"errors"
+	"path/filepath"
+	"testing"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/authz"
+)
+
+// newSignedEngine builds an engine whose gate has a real verifier pinning one
+// operational key — for exercising the signed-job consuming layer end to end.
+func newSignedEngine(t *testing.T, api GuestAPI) (*Engine, *Journal, testSigner) {
+	t.Helper()
+	j, err := OpenJournal(filepath.Join(t.TempDir(), "journal.log"))
+	if err != nil {
+		t.Fatalf("OpenJournal: %v", err)
+	}
+	t.Cleanup(func() { j.Close() })
+	q := NewQueue()
+	t.Cleanup(q.Close)
+	op := newTestSigner(t)
+	v, _ := realVerifierAt(t, filepath.Join(t.TempDir(), "n.log"), testHost, op.allowed(t, "op1", authz.RoleOperational))
+	g := NewGate(v, testHost, nil, nil)
+	e := NewEngine(EngineOptions{API: api, Queue: q, Journal: j, Gate: g, HostID: testHost})
+	return e, j, op
+}
+
+func TestRunSignedJob_ValidExecutesAndMarksApplied(t *testing.T) {
+	e, j, op := newSignedEngine(t, &fakeAPI{})
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	calls := 0
+	exec := func(context.Context, Intent, *authz.VerifiedOp) (string, error) { calls++; return "", nil } // synchronous
+
+	res := e.RunSignedJob(context.Background(), destroyIntent(SourceOneShotJob), signed, exec)
+	if !res.Executed || res.Err != nil {
+		t.Fatalf("valid job should execute, got %+v", res)
+	}
+	if calls != 1 {
+		t.Errorf("executor should run once, ran %d", calls)
+	}
+	if !j.AlreadyApplied(n) {
+		t.Error("successful job must mark its idempotency key (nonce) applied")
+	}
+}
+
+func TestRunSignedJob_RedeliveryDedupedByIdempotencyKey(t *testing.T) {
+	// After success, a redelivered identical job must NOT re-run — the journal's
+	// idempotency key short-circuits BEFORE the verifier (so it reports already-applied,
+	// not a confusing replay rejection).
+	e, _, op := newSignedEngine(t, &fakeAPI{})
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	calls := 0
+	exec := func(context.Context, Intent, *authz.VerifiedOp) (string, error) { calls++; return "", nil }
+
+	first := e.RunSignedJob(context.Background(), destroyIntent(SourceOneShotJob), signed, exec)
+	if !first.Executed {
+		t.Fatalf("first delivery should execute: %+v", first)
+	}
+	second := e.RunSignedJob(context.Background(), destroyIntent(SourceOneShotJob), signed, exec)
+	if !second.AlreadyApplied || second.Executed {
+		t.Fatalf("redelivery should be deduped (already applied), got %+v", second)
+	}
+	if calls != 1 {
+		t.Errorf("executor must run exactly once across redelivery, ran %d", calls)
+	}
+}
+
+func TestRunSignedJob_RefusedDoesNotExecute(t *testing.T) {
+	e, j, _ := newSignedEngine(t, &fakeAPI{})
+	attacker := newTestSigner(t) // not pinned
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := attacker.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	calls := 0
+	exec := func(context.Context, Intent, *authz.VerifiedOp) (string, error) { calls++; return "", nil }
+
+	res := e.RunSignedJob(context.Background(), destroyIntent(SourceOneShotJob), signed, exec)
+	if res.Executed || res.Decision.Allowed || !errors.Is(res.Decision.Err, authz.ErrUnknownSigner) {
+		t.Fatalf("forged job must be refused unexecuted, got %+v", res)
+	}
+	if calls != 0 {
+		t.Errorf("executor must not run for a refused job, ran %d", calls)
+	}
+	if j.AlreadyApplied(n) {
+		t.Error("a refused job must not mark its key applied")
+	}
+}
+
+func TestRunSignedJob_NoExecutorInert(t *testing.T) {
+	// Slice-4 inert state: a VALID authorization with no destructive executor wired
+	// returns an error and does NOT mark the key applied (so it is retryable once the
+	// executor lands in a later slice).
+	e, j, op := newSignedEngine(t, &fakeAPI{})
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	res := e.RunSignedJob(context.Background(), destroyIntent(SourceOneShotJob), signed, nil)
+	if !res.Decision.Allowed {
+		t.Fatalf("op should authorize even with no executor: %+v", res.Decision)
+	}
+	if res.Executed || res.Err == nil {
+		t.Fatalf("no-executor job should not execute and should error, got %+v", res)
+	}
+	if j.AlreadyApplied(n) {
+		t.Error("an unexecuted (no-executor) job must not mark its key applied")
+	}
+}
+
+func TestRunSignedJob_ExecutorErrorJournaledFailed(t *testing.T) {
+	e, j, op := newSignedEngine(t, &fakeAPI{})
+	issued, expires := freshWindow()
+	n := nonce()
+	signed := op.mint("guest_destroy", testHost, "9001", "op1", n, `{"purge":true}`, issued, expires)
+
+	exec := func(context.Context, Intent, *authz.VerifiedOp) (string, error) {
+		return "", errors.New("destroy failed")
+	}
+	res := e.RunSignedJob(context.Background(), destroyIntent(SourceOneShotJob), signed, exec)
+	if res.Executed || res.Err == nil {
+		t.Fatalf("executor error should propagate, got %+v", res)
+	}
+	if j.AlreadyApplied(n) {
+		t.Error("a failed execution must not mark its key applied")
+	}
+}
@@ -0,0 +1,129 @@
+package reconcile
+
+import (
+	"crypto/ed25519"
+	"crypto/rand"
+	"crypto/sha256"
+	"encoding/pem"
+	"fmt"
+	"testing"
+	"time"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/authz"
+	"golang.org/x/crypto/ssh"
+)
+
+// In-test SSHSIG minter for the gate's adversarial matrix. It replicates the ~40 lines
+// of SSHSIG framing (porting internal/authz/sshsig.go + mint_test.go) so reconcile's
+// tests can produce valid AND adversarial signatures with now-relative timestamps.
+// This lives only in reconcile's test binary — production authz is untouched (no
+// signing capability is added to the verify-only security package), and the verifier's
+// unexported clock (not injectable cross-package) is why we mint live rather than reuse
+// the committed fixed-window fixture.
+//
+// The minted bytes round-trip through the REAL authz.Verifier, so a correct framing is
+// proven by the positive case verifying.
+
+const sshsigMagic = "SSHSIG"
+
+type sshsigBlob struct {
+	Version   uint32
+	PublicKey string
+	Namespace string
+	Reserved  string
+	HashAlgo  string
+	Signature string
+}
+
+// signedData recomputes the SSHSIG signed bytes: "SSHSIG" || marshal(ns, reserved,
+// hash, H(message)). Mirrors authz.signedData exactly (sha256).
+func signedDataForTest(ns string, msg []byte) []byte {
+	h := sha256.Sum256(msg)
+	body := ssh.Marshal(struct {
+		Namespace string
+		Reserved  string
+		HashAlgo  string
+		Hash      []byte
+	}{ns, "", "sha256", h[:]})
+	return append([]byte(sshsigMagic), body...)
+}
+
+// mintArmor builds an armored SSHSIG over message using sign.
+func mintArmor(pubMarshaled []byte, namespace string, message []byte, sign func([]byte) ssh.Signature) []byte {
+	sb := &sshsigBlob{Version: 1, PublicKey: string(pubMarshaled), Namespace: namespace, Reserved: "", HashAlgo: "sha256"}
+	sig := sign(signedDataForTest(namespace, message))
+	sb.Signature = string(ssh.Marshal(&sig))
+	raw := append([]byte(sshsigMagic), ssh.Marshal(sb)...)
+	return pem.EncodeToMemory(&pem.Block{Type: "SSH SIGNATURE", Bytes: raw})
+}
+
+// nonce returns a fresh 128-bit hex nonce (doc 04 §2.1: ≥128-bit random).
+func nonce() string {
+	var b [16]byte
+	if _, err := rand.Read(b[:]); err != nil {
+		panic(err)
+	}
+	const hexdigits = "0123456789abcdef"
+	out := make([]byte, 32)
+	for i, x := range b {
+		out[i*2] = hexdigits[x>>4]
+		out[i*2+1] = hexdigits[x&0x0f]
+	}
+	return string(out)
+}
+
+// canonicalBlob builds an op blob in the doc 04 §2.1 canonical field order (keys
+// sorted at every level, no insignificant whitespace).
+func canonicalBlob(op, hostID, guestID, keyID, nonce, paramsJSON string, issued, expires time.Time) []byte {
+	if paramsJSON == "" {
+		paramsJSON = "{}"
+	}
+	return []byte(fmt.Sprintf(
+		`{"expires_at":%q,"issued_at":%q,"key_id":%q,"nonce":%q,"op":%q,"params":%s,"target":{"guest_id":%q,"host_id":%q}}`,
+		expires.UTC().Format(time.RFC3339), issued.UTC().Format(time.RFC3339),
+		keyID, nonce, op, paramsJSON, guestID, hostID))
+}
+
+// testSigner is a fresh ed25519 operator key: its public key, an authorized_keys line
+// to pin it, and a sign closure.
+type testSigner struct {
+	pub  ssh.PublicKey
+	line string
+	sign func([]byte) ssh.Signature
+}
+
+func newTestSigner(t *testing.T) testSigner {
+	t.Helper()
+	pub, priv, err := ed25519.GenerateKey(rand.Reader)
+	if err != nil {
+		t.Fatal(err)
+	}
+	sshPub, err := ssh.NewPublicKey(pub)
+	if err != nil {
+		t.Fatal(err)
+	}
+	return testSigner{
+		pub:  sshPub,
+		line: string(ssh.MarshalAuthorizedKey(sshPub)),
+		sign: func(d []byte) ssh.Signature {
+			return ssh.Signature{Format: ssh.KeyAlgoED25519, Blob: ed25519.Sign(priv, d)}
+		},
+	}
+}
+
+// allowed builds a pinned AllowedSigner for this key with the given id+role.
+func (s testSigner) allowed(t *testing.T, keyID string, role authz.KeyRole) authz.AllowedSigner {
+	t.Helper()
+	as, err := authz.NewAllowedSigner(keyID, role, s.line)
+	if err != nil {
+		t.Fatalf("NewAllowedSigner: %v", err)
+	}
+	return as
+}
+
+// mint builds a SignedOp (canonical blob + armored sig) for this signer.
+func (s testSigner) mint(op, hostID, guestID, keyID, nonce, paramsJSON string, issued, expires time.Time) *SignedOp {
+	blob := canonicalBlob(op, hostID, guestID, keyID, nonce, paramsJSON, issued, expires)
+	sig := mintArmor(s.pub.Marshal(), authz.Namespace, blob, s.sign)
+	return &SignedOp{Blob: blob, Sig: sig}
+}
@@ -37,6 +37,13 @@ type Action struct {
 // the MiB unit Proxmox's LXC `memory` config field uses.
 const bytesPerMiB = 1024 * 1024

+// desiredMemoryMiB canonicalizes a desired byte count to the integer MiB that
+// Proxmox's `memory` field stores and reports. Floor division is deliberate and
+// convergent: the value returned here is exactly the value written via SetConfig, so a
+// subsequent read returns the same MiB and the comparison settles (see Plan's memory
+// note). The actual side (a.MemoryMiB) is already MiB from GuestConfig.
+func desiredMemoryMiB(bytes int64) int64 { return bytes / bytesPerMiB }
+
 // Plan computes the minimal benign action set converging actual → desired. It is a
 // pure function (deterministic, side-effect-free) so it is exhaustively fixture-test
 // -able. Actions are returned sorted by vmid, then config-before-runstate per guest.
@@ -81,7 +88,14 @@ func Plan(desired DesiredState, actual ActualState, norm FieldNormalizers) []Act
 					params["cores"] = strconv.Itoa(d.Spec.Cores)
 					reasons = append(reasons, fmt.Sprintf("cores %d->%d", a.Cores, d.Spec.Cores))
 				}
-				if want := d.Spec.MemoryBytes / bytesPerMiB; want != a.MemoryMiB {
+				// Memory is canonicalized to MiB on BOTH sides before comparison — the
+				// numeric cousin of the description-newline normalization (string
+				// normalizers cover string fields; this is the integer one). We compare
+				// the SAME MiB value we then write, so a non-MiB-aligned desired
+				// converges in one pass (write `want` MiB → PVE stores `want` MiB → next
+				// read a.MemoryMiB == want → no further action), never perpetual drift.
+				// Slice 10 should still serve MiB-aligned MemoryBytes at the source.
+				if want := desiredMemoryMiB(d.Spec.MemoryBytes); want != a.MemoryMiB {
 					params["memory"] = strconv.FormatInt(want, 10)
 					reasons = append(reasons, fmt.Sprintf("memory %dMiB->%dMiB", a.MemoryMiB, want))
 				}
@@ -73,6 +73,23 @@ func TestPlan_SpecDrift(t *testing.T) {
 	mustActions(t, got)
 }

+func TestPlan_MemoryNonAlignedConverges(t *testing.T) {
+	// Note-2 guard: a desired MemoryBytes that is NOT a clean MiB multiple must not
+	// cause perpetual drift. We compare in MiB and write the SAME MiB we compared, so it
+	// settles in one pass.
+	desiredBytes := int64(2049)*bytesPerMiB + 500000 // 2049 MiB + change → floors to 2049
+	d := desired(DesiredGuest{VMID: 100, Spec: &hub.GuestSpec{Cores: 2, MemoryBytes: desiredBytes}})
+
+	// First pass: actual is 2048 MiB → one SetConfig memory=2049.
+	got := Plan(d, actual(ActualGuest{VMID: 100, Run: RunStopped, SpecKnown: true, Cores: 2, MemoryMiB: 2048}), nil)
+	mustActions(t, got, Action{VMID: 100, Kind: ActionSetConfig, Params: map[string]string{"memory": "2049"}})
+
+	// Apply it: actual becomes 2049 MiB. Re-plan against the SAME desired → no action.
+	if got2 := Plan(d, actual(ActualGuest{VMID: 100, Run: RunStopped, SpecKnown: true, Cores: 2, MemoryMiB: 2049}), nil); len(got2) != 0 {
+		t.Fatalf("non-MiB-aligned memory did not converge (perpetual drift): %+v", got2)
+	}
+}
+
 func TestPlan_DiskNotReconciled(t *testing.T) {
 	// DiskBytes differs but is intentionally not reconciled (pct resize, later slice).
 	got := Plan(
@@ -0,0 +1,97 @@
+package reconcile
+
+import (
+	"context"
+	"time"
+)
+
+// Recover consumes the journal's in-flight set at startup: resume-or-rollback for any
+// op that was mid-execution when the agent crashed (doc 03 §10). This MUST run before
+// the engine begins issuing new mutations.
+//
+// Why it is load-bearing for signed destructive ops (and why it lands with the gate):
+// the idempotency-key store dedupes a COMPLETED op, but an op that crashed AFTER the
+// Proxmox POST and BEFORE its terminal record (OpTaskRunning) is not covered by that —
+// its nonce is already consumed, so a redelivery is rejected as a replay, yet it never
+// reached a terminal state. Only this startup consumer can resolve it: re-check the
+// Proxmox task and record the real outcome.
+//
+// Resolution per in-flight entry:
+//   - has a task id (OpTaskRunning): re-read the task status once. Stopped → record the
+//     real terminal state (OK → succeeded, else failed). Still running → leave it
+//     in-flight (a later Recover or the task's own completion resolves it). Unreadable →
+//     leave it (cannot safely decide).
+//   - no task id (OpStarted only): the Proxmox POST was never confirmed, so the op
+//     never took effect — record failed (fail-safe, the documented FileNonceStore
+//     direction). A convergent reconcile op is simply re-issued next pass; a one-shot
+//     op did NOT mark its idempotency key applied, so it is not falsely deduped.
+func (e *Engine) Recover(ctx context.Context) RecoverResult {
+	var res RecoverResult
+	if e.journal == nil {
+		return res
+	}
+	for _, entry := range e.journal.InFlight() {
+		res.Examined++
+		if entry.UPID == "" {
+			// POST never confirmed → abandon (fail-safe).
+			e.append(terminal(entry, OpFailed))
+			res.RolledBack++
+			e.logger.Warn("recover: in-flight op had no task id; marked failed (fail-safe)",
+				"op_id", entry.OpID, "vmid", entry.VMID, "kind", entry.Kind)
+			continue
+		}
+		st, err := e.api.TaskStatusOnce(ctx, entry.UPID)
+		if err != nil {
+			res.Unresolved++
+			e.logger.Warn("recover: cannot read in-flight task status; left in-flight",
+				"op_id", entry.OpID, "upid", entry.UPID, "err", err)
+			continue
+		}
+		if st.Running() {
+			res.StillRunning++
+			e.logger.Info("recover: in-flight task still running; left in-flight",
+				"op_id", entry.OpID, "upid", entry.UPID)
+			continue
+		}
+		// Stopped: record the real outcome.
+		if st.OK() {
+			e.append(terminal(entry, OpSucceeded))
+			res.Resumed++
+			e.logger.Info("recover: in-flight task completed OK; marked succeeded",
+				"op_id", entry.OpID, "upid", entry.UPID)
+		} else {
+			e.append(terminal(entry, OpFailed))
+			res.Failed++
+			e.logger.Warn("recover: in-flight task ended non-OK; marked failed",
+				"op_id", entry.OpID, "upid", entry.UPID, "exitstatus", st.ExitStatus)
+		}
+	}
+	if res.Examined > 0 {
+		e.logger.Info("recover: in-flight journal reconciled", "result", res)
+	}
+	return res
+}
+
+// RecoverResult summarizes a startup recovery pass.
+type RecoverResult struct {
+	Examined     int
+	Resumed      int // task found completed OK and recorded succeeded
+	Failed       int // task found ended non-OK and recorded failed
+	RolledBack   int // no task id → abandoned (fail-safe)
+	StillRunning int // task still executing → left in-flight
+	Unresolved   int // task status unreadable → left in-flight
+}
+
+// terminal builds a terminal journal record preserving the op's identity, with the
+// idempotency key carried through so a SUCCEEDED one-shot op marks its key applied.
+func terminal(e JournalEntry, state OpState) JournalEntry {
+	return JournalEntry{
+		OpID:     e.OpID,
+		VMID:     e.VMID,
+		Kind:     e.Kind,
+		UPID:     e.UPID,
+		State:    state,
+		IdempKey: e.IdempKey,
+		At:       time.Now().UTC(),
+	}
+}
@@ -0,0 +1,103 @@
+package reconcile
+
+import (
+	"context"
+	"errors"
+	"testing"
+	"time"
+
+	"gitea.dooplex.hu/admin/felhom-agent/internal/proxmox"
+)
+
+func seedInFlight(t *testing.T, j *Journal, e JournalEntry) {
+	t.Helper()
+	e.State = OpTaskRunning
+	if e.At.IsZero() {
+		e.At = time.Now().UTC()
+	}
+	if err := j.Append(e); err != nil {
+		t.Fatalf("seed: %v", err)
+	}
+}
+
+func TestRecover_TaskCompletedOKMarksSucceeded(t *testing.T) {
+	api := &fakeAPI{statusFunc: func(string) (proxmox.TaskStatus, error) {
+		return proxmox.TaskStatus{Status: "stopped", ExitStatus: "OK"}, nil
+	}}
+	e, j, _ := newEngine(t, api, EmptyProvider{})
+	seedInFlight(t, j, JournalEntry{OpID: "op1", VMID: 100, Kind: "set_config", UPID: "UPID:x:", IdempKey: "k1"})
+
+	res := e.Recover(context.Background())
+	if res.Examined != 1 || res.Resumed != 1 {
+		t.Fatalf("want 1 resumed, got %+v", res)
+	}
+	if len(j.InFlight()) != 0 {
+		t.Errorf("resolved op should not be in-flight: %+v", j.InFlight())
+	}
+	// A resumed one-shot op marks its idempotency key applied (it really completed) —
+	// this is the case idempotency-alone could not cover (Note 1).
+	if !j.AlreadyApplied("k1") {
+		t.Error("a recovered-succeeded op must mark its idempotency key applied")
+	}
+}
+
+func TestRecover_TaskEndedNonOKMarksFailed(t *testing.T) {
+	api := &fakeAPI{statusFunc: func(string) (proxmox.TaskStatus, error) {
+		return proxmox.TaskStatus{Status: "stopped", ExitStatus: "got 403"}, nil
+	}}
+	e, j, _ := newEngine(t, api, EmptyProvider{})
+	seedInFlight(t, j, JournalEntry{OpID: "op2", VMID: 100, Kind: "guest_destroy", UPID: "UPID:x:", IdempKey: "k2"})
+
+	res := e.Recover(context.Background())
+	if res.Failed != 1 {
+		t.Fatalf("want 1 failed, got %+v", res)
+	}
+	if j.AlreadyApplied("k2") {
+		t.Error("a failed op must NOT mark its key applied (it may be retried)")
+	}
+}
+
+func TestRecover_TaskStillRunningLeftInFlight(t *testing.T) {
+	api := &fakeAPI{statusFunc: func(string) (proxmox.TaskStatus, error) {
+		return proxmox.TaskStatus{Status: "running"}, nil
+	}}
+	e, j, _ := newEngine(t, api, EmptyProvider{})
+	seedInFlight(t, j, JournalEntry{OpID: "op3", VMID: 100, Kind: "set_config", UPID: "UPID:x:"})
+
+	res := e.Recover(context.Background())
+	if res.StillRunning != 1 || len(j.InFlight()) != 1 {
+		t.Fatalf("still-running task must be left in-flight, got res=%+v inflight=%d", res, len(j.InFlight()))
+	}
+}
+
+func TestRecover_NoTaskIDRolledBack(t *testing.T) {
+	// OpStarted with no UPID: the POST was never confirmed → abandon (fail-safe).
+	e, j, _ := newEngine(t, &fakeAPI{}, EmptyProvider{})
+	if err := j.Append(JournalEntry{OpID: "op4", VMID: 100, Kind: "start", State: OpStarted, At: time.Now().UTC()}); err != nil {
+		t.Fatal(err)
+	}
+	res := e.Recover(context.Background())
+	if res.RolledBack != 1 || len(j.InFlight()) != 0 {
+		t.Fatalf("no-task op must be rolled back, got res=%+v inflight=%d", res, len(j.InFlight()))
+	}
+}
+
+func TestRecover_UnreadableStatusLeftInFlight(t *testing.T) {
+	api := &fakeAPI{statusFunc: func(string) (proxmox.TaskStatus, error) {
+		return proxmox.TaskStatus{}, errors.New("api unreachable")
+	}}
+	e, j, _ := newEngine(t, api, EmptyProvider{})
+	seedInFlight(t, j, JournalEntry{OpID: "op5", VMID: 100, Kind: "set_config", UPID: "UPID:x:"})
+
+	res := e.Recover(context.Background())
+	if res.Unresolved != 1 || len(j.InFlight()) != 1 {
+		t.Fatalf("unreadable status must leave op in-flight, got res=%+v inflight=%d", res, len(j.InFlight()))
+	}
+}
+
+func TestRecover_EmptyJournalNoop(t *testing.T) {
+	e, _, _ := newEngine(t, &fakeAPI{}, EmptyProvider{})
+	if res := e.Recover(context.Background()); res.Examined != 0 {
+		t.Errorf("empty journal recover should be a no-op, got %+v", res)
+	}
+}
@@ -112,6 +112,9 @@ type GuestAPI interface {
 	Stop(ctx context.Context, vmid int) (string, error)
 	SetConfig(ctx context.Context, vmid int, params map[string]string) (string, error)
 	WaitTask(ctx context.Context, upid string, opts proxmox.WaitOptions) (proxmox.TaskStatus, error)
+	// TaskStatusOnce is a single non-blocking task-status read — used by crash
+	// recovery to learn the outcome of an op that was in flight when the agent died.
+	TaskStatusOnce(ctx context.Context, upid string) (proxmox.TaskStatus, error)
 }

 // guestDescription decodes the (string-valued) `description` key from a GuestConfig's