Files
felhom-agent/internal/reconcile/classify.go
T
admin 1af21a6cac v0.4.0: slice 4 Phase B — reversibility gate + signed-op consuming layer
The security core of slice 4: hub-supplied intent is no longer trusted for
destructive change. The gate fronts the per-guest queue's executor, so every
mutation passes it. Reuses internal/authz for all crypto (surface untouched).

- Classifier (doc 03 §4): benign vs destructive by provenance + data-bearing-
  ness, NOT by verb. Destroy/overwrite of customer data is destructive unless
  agent-internal provenance (same-journaled-txn create, or agent-tagged scratch)
  makes it benign — and that provenance is journal-recorded, NEVER hub-sourced.
  Unknown op class fails safe to destructive.
- Reversibility gate: benign -> allowed unsigned; destructive -> requires a
  verified, role-scoped, action-bound operator signature, else pending_signature
  and never executed. Every decision audited (signal, never the guard).
- Signed-op consuming layer over authz.Verifier.Verify (locked pipeline
  untouched): role-scoping (doc 04 §4 — recovery=rotation only, operational=
  ordinary destructive + planned rotation) + op-to-action binding (op+host+
  guest+params must match the gated action).
- Signed-job orchestration: idempotency dedupe by nonce + journal-wrapped
  execution via an injected DestructiveExecutor (nil this slice — inert).
- Crash recovery (Note 1): Engine.Recover consumes the journal InFlight() set at
  startup (resume-or-rollback) — covers an op that crashed after the POST and
  before its terminal record, which idempotency dedupe alone cannot. Added
  TaskStatusOnce to the GuestAPI seam. Wired into daemon startup.
- Note 2: memory comparison canonicalized to MiB (desiredMemoryMiB) so a
  non-MiB-aligned MemoryBytes converges in one pass, not perpetual drift.
- Daemon: builds the verifier from config signers (none = nil verifier, the
  common slice-4 state), the gate (+SlogAudit), runs Recover before mutating.

Adversarial matrix proven against the REAL authz.Verifier with in-test-minted
SSHSIGs (framing replicated in reconcile's test binary; authz untouched, no
signing added to the verify-only package): unsigned job + unsigned desired-state
delta -> pending_signature; unknown signer/expired/replay-across-restart/wrong
host -> typed authz rejections; wrong guest/op/params -> binding_mismatch;
recovery key on ordinary destructive -> role_denied; hub-supplied scratch tag
ignored -> refused; valid+role+target+fresh nonce -> accepted then replay
rejected. Full module race-clean + vet-clean on the Linux build server.

Inert this slice: no destructive deltas served until slice 10; the destructive
path is classified, gated, and tested but not wired to live execution.

CHECKPOINT: Phase B complete (slice 4 done). Awaiting validation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 23:56:20 +02:00

107 lines
4.5 KiB
Go

package reconcile
// The benign/destructive classifier (doc 03 §4). The gate decides whether an intended
// action needs an operator signature by **provenance + data-bearing-ness, NOT by
// verb**. The op CLASS encodes the semantic intent (a "detach storage" is its own
// destructive class, not just another PUT) so classification never turns on the HTTP
// method.
// OpClass is the semantic class of an intended action. This vocabulary is the
// agent-side contract: the signed-op `op` field (doc 04 §2.1) and slice-10's hub /
// operator CLI match these exact strings. The committed slice-2 fixture
// (op="guest_destroy") seeds it.
type OpClass string
const (
// Benign-on-existing-guest set — wired to live execution this slice.
ClassStart OpClass = "start"
ClassStop OpClass = "stop"
ClassSetConfig OpClass = "set_config" // benign sizing/description changes only
// Benign by construction — classified now, executors land in later slices.
ClassCreate OpClass = "create" // provision a NEW guest (restore-to-new, slice 7)
ClassRestart OpClass = "restart" // heal a crashed controller in-place (§4)
// Destructive set — destroying/overwriting the only/primary copy of customer
// data. Classified and gated now; NOT wired to live execution this slice (nothing
// serves destructive deltas until slice 10).
ClassGuestDestroy OpClass = "guest_destroy"
ClassStorageWipe OpClass = "storage_wipe" // storage detach/wipe
ClassRestoreOverwrite OpClass = "restore_overwrite" // restore OVER an existing guest
ClassDecommission OpClass = "decommission"
// Key-rotation re-pin (doc 04 §4) — destructive-class, role-scoped: the cold
// recovery key authorizes ONLY this; the operational key authorizes this + ordinary
// destructive ops.
ClassKeyRotation OpClass = "key_rotation"
)
// Disposition is the classifier verdict.
type Disposition string
const (
// Benign — the reconciler/executor MAY act without an operator signature.
Benign Disposition = "benign"
// Destructive — an operator signature bound to the action is REQUIRED.
Destructive Disposition = "destructive"
)
// Provenance is AGENT-INTERNAL evidence that an otherwise-destructive action is
// actually safe (doc 03 §4). It is recorded in the operation journal by the agent's
// own bookkeeping and is **NEVER populated from the hub or any external input** — else
// a compromised hub could relabel a data-bearing guest as scratch to walk the gate.
// The zero value (no internal evidence) is the only value an externally-sourced intent
// may carry.
type Provenance struct {
// SameTxnCreated: the agent created this resource earlier in the SAME journaled
// transaction, so destroying it is a compensating rollback (§10), not data loss.
SameTxnCreated bool
// AgentTaggedScratch: the agent tagged this resource ephemeral/scratch (e.g. a
// restore-test scratch guest, §8). Journal-recorded provenance only.
AgentTaggedScratch bool
}
// internalEvidence reports whether agent-internal provenance makes a destroy benign.
func (p Provenance) internalEvidence() bool {
return p.SameTxnCreated || p.AgentTaggedScratch
}
// Classify returns the disposition for an op class given agent-internal provenance.
//
// Rules (doc 03 §4):
// - create/start/stop/restart/benign-set_config → always Benign.
// - destroy/overwrite of a data-bearing resource → Destructive, UNLESS agent-internal
// provenance (same-transaction create, or agent-tagged scratch) makes it benign.
// - key-rotation → always Destructive (signed); role-scoping picks the allowed key.
// - an UNKNOWN class fails safe → Destructive (require a signature).
func Classify(class OpClass, prov Provenance) Disposition {
switch class {
case ClassStart, ClassStop, ClassSetConfig, ClassCreate, ClassRestart:
return Benign
case ClassGuestDestroy, ClassStorageWipe, ClassRestoreOverwrite, ClassDecommission:
if prov.internalEvidence() {
return Benign // compensating rollback / scratch teardown
}
return Destructive
case ClassKeyRotation:
return Destructive
default:
return Destructive // fail safe: an unrecognized op is treated as destructive
}
}
// classOfAction maps a benign reconcile ActionKind to its OpClass, so every reconcile
// mutation is classified and passed through the gate like any other intent.
func classOfAction(k ActionKind) OpClass {
switch k {
case ActionStart:
return ClassStart
case ActionStop:
return ClassStop
case ActionSetConfig:
return ClassSetConfig
default:
return OpClass(k)
}
}