v0.4.0-rc1: slice 4 Phase A — reconcile engine (structural, runs live unfed)

New internal/reconcile package: the agent-side control core's structural half.

- Per-guest serializer Queue (doc 03 §10): the single choke point all mutation
  sources funnel through; same-vmid serial in submit order, different vmids
  parallel (cond-var FIFO lanes).
- Desired-state model + DesiredProvider seam; EmptyProvider is the only live
  source at slice 4 (no hub serving until slice 10) so the live engine computes
  an empty action set and performs zero mutations.
- Normalization layer (FieldNormalizers): normalized desired-vs-actual so
  Proxmox round-trip quirks don't read as drift. normDesc promoted out of
  main.go to reconcile.NormDescription; selftest uses the shared helper.
- Plan (pure diff): minimal benign action set (Start/Stop/SetConfig) for guests
  in both desired and actual; provision/destroy out of scope here.
- Engine: dispatches onto the shared queue; honors the dual-mode SetConfig
  contract (UPID -> WaitTask; empty UPID -> synchronous success).
- Durable op journal + idempotency store (mirrors authz.FileNonceStore):
  in-flight task ids for crash detection + AlreadyApplied dedupe across restart.
- Wired into runDaemon alongside the hub loop, sharing the queue; runs cleanly
  with no desired state and no signers.

Full module race-clean and vet-clean on the Linux build server.

CHECKPOINT: Phase A only. Awaiting validation before Phase B (the reversibility
gate + signed-op consuming layer, landing v0.4.0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-08 23:21:55 +02:00
parent 605ce25f58
commit 05c450147c
16 changed files with 1904 additions and 78 deletions
+130
View File
@@ -0,0 +1,130 @@
package reconcile
import (
"context"
"encoding/json"
"gitea.dooplex.hu/admin/felhom-agent/internal/hub"
"gitea.dooplex.hu/admin/felhom-agent/internal/proxmox"
)
// RunState is a guest's desired/actual power state. The empty value means
// "unmanaged" on the desired side (the reconciler leaves run-state alone).
type RunState string
const (
// RunUnspecified (the zero value) — on a DesiredGuest it means run-state is not
// managed; the reconciler never starts/stops the guest for run-state reasons.
RunUnspecified RunState = ""
// RunRunning maps to proxmox status "running".
RunRunning RunState = "running"
// RunStopped maps to proxmox status "stopped".
RunStopped RunState = "stopped"
)
// normRun maps a raw proxmox status string to a RunState, collapsing anything
// unrecognized (e.g. "") to RunUnspecified so actual-state comparison is well-defined.
func normRun(status string) RunState {
switch status {
case "running":
return RunRunning
case "stopped":
return RunStopped
default:
return RunUnspecified
}
}
// DesiredGuest is the target state for one existing guest. Every field is
// individually optional ("unmanaged") so a desired-state source can pin only what it
// cares about — slice 4's planner only acts on the fields that are set.
type DesiredGuest struct {
VMID int
// Run is the target power state; RunUnspecified leaves it alone.
Run RunState
// Spec, when non-nil, manages sizing. Reuses hub.GuestSpec (cores/memory/disk).
// Phase A reconciles Cores and Memory via SetConfig; DiskBytes is reported but
// NOT reconciled here (a rootfs grow is `pct resize`, grow-only and separate —
// deferred to a later slice). Nil = sizing unmanaged.
Spec *hub.GuestSpec
// Description, when non-nil, manages the cosmetic `description` field (the first
// proven SetConfig round-trip, slice-4 pre-check). Nil = unmanaged.
Description *string
}
// DesiredState is the vmid-keyed target for this host. At slice 4 the only live
// source is the empty provider, so Guests is empty in production; fixtures inject it
// in tests. Host-level desired state (storage manifest, etc.) arrives in later slices.
type DesiredState struct {
Guests map[int]DesiredGuest
}
// ActualGuest is one guest's observed state, read from Proxmox.
type ActualGuest struct {
VMID int
Run RunState
// SpecKnown is false when GuestConfig could not be read (the run-state from the
// list is still trusted; spec/description comparisons are skipped). Mirrors the
// collector's "keep run-status, omit spec" degradation.
SpecKnown bool
Cores int
MemoryMiB int64 // proxmox LXC `memory` is MiB
Description string // raw (may carry PVE's trailing newline; compared via normalizers)
}
// ActualState is the vmid-keyed observed state for this host.
type ActualState struct {
Guests map[int]ActualGuest
}
// DesiredProvider is the seam the desired-state source plugs into. At slice 4 the
// only implementation is EmptyProvider (no live source); slice 10's hub-serving path
// is the real implementation. Do NOT invent a hub/local-file source here.
type DesiredProvider interface {
Desired(ctx context.Context) (DesiredState, error)
}
// EmptyProvider is the slice-4 production provider: no desired state, so reconcile is
// a live no-op (the engine computes an empty action set).
type EmptyProvider struct{}
// Desired returns an empty desired state.
func (EmptyProvider) Desired(context.Context) (DesiredState, error) {
return DesiredState{Guests: map[int]DesiredGuest{}}, nil
}
// StaticProvider serves a fixed DesiredState — used by fixtures (and usable as a
// local override later). It never mutates the value it was given.
type StaticProvider struct{ State DesiredState }
// Desired returns the static state.
func (p StaticProvider) Desired(context.Context) (DesiredState, error) { return p.State, nil }
// GuestAPI is the narrow Proxmox surface the engine needs: read actual state and
// dispatch the benign-on-existing-guest mutations. *proxmox.Client satisfies it; a
// fake satisfies it in tests. Every mutating call returns a UPID (or "" for the
// synchronous path) per the proxmox/mutate.go contract — the engine WaitTasks a
// non-empty UPID and treats "" as a clean synchronous success.
type GuestAPI interface {
ListLXC(ctx context.Context) ([]proxmox.Guest, error)
GuestConfig(ctx context.Context, vmid int) (proxmox.GuestConfig, error)
Start(ctx context.Context, vmid int) (string, error)
Stop(ctx context.Context, vmid int) (string, error)
SetConfig(ctx context.Context, vmid int, params map[string]string) (string, error)
WaitTask(ctx context.Context, upid string, opts proxmox.WaitOptions) (proxmox.TaskStatus, error)
}
// guestDescription decodes the (string-valued) `description` key from a GuestConfig's
// raw Extra map, returning "" when absent. The value is returned raw — PVE appends a
// trailing newline on read, which the normalization layer strips at comparison time.
func guestDescription(cfg proxmox.GuestConfig) string {
raw, ok := cfg.Extra["description"]
if !ok || len(raw) == 0 {
return ""
}
var s string
if err := json.Unmarshal(raw, &s); err != nil {
return ""
}
return s
}