hub v0.7.4: ingest agent pbs_snapshots (slice 6 Phase B)
Accept + persist the now-populated host-report pbs_snapshots. hostPBSSnapshot mirror in hostReportPayload (persisted via report_json, no schema change); a FAILED PBS verify is logged prominently (loudest offsite-DR signal). Shared golden updated byte-identical with felhom-agent; TestHostPBSSnapshot_GoldenContract added. Build/deploy deferred (backward-compatible). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -4,35 +4,33 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
# REPORT — Hub: ingest agent backups + restore_tests (v0.7.3) (2026-06-09)
|
# REPORT — Hub: ingest agent pbs_snapshots (v0.7.4) (2026-06-09)
|
||||||
|
|
||||||
## Outcome
|
## Outcome
|
||||||
|
|
||||||
**Code committed + pushed (changelogged as `v0.7.3`); image build/deploy deferred to an
|
**Code committed + pushed (changelogged as `v0.7.4`); image build/deploy deferred to an
|
||||||
operator decision.** The felhom-agent slice-6 Phase A work populates the host-report's
|
operator decision.** The felhom-agent slice-6 Phase B work populates the host-report's
|
||||||
`backups` + `restore_tests`. This change is the hub half: accept + persist them. Minimal —
|
`pbs_snapshots` (PBS offsite inventory + per-snapshot verify-state). This is the hub half:
|
||||||
the authoritative backup policy is hub-owned (slice 10); this mirrors what the agent reports.
|
accept + persist them. Minimal — the authoritative offsite policy is hub-owned (slice 10).
|
||||||
|
|
||||||
## What landed (`hub/internal/api/handler.go`, `host_test.go`, golden)
|
## What landed (`hub/internal/api/handler.go`, `host_test.go`, golden)
|
||||||
|
|
||||||
- `hostReportPayload` gains `hostBackup` / `hostRestoreTest` mirror structs matching the
|
- `hostReportPayload` gains a `hostPBSSnapshot` mirror struct matching the agent's
|
||||||
agent's `hub.Backup` / `hub.RestoreTest` field-for-field.
|
`hub.PBSSnapshot` field-for-field, persisted via the existing `report_json` column.
|
||||||
- Persistence via the existing `report_json` column (no schema change). The handler logs a
|
- The handler logs a **FAILED PBS verify prominently** (`[WARN]` — the loudest offsite-DR
|
||||||
**FAILED restore-test prominently** (`[WARN]` — the loudest DR signal) and a failed backup;
|
signal); the host-report info line now counts pbs-snapshots too.
|
||||||
the host-report info line counts backups + restore-tests.
|
- The shared `testdata/host-report.golden.json` carries a populated `pbs_snapshots[0]`,
|
||||||
- The shared `testdata/host-report.golden.json` now carries a populated `backups[0]` /
|
**byte-identical** with felhom-agent's copy; `TestHostPBSSnapshot_GoldenContract` is the
|
||||||
`restore_tests[0]`, **byte-identical** with felhom-agent's copy.
|
hub half of the bidirectional key-set test. `go test ./internal/api/` is green.
|
||||||
- `TestHostBackup_GoldenContract` / `TestHostRestoreTest_GoldenContract` are the hub half of
|
|
||||||
the bidirectional key-set test. `go test ./internal/api/ ./internal/store/` is green.
|
|
||||||
|
|
||||||
## Backward compatibility
|
## Backward compatibility
|
||||||
|
|
||||||
An agent that omits/empties `backups`/`restore_tests` is accepted unchanged. The legacy
|
An agent that omits/empties `pbs_snapshots` is accepted unchanged. The legacy controller
|
||||||
controller report path is untouched (frozen until the slice-10 cutover).
|
report path is untouched (frozen until the slice-10 cutover).
|
||||||
|
|
||||||
## Deploy
|
## Deploy
|
||||||
|
|
||||||
> Per the GitOps flow (`CLAUDE.md`): build+push `gitea.dooplex.hu/admin/felhom-hub:v0.7.3`,
|
> Per the GitOps flow (`CLAUDE.md`): build+push `gitea.dooplex.hu/admin/felhom-hub:v0.7.4`,
|
||||||
> bump `manifests/hub.yaml`, commit, then sync the `felhom` ArgoCD app. **Deferred** at this
|
> bump `manifests/hub.yaml`, commit, then sync the `felhom` ArgoCD app. **Deferred** at this
|
||||||
> checkpoint — the change is backward-compatible, so the live hub (v0.7.2) keeps ingesting
|
> checkpoint — the change is backward-compatible, so the live hub (v0.7.3) keeps ingesting
|
||||||
> host-reports fine until then.
|
> host-reports fine until then.
|
||||||
|
|||||||
@@ -1,5 +1,25 @@
|
|||||||
# Felhom Hub — Changelog
|
# Felhom Hub — Changelog
|
||||||
|
|
||||||
|
## v0.7.4 — ingest agent pbs_snapshots (slice 6 Phase B) (2026-06-09)
|
||||||
|
|
||||||
|
The agent's slice-6 Phase B work populates the host-report's `pbs_snapshots` (the PBS offsite
|
||||||
|
inventory + per-snapshot verify-state). This is the hub half: accept + persist them. Minimal —
|
||||||
|
the rich offsite policy is hub-owned (slice 10); this mirrors what the agent reports.
|
||||||
|
|
||||||
|
### Added
|
||||||
|
- **`hostPBSSnapshot`** mirror struct in `hostReportPayload` (`internal/api/handler.go`) —
|
||||||
|
field-for-field with the agent's `hub.PBSSnapshot` wire contract (namespace/backup_type/
|
||||||
|
backup_id/backup_time/size_bytes/owner/protected/encrypted/verify_state/verify_upid).
|
||||||
|
Persisted via `report_json` (no new columns — the slice-5/6A precedent).
|
||||||
|
- **A FAILED PBS verify is logged prominently** (`[WARN]` — the loudest offsite-DR signal,
|
||||||
|
same treatment as a failed restore-test). The `host-report` info line now counts pbs-snapshots.
|
||||||
|
- **`testdata/host-report.golden.json`** updated with a populated `pbs_snapshots[0]`, kept
|
||||||
|
**byte-identical** with felhom-agent's copy.
|
||||||
|
- **`TestHostPBSSnapshot_GoldenContract`** — the hub half of the bidirectional key-set test.
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
- Backward-compatible: an agent that omits/empties `pbs_snapshots` is accepted unchanged.
|
||||||
|
|
||||||
## v0.7.3 — ingest agent backups + restore_tests (slice 6 Phase A) (2026-06-09)
|
## v0.7.3 — ingest agent backups + restore_tests (slice 6 Phase A) (2026-06-09)
|
||||||
|
|
||||||
The agent's slice-6 work populates the host-report's `backups` + `restore_tests` (the
|
The agent's slice-6 work populates the host-report's `backups` + `restore_tests` (the
|
||||||
|
|||||||
@@ -260,11 +260,28 @@ type hostReportPayload struct {
|
|||||||
StorageTargets []hostStorageTarget `json:"storage_targets"`
|
StorageTargets []hostStorageTarget `json:"storage_targets"`
|
||||||
Backups []hostBackup `json:"backups"` // slice 6
|
Backups []hostBackup `json:"backups"` // slice 6
|
||||||
RestoreTests []hostRestoreTest `json:"restore_tests"` // slice 6
|
RestoreTests []hostRestoreTest `json:"restore_tests"` // slice 6
|
||||||
|
PBSSnapshots []hostPBSSnapshot `json:"pbs_snapshots"` // slice 6 Phase B
|
||||||
Cloudflared struct {
|
Cloudflared struct {
|
||||||
Status string `json:"status"`
|
Status string `json:"status"`
|
||||||
} `json:"cloudflared"`
|
} `json:"cloudflared"`
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// hostPBSSnapshot mirrors the agent's hub.PBSSnapshot wire contract (slice 6 Phase B). The
|
||||||
|
// hub persists it via report_json and surfaces a FAILED verify prominently (the loudest
|
||||||
|
// offsite-DR signal — same treatment as a failed restore-test).
|
||||||
|
type hostPBSSnapshot struct {
|
||||||
|
Namespace string `json:"namespace"`
|
||||||
|
BackupType string `json:"backup_type"`
|
||||||
|
BackupID string `json:"backup_id"`
|
||||||
|
BackupTime string `json:"backup_time"`
|
||||||
|
SizeBytes int64 `json:"size_bytes"`
|
||||||
|
Owner string `json:"owner"`
|
||||||
|
Protected bool `json:"protected"`
|
||||||
|
Encrypted bool `json:"encrypted"`
|
||||||
|
VerifyState string `json:"verify_state"`
|
||||||
|
VerifyUPID string `json:"verify_upid,omitempty"`
|
||||||
|
}
|
||||||
|
|
||||||
// hostBackup / hostRestoreTest mirror the agent's hub.Backup / hub.RestoreTest wire
|
// hostBackup / hostRestoreTest mirror the agent's hub.Backup / hub.RestoreTest wire
|
||||||
// contract field-for-field (slice 6, doc 03 §8). DUPLICATED contract — the golden stays
|
// contract field-for-field (slice 6, doc 03 §8). DUPLICATED contract — the golden stays
|
||||||
// byte-identical with felhom-agent's copy and the key-set tests guard drift. The hub
|
// byte-identical with felhom-agent's copy and the key-set tests guard drift. The hub
|
||||||
@@ -444,9 +461,16 @@ func (h *Handler) handleHostReport(w http.ResponseWriter, r *http.Request) {
|
|||||||
hostID, bk.TargetID, bk.VMID, bk.Error)
|
hostID, bk.TargetID, bk.VMID, bk.Error)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
// pbs_snapshots (slice 6 Phase B): a FAILED PBS verify is the loudest offsite-DR signal.
|
||||||
|
for _, ps := range rep.PBSSnapshots {
|
||||||
|
if ps.VerifyState == "failed" {
|
||||||
|
h.logger.Printf("[WARN] host %s PBS verify FAILED: %s/%s ns=%s owner=%s",
|
||||||
|
hostID, ps.BackupType, ps.BackupID, ps.Namespace, ps.Owner)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
h.logger.Printf("[INFO] host-report from %s (%d guests, %d storage targets, %d backups, %d restore-tests, %d bytes)",
|
h.logger.Printf("[INFO] host-report from %s (%d guests, %d storage targets, %d backups, %d restore-tests, %d pbs-snapshots, %d bytes)",
|
||||||
hostID, len(rep.Guests), len(rep.StorageTargets), len(rep.Backups), len(rep.RestoreTests), len(body))
|
hostID, len(rep.Guests), len(rep.StorageTargets), len(rep.Backups), len(rep.RestoreTests), len(rep.PBSSnapshots), len(body))
|
||||||
|
|
||||||
blocked := false
|
blocked := false
|
||||||
if cc, err := h.store.GetCustomerConfig(custID); err == nil && cc != nil && cc.Status == "blocked" {
|
if cc, err := h.store.GetCustomerConfig(custID); err == nil && cc != nil && cc.Status == "blocked" {
|
||||||
|
|||||||
@@ -338,6 +338,32 @@ func TestHostRestoreTest_GoldenContract(t *testing.T) {
|
|||||||
assertSameStorageKeys(t, "restore_tests[0]", goldenKeys, mirrorKeys)
|
assertSameStorageKeys(t, "restore_tests[0]", goldenKeys, mirrorKeys)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestHostPBSSnapshot_GoldenContract(t *testing.T) {
|
||||||
|
raw, err := os.ReadFile("testdata/host-report.golden.json")
|
||||||
|
if err != nil {
|
||||||
|
t.Fatal(err)
|
||||||
|
}
|
||||||
|
var golden struct {
|
||||||
|
PBSSnapshots []json.RawMessage `json:"pbs_snapshots"`
|
||||||
|
}
|
||||||
|
if err := json.Unmarshal(raw, &golden); err != nil {
|
||||||
|
t.Fatal(err)
|
||||||
|
}
|
||||||
|
if len(golden.PBSSnapshots) == 0 {
|
||||||
|
t.Fatal("golden has no pbs_snapshots to check")
|
||||||
|
}
|
||||||
|
var goldenKeys map[string]any
|
||||||
|
json.Unmarshal(golden.PBSSnapshots[0], &goldenKeys)
|
||||||
|
var mirror hostPBSSnapshot
|
||||||
|
if err := json.Unmarshal(golden.PBSSnapshots[0], &mirror); err != nil {
|
||||||
|
t.Fatalf("golden pbs snapshot does not parse into the mirror: %v", err)
|
||||||
|
}
|
||||||
|
b, _ := json.Marshal(mirror)
|
||||||
|
var mirrorKeys map[string]any
|
||||||
|
json.Unmarshal(b, &mirrorKeys)
|
||||||
|
assertSameStorageKeys(t, "pbs_snapshots[0]", goldenKeys, mirrorKeys)
|
||||||
|
}
|
||||||
|
|
||||||
func assertSameStorageKeys(t *testing.T, where string, a, b any) {
|
func assertSameStorageKeys(t *testing.T, where string, a, b any) {
|
||||||
t.Helper()
|
t.Helper()
|
||||||
ka, kb := sortedKeys(a), sortedKeys(b)
|
ka, kb := sortedKeys(a), sortedKeys(b)
|
||||||
|
|||||||
+14
-1
@@ -111,7 +111,20 @@
|
|||||||
"duration_seconds": 38.2
|
"duration_seconds": 38.2
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"pbs_snapshots": [],
|
"pbs_snapshots": [
|
||||||
|
{
|
||||||
|
"namespace": "root",
|
||||||
|
"backup_type": "ct",
|
||||||
|
"backup_id": "9001",
|
||||||
|
"backup_time": "2026-06-09T14:18:33Z",
|
||||||
|
"size_bytes": 2518889256,
|
||||||
|
"owner": "felhom@pbs!n100",
|
||||||
|
"protected": false,
|
||||||
|
"encrypted": true,
|
||||||
|
"verify_state": "ok",
|
||||||
|
"verify_upid": "UPID:dooplex:00034582:5269BDD7:00000005:6A282176:verify:felhom-spike:felhom@pbs!n100:"
|
||||||
|
}
|
||||||
|
],
|
||||||
"cloudflared": { "status": "active" },
|
"cloudflared": { "status": "active" },
|
||||||
"audit_tail": []
|
"audit_tail": []
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user