Add DELETE /hosts/{id}/jobs/{job_id} (per-host self-scoped, idempotent) so the
agent clears a job after executing or terminally rejecting it. The hub stores
the operator-signed blobs opaquely (no signing key — cannot forge or open);
the agent verifies + executes. Doc 03 §4/§6/§9 updated (operator-signed path
live; 8C wipe completes; 10B done).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
38 KiB
Felhom Hub — Changelog
v0.10.0 — slice 10B: signed-op job completion (clear-job) (2026-06-10)
The hub half of slice 10B is small by design — the hub stores + serves the operator-signed blobs opaquely (it holds no signing key and can neither forge nor open them; the agent verifies + executes). 10B adds the missing completion path so a processed job leaves the queue.
Added
DELETE /api/v1/hosts/{host_id}/jobs/{job_id}(per-host key, self-scoped; the global key may clear any) — the agent calls it after executing OR terminally rejecting a job. Idempotent (clearing an absent job is a clean 200). Store:DeleteSignedJob.
Unchanged (already in 10A, reused by 10B)
POST /admin/hosts/{id}/jobs(operator enqueues the signed blob),GET /hosts/{id}/jobs(the agent fetches), and thehas_signed_opsenvelope flag. The signed blob stays opaque on the wire (a base64{op_blob_b64, sig_armored}envelope the agent parses) — no jobs-wire golden change.
Tests
DELETE …/jobs/{id}is self-scoped (host A cannot clear host B's job → 403) and idempotent.
v0.9.0 — slice 10A: desired-state serving + signed-jobs queue (the "Down" channel) (2026-06-10)
The hub half of slice 10A: the hub now serves operator intent down to already-authenticated
hosts. The control envelope (the host-report response) stops returning placeholder
desired_generation:0 / has_signed_ops:false and carries the host's real generation + a
signed-jobs flag — the cheap change-notification the agent (v0.15.0) acts on. The heavy
desired-state moves only on a dedicated, self-scoped fetch.
Added
PUT /api/v1/admin/hosts/{host_id}/desired-state(global/operator key only) — sets a host's desired-state and atomically bumpsdesired_generation. The body is JSON the hub stores + serves opaquely (it validates only that it is well-formed JSON; the agent/CLI owns the schema). Unknown host → 404; malformed JSON → 400. Minimal admin path; rich editing UX is later.GET /api/v1/hosts/{host_id}/desired-state(per-host key, self-scoped — a host reads only its own; the global key may read any) — returns{generation, desired_state}. The agent fetches it when the envelope's generation advances past its cache.GET /api/v1/hosts/{host_id}/jobs(per-host key, self-scoped) — serves the host's pending opaque signed-op blobs (oldest first). The hub never forged, opened, or executes them (verify + run is slice 10B; this only serves the queue).POST /api/v1/admin/hosts/{host_id}/jobs(global key only) — enqueues a pre-signed opaque job blob. The minimal operator path to seed the queue; the hub holds no signing key.- Store: a new
signed_jobstable (per-host opaque blob queue);SetHostDesired(set + bump generation, atomic),EnqueueSignedJob/GetSignedJobs/CountSignedJobs. Thehoststable's previously-inertdesired_json/desired_generationcolumns are now live.
Changed
- The host-report control envelope now reports the host's actual
desired_generationandhas_signed_ops(queue non-empty), both degrading safely to their old defaults on a store error (a heartbeat never fails on the control channel).poll_interval_seconds/blockedunchanged.
Tests
- admin-set bumps the generation each write + the served state reflects the latest body; admin-set is global-key-only (per-host → 403, malformed → 400, unknown host → 404).
GET /desired-stateis self-scoped (host A's key → host B → 403; global → any; no token → 401).- the envelope carries the current generation +
has_signed_opsflips on enqueue;GET /jobsis self-scoped + serves the blobs oldest-first; admin enqueue is global-key-only. - cross-repo golden round-trip:
testdata/desired-state.golden.jsonset → fetched back unchanged (the opaque pass-through), byte-identical with felhom-agent's copy.
(no version bump) — slice 9 cross-repo wire-contract: host.cpu_temp_c (2026-06-10)
Slice 9 adds a nullable cpu_temp_c field to the shared HostMetrics wire struct (the agent's
new CPU/chassis-temperature collector). The agent's host-report carries it too, so the hub's
cross-repo host-report golden (internal/api/testdata/host-report.golden.json) was updated to
stay byte-identical with felhom-agent/internal/hub/testdata/host-report.golden.json (the
duplicated-contract discipline; manual diff confirmed identical). No hub code change — the full
report_json already persists the field verbatim, and the hub does not surface CPU temp on the
operator dashboard yet (an optional later freebie). The golden-contract test (host_test.go) still
passes (the host parse-struct ignores the extra key).
v0.8.0 — opaque PBS recovery-code escrow storage (slice 7, doc 03 §8a) (2026-06-10)
Hub half of slice-7 close-out: store the agent's opaque R-wrapped PBS-key escrow blob. The
default posture is zero-knowledge — the hub holds ciphertext it cannot open (it has no recovery
code; there is no decrypt path). Pairs with felhom-agent v0.9.0 (escrow creation). Consumption /
restore-mode serving is slice 10.
Added
PUT /api/v1/hosts/{host_id}/escrow— authed with the per-host key (a host may only write its own escrow; the global operator key is also accepted). Body mirrors the agent's emit struct (blob_b64,key_fingerprint,posture,created_at). Stores the decoded opaque bytes verbatim; rotation is last-write-wins. No serving this slice.host_escrowtable (host_idPK,blobBLOB, fingerprint/posture/created_at). Store methodsSaveHostEscrow/GetHostEscrow(HostEscrow). The hub never transforms or decrypts the blob.
Tests
- Stores the opaque blob verbatim (round-trips byte-identical); rotation last-write-wins; rejects an absent/wrong key (401) and a host writing another host's escrow (403); bad/empty base64 → 400; the wire-contract key-set matches the agent's emit struct.
Security note
The hub stores ciphertext only — holding the blob does NOT let Felhom read customer data (separation principle, doc 03 §8a). The per-host-key gate scopes writes to the owning host.
v0.7.5 — restore-test "passed with warnings" visibility (2026-06-09)
Hub half of TASK — Restore-test must not false-fail on benign start warnings (Phase B). The
agent (v0.7.0) now treats a guest-start advisory like the systemd-nesting warning as a PASS
(verdict is liveness, not the start exitstatus) and carries the warning text on the wire. This
makes that visible to the operator instead of indistinguishable from a clean pass.
Added
hostRestoreTest.warnings([]string) +warnings_recognized(bool) mirror fields, matching the agent'shub.RestoreTestwire contract (omitempty; an absentwarnings_recognized⇒false⇒ treated as the louder unrecognized case — a missing flag can only over-notice).
Changed
- Host-report ingest now surfaces a passed restore-test that carried warnings:
[INFO] restore-test passed WITH WARNINGS (recognized)when every warning is the known-benign anchor, escalated to[WARN] … UNRECOGNIZED WARNINGSotherwise — as loud as a failed PBS verify, so a real restore warning can't hide behind a green pass. A FAILED restore-test still logs the existing[WARN] … FAILED.
Tests / contract
restore_tests[0]in the host-report golden gainswarnings+warnings_recognized; the golden stays byte-identical with felhom-agent's copy (sha256-verified) and the bidirectional key-set contract test now round-trips the new keys throughhostRestoreTest.
Not in this slice
- No dashboard widget: the hub web layer renders only controller-report data — there is no host-domain dashboard surface yet (guests/storage/restore_tests/pbs_snapshots are log+persist only, same as the failed-PBS-verify signal). Distinct dashboard treatment lands when the host-domain dashboard does (slice 10). The operator signal this slice is the log line.
v0.7.4 — ingest agent pbs_snapshots (slice 6 Phase B) (2026-06-09)
The agent's slice-6 Phase B work populates the host-report's pbs_snapshots (the PBS offsite
inventory + per-snapshot verify-state). This is the hub half: accept + persist them. Minimal —
the rich offsite policy is hub-owned (slice 10); this mirrors what the agent reports.
Added
hostPBSSnapshotmirror struct inhostReportPayload(internal/api/handler.go) — field-for-field with the agent'shub.PBSSnapshotwire contract (namespace/backup_type/ backup_id/backup_time/size_bytes/owner/protected/encrypted/verify_state/verify_upid). Persisted viareport_json(no new columns — the slice-5/6A precedent).- A FAILED PBS verify is logged prominently (
[WARN]— the loudest offsite-DR signal, same treatment as a failed restore-test). Thehost-reportinfo line now counts pbs-snapshots. testdata/host-report.golden.jsonupdated with a populatedpbs_snapshots[0], kept byte-identical with felhom-agent's copy.TestHostPBSSnapshot_GoldenContract— the hub half of the bidirectional key-set test.
Notes
- Backward-compatible: an agent that omits/empties
pbs_snapshotsis accepted unchanged.
v0.7.3 — ingest agent backups + restore_tests (slice 6 Phase A) (2026-06-09)
The agent's slice-6 work populates the host-report's backups + restore_tests (the
self-restore-test result). This is the hub half: accept + persist them. Minimal — the rich
backup policy (schedule/retention/target selection) is hub-manifest-owned and lands at
slice 10; this slice only mirrors what the agent reports.
Added
hostBackup/hostRestoreTestmirror structs inhostReportPayload(internal/api/handler.go) — field-for-field with the agent'shub.Backup/hub.RestoreTestwire contract. Persisted verbatim inreport_json(no new columns — slice-5 precedent).- A FAILED restore-test is logged prominently (
[WARN], the loudest DR signal there is); a failed backup is logged too. Thehost-reportinfo line now counts backups + restore-tests. testdata/host-report.golden.jsonupdated with a populatedbackups[0]/restore_tests[0], kept byte-identical with felhom-agent's copy.TestHostBackup_GoldenContract/TestHostRestoreTest_GoldenContract— the hub half of the bidirectional key-set test (round-trip the golden through the mirror, assert exact keys).
Notes
- Backward-compatible: an agent that omits/empties these is accepted unchanged. The legacy controller report path is untouched (frozen until slice 10).
v0.7.2 — ingest agent storage_targets (slice 5 Phase A) (2026-06-09)
The agent's slice-5 work populates the host-report's storage_targets (previously empty).
This is the hub half: accept + persist them. Minimal by design — the rich, authoritative
storage manifest (desired class/role/policy/creds) is hub-owned and lands at slice 10; this
slice only mirrors what the agent observes.
Added
hostReportPayload.StorageTargets(internal/api/handler.go) — a full mirror of the agent'shub.StorageTargetwire contract (name/type/durable_id/state/reachable/usage/ content/mount/class_hint/role/thin_pool/smart). The targets are persisted verbatim in the existingreport_jsonrow (no schema change); the handler counts them and logs a[WARN]when any aredisconnected(the storage analog of host-down visibility).testdata/host-report.golden.json— updated to carry two populatedstorage_targets(an lvmthin withthin_pool, a usb), kept byte-identical with felhom-agent's copy.TestHostStorageTarget_GoldenContract— the hub half of the bidirectional key-set test: round-trips the golden'sstorage_targets[0]through the mirror struct and asserts the key set matches exactly (no missing/extra fields vs the agent).TestHostReport_GoldenContractalso now asserts the targets are persisted + parse back.
Notes
- Backward-compatible: an older agent that sends
storage_targets: [](or omits it) is accepted unchanged. The legacy controller report path is untouched (frozen until slice 10).
Repo docs — no hub version change (2026-06-08)
Changed
- Reflowed
felhom.eu/CLAUDE.md— removed hard mid-paragraph line wraps (prose, list items, blockquotes now single-line); tables untouched; rendered output unchanged. - Unified the REPORT/CHANGELOG convention: this repo's
REPORT.mdswitches from append/cumulative to overwrite-latest (uniform with the sibling repos);CHANGELOG.md(this file) stays the cumulative log, newest on top. UpdatedREPORT.md's header note accordingly (existing sections retained as history). Added an explicit no-secrets rule. No hub code change → no version bump.
v0.7.1 (2026-06-08)
Changed
/host-reportrejects oversize bodies explicitly with 413 (handler.go) instead of silently truncating at the 4 MiBLimitReadercap. Reads one byte pastmaxHostReportBytesand returns413 Payload too large— a truncated-but-valid JSON could otherwise be accepted as a partial report (silently dropping guests from the mirror). The controllerhandleReport1 MiB path is unchanged (frozen until slice-10 cutover).
Added
- Cross-repo contract fixture
hub/internal/api/testdata/host-report.golden.json(byte-identical with felhom-agent's copy) +TestHostReport_GoldenContract— POSTs the golden through the realhandleHostReportand asserts 200 + denorm (guest_total/guest_running/cloudflared_status) + both guests upserted, provinghostReportPayloadstill extracts the contract from the real shape. Duplicated contract (no shared types module yet); revisit at slices 5/6.
v0.7.0 (2026-06-08)
Added — host-domain ingest (slice 3, additive; controller path untouched)
- New tables
hosts,guests,host_reports(store.go migrate(), idempotent). Full schema now, including columns inert until slice 10 (hosts.desired_json/desired_generation/dr_record_json,guests.api_key/desired_spec_json) so the cutover needs noALTER. Nothing reads/writes the inert columns this slice. POST /api/v1/host-report— the agent's heartbeat. Per-host Bearer auth; 4 MiB body; persists the full report + denormalized fields (cpu/mem/disk %, guest counts, cloudflared status); upserts each guest's reality columns (guest_id = "<host_id>/<vmid>", hub-derived); returns the control envelope{status, poll_interval_seconds:900, blocked, desired_generation:0, has_signed_ops:false}(blockedreflects the customer's status; the latter two are reserved/placeholder for slice 4).- Per-host key auth —
checkAuthHost(Bearer → host → customer), added alongside the unchangedcheckAuthCustomer. Global key remains a bootstrap fallback. POST /api/v1/admin/hosts— PROVISIONAL global-key-only host mint (host_id + per-host api_key); the slice-3 bootstrap until enrollment (slices 7–8) replaces it.- Host dead-man's-switch —
monitor.HostStalenessCheckeroverhost_reports, emittinghost_stale/host_down/host_recovered(30m/60m), attributed to the host's customer; registered inallowedEventTypes; wired incmd/hub/main.goon the existing 60s ticker. A deliberate sibling of the controllerStalenessChecker(both run until slice 10). - Store methods:
GetHostByAPIKey,GetHost,ListHosts,UpsertHost,SaveHostReport,UpsertGuestFromReport(preserves inert columns on conflict),GetHostStaleness(skips never-reported hosts),GuestID.Prunenow also pruneshost_reports(same retention). - Tests (new, hermetic): store, auth (
checkAuthHost), ingest (valid+envelope+denorm, host_id mismatch→403, unknown-host-under-global→400, blocked→true, oversize→400), admin mint (non-global→403, unknown customer→400, mint+round-trip), host staleness transitions.
Unchanged (explicit)
- The controller path —
/api/v1/report,reports,customer_configs,checkAuthCustomer, the existing staleness/deadline checkers — is untouched and still green. The old controller and the new agent report in parallel during slices 3–9; the schema/auth cutover is slice 10.
v0.6.2 (2026-02-26)
Added
- Infra backup GFS retention — New
infra_backup_versionstable stores multiple backups per customer. GFS pruning keeps: all from last 24h, latest per day (7 days), latest per week (4 weeks), latest per month (3 months) — ~14 versions max per customer GET /api/v1/infra-backup/{id}/versions— Returns metadata list of all retained backup versions (date, stack names, disk count) for a customer. Bearer auth.- Recovery version selection —
GET /api/v1/recovery/{id}?version=IDfetches a specific backup version instead of latest. Response now includesbackup_versionsarray with all available versions. - Dashboard backup history — Customer detail page "Infra Backup" card shows version count and collapsible history table (date, apps, disks)
Changed
SaveInfraBackup()— Now INSERTs a new row instead of upserting, preserving history. Automatically prunes old versions via GFS algorithm.- One-time migration — Existing data from
infra_backupstable is copied toinfra_backup_versionson first startup
v0.6.1 (2026-02-25)
Added
- Delete issues from app detail page — Known Issues table now has per-row checkboxes with "Delete Selected" and "Delete All Issues" buttons; keeps telemetry data (memory trends, etc.) intact
DELETE /apps/{appName}/delete-issues— New POST endpoint supportingaction=selected(withissue_idsform values) andaction=all
Fixed
- Hub-side fingerprint hardening —
fingerprintIssue()now strips ANSI escape codes, ISO/syslog timestamps, and lowercases before truncating to 100 chars. Prevents duplicate issue rows when messages differ only by embedded timestamps.
v0.6.0 (2026-02-25)
Added
- Geo-restriction display (
customer_unified.html) — New "Geo-korlátozás" section on customer detail pages showing: enabled/disabled status, allowed countries, per-app overrides, last sync time, and sync errors. Only visible when the controller reports geo_restriction data. - "Összes geo-korlátozás eltávolítása" button — One-click removal of all
[felhom-geo]Cloudflare WAF rules. The Hub calls the Cloudflare API directly (bypasses potentially blocked tunnel), then retries notifying the controller in background (every 30s for up to 10 min) to disable geo in its settings. - Cloudflare unblock client (
internal/cloudflare/unblock.go) — Minimal Cloudflare API client for deleting geo-restriction WAF rules. Resolves zone ID, finds thehttp_request_firewall_customruleset, and deletes rules with[felhom-geo]description prefix. POST /customers/{id}/geo/disableroute — CSRF-protected endpoint for the geo-disable action.
Removed
- Legacy Monitoring UUIDs — Removed the "Monitoring UUIDs" section from the config form (
config_form.html), UUID form-field handling frombuildConfigJSON(), UUID import fromhandlePullConfig(), volatile key entries formonitoring.ping_uuids.*, and the commented-outping_uuidssection fromcontroller.yaml.default. Monitoring is fully handled by the Hub event system since v0.3.0.
v0.5.0 (2026-02-25)
Added
- Configuration page (
GET /configuration) — New "Configuration" tab in the web UI with asset management controls. Displays asset file count, manifest generation timestamp, and a "Refresh Assets from Image" button. - Manual asset re-seed (
POST /configuration, action=refresh_assets) — Re-reads the baked-in seed directory, compares SHA-256 checksums with PVC assets, and updates changed files. Rebuilds the manifest afterward. Controllers pick up changes on their next daily sync. ReSeed()method (internal/assets/assets.go) — Public method for triggering asset re-seed + manifest rebuild from the web UI.
Changed
- Asset seeding:
seedIfEmpty()→seedOrUpdate()(internal/assets/assets.go) — On startup the Hub now compares SHA-256 checksums between the image seed directory and the PVC, updating any changed files instead of only seeding into an empty directory. This means redeploying the Hub image with updated assets automatically propagates them without PVC deletion. isAssetFile()expanded — Now also matches*-favicon.svgand*-favicon.icopatterns, allowing branding assets likefelhom-favicon.svgin the manifest.RebuildManifest()refactored — Internal logic extracted torebuildManifestLocked()for reuse byReSeed().- Web Server struct — Added
assetsMgrfield andSetAssetManager()method. Wired inmain.go. - All templates translated to English — The "Alkalmazások" nav link and telemetry pages (apps.html, app_detail.html, customer_unified.html telemetry section) are now in English, consistent with the rest of the Hub UI.
- Navigation updated — All templates now show four tabs: Dashboard, Customers, Apps, Configuration.
v0.4.1 (2026-02-23)
Added
- Per-app telemetry reset (
store/telemetry.go,web/apps.go) — New "Telemetria törlése" button on the app detail page that deletes all telemetry records and known issues for the selected app. Useful after major app updates when old data is no longer representative. Includes confirmation dialog and flash notification. DeleteAppTelemetry()andDeleteAppIssues()store methods (store/telemetry.go) — Delete all telemetry/issue rows for a specific app_name.POST /apps/{name}/reset-telemetryroute (web/server.go) — CSRF-protected endpoint that triggers the reset and redirects back with flash message.
v0.4.0 (2026-02-23)
App Telemetry & Analytics Dashboard
Added
app_telemetryandapp_log_issuesSQLite tables (store/store.go) — store per-app resource metrics and deduplicated log issues reported by v0.28.0+ controllers.internal/store/telemetry.go— New store methods:SaveAppTelemetry,GetFleetAppSummary(with P95 memory calculation),GetAppTelemetryHistory,GetAppCustomerBreakdown,GetCustomerAppSummary,GetAppIssues,GetRecentIssuesAllApps,PruneAppTelemetry,PruneStaleIssues. New types:AppTelemetryRecord,FleetAppSummary,AppTelemetryPoint,AppCustomerStats,CustomerAppSummary,AppIssue./api/v1/reporthandler update (api/handler.go) — After saving the standard report, parses the optionalapp_telemetryJSON field and persists it. Backward-compatible: old controllers (noapp_telemetrykey) are unaffected.- Fleet app list page (
GET /apps) — Hungarian-language dashboard showing all deployed apps fleet-wide with deployment count, avg/P95 memory, catalog estimate/limit accuracy, error/warning badges. Sortable columns, 24h/7d/30d period selector. - Per-app detail page (
GET /apps/{name}) — Memory trend Chart.js chart (avg + peak, with catalog limit line), per-customer breakdown table, known log issues table (severity, message, occurrence count, affected customers). Includes suggested mem_limit from P95×1.2 rounded to 32M. - Customer detail page telemetry section (
customer_unified.html) — New "Alkalmazás telemetria" card with per-app memory (current/avg/peak) and log error/warning counts linking to /apps/{name}. - Chart.js (
static/chart.min.js) — Embedded from controller build, served at/static/chart.min.js. - "Alkalmazások" nav link — Added to header navigation across all templates.
- New CSS (
style.css) —.badge,.badge-error,.badge-warn,.summary-cards,.summary-card,.chart-container,.period-selector,.period-btn,.accuracy-dot,.mem-ok/warn/danger,.data-tablestyles. - Telemetry pruning (
cmd/hub/main.go) —pruneAll()now also prunes app_telemetry rows older than 90 days and stale log issues not seen in 30 days.
Changed
internal/web/apps.go(new file) —handleApps,handleAppDetail,parsePeriod,sortFleetSummary,aggregateHistoryForChart,parseLimitMB,memoryColor,accuracyClass,getCSRFTokenhelper functions.internal/web/server.go— Added routes for/apps,/apps/{name},/static/chart.min.js. AddedmemoryColor,accuracyClass,gttemplate functions.internal/web/embed.go— Added//go:embed static/chart.min.jsdirective.
v0.3.7 (2026-02-21)
Asset management API
- New
internal/assetspackage: manages app assets (logos, screenshots) on Hub PVC (/data/assets/) with automatic seeding from baked-in image copy on first run. - Two new authenticated API endpoints for controllers to sync assets:
GET /api/v1/assets/manifest— returns JSON manifest with filenames + SHA-256 checksumsGET /api/v1/assets/file/{filename}— serves individual asset files
- Dockerfile updated to
COPY assets/ /usr/share/felhom/assets-seed/for first-run seeding. - Build script syncs website assets (
*-logo.{svg,png},*-screenshot-*.webp) into Docker build context.
v0.3.6 (2026-02-21)
Human-friendly retrieval passwords
- Retrieval passwords now use Hungarian word passphrases (e.g.
áldás-plazmid-palánta-süvítve-pócgém) instead of 64-char hex strings. - Embedded 29K+ curated Hungarian word list (
hungarian.txt) via go:embed; 5-word passphrases give ~74 bits of entropy. - New
configgen.RandomPassphrase(wordCount)function; all 3 retrieval password generation sites updated. - API keys remain as hex (machine-to-machine, never typed by humans).
v0.3.5 (2026-02-21)
Recovery Endpoint & Customer Standing
- New
GET /api/v1/recovery/{customer_id}endpoint: returns both generated controller.yaml and infra backup in a single response for disaster recovery. Auth viaX-Retrieval-Passwordheader (same as config retrieval). - Report response now includes
customer_blocked: truewhen customer status is "blocked" — allows controllers to detect standing and enter limited mode.
v0.3.4 (2026-02-20)
- Rename version labels: "Current version" → "Controller version", "Latest version" → "Registry latest".
v0.3.3 (2026-02-20)
Bugfixes
- Fix double "v" prefix in controller version display (showed "vv0.21.1" instead of "v0.21.1").
- Skip deprecated
monitoring.ping_uuids.*keys in config diff comparison (added to volatile keys).
v0.3.2 (2026-02-20)
Hub Version Display
- Show Hub version in footer of all pages via
hubVersiontemplate function. web.New()now acceptsversionparameter (4th arg) — set via ldflags at build time.
v0.3.1 (2026-02-20)
Config Diff Display + Pull Config
- Value-based config comparison: Replaced broken SHA256 hash comparison with semantic YAML comparison. Both configs are parsed into maps, flattened to dot-notation keys, and compared by value. Ignores key ordering, whitespace, comments, and volatile fields (
web.session_secret). Shows actual diff count on customer page ("⚠ Config mismatch — N differences"). - Config diff endpoint (
GET /customers/{id}/config-diff): Fetches live YAML from controller via newGET /api/configendpoint, generates Hub YAML viaconfiggen.Generate(), returns JSON with per-key diffs (key, hub value, controller value, status). Sensitive values (tokens, passwords, secrets) are masked. - Pull Config (
POST /customers/{id}/pull-config): Reverse of Push Config — imports controller's current config into the Hub. Extracts identity fields (name, domain, email) and override fields (infrastructure tokens, git credentials, monitoring UUIDs). Preserves existing APIKey and RetrievalPassword. - Diff display UI: "Show Diff" button on customer page expands a table showing all key-value differences with color-coded rows (yellow=changed, blue=hub-only, orange=controller-only).
- Pull Config button: Added next to existing "Push Config" with confirmation dialog.
v0.3.0 (2026-02-20)
Hub Monitoring Takeover — Event System, Dead Man's Switch, Notifications
Replaces external Healthchecks.io with a Hub-native event system. The Hub becomes the single source of truth for all customer monitoring, event tracking, dead man's switch alerting, and notification delivery.
Phase 1 — Event System
eventstable in SQLite: stores all events with customer_id, event_type, severity, message, details_json, source, timestamp- Indexes:
idx_events_customer_created(customer + time DESC),idx_events_type(type + time DESC) - Store methods:
SaveEvent,GetRecentEvents,GetEventsByType,GetLatestEventByType,GetAllRecentEvents,CountEventsBySeverity,PruneEvents,GetActiveCustomerIDs POST /api/v1/eventendpoint: accepts structured events from controllers, validates event_type against 27 allowed types, validates severity (info/warning/error), stores in DB- Enhanced auth:
checkAuthCustomer()validates per-customer API keys match the customer_id in payload; global key bypasses ownership check - Prune: events pruned alongside reports at 04:30 Budapest time
Phase 2 — Dead Man's Switch
- Staleness checker (
internal/monitor/staleness.go): runs every 60s, detects when controllers stop reporting- ok→stale (>30min): inserts
node_stalewarning event - any→down (>60min): inserts
node_downerror event - stale/down→ok: inserts
node_recoveredinfo event - Skips blocked customers, no false alerts on startup
- ok→stale (>30min): inserts
- Backup deadline checker (
internal/monitor/deadline.go): runs daily at 05:00 Budapest- Detects missing
backup_completedevents since midnight → insertsexpected_backup_missederror - Detects missing
db_dump_completedevents → insertsexpected_dbdump_missederror - Grace: skips customers with
node_downstate
- Detects missing
scheduleDaily()helper: goroutine that sleeps until target time (Europe/Budapest), runs function, loops/healthzenhanced: returns 503 if SQLite Ping fails
Phase 3 — Notification System
- Dispatcher (
internal/notify/dispatcher.go): processes events and sends emails via Resend API- Operator channel: English emails to operator for warning/error events, 1h cooldown per customer:eventType
- Customer channel: Hungarian emails per event_type, respects customer preferences (enabled_events, cooldown_hours), blocked customers skipped
- Test bypass:
testevent type skips cooldown/preferences, sends directly to customer email
- Email templates (
internal/notify/templates.go): operator (concise English), customer (Hungarian per event type with complete message table) - Cooldown tracking: in-memory maps with per-customer:eventType granularity
customer_notificationstable: addedcooldown_hourscolumn (default 6)notification_logtable: addedchannelcolumn (operator/customer)- Wired into
/api/v1/eventhandler and staleness/deadline checkers
Phase 4 — Hub UI
- Events section on customer detail page: last 50 events, severity filter buttons (All/Errors/Warnings/Info), colored severity badges
- Dashboard badges: error+warning count in last 24h per customer, clickable to customer events
- Notification log: shows channel column (operator/customer) in customer detail page
- Config form: Monitoring UUIDs section marked as "Legacy" with deprecation notice, collapsed by default
Phase 6 — Config Cleanup
controller.yaml.default:monitoring.ping_uuidssection commented out (deprecated)buildConfigJSON: only writesping_uuidsto config JSON if user explicitly provides UUID values (new configs get none)
v0.2.2 (2026-02-20)
Config Hash Comparison
- Config sync status on unified customer page: compares SHA256 hash of controller's
controller.yaml(from report payload) against Hub-generated YAML. Shows "In sync", "Config mismatch", or "Unknown" (controller needs v0.20.0+ to report hash). - Visible in the Controller Update section next to Push Config button.
v0.2.1 (2026-02-20)
Unified Customer Management
All customer views consolidated into a single page. New management features: blocked status, dashboard merge, config push, and auto-config creation.
New features
-
Unified customer page —
/customers/{id}:- Single page showing both configuration info and live report data
- Replaces separate
/configs/{id}(config detail) and/customers/{id}(report detail) pages - Shows config management (credentials, setup commands, YAML preview) when config exists
- Shows "Create Config" button for manual (report-only) customers
- Old
/configs/{id}URLs redirect to/customers/{id}
-
Dashboard shows pending customers:
- Customers with config but no reports appear on dashboard with "PENDING" status
- All metric columns show "—" for pending customers
-
Blocked/Banned status:
- Customers can be blocked via button on detail page
- Blocked customers hidden from Dashboard
- Reports still accepted (prevents controller retry loops) but notifications suppressed
- "BLOCKED" badge shown on Customers list and detail page
- One-click unblock button
-
Config push to controller:
- "Push Config" button on unified page (visible when controller URL known)
- Generates YAML and POSTs to
{controller_url}/api/config/apply - Note: requires controller v0.20.0+ with config apply endpoint
-
Auto-create config from report data:
- "Create Config" button on manual customer pages
- Pre-fills customer name from report, generates credentials
- Redirects to edit form for additional fields
Changes
- Customers list: all rows now link to
/customers/{id}(unified page) - Config badges: new MANAGED/MANUAL/BLOCKED pill-style badges
customer_configstable: addedstatuscolumn (active/blocked)- Status functions handle "pending" and "blocked" status values
v0.2.0 (2026-02-20)
Customer Configuration Management
New "Configurations" section for pre-provisioning customer nodes. Operators can configure
customer settings in the Hub web UI, then docker-setup.sh downloads a ready-made
controller.yaml — reducing deployment to a customer ID and password.
New features
-
Web UI —
/configspages:- List all customer configurations in a table
- Create new configuration: customer identity, infrastructure secrets (CF tunnel/API tokens), git sync credentials, monitoring UUIDs — organized in collapsible sections
- Detail page: shows credentials (retrieval password, per-customer API key) with copy-to-clipboard,
setup commands (
docker-setup.shandcurl), live YAML preview - Edit and delete configurations
- Navigation tabs (Dashboard / Configurations) on all pages
-
Config retrieval API —
GET /api/v1/config/{customer_id}:- Authenticated via
X-Retrieval-Passwordheader (separate from Bearer token) - Generates complete
controller.yamlby deep-merging template with customer overrides - Template sourced from
controller.yaml.example(fetched from Gitea repo periodically) - Falls back to embedded default template if fetcher not configured
- Authenticated via
-
Per-customer API keys:
- Each customer config gets its own API key (auto-generated, 64 hex chars)
- Controllers can authenticate with per-customer key instead of the shared global key
- Backward compatible — global
report_api_keycontinues to work alongside per-customer keys
-
YAML generation (
internal/configgenpackage):- Deep-merge of template + customer-specific overrides
- Programmatic injection: customer identity, hub config, session secret
- Shared by both API handler and web UI preview
-
Template fetcher (background goroutine):
- Periodically fetches
controller.yaml.examplefrom Gitea (configurable interval) - Requires
registry.username+registry.tokenin hub.yaml - Falls back to
go:embeddefault template when not configured
- Periodically fetches
-
Data layer:
- New
customer_configsSQLite table - 6 CRUD methods: Save, Get, List, Delete, GetByAPIKey, UpdateRetrievalPassword
- New
Configuration
New registry section in hub.yaml:
registry:
image: "gitea.dooplex.hu/admin/felhom-controller"
username: "" # Gitea credentials (for version checker + template fetcher)
token: ""
check_interval: "6h"
template_interval: "1h" # How often to refresh controller.yaml.example
Files added
internal/configgen/configgen.go— shared YAML generation packageinternal/web/configs.go— web handlers for config CRUDinternal/web/templatefetcher.go— background template refreshinternal/web/controller.yaml.default— embedded fallback templateinternal/web/templates/configs.html— config list pageinternal/web/templates/config_form.html— create/edit forminternal/web/templates/config_detail.html— detail + credentials page
Files modified
internal/store/store.go— customer_configs table + CRUD methodsinternal/api/handler.go— config retrieval endpoint, per-customer auth,ConfigTemplateProviderinterfaceinternal/web/server.go—/configs/*routes,SetTemplateFetcher()internal/web/embed.go— embedded default templateinternal/web/templates/dashboard.html— navigation barinternal/web/templates/customer.html— navigation barinternal/web/templates/style.css— form, nav, button, credential stylescmd/hub/main.go— template fetcher wiring,TemplateIntervalconfigconfigs/hub.yaml.example— registry section
v0.1.8 (2026-02-16)
- Controller update trigger: "Update" button on customer detail page calls controller's self-update endpoint
- Registry version checker: background goroutine checks Gitea registry for latest controller image tag
- Update available indicator on customer detail page
v0.1.7 (2026-02-15)
- Infrastructure backup endpoints for disaster recovery (POST + GET
/api/v1/infra-backup)
v0.1.6 (2026-02-14)
- Handle disabled reporting status
- Storage labels display
- Date in history table
v0.1.5 (2026-02-13)
- Notification preferences sync endpoint (
POST /api/v1/preferences) - Notification display on customer detail page
v0.1.4 (2026-02-12)
- Resend API key support for email notifications
- Notification endpoint (
POST /api/v1/notify)
v0.1.3 (2026-02-11)
- Customer detail page: system info, storage bars, container table
- 24h history graphs
v0.1.2 (2026-02-10)
- Dashboard auto-refresh (60s cycle)
- Status logic (green/yellow/red based on report age + health)
v0.1.1 (2026-02-09)
- Basic dashboard with customer overview table
- Report ingest API
v0.1.0 (2026-02-08)
- Initial release: SQLite store, report API, basic web dashboard