Files
felhom.eu/hub/CHANGELOG.md
admin f1212e6ba8 feat: infra backup GFS retention + version history
New infra_backup_versions table with GFS pruning (~14 versions per
customer). Recovery endpoint supports ?version=ID. New /versions API.
Dashboard shows collapsible backup history with app names and disk count.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 14:47:48 +01:00

22 KiB
Raw Permalink Blame History

Felhom Hub — Changelog

v0.6.2 (2026-02-26)

Added

  • Infra backup GFS retention — New infra_backup_versions table stores multiple backups per customer. GFS pruning keeps: all from last 24h, latest per day (7 days), latest per week (4 weeks), latest per month (3 months) — ~14 versions max per customer
  • GET /api/v1/infra-backup/{id}/versions — Returns metadata list of all retained backup versions (date, stack names, disk count) for a customer. Bearer auth.
  • Recovery version selectionGET /api/v1/recovery/{id}?version=ID fetches a specific backup version instead of latest. Response now includes backup_versions array with all available versions.
  • Dashboard backup history — Customer detail page "Infra Backup" card shows version count and collapsible history table (date, apps, disks)

Changed

  • SaveInfraBackup() — Now INSERTs a new row instead of upserting, preserving history. Automatically prunes old versions via GFS algorithm.
  • One-time migration — Existing data from infra_backups table is copied to infra_backup_versions on first startup

v0.6.1 (2026-02-25)

Added

  • Delete issues from app detail page — Known Issues table now has per-row checkboxes with "Delete Selected" and "Delete All Issues" buttons; keeps telemetry data (memory trends, etc.) intact
  • DELETE /apps/{appName}/delete-issues — New POST endpoint supporting action=selected (with issue_ids form values) and action=all

Fixed

  • Hub-side fingerprint hardeningfingerprintIssue() now strips ANSI escape codes, ISO/syslog timestamps, and lowercases before truncating to 100 chars. Prevents duplicate issue rows when messages differ only by embedded timestamps.

v0.6.0 (2026-02-25)

Added

  • Geo-restriction display (customer_unified.html) — New "Geo-korlátozás" section on customer detail pages showing: enabled/disabled status, allowed countries, per-app overrides, last sync time, and sync errors. Only visible when the controller reports geo_restriction data.
  • "Összes geo-korlátozás eltávolítása" button — One-click removal of all [felhom-geo] Cloudflare WAF rules. The Hub calls the Cloudflare API directly (bypasses potentially blocked tunnel), then retries notifying the controller in background (every 30s for up to 10 min) to disable geo in its settings.
  • Cloudflare unblock client (internal/cloudflare/unblock.go) — Minimal Cloudflare API client for deleting geo-restriction WAF rules. Resolves zone ID, finds the http_request_firewall_custom ruleset, and deletes rules with [felhom-geo] description prefix.
  • POST /customers/{id}/geo/disable route — CSRF-protected endpoint for the geo-disable action.

Removed

  • Legacy Monitoring UUIDs — Removed the "Monitoring UUIDs" section from the config form (config_form.html), UUID form-field handling from buildConfigJSON(), UUID import from handlePullConfig(), volatile key entries for monitoring.ping_uuids.*, and the commented-out ping_uuids section from controller.yaml.default. Monitoring is fully handled by the Hub event system since v0.3.0.

v0.5.0 (2026-02-25)

Added

  • Configuration page (GET /configuration) — New "Configuration" tab in the web UI with asset management controls. Displays asset file count, manifest generation timestamp, and a "Refresh Assets from Image" button.
  • Manual asset re-seed (POST /configuration, action=refresh_assets) — Re-reads the baked-in seed directory, compares SHA-256 checksums with PVC assets, and updates changed files. Rebuilds the manifest afterward. Controllers pick up changes on their next daily sync.
  • ReSeed() method (internal/assets/assets.go) — Public method for triggering asset re-seed + manifest rebuild from the web UI.

Changed

  • Asset seeding: seedIfEmpty()seedOrUpdate() (internal/assets/assets.go) — On startup the Hub now compares SHA-256 checksums between the image seed directory and the PVC, updating any changed files instead of only seeding into an empty directory. This means redeploying the Hub image with updated assets automatically propagates them without PVC deletion.
  • isAssetFile() expanded — Now also matches *-favicon.svg and *-favicon.ico patterns, allowing branding assets like felhom-favicon.svg in the manifest.
  • RebuildManifest() refactored — Internal logic extracted to rebuildManifestLocked() for reuse by ReSeed().
  • Web Server struct — Added assetsMgr field and SetAssetManager() method. Wired in main.go.
  • All templates translated to English — The "Alkalmazások" nav link and telemetry pages (apps.html, app_detail.html, customer_unified.html telemetry section) are now in English, consistent with the rest of the Hub UI.
  • Navigation updated — All templates now show four tabs: Dashboard, Customers, Apps, Configuration.

v0.4.1 (2026-02-23)

Added

  • Per-app telemetry reset (store/telemetry.go, web/apps.go) — New "Telemetria törlése" button on the app detail page that deletes all telemetry records and known issues for the selected app. Useful after major app updates when old data is no longer representative. Includes confirmation dialog and flash notification.
  • DeleteAppTelemetry() and DeleteAppIssues() store methods (store/telemetry.go) — Delete all telemetry/issue rows for a specific app_name.
  • POST /apps/{name}/reset-telemetry route (web/server.go) — CSRF-protected endpoint that triggers the reset and redirects back with flash message.

v0.4.0 (2026-02-23)

App Telemetry & Analytics Dashboard

Added

  • app_telemetry and app_log_issues SQLite tables (store/store.go) — store per-app resource metrics and deduplicated log issues reported by v0.28.0+ controllers.
  • internal/store/telemetry.go — New store methods: SaveAppTelemetry, GetFleetAppSummary (with P95 memory calculation), GetAppTelemetryHistory, GetAppCustomerBreakdown, GetCustomerAppSummary, GetAppIssues, GetRecentIssuesAllApps, PruneAppTelemetry, PruneStaleIssues. New types: AppTelemetryRecord, FleetAppSummary, AppTelemetryPoint, AppCustomerStats, CustomerAppSummary, AppIssue.
  • /api/v1/report handler update (api/handler.go) — After saving the standard report, parses the optional app_telemetry JSON field and persists it. Backward-compatible: old controllers (no app_telemetry key) are unaffected.
  • Fleet app list page (GET /apps) — Hungarian-language dashboard showing all deployed apps fleet-wide with deployment count, avg/P95 memory, catalog estimate/limit accuracy, error/warning badges. Sortable columns, 24h/7d/30d period selector.
  • Per-app detail page (GET /apps/{name}) — Memory trend Chart.js chart (avg + peak, with catalog limit line), per-customer breakdown table, known log issues table (severity, message, occurrence count, affected customers). Includes suggested mem_limit from P95×1.2 rounded to 32M.
  • Customer detail page telemetry section (customer_unified.html) — New "Alkalmazás telemetria" card with per-app memory (current/avg/peak) and log error/warning counts linking to /apps/{name}.
  • Chart.js (static/chart.min.js) — Embedded from controller build, served at /static/chart.min.js.
  • "Alkalmazások" nav link — Added to header navigation across all templates.
  • New CSS (style.css) — .badge, .badge-error, .badge-warn, .summary-cards, .summary-card, .chart-container, .period-selector, .period-btn, .accuracy-dot, .mem-ok/warn/danger, .data-table styles.
  • Telemetry pruning (cmd/hub/main.go) — pruneAll() now also prunes app_telemetry rows older than 90 days and stale log issues not seen in 30 days.

Changed

  • internal/web/apps.go (new file) — handleApps, handleAppDetail, parsePeriod, sortFleetSummary, aggregateHistoryForChart, parseLimitMB, memoryColor, accuracyClass, getCSRFToken helper functions.
  • internal/web/server.go — Added routes for /apps, /apps/{name}, /static/chart.min.js. Added memoryColor, accuracyClass, gt template functions.
  • internal/web/embed.go — Added //go:embed static/chart.min.js directive.

v0.3.7 (2026-02-21)

Asset management API

  • New internal/assets package: manages app assets (logos, screenshots) on Hub PVC (/data/assets/) with automatic seeding from baked-in image copy on first run.
  • Two new authenticated API endpoints for controllers to sync assets:
    • GET /api/v1/assets/manifest — returns JSON manifest with filenames + SHA-256 checksums
    • GET /api/v1/assets/file/{filename} — serves individual asset files
  • Dockerfile updated to COPY assets/ /usr/share/felhom/assets-seed/ for first-run seeding.
  • Build script syncs website assets (*-logo.{svg,png}, *-screenshot-*.webp) into Docker build context.

v0.3.6 (2026-02-21)

Human-friendly retrieval passwords

  • Retrieval passwords now use Hungarian word passphrases (e.g. áldás-plazmid-palánta-süvítve-pócgém) instead of 64-char hex strings.
  • Embedded 29K+ curated Hungarian word list (hungarian.txt) via go:embed; 5-word passphrases give ~74 bits of entropy.
  • New configgen.RandomPassphrase(wordCount) function; all 3 retrieval password generation sites updated.
  • API keys remain as hex (machine-to-machine, never typed by humans).

v0.3.5 (2026-02-21)

Recovery Endpoint & Customer Standing

  • New GET /api/v1/recovery/{customer_id} endpoint: returns both generated controller.yaml and infra backup in a single response for disaster recovery. Auth via X-Retrieval-Password header (same as config retrieval).
  • Report response now includes customer_blocked: true when customer status is "blocked" — allows controllers to detect standing and enter limited mode.

v0.3.4 (2026-02-20)

  • Rename version labels: "Current version" → "Controller version", "Latest version" → "Registry latest".

v0.3.3 (2026-02-20)

Bugfixes

  • Fix double "v" prefix in controller version display (showed "vv0.21.1" instead of "v0.21.1").
  • Skip deprecated monitoring.ping_uuids.* keys in config diff comparison (added to volatile keys).

v0.3.2 (2026-02-20)

Hub Version Display

  • Show Hub version in footer of all pages via hubVersion template function.
  • web.New() now accepts version parameter (4th arg) — set via ldflags at build time.

v0.3.1 (2026-02-20)

Config Diff Display + Pull Config

  • Value-based config comparison: Replaced broken SHA256 hash comparison with semantic YAML comparison. Both configs are parsed into maps, flattened to dot-notation keys, and compared by value. Ignores key ordering, whitespace, comments, and volatile fields (web.session_secret). Shows actual diff count on customer page ("⚠ Config mismatch — N differences").
  • Config diff endpoint (GET /customers/{id}/config-diff): Fetches live YAML from controller via new GET /api/config endpoint, generates Hub YAML via configgen.Generate(), returns JSON with per-key diffs (key, hub value, controller value, status). Sensitive values (tokens, passwords, secrets) are masked.
  • Pull Config (POST /customers/{id}/pull-config): Reverse of Push Config — imports controller's current config into the Hub. Extracts identity fields (name, domain, email) and override fields (infrastructure tokens, git credentials, monitoring UUIDs). Preserves existing APIKey and RetrievalPassword.
  • Diff display UI: "Show Diff" button on customer page expands a table showing all key-value differences with color-coded rows (yellow=changed, blue=hub-only, orange=controller-only).
  • Pull Config button: Added next to existing "Push Config" with confirmation dialog.

v0.3.0 (2026-02-20)

Hub Monitoring Takeover — Event System, Dead Man's Switch, Notifications

Replaces external Healthchecks.io with a Hub-native event system. The Hub becomes the single source of truth for all customer monitoring, event tracking, dead man's switch alerting, and notification delivery.

Phase 1 — Event System

  • events table in SQLite: stores all events with customer_id, event_type, severity, message, details_json, source, timestamp
  • Indexes: idx_events_customer_created (customer + time DESC), idx_events_type (type + time DESC)
  • Store methods: SaveEvent, GetRecentEvents, GetEventsByType, GetLatestEventByType, GetAllRecentEvents, CountEventsBySeverity, PruneEvents, GetActiveCustomerIDs
  • POST /api/v1/event endpoint: accepts structured events from controllers, validates event_type against 27 allowed types, validates severity (info/warning/error), stores in DB
  • Enhanced auth: checkAuthCustomer() validates per-customer API keys match the customer_id in payload; global key bypasses ownership check
  • Prune: events pruned alongside reports at 04:30 Budapest time

Phase 2 — Dead Man's Switch

  • Staleness checker (internal/monitor/staleness.go): runs every 60s, detects when controllers stop reporting
    • ok→stale (>30min): inserts node_stale warning event
    • any→down (>60min): inserts node_down error event
    • stale/down→ok: inserts node_recovered info event
    • Skips blocked customers, no false alerts on startup
  • Backup deadline checker (internal/monitor/deadline.go): runs daily at 05:00 Budapest
    • Detects missing backup_completed events since midnight → inserts expected_backup_missed error
    • Detects missing db_dump_completed events → inserts expected_dbdump_missed error
    • Grace: skips customers with node_down state
  • scheduleDaily() helper: goroutine that sleeps until target time (Europe/Budapest), runs function, loops
  • /healthz enhanced: returns 503 if SQLite Ping fails

Phase 3 — Notification System

  • Dispatcher (internal/notify/dispatcher.go): processes events and sends emails via Resend API
    • Operator channel: English emails to operator for warning/error events, 1h cooldown per customer:eventType
    • Customer channel: Hungarian emails per event_type, respects customer preferences (enabled_events, cooldown_hours), blocked customers skipped
    • Test bypass: test event type skips cooldown/preferences, sends directly to customer email
  • Email templates (internal/notify/templates.go): operator (concise English), customer (Hungarian per event type with complete message table)
  • Cooldown tracking: in-memory maps with per-customer:eventType granularity
  • customer_notifications table: added cooldown_hours column (default 6)
  • notification_log table: added channel column (operator/customer)
  • Wired into /api/v1/event handler and staleness/deadline checkers

Phase 4 — Hub UI

  • Events section on customer detail page: last 50 events, severity filter buttons (All/Errors/Warnings/Info), colored severity badges
  • Dashboard badges: error+warning count in last 24h per customer, clickable to customer events
  • Notification log: shows channel column (operator/customer) in customer detail page
  • Config form: Monitoring UUIDs section marked as "Legacy" with deprecation notice, collapsed by default

Phase 6 — Config Cleanup

  • controller.yaml.default: monitoring.ping_uuids section commented out (deprecated)
  • buildConfigJSON: only writes ping_uuids to config JSON if user explicitly provides UUID values (new configs get none)

v0.2.2 (2026-02-20)

Config Hash Comparison

  • Config sync status on unified customer page: compares SHA256 hash of controller's controller.yaml (from report payload) against Hub-generated YAML. Shows "In sync", "Config mismatch", or "Unknown" (controller needs v0.20.0+ to report hash).
  • Visible in the Controller Update section next to Push Config button.

v0.2.1 (2026-02-20)

Unified Customer Management

All customer views consolidated into a single page. New management features: blocked status, dashboard merge, config push, and auto-config creation.

New features

  • Unified customer page — /customers/{id}:

    • Single page showing both configuration info and live report data
    • Replaces separate /configs/{id} (config detail) and /customers/{id} (report detail) pages
    • Shows config management (credentials, setup commands, YAML preview) when config exists
    • Shows "Create Config" button for manual (report-only) customers
    • Old /configs/{id} URLs redirect to /customers/{id}
  • Dashboard shows pending customers:

    • Customers with config but no reports appear on dashboard with "PENDING" status
    • All metric columns show "—" for pending customers
  • Blocked/Banned status:

    • Customers can be blocked via button on detail page
    • Blocked customers hidden from Dashboard
    • Reports still accepted (prevents controller retry loops) but notifications suppressed
    • "BLOCKED" badge shown on Customers list and detail page
    • One-click unblock button
  • Config push to controller:

    • "Push Config" button on unified page (visible when controller URL known)
    • Generates YAML and POSTs to {controller_url}/api/config/apply
    • Note: requires controller v0.20.0+ with config apply endpoint
  • Auto-create config from report data:

    • "Create Config" button on manual customer pages
    • Pre-fills customer name from report, generates credentials
    • Redirects to edit form for additional fields

Changes

  • Customers list: all rows now link to /customers/{id} (unified page)
  • Config badges: new MANAGED/MANUAL/BLOCKED pill-style badges
  • customer_configs table: added status column (active/blocked)
  • Status functions handle "pending" and "blocked" status values

v0.2.0 (2026-02-20)

Customer Configuration Management

New "Configurations" section for pre-provisioning customer nodes. Operators can configure customer settings in the Hub web UI, then docker-setup.sh downloads a ready-made controller.yaml — reducing deployment to a customer ID and password.

New features

  • Web UI — /configs pages:

    • List all customer configurations in a table
    • Create new configuration: customer identity, infrastructure secrets (CF tunnel/API tokens), git sync credentials, monitoring UUIDs — organized in collapsible sections
    • Detail page: shows credentials (retrieval password, per-customer API key) with copy-to-clipboard, setup commands (docker-setup.sh and curl), live YAML preview
    • Edit and delete configurations
    • Navigation tabs (Dashboard / Configurations) on all pages
  • Config retrieval API — GET /api/v1/config/{customer_id}:

    • Authenticated via X-Retrieval-Password header (separate from Bearer token)
    • Generates complete controller.yaml by deep-merging template with customer overrides
    • Template sourced from controller.yaml.example (fetched from Gitea repo periodically)
    • Falls back to embedded default template if fetcher not configured
  • Per-customer API keys:

    • Each customer config gets its own API key (auto-generated, 64 hex chars)
    • Controllers can authenticate with per-customer key instead of the shared global key
    • Backward compatible — global report_api_key continues to work alongside per-customer keys
  • YAML generation (internal/configgen package):

    • Deep-merge of template + customer-specific overrides
    • Programmatic injection: customer identity, hub config, session secret
    • Shared by both API handler and web UI preview
  • Template fetcher (background goroutine):

    • Periodically fetches controller.yaml.example from Gitea (configurable interval)
    • Requires registry.username + registry.token in hub.yaml
    • Falls back to go:embed default template when not configured
  • Data layer:

    • New customer_configs SQLite table
    • 6 CRUD methods: Save, Get, List, Delete, GetByAPIKey, UpdateRetrievalPassword

Configuration

New registry section in hub.yaml:

registry:
  image: "gitea.dooplex.hu/admin/felhom-controller"
  username: ""               # Gitea credentials (for version checker + template fetcher)
  token: ""
  check_interval: "6h"
  template_interval: "1h"   # How often to refresh controller.yaml.example

Files added

  • internal/configgen/configgen.go — shared YAML generation package
  • internal/web/configs.go — web handlers for config CRUD
  • internal/web/templatefetcher.go — background template refresh
  • internal/web/controller.yaml.default — embedded fallback template
  • internal/web/templates/configs.html — config list page
  • internal/web/templates/config_form.html — create/edit form
  • internal/web/templates/config_detail.html — detail + credentials page

Files modified

  • internal/store/store.go — customer_configs table + CRUD methods
  • internal/api/handler.go — config retrieval endpoint, per-customer auth, ConfigTemplateProvider interface
  • internal/web/server.go/configs/* routes, SetTemplateFetcher()
  • internal/web/embed.go — embedded default template
  • internal/web/templates/dashboard.html — navigation bar
  • internal/web/templates/customer.html — navigation bar
  • internal/web/templates/style.css — form, nav, button, credential styles
  • cmd/hub/main.go — template fetcher wiring, TemplateInterval config
  • configs/hub.yaml.example — registry section

v0.1.8 (2026-02-16)

  • Controller update trigger: "Update" button on customer detail page calls controller's self-update endpoint
  • Registry version checker: background goroutine checks Gitea registry for latest controller image tag
  • Update available indicator on customer detail page

v0.1.7 (2026-02-15)

  • Infrastructure backup endpoints for disaster recovery (POST + GET /api/v1/infra-backup)

v0.1.6 (2026-02-14)

  • Handle disabled reporting status
  • Storage labels display
  • Date in history table

v0.1.5 (2026-02-13)

  • Notification preferences sync endpoint (POST /api/v1/preferences)
  • Notification display on customer detail page

v0.1.4 (2026-02-12)

  • Resend API key support for email notifications
  • Notification endpoint (POST /api/v1/notify)

v0.1.3 (2026-02-11)

  • Customer detail page: system info, storage bars, container table
  • 24h history graphs

v0.1.2 (2026-02-10)

  • Dashboard auto-refresh (60s cycle)
  • Status logic (green/yellow/red based on report age + health)

v0.1.1 (2026-02-09)

  • Basic dashboard with customer overview table
  • Report ingest API

v0.1.0 (2026-02-08)

  • Initial release: SQLite store, report API, basic web dashboard