Commit Graph

22 Commits

Author SHA1 Message Date
admin 3457415117 slice 10D (hub): DR capstone — recovery mode + re-enroll + directive serving (hub v0.11.0)
Recovery-mode toggle (global key, bounded auto-expiry) gates re-enroll +
restore-directive serving. Re-enroll rotates the agent<->hub credential to the
new box (old key revoked); returns the opaque escrow blobs + non-secret
directive. Store gains recovery_mode_until + identity_blob + directive_json.
Hub holds no usable secret + no Cloudflare write-power (operator-side rotation).
Doc 03 §9: slice 10 CLOSED.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 09:48:38 +02:00
admin 0c843286a2 slice 10B: signed-op job completion (DELETE clear-job) (hub v0.10.0)
Add DELETE /hosts/{id}/jobs/{job_id} (per-host self-scoped, idempotent) so the
agent clears a job after executing or terminally rejecting it. The hub stores
the operator-signed blobs opaquely (no signing key — cannot forge or open);
the agent verifies + executes. Doc 03 §4/§6/§9 updated (operator-signed path
live; 8C wipe completes; 10B done).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 20:14:32 +02:00
admin e54f882e70 slice 10A: hub desired-state serving + signed-jobs queue (Down channel) (hub v0.9.0)
Serve operator intent to authenticated hosts: PUT /admin/hosts/{id}/desired-state
(global key) bumps desired_generation; GET /hosts/{id}/desired-state + /jobs are
per-host self-scoped; the host-report envelope now carries the real generation +
has_signed_ops. New signed_jobs table + store methods. Desired-state stored/served
opaquely (agent owns the schema). Cross-repo golden (envelope + desired-state)
byte-identical with felhom-agent; doc 03 §4/§9 updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 19:03:14 +02:00
admin 7eb3772000 hub: opaque PBS recovery-code escrow storage (v0.8.0) + doc 03 §8a posture model
Slice-7 close-out (hub half). PUT /api/v1/hosts/{host_id}/escrow (per-host key)
stores the agent's OPAQUE R-wrapped blob verbatim against the host; the hub never
decrypts it (no recovery code, no decrypt path). host_escrow table + Save/GetHostEscrow.
Tests: verbatim store, rotation last-write-wins, 401/403/400 auth+body, wire contract.

doc 03 §8a rewritten into the key-custody posture model: separation principle,
topology matrix, default + anti-lockout ladder, SSH-vs-key, breach/legal, integrity
caveat. Corrected: hub opaque storage is slice 7 (this task); serving is slice 10.
Slice table + §13 updated.

No secrets committed (R/K never appear; spike findings + docs use placeholders).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 07:46:33 +02:00
admin 4bd0909f2b hub: restore-test "passed with warnings" visibility (v0.7.5)
Phase B (hub half) of the restore-test warning fix. The agent v0.7.0 now passes a
restore-test that emitted a benign start advisory (systemd-nesting) and carries the
warning text on the wire.

- hostRestoreTest gains warnings + warnings_recognized mirror fields (omitempty;
  absent recognized => false => louder unrecognized path)
- ingest logs [INFO] passed WITH WARNINGS (recognized), [WARN] for unrecognized;
  FAILED still [WARN]
- golden restore_tests[0] gains the keys, byte-identical with felhom-agent (sha256
  e6999d77...); bidirectional key-set contract test round-trips them
- no dashboard widget: no host-domain dashboard surface exists yet (log+persist only,
  as with pbs_snapshots) -- deferred to slice 10

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 19:41:21 +02:00
admin 5bc4c3d967 hub v0.7.4: ingest agent pbs_snapshots (slice 6 Phase B)
Accept + persist the now-populated host-report pbs_snapshots. hostPBSSnapshot mirror in
hostReportPayload (persisted via report_json, no schema change); a FAILED PBS verify is
logged prominently (loudest offsite-DR signal). Shared golden updated byte-identical with
felhom-agent; TestHostPBSSnapshot_GoldenContract added. Build/deploy deferred (backward-compatible).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:15:58 +02:00
admin 41f2d2b5da hub v0.7.3: ingest agent backups + restore_tests (slice 6 Phase A)
Accept + persist the now-populated host-report backups/restore_tests. Mirror structs in
hostReportPayload; persisted via report_json (no schema change); a FAILED restore-test is
logged prominently (loudest DR signal). Shared golden updated byte-identical with
felhom-agent; bidirectional key-set tests added. Build/deploy deferred (backward-compatible).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 13:56:18 +02:00
admin aaff268fff hub v0.7.2: ingest agent storage_targets (slice 5 Phase A)
Accept + persist the now-populated host-report storage_targets. Minimal — the
authoritative storage manifest is hub-owned (slice 10); this mirrors what the agent
observes.

- hostReportPayload.StorageTargets: full mirror of the agent's hub.StorageTarget
  wire contract; persisted verbatim in report_json (no schema change); count +
  WARN on disconnected targets.
- shared host-report golden updated with two populated targets; byte-identical with
  felhom-agent's copy.
- TestHostStorageTarget_GoldenContract: hub half of the bidirectional key-set test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 09:59:27 +02:00
admin 4be3bdf486 fix(hub): slice-3 follow-ups — /host-report 413 oversize + contract golden (v0.7.1)
- handleHostReport: read maxHostReportBytes+1 (4 MiB const) and reject oversize with
  413 instead of silent LimitReader truncation. Controller handleReport (1 MiB) is
  unchanged. Test asserts 413.
- contract: hub/internal/api/testdata/host-report.golden.json (byte-identical with
  felhom-agent's copy) + TestHostReport_GoldenContract drives the real handler and
  asserts 200 + denorm + both guests upserted.
- CHANGELOG v0.7.1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:31:44 +02:00
admin 7c0c75457f feat(hub): host-domain ingest — tables + /host-report + per-host auth + host dead-man's-switch (v0.7.0, slice 3)
Purely additive; the controller path (reports/customer_configs/checkAuthCustomer/
existing checkers) is untouched. Cutover remains slice 10.

- store: new hosts/guests/host_reports tables (full schema incl. columns INERT
  until slice 10, so no later ALTER); GetHostByAPIKey/GetHost/ListHosts/UpsertHost/
  SaveHostReport/UpsertGuestFromReport (preserves inert cols)/GetHostStaleness/
  GuestID; Prune also prunes host_reports.
- api: checkAuthHost (sibling of checkAuthCustomer); POST /host-report (per-host
  Bearer, 4MiB, denorm + guest upsert, control envelope); POST /admin/hosts
  (PROVISIONAL global-key host mint); host_* event types registered.
- monitor: HostStalenessChecker sibling over host_reports (host_stale/down/
  recovered), wired on the existing 60s ticker; controller checkers unchanged.
- tests (hermetic): store intent/inert-column preservation, auth, ingest
  (envelope+denorm, mismatch/unknown/blocked/oversize), admin mint round-trip,
  host staleness transitions.

CHANGELOG v0.7.0. Contract matches the agent host-report spec field-for-field.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:36:16 +02:00
admin f1212e6ba8 feat: infra backup GFS retention + version history
New infra_backup_versions table with GFS pruning (~14 versions per
customer). Recovery endpoint supports ?version=ID. New /versions API.
Dashboard shows collapsible backup history with app names and disk count.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 14:47:48 +01:00
admin a757bee07a feat(hub): app telemetry analytics dashboard (v0.4.0)
- store/telemetry.go: new app_telemetry + app_log_issues tables with
  SaveAppTelemetry, GetFleetAppSummary (with P95), GetAppTelemetryHistory,
  GetAppCustomerBreakdown, GetCustomerAppSummary, GetAppIssues, prune methods
- api/handler.go: parse and save optional app_telemetry from report body,
  backward-compatible with old controllers
- cmd/hub/main.go: prune app_telemetry (90d) and stale issues (30d)
- web/apps.go: handleApps + handleAppDetail + chart data aggregation helpers
- web/server.go: routes for /apps, /apps/{name}, /static/chart.min.js;
  added memoryColor/accuracyClass/gt template functions
- web/embed.go: embed static/chart.min.js
- web/configs.go: add app telemetry section to handleCustomerUnified
- templates/apps.html: fleet-wide app list with summary cards and sortable table
- templates/app_detail.html: per-app page with Chart.js memory trend,
  customer breakdown, and known issues table
- templates/customer_unified.html: new Alkalmazás telemetria card
- templates/style.css: badge, summary-card, chart, period-selector,
  accuracy-dot, mem-color, data-table styles
- All templates: added Alkalmazások nav link

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 10:46:50 +01:00
admin 3690c5028e feat(hub): asset management API with PVC storage and image seed
Add internal/assets package that manages app assets (logos, screenshots)
on Hub PVC with automatic seeding from baked-in image copy on first run.
Two new API endpoints: GET /assets/manifest (JSON with SHA-256 checksums)
and GET /assets/file/{name} for controllers to sync assets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 15:22:45 +01:00
admin 4ec1b7d712 hub v0.3.5: Recovery endpoint + customer_blocked in report response
- New GET /api/v1/recovery/{customer_id}: returns generated controller.yaml
  and infra backup in a single response for disaster recovery.
  Auth via X-Retrieval-Password header.
- Report response now includes customer_blocked: true when customer
  status is "blocked" — controllers use this to detect standing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 12:38:57 +01:00
admin 3217cb4751 feat: Hub monitoring takeover — event system, dead man's switch, notifications (v0.3.0)
Replace external Healthchecks.io with Hub-native monitoring. New events
table + /api/v1/event endpoint for structured events from controllers.
Staleness checker (60s) detects unresponsive nodes. Backup deadline
checker (daily 05:00) catches missed backups. Notification dispatcher
sends operator (English) + customer (Hungarian) emails via Resend with
per-event cooldowns. Event timeline on customer page, dashboard badges.
Config form deprecates Monitoring UUIDs section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 18:53:24 +01:00
admin 42e0617a6c hub: unified customer page, blocked status, dashboard merge
- Replace separate config detail and report detail pages with unified
  /customers/{id} page showing both config info and live report data
- Add "blocked" status for customers (hidden from dashboard, notifications
  suppressed, still accepts reports)
- Dashboard now shows config-only customers as "PENDING" status
- Customers list: all rows link to /customers/{id}, show BLOCKED badge
- New actions: block/unblock, push config to controller, auto-create
  config from report data
- /configs/{id} now redirects to /customers/{id}
- Add config-badge CSS classes for MANAGED/MANUAL/BLOCKED badges

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 15:57:39 +01:00
admin 4c8bf63ce3 feat: customer config management — CRUD, API retrieval, per-customer auth (v0.2.0)
New "Configurations" section lets operators pre-configure customer settings
in the Hub, then docker-setup.sh can download a ready-made controller.yaml
using just a customer ID and retrieval password.

- Store: customer_configs table with CRUD + per-customer API key lookup
- API: GET /api/v1/config/{id} with X-Retrieval-Password auth
- Auth: per-customer API keys alongside existing global key (backward compatible)
- Web UI: /configs list, create, edit, delete, YAML preview, copy-to-clipboard
- YAML gen: deep-merge controller.yaml.example template with customer overrides
- Template fetcher: background goroutine refreshing template from Gitea repo
- Navigation: Dashboard / Configurations tabs on all pages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 13:36:32 +01:00
admin 36a7d1c162 feat: add controller update trigger + version checker (v0.1.8)
Hub now tracks controller_url from reports, periodically checks the Gitea
registry for the latest controller image version, and shows a "Trigger Update"
button on the customer detail page that proxies to the controller's self-update
API endpoint using the shared API key.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 18:16:38 +01:00
admin 41e313bf36 hub v0.1.7: Infrastructure backup endpoints for disaster recovery
Add infra-backup push/pull API for controller DR:
- POST /api/v1/infra-backup — controller pushes infrastructure snapshot
- GET /api/v1/infra-backup/{customer_id} — fresh controller pulls backup
- infra_backups SQLite table with per-customer snapshots
- Customer detail page shows infra backup status card
- README.md with full API docs and DR flow

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 13:17:12 +01:00
admin bd669e7a9d Hub: add preferences sync endpoint + notification display on customer page
- POST /api/v1/preferences: accepts {customer_id, email, enabled_events} from controller
- GetRecentNotifications() store method for last N notification log entries
- Customer detail page: new Notifications section (email, events, recent log table)
- joinStrings template function for event list display

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 20:18:10 +01:00
admin e531516cfa Hub: add POST /api/v1/notify endpoint for customer notifications
- New notification relay endpoint: receives events from customer controllers,
  looks up customer email preferences, sends via Resend HTTP API
- New tables: customer_notifications (per-customer email + event prefs),
  notification_log (audit trail for all notification attempts)
- Hungarian email template with severity, event type, timestamp
- Config: notifications.resend_api_key + notifications.from_email
- Test events always pass event-type filter

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:29:55 +01:00
admin 77b5a4ce4e Add felhom-hub: multi-customer dashboard service
- Hub service receives reports from customer controllers
- SQLite store with 90-day retention and auto-prune
- REST API: POST /api/v1/report, GET /api/v1/customers
- Dark theme dashboard with status overview table
- Customer detail page with system, storage, containers, backup, health
- Bearer token auth for report ingest, bcrypt auth for dashboard
- K8s manifest for felhom-system namespace (Deployment, Service, Ingress, PVC)
- Dockerfile with multi-stage build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 13:19:25 +01:00