Commit Graph

353 Commits

Author SHA1 Message Date
admin f72f2c7ccb docs: golden rebaked to controller 0.45.0 (anon pull; archive produced, build guest purged)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 09:54:47 +02:00
admin fa60ba50c0 docs(v0.45.0): REPORT + CONTEXT + README for storage UX polish; live-validated on guest 9201
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 09:50:27 +02:00
admin 9ed844fd0b controller v0.45.0: storage UX polish — deterministic order, init filter, register shortcut, system-storage clarity
B1 sort /api/disks (user-data→system→backup, alpha within); B2 init wizard
excludes mounted drives; B3 Regisztrálás primary action for mounted-unregistered
user-data drives (POST /api/storage/register); B4 per-card purpose descriptions +
app-backing tags + tiering note (local & local-lvm both kept); B5 eject already
names affected apps. Pairs with felhom-agent v0.24.0 eject role-gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 09:35:31 +02:00
admin 12064dcd88 v0.44.0: role-aware drive management — protected lockout + customer type-to-confirm wipe + drive-list restyle
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 21:44:50 +02:00
admin 2c32c821fe docs(v0.43.0): REPORT (storage mgmt rebuild) + README agent-delegated storage note
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 20:01:00 +02:00
admin 29a9dcdd8c v0.43.0: rebuilt storage management (guided init/attach/eject on agent disk model)
Controller-only UI/orchestration over the agent's disk endpoints + StoragePath
registry. New: storage overview (data_bearing badges), guided init (format ->
resolve fs UUID -> assign -> register; data-bearing REFUSAL surfaces the
felhom-opsign command, no force-format), guided attach, eject (+deregister,
dependent-guest warning). agentapi: DiskInfo.DurableID/FSUUID + FormatResult.
PendingOp (parsed from the 403). Honest buttons (migrate disabled, no 404s).
Phase 3: removed dead CrossDrive blocks in deploy.html/backups.html.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 19:47:58 +02:00
admin 8fcd49304d docs(v0.42.1): REPORT (real wildcard cert) + README controller-route/wildcard-anchor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 18:30:42 +02:00
admin e61e7dd8fc v0.42.1: wildcard cert via controller route (entrypoint domains don't issue)
Empirically (staging on 9201): traefik v3 issues a cert from a router-level
tls.domains but NOT from the entrypoint http.tls.domains. So the wildcard moves
to RenderControllerRoute (the always-present anchor): when DNS-01 ACME is
configured it carries tls.certResolver+domains *.<domain>+apex, and every other
router serves that wildcard by SNI (no per-app labels). Reverts v0.42.0's dead
entrypoint-domains + TraefikData.Domain.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 18:04:39 +02:00
admin 84c3e84641 v0.42.0: real Let's Encrypt cert via wildcard proactive issuance
traefik's websecure entrypoint now declares http.tls.domains *.<domain>+apex so
it proactively obtains the wildcard via Cloudflare DNS-01 at startup (cert ready
before first client, every router serves it by SNI). Gated on CFAPIToken (DNS-01).
TraefikData gains Domain; ensureTraefik wires cfg.Customer.Domain.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 17:48:15 +02:00
admin 80216e6ce5 docs: REPORT update for v0.41.1/0.41.2 controller routing + dashboard fix
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 15:52:41 +02:00
admin 2bed7cee2a v0.41.2: fix controller-route auto-connect + dead dashboard cross-drive block
containerOnNetwork misread the absent-key '<nil>' as "already attached", so
wireController skipped docker network connect -> traefik 502'd felhom.<domain>.
Now lists network names and matches exactly. Also removed dashboard.html's dead
CrossDrive* block (slice-8C leftover) that 500'd the dashboard via gt <nil> 0,
exposed once v0.41.1 made the dashboard reachable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 15:48:50 +02:00
admin 91736eb015 v0.41.1: wire the controller dashboard into traefik (felhom.<domain> routing)
EnsureBaseStack now writes a traefik file-provider route
(Host(felhom.<domain>) -> http://felhom-controller:8080) and joins the
controller to traefik-public. Done post-pull (domain known) and idempotently
(write-if-changed + skip-if-connected), so felhom.<domain> reaches the
controller. Completes the v0.41.0 base-infra bring-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 15:40:43 +02:00
admin f1780100ee docs(v0.41.0): README base-infra bring-up section + REPORT (live-validated)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 15:17:09 +02:00
admin abbd9488c6 v0.41.0: first-boot base-infra bring-up + self-heal (+ Section-G mount fix)
New internal/infra package renders traefik/cloudflared/filebrowser from config
(pinned images, single source of truth; web filebrowser path delegates here).
stacks.EnsureBaseStack deploys the traefik-public network + the three stacks,
single-flight + idempotent + non-fatal; wired to first boot and every health
tick. monitor.EffectiveProtected drops cloudflared when no tunnel token.
Section-G fix lives in felhom-agent build-golden.sh (same-path stacks bind).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 14:56:42 +02:00
admin ba0e1eb04a document 2026-06-11 14:02:47 +02:00
admin 57b8f56c52 REPORT: v0.40.0 bootstrap pull+merge — live-validated on demo (guest 9201, ONLINE v0.40.0)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:37:10 +02:00
admin 6a594f9ec2 v0.40.0: bootstrap pull+merge onboarding (controller pulls config from hub)
Fix the onboarding 401: instead of seeding controller.yaml from the agent's
HOST hub key (which the hub's customer-scoped /api/v1/report rejects), the
controller now PULLS its full controller.yaml from the hub on first boot using
the bootstrap's retrieval passphrase (yielding the customer-scoped key) and
MERGES in the per-guest local_api block.

- internal/bootstrap: contract v1->v2 (customer.id + hub.url +
  hub.retrieval_password + local_api; drop host key/identity). MaybeIngest gains
  an injected PullFunc (keeps bootstrap free of the heavy report package),
  pulls with bounded transient-only retry, merges local_api at YAML-map level
  (preserves all hub-emitted fields), idempotent + fail-safe + never-crash.
- main.go: wire report.PullConfig as the pull adapter (maps ErrHubUnreachable
  -> ErrPullTransient; auth/not-found permanent).
- Lockstep with felhom-agent v0.19.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:22:37 +02:00
admin b76d8b298c REPORT: record pushed commit hash for the v0.39.1 cleanup + demo validation
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:24:36 +02:00
admin 6e77bea4d3 v0.39.1: 8C orphan-template cleanup (delete 5 dead templates)
Remove five orphaned HTML templates left behind when slice 8C retired the
disk/storage/restore web handlers (storage_handlers.go, handler_restore.go and
the /api/storage/* + /api/restore/* routes): storage_init, storage_attach,
migrate, migrate_drive, restore. Zero .go references, zero cross-template
references, no route, no nav entry; embed is a glob so deletion is safe (14
templates remain, build + tests green). No behaviour change; the deleted pages
were already unreachable.

Also ships the live demo validation (v0.39.0) writeup in REPORT.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:24:13 +02:00
admin d8d1e17758 slice 9: host-health view on the monitoring page (v0.39.0)
Add agentapi HostMetrics() + a thin /api/host-metrics proxy to the agent's
new GET /host/metrics, and a 'Szerver allapota (gazdagep)' card on the
monitoring page rendering host CPU%/load/mem/CPU-temp(n/a)/uptime + per-
storage capacity bars (thin-pool fill, disk temp/wear). Polls every 8s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 16:16:15 +02:00
admin 4c9065381b REPORT: slice 8B.2 controller half (resume at snapshotted, v0.38.0)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 15:02:30 +02:00
admin e4b69ac9e5 slice 8B.2 (controller): resume app at snapshotted, keep tracking to done (v0.38.0)
Quiesce loop resumes (StartStack + clear marker) at the snapshotted phase
instead of done -> downtime whole-backup -> until-snapshot, no consistency loss.
Keeps polling to done/failed (no overlapping backup; post-snapshot failure
observed). Stop-mode fallback to done + crash-safety preserved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 14:54:19 +02:00
admin 6ac7167dfd REPORT: slice 8C controller half (de-privileging + disk mgmt via agent, v0.37.0)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 14:07:24 +02:00
admin 6d267b3e4d slice 8C C.3: de-privilege the controller container (legacy docker-setup template) + CHANGELOG (v0.37.0)
Dropped privileged:true + /mnt rshared + /sys + /dev + /etc/fstab + /run/udev
from the bare-metal compose template (controller no longer does disk ops). The
golden bootstrap run was already minimal (8A). Slice 8 CLOSED on the controller.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 13:59:26 +02:00
admin abe4e8e619 slice 8C Phase B.2 + C.1/C.2: retire disk subsystem + rewire disk mgmt to agent
Retired (~12.3k LOC): internal/storage/* (scan/format/attach/migrate/safety),
backup restic/crossdrive/restore_drives/disk_layout/local_infra/restore_scan/
paths + restore_app, report/infra_backup*/infra_pull, setup/scanner,
monitor/watchdog+pinger, web/storage_handlers+handler_restore. Surgically split
backup.Manager to app-data only (DB dumps + volume tars + app restore; dropped
restic + cross-drive + snapshot history). Fixed router/main/web wiring.
Added agent-backed disk API (web/agent_disk_handlers.go): /api/disks list/
assign/eject/format proxying agentapi; data-bearing format refusal -> HTTP 409
'operator authorization required'. report/config_pull.go keeps the setup
fresh-install config download. go build + go test green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 13:57:27 +02:00
admin 0294513906 slice 8C Phase B.1: agentapi disk client (Disks/AssignDisk/EjectDisk/FormatDisk)
ErrFormatRefused surfaces the agent's data-bearing refusal distinctly. Tests:
list, blank format OK, data-bearing refused, eject dependents.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 13:23:00 +02:00
admin 9788ee64fa REPORT: slice 8B controller half (app-consistent backup quiesce loop, v0.36.0)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 11:04:59 +02:00
admin 68fc153d9c slice 8B (controller half): app-consistent backup quiesce loop (v0.36.0)
internal/quiesce: poll /backup/due -> quiesce (stop app stacks) -> POST /backup
-> poll /backup/status -> unquiesce (restart exactly those). Crash-safety:
persisted marker before stopping, guaranteed unquiesce (defer), max-quiesce
guard, startup Recover, single-flight. agentapi BackupDue/StartBackup/
BackupStatus; stacks.RunningAppStacks(); config QuiesceConfig; main wiring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 10:44:52 +02:00
admin 10685b771c REPORT: slice 8A controller half (bootstrap ingestion + pinned local-API client, v0.35.0)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 10:02:25 +02:00
admin 2a0d9a1b7a slice 8A (controller half): bootstrap.json ingestion + pinned agent local-API client (v0.35.0)
internal/bootstrap: first-run bootstrap.json ingestion (decision (c)) — seed
controller.yaml + skip setup; idempotent + fail-safe. internal/agentapi:
minimal pinned local-API client (leaf-cert SHA-256 pin, fails closed). config
LocalAPIConfig; startup /storage connectivity probe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 09:47:54 +02:00
admin 086281b582 docs: reflow CLAUDE.md; unify REPORT/CHANGELOG convention; add no-secrets rule
Reflow removes hard mid-paragraph line wraps (code blocks and tables untouched);
rendered output unchanged. Adds the uniform CHANGELOG (cumulative) / REPORT
(overwrite-latest) convention plus a no-secrets rule. Docs/meta only, no version bump.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 20:54:45 +02:00
admin 27c945d698 update 2026-06-08 20:07:56 +02:00
admin fabb881ab8 Merge pull request 'chore: rework references for repo rename deploy-felhom-compose -> felhom-controller' (#2) from chore/rename-repo-refs into main
Reviewed-on: #2
2026-06-08 11:56:47 +00:00
admin e9ca42060c chore: rework references for repo rename deploy-felhom-compose -> felhom-controller
Repo renamed on Gitea (admin/deploy-felhom-compose -> admin/felhom-controller).
Updates clone URLs, clone dirs, the customer bootstrap URL, build.sh, BUILDING.md,
README.md, CLAUDE.md, CONTEXT.md and TASK.md to the new name. No functional change:
Go module path and Docker image path (both already 'felhom-controller') untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 13:38:27 +02:00
admin a54ec5a598 Merge pull request 'refactor: extract app-data-backup into internal/appbackup (no behaviour change)' (#1) from refactor/extract-appbackup into main
Reviewed-on: admin/deploy-felhom-compose#1
2026-06-08 10:56:51 +00:00
admin a4de90def3 refactor: extract app-data-backup into internal/appbackup (no behaviour change)
Extract the stateless, keep-side app-data backup primitives out of
internal/backup/ into a new self-contained internal/appbackup/ package:
- dbdump.go: DB dump discovery/execution (DiscoverDatabases, DumpOne, ...)
- appdata.go: StackDataProvider + app-data/volume discovery, HumanizeBytes
- paths.go: keep-side path helpers (AppDBDumpPath, AppVolumeDumpPath, AppDataDir)

backup/ keeps every name available via type/const aliases + one-line function
forwarders (appbackup_bridge.go), so the still-present delete-side code
(restic, cross-drive, drive-mount) and the both-side consumers (web/api/report)
compile unchanged. The keep-only consumers appexport and storage are rewired to
import appbackup directly and no longer import backup.

This is the Part-2 prerequisite for the Proxmox port: appbackup has zero
references to restic/cross-drive/drive-mount and does not import backup, so the
delete-side can later be removed without breaking app-data backup or appexport.

Behaviour-preserving: pure move + import/qualifier rewrites, no logic edits.
The four Manager methods (RunDBDumps/DumpAppVolumes/DumpAppVolumesSafe share the
delete-side mutex/status state; RestoreAppFromTier2 reads the cross-drive mirror)
intentionally stay on Manager and delegate to appbackup — for the re-platform step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 11:01:39 +02:00
admin fb11c3b75a feat: backup safety — stop-before-dump, streaming restore, health check, per-app restic, infra configs (v0.34.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:56:48 +01:00
admin 783830a9d4 fix: add HasVolumeData to AppBackupRow for template rendering
The backups page template references .HasVolumeData on the status table
rows but the AppBackupRow struct was missing this field, causing a
template error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 21:48:22 +01:00
admin c929948f27 feat: Docker volume backup, Tier 2 restore, restore dropdown fixes (v0.33.0)
- Add Docker named volume backup to Tier 1 (dump to tar, include in restic)
  and Tier 2 (copy tars to rsync mirror _volumes/ dir)
- Fix volume name resolution: use project-prefixed names (mealie_mealie_data)
- Fix double Tier 1 in restore dropdown: filter snapshots by app's home drive
- Add Tier 2 restore: RestoreAppFromTier2() restores from rsync mirror
- Show Tier 2 entry in restore dropdown when cross-drive backup succeeded
- Add .fab import link in restore section
- Volume-aware restore type banners and backup content labels

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 21:43:02 +01:00
admin 5bf13ca19d move optional config from app info page to deploy/settings page
Users couldn't find metadata provider fields (IGDB, ScreenScraper, etc.)
on the app info page. Move them to the deploy page where all other
settings (integrations, geo-restriction) already live.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 20:04:51 +01:00
admin 54390c456c move optional config from app info page to deploy/settings page
Users couldn't find metadata provider fields (IGDB, ScreenScraper, etc.)
on the app info page. Move them to the deploy page where all other
settings (integrations, geo-restriction) already live.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 20:04:28 +01:00
admin 36afd828a1 fix: FileBrowser reads stale config on fresh deployments
The gtstef/filebrowser image bakes FILEBROWSER_CONFIG=/home/filebrowser/data/config.yaml,
but controller mounts config at /home/filebrowser/config.yaml. Override the env var in both
generateFileBrowserCompose() and docker-setup.sh so FileBrowser reads the controller-managed
config with proper sources and database path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 18:51:59 +01:00
admin b4bda38fa1 feat: format empty partitions on system disk (v0.32.6)
Detect and offer to format empty (no filesystem) partitions on the system
disk. Adds IsSystemPartition() for granular per-partition safety checks
instead of blocking the entire system disk. Init wizard shows formatable
partitions with appropriate warnings. Add felhotest demo node to docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 16:54:16 +01:00
admin 2c0064ac87 updated CF tunnel config 2026-02-27 16:29:00 +01:00
admin 9b13c0e21c feat: Tier2 backup pauses when destination drive is inactive (Inaktív)
Deactivated drives (Schedulable=false) now treated like disconnected for
Tier2 backups. New IsStoragePathSchedulable() checks active+connected+not
decommissioned. UI shows yellow "Cél meghajtó inaktív" badge, scheduler
skips silently with WARN log.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 10:59:56 +01:00
admin 4fd907a09e fix: Tier2 backup status now detects drives removed from storage (not just disconnected)
Previously, removing a storage drive from the controller only marked it as
disconnected if the StoragePath entry still existed with Disconnected:true.
Drives removed entirely from storage_paths were invisible to the check,
causing Tier2 backup UI to show green "Sikeres" and scheduler to attempt
backups to a no-longer-managed destination.

New IsStoragePathKnown() method covers both cases. UI shows yellow
"Cél meghajtó leválasztva" and scheduler skips silently.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 10:48:00 +01:00
admin dd79918234 docs: update CHANGELOG and README for v0.32.5
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 10:02:03 +01:00
admin f19c6fb0c9 fix: USB badge detection for bind-mounted drives + graceful Tier2 backup on disconnected destinations
- IsUSBDevice/diskModel: strip findmnt bind-mount suffix [/subdir] before
  parsing device path (fixes USB badge not showing for attach-wizard drives)
- crossdrive.go: skip disconnected src/dest drives with WARN log instead of
  returning error (prevents noisy error status in settings.json)
- handlers.go: detect Tier2 destination disconnection, set yellow status dot
  instead of red, skip ValidateDestination for disconnected paths
- backups.html: new template branch showing "Cél meghajtó leválasztva" badge
  with grayed-out info and hidden "Futtatás most" button

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 09:59:29 +01:00
admin 1155a0522b docs: update CHANGELOG and README for v0.32.4
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 09:23:16 +01:00
admin 62d26be8ae feat: include controller in app telemetry reports
Add the felhom-controller container as a special entry in the
app_telemetry array sent to the hub. This reuses all existing hub
infrastructure (storage, aggregation, UI) with zero hub-side changes.

The controller's memory/CPU metrics and log warnings/errors are now
collected alongside app telemetry, giving the hub visibility into
controller health, memory trends, and known issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 09:19:27 +01:00