Fix startup hub report — Push() silently swallows errors (v0.15.5)
This commit is contained in:
@@ -1,422 +1,246 @@
|
||||
# TASK: Fix hub report (v0.15.4)
|
||||
# TASK: Fix startup hub report — Push() silently swallows errors (v0.15.5)
|
||||
|
||||
Three areas of work:
|
||||
## Problem
|
||||
|
||||
1. **Hub storage report** — already fixed in `builder.go`/`types.go` (applied before this task).
|
||||
2. **Hub reporting lifecycle** — when hub reporting is disabled on a controller, the hub should know about it. Currently the hub just shows "DOWN" after reports stop arriving, with no distinction between "node crashed" vs "reporting turned off". The controller should send a one-time "disabled" notification so the hub can display the correct status. Also: hub report history should show dates (not just times), and hub should use storage labels.
|
||||
The startup hub report exists but silently fails. On the latest deployment, the controller tried to push a report 5 seconds after boot, but the hub returned HTTP 503 (it was still starting up). `Push()` always returns `nil` by design, so `main.go` logged `[INFO] Startup hub report sent` even though the push actually failed. The hub shows stale data until the first scheduled report fires (15 minutes later).
|
||||
|
||||
---
|
||||
|
||||
## Fix 1: Hub storage report — include all storage paths (ALREADY DONE)
|
||||
|
||||
The following changes have ALREADY been applied:
|
||||
|
||||
- **`controller/internal/report/types.go`** — `Label` field added to `StorageReport`
|
||||
- **`controller/internal/report/builder.go`** — Storage section now iterates over all `storagePaths` using `system.GetDiskUsage()`
|
||||
|
||||
These were applied before this task. No further changes needed in these files.
|
||||
|
||||
---
|
||||
|
||||
## Fix 2: Controller sends "disabled" notification when hub reporting is off
|
||||
|
||||
### Problem
|
||||
|
||||
When `hub.enabled: false` but `hub.url` and `hub.api_key` ARE configured, the controller sends nothing to the hub. The hub has no way to know whether the node is dead or reporting was intentionally turned off. It just shows "DOWN" after the stale threshold (30min+).
|
||||
|
||||
### Design
|
||||
|
||||
The intent behind the config values:
|
||||
- `hub.url` + `hub.api_key` configured → a relationship with the hub exists
|
||||
- `hub.enabled: false` → periodic reporting is turned off (but the hub relationship still exists)
|
||||
|
||||
So: when `hub.enabled == false` but URL + API key are present, the controller should send **one** "reporting disabled" notification on startup.
|
||||
|
||||
### Controller changes
|
||||
|
||||
#### Step 1: Add `ReportingDisabled` field to `Report`
|
||||
|
||||
**File:** `controller/internal/report/types.go`
|
||||
|
||||
Add a new field to the `Report` struct:
|
||||
|
||||
```go
|
||||
type Report struct {
|
||||
Version int `json:"version"`
|
||||
CustomerID string `json:"customer_id"`
|
||||
CustomerName string `json:"customer_name"`
|
||||
ControllerVersion string `json:"controller_version"`
|
||||
Timestamp time.Time `json:"timestamp"`
|
||||
ReportingDisabled bool `json:"reporting_disabled,omitempty"`
|
||||
System SystemReport `json:"system"`
|
||||
Storage []StorageReport `json:"storage"`
|
||||
Containers ContainerReport `json:"containers"`
|
||||
Backup BackupReport `json:"backup"`
|
||||
Health HealthReport `json:"health"`
|
||||
Stacks StacksReport `json:"stacks"`
|
||||
}
|
||||
Evidence from logs:
|
||||
```
|
||||
09:46:47 [INFO] Hub reporting enabled (every 15m0s to https://hub.felhom.eu)
|
||||
09:47:02 [WARN] Hub report push failed after 3 attempts: HTTP 503 ← Push() logged this internally
|
||||
09:47:02 [INFO] Startup hub report sent ← main.go logged "sent" because Push() returned nil
|
||||
```
|
||||
|
||||
#### Step 2: Add `PushOnce()` method to Pusher
|
||||
The hub pod only became ready at 09:47:02 — the same second Push() gave up.
|
||||
|
||||
## Root cause
|
||||
|
||||
`Push()` in `pusher.go` (line 39-86) has comment: "Never returns error to caller — push failures should not affect controller operation." It always returns `nil`. The startup code in `main.go` checks `err` from `Push()` but it's always nil, so it always takes the success branch.
|
||||
|
||||
The scheduler (`scheduler.go:223`) already handles errors from `JobFunc` gracefully — it logs the error and continues. So returning real errors from `Push()` is safe for scheduled calls too.
|
||||
|
||||
## Fix
|
||||
|
||||
### Step 1: Make `Push()` return actual errors
|
||||
|
||||
**File:** `controller/internal/report/pusher.go`
|
||||
|
||||
The existing `Push()` method checks `p.enabled` and silently returns if false. We need a method that pushes regardless of the `enabled` flag, for the one-time disabled notification.
|
||||
|
||||
Add after `Push()`:
|
||||
Change `Push()` to return the real error instead of always `nil`:
|
||||
|
||||
**Current** (line 38-86):
|
||||
```go
|
||||
// PushOnce sends a single report regardless of the enabled flag.
|
||||
// Used for one-time notifications (e.g., reporting-disabled on startup).
|
||||
func (p *Pusher) PushOnce(report *Report) error {
|
||||
if p.hubURL == "" || p.apiKey == "" {
|
||||
return nil
|
||||
}
|
||||
// Push sends a report to the hub. Retries 3 times with 5s backoff.
|
||||
// Never returns error to caller — push failures should not affect controller operation.
|
||||
func (p *Pusher) Push(report *Report) error {
|
||||
if !p.enabled {
|
||||
return nil
|
||||
}
|
||||
|
||||
data, err := json.Marshal(report)
|
||||
if err != nil {
|
||||
p.logger.Printf("[WARN] Hub report marshal failed: %v", err)
|
||||
return nil
|
||||
}
|
||||
data, err := json.Marshal(report)
|
||||
if err != nil {
|
||||
p.logger.Printf("[WARN] Hub report marshal failed: %v", err)
|
||||
return nil
|
||||
}
|
||||
|
||||
url := p.hubURL + "/api/v1/report"
|
||||
url := p.hubURL + "/api/v1/report"
|
||||
|
||||
req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(data))
|
||||
if err != nil {
|
||||
return nil
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
req.Header.Set("Authorization", "Bearer "+p.apiKey)
|
||||
var lastErr error
|
||||
for attempt := 0; attempt < 3; attempt++ {
|
||||
if attempt > 0 {
|
||||
time.Sleep(5 * time.Second)
|
||||
}
|
||||
|
||||
resp, err := p.httpClient.Do(req)
|
||||
if err != nil {
|
||||
p.logger.Printf("[WARN] Hub disabled-notification failed: %v", err)
|
||||
return nil
|
||||
}
|
||||
io.Copy(io.Discard, resp.Body)
|
||||
resp.Body.Close()
|
||||
req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(data))
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
continue
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if p.apiKey != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+p.apiKey)
|
||||
}
|
||||
|
||||
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
|
||||
p.logger.Printf("[INFO] Hub disabled-notification sent (%d bytes)", len(data))
|
||||
}
|
||||
return nil
|
||||
resp, err := p.httpClient.Do(req)
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
continue
|
||||
}
|
||||
io.Copy(io.Discard, resp.Body)
|
||||
resp.Body.Close()
|
||||
|
||||
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
|
||||
p.logger.Printf("[INFO] Hub report pushed successfully (%d bytes)", len(data))
|
||||
return nil
|
||||
}
|
||||
lastErr = fmt.Errorf("HTTP %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
p.logger.Printf("[WARN] Hub report push failed after 3 attempts: %v", lastErr)
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
#### Step 3: Send disabled notification on startup
|
||||
|
||||
**File:** `controller/cmd/controller/main.go` (~lines 248-261, 285-293)
|
||||
|
||||
Change the hub reporting section. Currently:
|
||||
|
||||
**Replace with:**
|
||||
```go
|
||||
// --- Central hub reporting ---
|
||||
var hubPusher *report.Pusher
|
||||
if cfg.Hub.Enabled && cfg.Hub.URL != "" {
|
||||
// ... create pusher, register scheduler ...
|
||||
// Push sends a report to the hub. Retries 3 times with 5s backoff.
|
||||
func (p *Pusher) Push(report *Report) error {
|
||||
if !p.enabled {
|
||||
return nil
|
||||
}
|
||||
|
||||
data, err := json.Marshal(report)
|
||||
if err != nil {
|
||||
return fmt.Errorf("marshal report: %w", err)
|
||||
}
|
||||
|
||||
url := p.hubURL + "/api/v1/report"
|
||||
|
||||
var lastErr error
|
||||
for attempt := 0; attempt < 3; attempt++ {
|
||||
if attempt > 0 {
|
||||
time.Sleep(5 * time.Second)
|
||||
}
|
||||
|
||||
req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(data))
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
continue
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if p.apiKey != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+p.apiKey)
|
||||
}
|
||||
|
||||
resp, err := p.httpClient.Do(req)
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
continue
|
||||
}
|
||||
io.Copy(io.Discard, resp.Body)
|
||||
resp.Body.Close()
|
||||
|
||||
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
|
||||
p.logger.Printf("[INFO] Hub report pushed successfully (%d bytes)", len(data))
|
||||
return nil
|
||||
}
|
||||
lastErr = fmt.Errorf("HTTP %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
return fmt.Errorf("hub push failed after 3 attempts: %w", lastErr)
|
||||
}
|
||||
```
|
||||
|
||||
Replace with:
|
||||
Changes:
|
||||
- Removed "Never returns error" comment
|
||||
- Marshal error: return wrapped error instead of logging + nil
|
||||
- After retries exhausted: return error instead of logging + nil
|
||||
- Success path: unchanged (returns nil)
|
||||
|
||||
This is safe because:
|
||||
- The scheduler (`executeJob` in `scheduler.go:223-235`) already catches and logs errors from `JobFunc`
|
||||
- The startup code in `main.go` already checks `err` — it just never saw one before
|
||||
|
||||
### Step 2: Add startup retry with longer delay
|
||||
|
||||
**File:** `controller/cmd/controller/main.go`
|
||||
|
||||
The startup goroutine (starting at ~line 270) sends the hub report once. If Push() fails (hub not ready), it should retry a few times with delay. The hub typically takes 10-15 seconds to start.
|
||||
|
||||
**Current** (~line 289-297):
|
||||
```go
|
||||
// --- Central hub reporting ---
|
||||
var hubPusher *report.Pusher
|
||||
if cfg.Hub.URL != "" && cfg.Hub.APIKey != "" {
|
||||
hubPusher = report.NewPusher(&cfg.Hub, logger)
|
||||
if cfg.Hub.Enabled {
|
||||
pushInterval, err := time.ParseDuration(cfg.Hub.PushInterval)
|
||||
if err != nil {
|
||||
pushInterval = 15 * time.Minute
|
||||
}
|
||||
sched.Every("hub-report", pushInterval, func(ctx context.Context) error {
|
||||
r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, metricsStore, Version, sett.GetStoragePaths())
|
||||
return hubPusher.Push(r)
|
||||
})
|
||||
logger.Printf("[INFO] Hub reporting enabled (every %s to %s)", pushInterval, cfg.Hub.URL)
|
||||
} else {
|
||||
logger.Printf("[INFO] Hub reporting disabled — will send disabled notification to %s", cfg.Hub.URL)
|
||||
}
|
||||
}
|
||||
// Hub report
|
||||
if hubPusher != nil {
|
||||
if cfg.Hub.Enabled {
|
||||
r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, metricsStore, Version, sett.GetStoragePaths())
|
||||
if err := hubPusher.Push(r); err != nil {
|
||||
logger.Printf("[WARN] Startup hub report failed: %v", err)
|
||||
} else {
|
||||
logger.Println("[INFO] Startup hub report sent")
|
||||
}
|
||||
} else {
|
||||
```
|
||||
|
||||
Then in the startup goroutine (~line 285-293), change the hub report block:
|
||||
|
||||
**Replace the `if cfg.Hub.Enabled` block** (keep the `else` disabled-notification branch unchanged):
|
||||
```go
|
||||
// Hub report
|
||||
if hubPusher != nil {
|
||||
if cfg.Hub.Enabled {
|
||||
r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, metricsStore, Version, sett.GetStoragePaths())
|
||||
if err := hubPusher.Push(r); err != nil {
|
||||
logger.Printf("[WARN] Startup hub report failed: %v", err)
|
||||
} else {
|
||||
logger.Println("[INFO] Startup hub report sent")
|
||||
}
|
||||
} else {
|
||||
// Send a minimal "disabled" notification so hub knows we're alive but not reporting
|
||||
r := &report.Report{
|
||||
Version: 1,
|
||||
CustomerID: cfg.Customer.ID,
|
||||
CustomerName: cfg.Customer.Name,
|
||||
ControllerVersion: Version,
|
||||
Timestamp: time.Now().UTC(),
|
||||
ReportingDisabled: true,
|
||||
Health: report.HealthReport{Status: "disabled", Issues: []string{}, Warnings: []string{}},
|
||||
Stacks: report.StacksReport{Deployed: []string{}, Available: []string{}},
|
||||
Containers: report.ContainerReport{List: []report.ContainerDetailReport{}},
|
||||
}
|
||||
hubPusher.PushOnce(r)
|
||||
}
|
||||
}
|
||||
// Hub report
|
||||
if hubPusher != nil {
|
||||
if cfg.Hub.Enabled {
|
||||
r := report.BuildReport(cfg, stackMgr, backupMgr, cpuCollector, metricsStore, Version, sett.GetStoragePaths())
|
||||
var pushErr error
|
||||
for attempt := 1; attempt <= 3; attempt++ {
|
||||
pushErr = hubPusher.Push(r)
|
||||
if pushErr == nil {
|
||||
logger.Println("[INFO] Startup hub report sent")
|
||||
break
|
||||
}
|
||||
logger.Printf("[WARN] Startup hub report attempt %d/3 failed: %v", attempt, pushErr)
|
||||
if attempt < 3 {
|
||||
time.Sleep(15 * time.Second)
|
||||
}
|
||||
}
|
||||
if pushErr != nil {
|
||||
logger.Printf("[WARN] Startup hub report failed after 3 attempts — next scheduled push in %s", cfg.Hub.PushInterval)
|
||||
}
|
||||
} else {
|
||||
```
|
||||
|
||||
**Note:** The `Report`, `HealthReport`, `StacksReport`, `ContainerReport`, and `ContainerDetailReport` types are all in `controller/internal/report/types.go`. Import the `report` package (already imported as it's used on the line above). The minimal report includes empty slices for all list fields so JSON serialization produces `[]` not `null`.
|
||||
This gives the hub up to ~40 seconds to come up (5s initial + Push's own 3x5s retries on first attempt, then 15s wait, then another Push attempt, etc.). The `else` branch for disabled notifications stays unchanged.
|
||||
|
||||
**Note:** `NewPusher` sets `enabled: cfg.Enabled`. When `cfg.Hub.Enabled == false`, `Push()` will silently return nil. The startup code uses `PushOnce()` instead for the disabled notification, and the scheduler is never registered, so the `enabled` flag doesn't matter. No change needed to `NewPusher`.
|
||||
**IMPORTANT:** The `else` branch (disabled notification via `PushOnce`) stays as-is — no changes needed there.
|
||||
|
||||
---
|
||||
|
||||
## Fix 4: Hub — handle "disabled" status + show dates in history
|
||||
|
||||
### Repo: `E:\git\felhom.eu` (hub code at `hub/`)
|
||||
|
||||
### 4a: Hub status logic — handle "disabled"
|
||||
|
||||
**File:** `hub/internal/web/server.go`
|
||||
|
||||
The `handleDashboard()` function (~line 142-151) determines `OverallStatus`:
|
||||
|
||||
```go
|
||||
if c.TimeSinceReport > time.Hour {
|
||||
dc.OverallStatus = "down"
|
||||
} else if c.TimeSinceReport > 30*time.Minute || c.HealthStatus == "warn" {
|
||||
dc.OverallStatus = "warn"
|
||||
} else if c.HealthStatus == "fail" {
|
||||
dc.OverallStatus = "down"
|
||||
} else {
|
||||
dc.OverallStatus = "ok"
|
||||
}
|
||||
```
|
||||
|
||||
**Replace with:**
|
||||
|
||||
```go
|
||||
if c.HealthStatus == "disabled" {
|
||||
dc.OverallStatus = "disabled"
|
||||
} else if c.TimeSinceReport > time.Hour {
|
||||
dc.OverallStatus = "down"
|
||||
} else if c.TimeSinceReport > 30*time.Minute || c.HealthStatus == "warn" {
|
||||
dc.OverallStatus = "warn"
|
||||
} else if c.HealthStatus == "fail" {
|
||||
dc.OverallStatus = "down"
|
||||
} else {
|
||||
dc.OverallStatus = "ok"
|
||||
}
|
||||
```
|
||||
|
||||
Same change in `handleCustomerDetail()` (~line 200-208):
|
||||
|
||||
```go
|
||||
overallStatus := "ok"
|
||||
if customer.HealthStatus == "disabled" {
|
||||
overallStatus = "disabled"
|
||||
} else if customer.TimeSinceReport > time.Hour {
|
||||
// ...existing logic...
|
||||
```
|
||||
|
||||
Add "disabled" status color and icon to the existing functions:
|
||||
|
||||
**In `statusColor()`:**
|
||||
```go
|
||||
case "disabled":
|
||||
return "#94a3b8" // gray (same as default)
|
||||
```
|
||||
|
||||
### 4b: Dashboard template — show "PAUSED" badge
|
||||
|
||||
**File:** `hub/internal/web/templates/dashboard.html` (line 46)
|
||||
|
||||
**Current:**
|
||||
```html
|
||||
{{if eq .OverallStatus "ok"}}OK{{else if eq .OverallStatus "warn"}}WARN{{else}}DOWN{{end}}
|
||||
```
|
||||
|
||||
**Replace with:**
|
||||
```html
|
||||
{{if eq .OverallStatus "ok"}}OK{{else if eq .OverallStatus "warn"}}WARN{{else if eq .OverallStatus "disabled"}}PAUSED{{else}}DOWN{{end}}
|
||||
```
|
||||
|
||||
### 4c: Customer detail — show "Reporting disabled" message
|
||||
|
||||
**File:** `hub/internal/web/templates/customer.html`
|
||||
|
||||
In the Health section (~line 130-155), add a check for disabled status BEFORE the existing `{{with .Report.health}}` block:
|
||||
|
||||
After line 132 (`<h2>Health</h2>`), add:
|
||||
|
||||
```html
|
||||
{{if eq .OverallStatus "disabled"}}
|
||||
<p class="health-status health-status-disabled">Reporting has been disabled on this node</p>
|
||||
<p class="hint">Enable it in the controller's <code>controller.yaml</code>: <code>hub.enabled: true</code></p>
|
||||
{{else}}
|
||||
```
|
||||
|
||||
And after the existing `{{end}}` that closes `{{with .Report.health}}` (line 155), add:
|
||||
|
||||
```html
|
||||
{{end}}
|
||||
```
|
||||
|
||||
So the full Health section becomes:
|
||||
```html
|
||||
<section class="card">
|
||||
<h2>Health</h2>
|
||||
{{if eq .OverallStatus "disabled"}}
|
||||
<p class="health-status health-status-disabled">Reporting has been disabled on this node</p>
|
||||
<p class="hint">Enable it in the controller's <code>controller.yaml</code>: <code>hub.enabled: true</code></p>
|
||||
{{else}}
|
||||
{{with .Report.health}}
|
||||
... existing content ...
|
||||
{{end}}
|
||||
{{end}}
|
||||
</section>
|
||||
```
|
||||
|
||||
### 4d: Storage labels — use `label` with fallback to `mount`
|
||||
|
||||
**File:** `hub/internal/web/templates/customer.html` (line 65)
|
||||
|
||||
**Current:**
|
||||
```html
|
||||
<span class="metric-label">{{index . "mount"}}</span>
|
||||
```
|
||||
|
||||
**Replace with:**
|
||||
```html
|
||||
<span class="metric-label">{{with index . "label"}}{{.}}{{else}}{{index . "mount"}}{{end}}</span>
|
||||
```
|
||||
|
||||
### 4e: Report history — show date + time
|
||||
|
||||
**File:** `hub/internal/web/templates/customer.html` (line 216)
|
||||
|
||||
**Current:**
|
||||
```html
|
||||
<td>{{.ReceivedAt.Format "15:04:05"}}</td>
|
||||
```
|
||||
|
||||
**Replace with:**
|
||||
```html
|
||||
<td>{{.ReceivedAt.Format "Jan 02 15:04"}}</td>
|
||||
```
|
||||
|
||||
This shows "Feb 18 17:10" instead of just "17:10:54", making it clear when reports are from different days.
|
||||
|
||||
### 4f: CSS for disabled status badge
|
||||
|
||||
**File:** `hub/internal/web/templates/style.css`
|
||||
|
||||
Search for `.status-badge-` CSS rules. Add a rule for `disabled`:
|
||||
|
||||
```css
|
||||
.status-badge-disabled {
|
||||
background: #475569;
|
||||
color: #e2e8f0;
|
||||
}
|
||||
```
|
||||
|
||||
Use a neutral gray — not red (DOWN) or yellow (WARN). This visually signals "intentionally paused", not "something is wrong".
|
||||
|
||||
---
|
||||
|
||||
## Summary of all changes
|
||||
|
||||
### Controller repo (`deploy-felhom-compose`)
|
||||
## Summary of changes
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `controller/internal/report/types.go` | Add `ReportingDisabled bool` field to `Report` struct (Label already applied) |
|
||||
| `controller/internal/report/pusher.go` | Add `PushOnce()` method |
|
||||
| `controller/cmd/controller/main.go` | Always create Pusher when URL+key present; send disabled notification when `hub.enabled == false` |
|
||||
| `controller/internal/report/pusher.go` | `Push()` returns actual errors instead of always nil |
|
||||
| `controller/cmd/controller/main.go` | Startup hub push retries 3 times with 15s delay between attempts |
|
||||
|
||||
### Hub repo (`felhom.eu`)
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `hub/internal/web/server.go` | Handle `"disabled"` in status logic (dashboard + detail handlers), add disabled color |
|
||||
| `hub/internal/web/templates/dashboard.html` | Show "PAUSED" badge for disabled status |
|
||||
| `hub/internal/web/templates/customer.html` | Disabled health message, storage labels, date+time in history |
|
||||
| `hub/internal/web/templates/style.css` | Add `.status-badge-disabled` CSS |
|
||||
|
||||
### Runtime config (demo node)
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `/opt/docker/felhom-controller/controller.yaml` | Set `hub.enabled: true` |
|
||||
Only **2 files** changed. No new types, no new methods, no template changes.
|
||||
|
||||
---
|
||||
|
||||
## Build & Deploy
|
||||
|
||||
### Part A: Controller (v0.15.4)
|
||||
|
||||
```bash
|
||||
SSH=/c/Windows/System32/OpenSSH/ssh.exe
|
||||
# 1. Commit & push
|
||||
cd e:/git/deploy-felhom-compose
|
||||
git add -A && git commit -m "v0.15.4: Show all storage paths on dashboard/monitoring, hub disabled notification" && git push
|
||||
git add -A && git commit -m "v0.15.5: Fix startup hub report — Push() returns real errors, startup retries" && git push
|
||||
# 2. Build
|
||||
$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-controller && git -C ~/git/deploy-felhom-compose pull && ./build.sh v0.15.4 --push"
|
||||
# 3. Enable hub reporting + deploy
|
||||
$SSH kisfenyo@192.168.0.162 "cd /opt/docker/felhom-controller && sudo sed -i 's/enabled: false/enabled: true/' controller.yaml && sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.15.4 && sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.15.4|' docker-compose.yml && sudo docker compose up -d"
|
||||
# 4. Verify
|
||||
$SSH kisfenyo@192.168.0.162 "docker ps --filter name=felhom-controller --format '{{.Image}} {{.Status}}'"
|
||||
# 5. Check hub reporting is active
|
||||
$SSH kisfenyo@192.168.0.162 "docker logs felhom-controller --tail 5 2>&1 | grep -i hub"
|
||||
```
|
||||
|
||||
### Part B: Hub (v0.1.6)
|
||||
|
||||
```bash
|
||||
# 1. Commit & push
|
||||
cd e:/git/felhom.eu
|
||||
git add -A && git commit -m "hub v0.1.6: Handle disabled reporting status, storage labels, date in history" && git push
|
||||
# 2. Build + push
|
||||
$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-hub && ./build.sh 0.1.6 --push"
|
||||
# 3. Deploy to k3s
|
||||
$SSH kisfenyo@192.168.0.180 "sudo kubectl set image -n felhom-system deploy/hub hub=gitea.dooplex.hu/admin/felhom-hub:0.1.6"
|
||||
# 4. Verify
|
||||
$SSH kisfenyo@192.168.0.180 "sudo kubectl get pods -n felhom-system -l app=hub && echo '---' && sudo kubectl logs -n felhom-system -l app=hub --tail 10"
|
||||
$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-controller && git -C ~/git/deploy-felhom-compose pull && ./build.sh v0.15.5 --push"
|
||||
# 3. Deploy
|
||||
$SSH kisfenyo@192.168.0.162 "cd /opt/docker/felhom-controller && sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.15.5 && sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.15.5|' docker-compose.yml && sudo docker compose up -d"
|
||||
# 4. Verify — look for successful startup push
|
||||
$SSH kisfenyo@192.168.0.162 "sleep 10 && docker logs felhom-controller --tail 15 2>&1 | grep -i hub"
|
||||
```
|
||||
|
||||
### Compile check
|
||||
Always run `go build ./...` in `controller/` before committing to ensure no compile errors.
|
||||
Always run `go build ./...` in `controller/` before committing.
|
||||
|
||||
## Documentation
|
||||
|
||||
Add a CHANGELOG.md entry at the top (under `## Changelog`). Read the first 30 lines to see the format, then insert a new entry. Example:
|
||||
Add a CHANGELOG.md entry. Read the first 30 lines for format, then insert a new entry:
|
||||
|
||||
```markdown
|
||||
### vX.X.X (2026-02-19 session XX)
|
||||
- **v0.15.4 — Show all storage paths on dashboard + hub reporting improvements:**
|
||||
- **v0.15.5 — Fix startup hub report silently failing:**
|
||||
|
||||
Dashboard ("Vezérlőpult") and monitoring ("Rendszermonitor") pages now show usage bars for ALL registered storage paths instead of just one hardcoded "Külső HDD" bar. Each bar displays the storage label and usage from settings.
|
||||
`Push()` now returns actual errors instead of always nil. Previously, push failures were logged internally but the caller could never detect them, leading to misleading "Startup hub report sent" log even when the push failed (e.g., hub returning HTTP 503 during simultaneous deployment).
|
||||
|
||||
Hub storage report now correctly includes all registered storage paths with proper mount paths and labels. Previously it sent only root `/` and one HDD entry with an empty mount path (used deprecated `cfg.Paths.HDDPath`).
|
||||
Startup hub push now retries 3 times with 15-second delays between attempts, giving the hub time to come up when both are deployed together. Each attempt uses Push()'s own 3-retry logic internally.
|
||||
|
||||
When hub reporting is disabled (`hub.enabled: false`) but hub URL/key are configured, the controller now sends a one-time "disabled" notification so the hub shows "PAUSED" instead of "DOWN".
|
||||
|
||||
**Files modified:** `internal/web/handlers.go`, `internal/web/templates/dashboard.html`, `internal/web/templates/monitoring.html`, `internal/report/types.go`, `internal/report/pusher.go`, `cmd/controller/main.go`
|
||||
**Hub files:** `hub/internal/web/server.go`, `hub/internal/web/templates/dashboard.html`, `hub/internal/web/templates/customer.html`, `hub/internal/web/templates/style.css`
|
||||
**Files modified (2):** `internal/report/pusher.go`, `cmd/controller/main.go`
|
||||
```
|
||||
|
||||
Update version in `C:\Users\User\.claude\projects\e--git\memory\MEMORY.md` to `v0.15.4`.
|
||||
Update version in `C:\Users\User\.claude\projects\e--git\memory\MEMORY.md` to `v0.15.5`.
|
||||
|
||||
## Verification
|
||||
|
||||
After deploying v0.15.4 (controller) and v0.1.6 (hub):
|
||||
1. Wait ~30s for startup hub report, then check `hub.felhom.eu/customers/demo-felhom`:
|
||||
- Health section should show **STATUS: OK** with no warnings (since hub.enabled is now true)
|
||||
- Storage section should show three bars with labels: "SSD", "USB HDD 1TB", "SYS Storage 350G"
|
||||
- Report history should show dates (e.g., "Feb 19 09:46")
|
||||
2. To test PAUSED state: set `hub.enabled: false` on demo node, restart controller. Hub should show "PAUSED" (gray badge), not "DOWN" (red)
|
||||
After deploying v0.15.5:
|
||||
1. Check logs: `docker logs felhom-controller 2>&1 | grep -i hub`
|
||||
- Should show `[INFO] Startup hub report sent` (success)
|
||||
- OR `[WARN] Startup hub report attempt 1/3 failed: ...` followed by eventual success
|
||||
2. Check hub dashboard at `hub.felhom.eu` — should show fresh data with current timestamp
|
||||
3. If hub is deployed at the same time: the retries should handle the delay
|
||||
|
||||
Reference in New Issue
Block a user