updated startup monitoring

This commit is contained in:
2026-02-14 18:57:20 +01:00
parent c3b80cffdc
commit 0be798af5d
4 changed files with 187 additions and 14 deletions
+3
View File
@@ -102,6 +102,7 @@ ssh kisfenyo@192.168.0.162 "cd /opt/docker/felhom-controller && docker pull gite
- App cards on dashboard and stacks pages are clickable via `data-href` attribute (skip protected stacks) - App cards on dashboard and stacks pages are clickable via `data-href` attribute (skip protected stacks)
- Logs page uses AJAX polling (`?raw=1` query param returns plain text) with auto-scroll and pause/resume - Logs page uses AJAX polling (`?raw=1` query param returns plain text) with auto-scroll and pause/resume
- Memory bar on deploy page uses two-segment stacked bar (committed = solid green, new = translucent green) - Memory bar on deploy page uses two-segment stacked bar (committed = solid green, new = translucent green)
- Deploy flow shows 3-step progress panel (config → containers → health), polls `GET /api/stacks/{name}` every 3s until running/unhealthy/timeout(120s)
## Git sync module (internal/sync) ## Git sync module (internal/sync)
@@ -139,3 +140,5 @@ Key patterns used in `internal/stacks/`:
4. Docker's `.State` field says "running" even for unhealthy containers — must parse `.Status` for health info 4. Docker's `.State` field says "running" even for unhealthy containers — must parse `.Status` for health info
5. After `DeployStack()` succeeds, update in-memory `Deployed` flag immediately — `RefreshStatus()` only reads docker ps, not app.yaml 5. After `DeployStack()` succeeds, update in-memory `Deployed` flag immediately — `RefreshStatus()` only reads docker ps, not app.yaml
6. `docker compose up -d` returns exit 0 even when containers crash-loop — post-start status check is essential for detecting failures 6. `docker compose up -d` returns exit 0 even when containers crash-loop — post-start status check is essential for detecting failures
7. Mealie image has no wget/curl — use Python TCP socket check for healthcheck; set `start_period: 60s` for DB migration time
8. Always verify container images have the healthcheck tool (`wget`, `curl`, etc.) before using it — Alpine has BusyBox wget, Python images have `python3`
+9 -1
View File
@@ -37,8 +37,13 @@ Last updated: 2026-02-14 (session 3)
- All verbose checks gated on `cfg.Logging.Level == "debug"`; timing always at INFO - All verbose checks gated on `cfg.Logging.Level == "debug"`; timing always at INFO
- **UI improvements** in `internal/web/templates.go` and `server.go`: - **UI improvements** in `internal/web/templates.go` and `server.go`:
- **Memory bar fix on deploy page**: Bar segments now always visible (min-width: 3px), new app segment uses translucent green with distinct border for clear visual separation from committed memory - **Memory bar fix on deploy page**: Bar segments now always visible (min-width: 3px), new app segment uses translucent green with distinct border for clear visual separation from committed memory
- **Clickable app cards**: Cards on Vezérlőpult and Alkalmazások pages are now clickable (navigates to deploy/detail page). Uses `data-href` attribute + delegated click handler. Protected stacks excluded - **Clickable app cards**: Cards on Vezérlőpult and Alkalmazások pages are now clickable (navigates to deploy/detail page). Uses `data-href` attribute + delegated click handler. Protected stacks excluded. Actions area (buttons, state labels) excluded from click-to-navigate
- **Live-scrolling logs**: Logs page now auto-refreshes every 3s via AJAX polling (`?raw=1` returns plain text). Fixed-height container (70vh) with auto-scroll to bottom. Pulsing green "Élő" indicator. Pause/resume toggle ("Szüneteltetés"/"Folytatás"). User scroll position preserved when scrolled up to read history - **Live-scrolling logs**: Logs page now auto-refreshes every 3s via AJAX polling (`?raw=1` returns plain text). Fixed-height container (70vh) with auto-scroll to bottom. Pulsing green "Élő" indicator. Pause/resume toggle ("Szüneteltetés"/"Folytatás"). User scroll position preserved when scrolled up to read history
- **Deployment progress UI**: Deploy button no longer shows alert+redirect immediately. Instead shows 3-step progress panel: config saved → containers starting → app initializing. Polls `GET /api/stacks/{name}` every 3s to track actual container health state. Handles running (auto-redirect), starting (keep polling), unhealthy (warning), exited (error), and 120s timeout. Shows elapsed time counter
- **Mealie healthcheck fix** (app-catalog-felhom.eu):
- `wget --spider` replaced with Python TCP socket check — mealie image doesn't include wget
- `start_period` increased to 60s (DB migrations take ~40s on first start)
- **Healthcheck audit**: filebrowser (Alpine, has BusyBox wget — OK), stirling-pdf (Ubuntu, has wget — OK)
### Previously completed (2026-02-15 session 2) ### Previously completed (2026-02-15 session 2)
- **Phase 4: Git Sync + App Catalog Audit** — major milestone - **Phase 4: Git Sync + App Catalog Audit** — major milestone
@@ -179,3 +184,6 @@ Last updated: 2026-02-14 (session 3)
- BIOS "AC Power Recovery" must be enabled on N100 for auto-restart after power outage - BIOS "AC Power Recovery" must be enabled on N100 for auto-restart after power outage
- `docker compose up -d` returns exit 0 even when containers immediately crash-loop — need post-start status check to detect this - `docker compose up -d` returns exit 0 even when containers immediately crash-loop — need post-start status check to detect this
- When logging env vars for debugging, only log keys (not values) to avoid leaking secrets in log files - When logging env vars for debugging, only log keys (not values) to avoid leaking secrets in log files
- Mealie image (`ghcr.io/mealie-recipes/mealie`) doesn't include wget/curl — use Python TCP socket check for healthcheck
- Mealie DB migrations on first start take ~40s (alembic) — use `start_period: 60s` to avoid premature unhealthy status
- Alpine-based images (filebrowser, vaultwarden) have wget via BusyBox — healthchecks with `wget --spider` work fine
+10 -2
View File
@@ -44,6 +44,7 @@ Current version: **v0.2.1**
- Verbose debug logging with operation timing, post-start container state checks, and image pull detection - Verbose debug logging with operation timing, post-start container state checks, and image pull detection
- Clickable app cards on dashboard and applications pages (navigate to detail/deploy page) - Clickable app cards on dashboard and applications pages (navigate to detail/deploy page)
- Memory bar with two-segment visualization on deploy page (committed vs new app allocation) - Memory bar with two-segment visualization on deploy page (committed vs new app allocation)
- Deployment progress UI: 3-step progress panel with real-time health polling (config → containers → health check)
### Known issues / next priorities ### Known issues / next priorities
- Cloudflare Tunnel + Traefik TLS: paperless.demo-felhom.eu works locally but shows "Not secure" (certificate chain not fully validated through tunnel) - Cloudflare Tunnel + Traefik TLS: paperless.demo-felhom.eu works locally but shows "Not secure" (certificate chain not fully validated through tunnel)
@@ -154,8 +155,15 @@ controller/
- Saves `app.yaml` (env vars + locked fields list) - Saves `app.yaml` (env vars + locked fields list)
- Runs `docker compose up -d` with env vars injected - Runs `docker compose up -d` with env vars injected
- Updates in-memory state immediately (no stale "Telepítés" button) - Updates in-memory state immediately (no stale "Telepítés" button)
4. Post-deploy: locked fields (DB_PASSWORD, etc.) become read-only 4. **Progress UI** replaces the form with a 3-step progress panel:
5. "Részletek" button opens deploy page in read-only mode showing current config - ✅ "Konfiguráció mentve" — shown immediately after API success
- ⏳ "Konténer(ek) indítása..." → ✅ when containers are up
- ⏳ "Alkalmazás inicializálása..." → ✅ when state = `running` (healthy)
- Polls `GET /api/stacks/{name}` every 3 seconds for real-time health status
- Handles: `running` (auto-redirect), `starting` (keep polling), `unhealthy` (warning), `exited` (error)
- Timeout after 120 seconds with informational message
5. Post-deploy: locked fields (DB_PASSWORD, etc.) become read-only
6. "Részletek" button opens deploy page in read-only mode showing current config
### Memory validation during deploy ### Memory validation during deploy
+163 -9
View File
@@ -426,6 +426,27 @@ const deployTmpl = `
</div> </div>
{{end}} {{end}}
</form> </form>
<div id="deploy-progress" class="deploy-progress" style="display:none">
<h3>Telepítés folyamatban...</h3>
<div class="deploy-steps">
<div class="deploy-step active" id="step-config">
<span class="step-icon">&#9203;</span>
<span class="step-text">Konfiguráció mentése...</span>
</div>
<div class="deploy-step" id="step-containers">
<span class="step-icon">&#9203;</span>
<span class="step-text">Konténer(ek) indítása...</span>
</div>
<div class="deploy-step" id="step-health">
<span class="step-icon">&#9203;</span>
<span class="step-text">Alkalmazás inicializálása...</span>
</div>
</div>
<div id="deploy-warning" class="alert alert-warning" style="display:none"></div>
<div id="deploy-result" style="display:none"></div>
<p class="deploy-elapsed" id="deploy-elapsed"></p>
</div>
</div> </div>
<script> <script>
@@ -466,7 +487,6 @@ document.getElementById('deploy-form').addEventListener('submit', async function
} }
const btn = e.target.querySelector('[type=submit]'); const btn = e.target.querySelector('[type=submit]');
const origText = btn.textContent;
btn.disabled = true; btn.disabled = true;
btn.textContent = 'Telepítés folyamatban...'; btn.textContent = 'Telepítés folyamatban...';
@@ -482,28 +502,114 @@ document.getElementById('deploy-form').addEventListener('submit', async function
} }
}); });
var stackName = '{{.Stack.Name}}';
var progressEl = document.getElementById('deploy-progress');
var formEl = document.getElementById('deploy-form');
var stepConfig = document.getElementById('step-config');
var stepContainers = document.getElementById('step-containers');
var stepHealth = document.getElementById('step-health');
var warningEl = document.getElementById('deploy-warning');
var resultEl = document.getElementById('deploy-result');
var elapsedEl = document.getElementById('deploy-elapsed');
function setStep(el, status, text) {
el.className = 'deploy-step ' + status;
if (text) el.querySelector('.step-text').textContent = text;
var icon = el.querySelector('.step-icon');
if (status === 'done') icon.textContent = '\u2705';
else if (status === 'error') icon.textContent = '\u274C';
else if (status === 'warn') icon.textContent = '\u26A0\uFE0F';
else if (status === 'active') icon.textContent = '\u23F3';
}
// Phase 1: Deploy request
try { try {
const resp = await fetch('/api/stacks/{{.Stack.Name}}/deploy', { var resp = await fetch('/api/stacks/' + stackName + '/deploy', {
method: 'POST', method: 'POST',
headers: {'Content-Type': 'application/json'}, headers: {'Content-Type': 'application/json'},
body: JSON.stringify({values: values}) body: JSON.stringify({values: values})
}); });
const data = await resp.json(); var data = await resp.json();
if (!data.ok) { if (!data.ok) {
alert('Hiba: ' + data.error); alert('Hiba: ' + data.error);
btn.textContent = origText; btn.textContent = 'Telepítés indítása';
btn.disabled = false; btn.disabled = false;
return; return;
} }
// Deploy API returned success — switch to progress view
formEl.style.display = 'none';
progressEl.style.display = 'block';
setStep(stepConfig, 'done', 'Konfiguráció mentve');
setStep(stepContainers, 'active', 'Konténer(ek) indítása...');
if (data.data && data.data.warning) { if (data.data && data.data.warning) {
alert('Sikeres telepítés!\n\nFigyelmeztetés: ' + data.data.warning); warningEl.textContent = data.data.warning;
} else { warningEl.style.display = 'block';
alert('Sikeres telepítés!');
} }
window.location.href = '/stacks';
// Phase 2: Poll stack status
var startTime = Date.now();
var pollTimeout = 120000;
var pollTimer = setInterval(async function() {
var elapsed = Math.round((Date.now() - startTime) / 1000);
elapsedEl.textContent = elapsed + ' másodperce...';
if (Date.now() - startTime > pollTimeout) {
clearInterval(pollTimer);
setStep(stepHealth, 'warn', 'Időtúllépés — az alkalmazás még indulhat');
resultEl.innerHTML = '<div class="alert alert-warning" style="margin-top:1rem">' +
'A telepítés időtúllépésbe futott. Az alkalmazás még indulhat.' +
'</div><a href="/stacks" class="btn btn-primary" style="margin-top:.75rem">Alkalmazások megtekintése</a>';
resultEl.style.display = 'block';
return;
}
try {
var sr = await fetch('/api/stacks/' + stackName);
var sd = await sr.json();
if (!sd.ok || !sd.data) return;
var state = sd.data.state;
if (state === 'running') {
clearInterval(pollTimer);
setStep(stepContainers, 'done', 'Konténerek elindultak');
setStep(stepHealth, 'done', 'Alkalmazás kész!');
progressEl.querySelector('h3').textContent = 'Telepítés sikeres!';
resultEl.innerHTML = '<div class="alert alert-info" style="margin-top:1rem">' +
'Az alkalmazás fut. Átirányítás 3 másodperc múlva...' +
'</div>';
resultEl.style.display = 'block';
setTimeout(function() { window.location.href = '/stacks'; }, 3000);
} else if (state === 'starting') {
setStep(stepContainers, 'done', 'Konténerek elindultak');
setStep(stepHealth, 'active', 'Alkalmazás inicializálása...');
} else if (state === 'unhealthy') {
clearInterval(pollTimer);
setStep(stepContainers, 'done', 'Konténerek elindultak');
setStep(stepHealth, 'warn', 'Állapotjelző: nem egészséges');
resultEl.innerHTML = '<div class="alert alert-warning" style="margin-top:1rem">' +
'Az alkalmazás elindult, de az állapotjelző nem egészséges. ' +
'Ez normális lehet az első percekben.' +
'</div><a href="/stacks" class="btn btn-primary" style="margin-top:.75rem">Alkalmazások megtekintése</a>';
resultEl.style.display = 'block';
} else if (state === 'exited' || state === 'stopped') {
clearInterval(pollTimer);
setStep(stepContainers, 'error', 'A konténer leállt');
setStep(stepHealth, 'error');
progressEl.querySelector('h3').textContent = 'Telepítés sikertelen';
resultEl.innerHTML = '<div class="alert alert-error" style="margin-top:1rem">' +
'A konténer leállt. Ellenőrizze a naplókat.' +
'</div><a href="/stacks/' + stackName + '/logs" class="btn btn-outline" style="margin-top:.75rem">Naplók megtekintése</a>' +
' <a href="/stacks" class="btn btn-primary" style="margin-top:.75rem">Alkalmazások</a>';
resultEl.style.display = 'block';
}
} catch(pollErr) {}
}, 3000);
} catch (err) { } catch (err) {
alert('Hálózati hiba: ' + err.message); alert('Hálózati hiba: ' + err.message);
btn.textContent = origText; btn.textContent = 'Telepítés indítása';
btn.disabled = false; btn.disabled = false;
} }
}); });
@@ -1187,6 +1293,54 @@ select.form-control option { background: var(--bg-secondary); color: var(--text-
border-top: 1px solid var(--border-color); border-top: 1px solid var(--border-color);
} }
/* Deploy progress */
.deploy-progress {
background: var(--bg-card);
border: 1px solid var(--border-color);
border-radius: var(--radius);
padding: 1.5rem;
}
.deploy-progress h3 {
margin-bottom: 0.5rem;
}
.deploy-steps {
margin: 1rem 0;
}
.deploy-step {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.5rem 0;
font-size: .95rem;
color: var(--text-muted);
}
.deploy-step.active {
color: var(--text-primary);
}
.deploy-step.done {
color: var(--text-primary);
}
.deploy-step.done .step-icon {
color: var(--green);
}
.deploy-step.error .step-icon {
color: var(--red);
}
.deploy-step.warn .step-icon {
color: var(--yellow);
}
.step-icon {
font-size: 1.1rem;
width: 1.5rem;
text-align: center;
flex-shrink: 0;
}
.deploy-elapsed {
color: var(--text-muted);
font-size: .85rem;
margin-top: 0.5rem;
}
/* Toggle switch */ /* Toggle switch */
.toggle { cursor: pointer; display: flex; align-items: center; gap: .5rem; } .toggle { cursor: pointer; display: flex; align-items: center; gap: .5rem; }
.toggle input[type="checkbox"] { accent-color: var(--accent-blue); } .toggle input[type="checkbox"] { accent-color: var(--accent-blue); }