fix: deep bug hunt II — concurrency, security & optimization (25 files)
Critical: watchdog mutex panic safety, SetGeoAppOverride nil guard, SSD-only app DB restore fallback. High: double deploy race (atomic Deploying flag), delete/remove during deploy guard, ScanStacks overwrite protection, FileBrowser mount mutex, PushEvent history, PushOnce error handling, DB dump sync+close before rename, restic retry fresh context, encrypt failure logging, cross-backup path traversal validation, deepCopyStack completeness. Security: constant-time API key comparison, login rate limiting (5/min), git credential masking in logs, storage path prefix traversal fix. Concurrency: MigrateEncryption lock ordering, SubdomainInUse I/O outside lock, scheduler late-registered jobs, SQLite WAL verification, metrics shutdown context, telemetry scan error logging, asset sync lock scope. Optimization: streaming file copy for DB dumps, restic stats dedup, atomic infra config copy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,5 +1,45 @@
|
||||
## Changelog
|
||||
|
||||
### v0.30.4 — Deep Bug Hunt II: Concurrency, Security & Optimization (2026-02-25)
|
||||
|
||||
#### Fixed (Critical)
|
||||
- **Watchdog mutex panic** — Wrapped `handleDisconnect` call in anonymous func with deferred re-lock to guarantee mutex re-acquisition even on panic (C1)
|
||||
- **SetGeoAppOverride nil crash** — Added nil guard; passing nil override now correctly deletes the entry instead of panicking (C2)
|
||||
- **SSD-only app DB restore** — `restoreDBDumps` now falls back to `app.DrivePath` when `HDDPath` is empty (C3)
|
||||
|
||||
#### Fixed (High)
|
||||
- **Double deploy race** — Added atomic check-and-set of `Deploying` flag with `clearDeploying()` helper on all error paths (H1)
|
||||
- **Delete/Remove during deploy** — Both `DeleteStack` and `RemoveStack` now reject operations while stack is deploying (H2)
|
||||
- **ScanStacks overwrite** — Skips updating `Deployed`/`AppConfig` for stacks with active deploy in progress (H3)
|
||||
- **FileBrowser mount race** — Added `fileBrowserMu` mutex to prevent concurrent `SyncFileBrowserMounts` calls (H5)
|
||||
- **PushEvent history gap** — Added `recordHistory` calls on both success and failure paths in PushEvent goroutine (H6)
|
||||
- **PushOnce silent failure** — Now returns error for non-2xx HTTP responses instead of nil (H7)
|
||||
- **DB dump file corruption** — Added `tmpFile.Sync()` and `tmpFile.Close()` before rename in `DumpOne` (H8)
|
||||
- **Restic retry timeout** — Creates fresh 30-minute context for retry after unlock instead of reusing near-expired original (H9)
|
||||
- **Encrypt failure silent** — Added warning log when encryption fails in `SaveAppConfig` (H10)
|
||||
- **Cross-backup path traversal** — Validates destination path against registered storage paths in both web and API handlers (H11)
|
||||
- **deepCopyStack incomplete** — Now deep-copies `Meta.OptionalConfig`, `Meta.HealthCheck`, and `DeployField.Options` (H12)
|
||||
|
||||
#### Security
|
||||
- **Constant-time API key** — Replaced `==` with `subtle.ConstantTimeCompare` for API key comparison, preventing timing attacks (M1)
|
||||
- **Login rate limiting** — Added per-IP rate limiter (5 attempts/minute) to login handler (M8)
|
||||
- **Git credential masking** — Applied `maskRepoURL()` in `runGitInDir` log output to prevent credential leakage (M23)
|
||||
- **Path prefix traversal** — Fixed `storageAttachBrowseHandler` prefix check to require trailing `/`, preventing sibling directory matches (M24)
|
||||
|
||||
#### Concurrency & Logic
|
||||
- **MigrateEncryption race** — Moved `encKey == nil` check inside the mutex lock (M5)
|
||||
- **SubdomainInUse I/O under lock** — Collect stack dirs under RLock, release, then perform disk I/O outside (M4)
|
||||
- **Scheduler late jobs** — Jobs registered after `Start()` now immediately get their goroutine launched (M10)
|
||||
- **SQLite WAL verification** — WAL pragma now verified via `QueryRow` + `Scan` instead of silent `Exec` (M13)
|
||||
- **Metrics shutdown** — `sampleContainers` now uses parent context instead of `context.Background()` for clean shutdown (M14)
|
||||
- **Telemetry scan logging** — Row scan errors now logged instead of silently swallowed (M15)
|
||||
- **Asset sync lock** — Refactored to hold mutex only for status updates, not during entire HTTP download (M22)
|
||||
|
||||
#### Optimization
|
||||
- **DB dump copy** — Replaced `os.ReadFile`/`os.WriteFile` with streaming `io.Copy` via `copyFile` helper for large dumps (M16)
|
||||
- **Restic stats dedup** — Per-drive stats now computed once and aggregated, eliminating duplicate restic subprocess calls (M17)
|
||||
- **Infra config atomic** — `syncInfraConfig` controller.yaml copy now uses atomic write via `copyFile` (M20)
|
||||
|
||||
### v0.30.3 — Comprehensive Bug Hunt Fixes (2026-02-25)
|
||||
|
||||
#### Fixed (Critical — P0)
|
||||
|
||||
Reference in New Issue
Block a user