127 lines
6.7 KiB
Markdown
127 lines
6.7 KiB
Markdown
# TASK: Fix data loss on container restart (v0.15.2)
|
||
|
||
Two pieces of data are lost when the felhom-controller container restarts:
|
||
1. Snapshot history delta stats (HOZZÁADOTT, ÚJ FÁJL, VÁLTOZOTT columns show 0)
|
||
2. DB validation info (ÉRVÉNYESÍTÉS column shows "–" instead of table counts)
|
||
|
||
## Bug 1: Snapshot history delta stats lost on restart
|
||
|
||
### Root cause
|
||
|
||
`LoadSnapshotHistory()` in `controller/internal/backup/backup.go` (~line 788) loads snapshots from restic repos on startup, but sets `HasStats: false` because restic's `snapshots` command only returns ID/time/paths/tags — NOT the delta stats (files_new, files_changed, data_added) which are only available in the `restic backup` output during the actual backup run.
|
||
|
||
The template in `controller/internal/web/templates/backups.html` renders 0 for `HasStats: false`:
|
||
```html
|
||
<td class="mono">{{if .HasStats}}+{{.DataAdded}}{{else}}0{{end}}</td>
|
||
```
|
||
|
||
### Fix
|
||
|
||
Persist the `snapshotHistory` ring buffer to a JSON file on disk so delta stats survive restarts.
|
||
|
||
**Implementation steps:**
|
||
|
||
1. **Add persistence methods to Manager** in `controller/internal/backup/backup.go`:
|
||
- Add a `snapshotHistoryFile` field to `Manager` struct (set to e.g. `<dataPath>/snapshot-history.json`)
|
||
- Add `saveSnapshotHistory()` — marshals `snapshotHistory` to JSON, writes atomically (write to .tmp, rename)
|
||
- Add `loadSnapshotHistory()` — reads JSON file, unmarshals into `snapshotHistory`
|
||
- Call `saveSnapshotHistory()` at the end of `appendSnapshotRecord()` (called after each backup)
|
||
|
||
2. **Update `LoadSnapshotHistory()`** in same file:
|
||
- First try `loadSnapshotHistory()` from the JSON file
|
||
- If the file exists and loads successfully, merge with restic repo snapshots:
|
||
- Build a map of persisted records by SnapshotID
|
||
- For any restic snapshot NOT in the persisted map, add it with `HasStats: false`
|
||
- This preserves delta stats for recent snapshots while still picking up very old ones from restic
|
||
- If the file doesn't exist (first run), fall back to the current restic-only loading
|
||
|
||
3. **Wire up the file path** in `NewManager()`:
|
||
- The data path can be derived from `cfg.Paths.SystemDataPath` or use a fixed path like `/opt/docker/felhom-controller/data/snapshot-history.json`
|
||
- The `data/` directory is a Docker volume, so it persists across container recreations
|
||
- Check what `config.Config` has available. See `controller/internal/config/config.go` for the config struct. The controller's data directory is typically `/opt/docker/felhom-controller/data/` which is a mounted volume (check `cmd/controller/main.go` for the data path).
|
||
|
||
**Key file paths to check:**
|
||
- `controller/internal/backup/backup.go` — Manager struct, LoadSnapshotHistory, appendSnapshotRecord
|
||
- `controller/internal/config/config.go` — Config struct for data paths
|
||
- `controller/cmd/controller/main.go` — where NewManager is called, what paths are available
|
||
|
||
## Bug 2: DB validation missing after restart
|
||
|
||
### Root cause
|
||
|
||
In `GetFullStatus()` in `controller/internal/backup/backup.go` (~line 973-992), when `LastDBDump` is nil (always nil after restart), a synthetic `LastDBDump` is created from `DumpFiles`. But the synthesized `DumpResult` does NOT copy the `Validation` field from `DumpFileInfo`:
|
||
|
||
```go
|
||
// Current code — BUG: missing Validation
|
||
results = append(results, DumpResult{
|
||
DB: DiscoveredDB{StackName: f.StackName, DBType: f.DBType, ContainerName: f.StackName},
|
||
FilePath: f.FileName,
|
||
Size: f.Size,
|
||
})
|
||
```
|
||
|
||
The `DumpResult` struct has a `Validation DumpValidation` field (see `controller/internal/backup/dbdump.go:41`), but it's left zero-valued (`Valid: false, Error: ""`). The template then renders "–" for this case.
|
||
|
||
Meanwhile, `ListDumpFiles()` in `dbdump.go:342` DOES call `ValidateDump()` on each file and populates `DumpFileInfo.Validation` correctly. The data is there — it's just not copied into the synthesized `DumpResult`.
|
||
|
||
### Fix
|
||
|
||
This is a one-line fix. In `GetFullStatus()` (~line 978), add `Validation: f.Validation` to the synthesized `DumpResult`:
|
||
|
||
```go
|
||
results = append(results, DumpResult{
|
||
DB: DiscoveredDB{StackName: f.StackName, DBType: f.DBType, ContainerName: f.StackName},
|
||
FilePath: f.FileName,
|
||
Size: f.Size,
|
||
Validation: f.Validation, // <-- ADD THIS LINE
|
||
})
|
||
```
|
||
|
||
**File:** `controller/internal/backup/backup.go`, inside `GetFullStatus()`, in the "Synthesize LastDBDump from DumpFiles" block.
|
||
|
||
## Build & Deploy
|
||
|
||
Version: **v0.15.2**
|
||
|
||
### Build workflow
|
||
```bash
|
||
SSH=/c/Windows/System32/OpenSSH/ssh.exe
|
||
# 1. Commit & push
|
||
cd e:/git/deploy-felhom-compose
|
||
git add -A && git commit -m "v0.15.2: Fix snapshot stats and DB validation loss on restart" && git push
|
||
# 2. Build
|
||
$SSH kisfenyo@192.168.0.180 "cd ~/build/felhom-controller && git -C ~/git/deploy-felhom-compose pull && ./build.sh v0.15.2 --push"
|
||
# 3. Deploy
|
||
$SSH kisfenyo@192.168.0.162 "cd /opt/docker/felhom-controller && sudo docker pull gitea.dooplex.hu/admin/felhom-controller:v0.15.2 && sudo sed -i 's|image: gitea.dooplex.hu/admin/felhom-controller:.*|image: gitea.dooplex.hu/admin/felhom-controller:v0.15.2|' docker-compose.yml && sudo docker compose up -d"
|
||
# 4. Verify
|
||
$SSH kisfenyo@192.168.0.162 "docker ps --filter name=felhom-controller --format '{{.Image}} {{.Status}}'"
|
||
```
|
||
|
||
### Compile check
|
||
Always run `go build ./...` in `controller/` before committing to ensure no compile errors.
|
||
|
||
## Documentation
|
||
|
||
Add a CHANGELOG.md entry at the top (under `## Changelog`). Read the first 30 lines to see the format, then insert a new entry. Example:
|
||
|
||
```markdown
|
||
### What was just completed (2026-02-19 session 52)
|
||
- **v0.15.2 — Fix data loss on container restart (2 bugs):**
|
||
|
||
**Bug 1:** Snapshot history delta stats (HOZZÁADOTT, ÚJ FÁJL, VÁLTOZOTT) showed 0 after container restart because restic doesn't store these stats — they were only in memory. Fixed by persisting the snapshot history ring buffer to `data/snapshot-history.json`. On startup, persisted stats are merged with restic repo snapshots.
|
||
|
||
**Bug 2:** DB validation (ÉRVÉNYESÍTÉS column) showed "–" after restart because the synthesized `LastDBDump.Results` didn't copy `Validation` from `DumpFileInfo`. One-line fix: added `Validation: f.Validation` to the synthesized `DumpResult` in `GetFullStatus()`.
|
||
|
||
**Files modified:** `internal/backup/backup.go`
|
||
```
|
||
|
||
Update version in `C:\Users\User\.claude\projects\e--git\memory\MEMORY.md` to `v0.15.2`.
|
||
|
||
## Verification
|
||
|
||
After deploying v0.15.2:
|
||
1. Navigate to /backups — verify Pillanatképek shows actual values (if a backup has been run before the restart)
|
||
2. Restart the container: `sudo docker compose restart`
|
||
3. Refresh /backups — Pillanatképek should still show the stats from before restart
|
||
4. DB Adatbázisok table should show table counts (e.g., "42 tábla") in ÉRVÉNYESÍTÉS column, not "–"
|