# TASK: Bugfix — Storage Initialization (FormatAndMount) **Version:** 0.11.4 → 0.11.5 **Priority:** Fix all 3 bugs + add safety improvements before testing format again. ## Context Storage initialization wizard (scan → select disk → format → mount → register) works up to the partitioning step. Three bugs prevent completion. Fix all three in one pass. **Test environment:** demo-felhom.eu, `/dev/sdb` = 931.5 GB USB HDD (HD710 PRO), has existing GPT partition table with one partition `sdb1` (no filesystem). --- ## Bug 1: sfdisk fails with "unsupported command" (CURRENT BLOCKER) ### Error output ``` Old situation: Device Start End Sectors Size Type /host-dev/sdb1 2048 1953523711 1953521664 931.5G Linux filesystem >>> Script header accepted. >>> line 2: unsupported command Hiba: exit status 1 ``` ### Root cause Two issues in `format_linux.go` line ~10225: ```go sfdiskInput := "label: gpt\n,,,L\n" cmd := exec.Command("sfdisk", HostDevicePath(req.DevicePath)) ``` 1. **`,,,L` type shorthand fails on GPT** — this sfdisk version doesn't accept `L` as type for GPT disklabel. For GPT, sfdisk needs the full GUID or no type (defaults to Linux filesystem). 2. **No `--force` flag** — sdb already has a GPT table with sdb1. sfdisk tries to apply the script as a delta to the existing layout, not as a fresh layout. 3. **No `wipefs` before sfdisk** — existing partition signatures confuse sfdisk. ### Fix in `controller/internal/storage/format_linux.go` Find this block (around line 10222–10230): ```go if req.CreatePartition { send("partitioning", "Partíció létrehozása...", 15) sfdiskInput := "label: gpt\n,,,L\n" cmd := exec.Command("sfdisk", HostDevicePath(req.DevicePath)) cmd.Stdin = strings.NewReader(sfdiskInput) if out, err := cmd.CombinedOutput(); err != nil { return "", fail("partitioning", "Partícionálás sikertelen: "+string(out), err) } ``` Replace with: ```go if req.CreatePartition { send("partitioning", fmt.Sprintf("wipefs -a %s ...", HostDevicePath(req.DevicePath)), 12) // Wipe existing partition table and filesystem signatures first _ = exec.Command("wipefs", "-a", HostDevicePath(req.DevicePath)).Run() time.Sleep(500 * time.Millisecond) // Create GPT with single partition spanning whole disk // ",," = start=default, size=default(fill disk), type=default(Linux filesystem GUID) // --force: overwrite even if device appears busy // --wipe always: wipe filesystem signatures from newly created partitions send("partitioning", fmt.Sprintf("sfdisk --force --wipe always %s ...", HostDevicePath(req.DevicePath)), 15) sfdiskInput := "label: gpt\n,,\n" cmd := exec.Command("sfdisk", "--force", "--wipe", "always", HostDevicePath(req.DevicePath)) cmd.Stdin = strings.NewReader(sfdiskInput) if out, err := cmd.CombinedOutput(); err != nil { return "", fail("partitioning", "Partícionálás sikertelen: "+string(out), err) } ``` --- ## Bug 2: `mount mountPath` will fail (NEXT BLOCKER after Bug 1) ### Current code (around line 10288–10290) ```go if out, err := exec.Command("mount", mountPath).CombinedOutput(); err != nil { return "", fail("mounting", "Csatlakoztatás sikertelen: "+string(out), err) } ``` ### Root cause `mount /mnt/hdd_1` works by looking up `/mnt/hdd_1` in the process's `/etc/fstab` to find which device to mount. But inside the container, `/etc/fstab` is Docker's auto-generated fstab (not the host's). The UUID entry was written to `/host-fstab` (the host's real fstab). So `mount /mnt/hdd_1` will fail with "can't find /mnt/hdd_1 in /etc/fstab" or similar. ### Fix in `controller/internal/storage/format_linux.go` Find this line (around line 10288): ```go if out, err := exec.Command("mount", mountPath).CombinedOutput(); err != nil { ``` Replace with: ```go // Mount by device path explicitly — container's /etc/fstab != host fstab, // so "mount /mnt/hdd_1" (fstab lookup) won't work. send("mounting", fmt.Sprintf("mount -t ext4 %s %s ...", HostDevicePath(partDev), mountPath), 70) if out, err := exec.Command("mount", "-t", "ext4", "-o", "defaults,noatime", HostDevicePath(partDev), mountPath).CombinedOutput(); err != nil { ``` The fstab entry in `/host-fstab` still ensures persistence across host reboots. This explicit mount handles the immediate "mount it right now" operation. --- ## Bug 3: Mount namespace isolation — mount won't be visible on host (RESTART BLOCKER) ### Root cause Even with `privileged: true`, `mount` inside a container operates in the container's mount namespace. The host kernel does NOT see the mount. Consequences: - After controller container restart, the mount is gone - Other containers can't access `/mnt/hdd_1` - The bind mount `- /mnt:/mnt:rw` shares existing host mounts INTO the container, but new mounts created inside the container don't propagate BACK to the host ### Fix: Change `/mnt` volume to use `rshared` mount propagation #### 3a. `controller/docker-compose.yml` Find this line: ```yaml # All external storage — /mnt/* for multi-storage + restore - /mnt:/mnt:rw ``` Replace with: ```yaml # All external storage — rshared propagation so mounts created inside # the container (disk init) propagate to the host and vice versa - type: bind source: /mnt target: /mnt bind: propagation: rshared ``` **Important:** This uses Docker Compose long-form volume syntax. The rest of the volumes can stay in short form. Only `/mnt` needs propagation. #### 3b. `scripts/docker-setup.sh` — Add mount propagation setup Find the section where the script does final setup steps (after Docker installation, before or after compose generation). Add: ```bash # Enable shared mount propagation on /mnt (required for controller disk init) # This allows mounts created inside the controller container to propagate to the host log_info "Configuring mount propagation for /mnt..." mount --make-rshared /mnt 2>/dev/null || mount --make-shared /mnt 2>/dev/null || true ``` **Also** add a comment near the controller compose generation (if any) explaining this requirement. If `docker-setup.sh` doesn't generate the controller compose, just add the `mount --make-rshared` to the node preparation section. It's idempotent and safe to run multiple times. --- ## Safety improvement 1: Post-mount verification ### What After mount succeeds (exit code 0), verify the mount is actually visible. ### Where In `format_linux.go`, right after the mount command succeeds and BEFORE the `send("mounting", "Csatlakoztatva..."` line, add: ```go // Verify mount actually worked (don't just trust exit code) verifyOut, verifyErr := exec.Command("findmnt", "-n", "-o", "SOURCE", "--target", mountPath).Output() if verifyErr != nil || strings.TrimSpace(string(verifyOut)) == "" { return "", fail("mounting", "A csatlakoztatás nem ellenőrizhető: a mount parancs sikerült, de a meghajtó nem látható a rendszerben", fmt.Errorf("mount point %s not found after mount", mountPath)) } ``` --- ## Safety improvement 2: Use ASCII mount name for ext4 filesystem label ### What The current code uses `req.Label` (user-provided display label like "Külső HDD 1TB") for the ext4 `-L` label. ext4 labels are limited to 16 BYTES. Hungarian UTF-8 chars (ű, ó, é) are 2 bytes each, so "Külső HDD 1TB" could exceed the limit or get truncated mid-character. ### Where In `format_linux.go`, find the label preparation block (around line 10249–10254): ```go label := req.Label if label == "" { label = req.MountName } if len(label) > 16 { label = label[:16] } ``` Replace with: ```go // Use ASCII-safe mount name for ext4 filesystem label (16-byte limit). // The display label (req.Label) stays in settings.json for the UI. fsLabel := req.MountName if len(fsLabel) > 16 { fsLabel = fsLabel[:16] } ``` Then update the mkfs.ext4 call right below to use `fsLabel` instead of `label`: ```go mkfsCmd := exec.Command("mkfs.ext4", "-L", fsLabel, "-F", HostDevicePath(partDev)) ``` --- ## Safety improvement 3: Smart partition handling (skip repartition when unnecessary) ### What The scan shows sdb has 1 partition (sdb1) with no filesystem. The JS always sends `CreatePartition: true` (because `disk.CreatePartition` is undefined on the `BlockDevice` struct, so `undefined !== false` evaluates to `true` in JS). For a disk that already has exactly one partition with no filesystem, we should skip the destructive repartition step and just format the existing partition directly. ### Where In `handlers.go`, in `storageInitAPIHandler`, AFTER building `fmtReq` (around line 14175–14180) and BEFORE the `go func()` goroutine, add: ```go // Smart partition: if device is a whole disk with exactly 1 partition // with no filesystem, skip repartitioning — just format existing partition if fmtReq.CreatePartition { result, scanErr := storage.ScanDisks() if scanErr == nil { for _, disk := range result.AvailableDisks { if disk.Path == req.DevicePath && len(disk.Partitions) == 1 && disk.Partitions[0].FSType == "" { s.logger.Printf("[INFO] Disk %s has 1 empty partition (%s) — skipping repartition", req.DevicePath, disk.Partitions[0].Path) fmtReq.DevicePath = disk.Partitions[0].Path // e.g., "/dev/sdb1" fmtReq.CreatePartition = false break } } } } ``` This way, for demo sdb (which has sdb1 with no FS), it will: 1. Set DevicePath to `/dev/sdb1` 2. Set CreatePartition to `false` 3. Skip wipefs + sfdisk entirely 4. Go straight to `mkfs.ext4 /host-dev/sdb1` **Note:** The wipefs+sfdisk fix (Bug 1) is still needed as fallback for truly unpartitioned disks or disks with multiple/incompatible partitions. --- ## Safety improvement 4: Descriptive progress messages ### What Include executed command details in progress messages for remote debugging. The progress messages show in the UI and get logged by the handler. ### Where Throughout `format_linux.go`, update the `send()` calls to include command info. Examples already shown in the Bug 1 and Bug 2 fixes above. Also update: For the mkfs step: ```go send("formatting", fmt.Sprintf("mkfs.ext4 -L %s -F %s ...", fsLabel, HostDevicePath(partDev)), 30) ``` For the blkid step (around line 10274): ```go send("mounting", fmt.Sprintf("UUID lekérése: blkid %s ...", HostDevicePath(partDev)), 65) ``` --- ## Summary: All changes by file ### `controller/internal/storage/format_linux.go` (5 changes) 1. Partition block: Add `wipefs -a`; change sfdisk input `",,,L"` → `",,"`; add `--force --wipe always` flags 2. Mount block: Change `mount mountPath` → `mount -t ext4 -o defaults,noatime HostDevicePath(partDev) mountPath` 3. After mount: Add `findmnt` verification 4. Label: Use `req.MountName` (ASCII) instead of `req.Label` (UTF-8) for `mkfs.ext4 -L` 5. Progress messages: Include command details in `send()` calls ### `controller/docker-compose.yml` (1 change) 6. Change `/mnt:/mnt:rw` to long-form syntax with `propagation: rshared` ### `controller/internal/web/handlers.go` (1 change) 7. In `storageInitAPIHandler`: Add smart partition detection before launching goroutine ### `scripts/docker-setup.sh` (1 change) 8. Add `mount --make-rshared /mnt` to node preparation section --- ## Build & deploy procedure ```bash # 1. On the host FIRST (before restarting controller): sudo mount --make-rshared /mnt # 2. Build new image with fixes (normal build process) # 3. Deploy cd /opt/docker/felhom-controller sudo docker compose up -d # 4. Verify container sees /host-dev docker exec felhom-controller ls /host-dev/sd* # 5. Verify rshared propagation is active docker inspect felhom-controller --format '{{range .Mounts}}{{if eq .Destination "/mnt"}}Propagation={{.Propagation}}{{end}}{{end}}' # Should show: Propagation=rshared # 6. Test storage init wizard: # - Scan → sdb appears # - Select sdb → configure hdd_1 → type FORMÁZÁS # - Watch progress panel — should show command details # - Should complete successfully # 7. Verify mount on HOST (proves propagation): findmnt /mnt/hdd_1 # Should show /dev/sdb1 mounted at /mnt/hdd_1 # 8. Verify fstab entry: grep hdd_1 /etc/fstab # Should show UUID=... /mnt/hdd_1 ext4 defaults,nofail,noatime 0 2 # 9. Verify storage registered in settings: # Visit Settings page → Adattárolók → /mnt/hdd_1 should appear # 10. Restart controller — verify mount survives: docker restart felhom-controller docker exec felhom-controller ls /mnt/hdd_1/ # Should show: storage/ Dokumentumok/ ``` --- ## What NOT to change - **Dockerfile** — packages already correct (fdisk, e2fsprogs, util-linux, rsync, parted) - **scan_linux.go** — scan works correctly after v0.11.1 fixes - **safety_linux.go / safety.go** — system disk detection works - **Template/JS** — wizard UI works fine; `CreatePartition` default-true is handled in handler