13 KiB
TASK: Bugfix — Storage Initialization (FormatAndMount)
Version: 0.11.4 → 0.11.5 Priority: Fix all 3 bugs + add safety improvements before testing format again.
Context
Storage initialization wizard (scan → select disk → format → mount → register) works up to the partitioning step. Three bugs prevent completion. Fix all three in one pass.
Test environment: demo-felhom.eu, /dev/sdb = 931.5 GB USB HDD (HD710 PRO),
has existing GPT partition table with one partition sdb1 (no filesystem).
Bug 1: sfdisk fails with "unsupported command" (CURRENT BLOCKER)
Error output
Old situation: Device Start End Sectors Size Type
/host-dev/sdb1 2048 1953523711 1953521664 931.5G Linux filesystem
>>> Script header accepted.
>>> line 2: unsupported command
Hiba: exit status 1
Root cause
Two issues in format_linux.go line ~10225:
sfdiskInput := "label: gpt\n,,,L\n"
cmd := exec.Command("sfdisk", HostDevicePath(req.DevicePath))
,,,Ltype shorthand fails on GPT — this sfdisk version doesn't acceptLas type for GPT disklabel. For GPT, sfdisk needs the full GUID or no type (defaults to Linux filesystem).- No
--forceflag — sdb already has a GPT table with sdb1. sfdisk tries to apply the script as a delta to the existing layout, not as a fresh layout. - No
wipefsbefore sfdisk — existing partition signatures confuse sfdisk.
Fix in controller/internal/storage/format_linux.go
Find this block (around line 10222–10230):
if req.CreatePartition {
send("partitioning", "Partíció létrehozása...", 15)
sfdiskInput := "label: gpt\n,,,L\n"
cmd := exec.Command("sfdisk", HostDevicePath(req.DevicePath))
cmd.Stdin = strings.NewReader(sfdiskInput)
if out, err := cmd.CombinedOutput(); err != nil {
return "", fail("partitioning", "Partícionálás sikertelen: "+string(out), err)
}
Replace with:
if req.CreatePartition {
send("partitioning", fmt.Sprintf("wipefs -a %s ...", HostDevicePath(req.DevicePath)), 12)
// Wipe existing partition table and filesystem signatures first
_ = exec.Command("wipefs", "-a", HostDevicePath(req.DevicePath)).Run()
time.Sleep(500 * time.Millisecond)
// Create GPT with single partition spanning whole disk
// ",," = start=default, size=default(fill disk), type=default(Linux filesystem GUID)
// --force: overwrite even if device appears busy
// --wipe always: wipe filesystem signatures from newly created partitions
send("partitioning", fmt.Sprintf("sfdisk --force --wipe always %s ...", HostDevicePath(req.DevicePath)), 15)
sfdiskInput := "label: gpt\n,,\n"
cmd := exec.Command("sfdisk", "--force", "--wipe", "always", HostDevicePath(req.DevicePath))
cmd.Stdin = strings.NewReader(sfdiskInput)
if out, err := cmd.CombinedOutput(); err != nil {
return "", fail("partitioning", "Partícionálás sikertelen: "+string(out), err)
}
Bug 2: mount mountPath will fail (NEXT BLOCKER after Bug 1)
Current code (around line 10288–10290)
if out, err := exec.Command("mount", mountPath).CombinedOutput(); err != nil {
return "", fail("mounting", "Csatlakoztatás sikertelen: "+string(out), err)
}
Root cause
mount /mnt/hdd_1 works by looking up /mnt/hdd_1 in the process's /etc/fstab to find
which device to mount. But inside the container, /etc/fstab is Docker's auto-generated fstab
(not the host's). The UUID entry was written to /host-fstab (the host's real fstab).
So mount /mnt/hdd_1 will fail with "can't find /mnt/hdd_1 in /etc/fstab" or similar.
Fix in controller/internal/storage/format_linux.go
Find this line (around line 10288):
if out, err := exec.Command("mount", mountPath).CombinedOutput(); err != nil {
Replace with:
// Mount by device path explicitly — container's /etc/fstab != host fstab,
// so "mount /mnt/hdd_1" (fstab lookup) won't work.
send("mounting", fmt.Sprintf("mount -t ext4 %s %s ...", HostDevicePath(partDev), mountPath), 70)
if out, err := exec.Command("mount", "-t", "ext4", "-o", "defaults,noatime",
HostDevicePath(partDev), mountPath).CombinedOutput(); err != nil {
The fstab entry in /host-fstab still ensures persistence across host reboots.
This explicit mount handles the immediate "mount it right now" operation.
Bug 3: Mount namespace isolation — mount won't be visible on host (RESTART BLOCKER)
Root cause
Even with privileged: true, mount inside a container operates in the container's
mount namespace. The host kernel does NOT see the mount. Consequences:
- After controller container restart, the mount is gone
- Other containers can't access
/mnt/hdd_1 - The bind mount
- /mnt:/mnt:rwshares existing host mounts INTO the container, but new mounts created inside the container don't propagate BACK to the host
Fix: Change /mnt volume to use rshared mount propagation
3a. controller/docker-compose.yml
Find this line:
# All external storage — /mnt/* for multi-storage + restore
- /mnt:/mnt:rw
Replace with:
# All external storage — rshared propagation so mounts created inside
# the container (disk init) propagate to the host and vice versa
- type: bind
source: /mnt
target: /mnt
bind:
propagation: rshared
Important: This uses Docker Compose long-form volume syntax. The rest of the volumes
can stay in short form. Only /mnt needs propagation.
3b. scripts/docker-setup.sh — Add mount propagation setup
Find the section where the script does final setup steps (after Docker installation, before or after compose generation). Add:
# Enable shared mount propagation on /mnt (required for controller disk init)
# This allows mounts created inside the controller container to propagate to the host
log_info "Configuring mount propagation for /mnt..."
mount --make-rshared /mnt 2>/dev/null || mount --make-shared /mnt 2>/dev/null || true
Also add a comment near the controller compose generation (if any) explaining this requirement.
If docker-setup.sh doesn't generate the controller compose, just add the mount --make-rshared
to the node preparation section. It's idempotent and safe to run multiple times.
Safety improvement 1: Post-mount verification
What
After mount succeeds (exit code 0), verify the mount is actually visible.
Where
In format_linux.go, right after the mount command succeeds and BEFORE the
send("mounting", "Csatlakoztatva..." line, add:
// Verify mount actually worked (don't just trust exit code)
verifyOut, verifyErr := exec.Command("findmnt", "-n", "-o", "SOURCE", "--target", mountPath).Output()
if verifyErr != nil || strings.TrimSpace(string(verifyOut)) == "" {
return "", fail("mounting", "A csatlakoztatás nem ellenőrizhető: a mount parancs sikerült, de a meghajtó nem látható a rendszerben", fmt.Errorf("mount point %s not found after mount", mountPath))
}
Safety improvement 2: Use ASCII mount name for ext4 filesystem label
What
The current code uses req.Label (user-provided display label like "Külső HDD 1TB") for the
ext4 -L label. ext4 labels are limited to 16 BYTES. Hungarian UTF-8 chars (ű, ó, é) are
2 bytes each, so "Külső HDD 1TB" could exceed the limit or get truncated mid-character.
Where
In format_linux.go, find the label preparation block (around line 10249–10254):
label := req.Label
if label == "" {
label = req.MountName
}
if len(label) > 16 {
label = label[:16]
}
Replace with:
// Use ASCII-safe mount name for ext4 filesystem label (16-byte limit).
// The display label (req.Label) stays in settings.json for the UI.
fsLabel := req.MountName
if len(fsLabel) > 16 {
fsLabel = fsLabel[:16]
}
Then update the mkfs.ext4 call right below to use fsLabel instead of label:
mkfsCmd := exec.Command("mkfs.ext4", "-L", fsLabel, "-F", HostDevicePath(partDev))
Safety improvement 3: Smart partition handling (skip repartition when unnecessary)
What
The scan shows sdb has 1 partition (sdb1) with no filesystem. The JS always sends
CreatePartition: true (because disk.CreatePartition is undefined on the BlockDevice
struct, so undefined !== false evaluates to true in JS).
For a disk that already has exactly one partition with no filesystem, we should skip the destructive repartition step and just format the existing partition directly.
Where
In handlers.go, in storageInitAPIHandler, AFTER building fmtReq (around line 14175–14180)
and BEFORE the go func() goroutine, add:
// Smart partition: if device is a whole disk with exactly 1 partition
// with no filesystem, skip repartitioning — just format existing partition
if fmtReq.CreatePartition {
result, scanErr := storage.ScanDisks()
if scanErr == nil {
for _, disk := range result.AvailableDisks {
if disk.Path == req.DevicePath && len(disk.Partitions) == 1 && disk.Partitions[0].FSType == "" {
s.logger.Printf("[INFO] Disk %s has 1 empty partition (%s) — skipping repartition",
req.DevicePath, disk.Partitions[0].Path)
fmtReq.DevicePath = disk.Partitions[0].Path // e.g., "/dev/sdb1"
fmtReq.CreatePartition = false
break
}
}
}
}
This way, for demo sdb (which has sdb1 with no FS), it will:
- Set DevicePath to
/dev/sdb1 - Set CreatePartition to
false - Skip wipefs + sfdisk entirely
- Go straight to
mkfs.ext4 /host-dev/sdb1
Note: The wipefs+sfdisk fix (Bug 1) is still needed as fallback for truly unpartitioned disks or disks with multiple/incompatible partitions.
Safety improvement 4: Descriptive progress messages
What
Include executed command details in progress messages for remote debugging. The progress messages show in the UI and get logged by the handler.
Where
Throughout format_linux.go, update the send() calls to include command info.
Examples already shown in the Bug 1 and Bug 2 fixes above. Also update:
For the mkfs step:
send("formatting", fmt.Sprintf("mkfs.ext4 -L %s -F %s ...", fsLabel, HostDevicePath(partDev)), 30)
For the blkid step (around line 10274):
send("mounting", fmt.Sprintf("UUID lekérése: blkid %s ...", HostDevicePath(partDev)), 65)
Summary: All changes by file
controller/internal/storage/format_linux.go (5 changes)
- Partition block: Add
wipefs -a; change sfdisk input",,,L"→",,"; add--force --wipe alwaysflags - Mount block: Change
mount mountPath→mount -t ext4 -o defaults,noatime HostDevicePath(partDev) mountPath - After mount: Add
findmntverification - Label: Use
req.MountName(ASCII) instead ofreq.Label(UTF-8) formkfs.ext4 -L - Progress messages: Include command details in
send()calls
controller/docker-compose.yml (1 change)
- Change
/mnt:/mnt:rwto long-form syntax withpropagation: rshared
controller/internal/web/handlers.go (1 change)
- In
storageInitAPIHandler: Add smart partition detection before launching goroutine
scripts/docker-setup.sh (1 change)
- Add
mount --make-rshared /mntto node preparation section
Build & deploy procedure
# 1. On the host FIRST (before restarting controller):
sudo mount --make-rshared /mnt
# 2. Build new image with fixes (normal build process)
# 3. Deploy
cd /opt/docker/felhom-controller
sudo docker compose up -d
# 4. Verify container sees /host-dev
docker exec felhom-controller ls /host-dev/sd*
# 5. Verify rshared propagation is active
docker inspect felhom-controller --format '{{range .Mounts}}{{if eq .Destination "/mnt"}}Propagation={{.Propagation}}{{end}}{{end}}'
# Should show: Propagation=rshared
# 6. Test storage init wizard:
# - Scan → sdb appears
# - Select sdb → configure hdd_1 → type FORMÁZÁS
# - Watch progress panel — should show command details
# - Should complete successfully
# 7. Verify mount on HOST (proves propagation):
findmnt /mnt/hdd_1
# Should show /dev/sdb1 mounted at /mnt/hdd_1
# 8. Verify fstab entry:
grep hdd_1 /etc/fstab
# Should show UUID=... /mnt/hdd_1 ext4 defaults,nofail,noatime 0 2
# 9. Verify storage registered in settings:
# Visit Settings page → Adattárolók → /mnt/hdd_1 should appear
# 10. Restart controller — verify mount survives:
docker restart felhom-controller
docker exec felhom-controller ls /mnt/hdd_1/
# Should show: storage/ Dokumentumok/
What NOT to change
- Dockerfile — packages already correct (fdisk, e2fsprogs, util-linux, rsync, parted)
- scan_linux.go — scan works correctly after v0.11.1 fixes
- safety_linux.go / safety.go — system disk detection works
- Template/JS — wizard UI works fine;
CreatePartitiondefault-true is handled in handler