Files
deploy-felhom-compose/TASK.md
T

7.1 KiB

BUGFIX: /dev/sdb not accessible inside container

Problem

FormatAndMount fails with stat /dev/sdb: no such file or directory because block device nodes don't exist inside the container's /dev.

Even with privileged: true, Docker creates its own tmpfs at /dev with minimal device nodes. The explicit - /dev:/dev volume mount in docker-compose.yml is silently overridden by Docker's internal /dev tmpfs setup — docker inspect shows no bind mount for /dev.

Root Cause

Docker always creates a fresh tmpfs for /dev inside containers. The privileged: true flag relaxes cgroup device access (the kernel allows I/O to any device), but doesn't populate /dev with all host device nodes. The bind mount - /dev:/dev conflicts with Docker's own /dev management and gets silently dropped.

Fix

Mount host /dev at a different path inside the container. The device nodes at /host-dev/sdb are real block devices that the kernel will allow I/O to (because privileged: true).

1. docker-compose.yml change

volumes:
  # ...existing...
  # Block devices — mounted at /host-dev (can't override Docker's /dev)
  - /dev:/host-dev:rw

Change - /dev:/dev to - /dev:/host-dev:rw

2. Go code: Add host device path constant

In internal/storage/ package (e.g., format_linux.go or a new paths.go):

const (
    // HostDevPath is where the host's /dev is mounted inside the container.
    // Docker overrides /dev with its own tmpfs, so we mount at /host-dev.
    HostDevPath = "/host-dev"
    
    // HostFstabPath is where the host's /etc/fstab is mounted.
    HostFstabPath = "/host-fstab"
)

// HostDevicePath converts a standard device path to the container-accessible path.
// "/dev/sdb" → "/host-dev/sdb"
// "/dev/sdb1" → "/host-dev/sdb1"
func HostDevicePath(devPath string) string {
    if strings.HasPrefix(devPath, "/dev/") {
        return HostDevPath + "/" + strings.TrimPrefix(devPath, "/dev/")
    }
    return devPath
}

3. Update all device operations to use HostDevicePath()

In format_linux.go (or wherever FormatAndMount is):

// Validation — check device exists
hostDev := HostDevicePath(req.DevicePath) // "/dev/sdb" → "/host-dev/sdb"
if _, err := os.Stat(hostDev); err != nil {
    return fmt.Errorf("device not found: %s", req.DevicePath)
}

// Partition — use host device path
cmd := exec.Command("sfdisk", hostDev)

// Format — use host device path
cmd := exec.Command("mkfs.ext4", "-F", "-L", label, HostDevicePath(partPath))

// blkid — use host device path
cmd := exec.Command("blkid", "-o", "value", "-s", "UUID", HostDevicePath(partPath))

// mount — use host device path for source, real path for target
cmd := exec.Command("mount", HostDevicePath(partPath), mountPath)

4. Update ScanDisks blkid enrichment

In scan_linux.go, the enrichWithBlkid function and getSystemDiskNames function use blkid which scans /dev by default. Update to scan /host-dev:

// enrichWithBlkid — run blkid on /host-dev to get filesystem info
func enrichWithBlkid(disks []BlockDevice) {
    // blkid by default scans /dev — we need it to scan /host-dev
    // Option 1: Run blkid with explicit device paths from /host-dev
    // Option 2: Run blkid -o export and it will find devices from /proc/partitions
    
    // blkid -o export still works because it reads /proc/partitions (kernel-level)
    // and then probes the devices. With privileged mode, it can probe via /proc.
    // BUT the DEVNAME in output will say /dev/sdb1, not /host-dev/sdb1.
    // That's fine — we match by device name anyway.
    out, err := exec.Command("blkid", "-o", "export").Output()
    // ... parsing as before, matching by /dev/xxx paths ...
}

// For getSystemDiskNames, blkid -U <uuid> returns /dev/xxx paths which is correct
// for fstab parsing (fstab uses /dev/xxx paths too).
// No changes needed there — it's just resolving UUIDs to device names.

Note: blkid -o export may not find devices if it can only see Docker's minimal /dev. In that case, enumerate /host-dev/sd* explicitly:

func enrichWithBlkid(disks []BlockDevice) {
    for i := range disks {
        for j := range disks[i].Partitions {
            p := &disks[i].Partitions[j]
            hostPath := HostDevicePath(p.Path) // "/host-dev/sdb1"
            
            // Probe individually
            if fstype, err := exec.Command("blkid", "-o", "value", "-s", "TYPE", hostPath).Output(); err == nil {
                p.FSType = strings.TrimSpace(string(fstype))
            }
            if uuid, err := exec.Command("blkid", "-o", "value", "-s", "UUID", hostPath).Output(); err == nil {
                p.UUID = strings.TrimSpace(string(uuid))
            }
            if label, err := exec.Command("blkid", "-o", "value", "-s", "LABEL", hostPath).Output(); err == nil {
                p.Label = strings.TrimSpace(string(label))
            }
        }
    }
}

5. fstab writing — use real /dev paths

When writing to fstab, use UUID-based entries (already the plan), so no /dev path needed:

UUID=<uuid>  /mnt/hdd_1  ext4  defaults,nofail,noatime  0  2

The UUID is obtained from blkid using the /host-dev/sdb1 path, but the UUID itself is filesystem-level and doesn't depend on device path.

6. mount command — needs special handling

mount /host-dev/sdb1 /mnt/hdd_1 should work because /host-dev/sdb1 is a real block device node (same major:minor as the host's /dev/sdb1). The kernel doesn't care about the path — it uses the device numbers.

However, mount may also accept UUID directly:

exec.Command("mount", "UUID="+uuid, mountPath)

This is even better — no device path needed at all. But it requires the kernel to find the device, which should work since the device is visible in /proc/partitions.

Recommended: Use the device path approach (mount /host-dev/sdb1 /mnt/hdd_1) as it's more explicit and debuggable.

7. Also update docker-setup.sh

If docker-setup.sh generates the controller compose file, update the /dev mount:

# Was:
echo "      - /dev:/dev" >> "$compose_file"
# Now:
echo "      - /dev:/host-dev:rw" >> "$compose_file"

Check in docker-setup.sh whether all necessary packages are deployed during installation (rsync, etc.)

8. Update documentation

Update CONTEXT.md, CHANGELOG.md, README.md

Summary of changes

File Change
controller/docker-compose.yml /dev:/dev/dev:/host-dev:rw
controller/internal/storage/format_linux.go Use HostDevicePath() for all device operations
controller/internal/storage/scan_linux.go Use HostDevicePath() for blkid probing
controller/internal/storage/paths.go (NEW) HostDevPath, HostFstabPath, HostDevicePath()
scripts/docker-setup.sh Update compose generation

Quick test after fix

# Rebuild + redeploy controller
# Then verify:
docker exec felhom-controller ls -la /host-dev/sd*
# Should show: /host-dev/sda, /host-dev/sda1, sda2, sda3, /host-dev/sdb, /host-dev/sdb1

# Try format (from UI or manually):
docker exec felhom-controller blkid /host-dev/sdb1
# Should work (empty output = no filesystem, which is correct for unformatted)