16 KiB
TASK: Major rewrite of scripts/docker-setup.sh (v5.0)
Overview
Rewrite docker-setup.sh to bring it up to date with the current Felhom architecture.
The script should now be a complete end-to-end provisioning tool: install infrastructure,
run an interactive configuration wizard, generate controller.yaml, deploy FileBrowser
as a protected stack, and deploy felhom-controller — all in one run.
Read the entire current scripts/docker-setup.sh before starting. This is a rewrite
of an existing ~1600-line script, not a new file.
Changes Required
1. Update banner and version
- Set
SCRIPT_VERSION="5.0.0" - Update
print_banner()— no Portainer, the title should be:Felhom Infrastructure Setup v5.0.0 - Update the comment header block at the top of the file to match the new scope (Docker + Traefik + FileBrowser + Controller + configuration wizard).
- Update
print_help()to reflect all removed/changed options.
2. Remove Portainer (confirm clean)
The current script has no Portainer code (already removed in a prior version). Just make sure there are zero references to "portainer" or "Portainer" anywhere — banner, comments, help text, variables. Search and confirm.
3. Remove --cf-tunnel-token CLI option
Remove the --cf-tunnel-token CLI flag and the CF_TUNNEL_TOKEN variable from
parse_args(). The Cloudflare tunnel token is now collected by the configuration wizard
and written into controller.yaml (see §7 below). The install_cloudflare_tunnel()
function stays but reads the token from the wizard variable instead of a CLI flag.
Also remove --hdd-path CLI option and HDD_PATH variable — deprecated.
Keep these CLI options (still useful for non-interactive/scripted runs):
--ip,--gateway,--dns,--interface(network config)--domain,--email,--cf-token(TLS/domain — can pre-seed wizard)--customer(customer ID — can pre-seed wizard)--traefik-password,--self-signed-cert--skip-filebrowser--dry-run,--debug,--help,--bootstrap
4. Remove --hdd-path references
Remove HDD_PATH variable, --hdd-path argument parsing, and all references.
FileBrowser mounts are determined by the wizard (system_data_path and any existing
/mnt/* mounts).
5. FileBrowser deployment as protected stack
The current install_filebrowser() function needs to be rewritten:
Location: Deploy to /opt/docker/stacks/filebrowser/ (already the current
FILEBROWSER_DIR — keep this).
Compose file: Generate a compose file matching the current production layout on the demo node. Key differences from current script template:
services:
filebrowser:
image: gtstef/filebrowser:latest
container_name: filebrowser
restart: unless-stopped
environment:
- TZ=Europe/Budapest
volumes:
- filebrowser_data:/home/filebrowser/data
# Mount discovered drives — populated by wizard
# e.g. /mnt/hdd_1:/srv/hdd_1, /mnt/sys_drive:/srv/sys_drive
networks:
- traefik-public
deploy:
resources:
limits:
memory: 256M
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:80/"]
interval: 30s
timeout: 5s
retries: 3
start_period: 15s
labels:
- "traefik.enable=true"
- "traefik.http.routers.filebrowser.rule=Host(`files.<DOMAIN>`)"
- "traefik.http.routers.filebrowser.entrypoints=websecure"
- "traefik.http.routers.filebrowser.tls=true"
- "traefik.http.services.filebrowser.loadbalancer.server.port=80"
- "traefik.docker.network=traefik-public"
Drive discovery for volumes: The wizard (§7) collects system_data_path.
Additionally, scan /mnt/ for existing mount points at install time. For each
discovered mount (e.g., /mnt/hdd_1, /mnt/sys_drive), add a volume mapping:
/mnt/<name>:/srv/<name>. If no mounts found, only mount the system_data_path.
Hardcode domain in the Traefik host rule (no ${DOMAIN} env var needed).
Use the wizard's domain value directly: Host(\files.ACTUAL-DOMAIN`)`.
Also generate .felhom.yml metadata file — keep the existing one from the
current script (Hungarian text, category: storage, etc.).
No .env file needed for filebrowser (domain is hardcoded in compose labels).
6. Controller deployment (NEW step)
Add a new step to deploy felhom-controller. This is currently missing from the script — the user had to deploy it manually.
Location: /opt/docker/felhom-controller/
docker-compose.yml — generate matching the current production layout:
services:
felhom-controller:
image: gitea.dooplex.hu/admin/felhom-controller:latest
container_name: felhom-controller
restart: unless-stopped
privileged: true
ports:
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /opt/docker/felhom-controller/controller.yaml:/opt/docker/felhom-controller/controller.yaml:ro
- controller-data:/opt/docker/felhom-controller/data
- /opt/docker/stacks:/opt/docker/stacks
- /srv/backups:/srv/backups
- type: bind
source: /mnt
target: /mnt
bind:
propagation: rshared
- /sys:/host/sys:ro
- /etc/os-release:/host/etc/os-release:ro
- /etc/hostname:/host/etc/hostname:ro
- /dev:/host-dev:rw
- /etc/fstab:/host-fstab
- /run/udev:/run/udev:ro
environment:
- TZ=Europe/Budapest
labels:
- "traefik.enable=true"
- "traefik.http.routers.controller.rule=Host(`felhom.<DOMAIN>`)"
- "traefik.http.routers.controller.entrypoints=websecure"
- "traefik.http.routers.controller.tls=true"
- "traefik.http.services.controller.loadbalancer.server.port=8080"
- "traefik.docker.network=traefik-public"
- "felhom.managed=true"
- "felhom.component=controller"
networks:
- traefik-public
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/api/health"]
interval: 30s
timeout: 5s
start_period: 10s
retries: 3
volumes:
controller-data:
networks:
traefik-public:
external: true
Hardcode domain in Traefik labels (like filebrowser).
Generate .env with just DOMAIN=<domain> — needed only as a reference/
documentation, since we hardcode the domain in compose labels. Actually, skip
the .env file entirely — compose doesn't need it if labels are hardcoded.
Use latest tag for the image. The controller has self-update capability
so it will manage its own version after initial deployment.
Pull and start the controller, then verify health via the healthcheck endpoint.
7. Configuration wizard for controller.yaml
Add an interactive wizard function run_config_wizard() that runs AFTER
infrastructure setup but BEFORE deploying the controller. It generates
/opt/docker/felhom-controller/controller.yaml.
CLI pre-seeding: If --domain, --customer, --email, --cf-token are
provided via CLI, use them as defaults in the wizard (user can still change).
Wizard flow (each question is a read -p prompt with a default shown in brackets):
===========================================================
Felhom Controller Configuration Wizard
===========================================================
--- Customer identity ---
Customer ID [demo-felhom]: _
Customer display name [Demo Ügyfél]: _
Domain [homeserver.local]: _
Customer email (optional) []: _
--- Infrastructure secrets ---
Cloudflare Tunnel token (optional, leave empty to skip) []: _
Cloudflare API token (for DNS-01 certs, optional) []: _
--- Paths ---
System data partition mount point
(if the system drive was partitioned for user data,
provide the mount point, e.g., /mnt/sys_drive)
System data path [/mnt/sys_drive]: _
--- Dashboard password ---
Set a password for the controller dashboard?
(leave empty for first-visit setup prompt)
Dashboard password []: _
--- Git sync ---
App catalog repository URL [https://gitea.dooplex.hu/admin/app-catalog-felhom.eu.git]: _
Git username []: _
Git token []: _
--- Healthcheck monitoring ---
Healthchecks.io ping UUIDs (leave empty to skip):
Heartbeat UUID []: _
System health UUID []: _
DB dump UUID []: _
Backup UUID []: _
Backup integrity UUID []: _
--- Ready ---
Password hashing: If user provides a dashboard password, hash it with bcrypt.
Use htpasswd -bnBC 10 "" "PASSWORD" | tr -d ':' or the python3 -c fallback.
Store the hash in web.password_hash.
Session secret: Auto-generate: openssl rand -hex 32
Hub config: Always enabled, with the hardcoded API key:
hub:
enabled: true
url: "https://hub.felhom.eu"
api_key: "094091de545ce28795c47ac2158fc30750db5c24a621c49329b001ee8db57fb8"
push_interval: "15m"
Backup: Keep enabled: true — the user confirmed it should stay for
troubleshooting purposes.
hdd_path: Do NOT include in generated config. It's deprecated. Remove it from the template entirely.
Full template — write this to /opt/docker/felhom-controller/controller.yaml:
# Felhom Controller Configuration
# Generated by docker-setup.sh v5.0.0 on <DATE>
customer:
id: "<CUSTOMER_ID>"
name: "<CUSTOMER_NAME>"
domain: "<DOMAIN>"
email: "<EMAIL>"
telegram_chat_id: ""
infrastructure:
cf_tunnel_token: "<CF_TUNNEL_TOKEN>"
cf_api_token: "<CF_API_TOKEN>"
paths:
stacks_dir: "/opt/docker/stacks"
data_dir: "/opt/docker/felhom-controller/data"
system_data_path: "<SYSTEM_DATA_PATH>"
system:
reserved_memory_mb: 384
web:
listen: ":8080"
password_hash: "<BCRYPT_HASH_OR_EMPTY>"
session_secret: "<AUTO_GENERATED_HEX>"
git:
repo_url: "<GIT_REPO_URL>"
branch: "main"
sync_interval: "15m"
username: "<GIT_USERNAME>"
token: "<GIT_TOKEN>"
stacks:
protected:
- "traefik"
- "cloudflared"
- "felhom-controller"
- "filebrowser"
update_window: "03:00-05:00"
compose_command: ""
backup:
enabled: true
restic_password_file: "/opt/docker/felhom-controller/data/restic-password"
db_dump_schedule: "02:30"
restic_schedule: "03:00"
retention:
keep_daily: 7
keep_weekly: 4
keep_monthly: 6
prune_schedule: "weekly"
monitoring:
enabled: true
healthchecks_base: "https://status.felhom.eu"
ping_uuids:
heartbeat: "<HEARTBEAT_UUID>"
system_health: "<SYSTEM_HEALTH_UUID>"
db_dump: "<DB_DUMP_UUID>"
backup: "<BACKUP_UUID>"
backup_integrity: "<BACKUP_INTEGRITY_UUID>"
system_health_interval: "5m"
health_check_schedule: "06:00"
thresholds:
disk_warn_percent: 80
disk_crit_percent: 90
backup_max_age_hours: 36
cpu_warn_percent: 90
memory_warn_percent: 85
temperature_warn_celsius: 75
hub:
enabled: true
url: "https://hub.felhom.eu"
api_key: "094091de545ce28795c47ac2158fc30750db5c24a621c49329b001ee8db57fb8"
push_interval: "15m"
self_update:
enabled: true
check_interval: "6h"
image: "gitea.dooplex.hu/admin/felhom-controller"
auto_update: false
health_timeout_seconds: 60
notifications:
customer_events:
- "disk_warning"
- "backup_failed"
- "update_available"
- "security_update"
operator_events:
- "disk_critical"
- "backup_failed"
- "self_update_failed"
- "container_unhealthy"
logging:
level: "info"
file: ""
max_size_mb: 10
max_files: 3
assets:
source_url: "https://felhom.eu"
8. Update controller.yaml.example
Update controller/configs/controller.yaml.example to match the wizard template:
- Remove
hdd_pathline entirely - Set
hub.enabled: true(wasfalse) - Set
hub.api_keyto the real key:094091de545ce28795c47ac2158fc30750db5c24a621c49329b001ee8db57fb8 - Improve
system_data_pathcomment to be clearer:system_data_path: "/mnt/sys_drive" # Mount point of user-data partition on system drive (e.g., /mnt/sys_drive)
9. Update install_cloudflare_tunnel()
The function currently reads from CF_TUNNEL_TOKEN (CLI arg). Change it to
read from the wizard variable (same variable name is fine, just populated by the
wizard instead of CLI). The function body stays the same — it creates the
docker-compose at /opt/docker/cloudflared/ and starts it.
Guard: If wizard left the CF tunnel token empty, skip this step (already
handled by the existing if [[ -z "$CF_TUNNEL_TOKEN" ]] check).
10. Update execution order in main()
New execution order:
1. Install base packages
2. Configure network (static IP, if requested)
3. Install Docker Engine + Compose
4. Install Traefik reverse proxy
5. Generate self-signed certificate (if requested)
6. Run configuration wizard → generates controller.yaml
7. Install Cloudflare Tunnel (if token provided in wizard)
8. Install FileBrowser (protected stack)
9. Deploy felhom-controller
10. Install helper tools
11. Print summary
Update step numbering and get_total_steps() accordingly.
11. Update print_summary()
Update the summary to reflect:
- Controller is deployed and accessible at
https://felhom.<DOMAIN> - FileBrowser at
https://files.<DOMAIN> - Remove manual "deploy felhom-controller" instructions (it's automated now)
- Show healthcheck UUID status (configured / not configured)
- Show hub status (enabled)
- Remove the
CUSTOMER_IDdisplay bug (the "Note: No --customer specified" message is inside theif [[ -n "$CUSTOMER_ID" ]]block — wrong logic)
12. Update print_help()
Update help text to reflect:
- Removed
--cf-tunnel-token(now in wizard) - Removed
--hdd-path(deprecated) - Mention the interactive wizard
- Updated "WHAT THIS SCRIPT INSTALLS" list:
- Base packages
- Docker Engine + Compose
- Traefik reverse proxy
- TLS certificates
- Felhom Controller (with interactive configuration)
- FileBrowser Quantum (web file manager)
- Cloudflare Tunnel (if configured)
- Helper tools
Additional observations
Bugs in current script
-
print_summary()CUSTOMER_ID logic is inverted (line ~1507): The "Note: No --customer specified" message is insideif [[ -n "$CUSTOMER_ID" ]]which only triggers when a customer IS specified. Should be in an else branch or removed. -
Step numbering is fragile: The
get_total_steps()and hardcoded step numbers (e.g.,log_step "3/$(get_total_steps)") will desync if steps are added/removed. Consider using a counter variable incremented at each step.
Things NOT to change
bootstrap_sudo()— works fine, keep as-is- Network configuration (steps 2) — keep all network manager detection logic
- Docker installation (step 3) — keep as-is
- Traefik installation (step 4) — keep as-is
- Self-signed cert generation — keep as-is
- Helper tools installation — keep as-is
- Error trap and diagnostics — keep as-is
- Color/logging functions — keep as-is
Template completeness check
The controller.yaml template covers all sections from the current example. Sections that use sensible defaults and don't need wizard prompts:
system.reserved_memory_mb(384)backup.*(all defaults are fine)stacks.protected(hardcoded list)stacks.update_window("03:00-05:00")monitoring.thresholds.*(all defaults)self_update.*(all defaults)notifications.*(all defaults)logging.*(all defaults)assets.*(hardcoded)
Implementation notes
- The script is bash — no external YAML parser needed. Use
cat > file << EOFwith variable substitution for generating YAML. - For bcrypt hashing, prefer
htpasswd -bnBC 10 "" "$password" | tr -d ':\n'(apache2-utils is installed in step 1). Fallback:python3 -c "import bcrypt; ..." - The wizard should show current/default values in brackets and accept Enter
for defaults:
read -p "Domain [$default]: " input; value="${input:-$default}" - Dry-run mode should show what the wizard WOULD generate without writing files.
- All generated files should have appropriate permissions:
controller.yaml:chmod 600(contains secrets)docker-compose.ymlfiles:chmod 644
Build & test
After implementing, test the script with --dry-run to verify:
sudo ./docker-setup.sh --domain test.local --customer test --dry-run
For a real deployment test on the demo node:
# Copy script to demo node
SSH=/c/Windows/System32/OpenSSH/ssh.exe
scp scripts/docker-setup.sh kisfenyo@192.168.0.162:/tmp/
# Run on demo node (it already has infrastructure, so most steps will skip)
$SSH kisfenyo@192.168.0.162 "sudo bash /tmp/docker-setup.sh --domain demo-felhom.eu --customer demo-felhom --email certs@felhom.eu --cf-token <token>"