feat: controller-side HTTP/TCP health probes

Add network-level health probing from the controller to deployed apps.
The controller probes containers over the shared Docker network and
overrides stack state to "unhealthy" if the service isn't responding.

Three probe types: http (any response = alive), api (validates status
code and body content), tcp (port reachability). Configured per-app
via healthcheck: section in .felhom.yml. Runs every minute, per-app
interval defaults to 5 minutes.

This replaces Docker-level healthchecks for distroless images (e.g.
Vikunja) that lack shell utilities, and complements existing Docker
healthchecks for other apps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-25 11:11:21 +01:00
parent 077640d9bb
commit 4c5d430b1a
6 changed files with 425 additions and 13 deletions
+3
View File
@@ -220,6 +220,9 @@ func main() {
sched.Every("stack-scan", 2*time.Minute, func(ctx context.Context) error {
return stackMgr.ScanStacks()
})
sched.Every("health-probes", 1*time.Minute, func(ctx context.Context) error {
return stackMgr.RunHealthProbes()
})
// Heartbeat — lightweight "I'm alive" signal
sched.Every("heartbeat", 5*time.Minute, func(ctx context.Context) error {