renovate: default-allow + codify ArgoCD auto-sync #16

Merged
admin merged 5 commits from feat/renovate-default-allow into main 2026-06-05 07:58:04 +02:00
Owner

Two coordinated changes for the default-allow rollout. Do not merge yet — dry-run preview happens next.

1) admin-system/renovate.yaml — flip to default-allow

Replaces the 4-rule Tier 1 allowlist with a 7-rule default-allow + safety-gate structure.

  • Throttle: prHourlyLimit: 8, prConcurrentLimit: 8 (was 0/0, unlimited)
  • Rule order (matters):
    1. *minimumReleaseAge: 3 days
    2. minor/patchautomerge + platformAutomerge
    3. majordependencyDashboardApproval
    4. k3s-bundled (rancher/local-path-provisioner, rancher/mirrored-coredns/coredns, rancher/mirrored-metrics-server) → enabled: false
    5. Critical-core (gitea/gitea, quay.io/argoproj/argocd, ghcr.io/goauthentik/{server,ldap,proxy}, ghcr.io/cloudnative-pg/cloudnative-pg) → automerge: false (Viktor merges manually)
    6. ghcr.io/lukegus/termixversioning: loose + extractVersion: "^release-(?<version>.+)$"
    7. flomp/wanderer-db + flomp/wanderer-webgroupName: wanderer (avoids the wanderer.yaml file race)

enabledManagers unchanged ([kubernetes, helm-values]) — Helmfile-managed infra stays invisible.

Critical-core verification (some are no-ops, acknowledged in brief):

  • gitea/gitea in gitea-system/gitea.yaml
  • ghcr.io/goauthentik/server in auth-system/authentik-values.yaml
  • ghcr.io/cloudnative-pg/cloudnative-pg in database-system/cnpg/values.yaml
  • quay.io/argoproj/argocd not in repo (ArgoCD bootstrap-installed) — no-op
  • goauthentik/ldap, /proxy not pinned in values (chart defaults) — no-op

2) argocd-apps/homelab.yaml — codify per-app auto-sync

Currently auto-sync lives only on live CRs (set imperatively via UI) — DR risk and drift.

  • 35 existing bare-AUTO apps: add automated: {enabled: true} (matches live, no behavioral change)
  • jarr, version-checker: add automated: {enabled: true, prune: true, selfHeal: true} (flipping MANUAL → AUTO so Renovate merges deploy)
  • Untouched (already strict in git): admin-tools, authentik, cnpg-operator, root-apps
  • Explicitly kept MANUAL (per Viktor's call): monitoring, infrastructure, felhom, gitea, pihole, database-system

Important behavioral note: root-apps does NOT enforce syncPolicy.automated drift between git and live (consistent with the imperative auto-sync model). So jarr and version-checker will also need a one-off kubectl patch after merge to actually flip live. That's part of the go-live step.

Next step (not in this PR)

Phase 4 dry-run on the branch's config via RENOVATE_DRY_RUN=full to preview the blast radius before merge.

🤖 Generated with Claude Code

Two coordinated changes for the default-allow rollout. **Do not merge yet** — dry-run preview happens next. ## 1) `admin-system/renovate.yaml` — flip to default-allow Replaces the 4-rule Tier 1 allowlist with a 7-rule default-allow + safety-gate structure. - Throttle: `prHourlyLimit: 8`, `prConcurrentLimit: 8` (was 0/0, unlimited) - Rule order (matters): 1. `*` → `minimumReleaseAge: 3 days` 2. `minor/patch` → `automerge` + `platformAutomerge` 3. `major` → `dependencyDashboardApproval` 4. k3s-bundled (`rancher/local-path-provisioner`, `rancher/mirrored-coredns/coredns`, `rancher/mirrored-metrics-server`) → `enabled: false` 5. Critical-core (`gitea/gitea`, `quay.io/argoproj/argocd`, `ghcr.io/goauthentik/{server,ldap,proxy}`, `ghcr.io/cloudnative-pg/cloudnative-pg`) → `automerge: false` (Viktor merges manually) 6. `ghcr.io/lukegus/termix` → `versioning: loose` + `extractVersion: "^release-(?<version>.+)$"` 7. `flomp/wanderer-db` + `flomp/wanderer-web` → `groupName: wanderer` (avoids the wanderer.yaml file race) `enabledManagers` unchanged (`[kubernetes, helm-values]`) — Helmfile-managed infra stays invisible. **Critical-core verification** (some are no-ops, acknowledged in brief): - ✅ `gitea/gitea` in `gitea-system/gitea.yaml` - ✅ `ghcr.io/goauthentik/server` in `auth-system/authentik-values.yaml` - ✅ `ghcr.io/cloudnative-pg/cloudnative-pg` in `database-system/cnpg/values.yaml` - ⚪ `quay.io/argoproj/argocd` not in repo (ArgoCD bootstrap-installed) — no-op - ⚪ `goauthentik/ldap`, `/proxy` not pinned in values (chart defaults) — no-op ## 2) `argocd-apps/homelab.yaml` — codify per-app auto-sync Currently auto-sync lives only on live CRs (set imperatively via UI) — DR risk and drift. - **35 existing bare-AUTO apps**: add `automated: {enabled: true}` (matches live, no behavioral change) - **jarr, version-checker**: add `automated: {enabled: true, prune: true, selfHeal: true}` (flipping MANUAL → AUTO so Renovate merges deploy) - **Untouched** (already strict in git): `admin-tools`, `authentik`, `cnpg-operator`, `root-apps` - **Explicitly kept MANUAL** (per Viktor's call): `monitoring`, `infrastructure`, `felhom`, `gitea`, `pihole`, `database-system` **Important behavioral note:** root-apps does NOT enforce `syncPolicy.automated` drift between git and live (consistent with the imperative auto-sync model). So `jarr` and `version-checker` will also need a one-off `kubectl patch` after merge to actually flip live. That's part of the go-live step. ## Next step (not in this PR) Phase 4 dry-run on the branch's config via `RENOVATE_DRY_RUN=full` to preview the blast radius before merge. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
admin added 1 commit 2026-06-05 07:08:08 +02:00
Two coordinated changes — open PR only, do NOT merge until dry-run passes.

1) admin-system/renovate.yaml: flip packageRules from Tier 1 allowlist to
   default-allow with safety gates. Adds prHourlyLimit=8 + prConcurrentLimit=8
   to throttle the first wave. New rules (7 total, order-sensitive):
   - "*"                    : 3-day stability gate (minimumReleaseAge)
   - minor/patch            : automerge via platformAutomerge
   - major                  : dependencyDashboardApproval (manual gate)
   - k3s-bundled (3 images) : disabled (ride k3s upgrades)
   - critical-core (6 imgs) : automerge=false (Viktor merges manually)
     - gitea/gitea, ghcr.io/goauthentik/{server,ldap,proxy},
       ghcr.io/cloudnative-pg/cloudnative-pg, quay.io/argoproj/argocd
     - ArgoCD + authentik /ldap and /proxy are no-ops (not pinned in repo)
   - termix                 : versioning=loose, extractVersion for "release-X.Y.Z"
   - wanderer-db + -web     : groupName=wanderer (one PR, prevents file race)
   enabledManagers unchanged ([kubernetes, helm-values]) — keeps Helmfile-
   managed infra invisible.

2) argocd-apps/homelab.yaml: codify per-app auto-sync intent in git
   (currently lives only on live CRs via UI — DR risk).
   - 35 existing bare-AUTO apps: add `automated: {enabled: true}` (matches live).
   - jarr, version-checker: add `automated: {enabled: true, prune: true,
     selfHeal: true}` (flipping MANUAL -> AUTO so Renovate merges deploy).
   - Untouched: admin-tools, authentik, cnpg-operator, root-apps (already
     have strict automated in git); monitoring, infrastructure, felhom,
     gitea, pihole, database-system (explicitly kept MANUAL per Viktor).
   NOTE: root-apps does NOT enforce syncPolicy.automated drift between git
   and live, so jarr + version-checker will also need a one-off kubectl
   patch after merge to actually become AUTO live. Done in go-live step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
admin added 1 commit 2026-06-05 07:27:17 +02:00
Debug-level dry-run showed:
  Dependency ghcr.io/lukegus/termix has unsupported/unversioned value
  release-1.11.0 (versioning=loose)
  Skipping ghcr.io/lukegus/termix because no currentDigest or pinDigests

`versioning: loose + extractVersion` doesn't work as intended here:
Renovate evaluates the currentValue (`release-1.11.0`) against the loose
parser BEFORE extractVersion is applied. loose can't parse a prefixed
value, so Renovate falls back to digest-based comparison; we don't pin
digests, so it silently skips and no PRs are ever opened. (Upstream has
v1.11.1, v1.11.2, and a major bump to release-2.3.2 since we deployed.)

Fix: use `versioning: regex:^release-(?<major>\d+)\.(?<minor>\d+)\.(?<patch>\d+)$`
which parses the whole tag including the `release-` prefix. The named
major/minor/patch groups let Renovate categorize bumps correctly so
the existing minor/patch automerge and major dashboard-approval rules
apply normally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
admin added 1 commit 2026-06-05 07:37:43 +02:00
Debug dry-run revealed why termix (and reloader/homepage/headlamp
8d ago) sit in "Pending Status Checks" indefinitely:

  Marking 2 release(s) as pending, as they do not have a
  releaseTimestamp and we're running with
  minimumReleaseAgeBehaviour=timestamp-required
  "depName": "ghcr.io/lukegus/termix"
  "versions": ["release-1.11.2", "release-1.11.1"]
  "check": "minimumReleaseAge"

ghcr.io OCI manifests for these images don't expose a release
timestamp Renovate can read, so the default `timestamp-required`
mode turns the 3-day stability gate into an INFINITE hold for
ghcr.io packages -- silently. PRs are never opened.

Switching to `timestamp-optional` (other supported value per Renovate
source: lib/config/options/index.ts) makes the gate best-effort: the
3-day window is still enforced for any package the datasource gives a
timestamp for; packages without a timestamp are allowed through.
Restores intended behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
admin added 1 commit 2026-06-05 07:43:38 +02:00
Last commit's global `minimumReleaseAgeBehaviour: timestamp-optional` did
two unwanted things:

  1) Dry-run showed 0 "Would commit" branches (was 33 before). The flag
     appears to alter Renovate's filtering more broadly than expected and
     is not the right knob here.
  2) Automated security review correctly flagged the global form as
     fail-open: a missing timestamp on ANY package would bypass the
     stability gate, weakening supply-chain protection across the fleet.

Narrow fix instead:
  - Revert the global setting (back to default `timestamp-required`).
  - Add `minimumReleaseAge: "0 days"` ONLY to the termix packageRule.
    ghcr.io OCI manifests for ghcr.io/lukegus/termix don't expose a
    release timestamp Renovate can read, so the global 3-day gate would
    otherwise hold updates indefinitely (this is the same class of issue
    that's been keeping reloader/homepage/headlamp on "Pending Status
    Checks" for 8+ days). Major bumps still gated by the global major
    rule (`dependencyDashboardApproval: true`).

Other ghcr.io packages with the same issue (reloader, homepage, headlamp)
remain on the dashboard's "Pending Status Checks" list and can be
force-approved per-update via the checkbox UX. That's a slower but safer
manual-approval path that preserves the supply-chain gate's intent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
admin added 1 commit 2026-06-05 07:53:51 +02:00
Replaces the security-flagged `minimumReleaseAge: 0` bypass with a
proper datasource swap.

Why: ghcr.io OCI manifests for ghcr.io/lukegus/termix don't expose a
release timestamp, so Renovate's default `timestamp-required` mode
holds updates indefinitely. The previous fix (zeroing the gate) was
flagged as a supply-chain control regression -- correctly, since it
weakens the stability protection for that package.

Cleaner fix: point Renovate's version lookup at the upstream GitHub
Releases (Termix-SSH/Termix per the OCI source label) where timestamps
ARE published. The 3-day gate then works for termix the same way it
works for other packages with intact timestamps. Renovate still
updates the same image -- the manager extracts ghcr.io/lukegus/termix
from termix.yaml and writes the new tag back; only the version-source
lookup is redirected. The ghcr.io registry hosts every release-X.Y.Z
tag (verified release-2.3.2 present), so the writeback target stays
valid.

Major bumps (1.x -> 2.x) continue to queue for dashboard approval via
the global major rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
admin merged commit e147d829e7 into main 2026-06-05 07:58:04 +02:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: admin/homelab-manifests#16