Decides long-lived runner Deployment with DinD sidecar; updates RAIL-HO-WP-0005 runner model decision.
4.1 KiB
ADR-004 — Forgejo In-Cluster Actions Runner on railiance01
Status: Accepted
Date: 2026-07-03
Deciders: Bernd Worsch (operator), custodian agents
Workplans: RAIL-HO-WP-0005-T02, CUST-WP-0054-T04
Context
Forgejo production runs on railiance01 k3s (railiance-apps, S5). An interim
host runner on coulombcore proved Actions scheduling (coulomb/forgejo-actions-probe)
but:
- coulombcore is a legacy machine slated for drain (CUST-WP-0054-T03).
- Host runners require Docker or Podman on the OS — not installed, not desired on coulombcore long term.
- Forgejo upstream recommends not co-locating runners on the same machine as the forge instance; in-cluster separate pods satisfy isolation while staying on the production fleet node.
RAIL-HO-WP-0005-T02left the runner model undecided among host, in-cluster, and ephemeral options.
Goal: a coherent Kubernetes-from-the-start CI substrate — Forgejo app, database, ingress, and Actions runner all lifecycle-managed on railiance01.
Decision
Runner placement
Deploy one long-lived Forgejo Actions runner Deployment in the forgejo namespace
on railiance01:
| Component | Implementation |
|---|---|
| Runner | data.forgejo.org/forgejo/runner:6.3.1 |
| Container runtime for jobs | docker:dind sidecar (privileged) |
| State | PVC forgejo-runner-data (.runner, config.yaml, action cache) |
| Registration scope | coulomb organization |
| Runner name | railiance01-build-01 |
| Deploy surface | railiance-apps/manifests/forgejo-runner.yaml |
| Operator targets | make forgejo-runner-deploy, forgejo-runner-status |
Label contract
Preserve Gitea migration compatibility and semantic capability labels:
self-hosted:host,linux:host,linux_amd64:host,container-build:host,registry-publish:host,railiance01:host,ubuntu-latest:docker://node:20-bookworm,docker:docker://node:20-bookworm
Security boundaries
- Runner pod receives no cluster-admin kubeconfig and no OpenBao tokens by default.
registry-publishjobs use repo/org-scoped Forgejo secrets only.- DinD sidecar runs privileged — accepted for single-node railiance01 with
dedicated
forgejonamespace; revisit when a third node or multi-tenant runners appear. - Registration tokens live in Kubernetes Secret
forgejo-runner-registration(SOPS template committed; live value never in Git).
Retire interim host runner
Stop and disable forgejo-runner.service on coulombcore after in-cluster runner is
healthy. Do not register new host runners without an explicit ADR amendment.
Alternatives considered
| Option | Outcome |
|---|---|
| Host runner + Docker on coulombcore | Rejected — legacy host, contradicts drain plan |
| Host runner + Podman on haskelseed | Viable fallback; not chosen as primary |
| Kaniko/Buildah without DinD | Deferred — higher workflow churn during Gitea migration |
| Multiple ephemeral runner Jobs | Deferred — start with capacity=1 long-lived pod |
Consequences
Positive
- Single-machine production loop: forge + runner on railiance01, workstation not required.
- Container image CI (
docker build/docker push) works without OS-level Docker. - Runner upgrades roll with Git-managed manifests and
kubectl/Makefile.
Negative / follow-on
- Privileged DinD increases blast radius within the node — monitor and restrict namespace RBAC.
- SOPS-encrypted registration secret still requires operator age key.
cluster-deploy/s5-release-checklabels remain out of scope until credential paths reviewed.
Ownership (OAS)
| Concern | Repo | Layer |
|---|---|---|
| ADR + umbrella sequencing | railiance-infra |
S1 |
| Runner manifests + Makefile | railiance-apps |
S5 |
| Label contract + runner evidence docs | railiance-forge |
S5 forge substrate |
| Reusable workflow templates | railiance-enablement |
S4 |
References
railiance-apps/docs/forgejo-on-railiance01.mdrailiance-forge/docs/forgejo-actions-runner-substrate.mdthe-custodian/docs/forgejo-production-decisions.md- Forgejo runner installation