ADR-004: Forgejo in-cluster Actions runner on railiance01

Decides long-lived runner Deployment with DinD sidecar; updates
RAIL-HO-WP-0005 runner model decision.
This commit is contained in:
tegwick 2026-07-03 22:29:28 +02:00
parent b32c56db4f
commit 6b0ededee2
2 changed files with 124 additions and 9 deletions

View file

@ -46,14 +46,18 @@ change is made there.
## Key Decisions to Confirm
1. Public/private hostname for Forgejo and whether Gitea remains reachable
during the transition.
1. ~~Public/private hostname for Forgejo~~ **DECIDED 2026-07-03:**
`forgejo.coulomb.social` → railiance01 (`92.205.62.239`). DNS active;
Traefik edge live; Forgejo workload not deployed yet (404). Gitea remains
canonical until migration drills pass. Record:
`the-custodian/docs/forgejo-production-decisions.md`.
2. Mail delivery path for password reset and account recovery
(SMTP relay, sender domain, SPF/DKIM/DMARC expectations).
3. Package registry scope: container images only at first, or also generic,
npm, PyPI, Go, Maven, and Helm packages.
4. Actions runner model: in-cluster ephemeral runners, long-lived runner pod,
or isolated host runner.
4. ~~Actions runner model~~ **DECIDED 2026-07-03:** in-cluster long-lived runner
Deployment with DinD sidecar on railiance01 (`ADR-004`). Interim coulombcore
host runner retired after cutover.
5. Backup destination and retention target for database, repositories,
attachments, LFS, Actions artifacts/logs, and package data.
6. Cutover mode: freeze-and-migrate all repos in one window, or staged
@ -98,8 +102,7 @@ The probe is destroyed or explicitly archived after production Forgejo is live.
```
operator / agents / developers
-> private HTTPS endpoint
-> railiance01 ingress
-> https://forgejo.coulomb.social (railiance01 Traefik ingress)
-> forgejo Service in forgejo namespace
-> Forgejo Deployment/StatefulSet
-> forgejo-db CloudNative PG Cluster in databases namespace
@ -144,7 +147,7 @@ manual, unsupported, or explicitly out of scope.
```task
id: RAIL-HO-WP-0005-T02
status: todo
status: progress
priority: high
needs_human: true
state_hub_task_id: "f88115bf-4f99-49ef-a415-0b23750141b3"
@ -152,10 +155,14 @@ state_hub_task_id: "f88115bf-4f99-49ef-a415-0b23750141b3"
Decide the production choices listed in "Key Decisions to Confirm".
**Partial (2026-07-03):** hostname and in-cluster runner model decided (`ADR-004`).
Remaining: SMTP, package scope, backup, cutover mode. See
`the-custodian/docs/forgejo-production-decisions.md`.
Expected output:
- A short decision record in this workplan or a dedicated ADR.
- Hostname and exposure model.
- Hostname and exposure model. ✓ hostname; exposure follows railiance01 Traefik
- SMTP provider and sender identity.
- Package registry scope.
- Actions runner isolation model.
@ -229,7 +236,7 @@ Forgejo app running.
```task
id: RAIL-HO-WP-0005-T05
status: todo
status: progress
priority: high
state_hub_task_id: "11540ba4-d31c-4f64-836b-c6de69107aa4"
```
@ -245,6 +252,10 @@ Minimum scope:
- Health/status targets in the Makefile.
- Migration-safe configuration for coexistence with Gitea during the cutover.
**Partial (2026-07-03):** `railiance-apps` deploy live — HTTPS smoke pass, Actions
enabled, `coulomb` org + probe workflow success. Remaining: SOPS secrets,
SMTP, Docker on runner host for image builds, migration drills.
**Done when:** Forgejo runs on railiance01 against production platform
services and can serve login, git clone/push, package registry, and admin
operations.