Add Railiance canary chart template

This commit is contained in:
tegwick 2026-06-27 16:38:00 +02:00
parent a23b3327fc
commit c89d6fad07
6 changed files with 408 additions and 22 deletions

View file

@ -76,6 +76,7 @@ From two bare Linux servers, a Git repo, and valid credentials, you can rebuild
- [Deployment lifecycle](deployment-lifecycle.md)
- [Railiance app.toml contract](app-toml-contract.md)
- [Railiance overlay repo pattern](overlay-repo-pattern.md)
- [Canary Helm template](canary-helm-template.md)
- [Railiance run command](railiance-run-command.md)
## 👥 Contributing

View file

@ -0,0 +1,55 @@
# Canary Helm Template
Generated Railiance overlays include a stage-aware Helm chart for Stage 2
canaries and Stage 3 stable promotion.
The chart keeps stable and canary release identities explicit:
- `railiance.stableRelease` names the current stable release;
- `railiance.canaryRelease` names the Stage 2 candidate release;
- `railiance.stage` selects the rendered identity, labels, and selectors;
- `railiance.previousStable` records rollback context before promotion.
## Traffic Shape
The default Stage 2 values use an isolated canary ingress:
```yaml
railiance:
stage: canary
traffic:
mode: isolated
ingress:
enabled: true
```
This creates canary Deployment, Service, and Ingress resources without changing
the stable release. For environments that use Traefik weighted routing, set:
```yaml
railiance:
traffic:
mode: weighted
provider: traefik
stableWeight: 95
canaryWeight: 5
```
The chart then renders a `TraefikService` and `IngressRoute` that split traffic
between the stable and canary services. Other ingress controllers can use the
same stable/canary values layout with controller-specific annotations or a later
provider template.
## Observability And Safety
Generated workloads include:
- Prometheus-compatible scrape annotations on pods and services;
- readiness and liveness HTTP probes;
- conservative resource requests/limits for single-node clusters;
- separate `values/stage2-canary.yaml` and `values/stage3-production.yaml` so
canary exposure and stable promotion can be reviewed independently.
Run `tests/stage2-template.sh` in the overlay repo before any Stage 2 attempt.
It verifies the scaffold and runs `helm template` when Helm is available.

View file

@ -97,8 +97,9 @@ Stage 2 candidate.
The chart is the Railiance deployment wrapper. It may start as a thin Helm
chart around an upstream image and grow only as required by the promotion gates.
It should keep defaults conservative and route production-specific choices
through `values/` files.
Generated charts include stable/canary release identities, Prometheus-compatible
annotations, HTTP probes, resource limits, isolated canary ingress, and optional
Traefik weighted routing. Production-specific choices stay in `values/` files.
### `values/`
@ -112,11 +113,14 @@ Secret values do not belong in these files. Use Kubernetes Secret,
ExternalSecret, OpenBao, KeyCape, or another approved route and record only the
reference name.
### `tests/stage1.sh`
### `tests/stage1.sh` And `tests/stage2-template.sh`
Stage 1 should be runnable without production credentials. The generated script
performs syntax and Helm rendering checks when the relevant tools are available.
Workload-specific tests can extend it.
Stage 2 template validation verifies the canary scaffold, stable/canary values,
Prometheus annotations, rollback labels, and Helm rendering when Helm is
available. Workload-specific tests can extend either script.
### `runbooks/rollback.md`
@ -147,8 +151,8 @@ fetch secrets, or push Git remotes.
1. Generate or update the overlay repo.
2. Fill in accurate image, namespace, health, dependency, and rollback fields.
3. Validate `railiance/app.toml` against the schema.
4. Run `tests/stage1.sh`.
5. Use later T04-T07 commands to run, deploy, observe, promote, and rollback.
4. Run `tests/stage1.sh` and `tests/stage2-template.sh`.
5. Use later T06-T07 commands to deploy, observe, promote, and rollback.
## Safety Rules

View file

@ -66,6 +66,7 @@ This model emphasizes:
### `create_railiance_overlay_repo.sh`
- Scaffolds a local Railiance overlay repo for a third-party upstream app.
- Records upstream identity without vendoring upstream code.
- Generates `railiance/app.toml`, a thin chart, stage values, tests, and runbooks.
- Generates `railiance/app.toml`, a stage-aware canary chart, stage values,
tests, and runbooks.
✦ Railiance is not just code — its a way of letting infrastructure **colonize new worlds**.

View file

@ -290,6 +290,23 @@ appVersion: "${UPSTREAM_REVISION}"
EOF
cat > "${OUT_DIR}/charts/${APP_ID}/values.yaml" <<EOF
railiance:
stage: stable
stableRelease: ${APP_ID}
canaryRelease: ${APP_ID}-canary
previousStable:
release: ${APP_ID}
imageTag: ""
imageDigest: ""
traffic:
mode: isolated
provider: standard
stableWeight: 100
canaryWeight: 0
routeName: ${APP_ID}-traffic
entryPoints:
- web
image:
repository: ${APP_ID}
tag: ${UPSTREAM_REVISION}
@ -301,6 +318,21 @@ replicaCount: 1
service:
port: 8080
health:
path: /health
readiness:
initialDelaySeconds: 5
periodSeconds: 10
liveness:
initialDelaySeconds: 15
periodSeconds: 20
prometheus:
enabled: true
scrape: true
path: /metrics
port: http
resources:
requests:
cpu: 50m
@ -309,79 +341,358 @@ resources:
cpu: 500m
memory: 512Mi
ingress:
enabled: false
className: ""
host: ""
path: /
pathType: Prefix
annotations: {}
tls: []
deployment:
revisionHistoryLimit: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
env: []
secretRefs: []
podAnnotations: {}
EOF
cat > "${OUT_DIR}/charts/${APP_ID}/templates/_helpers.tpl" <<'EOF'
{{- define "railiance.stage" -}}
{{- default "stable" .Values.railiance.stage -}}
{{- end -}}
{{- define "railiance.releaseName" -}}
{{- if eq (include "railiance.stage" .) "canary" -}}
{{- default (printf "%s-canary" .Chart.Name) .Values.railiance.canaryRelease | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- default .Release.Name .Values.railiance.stableRelease | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- define "railiance.image" -}}
{{- if .Values.image.digest -}}
{{- printf "%s@%s" .Values.image.repository .Values.image.digest -}}
{{- else -}}
{{- printf "%s:%s" .Values.image.repository .Values.image.tag -}}
{{- end -}}
{{- end -}}
{{- define "railiance.selectorLabels" -}}
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ include "railiance.releaseName" . }}
railiance.coulomb.social/stage: {{ include "railiance.stage" . }}
{{- end -}}
{{- define "railiance.labels" -}}
helm.sh/chart: {{ printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{ include "railiance.selectorLabels" . }}
{{- end -}}
{{- define "railiance.prometheusAnnotations" -}}
{{- if .Values.prometheus.enabled }}
prometheus.io/scrape: {{ .Values.prometheus.scrape | quote }}
prometheus.io/path: {{ .Values.prometheus.path | quote }}
prometheus.io/port: {{ .Values.prometheus.port | quote }}
{{- end }}
{{- end -}}
EOF
cat > "${OUT_DIR}/charts/${APP_ID}/templates/deployment.yaml" <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}
name: {{ include "railiance.releaseName" . }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{ include "railiance.labels" . | nindent 4 }}
annotations:
railiance.coulomb.social/stable-release: {{ .Values.railiance.stableRelease | quote }}
railiance.coulomb.social/canary-release: {{ .Values.railiance.canaryRelease | quote }}
railiance.coulomb.social/previous-stable: {{ .Values.railiance.previousStable.release | quote }}
spec:
replicas: {{ .Values.replicaCount }}
revisionHistoryLimit: {{ .Values.deployment.revisionHistoryLimit }}
strategy:
{{ toYaml .Values.deployment.strategy | nindent 4 }}
selector:
matchLabels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{ include "railiance.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{ include "railiance.labels" . | nindent 8 }}
annotations:
{{ include "railiance.prometheusAnnotations" . | nindent 8 }}
{{- with .Values.podAnnotations }}
{{ toYaml . | nindent 8 }}
{{- end }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
image: "{{ include "railiance.image" . }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.port }}
readinessProbe:
httpGet:
path: {{ .Values.health.path | quote }}
port: http
initialDelaySeconds: {{ .Values.health.readiness.initialDelaySeconds }}
periodSeconds: {{ .Values.health.readiness.periodSeconds }}
livenessProbe:
httpGet:
path: {{ .Values.health.path | quote }}
port: http
initialDelaySeconds: {{ .Values.health.liveness.initialDelaySeconds }}
periodSeconds: {{ .Values.health.liveness.periodSeconds }}
{{- with .Values.env }}
env:
{{ toYaml . | nindent 12 }}
{{- end }}
{{- if .Values.secretRefs }}
envFrom:
{{- range .Values.secretRefs }}
- secretRef:
name: {{ . | quote }}
{{- end }}
{{- end }}
resources:
{{ toYaml .Values.resources | indent 12 }}
{{ toYaml .Values.resources | nindent 12 }}
EOF
cat > "${OUT_DIR}/charts/${APP_ID}/templates/service.yaml" <<'EOF'
apiVersion: v1
kind: Service
metadata:
name: {{ .Release.Name }}
name: {{ include "railiance.releaseName" . }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{ include "railiance.labels" . | nindent 4 }}
annotations:
{{ include "railiance.prometheusAnnotations" . | nindent 4 }}
spec:
selector:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{ include "railiance.selectorLabels" . | nindent 4 }}
ports:
- name: http
port: {{ .Values.service.port }}
targetPort: http
EOF
cat > "${OUT_DIR}/charts/${APP_ID}/templates/ingress.yaml" <<'EOF'
{{- if and .Values.ingress.enabled (ne .Values.railiance.traffic.mode "weighted") }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "railiance.releaseName" . }}
labels:
{{ include "railiance.labels" . | nindent 4 }}
annotations:
{{- with .Values.ingress.annotations }}
{{ toYaml . | nindent 4 }}
{{- else }}
railiance.coulomb.social/traffic-mode: {{ .Values.railiance.traffic.mode | quote }}
{{- end }}
spec:
{{- if .Values.ingress.className }}
ingressClassName: {{ .Values.ingress.className | quote }}
{{- end }}
rules:
- host: {{ .Values.ingress.host | quote }}
http:
paths:
- path: {{ .Values.ingress.path | quote }}
pathType: {{ .Values.ingress.pathType }}
backend:
service:
name: {{ include "railiance.releaseName" . }}
port:
name: http
{{- with .Values.ingress.tls }}
tls:
{{ toYaml . | nindent 4 }}
{{- end }}
{{- end }}
EOF
cat > "${OUT_DIR}/charts/${APP_ID}/templates/traefik-weighted.yaml" <<'EOF'
{{- if and .Values.ingress.enabled (eq .Values.railiance.traffic.mode "weighted") (eq .Values.railiance.traffic.provider "traefik") }}
{{- $routeName := default (printf "%s-weighted" .Chart.Name) .Values.railiance.traffic.routeName }}
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
name: {{ $routeName }}
labels:
{{ include "railiance.labels" . | nindent 4 }}
spec:
weighted:
services:
- name: {{ .Values.railiance.stableRelease }}
port: {{ .Values.service.port }}
weight: {{ .Values.railiance.traffic.stableWeight }}
- name: {{ .Values.railiance.canaryRelease }}
port: {{ .Values.service.port }}
weight: {{ .Values.railiance.traffic.canaryWeight }}
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: {{ $routeName }}
labels:
{{ include "railiance.labels" . | nindent 4 }}
spec:
entryPoints:
{{ toYaml .Values.railiance.traffic.entryPoints | nindent 4 }}
routes:
- kind: Rule
match: "Host(`{{ .Values.ingress.host }}`) && PathPrefix(`{{ .Values.ingress.path }}`)"
services:
- name: {{ $routeName }}
kind: TraefikService
port: {{ .Values.service.port }}
{{- end }}
EOF
cat > "${OUT_DIR}/values/stage1.yaml" <<EOF
railiance:
stage: stable
stableRelease: ${APP_ID}
canaryRelease: ${APP_ID}-canary
image:
repository: ${APP_ID}
tag: ${UPSTREAM_REVISION}
EOF
cat > "${OUT_DIR}/values/stage2-canary.yaml" <<EOF
railiance:
stage: canary
stableRelease: ${APP_ID}
canaryRelease: ${APP_ID}-canary
previousStable:
release: ${APP_ID}
imageTag: ""
imageDigest: ""
traffic:
mode: isolated
provider: standard
stableWeight: 100
canaryWeight: 0
routeName: ${APP_ID}-traffic
entryPoints:
- web
image:
repository: ${APP_ID}
tag: ${UPSTREAM_REVISION}
replicaCount: 1
ingress:
enabled: true
host: ${APP_ID}-canary.local
path: /
annotations:
railiance.coulomb.social/canary-mode: isolated
railiance.coulomb.social/stable-release: ${APP_ID}
railiance.coulomb.social/canary-release: ${APP_ID}-canary
resources:
requests:
cpu: 50m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
EOF
cat > "${OUT_DIR}/values/stage3-production.yaml" <<EOF
railiance:
stage: stable
stableRelease: ${APP_ID}
canaryRelease: ${APP_ID}-canary
previousStable:
release: ${APP_ID}
imageTag: ""
imageDigest: ""
traffic:
mode: stable
provider: standard
stableWeight: 100
canaryWeight: 0
routeName: ${APP_ID}-traffic
entryPoints:
- web
image:
repository: ${APP_ID}
tag: ${UPSTREAM_REVISION}
replicaCount: 2
ingress:
enabled: true
host: ${APP_ID}.local
path: /
annotations:
railiance.coulomb.social/stage: stable
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: "1"
memory: 1Gi
EOF
cat > "${OUT_DIR}/tests/stage2-template.sh" <<EOF
#!/usr/bin/env bash
set -euo pipefail
cd "\$(dirname "\${BASH_SOURCE[0]}")/.."
python3 - <<'PY'
import pathlib
import tomllib
contract = tomllib.loads(pathlib.Path('railiance/app.toml').read_text())
stage2 = contract['stages']['stage2']
assert stage2['release'] == '${APP_ID}-canary'
assert stage2['canary_mode'] in {'isolated', 'weighted', 'header', 'shadow'}
for rel in ('${APP_ID}', '${APP_ID}-canary'):
assert rel
required_paths = [
'charts/${APP_ID}/templates/deployment.yaml',
'charts/${APP_ID}/templates/service.yaml',
'charts/${APP_ID}/templates/ingress.yaml',
'charts/${APP_ID}/templates/traefik-weighted.yaml',
'values/stage2-canary.yaml',
'values/stage3-production.yaml',
]
for item in required_paths:
assert pathlib.Path(item).exists(), item
values = pathlib.Path('values/stage2-canary.yaml').read_text()
assert 'stage: canary' in values
assert 'stableRelease: ${APP_ID}' in values
assert 'canaryRelease: ${APP_ID}-canary' in values
chart = pathlib.Path('charts/${APP_ID}/templates/deployment.yaml').read_text()
assert 'prometheus.io/scrape' in pathlib.Path('charts/${APP_ID}/templates/_helpers.tpl').read_text()
assert 'previous-stable' in chart
print('stage2 canary scaffold ok')
PY
if command -v helm >/dev/null 2>&1; then
helm template ${APP_ID}-canary charts/${APP_ID} -f values/stage2-canary.yaml >/tmp/${APP_ID}-stage2-canary-render.yaml
grep -q 'kind: Deployment' /tmp/${APP_ID}-stage2-canary-render.yaml
grep -q 'kind: Service' /tmp/${APP_ID}-stage2-canary-render.yaml
grep -q 'kind: Ingress' /tmp/${APP_ID}-stage2-canary-render.yaml
echo 'stage2 helm template ok'
else
echo 'helm unavailable; verified stage2 canary scaffold files only'
fi
EOF
chmod +x "${OUT_DIR}/tests/stage2-template.sh"
cat > "${OUT_DIR}/tests/stage1.sh" <<EOF
#!/usr/bin/env bash
set -euo pipefail
@ -430,6 +741,11 @@ This overlay follows the Railiance three-stage lifecycle.
- Stage 2 deploys an isolated canary by default.
- Stage 3 replaces the stable release only after Stage 2 acceptance.
Run \`tests/stage2-template.sh\` before the first Stage 2 attempt. To use
weighted Traefik routing, change \`railiance.traffic.mode\` to \`weighted\`, set
\`provider: traefik\`, and choose explicit stable/canary weights in
\`values/stage2-canary.yaml\`.
Before Stage 2, fill in real image repositories, platform dependencies,
observability endpoints, and rollback target details.
EOF

View file

@ -160,7 +160,7 @@ Expected behavior:
```task
id: RAIL-BS-WP-0006-T05
status: todo
status: done
priority: high
state_hub_task_id: "47b8cd47-99c7-4f31-a147-ea16afde7217"
```
@ -179,6 +179,15 @@ Minimum requirements:
**Done when:** a canary deployment can be created without hand-editing cluster
resources.
2026-06-27: Updated generated overlay charts for Stage 2 canaries. The
scaffold now emits stable/canary release identities, isolated canary ingress by
default, optional Traefik weighted routing, Prometheus-compatible annotations,
HTTP probes, conservative single-node resource limits, rollback labels,
separate Stage 2/Stage 3 values, and `tests/stage2-template.sh`. Verified a
fresh Forgejo overlay with schema validation, Stage 1 run, and Stage 2 scaffold
checks; Helm rendering was skipped because Helm is unavailable in this
environment.
---
### T06 - railiance deploy --stage 2 and observation tooling