9.1 KiB
| id | type | title | domain | repo | status | owner | topic_slug | repo_goal_id | state_hub_workstream_id | created | updated |
|---|---|---|---|---|---|---|---|---|---|---|---|
| RAIL-BS-WP-0006 | workplan | Staged Promotion Lifecycle | financials | railiance-cluster | finished | railiance | railiance | 6ea441f7-7fe3-4598-922b-38baf20c0580 | cb72d3ba-1863-43c2-a2a5-49ac75fc2603 | 2026-02-24 | 2026-06-27 |
Staged Promotion Lifecycle
Goal
Design and implement the three-stage deployment lifecycle as the core Railiance application promotion pattern:
- Stage 1: local development and validation.
- Stage 2: canary on production infrastructure.
- Stage 3: full production promotion with rollback.
This lifecycle should become the repeatable path for native Railiance apps and third-party upstream applications wrapped by a Railiance overlay repo.
Why This Belongs Before Forgejo
Forgejo will become critical production infrastructure. Before moving the source forge itself, Railiance needs a well-defined promotion lifecycle so the Forgejo deployment, Actions runners, package registry, and future upgrades can move through the same staged gates as every other important workload.
Boundary
This workplan lives in railiance-cluster because it defines cluster runtime
promotion mechanics and the canonical handoff between local validation,
canary deployment, and production routing.
Expected cross-repo handoffs:
railiance-enablement: developer-facing CLI templates and CI workflow conventions.railiance-platform: shared platform dependencies used by canaries.railiance-apps: application Helm values and workload-specific promotion definitions.
Tasks
T01 - Write deployment lifecycle specification
id: RAIL-BS-WP-0006-T01
status: done
priority: high
state_hub_task_id: "fbfc341f-8ccb-4950-a85d-3e59c4f5b87f"
Write docs/deployment-lifecycle.md.
The spec should define:
- Stage 1, Stage 2, and Stage 3 semantics.
- Required checks before each stage.
- Canary acceptance gates.
- Rollback expectations.
- Human approval gates for production-critical workloads.
Done when: the lifecycle is clear enough to apply to Forgejo as a later production workload.
2026-06-16: Added docs/deployment-lifecycle.md and linked it from
docs/README.md. The specification defines Stage 1 local validation, Stage 2
production canary, Stage 3 production promotion, required checks and evidence,
canary acceptance gates, rollback expectations, human approval gates for
production-critical workloads, and the Forgejo readiness questions that must be
answered before cutover.
T02 - Define railiance directory schema and app.toml contract
id: RAIL-BS-WP-0006-T02
status: done
priority: high
state_hub_task_id: "523cf928-bb0e-4109-a172-abf029c62885"
Define the repository-local railiance/ directory schema and app.toml
contract for native and third-party applications.
Minimum contract:
- App identity and ownership.
- Stage definitions.
- Required platform dependencies.
- Health checks and observability endpoints.
- Promotion and rollback commands.
- Secret references without plaintext secret values.
Done when: a repo can declare how it moves through the Railiance promotion lifecycle without bespoke instructions.
2026-06-27: Added docs/app-toml-contract.md, schemas/railiance-app.schema.json, and examples/railiance/app.toml. The v1 contract covers app identity, ownership, source/artifact policy, platform dependencies, secret references without plaintext values, health and observability endpoints, stage commands/checks/evidence, canary and promotion modes, rollback strategy, and human approval gates.
T03 - Overlay repo pattern and creation script
id: RAIL-BS-WP-0006-T03
status: done
priority: medium
state_hub_task_id: "7cd378f2-0319-407a-9ce7-2c6d1a6d6d24"
Design the overlay repo pattern for third-party upstream applications and add
create_railiance_overlay_repo.sh or equivalent tooling.
The pattern should keep upstream code and Railiance deployment concerns cleanly separated while still allowing reproducible promotion.
Done when: a third-party app can be wrapped without forking deployment logic into the upstream repository.
2026-06-27: Added docs/overlay-repo-pattern.md and tools/create_railiance_overlay_repo.sh, plus the bin/railiance create-overlay dispatcher entry. The scaffold records upstream identity in railiance/upstream.toml, generates a schema-valid railiance/app.toml, stage values, a thin Helm chart, Stage 1 test script, rollback runbook, and promotion notes without vendoring upstream code or touching secrets.
T04 - railiance run command
id: RAIL-BS-WP-0006-T04
status: done
priority: high
state_hub_task_id: "95c3311b-04bb-4c83-bda3-47958217b665"
Implement the Stage 1 railiance run command for local development and
validation.
Expected behavior:
- Read
railiance/app.toml. - Start or validate the local development target.
- Run defined local health checks.
- Emit a machine-readable result suitable for later promotion gates.
Done when: at least one representative app can complete Stage 1 locally.
2026-06-27: Added tools/cmd/railiance-run, the bin/railiance run dispatcher entry, and docs/railiance-run-command.md. The command reads railiance/app.toml, runs Stage 1 commands and local checks, and emits railiance.run-result.v1 JSON without command logs or secret values. Updated the overlay generator so a generated Forgejo overlay completes Stage 1 locally in this environment; Helm rendering is optional when Helm is unavailable.
T05 - Canary Helm chart template
id: RAIL-BS-WP-0006-T05
status: done
priority: high
state_hub_task_id: "47b8cd47-99c7-4f31-a147-ea16afde7217"
Create the Stage 2 canary Helm chart template.
Minimum requirements:
- Stable and canary release identities.
- Weighted routing or equivalent traffic split through the chosen ingress path.
- Prometheus-compatible annotations.
- Resource limits appropriate for single-node and future ThreePhoenix use.
- Rollback-safe values layout.
Done when: a canary deployment can be created without hand-editing cluster resources.
2026-06-27: Updated generated overlay charts for Stage 2 canaries. The
scaffold now emits stable/canary release identities, isolated canary ingress by
default, optional Traefik weighted routing, Prometheus-compatible annotations,
HTTP probes, conservative single-node resource limits, rollback labels,
separate Stage 2/Stage 3 values, and tests/stage2-template.sh. Verified a
fresh Forgejo overlay with schema validation, Stage 1 run, and Stage 2 scaffold
checks; Helm rendering was skipped because Helm is unavailable in this
environment.
T06 - railiance deploy --stage 2 and observation tooling
id: RAIL-BS-WP-0006-T06
status: done
priority: medium
state_hub_task_id: "6a5c7422-fcb1-49d1-8153-e891bd1c27fa"
Implement Stage 2 deployment and observation commands.
Expected behavior:
- Deploy the canary from declared app metadata.
- Show rollout state, pod health, ingress/routing state, and key metrics.
- Fail closed when prerequisites or health gates are missing.
Done when: Stage 2 can be run and observed from a repeatable command path.
2026-06-27: Added tools/cmd/railiance-stage2 and dispatcher entries for
bin/railiance deploy and bin/railiance observe. Deploy emits a
railiance.stage2-deploy-result.v1 plan by default, can run Helm server dry-run
or apply when tools and cluster access are present, and fails closed when
required paths, Helm, or approval evidence are missing. Observe emits a
railiance.stage2-observe-result.v1 target plan by default and runs live
kubectl rollout, pod, ingress, and metrics checks only with --live. Updated
generated overlays to declare the repeatable Stage 2 plan commands.
T07 - railiance promote, rollback, and onboarding guide
id: RAIL-BS-WP-0006-T07
status: done
priority: medium
state_hub_task_id: "476198f6-0049-4ac4-9593-6723c86c9602"
Implement Stage 3 promotion and rollback commands, then write the reference onboarding guide.
Expected output:
railiance promotefor controlled production promotion.railiance rollbackfor reverting to the previous stable version.- A guide showing how a representative app adopts the lifecycle.
- Explicit human approval points for critical infrastructure workloads.
Done when: a representative app can move Stage 1 -> Stage 2 -> Stage 3 and back through rollback using documented commands.
2026-06-27: Added tools/cmd/railiance-stage3 and dispatcher entries for
bin/railiance promote and bin/railiance rollback. Both commands default to
non-mutating JSON plans, apply modes require approval evidence and Helm, and
rollback apply also requires a Helm revision for helm-revision strategy.
Added docs/promote-rollback-onboarding.md with the representative Stage 1 ->
Stage 2 -> Stage 3 -> rollback path and explicit human approval points for
critical workloads. Updated generated overlays to declare promote/rollback plan
commands.
Dependencies
This workplan should be done before the Forgejo production cutover. It can run in parallel with preparatory ThreePhoenix design, but its Stage 2/3 behavior should be validated against the intended ThreePhoenix cluster model.