railiance-cluster/workplans/RAIL-BS-WP-0004-safety-net.md
tegwick a15ceee92b chore(workplan): add state_hub_task_ids to WP-0004
Written by fix-consistency: T01-T06 registered in state hub.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 15:24:28 +01:00

4.5 KiB

id type title domain repo status owner topic_slug state_hub_workstream_id created updated
RAIL-BS-WP-0004 workplan Current-Environment Safety Net railiance railiance-cluster active tegwick railiance 7e8b0c20-51eb-40c9-9e3b-85dd380d7625 2026-02-25 2026-03-10

Current-Environment Safety Net

Goal

Ensure backup and disaster recovery for the current single-server environment is operational and tested before any ThreePhoenix infrastructure migration work begins. Aligned to OAS Stack S2 (railiance-cluster owns backup tooling).

Context

The backup toolchain lives in tools/cmd/railiance-backup and tools/cmd/railiance-preflight, dispatched via bin/railiance. It protects:

Asset Method Risk without backup
Custodian State Hub DB pg_dump → age → Nextcloud Total loss of workstreams, decisions, history
Claude config + memory tar → age → Nextcloud Loss of MCP registration, project memory
Git repos Gitea remotes SPOF: Gitea runs on the same server being migrated

Decision D2: Nextcloud upload-only file drop as backup destination.

OAS Alignment

Per ADR-003, backup tooling lives in S2 (railiance-cluster). The preflight check covers all five OAS stack repos:

Repo OAS Layer
railiance-infra S1 — OS & Provisioning
railiance-cluster S2 — Kubernetes Runtime
railiance-platform S3 — Platform Services
railiance-enablement S4 — Developer Tooling
railiance-apps S5 — Workloads & Endpoints

Plus cross-domain repos: the-custodian, markitect_project, activity-core, net-kingdom, issue-facade, binect-js, kaizen-agentic.

Boundary

Backup execution: this repo (bin/railiance backup). Backup destination: Nextcloud file drop (URL in ~/.config/railiance/nc-upload-url or hardcoded). Restore procedure: docs/backup-restore.md.


Tasks

T01 — Update preflight repo list to OAS 5-repo layout

id: T01
status: done
priority: high
state_hub_task_id: "4526a842-ea31-4874-9231-92ab556cfe7b"

Update tools/cmd/railiance-preflight REPOS array: remove railiance-bootstrap, add railiance-infra, railiance-cluster, railiance-platform, railiance-enablement, railiance-apps. Add all active project repos.

Done when: bin/railiance preflight checks all current repos.


T02 — Fix stale repo references in backup-restore.md

id: T02
status: done
priority: medium
state_hub_task_id: "a6313e06-1976-46a7-8e31-df4eb2eca880"

Update restore procedure: railiance-bootstraprailiance-cluster, railiance-hostsrailiance-infra, add the three new OAS repos.

Done when: doc accurately reflects the current 5-repo OAS stack.


T03 — Add make backup and make preflight targets

id: T03
status: done
priority: medium
state_hub_task_id: "05d42a55-921f-4aa7-bb76-e8af9c7e0ac3"

Add to root Makefile so the safety net is discoverable from make help.

Done when: make backup and make preflight both work.


T04 — Run current backup and verify upload

id: T04
status: done
priority: high
state_hub_task_id: "08233868-d522-4117-bc4e-6c0f52545665"

Run bin/railiance backup and confirm both DB and config files appear in the Nextcloud file drop.

Done when: backup completes without error and .last-backup stamp is fresh.


T05 — Verify or install cron job

id: T05
status: todo
priority: medium
state_hub_task_id: "2d5acff7-4a4e-4ddd-ad06-08237ad3dac8"

Confirm that the daily 02:00 cron job is installed and has run at least once:

crontab -l | grep railiance
cat ~/.cache/railiance/backup.log | tail -20

If missing, install:

(crontab -l 2>/dev/null; echo "0 2 * * * /home/worsch/railiance-cluster/bin/railiance backup >> ~/.cache/railiance/backup.log 2>&1") | crontab -

Done when: cron is listed and log shows a successful run.


T06 — Run restore drill

id: T06
status: todo
priority: medium
state_hub_task_id: "f8e4a094-c367-40eb-b895-da17bc144b07"

Run the minimal restore drill from docs/backup-restore.md against the current backup. Record completion in ~/.cache/railiance/restore-drill.log.

Done when: drill exits 0 and log entry is written.


References

  • Decision D2: Nextcloud as backup destination (DECISIONS.md)
  • Backup tooling: tools/cmd/railiance-backup, tools/cmd/railiance-preflight
  • Restore procedure: docs/backup-restore.md
  • Extension points: EP-RAIL-003 (git bare mirrors), EP-RAIL-004 (secondary offsite copy)