Written by fix-consistency: T01-T06 registered in state hub. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
172 lines
4.5 KiB
Markdown
172 lines
4.5 KiB
Markdown
---
|
|
id: RAIL-BS-WP-0004
|
|
type: workplan
|
|
title: "Current-Environment Safety Net"
|
|
domain: railiance
|
|
repo: railiance-cluster
|
|
status: active
|
|
owner: tegwick
|
|
topic_slug: railiance
|
|
state_hub_workstream_id: "7e8b0c20-51eb-40c9-9e3b-85dd380d7625"
|
|
created: "2026-02-25"
|
|
updated: "2026-03-10"
|
|
---
|
|
|
|
# Current-Environment Safety Net
|
|
|
|
## Goal
|
|
|
|
Ensure backup and disaster recovery for the current single-server environment
|
|
is operational and tested before any ThreePhoenix infrastructure migration
|
|
work begins. Aligned to OAS Stack S2 (railiance-cluster owns backup tooling).
|
|
|
|
## Context
|
|
|
|
The backup toolchain lives in `tools/cmd/railiance-backup` and
|
|
`tools/cmd/railiance-preflight`, dispatched via `bin/railiance`. It protects:
|
|
|
|
| Asset | Method | Risk without backup |
|
|
|---|---|---|
|
|
| Custodian State Hub DB | pg_dump → age → Nextcloud | Total loss of workstreams, decisions, history |
|
|
| Claude config + memory | tar → age → Nextcloud | Loss of MCP registration, project memory |
|
|
| Git repos | Gitea remotes | SPOF: Gitea runs on the same server being migrated |
|
|
|
|
Decision D2: Nextcloud upload-only file drop as backup destination.
|
|
|
|
## OAS Alignment
|
|
|
|
Per ADR-003, backup tooling lives in **S2 (railiance-cluster)**. The preflight
|
|
check covers all five OAS stack repos:
|
|
|
|
| Repo | OAS Layer |
|
|
|---|---|
|
|
| railiance-infra | S1 — OS & Provisioning |
|
|
| railiance-cluster | S2 — Kubernetes Runtime |
|
|
| railiance-platform | S3 — Platform Services |
|
|
| railiance-enablement | S4 — Developer Tooling |
|
|
| railiance-apps | S5 — Workloads & Endpoints |
|
|
|
|
Plus cross-domain repos: the-custodian, markitect_project, activity-core,
|
|
net-kingdom, issue-facade, binect-js, kaizen-agentic.
|
|
|
|
## Boundary
|
|
|
|
Backup execution: this repo (`bin/railiance backup`).
|
|
Backup destination: Nextcloud file drop (URL in `~/.config/railiance/nc-upload-url` or hardcoded).
|
|
Restore procedure: `docs/backup-restore.md`.
|
|
|
|
---
|
|
|
|
## Tasks
|
|
|
|
### T01 — Update preflight repo list to OAS 5-repo layout
|
|
|
|
```task
|
|
id: T01
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "4526a842-ea31-4874-9231-92ab556cfe7b"
|
|
```
|
|
|
|
Update `tools/cmd/railiance-preflight` REPOS array: remove `railiance-bootstrap`,
|
|
add `railiance-infra`, `railiance-cluster`, `railiance-platform`,
|
|
`railiance-enablement`, `railiance-apps`. Add all active project repos.
|
|
|
|
**Done when:** `bin/railiance preflight` checks all current repos.
|
|
|
|
---
|
|
|
|
### T02 — Fix stale repo references in backup-restore.md
|
|
|
|
```task
|
|
id: T02
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "a6313e06-1976-46a7-8e31-df4eb2eca880"
|
|
```
|
|
|
|
Update restore procedure: `railiance-bootstrap` → `railiance-cluster`,
|
|
`railiance-hosts` → `railiance-infra`, add the three new OAS repos.
|
|
|
|
**Done when:** doc accurately reflects the current 5-repo OAS stack.
|
|
|
|
---
|
|
|
|
### T03 — Add make backup and make preflight targets
|
|
|
|
```task
|
|
id: T03
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "05d42a55-921f-4aa7-bb76-e8af9c7e0ac3"
|
|
```
|
|
|
|
Add to root Makefile so the safety net is discoverable from `make help`.
|
|
|
|
**Done when:** `make backup` and `make preflight` both work.
|
|
|
|
---
|
|
|
|
### T04 — Run current backup and verify upload
|
|
|
|
```task
|
|
id: T04
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "08233868-d522-4117-bc4e-6c0f52545665"
|
|
```
|
|
|
|
Run `bin/railiance backup` and confirm both DB and config files appear
|
|
in the Nextcloud file drop.
|
|
|
|
**Done when:** backup completes without error and `.last-backup` stamp is fresh.
|
|
|
|
---
|
|
|
|
### T05 — Verify or install cron job
|
|
|
|
```task
|
|
id: T05
|
|
status: todo
|
|
priority: medium
|
|
state_hub_task_id: "2d5acff7-4a4e-4ddd-ad06-08237ad3dac8"
|
|
```
|
|
|
|
Confirm that the daily 02:00 cron job is installed and has run at least once:
|
|
|
|
```bash
|
|
crontab -l | grep railiance
|
|
cat ~/.cache/railiance/backup.log | tail -20
|
|
```
|
|
|
|
If missing, install:
|
|
```bash
|
|
(crontab -l 2>/dev/null; echo "0 2 * * * /home/worsch/railiance-cluster/bin/railiance backup >> ~/.cache/railiance/backup.log 2>&1") | crontab -
|
|
```
|
|
|
|
**Done when:** cron is listed and log shows a successful run.
|
|
|
|
---
|
|
|
|
### T06 — Run restore drill
|
|
|
|
```task
|
|
id: T06
|
|
status: todo
|
|
priority: medium
|
|
state_hub_task_id: "f8e4a094-c367-40eb-b895-da17bc144b07"
|
|
```
|
|
|
|
Run the minimal restore drill from `docs/backup-restore.md` against the
|
|
current backup. Record completion in `~/.cache/railiance/restore-drill.log`.
|
|
|
|
**Done when:** drill exits 0 and log entry is written.
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- Decision D2: Nextcloud as backup destination (`DECISIONS.md`)
|
|
- Backup tooling: `tools/cmd/railiance-backup`, `tools/cmd/railiance-preflight`
|
|
- Restore procedure: `docs/backup-restore.md`
|
|
- Extension points: EP-RAIL-003 (git bare mirrors), EP-RAIL-004 (secondary offsite copy)
|