Reviewed state and next todos
This commit is contained in:
parent
83d041fe1c
commit
301a63d843
2 changed files with 234 additions and 11 deletions
214
AGENTS.md
Normal file
214
AGENTS.md
Normal file
|
|
@ -0,0 +1,214 @@
|
||||||
|
# railiance-infra — Codex Instructions
|
||||||
|
|
||||||
|
## Custodian State Hub Integration
|
||||||
|
|
||||||
|
This project is tracked as the **railiance** domain in the Custodian State Hub.
|
||||||
|
Hub topic ID: `ca369340-a64e-442e-98f1-a4fa7dc74a38`
|
||||||
|
|
||||||
|
The State Hub runs locally at http://127.0.0.1:8000. The MCP server (`state-hub`)
|
||||||
|
exposes tools for reading and writing state without touching the API directly.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Session Protocol
|
||||||
|
|
||||||
|
**On receiving your first message — before writing any response text — execute
|
||||||
|
this orientation sequence. Do not greet, do not ask what to do first.**
|
||||||
|
|
||||||
|
**Step 1 — Call the State Hub**
|
||||||
|
```
|
||||||
|
get_domain_summary("railiance") # workstreams, blocking decisions, recent progress, SBOM status
|
||||||
|
```
|
||||||
|
If the call fails, the API is offline: `cd ~/the-custodian/state-hub && make api`
|
||||||
|
|
||||||
|
**Step 2 — Scan local workplans**
|
||||||
|
|
||||||
|
Read every `.md` file under `workplans/`. Use `Glob(pattern="**/*.md", path="workplans/")`
|
||||||
|
or Bash `ls workplans/` to discover them. For each file with `status: active`,
|
||||||
|
extract and note:
|
||||||
|
- The workplan title and ID
|
||||||
|
- All tasks whose `status` is `todo` or `in_progress`
|
||||||
|
|
||||||
|
**Step 3 — Present orientation to the user**
|
||||||
|
|
||||||
|
Output a concise brief covering:
|
||||||
|
1. **Active workstreams** (from state hub) for the `railiance` domain — title,
|
||||||
|
task counts, any blocking decisions
|
||||||
|
2. **Pending tasks for this repo** — from local `workplans/` files (Step 2)
|
||||||
|
plus any state hub tasks with `[repo:railiance-infra]` in their title
|
||||||
|
3. **Goal guidance** — if the summary contains a `goal_guidance` key, act on it:
|
||||||
|
- **`needs_workplan`** entries: for each active repo goal with no linked workstream,
|
||||||
|
surface it as the top suggested action — *"Repo goal '{title}' has no workplan yet.
|
||||||
|
Suggest: create workplans/RAIL-HO-WP-NNNN-<slug>.md and register a workstream
|
||||||
|
with repo_goal_id='{goal_id}'"*. Treat this as higher priority than continuing
|
||||||
|
existing work unless Bernd says otherwise.
|
||||||
|
- **`alignment_warnings`** entries: if active workstreams exist but are not linked
|
||||||
|
to the current repo goal, name the most recently active one and note:
|
||||||
|
*"Current work on '{recent_workstream_title}' may not be aligned with the active
|
||||||
|
goal '{active_goal_title}'. Continue unless you hear otherwise — but flag it."*
|
||||||
|
4. **Suggested next action** — the highest-priority open item across all sources,
|
||||||
|
with goal alignment taken into account
|
||||||
|
5. **SBOM status** — is `last_sbom_at` set for this repo? If not, note it as a gap
|
||||||
|
|
||||||
|
If there are no workstreams at all: follow the First Session Protocol below.
|
||||||
|
|
||||||
|
**During work:**
|
||||||
|
- Use `record_decision()` for any decision that affects direction or dependencies.
|
||||||
|
- Use `add_progress_event()` for notable events (milestones, blockers, insights).
|
||||||
|
- Use `resolve_decision()` to close a decision once the choice is made.
|
||||||
|
|
||||||
|
> **Design boundary:** The State Hub is a *read model*. Two write operations are
|
||||||
|
> permanently sanctioned: **Resolving Decisions** and **Suggesting Next Steps**.
|
||||||
|
> The bootstrap tools (`create_workstream`, `create_task`, `update_task_status`)
|
||||||
|
> are only for First Session Protocol. Formal work structure — workplans, tasks —
|
||||||
|
> belongs in the domain repo as files (ADR-001), not managed through the hub alone.
|
||||||
|
|
||||||
|
**At the end of every session:**
|
||||||
|
- Call `add_progress_event()` with a summary of what was accomplished or decided.
|
||||||
|
Include `topic_id: ca369340-a64e-442e-98f1-a4fa7dc74a38` and the relevant `workstream_id`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Repo Boundary Rule
|
||||||
|
|
||||||
|
This agent is responsible for files **in this repo only**.
|
||||||
|
|
||||||
|
- **Do not** write files or make commits in any other repository
|
||||||
|
- **Do not** create workplan files in other repos on their behalf
|
||||||
|
- When you identify work for another registered repo (**ecosystem todo**):
|
||||||
|
create a state hub task with `[repo:<slug>]` in the title — the other repo's
|
||||||
|
agent will see it at session start and create its own workplan
|
||||||
|
- When you identify work for an upstream repo (**third-party todo**):
|
||||||
|
create a contribution artifact in `contrib/` and register it
|
||||||
|
|
||||||
|
Terminology and workflows: `http://localhost:3000/docs/inter-repo-communication`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### First Session Protocol
|
||||||
|
|
||||||
|
Triggered when `get_domain_summary("railiance")` shows **no workstreams** for the `railiance`
|
||||||
|
topic. The project is registered but work has not yet been structured.
|
||||||
|
|
||||||
|
**Step 1 — Understand the project (read, don't write)**
|
||||||
|
- `~/the-custodian/canon/projects/railiance/project_charter_v0.1.md` — purpose, scope
|
||||||
|
- `~/the-custodian/canon/projects/railiance/roadmap_v0.1.md` — planned phases
|
||||||
|
- Scan the repo root: README, directory structure, existing code or docs
|
||||||
|
|
||||||
|
**Step 2 — Survey in-progress work**
|
||||||
|
- Look for TODOs, open branches, half-finished files, notes
|
||||||
|
- Note what is already done vs. what is clearly started but incomplete
|
||||||
|
|
||||||
|
**Step 3 — Propose workstreams to Bernd**
|
||||||
|
Propose 1–3 workstreams — each a coherent strand of work lasting weeks to months,
|
||||||
|
named clearly, anchored to a roadmap phase. **Wait for approval before creating.**
|
||||||
|
|
||||||
|
**Step 4 — Create workplan file first, then DB record**
|
||||||
|
Per ADR-001, work items originate as files in the repo:
|
||||||
|
```
|
||||||
|
workplans/RAIL-HO-WP-NNNN-<slug>.md ← write this first
|
||||||
|
```
|
||||||
|
Then register in the hub:
|
||||||
|
```
|
||||||
|
create_workstream(topic_id="ca369340-a64e-442e-98f1-a4fa7dc74a38", title="...", owner="...", description="...")
|
||||||
|
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 5 — Record the setup**
|
||||||
|
```
|
||||||
|
add_progress_event(
|
||||||
|
summary="First session: structured railiance work into N workstreams, M tasks",
|
||||||
|
event_type="milestone",
|
||||||
|
topic_id="ca369340-a64e-442e-98f1-a4fa7dc74a38",
|
||||||
|
detail={"workstreams": [...], "tasks_created": M}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Workplan Convention (ADR-001)
|
||||||
|
|
||||||
|
Work items MUST originate as files in this repo before being registered in the hub.
|
||||||
|
|
||||||
|
**File location:** `workplans/<ID>-<slug>.md`
|
||||||
|
**Frontmatter required:** `id`, `type: workplan`, `domain`, `repo`, `status`,
|
||||||
|
`state_hub_workstream_id`, `state_hub_task_id` (per task)
|
||||||
|
|
||||||
|
When another domain's agent identifies work for this repo, it creates a state hub
|
||||||
|
task with `[repo:railiance-infra]` in the title (an **ecosystem todo**). You will
|
||||||
|
see it at session start via `get_domain_summary("railiance")`. When you pick it up, create
|
||||||
|
the corresponding workplan file in `workplans/` (ADR-001) and begin work.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Contribution Tracking
|
||||||
|
|
||||||
|
Track upstream contributions in `contrib/` — bug reports (BR), feature requests
|
||||||
|
(FR), extension-point proposals (EP), upstream PRs (UPR).
|
||||||
|
|
||||||
|
```
|
||||||
|
contrib/
|
||||||
|
bug-reports/ # br-YYYY-MM-DD--org--repo--slug.md
|
||||||
|
feature-requests/ # fr-YYYY-MM-DD--org--repo--slug.md
|
||||||
|
extension-points/ # EP-RAIL-NNN--org--repo--slug.md
|
||||||
|
upstream-prs/ # upr-YYYY-MM-DD--org--repo--slug.md
|
||||||
|
```
|
||||||
|
|
||||||
|
Templates: `~/the-custodian/canon/standards/contrib-templates/`
|
||||||
|
Convention: `~/the-custodian/canon/standards/contribution-convention_v0.1.md`
|
||||||
|
|
||||||
|
```
|
||||||
|
register_contribution(type="br|fr|ep|upr", title="...", target_org="...",
|
||||||
|
target_repo="...", body_path="contrib/...", related_workstream_id="<uuid>")
|
||||||
|
update_contribution_status(contribution_id="<uuid>", status="submitted")
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### SBOM
|
||||||
|
|
||||||
|
After updating dependencies, re-ingest the SBOM:
|
||||||
|
```bash
|
||||||
|
cd ~/the-custodian/state-hub
|
||||||
|
make ingest-sbom REPO=railiance-infra SCAN=1 REPO_PATH=$(pwd)
|
||||||
|
```
|
||||||
|
|
||||||
|
Check compliance: `http://localhost:3000/repos`
|
||||||
|
Standard: `~/the-custodian/canon/standards/sbom-convention_v0.1.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Remote Execution & State Hub Tunnel
|
||||||
|
|
||||||
|
This repo is designed to be worked on **from the HostEurope server** (or any
|
||||||
|
remote Linux box with access to the managed hosts). The State Hub runs locally
|
||||||
|
on Bernd's workstation at `127.0.0.1:8000` and is not publicly reachable.
|
||||||
|
|
||||||
|
**Before SSHing to the remote server, start a reverse tunnel on your local machine:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh -R 8000:127.0.0.1:8000 <user>@<remote-host>
|
||||||
|
```
|
||||||
|
|
||||||
|
This forwards the remote's `localhost:8000` back to your local State Hub.
|
||||||
|
Codex on the remote host then reaches the MCP server and `get_domain_summary`
|
||||||
|
work as normal.
|
||||||
|
|
||||||
|
**Verify the tunnel is live from the remote:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://127.0.0.1:8000/state/health
|
||||||
|
# expected: {"status":"ok"}
|
||||||
|
```
|
||||||
|
|
||||||
|
**If the tunnel is not up (degraded mode):**
|
||||||
|
The State Hub call in Step 1 will fail. In that case:
|
||||||
|
- Skip Step 1 — proceed from local workplans only (Step 2)
|
||||||
|
- Note that goal guidance and progress logging will be unavailable
|
||||||
|
- Log any progress events manually from your local machine after the session
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Quick Reference
|
||||||
|
|
||||||
|
`~/the-custodian/state-hub/mcp_server/TOOLS.md` — compact MCP tool reference
|
||||||
|
|
@ -8,7 +8,7 @@ status: active
|
||||||
owner: worsch
|
owner: worsch
|
||||||
topic_slug: railiance
|
topic_slug: railiance
|
||||||
created: "2026-03-26"
|
created: "2026-03-26"
|
||||||
updated: "2026-03-27"
|
updated: "2026-05-02"
|
||||||
supersedes: RAIL-PL-WP-0001
|
supersedes: RAIL-PL-WP-0001
|
||||||
state_hub_workstream_id: "cee078e9-b18c-4f84-8a8a-6f27c2f9f407"
|
state_hub_workstream_id: "cee078e9-b18c-4f84-8a8a-6f27c2f9f407"
|
||||||
---
|
---
|
||||||
|
|
@ -432,7 +432,7 @@ context.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### T09 — Deploy state-hub to cluster (S5)
|
### T09 — Deploy state-hub to railiance01 as cluster primary (S5)
|
||||||
|
|
||||||
```task
|
```task
|
||||||
id: RAIL-HO-WP-0004-T09
|
id: RAIL-HO-WP-0004-T09
|
||||||
|
|
@ -440,12 +440,16 @@ status: todo
|
||||||
priority: medium
|
priority: medium
|
||||||
state_hub_task_id: "d2afe78a-eb51-4ce9-b332-f181323d2370"
|
state_hub_task_id: "d2afe78a-eb51-4ce9-b332-f181323d2370"
|
||||||
needs_human: true
|
needs_human: true
|
||||||
intervention_note: "Requires decisions: final hostname/domain for state-hub, whether to use Gitea container registry or ghcr.io, and approval before data migration from workstation postgres."
|
intervention_note: "Requires decisions: final hostname/domain or tunnel-only endpoint, registry choice, private exposure model, and approval before freezing workstation writes and migrating production State Hub data."
|
||||||
```
|
```
|
||||||
|
|
||||||
**Pre-condition:** T04 done (cnpg Gitea DB working); T08 done (deploy sequence
|
**Pre-condition:** T04 done (cnpg Gitea DB working); T08 done (deploy sequence
|
||||||
documented). State-hub needs a PostgreSQL database — use a cnpg cluster in
|
documented). Custodian-side safety gate `CUST-WP-0011-T01` must have passed:
|
||||||
`databases` namespace.
|
a fresh WSL2 State Hub backup restore drill with matching row counts.
|
||||||
|
|
||||||
|
State-hub needs a PostgreSQL database — use a cnpg cluster in `databases`
|
||||||
|
namespace. This is the pragmatic railiance01 migration path; full multi-node
|
||||||
|
ThreePhoenix HA remains a separate Custodian follow-up (`CUST-WP-0038`).
|
||||||
|
|
||||||
Steps:
|
Steps:
|
||||||
1. Define `state-hub-db` cnpg Cluster in `railiance-platform` (same pattern as T03).
|
1. Define `state-hub-db` cnpg Cluster in `railiance-platform` (same pattern as T03).
|
||||||
|
|
@ -456,13 +460,18 @@ Steps:
|
||||||
- Service + Ingress (https://state-hub.<domain>)
|
- Service + Ingress (https://state-hub.<domain>)
|
||||||
- ConfigMap for environment (DB URL, etc.)
|
- ConfigMap for environment (DB URL, etc.)
|
||||||
- Secret for DB credentials (SOPS-managed)
|
- Secret for DB credentials (SOPS-managed)
|
||||||
5. Migrate data: `pg_dump` from workstation postgres → `pg_restore` into
|
5. Deploy empty State Hub and run Alembic migrations in-cluster.
|
||||||
cnpg cluster.
|
6. Restore a copy of WSL2 data into the cnpg cluster and compare table counts
|
||||||
6. Update ops-bridge tunnel targets if the state-hub URL changes.
|
while the workstation remains the source of truth.
|
||||||
7. Update `~/.claude/CLAUDE.md` global instructions to point to cluster URL.
|
7. With explicit human approval, freeze workstation writes, take a final dump,
|
||||||
|
restore it to the cluster, and make railiance01 the primary endpoint.
|
||||||
|
8. Update ops-bridge tunnel targets or MCP `API_BASE` if the State Hub URL changes.
|
||||||
|
9. Update operator instructions to describe cluster primary plus WSL2 fallback.
|
||||||
|
|
||||||
**Done when:** `curl https://state-hub.<domain>/state/health` returns healthy;
|
**Done when:** the private State Hub endpoint returns healthy, MCP tools work
|
||||||
all MCP tools functional; workstation state-hub can be decommissioned.
|
against the cluster-backed API, and WSL2 is retained as documented fallback.
|
||||||
|
Permanent WSL2 retirement is out of scope here and requires a later explicit
|
||||||
|
approval after stabilisation.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue