Provider Secret gate cleared; full reconcile passed with fixture smoke (health=ok, latency 2.084s). Harden the smoke against NetworkPolicy allowlist propagation by retrying up to 6x with a 5s warm-up inside the smoke pod — the netpol added 2026-06-19 rejected the pod's immediate first request before its IP propagated. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
3.7 KiB
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|
| RAILIANCE-WP-0014 | workplan | activity-core llm-connect live reconcile | financials | railiance-cluster | finished | codex | railiance | 2026-06-18 | 2026-07-01 | a152ddda-d60a-4a65-9b9c-59e2db9ff2b7 |
activity-core llm-connect live reconcile
Context
activity-core has updated its Railiance runtime manifest so
actcore-runtime-config points at the verified in-cluster llm-connect URL:
LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080
LLM_CONNECT_TIMEOUT_SECONDS=300
The remaining live gate belongs at the cluster/operator layer. Provider credentials must stay outside Git and State Hub, and the fixture smoke should record only non-secret evidence.
Add cluster-owned reconcile/check command
id: RAILIANCE-WP-0014-T01
status: done
priority: high
state_hub_task_id: "49288db7-8102-4ad5-af08-1fe6ab3f1d37"
Add a repeatable Railiance command that:
- reconciles the non-secret activity-core runtime config keys;
- checks the provider Secret by key count only;
- applies the llm-connect overlay only after the provider Secret exists;
- runs the in-namespace fixture smoke only after deployment readiness;
- posts a non-secret State Hub evidence note.
2026-06-18: Added tools/cmd/railiance-reconcile-activity-core-llm-connect
and Makefile target reconcile-activity-core-llm-connect.
Reconcile live non-secret runtime config
id: RAILIANCE-WP-0014-T02
status: done
priority: high
state_hub_task_id: "61df5bad-535f-4ad1-ac7a-f46ff278c388"
Patch the live activity-core/actcore-runtime-config ConfigMap so it consumes
the verified llm-connect service URL and timeout. Do not touch Secret values.
2026-06-18: The reconcile command patches only LLM_CONNECT_URL and
LLM_CONNECT_TIMEOUT_SECONDS, then re-reads the live ConfigMap to verify the
values. Live evidence note c72c514a-399e-4c54-8d5b-d36405932360 confirms
LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080 and
LLM_CONNECT_TIMEOUT_SECONDS=300.
Complete provider Secret, deployment, and smoke gate
id: RAILIANCE-WP-0014-T03
status: done
priority: high
state_hub_task_id: "ae8af00a-c14f-4b76-933c-46d06cd360ae"
After an operator stores provider credentials in
activity-core/llm-connect-provider-secrets, rerun:
make reconcile-activity-core-llm-connect
The command will apply the llm-connect overlay, wait for deployment readiness,
run the in-namespace fixture smoke with imagePullPolicy=Never, and post
non-secret evidence: provider Secret key count, deployment readiness,
pass/fail, latency/recommendation summary or sanitized failure.
2026-07-01: Gate closed. Provider Secret activity-core/llm-connect-provider-secrets
present (key count 1, no values inspected), overlay applied (no drift),
deployment llm-connect ready 1/1, in-namespace fixture smoke passed
(health=ok latency_seconds=2.084 recommendations=1). Evidence note
bddbf5d2-6cbe-4d97-9de6-689147d61be1. The first rerun failed with
Connection refused because the llm-connect-activity-core-only
NetworkPolicy (added 2026-06-19) allowlist had not yet propagated the fresh
smoke-pod IP; the reconcile tool now retries the smoke up to 6× with a 5s
warm-up inside the pod.
Historical live gate on 2026-06-18: provider Secret
activity-core/llm-connect-provider-secrets is missing, so deployment and
smoke are intentionally blocked until operator/OpenBao-to-Kubernetes Secret
custody is complete. Evidence note
c72c514a-399e-4c54-8d5b-d36405932360 records provider Secret status
missing, key count 0, deployment status not checked; provider Secret gate not satisfied, and smoke status blocked.