Cascade Architecture Breakdown
Generated from the current workspace on 2026-04-09. This reflects the code and docs currently present in /home/leo/workspace and /home/leo/cascade, including gaps between spec and implementation.
Executive Summary
There are six main runtime services plus public sites:
The stack uses two internal communication styles on purpose:
- HTTP when a service needs an immediate authoritative answer or a synchronous mutation result.
- NATS when a service wants to publish a state change or lifecycle event for other services to react to asynchronously.
The largest current trust gap is internal machine identity. Headwaters already does native client-certificate identity extraction. Fabric and Ledger currently still depend on forwarded identity in proxy mode, and Cascadia currently identifies to Fabric using x-client-identity instead of presenting a true node certificate.
Top-Level System Map
Public browser -> useCascade.io / docs -> Headwaters public auth + OIDC -> Weir operator console -> Conduit tenant/customer panel -> Ledger browser billing routes Internal services Weir -> Headwaters / Fabric / Ledger Conduit -> Headwaters / Fabric / Ledger / Cascadia Ledger -> Fabric Headwaters -> Fabric Cascadia -> Fabric Async event bus Fabric -> NATS -> Conduit Ledger -> NATS -> Conduit Conduit -> NATS -> downstream consumers (currently limited) Machine trust Intended: Breakwater issues certs and/or services terminate mTLS natively Current: hybrid, with some header-forwarding still present
How HTTP, NATS, and WebSockets Fit
HTTP
Used for authoritative request/response work:
- Fabric tenant lookup, entitlements, signed actions, provisioning
- Ledger checkout and subscription reads
- Headwaters auth, org, OIDC, JWKS
- Cascadia signed-action endpoints and control-plane sync
Use HTTP when the caller needs a success/failure answer now.
NATS
Used as an internal event bus for async propagation:
fabric.tenant.*fabric.node.*fabric.provisioning.*ledger.subscription.*conduit.*
Use NATS when one service wants to announce state changes and multiple consumers may react later.
WebSockets
Not the same thing as NATS.
- WebSockets are point-to-point long-lived connections.
- NATS is a backend messaging fabric / event bus.
- Cascadia and browser-facing real-time surfaces could use WebSockets or streams.
WebSockets are for live sessions. NATS is for backend event distribution.
Trust and Authority Boundaries
| Concern | Authority | Notes |
|---|---|---|
| Human identity, sessions, MFA, org membership | Headwaters | Downstream services validate Headwaters JWTs via JWKS. |
| Tenant existence, state, entitlements, policy, node licensing | Fabric | The main control-plane authority. |
| Billing state, subscriptions, checkout, enforcement triggers | Ledger | Ledger does not mutate tenant state directly; it calls Fabric. |
| Tenant-facing product/order/service API | Conduit | Depends on Fabric and Ledger, but owns the tenant panel backend. |
| Node-local runtime state, logs, files, realized networking, execution | Cascadia | After workload acceptance, runtime truth is node-local. |
| Operator/admin aggregation UX | Weir | Not authoritative; aggregates Headwaters + Fabric + Ledger + some Conduit. |
| Machine identity / internal PKI | Intended: Breakwater | Current implementation is incomplete and hybrid. |
Service Rundown
Headwaters
Repo: /home/leo/workspace/Headwatersv2
Role: human identity authority.
Main features: signup/login/logout/refresh, password reset, magic links, user profile, orgs, memberships, roles, invites, MFA, machine tokens, OIDC/OAuth, JWKS, internal token introspection.
Exposure:
- Public/browser-facing: auth/session/OIDC endpoints.
- Internal-only:
/v1/internal/*mTLS endpoints.
Native mTLS termination: Yes. Headwaters uses a Rustls acceptor and extracts canonical caller identity from the peer certificate.
Important routes:
POST /v1/auth/signupPOST /v1/auth/loginPOST /v1/auth/refreshGET /.well-known/jwks.jsonGET /oauth/authorize,POST /oauth/token,GET /oauth/userinfoGET/POST /v1/orgsand nested role/member/invite routesPOST /v1/internal/token/introspect
Fabric
Repo: /home/leo/workspace/fabric-v2
Role: platform control-plane authority.
Modules it effectively owns:
- Tenant lifecycle: create, lookup, suspend, unsuspend, schedule deletion, wipe flow, cluster migration.
- Profiles and entitlements: profile creation/publish/deprecate, tenant entitlement overrides, effective entitlements.
- Policy distribution: service policy bundles and tenant policy bundles.
- Signed actions: issue and introspect short-lived Fabric tokens for Cascadia operations.
- Provisioning: capacity reservations and deployment-template approval handoff.
- Node licensing and attestation: bootstrap tokens, attestation, heartbeat, revocation, transport peer rosters, certificate lifecycle hooks.
- Tenant JWT keys: per-tenant signing keys distributed to Conduit.
- Template catalog: canonical runtime templates used by Conduit.
- Admin/global config: global config, cluster registry, signing-key rotation.
- Audit and events: append-only audit and NATS subjects.
Exposure:
- Public: effectively just
GET /v1/jwks.json. - Internal-only: almost everything else.
- Admin: staff JWT routes under
/v1/admin/*.
Current inbound machine identity model: Hybrid. Fabric expects a PeerIdentity in request extensions. Today that usually comes from forwarded identity middleware unless a future native TLS listener is added.
Important routes:
POST /v1/tenants,GET /v1/tenants/:id,GET /v1/tenants/by-headwaters-id/:headwaters_idGET /v1/tenants/:id/effective-entitlementsGET /v1/policy-bundle,GET /v1/tenants/:id/policy-bundlePOST /v1/signed-actions/issue,POST /v1/signed-actions/introspectPOST /v1/provisioning/resolve,POST /v1/provisioning/reservations/finalizePOST /v1/licensing/attest,POST /v1/licensing/heartbeat,POST /v1/licensing/renew-cert,POST /v1/licensing/bootstrap-tokens,POST /v1/licensing/revoke
Ledger
Repo: /home/leo/workspace/ledger
Role: billing state machine and enforcement trigger.
Features: product and plan management, checkout session creation, subscription state, webhook ingestion, scheduler-driven enforcement, internal subscription lookups.
Exposure:
- Public/browser-facing: checkout and subscription routes.
- Webhook-facing:
/v1/webhooks/polar. - Internal-only:
/v1/internal/*. - Admin:
/v1/admin/*via staff JWT.
Current inbound machine identity model: Hybrid. Similar to Fabric: internal allowlist plus proxy/header mode exists, native peer-cert extraction is not wired like Headwaters.
Important routes:
POST /v1/checkout/sessionGET /v1/subscriptions,GET /v1/subscriptions/:id,POST /v1/subscriptions/:id/cancelGET /v1/internal/tenants/:tenant_id/subscriptionPOST /v1/internal/tenants/:tenant_id/record-addon
Conduit
Repo: /home/leo/workspace/Conduit
Role: tenant-facing panel backend for products, services, customers, staff, templates, nodes, networking, billing projections, and support-ish operations.
Features:
- Catalog, products, plans, quotes, orders, services
- Customer auth and customer records
- Staff roles and tenant admin config
- Nodes, node groups, private fabrics, virtual networks
- Templates, migrations, billing projections, invoice actions
- Signed-action brokering to Cascadia via Fabric
Exposure: public tenant/admin API under /api/v1/*.
Current outbound dependencies:
- Headwaters JWKS validation
- Fabric policy bundles, tenant policy, signed actions, provisioning, licensing actions
- Ledger internal subscription/addon routes
- Cascadia signed-action endpoints and control-plane projection endpoints
Current machine identity issue: Conduit still injects x-client-identity in its outbound Fabric and Ledger clients. That should go away once proper client cert identity is in place.
Large route families:
/api/v1/products,/api/v1/plans,/api/v1/orders,/api/v1/services/api/v1/billing/*/api/v1/customers*,/api/v1/auth/customer/*/api/v1/virtual-networks*/api/v1/staff/roles*/api/v1/nodes*/api/v1/admin/*for locations, node groups, templates, migrations, private fabrics, plan/product versions/api/v1/signed-actions
Cascadia
Repo: /home/leo/workspace/Cascadia
Role: node-resident sovereign runtime authority.
Features:
- accept Fabric-approved deployment templates
- verify and execute Fabric-issued signed actions
- store sovereign workload/runtime state locally
- maintain usage, logs, network state, TLS/domain state
- perform Fabric heartbeat, policy sync, JWKS sync, and cert-renew sync loops
- manage encrypted tenant mesh participation
Exposure:
- Node/browser-facing:
/client-action,/client-query,/client-stream - Control-plane projection:
/control-plane/audit - Health:
/livez,/readyz,/metrics
Current authentication to Fabric: Not proper node mTLS yet. The current Fabric client sets x-client-identity to cascadia.<node_id>.internal and does not present a true client cert.
Anti-copy licensing state: Partially real.
- Install derives a hardware fingerprint.
- Fabric bootstrap tokens are single-use and hashed at rest.
- Fabric node records bind
tenant_id + node_id + hardware_fingerprint. - Heartbeat and re-attest reject hardware fingerprint mismatch.
- Cascadia stores an encrypted activation binding with
node_id,tenant_id,license_key, and the current hardware fingerprint. - Runtime activation fails if the binding does not match current hardware.
But: the node certificate lifecycle is not complete. Fabric renew_certificate currently ignores the CSR and returns a generated serial string as the "certificate". This is not a real PKI implementation yet.
Weir
Repo: /home/leo/cascade/Weir
Role: operator/admin console backend-for-frontend.
Features: session context, auth/login dance with Headwaters, org views, onboarding/activation, tenant summary, billing summary, infrastructure summary, node summary, activity projection.
Exposure: console/backend API at /api/*.
Current status: useful route shape exists, but upstream integration is still partially mocked/in-memory in dev fallback mode.
Important routes:
/api/session/me,/api/session/switch-org/api/auth/login,/api/auth/consent,/api/auth/logout/api/onboarding/status,/api/onboarding/activate/api/org/current,/api/org/members,/api/org/invites/api/tenant/current,/api/tenant/entitlements/api/billing/summary,/api/billing/checkout/api/infrastructure/summary,/api/nodes,/api/activity
Breakwater
Repo: /home/leo/workspace/Breakwater
Intended role: machine identity authority and trust layer.
What exists now:
- a strong spec defining it as internal PKI and mTLS trust boundary
- local PKI bootstrap scripts
- Caddy gateway configs for Fabric, Ledger, and Sluice
- very light Rust binaries that are still scaffold-level
Conclusion: Breakwater already exists conceptually as the issuer/trust layer, but the actual runtime implementation is still mostly the local gateway configs rather than a finished authority service.
Client-Facing Flows
Operator Signup / Console Flow
- Browser authenticates with Headwaters.
- Weir exchanges/uses Headwaters identity and org context.
- Weir calls Fabric to create or inspect the tenant.
- Weir calls Ledger for billing summary or checkout initiation.
- Weir may redirect operators into tenant-facing Conduit workflows later.
Tenant Admin / Customer Flow
- Browser uses Conduit UI/API.
- Conduit validates Headwaters staff JWTs or customer auth context.
- Conduit asks Fabric for policy, entitlements, or signed actions.
- For node/runtime operations, browser or Conduit calls Cascadia using Fabric-issued authorization.
- For billing data, Conduit queries Ledger internal routes or starts checkout via Ledger.
Cascadia Node Install / Activation Flow
- Operator obtains a Fabric bootstrap token.
install.shruns on the target host and calls hidden_activate-node.- Cascadia derives a hardware fingerprint locally.
- Cascadia attests to Fabric with
tenant_id,node_id, fingerprint, bootstrap token, hostname, and transport metadata. - Fabric consumes the single-use bootstrap token, creates a node record, issues a license key, and stores the hardware fingerprint.
- Cascadia stores local node identity plus an encrypted activation binding tied to the current host.
Internal Flows
| Caller | Callee | Transport | Purpose |
|---|---|---|---|
| Headwaters | Fabric | HTTP | Fetch service policy bundle |
| Ledger | Fabric | HTTP | Tenant lookup, suspend/unsuspend, deletion, profile migration, policy bundle |
| Conduit | Fabric | HTTP | Policy, tenant bundle, signed actions, provisioning, bootstrap tokens, revocation |
| Conduit | Ledger | HTTP | Internal subscription lookup, addon record, checkout orchestration |
| Conduit | Cascadia | HTTP | Signed runtime actions, telemetry pulls, control-plane audit projection |
| Cascadia | Fabric | HTTP | Attest, heartbeat, policy sync, JWKS sync, signed-action introspection, cert renew, transport peers |
| Weir | Headwaters | HTTP | Org membership/auth/session flows |
| Weir | Fabric | HTTP | Tenant creation/state/entitlements |
| Weir | Ledger | HTTP | Billing summary and checkout |
NATS Subjects in Use
Fabric emits
fabric.tenant.created, fabric.tenant.cluster_assigned, fabric.tenant.suspended, fabric.tenant.unsuspended, fabric.tenant.deletion_scheduled, fabric.tenant.wipe_initiated, fabric.tenant.deletion_ready, fabric.tenant.cluster_migrated, fabric.tenant.entitlements_changed, fabric.node.attested, fabric.node.revoked, fabric.config.updated, fabric.signing_key.rotated, fabric.cluster.created, fabric.signed_action.issued, fabric.signed_action.introspected, fabric.signed_action.replayed, fabric.provisioning.reserved, fabric.provisioning.reservation_updated.
Ledger emits / Conduit consumes
ledger.subscription.created, ledger.subscription.updated are consumed by Conduit.
Conduit emits
conduit.order.created, conduit.order.pending_payment, conduit.invoice.paid, conduit.invoice.refunded, conduit.provisioning.started, conduit.provisioning.succeeded, conduit.provisioning.failed, conduit.service.created, conduit.service.terminated.
Cascadia Licensing and Anti-Copy Reality Check
| Mechanism | Status | How it works now |
|---|---|---|
| Single-use bootstrap token | Implemented | Fabric creates aft_bt_... tokens, stores only a SHA-256 hash, and consumes them on first attestation. |
| Hardware fingerprint binding in Fabric | Implemented | Fabric stores hardware_fingerprint on the node record and rejects mismatches on attest/heartbeat. |
| Host-bound local activation seal | Implemented | Cascadia stores an encrypted activation binding under the node state dir and refuses runtime activation if current hardware fingerprint differs. |
| Real node client certificate authentication to Fabric | Not implemented correctly yet | Cascadia still identifies to Fabric by sending x-client-identity based on node ID. |
| Real certificate issuance / CSR signing | Stubbed | Fabric renew_certificate currently ignores the CSR and returns a generated certificate serial string, not an actual signed cert chain. |
Conclusion: copying the Cascadia binary alone is not enough to create a valid second node, because the host-bound activation binding and Fabric fingerprint checks block that path. But the machine-certificate side is not complete yet, so the current design is not finished enough to call the node identity lifecycle production-grade.
Should Fabric Be Split?
Today: keeping Fabric together is reasonable because many of its responsibilities are tightly coupled: tenant state, entitlements, policy bundles, node licensing, signed actions, provisioning, tenant JWT keys.
In an ideal future world: some areas could split if scale or team ownership demands it:
- Machine identity / PKI into Breakwater
- Catalog / template authority into a dedicated catalog service
- Provisioning/capacity reservation into a dedicated scheduler service
- Policy/entitlement engine into a smaller dedicated policy authority
But: splitting too early would add coordination overhead while the trust model is still settling. The highest-value boundary to split first is machine identity into Breakwater, not tenant policy out of Fabric.
Current Recommended Target Architecture
- Breakwater becomes the real machine certificate issuer and trust-bundle authority.
- Headwaters remains human identity only.
- Each internal service terminates client certs natively where feasible.
- Fabric consumes verified machine identity directly from TLS, not generic forwarded headers.
- Ledger does the same.
- Cascadia gets a real node cert lifecycle issued by Breakwater/Fabric-approved flows.
- Forwarded identity headers become a tightly scoped transition mechanism only, then disappear.
Companion file: roadmap.html