Article · 5 min read

Fail-closed multi-tenancy

Most multi-tenant systems treat missing tenant context as a soft error and recover. For platforms handling AI-generated code, prompts, and model outputs, missing context must hard-fail at the data layer. The reasons are architectural, not theoretical.

See deployment model

Read the whitepaper

GAI

Governance

Traceability

Evidence

Workflow

Artifacts

Security

The default behaviour is wrong for governed AI delivery

Most multi-tenant SaaS platforms derive tenant identity from request context the application code chooses to read. A tenant ID arrives in the request body, a query parameter, or sometimes a header. If it is missing, the application falls back: to the first tenant in the database, to a "global" scope, to the calling user's last-used tenant, or — worst — to a silent unbounded scan. Each of those fallbacks is a silent cross-tenant leak waiting for a routing bug to expose it.

The default behaviour is a deliberate trade. It optimises for developer ergonomics: writing a query that "just works" without wiring tenancy through every call site. The trade is that tenancy becomes an application-level invariant, enforced by convention. Convention does not survive a hundred contributors and a thousand callbacks.

Why AI delivery makes the trade unacceptable

AI-assisted delivery workflows are asynchronous by design. Stage transitions trigger background jobs. Long-running generation calls return through callbacks. Idempotent retries reissue work hours after the original request. Each indirection is a new place to lose request context. The same convention that worked in a synchronous web request stops working as soon as the work crosses a queue boundary.

The artifacts that AI workflows produce are tenant-bound intellectual property: prompts that encode customer language, generated code that solves a customer problem, model logs that capture proprietary requirements. One slip in the tenancy invariant produces a customer-visible incident with a clear blast radius. The threat model that justified soft-failing tenancy in a CRM does not survive contact with AI delivery.

What fail-closed tenancy looks like

Fail-closed tenancy is three architectural decisions made together:

1. Tenant identity comes from authenticated context only. The token validated at the gateway carries the tenant claim. Request bodies, query parameters, and headers that arrive after authentication cannot override it. A request that asserts a tenant ID different from the authenticated one is rejected.

2. Tenant context propagates via continuation-local storage. Once extracted at the gateway, the tenant scope rides with the asynchronous execution context across queue boundaries, retries, and callbacks. Code at any point in the call graph reads from CLS rather than receiving tenant as a parameter — but the read is mandatory, not optional.

3. Enforcement happens at the data layer. Every query against tenant-scoped data goes through a persistence extension that requires a tenant in CLS. Missing context throws a hard error before the query reaches the database. There is no "default tenant" fallback. There is no silent global scope. There is a failure that surfaces immediately and is logged with the call stack that produced it.

Why retrofitting is expensive

Fail-closed tenancy is cheap to design in. It is expensive to retrofit. Every existing query has to be audited. Every background job has to be re-instrumented to propagate context. Every callback path has to be hardened against the case where the original request is gone. CLS is a substrate decision, not a library swap — getting it right requires the framework to be aware of it from the entry point inward.

The cost is not measured in engineering hours. It is measured in product cycles spent stabilising an invariant that should never have been negotiable. Teams that retrofit tenancy usually find the bug class that justified the retrofit while they are still implementing it.

Operational consequences

Fail-closed tenancy turns a class of "weird production bug" into a clean exception with a stack trace. SOC 2 and ISO 27001 evidence becomes "we reject the request" rather than "we trust the call site". On-prem buyers can review the invariant in a code review rather than auditing every tenant-touching path.

The trade-off is real: developers writing new code have to think about tenancy at every persistence boundary. That is the point. Tenancy is the most expensive thing to get wrong in a multi-tenant platform; it should be expensive to skip.

GrowAppAI's choice

Fail-closed tenancy is non-negotiable in GrowAppAI's architecture principles. Tenant identity comes from the authenticated JWT at the gateway. CLS propagates the scope across every async boundary. The data-layer extension blocks every query that arrives without it. Missing context fails the request, surfaces in logs, and never resolves to a default.

That decision is one of the reasons regulated and on-prem buyers can adopt the platform without negotiating a separate isolation story. Read the deployment model for the substrate-neutrality view, or see the platform for the lifecycle context this invariant operates inside.

Next step

Review the isolation story against your threat model

Book a working session with our team. We will walk through the tenancy invariant and where it is enforced — useful input for any security-led platform review.

Book a Demo

Read the Whitepaper