Your Feature Flag Vendor Should Not Sit on Your Hot Path

A practical architecture for keeping feature flags off your request hot path, with better consistency, observability, and safer worker workflows.

Your Feature Flag Vendor Should Not Sit on Your Hot Path

A very normal mistake in modern systems is treating feature flags like a per-request API dependency.

It usually starts innocently. You have edge middleware, serverless handlers, SSR, a few background workers, and a flag vendor with a nice SDK. Someone wires flag evaluation straight into the request path. In the worst version, the edge function calls a remote flag API on every request. In the slightly less bad version, the server does it during render. Either way, your release control plane just became part of your serving path.

That architecture is fine right up until traffic spikes, the network gets weird, or the vendor endpoint slows down for three minutes. Then your “safe rollout mechanism” becomes a latency amplifier.

The mistake

It looks roughly like this:

export async function middleware(req: Request) {
  const ctx = buildFlagContext(req)
  const res = await fetch("https://flags.example.com/eval", {
    method: "POST",
    body: JSON.stringify({ key: "new-checkout", ctx }),
  })

  const { enabled } = await res.json()
  return enabled ? routeNewCheckout(req) : routeOldCheckout(req)
}

This fails in more ways than teams expect.

You add network latency to every request, exactly where tail latency already hurts most.
Edge and serverless bursts turn into vendor bursts, which is how you discover rate and concurrency limits the hard way.
A flag outage can become a partial site outage.
Different services may evaluate the same flag at slightly different times and get different answers during a rollout.
Multi-tenant systems get risky if tenant, environment, or region scoping is inconsistent across callers.

This is also just backwards from a systems-design perspective. The flag service is the control plane. Your app runtime is the data plane. Control planes can be slower and eventually consistent. Data planes need to answer fast, locally, and predictably.

LaunchDarkly’s own docs make this distinction pretty explicit: server-side and edge SDKs evaluate using cached rules locally, while client-side SDKs depend on vendor-side evaluation for security reasons (Choosing an SDK type, Flag evaluation rules). If your server or edge code is still making a remote call for every evaluation, you are bypassing the architecture that exists to keep flags off the hot path.

The better architecture

Use remote systems to distribute flag state, not to answer every request.

The shape I want is boring on purpose:

A control-plane sync process streams or polls flag changes from the vendor.
A local store or edge store holds the latest ruleset.
Edge functions, APIs, and workers evaluate flags in-process from that local state.
Request handlers pass durable commands to queues for slow side effects.
Workers execute against an explicit snapshot of the decision that started the work.

That last part matters more than people think.

If a request decides new-checkout = true, then enqueues work for fulfillment, billing, or notifications, the worker should not casually re-evaluate the same flag 90 seconds later and maybe get false. That is how you get split-brain business behavior inside one user action.

For long-running flows, snapshot the relevant flag result into the job payload or workflow state:

await queue.enqueue({
  type: "create-order",
  orderId,
  flags: {
    newCheckout: flagClient.getBooleanValue("new-checkout", false, ctx),
  },
})

If the workflow is durable, the same rule applies even more strongly. AWS’s durable execution guidance is blunt about replay and determinism: non-deterministic calls must be wrapped so replay follows the same path, and side effects need idempotent handling (Determinism during replay, Idempotency and retries). Re-evaluating live flags mid-replay is just another source of non-determinism.

What this buys you

First, latency stops depending on somebody else’s control plane.

Second, rate limits become manageable. Stripe’s docs are a good reminder that external systems usually enforce both rate and concurrency limits, not just request-per-second caps (Rate limits). The same logic applies to flag vendors, internal config APIs, and any other shared platform service. If every request fans out, bursts become self-inflicted throttling events.

Third, your API boundaries get cleaner. Request handlers answer user traffic. Workers handle retries, backoff, and recovery. Queues absorb spikes instead of pushing them downstream immediately. That is the same basic reason queue-based load leveling keeps showing up in architecture guides: it decouples arrival rate from processing rate (Queue-Based Load Leveling).

Fourth, observability gets sharper. I want every evaluation to emit:

```
flag_key
```
```
variant
```
```
reason
```
```
tenant_id
```
```
env
```
```
store_age_ms
```
```
used_fallback
```
request_id or workflow_id

This is where a vendor-neutral boundary helps. OpenFeature hooks are useful because they give you one place to attach logging, telemetry, and evaluation-context validation without smearing provider-specific code across every service.

The tradeoffs

This architecture is not free.

Local evaluation means you now operate config distribution, cache freshness, and store health.
Flag flips are eventually consistent. A kill switch might take seconds to propagate unless you invest in streaming, relays, or edge-native stores.
Server-side evaluation means your rules and targeting attributes live in trusted infrastructure, so tenancy boundaries and PII handling need discipline.
Some product experiments still belong client-side, especially when the browser owns the UX state.

But those are good tradeoffs. You are choosing explicit consistency and operability constraints over accidental request-path fragility.

The rule of thumb is simple: evaluate flags locally wherever you serve traffic, and treat the vendor as a control-plane dependency, not a serving dependency. If a decision crosses an async boundary, snapshot it. If the resulting work is slow or failure-prone, queue it. And if you cannot tell when the system is serving fallback values, you do not have a flag platform yet. You have a remote boolean lookup with branding.