When should an MCP server use an on-behalf-of flow?

Use a provider-specific delegated exchange only when a downstream API must receive a user-delegated token and the identity provider supports that exchange. It is not a generic MCP feature. If the downstream operation is service-owned, a workload credential with an explicit policy may be safer and easier to audit.

Enterprise OAuth for Remote MCP Servers

Q: Does every remote MCP deployment need OAuth?

A remote server needs an authenticated and authorized request boundary, but the protocol does not make one identity provider or grant type suitable for every topology. OAuth is the interoperable choice when independent clients and delegated user access are required. A private service-to-service deployment may also use workload identity or mTLS, provided the MCP server still performs application authorization.

Q: Does OAuth 2.1 make an MCP tool call safe?

No. OAuth authenticates a client or user and carries claims. The MCP server must still check the intended resource, tenant, subject, tool, purpose, and side effect for every call. A valid token is not proof that the caller owns an object or may perform an administrative action.

Q: How should JWKS rotation be handled?

Cache keys according to the provider's cache headers and bound the cache lifetime. On an unknown key ID, refresh once with a distributed cooldown, then reject if the key is still unknown. Do not disable signature, issuer, audience, or expiry checks to recover from a rotation.

Q: Is CORS required for every MCP server?

No. CORS is a browser enforcement mechanism. A native client, local process, or backend service does not need browser CORS headers. If a browser client is supported, use an explicit origin allowlist and treat CORS as an additional browser boundary, never as authorization.

2026-05-16 - QubitTool Tech Team

An enterprise MCP deployment has two different questions:

Who is making this request, and which authorization server issued the evidence?
May this principal perform this exact operation on this exact resource?

OAuth and OpenID Connect help answer the first question. They do not answer the second. This distinction is the central design constraint for a remote MCP server: a valid bearer token can identify a principal while still being insufficient to read another tenant's invoice, send an external message, or delete an object.

This article focuses on the identity boundary around a remote MCP server. It complements the MCP production guide, which covers transport, sessions, bounded results, and protocol-level reliability. The examples use TypeScript-shaped pseudocode to show boundaries rather than promise a copy-paste deployment. Pin an MCP specification revision, an SDK version, and an identity-provider profile in the implementation repository.

Key Takeaways

Treat the MCP server as a protected resource. An authorization server or enterprise identity provider issues credentials; the MCP server validates them and applies resource policy.
Use the authorization server metadata and protected-resource metadata that your deployment supports. Do not assume dynamic registration, a particular scope name, or a particular discovery URL.
PKCE with S256 protects authorization-code clients from code interception. It does not replace redirect-URI validation, state/nonce handling, token validation, or tool authorization.
Validate the exact issuer, audience or resource indicator, allowed algorithms, signature, time claims, and scopes. Keep provider-specific claim mapping outside the business policy.
JWKS caching is an availability optimization, not a reason to accept an unknown key or skip verification.
A provider-specific delegated exchange such as Microsoft Entra OBO is useful only when the downstream API and identity provider support it. It must not be confused with generic MCP behavior.
Enforce tenant, subject, object ownership, purpose, and side-effect policy after authentication and before executing a tool.
Browser CORS, mTLS, workload identity, and refresh-token rotation are deployment choices whose necessity depends on the client topology and provider configuration.
Test identity failures and authorization failures separately. Both must be observable without leaking token contents or sensitive resource existence.

Start With the Trust Topology

Before choosing a grant or SDK, document the principals and token audiences:

text

user or workload
    -> MCP client / host
    -> MCP protected resource
    -> downstream API (optional)

For each arrow, record:

Boundary	Credential subject	Intended audience/resource	Server decision
Client -> MCP	user, workload, or delegated client	MCP resource identifier	authenticate, then authorize the tool
MCP -> downstream API	user delegation or service identity	downstream API	request only the narrow capability needed
Browser -> client	browser origin, if applicable	registered redirect URI	apply browser controls; never infer object permission

The MCP server should not accept a token merely because it was signed by a familiar provider. A token issued for a different API is valid cryptographically but invalid for this resource. Similarly, a token with mcp.tools.read should not invoke a write tool unless a separate policy explicitly permits that action.

Protected Resource Metadata and Discovery

Remote clients need a way to learn which authorization servers protect an MCP resource. RFC 9728 defines Protected Resource Metadata. The metadata is a discovery document, not a grant of permission:

json

{
  "resource": "https://mcp.example.com",
  "authorization_servers": [
    "https://login.example.com"
  ],
  "scopes_supported": [
    "mcp.tools.read",
    "mcp.tools.execute"
  ],
  "bearer_methods_supported": ["header"],
  "resource_documentation": "https://docs.example.com/mcp"
}

The exact endpoint and resource value must match the deployment's MCP and OAuth profile. Do not publish a wildcard issuer list or advertise scopes that the policy engine does not enforce. If multiple authorization servers are supported, define how a tenant is selected and how an issuer is mapped to that tenant before accepting a token.

Discovery has several failure modes:

An attacker can replace an unpinned discovery URL if the initial connection is not authenticated.
A provider can publish endpoints for a tenant different from the one the request is meant to access.
Metadata can advertise a scope without granting it.
Dynamic client registration can create unmanaged clients if approval and lifecycle controls are missing.

Pin trusted issuers and resource identifiers in configuration. Treat remote metadata as input to a controlled registration process, not as runtime authority.

Authorization Code and PKCE: What It Solves

For a user-facing public client, the authorization-code flow with PKCE binds the code exchange to a verifier held by the client:

sequenceDiagram participant C as "MCP Client" participant R as "MCP Resource" participant A as "Authorization Server" participant U as "User" C->>R: Request protected metadata R-->>C: issuer and supported resource C->>A: Discover endpoints C->>C: Generate verifier and S256 challenge C->>A: Authorization request + state + redirect URI A->>U: Authenticate and obtain consent A-->>C: Authorization code C->>A: Code + verifier A-->>C: Access token C->>R: MCP request with bearer token R->>R: Validate token and authorize operation

PKCE does not:

prove that the model selected an appropriate tool;
authorize a tenant or object;
prevent a compromised client from using a token it legitimately received;
make a bearer token sender-constrained;
replace exact redirect-URI, state, nonce, or token validation.

Use S256, protect the verifier, bind state to the initiating browser session, and use a nonce where the OIDC identity layer requires it. Confidential clients still need a correct authorization-code exchange; whether PKCE is required is determined by the applicable profile and provider policy, not by a blanket claim that every OAuth deployment has identical requirements.

Token Validation Is a Narrow, Explicit Contract

The resource server should convert a verified token into a small internal principal. It should not pass raw claims into tool handlers:

typescript

type Principal = {
  subject: string;
  tenant: string;
  clientId?: string;
  scopes: ReadonlySet<string>;
  issuer: string;
  audience: string;
};

type TokenPolicy = {
  issuer: string;
  audience: string;
  algorithms: readonly string[];
  requiredScopes: readonly string[];
};

type VerifiedClaims = {
  sub?: unknown;
  iss?: unknown;
  aud?: unknown;
  exp?: unknown;
  nbf?: unknown;
  iat?: unknown;
  scope?: unknown;
  scp?: unknown;
  tid?: unknown;
  azp?: unknown;
  client_id?: unknown;
};

function toScopes(claims: VerifiedClaims): Set<string> {
  const value = claims.scope ?? claims.scp;
  if (typeof value === "string") return new Set(value.split(/\s+/).filter(Boolean));
  if (Array.isArray(value) && value.every((item) => typeof item === "string")) {
    return new Set(value);
  }
  return new Set();
}

function requirePrincipal(
  rawToken: string,
  policy: TokenPolicy,
  verifyJwt: (token: string, options: {
    issuer: string;
    audience: string;
    algorithms: readonly string[];
  }) => VerifiedClaims,
): Principal {
  const claims = verifyJwt(rawToken, {
    issuer: policy.issuer,
    audience: policy.audience,
    algorithms: policy.algorithms,
  });

  if (typeof claims.sub !== "string" || typeof claims.iss !== "string") {
    throw new Error("invalid_principal_claims");
  }

  const scopes = toScopes(claims);
  if (!policy.requiredScopes.every((scope) => scopes.has(scope))) {
    throw new Error("insufficient_scope");
  }

  const tenant = typeof claims.tid === "string" ? claims.tid : undefined;
  if (!tenant) throw new Error("tenant_context_required");

  return {
    subject: claims.sub,
    tenant,
    clientId: typeof claims.azp === "string"
      ? claims.azp
      : typeof claims.client_id === "string" ? claims.client_id : undefined,
    scopes,
    issuer: claims.iss,
    audience: policy.audience,
  };
}

The verifyJwt adapter must perform signature verification using a trusted JWKS URI, reject algorithms outside the allowlist, validate iss, aud or the configured resource indicator, and enforce exp, nbf, and an acceptable clock skew. Require sub for user-delegated operations and reject the request when a trusted tenant context is unavailable. If a provider uses a claim other than tid, normalize it in the authentication adapter before this function; never use a placeholder tenant. Some providers use scp, others use scope; normalize that difference once and keep it out of business handlers.

Do not log the raw token, authorization header, refresh token, or complete claims object. Record a request identifier, issuer, key ID, token hash or redacted subject, decision, and policy version.

JWKS Rotation Without Fail-Open Behavior

JWKS caching reduces latency and dependency load, but it creates a short synchronization window during key rotation. A robust cache has four properties:

Respect the provider's Cache-Control guidance, with an application maximum age.
Refresh on expiry and coalesce concurrent refreshes.
On an unknown kid, perform one bounded refresh behind a distributed cooldown.
Reject the token if the key remains unknown or validation still fails.

Never respond to a key-fetch outage by accepting an unverified token. Monitor cache age, refresh errors, unknown-key counts, and validation failures. A provider-specific SDK may already implement safe key selection; understand its cache and refresh semantics before wrapping it.

Authorization Is More Than Scope

Scopes describe a coarse capability. The tool policy must narrow it to the request:

typescript

type ToolCall = {
  name: string;
  resourceId?: string;
  arguments: Record<string, unknown>;
  sideEffect: "none" | "external_write" | "destructive";
};

type AuthorizationContext = {
  principal: Principal;
  tenant: string;
  requestId: string;
};

function authorizeTool(
  call: ToolCall,
  context: AuthorizationContext,
  policy: {
    requiredScope: string;
    authorizeResource: (tenant: string, subject: string, resourceId: string) => boolean;
    allowSideEffect: (subject: string, name: string) => boolean;
  },
): "allow" | "deny" | "confirm" {
  if (!context.principal.scopes.has(policy.requiredScope)) return "deny";
  if (context.principal.tenant !== context.tenant) return "deny";

  if (call.resourceId &&
      !policy.authorizeResource(context.tenant, context.principal.subject, call.resourceId)) {
    return "deny";
  }

  if (call.sideEffect === "destructive") return "confirm";
  if (call.sideEffect === "external_write" &&
      !policy.allowSideEffect(context.principal.subject, call.name)) {
    return "deny";
  }
  return "allow";
}

The authorization function should query an authoritative policy or resource service. It must not trust tenant_id, owner_id, price, role, or resource identifiers supplied by the model. Resolve identity from the verified principal and resolve ownership from the database or policy engine. Keep read-only discovery separate from execution and use different scopes only when the server actually enforces the distinction.

Downstream Delegation: OBO Is Provider-Specific

An MCP server sometimes needs to call a downstream API as the user. Microsoft Entra's On-Behalf-Of flow is one implementation of delegated exchange; other providers expose different token-exchange profiles, and some downstream systems do not support delegation at all.

Choose between two patterns:

Need	Credential	Main risk
User-owned resource	provider-supported delegated token	confused deputy and consent drift
Service-owned operation	workload identity or client credential	excessive service authority

For either pattern:

allowlist the downstream audience and scopes;
never let the model choose the target audience or scope;
check the MCP principal and object authorization before exchange;
cache only short-lived downstream tokens, keyed by a protected subject and scope digest;
redact assertions and downstream tokens from logs;
propagate the original subject and request ID for audit;
reject a downstream response that attempts to alter the current authorization decision.

OBO cannot magically reduce an over-privileged upstream consent. It also does not prove that a downstream object belongs to the user. The downstream API must enforce its own authorization.

Browser, Native, and Workload Clients

The client topology changes the controls:

Client	Important controls
Browser-based public client	PKCE, exact redirect URIs, state/nonce, explicit CORS origins, no secrets in browser code
Native desktop client	PKCE, claimed or loopback redirect handling, OS-protected token storage
Backend service	workload identity or confidential-client authentication, secret/key rotation, egress policy
Local `stdio` client	OS process isolation and filesystem/network restrictions; remote OAuth may not be needed

CORS only governs browsers. mTLS authenticates a network peer and can complement OAuth for service-to-service traffic, but a certificate does not replace tool or object authorization. Do not impose TLS 1.3 as a universal application rule without checking client and proxy support; require modern TLS and an approved cipher policy appropriate to the deployment.

Operational Controls and Failure Semantics

Authentication and authorization failures should be distinguishable internally while exposing minimal information externally:

401 for a missing, malformed, expired, or invalid access token;
403 for a valid principal lacking the required scope or policy decision;
409 or a domain-specific response for a resource version or idempotency conflict;
bounded 429 responses for rate limits, with retry information that does not reveal sensitive state.

Rate-limit by more than IP: principal, client, tenant, tool, and downstream dependency may need separate budgets. Protect discovery, token exchange, initialization, tool calls, and large-result downloads independently. Cancellation must stop downstream work where possible, and retries must be limited to operations whose side effects are known to be safe.

Useful telemetry includes:

authentication outcome by issuer and failure class;
JWKS cache age, refresh latency, and unknown-key count;
authorization decisions by policy version and tool, without raw arguments;
downstream exchange success and latency;
rate-limit and cancellation counts;
cross-tenant denial attempts and repeated invalid-token patterns.

Do not set universal alert thresholds in an article. Establish baselines from a staging replay and tune thresholds to the traffic shape, tenant count, and provider limits.

Production Test Matrix

Test the identity boundary and the capability boundary independently:

Scenario	Expected result
wrong issuer or audience	reject before tool execution
disallowed algorithm or unknown `kid`	reject; one bounded JWKS refresh only
expired or not-yet-valid token	reject with no sensitive detail
valid token, missing execution scope	deny without invoking the tool
valid scope, another tenant's object	deny and avoid existence disclosure
model changes the target audience or downstream scope	ignore the model value and use server policy
browser request from an unregistered origin	browser boundary rejects; server auth remains authoritative
duplicated write with the same idempotency key	return the existing result or a safe conflict
provider outage during JWKS refresh	fail closed, expose a dependency error metric
canceled request during downstream call	stop or compensate according to the operation contract

Run these cases in a replayable environment with fixed policy versions and synthetic tokens. Include provider-specific integration tests, but keep the core authorization tests independent of any one IdP.

Migration Checklist

Before exposing a remote MCP endpoint:

Define the protected resource identifier, trusted issuers, tenant mapping, and supported client topologies.
Publish only the discovery metadata and scopes that the server actually enforces.
Implement PKCE and redirect/state handling for public clients.
Validate signature, algorithm, issuer, audience/resource, time claims, subject, and provider-specific scope claims.
Add object-level authorization, tenant isolation, side-effect confirmation, and idempotency.
Configure JWKS caching with bounded refresh and fail-closed behavior.
Decide deliberately between delegated downstream access and workload identity.
Redact tokens and sensitive claims from logs and traces.
Test the failure matrix before load testing or exposing high-impact tools.
Review the provider's current documentation and the pinned MCP specification before each upgrade.

Frequently Asked Questions

Does every remote MCP deployment need OAuth?

No. It needs a verifiable identity and authorization boundary. OAuth is the interoperable default when multiple clients, user consent, and delegated access are involved. A private service mesh may use workload identity or mTLS, but the server still needs application-level policy.

Does OAuth 2.1 make an MCP tool call safe?

No. It protects an authorization flow and conveys claims. The server must still authorize the exact tool, tenant, resource, purpose, and side effect. A model cannot grant itself access by placing a different subject or owner in tool arguments.

When should an MCP server use OBO?

Only when the identity provider and downstream API support a compatible delegated exchange and the user really needs their own permissions at the downstream boundary. Otherwise use a narrow workload identity and an explicit service policy.

How should JWKS rotation be handled?

Honor cache headers, bound the cache age, refresh once on an unknown key ID with a distributed cooldown, and reject if verification still fails. Never fail open.

Is CORS required for every MCP server?

No. It matters for browser clients. Native, local-process, and backend clients need their own transport and credential controls. In every topology, CORS and mTLS are supplemental; token and object authorization remain authoritative.

Conclusion

Enterprise OAuth integration is successful when it makes authority explicit: a trusted issuer authenticates a principal, a resource server validates the evidence, and an application policy decides whether this exact MCP call is allowed. Keeping those responsibilities separate makes provider migrations, incident response, delegated access, and security testing tractable. It also prevents the most dangerous shortcut in agent systems: treating a valid token as permission to do whatever the model requested.

Primary Sources

Previous:MCP in Multi-Agent Systems: Protocol Boundary, Not Policy Engine

Next:MCP Apps: Product Architecture, Distribution, and Trust