subagentcowork

.com 87 pages

Using Cowork on 3P with an LLM Gateway

Configure Cowork on 3P to use Claude models on a self-hosted gateway that implements the Anthropic Messages API

To use a self-hosted LLM gateway (for example LiteLLM, Portkey, or an in-house proxy) as the inference provider, set inferenceProvider to gateway and supply the base URL and credentials described below.

The gateway must implement the Anthropic Messages API:

  • POST /v1/messages with streaming and tool use is required.
  • GET /v1/models is optional. If the gateway implements it, Cowork on 3P auto-discovers available models; if not, set inferenceModels explicitly.

[!NOTE] The data-residency and "no conversation data sent to Anthropic" statements elsewhere in these pages apply to a gateway deployment provided your gateway does not route inference to Anthropic-operated infrastructure (directly to the Anthropic API or via Microsoft Foundry). Data handling is otherwise determined by the gateway you operate and the upstream provider it routes to.

Configuration keys

Setting Required Description
Gateway base URL
inferenceGatewayBaseUrl
Yes Gateway base URL. Must be https://.
Gateway API key
inferenceGatewayApiKey
Unless using interactive sign-in or a credential helper API key sent to the gateway. The field cannot be empty, so if your gateway authenticates by network identity and does not require a key, set a placeholder value.
Gateway auth scheme
inferenceGatewayAuthScheme
No Which HTTP header carries the credential. bearer (default) sends Authorization: Bearer <key>. x-api-key sends the x-api-key header instead. This setting controls the wire format only; to select interactive sign-in instead of a static key, set inferenceCredentialKind (see below).

To send additional HTTP headers on every inference request (tenant routing, org IDs, and similar), set inferenceCustomHeaders. It applies to all providers, not just gateways.

As an alternative to a static inferenceGatewayApiKey, configure an inferenceCredentialHelper executable that prints the gateway credential to stdout, or set inferenceCredentialKind to interactive for per-user single sign-on through your identity provider.

Models

When inferenceModels is unset, Cowork on 3P populates the model picker from your gateway's GET /v1/models response. Auto-discovery shows only models whose IDs are recognizably Claude; if your gateway advertises models under opaque aliases, set inferenceModels explicitly. Set inferenceModels to override discovery with an explicit list — the picker will show exactly the entries you provide. Use the model IDs your gateway expects (for example bedrock/us.anthropic.claude-opus-4-7 for a LiteLLM-style routing prefix).

MCP tool search

MCP tool search is disabled by default when routing through a gateway, because most proxies do not forward tool_reference content blocks. If your gateway passes them through unchanged — LiteLLM in passthrough mode and Cloudflare AI Gateway both do — set the environment variable ENABLE_TOOL_SEARCH=true to re-enable it.

Configure in the app

Open the in-app configuration window (Developer → Configure third-party inference). In the Connection section, set Inference provider to Gateway, then fill in the Gateway credentials card:

Field Value
Gateway base URL https://llm-gateway.example.corp
Gateway API key your gateway key (or a placeholder if your gateway has none)
Credential kind Static API key (default), or Interactive sign-in for single sign-on
Gateway auth scheme Bearer (default) or x-api-key

Then click Export to produce a .mobileconfig (macOS) or .reg (Windows) file for your MDM. See Installation and setup for the export and deployment workflow.

mirror sha256:16 971ca5b859b51b3f · verify