Portkey provides a robust and secure gateway to integrate various Large Language Models (LLMs) into applications, including Anthropic’s Claude APIs. With Portkey, take advantage of features like fast AI gateway access, observability, prompt management, and more, while securely managing API keys through Model Catalog.Documentation Index
Fetch the complete documentation index at: https://portkey-docs-fix-cache-hit-elaborate.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
All Models
Full support for all Claude models including Sonnet and Haiku 4-5
All Endpoints
/messages, count-tokens and more fully supportedMulti-Provider Support
Use Claude from Anthropic, Bedrock, and Vertex with native SDK support
Quick Start
Get Anthropic working in 3 steps:Tip: You can also set
provider="@anthropic" in Portkey() and use just model="claude-sonnet-4-5-20250929" in the request.max_tokensis required - Always specify this parameter- System prompts - Handled differently (see System Prompts section below)
- Model naming - Use full model names like
claude-sonnet-4-5-20250929
Add Provider in Model Catalog
- Go to Model Catalog → Add Provider
- Select Anthropic
- Choose existing credentials or create new by entering your Anthropic API key
- Name your provider (e.g.,
anthropic-prod)
Complete Setup Guide →
See all setup options, code examples, and detailed instructions
Basic Usage
Chat Completions
System Prompts
Anthropic handles system prompts differently than OpenAI. With Portkey, you can use the OpenAI-compatible format:Streaming
Streaming works the same as OpenAI:Catch Overloaded Error on Stream
Anthropic’s API can return anoverloaded_error inside a streaming response with HTTP status 200. The error appears as an SSE event:
529 responses, allowing your retry and fallback strategies to trigger automatically.
This feature is only available for the Anthropic provider. Other providers (e.g., Bedrock) handle overload errors at the HTTP level, where existing retry/fallback already applies. It also only applies to streaming requests — non-streaming Anthropic requests already return HTTP 529 directly.
How it works
When enabled on an integration, the gateway:- Reads the first chunk of the Anthropic streaming response before committing it to the client
- Skips any keepalive ping events
- If the first meaningful event is an
overloaded_error, returns an HTTP529response instead of the stream - If the first event is normal content, continues streaming as usual with no data loss
overloaded_error is found, the request fails as a normal request with error 529.
The 529 response integrates with the gateway’s existing error handling and supports all existing config combinations:
- Retry: Triggers automatically when retry is configured
- Fallback: Moves to the next target in a fallback strategy
- Circuit breaker: Counts as a failure for circuit breaker thresholds
Performance: There is zero overhead when the setting is disabled. When enabled, only the first event is inspected before the stream is committed.
How to enable
Enable the flag on your Anthropic integration
Go to Model Catalog → Integrations → Anthropic and enable the Catch Overloaded Error on Stream flag, then create or update the integration.
Add 529 to your retry status codes
In your config, add
529 to the retry on_status_codes (or fallback on_status_codes). This supports all existing config combinations.Example: Fallback on overload
With a fallback config using two Anthropic integrations (both with Catch Overloaded Error on Stream enabled), if the primary returns an overloaded error during streaming, the gateway automatically retries with the backup:Error response
When an overloaded error is detected, the client receives:Advanced Features
Vision (Multimodal)
Portkey supports Anthropic’s vision models includingclaude-sonnet-4-5-20250929, claude-3-5-sonnet, claude-3-haiku, claude-3-opus, and claude-3.7-sonnet. Use the same format as OpenAI:
Anthropic only accepts base64-encoded images and does not support image URLs. Use the same base64 format to send images to both Anthropic and OpenAI models.
To prompt with PDFs, update the
url field to: data:application/pdf;base64,BASE64_PDF_DATAPDF Support
Anthropic Claude processes PDFs to extract text, analyze charts, and understand visual content. PDF support is available on:- Claude 3.7 Sonnet (
claude-3-7-sonnet-20250219) - Claude 3.5 Sonnet (
claude-3-5-sonnet-20241022,claude-3-5-sonnet-20240620) - Claude Sonnet 4-5 (
claude-sonnet-4-5-20250929) - Claude 3.5 Haiku (
claude-3-5-haiku-20241022)
- Maximum request size: 32MB
- Maximum pages per request: 100
- Format: Standard PDF (no passwords/encryption)
Extended Thinking (Reasoning Models)
Models likeclaude-3-7-sonnet-latest support extended thinking. Get the model’s reasoning as it processes the request.
The assistant’s thinking response is returned in the
response_chunk.choices[0].delta.content_blocks array, not the response.choices[0].message.content string.strict_open_ai_compliance=False to use this feature:
Using /messages Route
Portkey supports Anthropic’s/messages endpoint, allowing you to use either Anthropic’s native SDK or Portkey’s SDK with full gateway features.
Using Anthropic’s Native SDK
Using Portkey’s SDK
cURL
You can use all Portkey features (like caching, observability, configs) with this route. Just add the
x-portkey-config, x-portkey-provider, x-portkey-... headers.Prompt Caching
Portkey works with Anthropic’s prompt caching feature to save time and money. Refer to this guide:Prompt Caching
Learn how to enable prompt caching for Anthropic requests
Structured Outputs
Ensure that the model always follows your supplied JSON schema with Portkey’s structured outputs support.Structured Outputs
Learn how to use Pydantic, Zod, or JSON schema for structured data from Anthropic
Web Search
Anthropic Claude models support web search as a tool, allowing the model to search the web for up-to-date information.Set
strict_open_ai_compliance to false (or use the header x-portkey-strict-open-ai-compliance: false) to receive citations in the response.Files API
Portkey supports Anthropic’s Files API (beta), enabling you to upload, list, retrieve, and delete files through the gateway. Uploaded files can be referenced in chat completions usingfile_id instead of re-uploading content each request.
Files API
Upload, list, retrieve, and delete files — then use them in chat completions
Service Tier
When routing Chat Completions requests to Anthropic, Portkey automatically translates OpenAI’sservice_tier parameter to Anthropic’s native speed parameter:
service_tier | Anthropic speed |
|---|---|
auto | fast |
standard_only | standard |
default | standard |
fast | fast |
standard | standard |
| unknown value | omitted |
Beta Features
Portkey supports Anthropic’s beta features through headers. Pass the beta feature name as the value:Managing Anthropic Prompts
Manage all prompt templates to Anthropic in the Prompt Library. All current Anthropic models are supported, and you can easily test different prompts. Use theportkey.prompts.completions.create interface to use the prompt in an application.
Next Steps
Add Metadata
Add metadata to your Anthropic requests
Gateway Configs
Add gateway configs to your Anthropic requests
Tracing
Trace your Anthropic requests
Fallbacks
Setup fallback from OpenAI to Anthropic
SDK Reference
Complete Portkey SDK documentation

