AWS Bedrock
Configuration
credentials:
- name: "bedrock_us_east"
type: "bedrock"
api_key: "os.environ/AWS_BEDROCK_API_KEY"
base_url: "https://bedrock-runtime.us-east-1.amazonaws.com"
rpm: 60
tpm: 100000
Required Fields
| Field | Description |
|---|---|
api_key |
AWS Bedrock API key / token (supports os.environ/VAR_NAME) |
base_url |
Bedrock Runtime endpoint URL (region-specific) |
The api_key is sent as Authorization: Bearer <api_key> to the Bedrock Runtime API.
Common Base URLs by Region
| Region | Base URL |
|---|---|
| us-east-1 | https://bedrock-runtime.us-east-1.amazonaws.com |
| us-west-2 | https://bedrock-runtime.us-west-2.amazonaws.com |
| eu-central-1 | https://bedrock-runtime.eu-central-1.amazonaws.com |
| ap-northeast-1 | https://bedrock-runtime.ap-northeast-1.amazonaws.com |
Model IDs
Use AWS Bedrock model IDs in the model field. Cross-region inference profiles are supported:
us.anthropic.claude-sonnet-4-20250514-v1:0
us.anthropic.claude-3-5-sonnet-20241022-v2:0
us.anthropic.claude-3-haiku-20240307-v1:0
anthropic.claude-3-5-haiku-20241022-v1:0
OpenAI-Compatible API
The router accepts requests in OpenAI Chat Completion format and automatically converts them to AWS Bedrock InvokeModel format. The conversion pipeline:
- OpenAI → Anthropic Messages API format (full parameter conversion)
- Bedrock-specific adjustments: remove
model(embedded in URL), removestream, addanthropic_version: "bedrock-2023-05-31" - Response: Bedrock (Anthropic) → OpenAI format
Request URL Construction
| Mode | URL |
|---|---|
| Non-streaming | {base_url}/model/{model_id}/invoke |
| Streaming | {base_url}/model/{model_id}/invoke-with-response-stream |
Supported Parameters
Same as the Anthropic provider:
| OpenAI Parameter | Notes |
|---|---|
temperature |
Set to 1.0 automatically when thinking is enabled |
top_p |
|
max_tokens |
Defaults to 4096 if not set |
max_completion_tokens |
Fallback if max_tokens is empty |
stop |
Accepts string or array |
stream |
|
tools |
Full conversion (see Tool Calling) |
tool_choice |
|
user |
Mapped to metadata.user_id |
reasoning_effort |
See Extended Thinking |
extra_body Parameters
| Parameter | Description |
|---|---|
extra_body.top_k |
Top-K sampling |
extra_body.thinking |
Direct thinking config ({"type": "enabled", "budget_tokens": 15000}) |
Unsupported Parameters
n, frequency_penalty, presence_penalty, seed, response_format, logprobs, top_logprobs, modalities, service_tier, store, parallel_tool_calls, prediction
Embeddings and image generation are not supported by this provider.
Message Conversion
Same as Anthropic provider.
Tool Calling
Same as Anthropic provider. Supported tool types:
| OpenAI Tool Type | Anthropic (Bedrock) Type |
|---|---|
function |
Standard function tool |
computer_use |
computer_20241022 |
text_editor |
text_editor_20241022 |
bash |
bash_20241022 |
web_search / web_search_preview |
web_search_20250305 |
Extended Thinking
Supported for reasoning-capable models (e.g. claude-sonnet-4). Same interface as Anthropic provider.
response = client.chat.completions.create(
model="us.anthropic.claude-sonnet-4-20250514-v1:0",
messages=[{"role": "user", "content": "Solve this step by step..."}],
reasoning_effort="high", # minimal, low, medium, high
)
| reasoning_effort | budget_tokens |
|---|---|
minimal |
1,000 |
low |
5,000 |
medium |
15,000 |
high |
30,000 |
none / disable |
Disabled |
When thinking is enabled,
temperatureis automatically set to 1.0.Thinking content is returned in the
reasoning_contentfield of the response message.
Content Types
| Content Type | Support | Notes |
|---|---|---|
| Text | Full | String or {"type": "text"} blocks |
| Image (base64) | Full | data:image/...;base64,... → Anthropic base64 source |
| Image (URL) | Full | HTTP/HTTPS URLs → Anthropic URL source |
| Document (base64) | Full | application/* and text/* MIME types only |
| Audio | Placeholder | Replaced with [Audio input: <format> format - not supported by Anthropic API] |
| Video | Placeholder | Replaced with [Video: <url>] |
Streaming
AWS Bedrock uses a binary Event Stream framing protocol instead of SSE. The router decodes it transparently:
- Binary AWS Event Stream frames decoded on the fly
- Each frame payload (base64-encoded Anthropic event JSON) converted to SSE
- Anthropic SSE events converted to OpenAI streaming format
The client receives standard OpenAI-compatible SSE:
stream = client.chat.completions.create(
model="us.anthropic.claude-3-5-sonnet-20241022-v2:0",
messages=[{"role": "user", "content": "Hello"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Finish Reasons
| Bedrock/Anthropic stop_reason | OpenAI finish_reason |
|---|---|
end_turn |
stop |
max_tokens |
length |
tool_use |
tool_calls |
stop_sequence |
stop |
Token Counting
| Anthropic Field | OpenAI Mapping |
|---|---|
input_tokens |
prompt_tokens |
output_tokens |
completion_tokens |
cache_read_input_tokens |
prompt_tokens_details.cached_tokens |
cache_creation_input_tokens |
Tracked internally for billing |