Anthropic API Compatibility | Doubleword Inference API

Doubleword exposes an Anthropic-compatible Messages API. Point the Anthropic SDK (or any Anthropic client) at the gateway with a base-URL change and your existing /v1/messages code works unchanged - requests are translated to Doubleword's routing layer and the response is returned in Anthropic shape.

Base URL and authentication

	Value
Base URL	`https://api.doubleword.ai`
Auth	Bearer token (your Doubleword API key)

The Anthropic SDK appends /v1/messages to the base URL, so set the base URL to the gateway root (no /v1). Doubleword authenticates with a Bearer token, so pass your key as auth_token (which sends Authorization: Bearer) rather than api_key.

Note

Use auth_token=..., not api_key=.... The api_key argument makes the SDK send an x-api-key header; Doubleword authenticates on Authorization: Bearer.

Messages

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.doubleword.ai",
    auth_token="{{apiKey}}",
)

message = client.messages.create(
    model="{{selectedModel.id}}",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, world"}],
)
print(message.content[0].text)

curl https://api.doubleword.ai/v1/messages \
  -H "Authorization: Bearer {{apiKey}}" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "{{selectedModel.id}}",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello, world"}]
  }'

Streaming

Set stream=True (or "stream": true) and consume the Anthropic SSE event sequence as usual:

with client.messages.stream(
    model="{{selectedModel.id}}",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a haiku about caching"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

curl https://api.doubleword.ai/v1/messages \
  -H "Authorization: Bearer {{apiKey}}" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "{{selectedModel.id}}",
    "max_tokens": 1024,
    "stream": true,
    "messages": [{"role": "user", "content": "Write a haiku about caching"}]
  }'

Flex tier

The flex tier runs your request asynchronously (queued to the inference daemon, ~1 hour SLA) at batch pricing, while keeping the request/response shape of a normal Messages call - your code still blocks on a single call and gets one response back.

Anthropic's own service_tier values (auto, standard_only) describe priority-vs-standard and have no effect on Doubleword. To opt into the flex tier, send the Doubleword-specific value service_tier: "flex". The Anthropic SDK doesn't expose service_tier as a typed argument, so pass it via extra_body:

message = client.messages.create(
    model="{{selectedModel.id}}",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Summarise this transcript..."}],
    extra_body={"service_tier": "flex"},
)
print(message.content[0].text)

curl https://api.doubleword.ai/v1/messages \
  -H "Authorization: Bearer {{apiKey}}" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "{{selectedModel.id}}",
    "max_tokens": 1024,
    "service_tier": "flex",
    "messages": [{"role": "user", "content": "Summarise this transcript..."}]
  }'

Tip

Flex trades latency for cost. Use it for work that is not latency-sensitive - bulk summarisation, evals, offline enrichment - where a longer turnaround in exchange for batch pricing is a good trade.

Batch

The native Anthropic batch endpoint (/v1/messages/batches) is not yet available. To run Messages requests as a batch today, use Doubleword's OpenAI-format Batch API with /v1/messages request lines. Each line carries an Anthropic request body; results come back as Anthropic messages.

Use the OpenAI SDK (pointed at the gateway's /v1 base URL) for the file upload and batch lifecycle, and put an Anthropic body in each line:

1. Build the JSONL file. One request per line; url is /v1/messages and body is a normal Anthropic Messages request:

{"custom_id": "req-1", "method": "POST", "url": "/v1/messages", "body": {"model": "{{selectedModel.id}}", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}]}}
{"custom_id": "req-2", "method": "POST", "url": "/v1/messages", "body": {"model": "{{selectedModel.id}}", "max_tokens": 1024, "messages": [{"role": "user", "content": "Goodbye"}]}}

2. Upload, create the batch, and poll:

import time
from openai import OpenAI

client = OpenAI(
    base_url="https://api.doubleword.ai/v1",
    api_key="{{apiKey}}",
)

batch_file = client.files.create(
    file=open("requests.jsonl", "rb"),
    purpose="batch",
)

batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/messages",
    completion_window="24h",
)

# Poll until the batch finishes
while batch.status not in ("completed", "failed", "cancelled", "expired"):
    time.sleep(10)
    batch = client.batches.retrieve(batch.id)

3. Read the results. The output file is JSONL keyed by custom_id; each response.body is an Anthropic message:

import json

output = client.files.content(batch.output_file_id).text
for line in output.splitlines():
    row = json.loads(line)
    custom_id = row["custom_id"]
    message = row["response"]["body"]          # Anthropic message shape
    print(custom_id, message["content"][0]["text"])

Note

endpoint on batches.create and the url on every line must both be /v1/messages. Results are unordered - always match on custom_id, never by position.

Models

GET /v1/models returns the model list in Anthropic shape when called with an anthropic-version header (the Anthropic SDK sends it automatically):

for model in client.models.list():
    print(model.id)

curl https://api.doubleword.ai/v1/models \
  -H "Authorization: Bearer {{apiKey}}" \
  -H "anthropic-version: 2023-06-01"

Notes and limitations

service_tier: only flex changes behaviour (it selects Doubleword's flex tier). auto and standard_only are accepted but have no effect.
Native batch endpoint: POST /v1/messages/batches and the Anthropic SDK's client.messages.batches.* helpers are not yet supported - use the OpenAI-format Batch API above.
Extended thinking: an Anthropic thinking config is mapped to the equivalent reasoning effort on the underlying model.