LangChain / LangGraph
The langchain-doubleword package lets you use Doubleword as the model backend for any LangChain or LangGraph application. You get four drop-in classes — real-time and batched variants of chat and embeddings — that slot into agents, chains, graphs, retrievers, or anywhere else a LangChain BaseChatModel or Embeddings is accepted.
The batched variants are the headline feature. They let a LangGraph workflow fan out dozens or hundreds of parallel LLM calls and have them transparently collected into a single Doubleword batch submission — up to 90% lower per-token cost, no changes to your graph code.
Install
pip install langchain-doublewordAuthenticate
Three options, pick whichever fits. Credentials resolve in this order:
- Pass the key directly —
ChatDoubleword(model=..., api_key="{{apiKey}}"). Use for quick scripts and tests. DOUBLEWORD_API_KEYenvironment variable — the right choice for production, CI, and containers.- The
dwCLI — rundw loginonce and every script on the machine picks up your active account's inference key from~/.dw/. The smoothest option for local development.
No separate config is needed — whichever class you instantiate finds its credentials automatically.
Real-time chat
Interactive calls go through Doubleword's OpenAI-compatible chat endpoint. This is what you want for a chatbot, an interactive agent, or anything latency-sensitive.
from langchain_doubleword import ChatDoubleword
llm = ChatDoubleword(
model="{{selectedModel.id}}",
api_key="{{apiKey}}",
)
print(llm.invoke("Explain bismuth in three sentences.").content)It's a standard BaseChatModel, so bind_tools, with_structured_output, streaming, and LangSmith tracing all work unchanged.
Tool calling
ChatDoubleword supports tool calling via bind_tools. Define tools with @tool, bind them, and let the model decide when to use them:
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_core.tools import tool
from langchain_doubleword import ChatDoubleword
@tool
def calculator(expression: str) -> str:
"""Evaluate a basic arithmetic expression."""
return str(eval(expression, {"__builtins__": {}}, {}))
llm = ChatDoubleword(
model="{{selectedModel.id}}",
api_key="{{apiKey}}",
)
bound = llm.bind_tools([calculator])
messages = [HumanMessage(content="What is 137 * 49?")]
response = bound.invoke(messages)
# If the model called a tool, execute it and feed the result back
if response.tool_calls:
messages.append(response)
for tc in response.tool_calls:
result = calculator.invoke(tc["args"])
messages.append(ToolMessage(content=str(result), tool_call_id=tc["id"]))
response = bound.invoke(messages)
print(response.content)This pattern — invoke, check for tool calls, execute, feed back — is what LangGraph automates with its conditional edges. See the repo's examples/langgraph-basic/ for the full graph version.
Batched chat
For bulk work or agents that fan out, swap ChatDoubleword for ChatDoublewordBatch. The interface is identical — the only user-visible difference is it's async-only.
import asyncio
from langchain_doubleword import ChatDoublewordBatch
llm = ChatDoublewordBatch(
model="{{selectedModel.id}}",
api_key="{{apiKey}}",
)
async def main():
results = await asyncio.gather(*[
llm.ainvoke(f"Summarise chapter {i}") for i in range(50)
])
for r in results:
print(r.content)
asyncio.run(main())Those 50 concurrent ainvoke calls get collected into one batch submission. The pricing follows Doubleword's batch tier (up to 90% less than real-time). The cost you pay for that: a small window-of-wait tax on the first call, tunable via batch_window_seconds.
Pick ChatDoublewordBatch when:
- Your LangGraph workflow runs parallel branches or
Sendfan-out and you want batch pricing without rewriting the graph. - The model you need is only exposed via Doubleword's batch API (some of the larger Doubleword-hosted models fall into this category).
- You're embedding or processing a large corpus offline.
For embeddings, the same pattern applies: DoublewordEmbeddings is real-time, DoublewordEmbeddingsBatch is transparently batched.
Inside a LangGraph node
Both chat classes slot straight into LangGraph. This is the shape you'll use most:
from langgraph.graph import StateGraph, END
from langchain_doubleword import ChatDoublewordBatch
llm = ChatDoublewordBatch(
model="{{selectedModel.id}}",
api_key="{{apiKey}}",
completion_window="1h", # faster turnaround than the 24h default
batch_window_seconds=2.5, # don't make callers wait the full 10s
)
async def call_model(state):
return {"messages": [await llm.ainvoke(state["messages"])]}
graph = StateGraph(dict)
graph.add_node("model", call_model)
graph.set_entry_point("model")
graph.add_edge("model", END)
app = graph.compile()When several model nodes execute in parallel — via Send, conditional fan-out, or concurrent ainvoke calls from your own code — all their requests hit the same autobatcher window and get bundled together.
Try it end-to-end
Two sets of examples live in the repo:
examples/langgraph-basic/— a minimal LangGraph agent with a calculator tool, in both real-time and batched variants. The cleanest place to see tool calling in action.examples/async-agents-langgraph/notebook.ipynb— a full multi-agent research workflow that fans out sub-agents with Serper + Jina search-and-scrape, all throughChatDoublewordBatch.
Further reading
- Package README on GitHub — full API reference, configuration table, and smoke-test scripts.
autobatcher— the library powering the batched variants. Standalone, usable without LangChain.- Doubleword batch API docs — pricing, completion windows, and the raw endpoints the batched variants target.