doublewordSubmit and manage asynchronous batch AI inference jobs via Doubleword API supporting OpenAI-compatible endpoints, tool calling, and structured outputs.
Install via ClawdBot CLI:
clawdbot install pjb157/doublewordProcess multiple AI inference requests asynchronously using the Doubleword batch API with high throughput and low cost.
Before submitting batches, you need:
Batches are ideal for:
Pricing is per 1 million tokens (input / output):
Qwen3-VL-30B-A3B-Instruct-FP8 (mid-size):
Qwen3-VL-235B-A22B-Instruct-FP8 (flagship):
Cost estimation: Upload files to the Doubleword Console to preview expenses before submitting.
Two ways to submit batches:
Via API:
Via Web Console:
Create a .jsonl file where each line contains a complete, valid JSON object with no line breaks within the object:
{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is 2+2?"}]}}
{"custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is the capital of France?"}]}}
Required fields per line:
custom_id: Unique identifier (max 64 chars) - use descriptive IDs like "user-123-question-5" for easier result mappingmethod: Always "POST"url: API endpoint - "/v1/chat/completions" or "/v1/embeddings"body: Standard API request with model and messagesOptional body parameters:
temperature: 0-2 (default: 1.0)max_tokens: Maximum response tokenstop_p: Nucleus sampling parameterstop: Stop sequencestools: Tool definitions for tool calling (see Tool Calling section)response_format: JSON schema for structured outputs (see Structured Outputs section)File requirements:
custom_id valuesCommon pitfalls:
custom_id valuesHelper script:
Use scripts/create_batch_file.py to generate JSONL files programmatically:
python scripts/create_batch_file.py output.jsonl
Modify the script's requests list to generate your specific batch requests.
Via API:
curl https://api.doubleword.ai/v1/files \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
-F purpose="batch" \
-F file="@batch_requests.jsonl"
Via Console:
Upload through the Batches section at https://app.doubleword.ai/
Response contains id field - save this file ID for next step.
Via API:
curl https://api.doubleword.ai/v1/batches \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'
Via Console:
Configure batch settings in the web interface.
Parameters:
input_file_id: File ID from upload stependpoint: API endpoint ("/v1/chat/completions" or "/v1/embeddings")completion_window: Choose based on urgency and budget:"24h": Best pricing, results within 24 hours (typically faster)"1h": 50% price premium, results within 1 hour (typically faster)Response contains batch id - save this for status polling.
Before submitting, verify:
Via API:
curl https://api.doubleword.ai/v1/batches/batch-xyz789 \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY"
Via Console:
Monitor real-time progress in the Batches dashboard.
Status progression:
validating - Checking input file formatin_progress - Processing requestscompleted - All requests finishedOther statuses:
failed - Batch failed (check error_file_id)expired - Batch timed outcancelling/cancelled - Batch cancelledResponse includes:
output_file_id - Download results hereerror_file_id - Failed requests (if any)request_counts - Total/completed/failed countsPolling frequency: Check every 30-60 seconds during processing.
Early access: Results available via output_file_id before batch fully completes - check X-Incomplete header.
Via API:
curl https://api.doubleword.ai/v1/files/file-output123/content \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
> results.jsonl
Via Console:
Download results directly from the Batches dashboard.
Response headers:
X-Incomplete: true - Batch still processing, more results comingX-Last-Line: 45 - Resume point for partial downloadsOutput format (each line):
{
"id": "batch-req-abc",
"custom_id": "request-1",
"response": {
"status_code": 200,
"body": {
"id": "chatcmpl-xyz",
"choices": [{
"message": {
"role": "assistant",
"content": "The answer is 4."
}
}]
}
}
}
Download errors (if any):
curl https://api.doubleword.ai/v1/files/file-error123/content \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
> errors.jsonl
Error format (each line):
{
"id": "batch-req-def",
"custom_id": "request-2",
"error": {
"code": "invalid_request",
"message": "Missing required parameter"
}
}
Tool calling (function calling) enables models to intelligently select and use external tools. Doubleword maintains full OpenAI compatibility.
Example batch request with tools:
{
"custom_id": "tool-req-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "anthropic/claude-3-5-sonnet",
"messages": [{"role": "user", "content": "What's the weather in Paris?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
}
}
Use cases:
Structured outputs guarantee that model responses conform to your JSON Schema, eliminating issues with missing fields or invalid enum values.
Example batch request with structured output:
{
"custom_id": "structured-req-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "anthropic/claude-3-5-sonnet",
"messages": [{"role": "user", "content": "Extract key info from: John Doe, 30 years old, lives in NYC"}],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person_info",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"city": {"type": "string"}
},
"required": ["name", "age", "city"]
}
}
}
}
}
Benefits:
autobatcher is a Python client that automatically converts individual API calls into batched requests, reducing costs without code changes.
Installation:
pip install autobatcher
How it works:
Key benefit: Significant cost reduction through automatic batching while writing normal async code using the familiar OpenAI interface.
Documentation: https://github.com/doublewordai/autobatcher
Via API:
curl https://api.doubleword.ai/v1/batches?limit=10 \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY"
Via Console:
View all batches in the dashboard.
Via API:
curl https://api.doubleword.ai/v1/batches/batch-xyz789/cancel \
-X POST \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY"
Via Console:
Click cancel in the batch details view.
Notes:
Parse JSONL output line-by-line:
import json
with open('results.jsonl') as f:
for line in f:
result = json.loads(line)
custom_id = result['custom_id']
content = result['response']['body']['choices'][0]['message']['content']
print(f"{custom_id}: {content}")
Check for incomplete batches and resume:
import requests
response = requests.get(
'https://api.doubleword.ai/v1/files/file-output123/content',
headers={'Authorization': f'Bearer {api_key}'}
)
if response.headers.get('X-Incomplete') == 'true':
last_line = int(response.headers.get('X-Last-Line', 0))
print(f"Batch incomplete. Processed {last_line} requests so far.")
# Continue polling and download again later
Extract failed requests from error file and resubmit:
import json
failed_ids = []
with open('errors.jsonl') as f:
for line in f:
error = json.loads(line)
failed_ids.append(error['custom_id'])
print(f"Failed requests: {failed_ids}")
# Create new batch with only failed requests
Handle tool call responses:
import json
with open('results.jsonl') as f:
for line in f:
result = json.loads(line)
message = result['response']['body']['choices'][0]['message']
if message.get('tool_calls'):
for tool_call in message['tool_calls']:
print(f"Tool: {tool_call['function']['name']}")
print(f"Args: {tool_call['function']['arguments']}")
"user-123-question-5", "dataset-A-row-42""1", "req1"custom_id must be unique within the batch24h for cost savings (50-83% cheaper), 1h only when time-sensitiveerror_file_id and retry failed requestscompleted/total ratioFor complete API details, see:
references/api_reference.md - Full endpoint documentation and schemasreferences/getting_started.md - Detailed setup and account managementreferences/pricing.md - Model costs and SLA comparisonAI Usage Analysis
Analysis is being generated⦠refresh in a few seconds.
Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Gemini CLI for one-shot Q&A, summaries, and generation.
Research any topic from the last 30 days on Reddit + X + Web, synthesize findings, and write copy-paste-ready prompts. Use when the user wants recent social/web research on a topic, asks "what are people saying about X", or wants to learn current best practices. Requires OPENAI_API_KEY and/or XAI_API_KEY for full Reddit+X access, falls back to web search.
Check Antigravity account quotas for Claude and Gemini models. Shows remaining quota and reset times with ban detection.
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates opencla...
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates openclaw.json. Use when the user mentions free AI, OpenRouter, model switching, rate limits, or wants to reduce AI costs.