Authentication
Windfall supports two authentication methods. Both give access to the same API surface.
Bearer Token

Include your API key in the Authorization header. Keys start with wf_.

HTTP Header
Authorization: Bearer wf_your_api_key
Creating Keys

Create API keys via the Console or programmatically:

POST /api/keys
FieldTypeDescription
label string Human-readable label for the key
wallet string? Optional EVM wallet address for ERC-8004 identity tier
Identity Tiers

Your tier determines free request allowance and rate limits:

TierFree RequestsHow to Qualify
anonymous 25 Default tier, no verification required
erc8004 100 Link an ERC-8004 registered agent wallet or basename
verified 250 KYC or Stripe-verified account

After exhausting free requests, add balance via Top Up (Stripe or USDC/ETH on Base).

SIWE Wallet Sessions

Sign-In with Ethereum (SIWE) for wallet-based auth. Send a signed message to /v1/auth/siwe to receive a session token. Wallets with a basename (yourname.base.eth) or ERC-8004 registration get 100 free requests.

Chat Completions
OpenAI-compatible inference endpoint. Drop-in for chat completions with any OpenAI SDK. Add the mode field for energy-aware routing.
POST /v1/chat/completions
Request Body
FieldTypeDescription
model string Model identifier. e.g. deepseek-v3, meta-llama/llama-3.1-70b-instruct
messages array Array of message objects with role and content
mode string Routing mode: "greenest", "balanced", or "cheapest". Default: "greenest"
temperature number Sampling temperature (0-2). Optional.
max_tokens integer Maximum tokens to generate. Optional.
stream boolean Enable SSE streaming. Optional.
tools array Tool definitions for function calling. Forwarded to upstream provider.
tool_choice string/object Control tool selection: "auto", "none", "required", or specific tool. Optional.
response_format object Force output format: {"type": "json_object"} for JSON mode. Optional.
top_p number Nucleus sampling parameter (0-1). Optional.
frequency_penalty number Penalize repeated tokens (-2 to 2). Optional.
presence_penalty number Penalize tokens already present (-2 to 2). Optional.
stop string/array Stop sequences. Optional.
Response — Windfall Extension

Every response includes a windfall object with verified energy data:

FieldTypeDescription
windfall.verifiedProvider string Provider that served the request (e.g. "Crusoe", "Together")
windfall.carbonIntensityGCO2 number Carbon intensity at the provider's grid, in gCO2/kWh
windfall.carbonDelta number Grams CO2 saved vs US grid average (420g)
windfall.carbonMethodology string "provider-verified" or "grid-average"
windfall.energyPricePerKwh number Wholesale energy price at the provider's zone
windfall.curtailmentActive boolean Whether curtailment (negative pricing) was active
windfall.provider string Upstream inference provider name
windfall.providerTier string Provider tier: "standard" or "premium"
windfall.providerVerified boolean Whether provider's location is independently verified
windfall.costUsd number Total cost charged for this request
windfall.costBreakdown object Detailed cost: {inputTokens, outputTokens, providerCostUsd, marginPercent, marginUsd, totalCostUsd}
windfall.carbonGrams number Estimated carbon emissions for this request (grams)
windfall.carbonBaselineGrams number Carbon if routed to US grid average baseline
windfall.cached boolean Whether this response was served from cache
windfall.providerZone string Grid zone ID of the provider (e.g. "FI", "US-MIDA-PJM")
windfall.energySource string Data source for energy information
windfall.routingMode string Routing mode used for this request
Compliance Modes

Compliance modes restrict which providers and zones are available. Set them in Console Settings or via the API. Active modes are enforced on every request — they can only be strengthened, never weakened, by per-request headers.

ModeEffect
eu-data-residencyOnly EU/EEA/UK datacenters (FI, SE, NO, IS, DE, RO, GB)
gdpr-strictEU zones + non-US-owned providers only. Currently: Nebius (NL/FI), DataCrunch (FI)
zdrZero data retention — only providers with verified no-logging policy
csrdScope 2 + Scope 3 carbon data in every response. No routing restriction.
sciSCI for AI score (ISO 21031) per request — gCO₂e per 1000 tokens. No routing restriction.
Set via API
# Save compliance to your key (persists across requests) curl -X PUT https://windfallrouter.xyz/api/keys/routing \ -H "Authorization: Bearer $WINDFALL_KEY" \ -d '{"compliance": ["eu-data-residency", "csrd"]}' # Per-request header (can only add, not weaken) curl -X POST https://windfallrouter.xyz/v1/chat/completions \ -H "X-Windfall-Compliance: csrd,sci" \ -d '{"model": "deepseek-v3", "messages": [...]}'

When CSRD or SCI modes are active, every response includes a windfall.csrd or windfall.sci object with Scope 2/3 emissions, PUE, energy consumed, and SCI score.

Curtailment Lifecycle

GPU jobs can be scheduled to run only during clean energy windows. When grid carbon rises above your threshold, the job hibernates. When it drops back, the job restores. This cycle repeats until you cancel.

Job states: waiting_for_windowrunninghibernatedcompleted or cancelled

Supported providers: RunPod (stop/resume), Yotta (stop/resume). Lambda terminates and re-launches (no state preservation).

Schedule a curtailment job
curl -X POST https://windfallrouter.xyz/v1/batch/submit \ -H "Authorization: Bearer $WINDFALL_KEY" \ -d '{ "gpuModel": "RTX 4090", "mode": "greenest", "durationEstimateHrs": 720, "persistent": true, "triggerCondition": { "type": "carbon", "threshold": 50 } }' # Job waits until grid carbon < 50g, then provisions. # Hibernates when carbon rises, restores when it drops.

Social Cost of Carbon

The console dashboard calculates the ecological service value of your carbon savings using the EPA's social cost of carbon (SCC) — currently $204 per ton of CO₂ (2026 estimate, 3% discount rate).

SCC represents the economic damage avoided per ton of CO₂ not emitted: climate damage, health impacts, agricultural losses, and ecosystem degradation. When Windfall routes your inference to a 15g grid instead of a 420g grid, the difference has a real dollar value.

ecological_value = (baseline_carbon - actual_carbon) × $204 / 1,000,000
// grams → tons, then multiply by SCC

Example Response
{ "id": "chatcmpl-abc123", "object": "chat.completion", "choices": [{ "message": { "role": "assistant", "content": "Hello! How can I help?" }, "finish_reason": "stop" }], "windfall": { "verifiedProvider": "Crusoe", "carbonIntensityGCO2": 50, "carbonDelta": 370, "carbonMethodology": "provider-verified", "energyPricePerKwh": 0.0450, "curtailmentActive": false, "zone": "US-NW-PACE", "routingMode": "greenest" } }
Structured Output
Windfall forwards tool definitions and response format parameters to upstream providers. Structured output support depends on the upstream model's capabilities.
What Works

tools and tool_choice are forwarded to the upstream provider. If the model supports function calling, tool calls will work through Windfall.

response_format: {"type": "json_object"} is forwarded. JSON mode works for models that support it.

Known Limitations

The /v1/responses endpoint (OpenAI's successor to Chat Completions) is not yet supported. If your SDK defaults to the Responses API, force the classic endpoint:

Python (OpenAI SDK)
# Force chat completions instead of responses API client = OpenAI(base_url="https://windfallrouter.xyz") response = client.chat.completions.create( model="deepseek-v3", messages=[{"role": "user", "content": "Hello"}], )

model: "auto" (DeepSeek V3) may return inconsistent results with complex tool schemas. For structured output, specify the model explicitly (e.g. openai/gpt-4o, anthropic/claude-sonnet-4-6).

When using JSON instructions in the system prompt as a workaround, note that some models (Claude) may wrap the JSON response in ```json code fences.

Direct Routing
Some models are routed directly to providers (bypassing OpenRouter) for lower latency and richer metadata.

Currently direct-routed models include openai/gpt-4o-mini and google/gemini-2.5-flash, which route to Helsinki (hel1) directly.

Direct-routed responses include additional fields in the windfall object: cached, engagement, providerZone, energySource, and costBreakdown. These fields may be absent for OpenRouter-passthrough responses.

Submit Batch Job
Provision a GPU with energy-aware datacenter selection.
POST /v1/batch/submit
FieldTypeDescription
gpuModel string GPU model to provision (e.g. "NVIDIA A100 80GB")
mode string "greenest", "balanced", or "cheapest"
durationEstimateHrs number Expected duration in hours
spotOk boolean Allow spot/interruptible instances for lower cost
List Jobs
Retrieve all batch jobs for your account.
GET /v1/batch/jobs

Returns an array of job objects with status, provider, zone, cost, and carbon data.

GPU Availability
Real-time GPU inventory across all providers and datacenters.
GET /v1/batch/availability

Returns totalAvailable, totalDatacenters, and a gpus array with model, price, zone, provider, and availability.

Energy Provider Data
Live carbon intensity, energy prices, and renewable percentages across all routing zones.
GET /v1/energy/providers

Returns a zones object keyed by zone ID. Each zone includes carbonIntensity, pricePerKwh, renewablePercent, wholesalePricePerMwh, source, and curtailmentActive.

Wholesale LMP
Real-time locational marginal prices from CAISO OASIS. Free, no-auth upstream.
GET /v1/energy/lmp

Returns current LMP data for CAISO nodes including price per MWh, congestion, and loss components.

Rate Limits
Windfall enforces per-key rate limits to protect upstream providers and ensure fair access.
EndpointLimitWindow
/v1/chat/completions 60 requests Per minute
/api/keys (create/revoke) 10 operations Per hour
/v1/auth/* 10 attempts Per 15 minutes
429 Response

When rate limited, Windfall returns a 429 Too Many Requests response with a Retry-After header:

429 Response
{ "error": { "message": "Rate limit exceeded. Retry after 12 seconds.", "type": "rate_limit", "param": null, "code": "rate_limited" } }
Error Handling
All errors follow the OpenAI error response format. Parse the error object for structured error information.
Error Response Shape
Error Response
{ "error": { "message": "Human-readable error description", "type": "error_category", "param": null, "code": "machine_readable_code" } }
Error Codes
CodeHTTP StatusDescription
missing_parameter 400 A required field is missing from the request body (e.g. messages)
payment_required 402 Free tier exhausted or insufficient balance. Top up via Console.
rate_limited 429 Too many requests. Back off and retry after the Retry-After header value.
upstream_error 502 The upstream inference provider returned an error. Retry or try a different model.
no_provider 503 No available provider for the requested model. Check /v1/models for supported models.
Routing Config (for agents)

Autonomous agents can configure per-key routing preferences. Set once, applies to all requests made with that key.

PUT /api/keys/routing Authorization: Bearer wf_YOUR_KEY { "mode": "greenest", "maxCarbonIntensity": 100, "preferZone": "FI", "excludeZones": ["US-MIDA-PJM", "US-TEX-ERCO"] }
FieldTypeDescription
modestring"greenest" or "balanced". Default: greenest.
preferZonestringPreferred energy zone (e.g., "FI", "CA-QC"). Overridden by per-request prefer_zone.
excludeZonesstring[]Zones to never route to.
maxCarbonIntensitynumberMax gCO2/kWh. Requests will only route to zones below this threshold.
preferProviderstringPreferred provider ID (e.g., "or-nebius", "or-crusoe").
excludeProvidersstring[]Provider IDs to exclude.

Get current config: GET /api/keys/routing. Reset to defaults: DELETE /api/keys/routing.

Per-request parameters (mode, prefer_zone) override key-level config for that request.

Code Examples
Drop-in examples for the most common use case: green inference.
curl
Python
JavaScript
curl
# Green inference with Windfall curl -X POST https://windfallrouter.xyz/v1/chat/completions \ -H "Authorization: Bearer $WINDFALL_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-v3", "messages": [{"role": "user", "content": "Hello"}], "mode": "greenest" }'
Python
import requests response = requests.post( "https://windfallrouter.xyz/v1/chat/completions", headers={ "Authorization": f"Bearer {WINDFALL_KEY}", "Content-Type": "application/json", }, json={ "model": "deepseek-v3", "messages": [{"role": "user", "content": "Hello"}], "mode": "greenest", }, ) data = response.json() # Windfall extension fields windfall = data["windfall"] print(f"Provider: {windfall['verifiedProvider']}") print(f"Carbon: {windfall['carbonIntensityGCO2']}g CO2/kWh") print(f"Saved: {windfall['carbonDelta']}g vs US average")
JavaScript (fetch)
const response = await fetch( "https://windfallrouter.xyz/v1/chat/completions", { method: "POST", headers: { "Authorization": `Bearer ${WINDFALL_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "deepseek-v3", messages: [{ role: "user", content: "Hello" }], mode: "greenest", }), } ); const data = await response.json(); // Windfall extension fields const { windfall } = data; console.log(`Provider: ${windfall.verifiedProvider}`); console.log(`Carbon: ${windfall.carbonIntensityGCO2}g CO2/kWh`); console.log(`Saved: ${windfall.carbonDelta}g vs US average`);