API Docs — Windfall

Authentication

Windfall supports two authentication methods. Both give access to the same API surface.

Bearer Token

Include your API key in the Authorization header. Keys start with wf_.

HTTP Header

Authorization: Bearer wf_your_api_key

Creating Keys

Create API keys via the Console or programmatically:

POST /api/keys

Field	Type	Description
label	string	Human-readable label for the key
wallet	string?	Optional EVM wallet address for ERC-8004 identity tier

Identity Tiers

Your tier determines free request allowance and rate limits:

Tier	Free Requests	How to Qualify
anonymous	25	Default tier, no verification required
erc8004	100	Link an ERC-8004 registered agent wallet or basename
verified	250	KYC or Stripe-verified account

After exhausting free requests, add balance via Top Up (Stripe or USDC/ETH on Base).

SIWE Wallet Sessions

Sign-In with Ethereum (SIWE) for wallet-based auth. Send a signed message to /v1/auth/siwe to receive a session token. Wallets with a basename (yourname.base.eth) or ERC-8004 registration get 100 free requests.

Chat Completions

OpenAI-compatible inference endpoint. Drop-in for chat completions with any OpenAI SDK. Add the mode field for energy-aware routing.

POST /v1/chat/completions

Request Body

Field	Type	Description
model	string	Model identifier. e.g. `deepseek-v3`, `meta-llama/llama-3.1-70b-instruct`
messages	array	Array of message objects with `role` and `content`
mode	string	Routing mode: `"greenest"`, `"balanced"`, or `"cheapest"`. Default: `"greenest"`
temperature	number	Sampling temperature (0-2). Optional.
max_tokens	integer	Maximum tokens to generate. Optional.
stream	boolean	Enable SSE streaming. Optional.
tools	array	Tool definitions for function calling. Forwarded to upstream provider.
tool_choice	string/object	Control tool selection: `"auto"`, `"none"`, `"required"`, or specific tool. Optional.
response_format	object	Force output format: `{"type": "json_object"}` for JSON mode. Optional.
top_p	number	Nucleus sampling parameter (0-1). Optional.
frequency_penalty	number	Penalize repeated tokens (-2 to 2). Optional.
presence_penalty	number	Penalize tokens already present (-2 to 2). Optional.
stop	string/array	Stop sequences. Optional.

Response — Windfall Extension

Every response includes a windfall object with verified energy data:

Field	Type	Description
windfall.verifiedProvider	string	Provider that served the request (e.g. "Crusoe", "Together")
windfall.carbonIntensityGCO2	number	Carbon intensity at the provider's grid, in gCO2/kWh
windfall.carbonDelta	number	Grams CO2 saved vs US grid average (420g)
windfall.carbonMethodology	string	"provider-verified" or "grid-average"
windfall.energyPricePerKwh	number	Wholesale energy price at the provider's zone
windfall.curtailmentActive	boolean	Whether curtailment (negative pricing) was active
windfall.provider	string	Upstream inference provider name
windfall.providerTier	string	Provider tier: "standard" or "premium"
windfall.providerVerified	boolean	Whether provider's location is independently verified
windfall.costUsd	number	Total cost charged for this request
windfall.costBreakdown	object	Detailed cost: `{inputTokens, outputTokens, providerCostUsd, marginPercent, marginUsd, totalCostUsd}`
windfall.carbonGrams	number	Estimated carbon emissions for this request (grams)
windfall.carbonBaselineGrams	number	Carbon if routed to US grid average baseline
windfall.cached	boolean	Whether this response was served from cache
windfall.providerZone	string	Grid zone ID of the provider (e.g. "FI", "US-MIDA-PJM")
windfall.energySource	string	Data source for energy information
windfall.routingMode	string	Routing mode used for this request

Compliance Modes

Compliance modes restrict which providers and zones are available. Set them in Console Settings or via the API. Active modes are enforced on every request — they can only be strengthened, never weakened, by per-request headers.

Mode	Effect
eu-data-residency	Only EU/EEA/UK datacenters (FI, SE, NO, IS, DE, RO, GB)
gdpr-strict	EU zones + non-US-owned providers only. Currently: Nebius (NL/FI), DataCrunch (FI)
zdr	Zero data retention — only providers with verified no-logging policy
csrd	Scope 2 + Scope 3 carbon data in every response. No routing restriction.
sci	SCI for AI score (ISO 21031) per request — gCO₂e per 1000 tokens. No routing restriction.

Set via API

# Save compliance to your key (persists across requests) curl -X PUT https://windfallrouter.xyz/api/keys/routing \ -H "Authorization: Bearer $WINDFALL_KEY" \ -d '{"compliance": ["eu-data-residency", "csrd"]}' # Per-request header (can only add, not weaken) curl -X POST https://windfallrouter.xyz/v1/chat/completions \ -H "X-Windfall-Compliance: csrd,sci" \ -d '{"model": "deepseek-v3", "messages": [...]}'

When CSRD or SCI modes are active, every response includes a windfall.csrd or windfall.sci object with Scope 2/3 emissions, PUE, energy consumed, and SCI score.

Curtailment Lifecycle

GPU jobs can be scheduled to run only during clean energy windows. When grid carbon rises above your threshold, the job hibernates. When it drops back, the job restores. This cycle repeats until you cancel.

Job states: waiting_for_window → running ↔ hibernated → completed or cancelled

Supported providers: RunPod (stop/resume), Yotta (stop/resume). Lambda terminates and re-launches (no state preservation).

Schedule a curtailment job

curl -X POST https://windfallrouter.xyz/v1/batch/submit \ -H "Authorization: Bearer $WINDFALL_KEY" \ -d '{ "gpuModel": "RTX 4090", "mode": "greenest", "durationEstimateHrs": 720, "persistent": true, "triggerCondition": { "type": "carbon", "threshold": 50 } }' # Job waits until grid carbon < 50g, then provisions. # Hibernates when carbon rises, restores when it drops.

Social Cost of Carbon

The console dashboard calculates the ecological service value of your carbon savings using the EPA's social cost of carbon (SCC) — currently $204 per ton of CO₂ (2026 estimate, 3% discount rate).

SCC represents the economic damage avoided per ton of CO₂ not emitted: climate damage, health impacts, agricultural losses, and ecosystem degradation. When Windfall routes your inference to a 15g grid instead of a 420g grid, the difference has a real dollar value.

ecological_value = (baseline_carbon - actual_carbon) × $204 / 1,000,000
// grams → tons, then multiply by SCC

Example Response

{ "id": "chatcmpl-abc123", "object": "chat.completion", "choices": [{ "message": { "role": "assistant", "content": "Hello! How can I help?" }, "finish_reason": "stop" }], "windfall": { "verifiedProvider": "Crusoe", "carbonIntensityGCO2": 50, "carbonDelta": 370, "carbonMethodology": "provider-verified", "energyPricePerKwh": 0.0450, "curtailmentActive": false, "zone": "US-NW-PACE", "routingMode": "greenest" } }

Structured Output

Windfall forwards tool definitions and response format parameters to upstream providers. Structured output support depends on the upstream model's capabilities.

What Works

tools and tool_choice are forwarded to the upstream provider. If the model supports function calling, tool calls will work through Windfall.

response_format: {"type": "json_object"} is forwarded. JSON mode works for models that support it.

Known Limitations

The /v1/responses endpoint (OpenAI's successor to Chat Completions) is not yet supported. If your SDK defaults to the Responses API, force the classic endpoint:

Python (OpenAI SDK)

# Force chat completions instead of responses API client = OpenAI(base_url="https://windfallrouter.xyz") response = client.chat.completions.create( model="deepseek-v3", messages=[{"role": "user", "content": "Hello"}], )

model: "auto" (DeepSeek V3) may return inconsistent results with complex tool schemas. For structured output, specify the model explicitly (e.g. openai/gpt-4o, anthropic/claude-sonnet-4-6).

When using JSON instructions in the system prompt as a workaround, note that some models (Claude) may wrap the JSON response in ```json code fences.

Direct Routing

Some models are routed directly to providers (bypassing OpenRouter) for lower latency and richer metadata.

Currently direct-routed models include openai/gpt-4o-mini and google/gemini-2.5-flash, which route to Helsinki (hel1) directly.

Direct-routed responses include additional fields in the windfall object: cached, engagement, providerZone, energySource, and costBreakdown. These fields may be absent for OpenRouter-passthrough responses.

Submit Batch Job

Provision a GPU with energy-aware datacenter selection.

POST /v1/batch/submit

Field	Type	Description
gpuModel	string	GPU model to provision (e.g. "NVIDIA A100 80GB")
mode	string	`"greenest"`, `"balanced"`, or `"cheapest"`
durationEstimateHrs	number	Expected duration in hours
spotOk	boolean	Allow spot/interruptible instances for lower cost

List Jobs

Retrieve all batch jobs for your account.

GET /v1/batch/jobs

Returns an array of job objects with status, provider, zone, cost, and carbon data.

GPU Availability

Real-time GPU inventory across all providers and datacenters.

GET /v1/batch/availability

Returns totalAvailable, totalDatacenters, and a gpus array with model, price, zone, provider, and availability.

Energy Provider Data

Live carbon intensity, energy prices, and renewable percentages across all routing zones.

GET /v1/energy/providers

Returns a zones object keyed by zone ID. Each zone includes carbonIntensity, pricePerKwh, renewablePercent, wholesalePricePerMwh, source, and curtailmentActive.

Wholesale LMP

Real-time locational marginal prices from CAISO OASIS. Free, no-auth upstream.

GET /v1/energy/lmp

Returns current LMP data for CAISO nodes including price per MWh, congestion, and loss components.

Rate Limits

Windfall enforces per-key rate limits to protect upstream providers and ensure fair access.

Endpoint	Limit	Window
/v1/chat/completions	60 requests	Per minute
/api/keys (create/revoke)	10 operations	Per hour
/v1/auth/*	10 attempts	Per 15 minutes

429 Response

When rate limited, Windfall returns a 429 Too Many Requests response with a Retry-After header:

429 Response

{ "error": { "message": "Rate limit exceeded. Retry after 12 seconds.", "type": "rate_limit", "param": null, "code": "rate_limited" } }

Error Handling

All errors follow the OpenAI error response format. Parse the error object for structured error information.

Error Response Shape

Error Response

{ "error": { "message": "Human-readable error description", "type": "error_category", "param": null, "code": "machine_readable_code" } }

Error Codes

Code	HTTP Status	Description
missing_parameter	400	A required field is missing from the request body (e.g. `messages`)
payment_required	402	Free tier exhausted or insufficient balance. Top up via Console.
rate_limited	429	Too many requests. Back off and retry after the `Retry-After` header value.
upstream_error	502	The upstream inference provider returned an error. Retry or try a different model.
no_provider	503	No available provider for the requested model. Check `/v1/models` for supported models.

Routing Config (for agents)

Autonomous agents can configure per-key routing preferences. Set once, applies to all requests made with that key.

PUT /api/keys/routing
Authorization: Bearer wf_YOUR_KEY

{
  "mode": "greenest",
  "maxCarbonIntensity": 100,
  "preferZone": "FI",
  "excludeZones": ["US-MIDA-PJM", "US-TEX-ERCO"]
}

Field	Type	Description
mode	string	"greenest" or "balanced". Default: greenest.
preferZone	string	Preferred energy zone (e.g., "FI", "CA-QC"). Overridden by per-request prefer_zone.
excludeZones	string[]	Zones to never route to.
maxCarbonIntensity	number	Max gCO2/kWh. Requests will only route to zones below this threshold.
preferProvider	string	Preferred provider ID (e.g., "or-nebius", "or-crusoe").
excludeProviders	string[]	Provider IDs to exclude.

Get current config: GET /api/keys/routing. Reset to defaults: DELETE /api/keys/routing.

Per-request parameters (mode, prefer_zone) override key-level config for that request.

Code Examples

Drop-in examples for the most common use case: green inference.

curl

Python

JavaScript

curl
# Green inference with Windfall
curl -X POST https://windfallrouter.xyz/v1/chat/completions \
  -H "Authorization: Bearer $WINDFALL_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v3",
    "messages": [{"role": "user", "content": "Hello"}],
    "mode": "greenest"
  }'

Python
import requests

response = requests.post(
    "https://windfallrouter.xyz/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {WINDFALL_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "model": "deepseek-v3",
        "messages": [{"role": "user", "content": "Hello"}],
        "mode": "greenest",
    },
)

data = response.json()
# Windfall extension fields
windfall = data["windfall"]
print(f"Provider: {windfall['verifiedProvider']}")
print(f"Carbon: {windfall['carbonIntensityGCO2']}g CO2/kWh")
print(f"Saved: {windfall['carbonDelta']}g vs US average")

JavaScript (fetch)
const response = await fetch(
  "https://windfallrouter.xyz/v1/chat/completions",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${WINDFALL_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "deepseek-v3",
      messages: [{ role: "user", content: "Hello" }],
      mode: "greenest",
    }),
  }
);

const data = await response.json();

// Windfall extension fields
const { windfall } = data;
console.log(`Provider: ${windfall.verifiedProvider}`);
console.log(`Carbon: ${windfall.carbonIntensityGCO2}g CO2/kWh`);
console.log(`Saved: ${windfall.carbonDelta}g vs US average`);