Surf Inference

OpenAI-compatible LLM inference API with x402 or MPP micropayments.

Quick Start

# Check wallet balance
npx x402-proxy wallet
# Try it
npx x402-proxy -X POST -H "Content-Type: application/json" -d '{"model":"moonshotai/kimi-k2.5","messages":[{"role":"user","content":"Hello"}]}' https://inference.surf.cascade.fyi/v1/chat/completions
npx x402-proxy -X POST -H "Content-Type: application/json" -d '{"model":"moonshotai/kimi-k2.5","messages":[{"role":"user","content":"Hello"}],"stream":true}' https://inference.surf.cascade.fyi/v1/chat/completions

Pricing

Method Path Price Description
POST /v1/chat/completions dynamic LLM chat completion (streaming supported via SSE)

Model Pricing

Flat models charge a fixed price per request. Dynamic models charge per token based on input, output, and cache usage. Rates are USD per million tokens.

Model Type Flat Input Output
moonshotai/kimi-k2.5 dynamic - $0.59/M $2.86/M
minimax/minimax-m2.5 dynamic - $0.26/M $1.52/M
qwen/qwen-2.5-7b-instruct flat $0.001 - -
anthropic/claude-sonnet-4.5 dynamic - $3.9/M $19.5/M
anthropic/claude-sonnet-4.6 dynamic - $3.9/M $19.5/M
anthropic/claude-opus-4.5 dynamic - $6.5/M $32.5/M
anthropic/claude-opus-4.6 dynamic - $6.5/M $32.5/M
minimax/minimax-m2.7 dynamic - $0.39/M $1.56/M
z-ai/glm-5 dynamic - $1.04/M $3.33/M
x-ai/grok-4.1-fast dynamic - $0.26/M $0.65/M
x-ai/grok-4.20-beta dynamic - $2.6/M $7.8/M
x-ai/grok-4.20-multi-agent-beta dynamic - $2.6/M $7.8/M
x-ai/grok-4.1-fast:online dynamic - $1.05/M $0.75/M
x-ai/grok-4.20-beta:online dynamic - $3.75/M $9/M
x-ai/grok-4.20-multi-agent-beta:online dynamic - $3.75/M $9/M

Payment

Protocol
x402 / MPP
Currency
USDC
Networks
Base, Solana, Tempo