Skip to main content

API Overview

PolarGrid provides an OpenAI-compatible REST API, making it easy to migrate existing applications or use familiar patterns.

Base URL

https://api.{region}.edge.polargrid.ai
Available regions:
  • yto-01 — Toronto
  • yvr-02 — Vancouver
  • yul-01 — Montreal

Authentication

Edge endpoints require a JWT. See Authentication for how to obtain one.
All edge /v1/* requests require a JWT token obtained by exchanging your API key via POST https://app.polargrid.ai/api/auth/inference-token. The SDKs handle this automatically. For raw HTTP access:
# First, exchange your API key for a JWT
TOKEN=$(curl -s -X POST https://app.polargrid.ai/api/auth/inference-token \
  -H "Authorization: Bearer pg_your_api_key" | jq -r .token)

# Then use the JWT for edge requests
Authorization: Bearer $TOKEN
Get your API key from the Console.

Endpoints

Text Inference

MethodEndpointDescription
POST/v1/chat/completionsChat completions (recommended)
POST/v1/completionsText completions

Audio

MethodEndpointDescription
POST/v1/audio/speechText-to-speech
POST/v1/audio/transcriptionsSpeech-to-text

Models

MethodEndpointDescription
GET/v1/modelsList available models
POST/v1/models/loadLoad a model into GPU memory
POST/v1/models/unloadUnload a model
POST/v1/models/unload-allUnload all models
GET/v1/models/statusGet model loading status

GPU

MethodEndpointDescription
GET/v1/gpu/statusDetailed GPU status (may not be available on all regions)
GET/v1/gpu/memoryGPU memory usage
POST/v1/gpu/purgeClear GPU memory

Health

MethodEndpointDescription
GET/healthService health check

Request Format

All POST requests accept JSON:
curl -X POST https://api.yto-01.edge.polargrid.ai/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Meta-Llama-3.1-8B-Instruct",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 100
  }'

Response Format

Responses are JSON with this structure:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "Meta-Llama-3.1-8B-Instruct",
  "choices": [...],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Errors

Errors return appropriate HTTP status codes with a detail string (FastAPI format):
{
  "detail": "Invalid API key"
}
StatusDescription
400Bad request (validation error)
401Unauthorized (invalid API key)
404Not found
429Rate limit exceeded
500Server error

Streaming

For streaming responses, set stream: true:
curl -X POST https://api.yto-01.edge.polargrid.ai/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Meta-Llama-3.1-8B-Instruct",
    "messages": [{"role": "user", "content": "Tell me a story"}],
    "stream": true
  }'
Streaming responses use Server-Sent Events (SSE). See the Streaming Guide for details.

OpenAI Compatibility

PolarGrid’s REST API follows OpenAI’s endpoint structure and request/response formats, so tools that speak the OpenAI wire protocol (e.g., curl, LangChain, LiteLLM) can target PolarGrid with a base URL change. The PolarGrid Python and JavaScript SDKs use their own method signatures (e.g., client.chat_completion({...}) instead of client.chat.completions.create(...)) and are not drop-in replacements for the OpenAI SDK. See the SDK docs for details.