Skip to main content

Welcome to PolarGrid

PolarGrid is edge AI infrastructure that brings GPU-powered inference closer to your users. Run LLMs and voice AI (text-to-speech, speech-to-text, end-to-end voice agents) with ultra-low latency across our edge network.

Quickstart

Get your first API call working in 5 minutes

API Reference

OpenAI-compatible endpoints for easy migration

JavaScript SDK

npm install @polargrid/polargrid-sdk

Python SDK

pip install polargrid-sdk

Why PolarGrid?

Edge-First Architecture

Your inference requests are routed to the nearest GPU-equipped edge node, minimizing round-trip latency. Critical for real-time voice AI and interactive applications.

OpenAI-Compatible API

Drop-in replacement for OpenAI’s API. Migrate existing applications with minimal code changes.

Real-Time Voice

Sub-30ms network hop to the nearest edge node, enabling natural conversational AI experiences. Network latency is the round-trip time between your client and the edge — inference latency (model processing time) is additional and varies by model and input size. See Models for performance details.

Managed Model Infrastructure

PolarGrid handles model deployment and scaling across edge regions. Popular open-weight models are pre-loaded and ready to use — no provisioning or GPU management required.

Available Regions

RegionLocationID
TorontoCanada Centralyto-01
Toronto 02Canada Centralyto-02
MontrealCanada Eastyul-01
Montreal 02Canada Eastyul-02
VancouverCanada Westyvr-02
New YorkUS Eastnyc-01
New York 02US Eastnyc-02
DallasUS Centraldal-01
Dallas 02US Centraldal-02
San FranciscoUS Westsfo-01
See Regions for endpoint URLs, aliases, and auto-routing.

Getting Help