Skip to main content

Model Loading

PolarGrid supports dynamic model loading for managing models on edge nodes.
Operator-only. /v1/models/load, /v1/models/unload, and /v1/models/unload-all require a superadmin-scoped credential, issued only to PolarGrid operators. Standard pg_* API keys receive 403 Forbidden. Models available for inference are pre-deployed across edge regions — you do not need to load models yourself. Use GET /v1/models to see which models are available in your region.
Edge endpoints accept your pg_* API key as a bearer token. See Authentication for details. The cURL examples below pin Toronto (yto-01) for concreteness — substitute another region or discover the fastest one via GET https://autorouter.polargrid.ai/v1/route. See API Overview for both patterns.

Load Model

POST /v1/models/load
Load a model into GPU memory.

Request Body

ParameterTypeRequiredDefaultDescription
model_namestringYesModel ID to load
force_reloadbooleanNofalseForce reload even if already loaded

Example

# Edge endpoints accept your pg_* API key as a bearer token — see Authentication
curl -X POST https://api.yto-01.edge.polargrid.ai/v1/models/load \
  -H "Authorization: Bearer pg_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "qwen-3.5-27b",
    "force_reload": false
  }'

Response

{
  "status": "success",
  "model": "qwen-3.5-27b",
  "force_reload": false,
  "message": "Model qwen-3.5-27b loaded successfully"
}

Unload Model

POST /v1/models/unload
Unload a model from GPU memory.

Request Body

ParameterTypeRequiredDescription
model_namestringYesModel ID to unload (e.g., qwen-3.5-27b)

Example

const result = await client.unloadModel({
  modelName: 'gpt2',
});

console.log(result.message);

Response

{
  "status": "success",
  "model": "gpt2",
  "message": "Model gpt2 unloaded successfully"
}

Unload All Models

POST /v1/models/unload-all
Unload all models from GPU memory.

Example

const result = await client.unloadAllModels();

console.log(`Unloaded ${result.totalUnloaded} models`);
console.log('Models:', result.unloadedModels);

Response

{
  "status": "success",
  "unloaded_models": ["qwen-3.5-27b", "whisper-large-v3-turbo"],
  "errors": [],
  "total_unloaded": 2
}

Get Model Status

GET /v1/models/status
Get the loading status of all models.

Example

const status = await client.getModelStatus();

console.log('Loaded models:', status.loaded);
console.log('Status:', status.loadingStatus);

Response

{
  "loaded": ["qwen-3.5-27b", "whisper-large-v3-turbo"],
  "loading_status": {
    "qwen-3.5-27b": "loaded",
    "whisper-large-v3-turbo": "loaded",
    "gpt2": "unloaded"
  },
  "repository": "/models"
}

Status Values

StatusDescription
loadedModel is in GPU memory and ready
loadingModel is currently being loaded
unloadedModel is not in memory
failedModel failed to load