Model Loading
PolarGrid supports dynamic model loading for managing models on edge nodes.
Operator-only. /v1/models/load, /v1/models/unload, and /v1/models/unload-all require a superadmin-scoped credential, issued only to PolarGrid operators. Standard pg_* API keys receive 403 Forbidden. Models available for inference are pre-deployed across edge regions — you do not need to load models yourself. Use GET /v1/models to see which models are available in your region.
Edge endpoints accept your pg_* API key as a bearer token. See Authentication for details. The cURL examples below pin Toronto (yto-01) for concreteness — substitute another region or discover the fastest one via GET https://autorouter.polargrid.ai/v1/route. See API Overview for both patterns.
Load Model
Load a model into GPU memory.
Request Body
| Parameter | Type | Required | Default | Description |
|---|
model_name | string | Yes | — | Model ID to load |
force_reload | boolean | No | false | Force reload even if already loaded |
Example
# Edge endpoints accept your pg_* API key as a bearer token — see Authentication
curl -X POST https://api.yto-01.edge.polargrid.ai/v1/models/load \
-H "Authorization: Bearer pg_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model_name": "qwen-3.5-27b",
"force_reload": false
}'
Response
{
"status": "success",
"model": "qwen-3.5-27b",
"force_reload": false,
"message": "Model qwen-3.5-27b loaded successfully"
}
Unload Model
Unload a model from GPU memory.
Request Body
| Parameter | Type | Required | Description |
|---|
model_name | string | Yes | Model ID to unload (e.g., qwen-3.5-27b) |
Example
const result = await client.unloadModel({
modelName: 'gpt2',
});
console.log(result.message);
Response
{
"status": "success",
"model": "gpt2",
"message": "Model gpt2 unloaded successfully"
}
Unload All Models
POST /v1/models/unload-all
Unload all models from GPU memory.
Example
const result = await client.unloadAllModels();
console.log(`Unloaded ${result.totalUnloaded} models`);
console.log('Models:', result.unloadedModels);
Response
{
"status": "success",
"unloaded_models": ["qwen-3.5-27b", "whisper-large-v3-turbo"],
"errors": [],
"total_unloaded": 2
}
Get Model Status
Get the loading status of all models.
Example
const status = await client.getModelStatus();
console.log('Loaded models:', status.loaded);
console.log('Status:', status.loadingStatus);
Response
{
"loaded": ["qwen-3.5-27b", "whisper-large-v3-turbo"],
"loading_status": {
"qwen-3.5-27b": "loaded",
"whisper-large-v3-turbo": "loaded",
"gpt2": "unloaded"
},
"repository": "/models"
}
Status Values
| Status | Description |
|---|
loaded | Model is in GPU memory and ready |
loading | Model is currently being loaded |
unloaded | Model is not in memory |
failed | Model failed to load |