GPU
Monitor GPU utilization and manage GPU memory on edge nodes.
Edge endpoints accept your pg_* API key as a bearer token. See Authentication for details. The cURL examples below pin Toronto (yto-01) for concreteness — substitute another region or discover the fastest one via GET https://autorouter.polargrid.ai/v1/route. See API Overview for both patterns.
Operator-only. /v1/gpu/* requires a superadmin-scoped credential, issued only to PolarGrid operators. Standard pg_* API keys receive 403 Forbidden. These endpoints act on node-global, multi-tenant GPU state — they are documented for operator reference, not customer inference traffic.
Get GPU Status
Get detailed GPU status including memory, utilization, and temperature.
Example
# Edge endpoints accept your pg_* API key as a bearer token — see Authentication
curl https://api.yto-01.edge.polargrid.ai/v1/gpu/status \
-H "Authorization: Bearer pg_your_api_key"
Response
{
"status": "success",
"timestamp": "2025-01-29T12:00:00Z",
"gpus": [
{
"index": 0,
"name": "NVIDIA A100 80GB",
"memory": {
"total_mb": 81920,
"used_mb": 45056,
"free_mb": 36864,
"total_gb": 80.0,
"used_gb": 44.0,
"free_gb": 36.0,
"percent_used": 55.0
},
"utilization": {
"gpu_percent": 72,
"memory_percent": 55
},
"temperature_c": 65
}
],
"processes": [
{
"pid": 12345,
"name": "python",
"memory_mb": 40960
}
],
"total_gpus": 1
}
Get GPU Memory
Get simplified GPU memory information.
Example
const memory = await client.getGPUMemory();
memory.memory.forEach((gpu, i) => {
console.log(`GPU ${i}: ${gpu.usedGb}GB / ${gpu.totalGb}GB (${gpu.percentUsed}%)`);
});
Response
{
"status": "success",
"timestamp": "2025-01-29T12:00:00Z",
"memory": [
{
"total_gb": 80.0,
"used_gb": 44.0,
"free_gb": 36.0,
"percent_used": 55.0,
"percent_free": 45.0
}
]
}
Purge GPU Memory
Unload all models and clear GPU memory cache.
Request Body
| Parameter | Type | Required | Default | Description |
|---|
force | boolean | No | false | Force purge even if models are in use |
Example
const result = await client.purgeGPU({ force: false });
console.log(`Freed ${result.memoryFreedGb}GB`);
console.log(`Unloaded models:`, result.modelsUnloaded);
console.log(result.recommendation);
Response
{
"status": "success",
"timestamp": "2025-01-29T12:00:00Z",
"actions": ["unloaded qwen-3.5-27b", "cleared CUDA cache"],
"memory_before": {
"used_gb": 44.0,
"total_gb": 80.0,
"percent_used": 55.0
},
"memory_after": {
"used_gb": 2.1,
"total_gb": 80.0,
"percent_used": 2.6
},
"memory_freed_gb": 41.9,
"models_unloaded": ["qwen-3.5-27b"],
"errors": [],
"recommendation": "GPU memory cleared successfully"
}