GPU

Monitor GPU utilization and manage GPU memory on edge nodes.

Edge endpoints accept your pg_* API key as a bearer token. See Authentication for details. The cURL examples below pin Toronto (yto-01) for concreteness — substitute another region or discover the fastest one via GET https://autorouter.polargrid.ai/v1/route. See API Overview for both patterns.

Operator-only. /v1/gpu/* requires a superadmin-scoped credential, issued only to PolarGrid operators. Standard pg_* API keys receive 403 Forbidden. These endpoints act on node-global, multi-tenant GPU state — they are documented for operator reference, not customer inference traffic.

Get GPU Status

GET /v1/gpu/status

Get detailed GPU status including memory, utilization, and temperature.

Example

# Edge endpoints accept your pg_* API key as a bearer token — see Authentication
curl https://api.yto-01.edge.polargrid.ai/v1/gpu/status \
  -H "Authorization: Bearer pg_your_api_key"

Response

{
  "status": "success",
  "timestamp": "2025-01-29T12:00:00Z",
  "gpus": [
    {
      "index": 0,
      "name": "NVIDIA A100 80GB",
      "memory": {
        "total_mb": 81920,
        "used_mb": 45056,
        "free_mb": 36864,
        "total_gb": 80.0,
        "used_gb": 44.0,
        "free_gb": 36.0,
        "percent_used": 55.0
      },
      "utilization": {
        "gpu_percent": 72,
        "memory_percent": 55
      },
      "temperature_c": 65
    }
  ],
  "processes": [
    {
      "pid": 12345,
      "name": "python",
      "memory_mb": 40960
    }
  ],
  "total_gpus": 1
}

Get GPU Memory

GET /v1/gpu/memory

Get simplified GPU memory information.

Example

const memory = await client.getGPUMemory();

memory.memory.forEach((gpu, i) => {
  console.log(`GPU ${i}: ${gpu.usedGb}GB / ${gpu.totalGb}GB (${gpu.percentUsed}%)`);
});

Response

{
  "status": "success",
  "timestamp": "2025-01-29T12:00:00Z",
  "memory": [
    {
      "total_gb": 80.0,
      "used_gb": 44.0,
      "free_gb": 36.0,
      "percent_used": 55.0,
      "percent_free": 45.0
    }
  ]
}

Purge GPU Memory

POST /v1/gpu/purge

Unload all models and clear GPU memory cache.

Request Body

Parameter	Type	Required	Default	Description
`force`	boolean	No	false	Force purge even if models are in use

Example

const result = await client.purgeGPU({ force: false });

console.log(`Freed ${result.memoryFreedGb}GB`);
console.log(`Unloaded models:`, result.modelsUnloaded);
console.log(result.recommendation);

Response

{
  "status": "success",
  "timestamp": "2025-01-29T12:00:00Z",
  "actions": ["unloaded qwen-3.5-27b", "cleared CUDA cache"],
  "memory_before": {
    "used_gb": 44.0,
    "total_gb": 80.0,
    "percent_used": 55.0
  },
  "memory_after": {
    "used_gb": 2.1,
    "total_gb": 80.0,
    "percent_used": 2.6
  },
  "memory_freed_gb": 41.9,
  "models_unloaded": ["qwen-3.5-27b"],
  "errors": [],
  "recommendation": "GPU memory cleared successfully"
}

​GPU

​Get GPU Status

​Example

​Response

​Get GPU Memory

​Example

​Response

​Purge GPU Memory

​Request Body

​Example

​Response

GPU

Get GPU Status

Example

Response

Get GPU Memory

Example

Response

Purge GPU Memory

Request Body

Example

Response