Completions
The completions endpoint generates text from a single prompt. For conversational use cases, prefer Chat Completions.Create Completion
Request Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | Yes | — | The prompt to complete |
model | string | No | gpt2 | Model ID |
max_tokens | integer | No | 100 | Maximum tokens (1-4096) |
temperature | number | No | 0.7 | Sampling temperature (0.0-2.0) |
top_p | number | No | 0.9 | Nucleus sampling (0.0-1.0) |
top_k | integer | No | 50 | Top-k sampling |
frequency_penalty | number | No | 0.0 | Frequency penalty (-2.0 to 2.0) |
presence_penalty | number | No | 0.0 | Presence penalty (-2.0 to 2.0) |
stop | array | No | — | Up to 4 stop sequences |
user | string | No | — | End-user identifier |
Example Request
Response
Streaming
Enable streaming for real-time token generation:Legacy Generate Method
The SDKs also provide agenerate() method for backward compatibility. It wraps chatCompletion() internally:
