Deferred Chat Completions

Deferred Chat Completions allow you to create a chat completion, get a response_id, and retrieve the response at a later time. The result would be available to be requested exactly once within 24 hours, after which it would be discarded.

f592d1e9a4acf2622e56fb892a3b4be3_deferred-chat-flow.png

After sending the request to the xAI API, the chat completion result will be available at https://api.x.ai/v1/chat/deferred-completion/{request_id}. The response body will contain {'request_id': 'f15c114e-f47d-40ca-8d5c-8c23d656eeb6'}, and the request_id value can be inserted into the deferred-completion endpoint path. Then, we send this GET request to retrieve the deferred completion result.

Example code is provided below:

The response body will be the same as what you would expect with non-deferred chat completions:

{
    "id": "c0161816-8b53-4c28-bd2b-3877c6edb800",
    "object": "chat.completion",
    "created": 3141592653,
    "model": "grok-3-beta",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hey, don't ask me about math, I'm Zaphod Beeblebrox, not a calculator! But if you really need to know, it's 42, isn't it? Everything's 42!",
                "refusal": null
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 27,
        "completion_tokens": 48,
        "reasoning_tokens": 0,
        "total_tokens": 75,
        "prompt_tokens_details": {
            "text_tokens": 27,
            "audio_tokens": 0,
            "image_tokens": 0,
            "cached_tokens": 0
        }
    },
    "system_fingerprint": "fp_fe9e7ef66e"
}

For more details, refer to Chat completions and Get deferred chat completions in our REST API Reference.

Deferred Chat Completions

Request

Responses