Chat Completions

POST
/v1/chat/completions
logprobs?boolean|null

Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the 'content' of 'message'.

max_tokens?integer|null

Max completion tokens, deprecated (still used by vllm)

Formatint32
Range0 <= value
messages

A list of messages comprising the conversation so far.

modelstring

ID of the model to use.

parallel_tool_calls?boolean|null

Whether to enable parallel function calling during tool use.

response_format?null|
seed?integer|null

This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.

Formatint32
Range0 <= value
stop?null|
stream?boolean|null

If set, partial messages will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

stream_options?null|
temperature?number|null

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Formatfloat
tool_choice?null|
tools?|null

A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for.

top_logprobs?integer|null

An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. 'logprobs' must be set to 'true' if this parameter is used.

Formatint32
Range0 <= value
top_p?number|null

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

Formatfloat

Response Body

application/json

curl -X POST "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "messages": [      {        "content": "string",        "role": "system"      }    ],    "model": "string"  }'
{
  "choices": [
    {
      "finish_reason": {},
      "index": 0,
      "logprobs": {},
      "message": {
        "content": "string",
        "name": "string",
        "role": "system"
      }
    }
  ],
  "created": 0,
  "id": "string",
  "model": "string",
  "object": "string",
  "usage": {}
}