Integrations
Use OVHcloud AI Endpoints models with our partner integrations. Feel free to send us a message on our Discord if you have any ideas, suggestions or requests.
Information
To create an API key, you can navigate to the OVHcloud Control Panel, in Public Cloud > AI Endpoints > API keys.
LiteLLM
LiteLLM is a Python library that simplifies using Large Language Model (LLM) by providing a unified interface for different AI providers.
Install LiteLLM via pip:
pip install litellmBasic Usage
The recommended method to configure your API key is using environment variables:
import os
# Set your API key via environment variable
os.environ['OVHCLOUD_API_KEY'] = "your-api-key"Here's a simple usage example:
from litellm import completion
response = completion(
model="ovhcloud/Meta-Llama-3_3-70B-Instruct",
messages=[
{
"role": "user",
"content": "What's the capital of France?"
}
],
max_tokens=100,
temperature=0.7
)
print(response.choices[0].message.content)Advanced Features
Response Streaming
For applications requiring real-time responses, use streaming:
from litellm import completion
response = completion(
model="ovhcloud/Meta-Llama-3_3-70B-Instruct",
messages=[
{
"role": "user",
"content": "Write me a short story about a robot learning to cook."
}
],
max_tokens=500,
temperature=0.8,
stream=True # Enable streaming
)
# Progressive display of the response
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end='', flush=True)Function Calling (or Tool Calling)
LiteLLM supports function calling with AI Endpoints compatible models:
from litellm import completion
import json
def get_current_weather(location, unit="celsius"):
"""Simulated function to get the weather"""
if unit == "celsius":
return {"location": location, "temperature": "22", "unit": "celsius"}
else:
return {"location": location, "temperature": "72", "unit": "fahrenheit"}
# Define available tools
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and country, e.g. Paris, France"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
]
# First call to get the tool usage decision
response = completion(
model="ovhcloud/Meta-Llama-3_3-70B-Instruct",
messages=[{"role": "user", "content": "What's the weather like in Paris?"}],
tools=tools,
tool_choice="auto"
)
# Process tool calls
if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
function_args = json.loads(tool_call.function.arguments)
# Execute the function
result = get_current_weather(
location=function_args.get("location"),
unit=function_args.get("unit", "celsius")
)
print(f"Tool result: {result}")Vision and Image Analysis
For models supporting vision capabilities:
from base64 import b64encode
from mimetypes import guess_type
import litellm
def encode_image(file_path):
"""Encode an image to base64 for the API"""
mime_type, _ = guess_type(file_path)
if mime_type is None:
raise ValueError("Could not determine MIME type of the file")
with open(file_path, "rb") as image_file:
encoded_string = b64encode(image_file.read()).decode("utf-8")
data_url = f"data:{mime_type};base64,{encoded_string}"
return data_url
# Image analysis
response = litellm.completion(
model="ovhcloud/Mistral-Small-3.2-24B-Instruct-2506",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What do you see in this image?"
},
{
"type": "image_url",
"image_url": {
"url": encode_image("my_image.jpg"),
"format": "image/jpeg"
}
}
]
}
],
stream=False
)
print(response.choices[0].message.content)Structured Output (JSON Schema)
To get responses in a structured format:
from litellm import completion
response = completion(
model="ovhcloud/Meta-Llama-3_3-70B-Instruct",
messages=[
{
"role": "system",
"content": "You are a specialist in extracting structured data from unstructured text."
},
{
"role": "user",
"content": "Room 12 contains books, a desk, and a lamp."
}
],
response_format={
"type": "json_schema",
"json_schema": {
"title": "extracted_data",
"name": "data_extraction",
"schema": {
"type": "object",
"properties": {
"room": {"type": "string"},
"items": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["room", "items"],
"additionalProperties": False
},
"strict": False
}
}
)
print(response.choices[0].message.content)Embeddings
To generate embeddings with compatible models:
from litellm import embedding
response = embedding(
model="ovhcloud/BGE-M3",
input=["sample text to embed", "another sample text to embed"]
)
print(response.data)Using LiteLLM Proxy Server
Proxy Server Configuration
For production deployments, you can use the LiteLLM proxy server:
Install LiteLLM proxy:
pip install 'litellm[proxy]'Create a config.yaml file:
model_list:
- model_name: my-llama
litellm_params:
model: ovhcloud/Meta-Llama-3_3-70B-Instruct
api_key: your-ovh-api-key
- model_name: my-mistral
litellm_params:
model: ovhcloud/Mistral-Small-3.2-24B-Instruct-2506
api_key: your-ovh-api-key
- model_name: my-embedding
litellm_params:
model: ovhcloud/BGE-M3
api_key: your-ovh-api-keyStart the proxy server:
litellm --config /path/to/config.yaml --port 4000The proxy server is live with our models!
Using the Proxy
Once the proxy is running, use it like a standard OpenAI API:
import openai
client = openai.OpenAI(
api_key="sk-1234", # LiteLLM proxy key
base_url="http://localhost:4000" # Proxy URL
)
response = client.chat.completions.create(
model="my-llama",
messages=[
{
"role": "user",
"content": "What is OVHcloud?"
}
]
)
print(response.choices[0].message.content)Pydantic AI
Pydantic AI is a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI.
Pydantic AI is available on PyPI as pydantic-ai so installation is as simple as:
pip install pydantic-aiYou can then set the OVHCLOUD_API_KEY environment variable and use OVHcloudProvider by name:
from pydantic_ai import Agent
agent = Agent('ovhcloud:gpt-oss-120b')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.If you need to configure the provider, you can use the OVHcloudProvider class:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.ovhcloud import OVHcloudProvider
model = OpenAIChatModel(
'gpt-oss-120b',
provider=OVHcloudProvider(api_key='your-api-key'),
)
agent = Agent(model)
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.Kilo Code
Kilo Code accelerates development with AI-driven code generation and task automation. This open source extension plugs directly into VS Code.
First, go in your IDE market place and search for "Kilo Code".
- Visual Studio Code : https://marketplace.visualstudio.com/items?itemName=kilocode.Kilo-Code
- Jetbrains : https://plugins.jetbrains.com/plugin/28350-kilo-code
- Cursor : cursor:extension/kilocode.kilo-code
Once Kilo Code is installed, open the extension and click on Use your own API key. Then you can search in API Provider for OVHcloud AI Endpoints. You can enter your OVHcloud AI Endpoints API key.
You can select the model of your choice on the drop-down list, that automatically fetches the last available models.
Apache Airflow Provider
This package provides Apache Airflow integration with OVHcloud AI products and especially AI Endpoints.
Installation
pip install apache-airflow-provider-ovhcloud-aiConfiguration
Create an Airflow Connection
Navigate to Admin > Connections in the Airflow UI and create a new connection:
- Connection Id:
ovh_ai_endpoints_default(or your custom name) - Connection Type:
generic - Password: Your OVHcloud AI Endpoints API token
Alternatively, use the Airflow CLI:
airflow connections add ovh_ai_endpoints_default \
--conn-type generic \
--conn-password your-api-token-hereOr set via environment variable:
export AIRFLOW_CONN_OVH_AI_ENDPOINTS_DEFAULT='{"password":"your-api-token-here"}'Usage
Chat Completion Example
from airflow import DAG
from apache_airflow_provider_ovhcloud_ai.operators.ai_endpoints import OVHCloudAIEndpointsChatCompletionsOperator
from datetime import datetime
with DAG(
dag_id='ovh_ai_chat_example',
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
) as dag:
chat_task = OVHCloudAIEndpointsChatCompletionsOperator(
task_id='generate_response',
model='gpt-oss-120b',
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain what Apache Airflow is in one sentence."}
],
temperature=0.7,
max_tokens=100,
)Embedding Example
from airflow import DAG
from apache_airflow_provider_ovhcloud_ai.operators.ovhcloud_ai import OVHCloudAIEndpointsEmbeddingOperator
from datetime import datetime
with DAG(
dag_id='ovh_ai_embedding_example',
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
) as dag:
embed_task = OVHCloudAIEndpointsEmbeddingOperator(
task_id='create_embeddings',
model='BGE-M3',
input=[
"Apache Airflow is a workflow orchestration tool",
"OVHcloud provides cloud computing services"
],
)Using the Hook Directly
from apache_airflow_provider_ovhcloud_ai.hooks.ai_endpoints import OVHCloudAIEndpointsHook
def my_custom_function(**context):
hook = OVHCloudAIEndpointsHook(ovh_conn_id='ovh_ai_endpoints_default')
# Chat completion
response = hook.chat_completion(
model='gpt-oss-120b',
messages=[{"role": "user", "content": "Hello!"}],
temperature=0.8,
)
print(response['choices'][0]['message']['content'])
# Create embeddings
embeddings = hook.create_embedding(
model='BGE-M3',
input="Text to embed"
)
print(embeddings['data'][0]['embedding'])Dynamic Templating
Operators support Jinja templating for dynamic values:
from airflow import DAG
from apache_airflow_provider_ovhcloud_ai.operators.ovhcloud_ai import OVHCloudAIEndpointsChatCompletionsOperator
from datetime import datetime
with DAG(
dag_id='ovh_ai_templating_example',
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
) as dag:
chat_task = OVHCloudAIEndpointsChatCompletionsOperator(
task_id='templated_chat',
model='{{ var.value.llm_model }}', # From Airflow Variables
messages=[
{
"role": "user",
"content": "Process data from {{ ds }}" # Execution date
}
],
)