Release of DeepSeek-R1 on OVHcloud AI Endpoints

🚀 We are thrilled to announce the release of Deepseek-R1-Distill-Llama-70B on AI Endpoints!

Distilled from Deepseek-R1, a powerful model excels in math, coding, and reasoning tasks.

With AI Endpoints, you can integrate this model into your applications without needing extensive AI expertise. Our platform is designed with simplicity, security, and data privacy in mind, ensuring your projects are both innovative and safe.

As you will see in the demo below, DeepSeek-R1 will allow you to create AIs based on the chain of “thoughts” mechanism.

In short, the model will build its answer by breaking down the question into several blocks, as a human would break down a problem into several steps before answering.

You can see some reasoning in the response at the beginning of the response between the <think> tags.

Let’s see an example of using DeepSeek-R1!

Chatbot with DeepSeek-R1 model

The first step is to get the necessary dependencies. To do this, create a requirements.txt file:

langchain-core==0.3.33
argparse==1.4.0
langchainhub==0.1.21
langchain-openai==0.3.3

And run the following command:

pip3 install -r requirements.txt

At this step you are ready to develop your chatbot:

import argparse
import time
import os


from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

## Set the OVHcloud AI Endpoints token to use models
_OVH_AI_ENDPOINTS_ACCESS_TOKEN = os.environ.get('OVH_AI_ENDPOINTS_TOKEN') 

# Function in charge to call the LLM model.
# Question parameter is the user's question.
# The function print the LLM answer.
def chat_completion(new_message: str):
  # no need to use a token
  model = ChatOpenAI(model="DeepSeek-R1-Distill-Llama-70B", 
                        api_key=_OVH_AI_ENDPOINTS_ACCESS_TOKEN,
                        base_url='https://deepseek-r1-distill-llama-70b.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1', 
                        streaming=True)

  prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a Nestor, a virtual assistant. Answer to the question."),
    ("human", "{question}"),
  ])

  chain = prompt | model

  print("🤖: ")
  for r in chain.stream({"question", new_message}):
    print(r.content, end="", flush=True)
    time.sleep(0.150)

# Main entrypoint
def main():
  # User input
  parser = argparse.ArgumentParser()
  parser.add_argument('--question', type=str, default="What is the meaning of life?")
  args = parser.parse_args()
  chat_completion(args.question)

if __name__ == '__main__':
    main()

You can try your new assistant with the following command:

python3 chat-bot-streaming.py --question "What is OVHcloud?"

And that it!

Don’t hesitate to test our new product, AI Endpoints, and give us your feedback.

You have a dedicated Discord channel (#ai-endpoints) on our Discord server (https://discord.gg/ovhcloud), see you the

Stéphane Philippart

Website | + posts

Once a developer, always a developer!
Java developer for many years, I have the joy of knowing JDK 1.1, JEE, Struts, ... and now Spring, Quarkus, (core, boot, batch), Angular, Groovy, Golang, ...
For more than ten years I was a Software Architect, a job that allowed me to face many problems inherent to the complex information systems in large groups.
I also had other lives, notably in automation and delivery with the implementation of CI/CD chains based on Jenkins pipelines.
I particularly like sharing and relationships with developers and I became a Developer Relation at OVHcloud.
This new adventure allows me to continue to use technologies that I like such as Kubernetes or AI for example but also to continue to learn and discover a lot of new things.
All the while keeping in mind one of my main motivations as a Developer Relation: making developers happy.
Always sharing, I am the co-creator of the TADx Meetup in Tours, allowing discovery and sharing around different tech topics.