How to use AI Endpoints and LangChain to create a chatbot

Image representing a robot with a parrot and a snake

Have a look at our previous blog posts:
– Enhance your applications with AI Endpoints
– How to use AI Endpoints and LangChain4j
– LLMs streaming with AI Endpoints and LangChain4j

In the world of generative AI with LLMs, LangChain is one of the most famous Framework used to simplify the LLM use with API call.

LangChain’s tools and APIs simplify the process of building LLM-driven applications like chat bots and virtual agents.

LangChain is designed to be used with Python language and Javascript.

And, of course, we’ll use our AI Endpoints product to access to various LLM models 🤩.

ℹ️ All the code source used in the blog post is available on our GitHub repository: public-cloud-examples/tree/main/ai/ai-endpoints/python-langchain-chatbot ℹ️

Blocking chatbot

Let’s start by creating a simple chatbot with LangChain and AI Endpoints.

The first step is to get the necessary dependencies. To do this, create a requirements.txt file:

fastapi==0.110.0
openai==1.13.3
langchain-mistralai==0.1.7

And run the following command:

pip3 install -r requirements.txt

At this step you are ready to develop your first chatbot:

import argparse

from langchain_mistralai import ChatMistralAI
from langchain_core.prompts import ChatPromptTemplate

## Set the OVHcloud AI Endpoints token to use models
_OVH_AI_ENDPOINTS_ACCESS_TOKEN = os.environ.get('OVH_AI_ENDPOINTS_ACCESS_TOKEN') 
_OVH_AI_ENDPOINTS_MODEL_NAME = os.environ.get('OVH_AI_ENDPOINTS_MODEL_NAME') 
_OVH_AI_ENDPOINTS_MODEL_URL = os.environ.get('OVH_AI_ENDPOINTS_MODEL_URL') 

# Function in charge to call the LLM model.
# Question parameter is the user's question.
# The function print the LLM answer.
def chat_completion(question: str):
  # no need to use a token
  model = ChatMistralAI(model=_OVH_AI_ENDPOINTS_MODEL_NAME, 
                        api_key=_OVH_AI_ENDPOINTS_ACCESS_TOKEN,
                        endpoint=_OVH_AI_ENDPOINTS_MODEL_URL, 
                        max_tokens=1500)

  prompt = ChatPromptTemplate.from_messages([
    ("system", "You are Nestor, a virtual assistant. Answer to the question."),
    ("human", "{question}"),
  ])

  chain = prompt | model

  response = chain.invoke(question)

  print(f"🤖: {response.content}")

# Main entrypoint
def main():
  # User input
  parser = argparse.ArgumentParser()
  parser.add_argument('--question', type=str, default="What is the meaning of life?")
  args = parser.parse_args()
  chat_completion(args.question)

if __name__ == '__main__':
    main()

You can try your new assistant with the following command:

python3 chat-bot.py --question "What is OVHcloud?"

🤖: OVHcloud is a global cloud computing company that offers a variety of services such as virtual private servers, dedicated servers, 
storage solutions, and other web services. 
It was founded in France and has since expanded to become a leading provider of cloud infrastructure, with data centers located around the 
world. OVHcloud offers a range of options for businesses and individuals, including high-performance computing, big data, and machine learning solutions. 
It is known for its commitment to data security and privacy, 
and its infrastructure is designed to be flexible, scalable, and reliable.

Streaming chatbot

Now you have a chatbot, what do you think about adding the streaming feature? The streaming gives your chatbot the ability to display the information when it’s ready, and to not wait for the whole message to be processed on the server side.

To do this you can update the previous Python script as following:

import argparse
import time

from langchain_mistralai import ChatMistralAI
from langchain_core.prompts import ChatPromptTemplate

## Set the OVHcloud AI Endpoints token to use models
_OVH_AI_ENDPOINTS_ACCESS_TOKEN = os.environ.get('OVH_AI_ENDPOINTS_ACCESS_TOKEN') 
_OVH_AI_ENDPOINTS_MODEL_NAME = os.environ.get('OVH_AI_ENDPOINTS_MODEL_NAME') 
_OVH_AI_ENDPOINTS_MODEL_URL = os.environ.get('OVH_AI_ENDPOINTS_MODEL_URL') 


# Function in charge to call the LLM model.
# Question parameter is the user's question.
# The function print the LLM answer.
def chat_completion(new_message: str):
  # no need to use a token
  model = ChatMistralAI(model=_OVH_AI_ENDPOINTS_MODEL_NAME, 
                        api_key=_OVH_AI_ENDPOINTS_ACCESS_TOKEN,
                        endpoint=_OVH_AI_ENDPOINTS_MODEL_URL, 
                        max_tokens=1500, 
                        streaming=True)

  prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a Nestor, a virtual assistant. Answer to the question."),
    ("human", "{question}"),
  ])

  chain = prompt | model

  print("🤖: ")
  for r in chain.stream({"question", new_message}):
    print(r.content, end="", flush=True)
    time.sleep(0.150)

# Main entrypoint
def main():
  # User input
  parser = argparse.ArgumentParser()
  parser.add_argument('--question', type=str, default="What is the meaning of life?")
  args = parser.parse_args()
  chat_completion(args.question)

if __name__ == '__main__':
    main()

ℹ️ Note on the environment variables : you can find values on the documentation tab of each model. ℹ️
For example with the model list on April 2025 :
– OVH_AI_ENDPOINTS_MODEL_NAME: Mistral-7B-Instruct-v0.3
– OVH_AI_ENDPOINTS_MODEL_URL: https://mistral-7b-instruct-v0-3.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1

You can try your new assistant with the following command:

python3 chat-bot-streaming.py --question "What is OVHcloud?"

And that it!

Don’t hesitate to test our new product, AI Endpoints, and give us your feedback.

You have a dedicated Discord channel (#ai-endpoints) on our Discord server (https://discord.gg/ovhcloud), see you there!

Stéphane Philippart

Once a developer, always a developer!
Java developer for many years, I have the joy of knowing JDK 1.1, JEE, Struts, ... and now Spring, Quarkus, (core, boot, batch), Angular, Groovy, Golang, ...
For more than ten years I was a Software Architect, a job that allowed me to face many problems inherent to the complex information systems in large groups.
I also had other lives, notably in automation and delivery with the implementation of CI/CD chains based on Jenkins pipelines.
I particularly like sharing and relationships with developers and I became a Developer Relation at OVHcloud.
This new adventure allows me to continue to use technologies that I like such as Kubernetes or AI for example but also to continue to learn and discover a lot of new things.
All the while keeping in mind one of my main motivations as a Developer Relation: making developers happy.
Always sharing, I am the co-creator of the TADx Meetup in Tours, allowing discovery and sharing around different tech topics.

Thierrry Chantier

Geek, musician, coder and maker that loves sharing knowledge.

Currently DevRel @OVHcloud as day job. Also founder of Mixteen for the kids and Community Hero

Curiosity and passion leads to micro:bit Champion