LLMs streaming with AI Endpoints and LangChain4j

A parrot and a stream

After explaining how to use AI Endpoints and LangChain4j in the previous post, let’s take a look at how to use streaming to create a real chat bot !

Create the project with Quarkus

Like the previous post about AI Endpoints we’ll use LangChain4j through a Quarkus extension: quarkus-langchain4j.

ℹ️ All the code source used in the blog post is available on our GitHub repository: public-cloud-examples/ai/ai-endpoints/quarkus-langchain4j-streaming ℹ️

First of all, you need to create a Quarkus application with the Quarkus CLI.

ℹ️ For more details on the type of files are created, see the blog post How to use AI Endpoints and LangChain4j ℹ️

AI Service creation

LLM is used remotely via a LangChain4j AI Service.

Let’s code our service to create a chat bot.

package com.ovhcloud.examples.aiendpoints.services;

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;
import io.smallrye.mutiny.Multi;

@RegisterAiService
public interface ChatBotService {
  // Scope / context passed to the LLM
  @SystemMessage("You are a virtual, an AI assistant.")
  // Prompt (with detailed instructions and variable section) passed to the LLM
  @UserMessage("Answer as best possible to the following question: {question}. The answer must be in a style of a virtual assistant and add some emojis.")
  Multi<String> askAQuestion(String question);
}

If you want more information about the purpose of SystemMessage and UserMessage you can read the documentation of the Quarkus extension.

Then, you have to configure the quarkus-langchain4j extension to use AI Endpoints.

### Global configurations
# Base URL for Mistral AI endpoints
quarkus.langchain4j.mistralai.base-url=https://mistral-7b-instruct-v02.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1
# Activate or not the log during the request
quarkus.langchain4j.mistralai.log-requests=true
# Activate or not the log during the response
quarkus.langchain4j.mistralai.log-responses=true
# Delay before raising a timeout exception                   
quarkus.langchain4j.mistralai.timeout=60s   
# No key is needed
quarkus.langchain4j.mistralai.api-key=foo
 
# Activate or not the Mistral AI embedding model                     
quarkus.langchain4j.mistralai.embedding-model.enabled=false
 
### Chat model configurations
# Activate or not the Mistral AI chat model
quarkus.langchain4j.mistralai.chat-model.enabled=true             
# Chat model name used
quarkus.langchain4j.mistralai.chat-model.model-name=Mistral-7B-Instruct-v0.2
# Number of tokens to use
quarkus.langchain4j.mistralai.chat-model.max-tokens=1024

ℹ️  To know how to use the OVHcloud AI Endpoints product, please read the blog post: Enhance your applications with AI Endpoints ℹ️ 

AI chat bot API

Now it’s time to test our AI!

Let’s develop a small API.

package com.ovhcloud.examples.aiendpoints;

import com.ovhcloud.examples.aiendpoints.services.ChatBotService;
import io.smallrye.mutiny.Multi;
import jakarta.inject.Inject;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;

@Path("/ovhcloud-ai")
public class AIEndpointsResource {
    // AI Service injection to use it later
    @Inject
    ChatBotService chatBotService;

    // ask resource exposition with POST method
    @Path("ask")
    @POST
    public Multi<String> ask(String question) {
        // Call the Mistral AI chat model
        return chatBotService.askAQuestion(question);
    }
}

Call the API with a curl command.

curl -N http://localhost:8080/ovhcloud-ai/ask \                                                 
     -X POST \
     -d '{"question":"Can you tell me what OVHcloud is and what kind of products it offers?"}' \
     -H 'Content-Type: application/json'

And that it!

In just a few lines of code, you’ve developed your first chat bot using Quarkus, LangChain4j and AI Endpoints.

This chat bot use the streaming mode, fell free to develop a web application to be more usable by humans 😉.

Don’t hesitate to test our new product, AI Endpoints, and give us your feedback.

You have a dedicated Discord channel (#ai-endpoints) on our Discord server (https://discord.gg/ovhcloud), see you there!

Website | + posts

Once a developer, always a developer!
Java developer for many years, I have the joy of knowing JDK 1.1, JEE, Struts, ... and now Spring, Quarkus, (core, boot, batch), Angular, Groovy, Golang, ...
For more than ten years I was a Software Architect, a job that allowed me to face many problems inherent to the complex information systems in large groups.
I also had other lives, notably in automation and delivery with the implementation of CI/CD chains based on Jenkins pipelines.
I particularly like sharing and relationships with developers and I became a Developer Relation at OVHcloud.
This new adventure allows me to continue to use technologies that I like such as Kubernetes or AI for example but also to continue to learn and discover a lot of new things.
All the while keeping in mind one of my main motivations as a Developer Relation: making developers happy.
Always sharing, I am the co-creator of the TADx Meetup in Tours, allowing discovery and sharing around different tech topics.