Using Function Calling with OVHcloud AI Endpoints

If you want to have more information on AI Endpoints, please read the following blog post.
You can, also, have a look at our previous blog posts on how use AI Endpoints.

OVHcloud AI Endpoints allows developers to easily add AI features to there day to day developments.

Stable Diffusion is a powerful artificial intelligence model to generate images from text descriptions.
You can use it, thanks to AI Endpoints, simply by calling the endpoint with a prompt.

However, creating a good prompt for Stable Diffusion can be challenging.

In this blog post, we will show you how to optimize your prompts using Function Calling and AI Endpoints.

OVHcloud AI Endpoints provides a lot of models, but for this example we will use models from the Large Languages Models (LLM) and Image Generation families.

The following examples use LangChain4J as Framework to do the LLM calls.

ℹ️ You can find the full code on Github ℹ️

Introduction to Function Calling

Function calling refers to the ability of a language model or AI system to ask to invoke and execute pre-defined functions or tasks, such as data processing, calculations, or external API calls, in response to user input or prompts.
This enables the AI system to perform more complex and dynamic tasks, and to leverage external knowledge and services to generate more accurate and informative responses.

In the context of image generation, function calling can be used to enhance the quality of the prompts by optimizing them thanks to external tool based on a LLM.

To create our application we will use LangChain4J to simplify the integration of the AI models and the function calling mechanism.

Tool creation

To use the function calling mechanism, we need to define a tool.
In our example the goal of the tool is to call Stable Diffusion API to generate an image.

⚠️ This is not the model itself that calls the tool but the client that invokes the model. ⚠️

    @Tool("""
    Tool to create an image with Stable Diffusion XL given a prompt and a negative prompt.
    """)
    void generateImage(@P("Prompt that explains the image") String prompt, @P("Negative prompt that explains what the image must not contains") String negativePrompt) throws IOException, InterruptedException {
        System.out.println("Prompt: " + prompt);
        System.out.println("Negative prompt: " + negativePrompt);

        HttpRequest httpRequest = HttpRequest.newBuilder()
                .uri(URI.create(System.getenv("OVH_AI_ENDPOINTS_SD_URL")))
                .POST(HttpRequest.BodyPublishers.ofString("""
                        {"prompt": "%s", 
                         "negative_prompt": "%s"}
                        """.formatted(prompt, negativePrompt)))
                .header("accept", "application/octet-stream")
                .header("Content-Type", "application/json")
                .header("Authorization", "Bearer " + System.getenv("OVH_AI_ENDPOINTS_SDXL_ACCESS_TOKEN"))
                .build();

        HttpResponse<byte[]> response = HttpClient.newHttpClient()
                .send(httpRequest, HttpResponse.BodyHandlers.ofByteArray());

        System.out.println("SDXL status code: " + response.statusCode());
        Files.write(Path.of("generated-image.jpeg"), response.body());
    }

⚠️ One of the main point to help the LLM to choose the right tool to use, is to provide clear and comprehensive description. ⚠️

Once the tool is ready, lets tell to the model that it can use it!

Optimizing the model with a tool

First we create a simple chatbot.

/// Chatbot definition.
/// The goal of the chatbot is to build a powerful prompt for Stable diffusion XML.
interface ChatBot {
    @SystemMessage("""
            Your are an expert of using the Stable Diffusion XL model.
            The user explains in natural language what kind of image he wants.
            You must do the following steps:
              - Understand the user's request.
              - Generate the two kinds of prompts for stable diffusion: the prompt and the negative prompt
              - the prompts must be in english and detailed and optimized for the Stable Diffusion XL model. 
              - once and only once you have this two prompts call the tool with the two prompts.
            If asked about to create an image, you MUST call the `generateImage` function.
            """)
    @UserMessage("Create an image with stable diffusion XLK following this description: {{userMessage}}")
    String chat(String userMessage);
}

It’s not mandatory to create a such detailed system message, but it helps the model to choose the tool when needed.

After this we assemble all the pieces together.

void main() throws Exception {

    // Main chatbot configuration, choose on of the available models on the AI Endpoints catalog (https://endpoints.ai.cloud.ovh.net/catalog)
    ChatModel chatModel = MistralAiChatModel.builder()
            .apiKey(System.getenv("OVH_AI_ENDPOINTS_ACCESS_TOKEN"))
            .baseUrl(System.getenv("OVH_AI_ENDPOINTS_MODEL_URL"))
            .modelName(System.getenv("OVH_AI_ENDPOINTS_MODEL_NAME"))
            .logRequests(false)
            .logResponses(false)
            // To have more deterministic outputs, set temperature to 0.
            .temperature(0.0)
            .build();

    // Add memory to fine tune the SDXL prompt.
    ChatMemory chatMemory = MessageWindowChatMemory.withMaxMessages(10);

    // Build the chatbot thanks to LangChain4J AI Servises mode
    ChatBot chatBot = AiServices.builder(ChatBot.class)
            .chatModel(chatModel)
            .tools(new ImageGenTools())
            .chatMemory(chatMemory)
            .build();

    // Start the conversation loop (enter "exit" to quit)
    String userInput = "";
    Scanner scanner = new Scanner(System.in);
    while (true) {
        System.out.print("Enter your message: ");
        userInput = scanner.nextLine();
        if (userInput.equalsIgnoreCase("exit")) break;
        System.out.println("Response: " + chatBot.chat(userInput));
    }
    scanner.close();
}

ℹ️ We use a loop to be able to ask the model to optimize the image generation parameters based on the previous response. ℹ️

And that it!
It’s time to test our Stable Diffusion optimizer.

$ jbang ImageGeneration.java


Enter your message: Un chat roux mignon photo réaliste

Prompt: A high-quality, realistic image of a cute red cat, with expressive eyes, soft fur, and a playful pose. 
The cat should be well-lit, with a warm and inviting atmosphere.

Negative prompt: No text, no watermarks, no low-quality images, no cartoon-style, no blurry or pixelated images, 
no cats with missing body parts, no cats with unnatural colors, no cats in unrealistic settings, no cats with human features, 
no cats with inappropriate content.

Response: I have successfully generated the image for you. The image should be a high-quality, 
realistic image of a cute red cat, with expressive eyes, soft fur, and a playful pose. The cat should be well-lit, 
with a warm and inviting atmosphere. If you have any issues or need further assistance, please let me know.

Enter your message: exit

ℹ️ As you see, the model translate the prompt 😊

Here is the result of the prompt:

A cut red cat generated by Stable Diffusion

You have a dedicated Discord channel (#ai-endpoints) on our Discord server (https://discord.gg/ovhcloud), see you there!

Stéphane Philippart

Website | + posts

Once a developer, always a developer!
Java developer for many years, I have the joy of knowing JDK 1.1, JEE, Struts, ... and now Spring, Quarkus, (core, boot, batch), Angular, Groovy, Golang, ...
For more than ten years I was a Software Architect, a job that allowed me to face many problems inherent to the complex information systems in large groups.
I also had other lives, notably in automation and delivery with the implementation of CI/CD chains based on Jenkins pipelines.
I particularly like sharing and relationships with developers and I became a Developer Relation at OVHcloud.
This new adventure allows me to continue to use technologies that I like such as Kubernetes or AI for example but also to continue to learn and discover a lot of new things.
All the while keeping in mind one of my main motivations as a Developer Relation: making developers happy.
Always sharing, I am the co-creator of the TADx Meetup in Tours, allowing discovery and sharing around different tech topics.