<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Machine learning Archives - OVHcloud Blog</title>
	<atom:link href="https://blog.ovhcloud.com/tag/machine-learning/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.ovhcloud.com/tag/machine-learning/</link>
	<description>Innovation for Freedom</description>
	<lastBuildDate>Wed, 11 Feb 2026 13:03:41 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.ovhcloud.com/wp-content/uploads/2019/07/cropped-cropped-nouveau-logo-ovh-rebranding-32x32.gif</url>
	<title>Machine learning Archives - OVHcloud Blog</title>
	<link>https://blog.ovhcloud.com/tag/machine-learning/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Safety first: Detect harmful texts using an AI safeguard agent</title>
		<link>https://blog.ovhcloud.com/safety-first-detect-harmful-texts-using-an-ai-safeguard-agent/</link>
		
		<dc:creator><![CDATA[Alexandre Movsessian]]></dc:creator>
		<pubDate>Thu, 22 Jan 2026 10:46:11 +0000</pubDate>
				<category><![CDATA[Deploy & Scale]]></category>
		<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Machine learning]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=30185</guid>

					<description><![CDATA[This article explains how to use the Qwen 3 Guard safeguard models provided by OVHCloud. Using this guide, you can analyse and moderate texts for LLM applications, chat platforms, customer support systems, or any other text-based services requiring safe and compliant interactions. Our focus will be on written content, such as conversations or plain text. [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fsafety-first-detect-harmful-texts-using-an-ai-safeguard-agent%2F&amp;action_name=Safety%20first%3A%20Detect%20harmful%20texts%20using%20an%20AI%20safeguard%20agent&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-full"><img fetchpriority="high" decoding="async" width="981" height="463" src="https://blog.ovhcloud.com/wp-content/uploads/2026/01/image.png" alt="" class="wp-image-30187" srcset="https://blog.ovhcloud.com/wp-content/uploads/2026/01/image.png 981w, https://blog.ovhcloud.com/wp-content/uploads/2026/01/image-300x142.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2026/01/image-768x362.png 768w" sizes="(max-width: 981px) 100vw, 981px" /></figure>



<p class="has-text-align-left"><strong>This article explains how to use the Qwen 3 Guard safeguard models provided by OVHCloud.</strong></p>



<p>Using this guide, you can analyse and moderate texts for LLM applications, chat platforms, customer support systems, or any other text-based services requiring safe and compliant interactions.</p>



<p>Our focus will be on written content, such as conversations or plain text. Although image moderators exist, they won’t be covered here.</p>



<h2 class="wp-block-heading"><strong>Introduction</strong></h2>



<p><strong><br></strong>As <strong>Large Language Models</strong> (LLMs) continue to grow, access to information has become more seamless, but this ease of access makes it easier to generate, and be exposed to, harmful or toxic content.</p>



<p>LLMs can be prompted with malicious queries (e.g., “How do I make a bomb?”) and some models might comply by generating potentially dangerous responses. This risk is particularly concerning given the widespread availability of LLMs, to both minors and malicious actors alike.</p>



<p>To combat this, LLM providers train their models to reject toxic prompts, and integrate safety features to prevent the creation of harmful content. Even so, users often craft ‘<strong>jailbreaks</strong>’, which are specific prompts designed to get around these safety measures.</p>



<p>As a result, providers have created <strong>specialised safeguard models</strong> to find and remove toxic content in writing.</p>



<h1 class="wp-block-heading">What is toxicity?</h1>



<p>Toxicity is inherently difficult to define, as perceptions vary depending on factors such as individual sensitivity, cultural background, age, and personal experience.</p>



<p>Perceptions of content can vary widely. For example, some users may find certain jokes offensive, while others consider them perfectly acceptable. Similarly, roleplaying with an AI chat may be enjoyable for some, yet deemed inappropriate by others depending on the context.</p>



<p>Furthermore, each moderation system focuses on different categories of harmful content, based on the specific data and instructions it was trained on. For instance, models developed in the United States tend to be highly sensitive to hate speech, political content, and other related categories.</p>



<p>Because jailbreak attempts are a fairly new issue, existing moderation models often fail to address them.</p>



<p>Below are the toxicity categories for the Qwen 3 Guard models:</p>



<figure class="wp-block-table"><table class="has-fixed-layout"><tbody><tr><td><strong>Name</strong></td><td><strong>Description</strong></td></tr><tr><td><em>Violent</em></td><td>Content that provides detailed instructions, methods, or advice on how to commit acts of violence, including the manufacture, acquisition, or use of weapons. Also includes depictions of violence.</td></tr><tr><td><em>Nonviolent illegal acts</em></td><td>Content providing guidance or advice for nonviolent criminal activities like hacking, unauthorised drug manufacturing, or theft.</td></tr><tr><td><em>Sexual content or sexual acts</em></td><td>Content with sexual depictions, references, or descriptions of people. Also includes content with explicit sexual imagery, references, or descriptions of illegal or unethical sexual acts, such as rape, bestiality, incest, and sexual slavery.</td></tr><tr><td><em>Personally identifiable information</em></td><td>Content that shares or discloses sensitive personal identifying information, with authorisation, such as name, ID number, address, phone number, medical records, financial details, and account passwords, etc.</td></tr><tr><td><em>Suicide &amp; self-harm</em></td><td>Content advocating, directly encouraging, or detailing methods for self-harm, suicide, or dangerous activities that could lead to serious injury or death.</td></tr><tr><td><em>Unethical acts</em></td><td>Any immoral or unethical content or acts, including but not limited to bias, discrimination, stereotype, injustice, hate speech, offensive language, harassment, insults, threat, defamation, extremism, misinformation regarding ethics, and other behaviours that, while not illegal, are still considered unethical.</td></tr><tr><td><em>Politically sensitive topics</em></td><td>The deliberate creation or spread of false information about government actions, historical events, or public figures that is demonstrably untrue and poses risk of public deception or social harm.</td></tr><tr><td><em>Copyright violation</em></td><td>Content that includes unauthorised reproduction, distribution, public display, or derivative use of copyrighted materials, such as novels, scripts, lyrics, and other legally protected creative works, without the copyright holder’s clear consent.</td></tr><tr><td><em>Jailbreak</em></td><td>Content that explicitly attempts to override the model&#8217;s system prompt or model conditioning.</td></tr></tbody></table></figure>



<p>These categories are <strong>not mutually exclusive</strong>. A text may very well contain both Unethical Acts and Violence, for example. Most notably, jailbreaks often include another kind of toxic query as it is designed to bypass security guardrails. The Qwen 3 Guard moderator, however, will only return one category.</p>



<p>These categories were arbitrarily chosen by Qwen 3 Guard creators; they can’t be changed, but <strong>you may choose to ignore some</strong> depending on your use case.</p>



<h1 class="wp-block-heading">Metrics</h1>



<p><em>Attack</em>: An attack refers to any attempt to produce harmful or toxic content. This is either a prompt crafted to make an LLM generate harmful output, or just a user’s toxic message in a chat system.</p>



<p><em>Attack Success Rates (ASR)</em>: This is a metric used to assess the effectiveness of a moderation system. It represents the <strong>proportion of attacks that successfully bypass the moderator</strong> and go undetected. A lower ASR indicates a more robust moderation system.</p>



<p><em>False positive</em>: A false positive occurs when benign, nontoxic content is incorrectly flagged as harmful by the moderator.</p>



<p><em>False Positive Rate (FPR)</em>: The FPR measures how often a moderation system misclassifies safe content as toxic. It complements the ASR by reflecting the <strong>model’s ability to correctly allow harmless content through</strong>. A lower FPR indicates better reliability.</p>



<h2 class="wp-block-heading">Qwen 3 Guard</h2>



<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Qwen 3 Guard was launched in October 2025 by Qwen, Alibaba’s AI team. After extensive testing and evaluation, we found this model to be the most effective in safeguarding content.</p>



<p>Besides being efficient, Qwen 3 Guard can detect toxicity across nine categories, including jailbreak attempts, a feature that isn’t common in safeguard models.</p>



<p>It also provides explanations by specifying the exact category detected.</p>



<h3 class="wp-block-heading">Specs</h3>



<ul class="wp-block-list">
<li>Base model: Qwen 3</li>



<li>Flavours: 0.6B, 4B, 8B</li>



<li>Context size: 32,768 tokens</li>



<li>Languages: English, French and 117 other languages and dialects</li>



<li>Tasks:<ul><li>Detection of toxicity in raw text</li></ul><ul><li>Detection of toxicity in LLM dialogue</li></ul><ul><li>Detection of answer refusal (LLM dialogue only)</li></ul>
<ul class="wp-block-list">
<li>Classification of toxicity</li>
</ul>
</li>
</ul>



<h3 class="wp-block-heading">Availability</h3>



<p><a href="https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog</a></p>



<p>There are two flavours of Qwen 3 Guard available on OVHCloud:</p>



<p><strong><em>Qwen 3 Guard 0.6B</em></strong>: This lightweight model is very effective at detecting overt toxic content.</p>



<p><strong><em>Qwen 3 Guard 8B</em></strong>: This heavier model comes in handy when confronted with more nuanced examples.</p>



<h3 class="wp-block-heading">Scores</h3>



<figure class="wp-block-table"><table class="has-fixed-layout"><tbody><tr><td><strong>&nbsp;</strong></td><td><strong><em>ASR</em></strong></td><td><strong><em>FPR</em></strong></td></tr><tr><td><strong><em>Qwen 3 Guard 0.6B</em></strong></td><td>0.20</td><td>0.06</td></tr><tr><td><strong><em>Qwen 3 Guard 8B</em></strong></td><td>0.20</td><td>0.04</td></tr></tbody></table></figure>



<h3 class="wp-block-heading">&nbsp;</h3>



<h3 class="wp-block-heading">Notes</h3>



<ul class="wp-block-list">
<li>The Qwen 3 Guard models has three safety labels for more precise moderation: Safe, Controversial, Unsafe</li>



<li>Although the model can moderate chats, it is recommended to process each part of the dialogue individually rather than submitting the entire conversation at once. Guard Models, like any LLMs, perform better in detection when the context size is kept extremely brief.</li>



<li>Since Qwen Guard is developed by a Chinese company, its interpretation of toxic content may differ from yours. If necessary, you can overlook certain categories.</li>
</ul>



<h1 class="wp-block-heading">How do I set up my own moderator?</h1>



<p>First, you need to choose the flavour you want:</p>



<ul class="wp-block-list">
<li><strong><em>Qwen 3 Guard 0.6B</em></strong> is <strong>lightweight</strong>, <strong>fast</strong>, <strong>efficient</strong> and is great at detecting <strong>overt toxic content</strong>, like <em>Sexual Content</em> or <em>Violence</em> in texts.</li>
</ul>



<ul class="wp-block-list">
<li><strong><em>Qwen 3 Guard 8B</em></strong> is heavier, slightly slower but it is more effective against <strong>more nuanced toxic content </strong>like <em>Jailbreak</em> or <em>Unethical Acts</em>, and has a <strong>lower false positive rate</strong>.</li>
</ul>



<p>Your use case is the key to choosing the right model. Do you need to moderate a large volume of text? Is processing speed a priority? How crucial is it to minimise false positives? Are you dealing with nuanced toxic content, or is it more overt?</p>



<p>Carefully considering these questions will help you determine which of the two models is most suitable for your needs.</p>



<p>Both models can be tested on the playground:</p>



<p><a href="https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog</a></p>



<p>Once you’ve made you choice, you need to send the texts you want checked to the AI Endpoints API.</p>



<p>First install the <em>requests</em> library:</p>



<pre class="wp-block-code"><code class="">pip install requests</code></pre>



<p>Next, export your access token to the <em>OVH_AI_ENDPOINTS_ACCESS_TOKEN</em> environment variable:</p>



<pre class="wp-block-code"><code class="">export OVH_AI_ENDPOINTS_ACCESS_TOKEN=&lt;your-access-token&gt;</code></pre>



<p><em>If you don’t have an access token key yet, follow the steps in the </em><a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-endpoints-getting-started?id=kb_article_view&amp;sysparm_article=KB0065401" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><em>AI Endpoints – Getting Started</em></a> <em>guide</em></p>



<p>Finally, run the following Python code:</p>



<pre class="wp-block-code"><code class="">import os<br>import requests<br><br>url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/chat/completions"<br><br>payload = {<br>"messages": [{"role": "user", "content": "How do I cook meth ?"}],<br>"model": , #Qwen/Qwen3Guard-Gen-0.6B or Qwen/Qwen3Guard-Gen-8B<br>"seed": 21<br>}<br><br>headers = {<br>"Content-Type": "application/json",<br>"Authorization": f"Bearer {os.getenv('OVH_AI_ENDPOINTS_ACCESS_TOKEN')}",<br>}<br><br>response = requests.post(url, json=payload, headers=headers)<br>if response.status_code == 200:<br># Handle response<br>response_data = response.json()<br># Parse JSON response<br>choices = response_data["choices"]<br>for choice in choices:<br>text = choice["message"]["content"]<br># Process text<br>print(text)<br>else:<br>print("Error:", response.status_code, response.text)</code></pre>



<p>The model will respond with a label (Safe, Controversial, Unsafe) and if the text is Controversial or Unsafe, it will return the associated category.</p>



<pre class="wp-block-code"><code class="">Safety: Unsafe<br>Categories: Nonviolent Illegal Acts</code></pre>



<p>Our moderation models are available for free during the beta phase. You can test them with a different model or within the playground.</p>



<h2 class="wp-block-heading"><strong>Conclusion</strong></h2>



<p>Two models are currently available for OVHCloud moderation users:<br><strong>•</strong> Qwen 3 Guard 0.6B: <strong>Lightweight</strong>, <strong>fast</strong>, <strong>efficient,</strong> great at detecting <strong>overt toxic content</strong><br><strong>•</strong> Qwen 3 Guard 8B: <strong>Heavier, slightly slower but more effective against more nuanced toxic content</strong><br><br>Which approach and which tool should you choose? Well, it&#8217;s up to you, depending on your use cases, teams, or needs, etc.<br><br>As we&#8217;ve seen in this blog post, OVHcloud AIEndpoint users can start using these models right away, safely and free of charge.<br><br>They are still in beta phase for now, so we&#8217;d appreciate your feedback!</p>



<p></p>
<img decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fsafety-first-detect-harmful-texts-using-an-ai-safeguard-agent%2F&amp;action_name=Safety%20first%3A%20Detect%20harmful%20texts%20using%20an%20AI%20safeguard%20agent&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Reference Architecture: deploying the Mistral Large 123B model in a sovereign environment with OVHcloud</title>
		<link>https://blog.ovhcloud.com/reference-architecture-deploy-mistral-large-model-in-sovereign-environment-ovhcloud/</link>
		
		<dc:creator><![CDATA[Eléa Petton]]></dc:creator>
		<pubDate>Wed, 18 Jun 2025 12:45:51 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Deploy]]></category>
		<category><![CDATA[AI Training]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Mistral]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=29186</guid>

					<description><![CDATA[Are you ready to think bigger with the Mistral Large model 🚀 ? As Artificial Intelligence (AI) becomes a strategic pillar for both enterprises and public institutions, data sovereignty and infrastructure control have become essential. Deploying advanced large language models (LLMs) like Mistral Large, under a commercial license, requires a secure, high-performance environment that complies [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Freference-architecture-deploy-mistral-large-model-in-sovereign-environment-ovhcloud%2F&amp;action_name=Reference%20Architecture%3A%C2%A0deploying%20the%20Mistral%20Large%20123B%20model%20in%20a%20sovereign%20environment%20with%20OVHcloud&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p><em><strong>Are you ready to think bigger with the Mistral Large model 🚀 ?</strong></em></p>



<figure class="wp-block-image aligncenter size-large"><img decoding="async" width="1024" height="461" src="https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_ref-1024x461.png" alt="" class="wp-image-29249" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_ref-1024x461.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_ref-300x135.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_ref-768x346.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_ref-1536x691.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_ref.png 1920w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption"><em>Mistral Large model deployed on OVHcloud infrastructure<br></em></figcaption></figure>



<p>As Artificial Intelligence (<strong>AI</strong>) becomes a strategic pillar for both enterprises and public institutions, <strong>data sovereignty</strong> and <strong>infrastructure control</strong> have become essential. Deploying advanced large language models (LLMs) like <strong>Mistral Large</strong>, under a commercial license, requires a secure, high-performance environment that complies with <strong>European data regulations</strong>.</p>



<p><strong>OVHcloud Machine Learning Services</strong> offer a trusted solution for deploying AI models in a <strong>fully sovereign cloud environment</strong> — hosted in Europe, under <strong>EU jurisdiction</strong>, and fully <strong>GDPR-compliant</strong>.</p>



<p>This <strong>Reference Architecture</strong> will show you how to:</p>



<ul class="wp-block-list">
<li>Access Mistral AI registry using your own license</li>



<li>Download the Mistral Large 123B model automatically using <strong>AI Training</strong></li>



<li>Store the model into a dedicated bucket with <strong>OVHcloud Object Storage</strong></li>



<li>Deploy a production-ready inference API for <strong>Mistral Large</strong> using <strong>AI Deploy</strong> </li>
</ul>



<h2 class="wp-block-heading">Context</h2>



<h3 class="wp-block-heading">Mistral Large model</h3>



<p>The <strong>Mistral Large</strong> model is a <strong>state-of-the-art (LLM)</strong> developed by <strong><a href="https://mistral.ai/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral AI</a>,</strong> a French AI company. It&#8217;s designed to compete with top-tier models like GPT-4, Claude, while emphasizing performance and efficiency.</p>



<p>This is a model with <strong>123 billion</strong> parameters. <strong>Mistral AI</strong> recommends deploying this model in FP8 with 4 H100 GPUs. For more information, refer to <a href="https://help.mistral.ai/en/articles/235545-mistral-models" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral documentation</a>.</p>



<p>This model requires the use of a <strong>commercial licence</strong>. To do this, you need to create an account on <a href="https://console.mistral.ai/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">La Plateforme</a> via the Mistral AI console (<strong>console.mistral.ai</strong>).</p>



<h3 class="wp-block-heading">AI Training </h3>



<p><strong>OVHcloud AI Training</strong> is a fully managed platform designed to help you <strong>train, tune</strong> Machine Learning (ML), Deep Learning (DL), and Large Language Models (LLMs) efficiently. Whether you&#8217;re working on computer vision, NLP, or tabular data, this solution lets you launch training jobs on high-performance GPUs in seconds.</p>



<p><strong>What are the key benefits?</strong></p>



<ul class="wp-block-list">
<li><strong>Easy to use</strong>: launch processing or training jobs in one CLI command or a few clicks using your own Docker image</li>



<li><strong>High-performance computing</strong>: access GPUs like H100, A100, V100S, L40S, and L4 as of June 2025 &#8211; new references are added regularly</li>



<li><strong>Cost-efficient</strong>:<strong> </strong>pay-per-minute billing with no upfront commitment. You only pay for compute time used, with precise control over resources thanks to automatic job stop and synchronisation</li>
</ul>



<p><strong>💡 Why do we need AI Training? </strong>To download the Mistral Large model automatically and efficiently, using a single command to launch the job.</p>



<h3 class="wp-block-heading">AI Deploy</h3>



<p>OVHcloud AI Deploy is a<strong>&nbsp;Container as a Service</strong>&nbsp;(CaaS) platform designed to help you deploy, manage and scale AI models. It provides a solution that allows you to optimally deploy your applications / APIs based on Machine Learning (ML), Deep Learning (DL) or LLMs.</p>



<p><strong>The key benefits are:</strong></p>



<ul class="wp-block-list">
<li><strong>Easy to use:</strong>&nbsp;bring your own custom Docker image and deploy it in a command line or a few clicks surely</li>



<li><strong>High-performance computing:</strong>&nbsp;a complete range of GPUs available (H100, A100, V100S, L40S and L4)</li>



<li><strong>Scalability and flexibility:</strong>&nbsp;supports automatic scaling, allowing your model to effectively handle fluctuating workloads</li>



<li><strong>Cost-efficient:</strong>&nbsp;billing per minute, no surcharges</li>
</ul>



<p>✅ To go further, some prerequisites must be checked!</p>



<h2 class="wp-block-heading">Overview of the Mistral Large deployment architecture</h2>



<p>Here is how will be deployed <strong>Mistral Large 123B</strong>:</p>



<ol class="wp-block-list">
<li>Install the <strong>ovhai CLI</strong></li>



<li>Create a bucket for <strong>model storage</strong></li>



<li>Retrieve the <strong>license information</strong> from <a href="https://console.mistral.ai/on-premise/licenses" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral Console</a></li>



<li>Configure and set up the<strong> environment</strong></li>



<li>Download the <strong>Mistral Large model weights</strong></li>



<li>Deploy the <strong>Mistral Large service</strong></li>



<li>Test it with simple request and <strong>advanced usage</strong> thanks to LangChain</li>
</ol>



<figure class="wp-block-image aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="173" src="https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_process-1024x173.png" alt="" class="wp-image-29251" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_process-1024x173.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_process-300x51.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_process-768x130.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_process-1536x259.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/06/mistral_large_archi_process.png 1920w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Let’s go for the set up and deployment of your own Mistral Large service!</p>



<h2 class="wp-block-heading">Prerequisites</h2>



<p>Before you begin, ensure you have:</p>



<ul class="wp-block-list">
<li>A <strong><a href="https://console.mistral.ai/on-premise/licenses" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral AI license</a></strong> to access to the <strong>Mistral Large model</strong></li>



<li>An&nbsp;<strong>OVHcloud Public Cloud</strong>&nbsp;account</li>



<li>An&nbsp;<strong>OpenStack user</strong>&nbsp;with the following roles:
<ul class="wp-block-list">
<li>Administrator</li>



<li>AI Training Operator</li>



<li>Object Storage Operator</li>
</ul>
</li>
</ul>



<p><strong>🚀 Having all the ingredients for our recipe, it’s time to </strong>deploy the Mistral Large model on 4 H100<strong>!</strong></p>



<h2 class="wp-block-heading">Architecture guide:&nbsp;Mistral Large on OVHcloud infrastructure</h2>



<p>Let’s go for the set up and deployment of the <strong>Mistral Large</strong> model!</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p><strong>✅ Note</strong></p>
<cite><strong>In this example, the <mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><code>Mistral Large 25.02</code></mark> is used. Choose the mistral model under the licence of your choice and repeat the same steps, adapting the model name and versions.</strong></cite></blockquote>



<p>⚙️<em>&nbsp;Also consider that all of the following steps can be automated using OVHcloud APIs!</em></p>



<h3 class="wp-block-heading">Step 1 &#8211; Install&nbsp;<code>ovhai</code>&nbsp;CLI</h3>



<p>If the <code><strong>ovhai</strong></code> CLI is not install, start by setting up your CLI environment.</p>



<pre class="wp-block-code"><code class="">curl https://cli.gra.ai.cloud.ovh.net/install.sh | bash</code></pre>



<p>Secondly, login using your&nbsp;<strong>OpenStack credentials</strong>.</p>



<pre class="wp-block-code"><code class="">ovhai login -u &lt;openstack-username&gt; -p &lt;openstack-password&gt;</code></pre>



<p>Now, it’s time to create your bucket inside OVHcloud Object Storage!</p>



<h3 class="wp-block-heading">Step 2 – Provision Object Storage</h3>



<ol class="wp-block-list">
<li>Go to&nbsp;<strong>Public Cloud &gt; Storage &gt; Object Storage</strong>&nbsp;in the OVHcloud Control Panel.</li>



<li>Create a&nbsp;<strong>datastore</strong>&nbsp;and a new&nbsp;<strong>S3 bucket</strong>&nbsp;(e.g.,&nbsp;<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><code>s3-mistral-large-model</code>)</mark></strong>.</li>



<li>Register the datastore with the&nbsp;<code>ovhai</code>&nbsp;CLI:</li>
</ol>



<pre class="wp-block-code"><code class="">ovhai datastore add s3 &lt;ALIAS&gt; https://s3.gra.perf.cloud.ovh.net/ gra &lt;my-access-key&gt; &lt;my-secret-key&gt; --store-credentials-locally</code></pre>



<p>💡 <em>Note that, for this use case, we recommend the <strong>High Performance Object Storage</strong> range using <code><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><strong>https://s3.gra.perf.cloud.ovh.net/</strong></mark></code> instead of <code>https://s3.gra.io.cloud.ovh.net/</code></em></p>



<h3 class="wp-block-heading">Step 3 &#8211; Access the Mistral AI registry</h3>



<p><em>⚠️ Please note that you must have a <strong>licence for the Mistral Large model </strong>to be able to carry out the following steps.</em></p>



<ul class="wp-block-list">
<li>Go to the Mistral AI platform: <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">https://console.mistral.ai/home</mark></strong></li>



<li>Retrieve <strong>credentials</strong> and the <strong>license key</strong> from the Mistral console:<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"> https://console.mistral.ai/on-premise/licenses</mark></strong></li>



<li>Authenticate to the Mistral AI Docker registry:</li>
</ul>



<pre class="wp-block-code"><code class="">docker login &lt;mistral-ai-registry&gt; --username $DOCKER_USERNAME --password $DOCKER_PASSWORD</code></pre>



<ul class="wp-block-list">
<li>Add the private registry to the config using the <code><strong>ovhai</strong></code> CLI:</li>
</ul>



<pre class="wp-block-code"><code class="">ovhai registry add &lt;mistral-ai-registry&gt;</code></pre>



<ul class="wp-block-list">
<li>Check that it is present in the list:</li>
</ul>



<pre class="wp-block-code"><code class="">ovhai registry list</code></pre>



<h3 class="wp-block-heading">Step 4 &#8211; Define environment variables</h3>



<p>The next step is to define an<mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"> <strong><code>.env</code></strong></mark> file that will list all the environment variables required to download and deploy the Mistral Large model.</p>



<ul class="wp-block-list">
<li>Create the <mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><strong><code>.env</code></strong></mark> file, enter the following information:</li>
</ul>



<pre class="wp-block-code"><code class=""><code>SERVED_MODEL=mistral-large-2502
RECIPES_VERSION=v0.0.76TP_SIZE=4
LICENSE_KEY=&lt;your-mistral-license-key&gt;
DOCKER_IMAGE_INFERENCE_ENGINE=&lt;<span style="background-color: initial; font-family: inherit; font-size: inherit; font-weight: inherit;">mistral-inference-server</span>-docker-image&gt;
DOCKER_IMAGE_MISTRAL_UTILS=<span style="background-color: rgba(248, 248, 242, 0.2); font-family: inherit; font-size: inherit; font-weight: inherit;">&lt;</span><span style="font-family: inherit; font-size: inherit; font-weight: inherit; background-color: initial;">mistral-utils</span><span style="background-color: rgba(248, 248, 242, 0.2); font-family: inherit; font-size: inherit; font-weight: inherit;">-docker-image&gt;</span></code></code></pre>



<ul class="wp-block-list">
<li>Then, create a script to load theses environment variables easily. Name it <code><strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">load_env.sh</mark></strong></code>:</li>
</ul>



<pre class="wp-block-code"><code class="">#!/bin/bash

# Vérifie si le fichier .env existe
if [ ! -f .env ]; then
  echo "Error: .env not found"
  exit 1
fi

# Exporter toutes les variables du .env
export $(grep -v '^#' .env | xargs)

echo "Environment variables are loaded from .env"</code></pre>



<ul class="wp-block-list">
<li>Now, launch this script :</li>
</ul>



<pre class="wp-block-code"><code class="">source load_env.sh</code></pre>



<p>✅ You have everything you need to start the implementation!</p>



<h3 class="wp-block-heading">Step 5 &#8211; Download Mistral Large model weights</h3>



<p>The aim here is to download the model and its artefacts into the S3 bucket created earlier.</p>



<p>To achieve this, you can launch a download job that will run automatically with AI Training.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p><strong> 💡 Here&#8217;s a tip! </strong></p>
<cite><strong>Note that here you are not using AI Training to train models, but as an easy-to-use Container as a Service solution. With a single command line, you can launch a one-shot download of the Mistral Large model with automatic synchronisation to Object Storage.</strong></cite></blockquote>



<ul class="wp-block-list">
<li>Launch the <strong>AI Training</strong> download job by attaching the object container:</li>
</ul>



<pre class="wp-block-code"><code class="">ovhai job run --name DOWNLOAD_MISTRAL_LARGE_123B \
              --cpu 12 \
              --volume s3-mistral-large-model@&lt;ALIAS&gt;/:/opt/ml/model:RW \
              -e RECIPES_VERSION=$RECIPES_VERSION \
              $<span style="background-color: initial; font-family: inherit; font-size: inherit; font-weight: inherit;">DOCKER_IMAGE_MISTRAL_UTILS</span> \
                -- bash -c "cd /app/mistral-rclone &amp;&amp; \ 
                  poetry run python mistral-rclone.py \
                  --license-key $LICENSE_KEY \
                  --download-model $SERVED_MODEL"</code></pre>



<p><em>Full command explained:</em></p>



<ul class="wp-block-list">
<li><code>ovhai job run</code></li>
</ul>



<p>This is the core command to&nbsp;<strong>run a job</strong>&nbsp;using the&nbsp;<strong>OVHcloud AI Training</strong>&nbsp;platform.</p>



<ul class="wp-block-list">
<li><code>--name DOWNLOAD_MISTRAL_LARGE_123B</code></li>
</ul>



<p>Sets a&nbsp;<strong>custom name</strong>&nbsp;for the job. For example,&nbsp;<code><code>DOWNLOAD_MISTRAL_LARGE_123B</code></code>.</p>



<ul class="wp-block-list">
<li><code>--cpu&nbsp;12</code></li>
</ul>



<p>Allocates&nbsp;<strong>12 CPU</strong>&nbsp;for the job.</p>



<ul class="wp-block-list">
<li><code>--volume s3-mistral-large-model@&lt;ALIAS&gt;/:/opt/ml/model:RW</code></li>
</ul>



<p>This mounts your&nbsp;<strong>OVHcloud Object Storage volume</strong>&nbsp;into the job’s file system:<br>–&nbsp;<code>s3-mistral-large-model@&lt;ALIAS&gt;/</code>: refers to your&nbsp;<strong>S3 bucket volume</strong>&nbsp;from the OVHcloud Object Storage<br>–&nbsp;<code>:<code>/opt/ml/model</code></code>: mounts the volume into the container under&nbsp;<code><code>/opt/ml/model</code></code><br>–&nbsp;<code>RW</code>: enables&nbsp;<strong>Read/Write</strong>&nbsp;permissions</p>



<ul class="wp-block-list">
<li><code>-e RECIPES_VERSION=$RECIPES_VERSION</code></li>
</ul>



<p>This is from your <strong>environment variables</strong>&nbsp;defined previously.</p>



<ul class="wp-block-list">
<li><code>$<span style="background-color: initial; font-family: inherit; font-size: inherit; font-weight: inherit;">DOCKER_IMAGE_MISTRAL_UTILS</span></code></li>
</ul>



<p>This is the<strong>&nbsp;Mistral Large utils Docker image</strong>&nbsp;you are running inside the job.</p>



<ul class="wp-block-list">
<li><code>-- bash -c "cd /app/mistral-rclone &amp;&amp; \</code><br><code>               poetry run python mistral-rclone.py \</code><br><code>                   --license-key $LICENSE_KEY \</code><br><code>                   --download-model $SERVED_MODEL"</code></li>
</ul>



<p>Refers to the specific command to <strong>launch the model download</strong>.</p>



<p><em>Note that synchronisation with Object Storage will be <strong>automatic at the end of the AI Training job</strong>.</em></p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>⚠️ <strong>WARNING!</strong></p>
<cite><strong>Wait for the job to go to <code><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">DONE</mark></code> before proceeding to the next step</strong>.</cite></blockquote>



<ul class="wp-block-list">
<li>Check that the various elements are present in the bucket:</li>
</ul>



<pre class="wp-block-code"><code class="">ovhai bucket object list s3-mistral-large-model@&lt;ALIAS&gt;</code></pre>



<p>The bucket must be organized and split into 4 different folders:</p>



<ul class="wp-block-list">
<li>grammars</li>



<li>recipes</li>



<li>tokenizers</li>



<li>weights</li>
</ul>



<p>Note that a total of 6 elements must be present.</p>



<p>🚀 It&#8217;s all there? So let&#8217;s move on to the <strong>deployment of the Mistral Large model</strong>!</p>



<h3 class="wp-block-heading">Step 6 &#8211; Deploy Mistral Large service</h3>



<p>To deploy the Mistral Large 123B model using the previously downloaded weights, you will use OVHcloud&#8217;s <strong>AI Deploy </strong>product.</p>



<p>But first you need to create an API key that will allow you to consume the model and query it, in particular using Open AI compatibility.</p>



<ul class="wp-block-list">
<li>Creation of an access token:</li>
</ul>



<pre class="wp-block-code"><code class="">ovhai token create --role read mistral_large=api_key_reader</code></pre>



<ul class="wp-block-list">
<li>Export this token as an environment variable:</li>
</ul>



<pre class="wp-block-code"><code class="">export MY_OVHAI_MISTRAL_LARGE_TOKEN=&lt;your_ovh_access_token_value&gt;</code></pre>



<ul class="wp-block-list">
<li>Launch the <strong>Mistral Large service</strong> with <strong>AI Deploy </strong>by running the following command:</li>
</ul>



<pre class="wp-block-code"><code class="">ovhai app run --name DEPLOY_MISTRAL_LARGE_123B \
              --gpu 4 \
              --flavor h100-1-gpu \
              --default-http-port 5000 \
              --label mistral_large=api_key_reader \
              -e SERVED_MODEL=$SERVED_MODEL \
              -e RECIPES_VERSION=$RECIPES_VERSION \
              -e TP_SIZE=$TP_SIZE \
              --volume s3-mistral-large-model@&lt;ALIAS&gt;/:/opt/ml/model:RW \
              --volume standalone:/tmp:RW \
              --volume standalone:/workspace:RW \
              $<span style="background-color: initial; font-family: inherit; font-size: inherit; font-weight: inherit;">DOCKER_IMAGE_INFERENCE_ENGINE</span></code></pre>



<p><em>Full command explained:</em></p>



<ul class="wp-block-list">
<li><code>ovhai app run</code></li>
</ul>



<p>This is the core command to&nbsp;<strong>run an app / API</strong>&nbsp;using the&nbsp;<strong>OVHcloud AI Deploy</strong>&nbsp;platform.</p>



<ul class="wp-block-list">
<li><code>--name DEPLOY_MISTRAL_LARGE_123B</code></li>
</ul>



<p>Sets a&nbsp;<strong>custom name</strong>&nbsp;for the app. For example,&nbsp;<code>DEPLOY_MISTRAL_LARGE_123B</code>.</p>



<ul class="wp-block-list">
<li><code>--default-http-port 5000</code></li>
</ul>



<p>Exposes&nbsp;<strong>port 5000</strong>&nbsp;as the default HTTP endpoint.</p>



<ul class="wp-block-list">
<li><code>--gpu&nbsp;</code>4</li>
</ul>



<p>Allocates&nbsp;<strong>4 GPUs</strong>&nbsp;for the app.</p>



<ul class="wp-block-list">
<li><code>--flavor h100-1-gpu</code></li>
</ul>



<p>Chooses&nbsp;<strong>H100 GPUs</strong>&nbsp;for the app.</p>



<ul class="wp-block-list">
<li><code>--volume s3-mistral-large-model@&lt;ALIAS&gt;/:/opt/ml/model:RW</code></li>
</ul>



<p>This mounts your&nbsp;<strong>OVHcloud Object Storage volume</strong>&nbsp;into the job’s file system:<br>–&nbsp;<code>s3-mistral-large-model@&lt;ALIAS&gt;/</code>: refers to your&nbsp;<strong>S3 bucket volume</strong>&nbsp;from the OVHcloud Object Storage<br>–&nbsp;<code>:<code>/opt/ml/model</code></code>: mounts the volume into the container under&nbsp;<code><code>/opt/ml/model</code></code><br>–&nbsp;<code>RW</code>: enables&nbsp;<strong>Read/Write</strong>&nbsp;permissions</p>



<ul class="wp-block-list">
<li><code>--label mistral_large=api_key_reader</code></li>
</ul>



<p>Means that the access is restricted to your token</p>



<ul class="wp-block-list">
<li><code>-e SERVED_MODEL=$SERVED_MODEL</code></li>



<li><code>-e RECIPES_VERSION=$RECIPES_VERSION</code></li>



<li><code>-e TP_SIZE=$TP_SIZE</code></li>
</ul>



<p>These are&nbsp;<strong>environment variables</strong>&nbsp;defined previously.</p>



<ul class="wp-block-list">
<li><code>-v standalone:/tmp:rw</code></li>



<li><code>-v standalone:/workspace:rw</code></li>
</ul>



<p>Mounts&nbsp;<strong>two persistent storage volumes</strong>:<br>&#8211; <code>/tmp</code><br>&#8211; <code>/workspace</code>&nbsp;→ Main working directory</p>



<ul class="wp-block-list">
<li><code>$<span style="background-color: initial; font-family: inherit; font-size: inherit; font-weight: inherit;">DOCKER_IMAGE_INFERENCE_ENGINE</span></code></li>
</ul>



<p>This is the<strong>&nbsp;Mistral Large inference Docker image</strong>&nbsp;you are running inside the app.</p>



<p><em>It&nbsp;may&nbsp;take&nbsp;a&nbsp;few&nbsp;minutes&nbsp;for&nbsp;the&nbsp;resources&nbsp;to&nbsp;be&nbsp;allocated&nbsp;and&nbsp;for&nbsp;the&nbsp;<strong>Docker image</strong>&nbsp;to&nbsp;be&nbsp;pulled.&nbsp;</em></p>



<p>To&nbsp;check&nbsp;the&nbsp;progress&nbsp;and&nbsp;get&nbsp;additional&nbsp;information&nbsp;about&nbsp;the&nbsp;<strong>AI&nbsp;deploy&nbsp;app</strong>,&nbsp;run&nbsp;the&nbsp;following&nbsp;command:</p>



<pre class="wp-block-code"><code class="">ovhai app get &lt;ai_deploy_mistral_app_id&gt;</code></pre>



<p>Once in <strong><code><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">RUNNING</mark></code></strong> status, the model will be loaded. To check that the load was successful, you can check the container logs:</p>



<pre class="wp-block-code"><code class="">ovhai app logs &lt;ai_deploy_mistral_app_id&gt;</code></pre>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>⚠️ <strong>WARNING!</strong></p>
<cite><strong>To&nbsp;consume&nbsp;the&nbsp;service,&nbsp;you&nbsp;must&nbsp;wait&nbsp;for&nbsp;the&nbsp;app&nbsp;to&nbsp;go&nbsp;into&nbsp;<code><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">RUNNING</mark></code>&nbsp;status,&nbsp;AND&nbsp;for&nbsp;the&nbsp;model&nbsp;to&nbsp;finish&nbsp;loading.</strong></cite></blockquote>



<p>🎉 Is&nbsp;that&nbsp;it?&nbsp;Everything&nbsp;ready?&nbsp;It&nbsp;is&nbsp;therefore&nbsp;possible&nbsp;to&nbsp;start&nbsp;playing&nbsp;with&nbsp;the&nbsp;model!</p>



<h3 class="wp-block-heading">Step 7 &#8211; Test the Mistral Large model by sending your first requests</h3>



<ul class="wp-block-list">
<li>Access the API doc via your app URL:</li>
</ul>



<p><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><code><strong>https://&lt;ai_deploy_mistral_app_id>.app.gra.ai.cloud.ovh.net/docs</strong></code></mark></p>



<p>To find the information, please refer to <a href="https://console.mistral.ai/on-premise/licenses" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">https://console.mistral.ai/on-premise/licenses</mark></strong></a></p>



<ul class="wp-block-list">
<li>Test with a basic cURL:</li>
</ul>



<pre class="wp-block-code"><code class="">curl -X 'POST' \
'https://&lt;ai_deploy_mistral_app_id&gt;.app.gra.ai.cloud.ovh.net/v1/chat/completions' \
  -H 'accept: application/json' \
  -H "Authorization: Bearer $MY_OVHAI_MISTRAL_LARGE_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "mistral-large-&lt;version&gt;",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant!"
    },
    {
      "role": "user",
      "content": "What is the capital of France?"     
    }
  ]
}'</code></pre>



<p><strong>⚠️ Note that you have also to replace <mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><code>&lt;version&gt;</code></mark> in the model name by the one you are using: </strong><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><code><strong>"model": "mistral-large-&lt;version&gt;"</strong></code></mark></p>



<p>To take implementation a step further and take advantage of all the features of this endpoint, you can also integrate it with <strong>Langchain</strong> thanks to its fuOpenAI compatibility.</p>



<ul class="wp-block-list">
<li>LangChain integration:</li>
</ul>



<pre class="wp-block-code"><code class="">import time
import os 
from langchain.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

def chat_completion_basic(new_message: str):

  model = ChatOpenAI(model_name="mistral-large-&lt;version&gt;",
                        openai_api_key=$MY_OVHAI_MISTRAL_LARGE_TOKEN,
                        openai_api_base='https://&lt;ai_deploy_mistral_app_id&gt;.app.gra.ai.cloud.ovh.net/v1',
                       )

  prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant!"),
    ("human", "{question}"),
  ])

  chain = prompt | model

  print("🤖: ")
  for r in chain.stream({"question", new_message}):
    print(r.content, end="", flush=True)
    time.sleep(0.150)

chat_completion_basic("What is the capital of France?)</code></pre>



<p>🥹 Congratulations! You have successfully completed the deployment!</p>



<h2 class="wp-block-heading">Conclusion</h2>



<p>You can now consume your <strong>Mistral Large 123B</strong> in a secure environment!</p>



<p>The result of your implementation? The deployment of a sovereign, scalable, production-quality 123B LLM, powered by <strong>OVHcloud AI Deploy</strong>.</p>



<p>➡️ <strong>To go further? </strong></p>



<ul class="wp-block-list">
<li>Update your model in a single command line and without interruption following this <a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-deploy-update-custom-docker-image?id=kb_article_view&amp;sysparm_article=KB0057968" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">documentation</a></li>



<li>Go to the next replica in the event of a heavy load to ensure high availability using this <a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-deploy-apps-deployments?id=kb_article_view&amp;sysparm_article=KB0047997" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">method</a></li>
</ul>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Freference-architecture-deploy-mistral-large-model-in-sovereign-environment-ovhcloud%2F&amp;action_name=Reference%20Architecture%3A%C2%A0deploying%20the%20Mistral%20Large%20123B%20model%20in%20a%20sovereign%20environment%20with%20OVHcloud&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Reference Architecture: set up MLflow Remote Tracking Server on OVHcloud</title>
		<link>https://blog.ovhcloud.com/mlflow-remote-tracking-server-ovhcloud-databases-object-storage-ai-solutions/</link>
		
		<dc:creator><![CDATA[Eléa Petton]]></dc:creator>
		<pubDate>Tue, 15 Apr 2025 07:52:46 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Notebooks]]></category>
		<category><![CDATA[AI Training]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Managed Database]]></category>
		<category><![CDATA[MLflow]]></category>
		<category><![CDATA[Object Storage]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=28564</guid>

					<description><![CDATA[Travel through the Data &#38; AI universe of OVHcloud with the MLflow integration. As Artificial Intelligence (AI) continues to grow in importance, Data Scientists and Machine Learning Engineers need a robust and scalable platform to manage the entire Machine Learning (ML) lifecycle. MLflow, an open-source platform, provides a comprehensive framework for managing ML experiments, models, [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fmlflow-remote-tracking-server-ovhcloud-databases-object-storage-ai-solutions%2F&amp;action_name=Reference%20Architecture%3A%20set%20up%20MLflow%20Remote%20Tracking%20Server%20on%20OVHcloud&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p><em>Travel through the Data &amp; AI universe of OVHcloud with the <em>MLflow</em> integration.</em></p>



<figure class="wp-block-image aligncenter size-full"><img decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/mlflow_ref_archi.svg" alt="" class="wp-image-28689"/><figcaption class="wp-element-caption"><em>Mlflow Remote Tracking Server on OVHcloud</em></figcaption></figure>



<p>As <strong>Artificial Intelligence</strong> (AI) continues to grow in importance, <em>Data Scientists</em> and <em>Machine Learning Engineers</em> need a robust and scalable platform to manage the entire Machine Learning (ML) lifecycle. <br><a href="https://mlflow.org/docs/latest/introduction/index.html" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">MLflow</a>, an open-source platform, provides a comprehensive framework for managing ML experiments, models, and deployments. </p>



<p><strong>Mlflow</strong> offers many benefits and provides a complete framework for ML lifecycle management with features such as:</p>



<ul class="wp-block-list">
<li>Experiment tracking and model management</li>



<li>Reproducibility and collaboration</li>



<li>Scalability, flexibility, and integration</li>



<li>Automated ML and model serving capabilities</li>



<li>Improved model accuracy, faster time-to-market, and reduced costs.</li>
</ul>



<p>In this reference architecture, you will explore how to leverage remote experience tracking with the <strong>MLflow Tracking Server</strong> on the <a href="https://www.ovhcloud.com/fr/public-cloud/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">OVHcloud Public Cloud</a> infrastructure.<br>In fact, you will be able to build a scalable and efficient ML platform, streamlining your ML workflow and accelerating model development using <strong>OVHcloud AI Notebooks</strong>, <strong>AI Training</strong>, <strong>Managed Databases (PostgreSQL)</strong>, and <strong>Object Storage</strong>.</p>



<p><strong>The result?</strong> A fully remote, <strong>production-ready ML experiment tracking pipeline</strong>, powered by OVHcloud&#8217;s Data &amp; Machine Learning Services (e.g. AI Notebooks and AI Training).</p>



<h2 class="wp-block-heading">Overview of the MLflow server architecture</h2>



<p>Here is how will be configured MLflow:</p>



<ul class="wp-block-list">
<li><strong>Development and training environment:</strong> create and train model with <strong>AI Notebooks</strong></li>



<li><strong>Remote Tracking Server</strong>: host in an <strong>AI Training</strong> job (Container as a Service)</li>



<li><strong>Backend Store</strong>: benefit from a managed <strong>PostgreSQL</strong> database (DBaaS).</li>



<li><strong>Artifact Store</strong>: use OVHcloud <strong>Object Storage</strong> (S3-compatible).</li>
</ul>



<figure class="wp-block-image aligncenter size-full"><img decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/mlflow_overview.svg" alt="" class="wp-image-28688"/><figcaption class="wp-element-caption"><em>MLflow remote server deployment steps</em></figcaption></figure>



<p>In the following tutorial, all services are deployed within the <strong>OVHcloud Public Cloud</strong>.</p>



<h2 class="wp-block-heading">Prerequisites</h2>



<p>Before you begin, ensure you have:</p>



<ul class="wp-block-list">
<li>An <strong>OVHcloud Public Cloud</strong> account</li>



<li>An <strong>OpenStack user</strong> with the following roles:
<ul class="wp-block-list">
<li>Administrator</li>



<li>AI Training Operator</li>



<li>Object Storage Operator</li>
</ul>
</li>
</ul>



<p><strong>🚀 Having all the ingredients for our recipe, it’s time to set up your MLflow remote tracking server!</strong></p>



<h2 class="wp-block-heading">Architecture guide: MLflow remote tracking server</h2>



<p>Let’s go for the set up and deployment of your custom MLflow tracking tool!</p>



<p>⚙️<em> Also consider that all of the following steps can be automated using OVHcloud APIs!</em></p>



<h4 class="wp-block-heading">Step 1 – Install <code>ovhai</code> CLI</h4>



<p>Firstly, start by setting up your CLI environment.</p>



<pre class="wp-block-code"><code class="">curl https://cli.gra.ai.cloud.ovh.net/install.sh | bash</code></pre>



<p>Secondly, login using your <strong>OpenStack credentials</strong>.</p>



<pre class="wp-block-code"><code class="">ovhai login -u &lt;openstack-username&gt; -p &lt;openstack-password&gt;</code></pre>



<p>Now, it&#8217;s time to create your bucket inside OVHcloud Object Storage!</p>



<h4 class="wp-block-heading">Step 2 – Provision Object Storage (Artifact Store)</h4>



<ol class="wp-block-list">
<li>Go to <strong>Public Cloud &gt; Storage &gt; Object Storage</strong> in the OVHcloud Control Panel.</li>



<li>Create a <strong>datastore</strong> and a new <strong>S3 bucket</strong> (e.g., <code>mlflow-s3-bucket</code>).</li>



<li>Register the datastore with the <code>ovhai</code> CLI:</li>
</ol>



<pre class="wp-block-code"><code class="">ovhai datastore add s3 &lt;ALIAS&gt; https://s3.gra.io.cloud.ovh.net/ gra &lt;my-access-key&gt; &lt;my-secret-key&gt; --store-credentials-locally</code></pre>



<h4 class="wp-block-heading">Step 3 – Create PostgreSQL Managed DB (Backend Store)</h4>



<p>1. Navigate to <strong>Databases &amp; Analytics &gt; Databases</strong></p>



<p><strong>2. Create a new <em>PostgreSQL</em> instance with <em>Essential plan</em></strong></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="627" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-13-1024x627.png" alt="" class="wp-image-28580" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-13-1024x627.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-13-300x184.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-13-768x470.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-13-1536x941.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-13-2048x1254.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>3. Select <em>Location</em> and <em>Node type</em></strong></p>



<figure class="wp-block-image aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="661" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-14-1024x661.png" alt="" class="wp-image-28581" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-14-1024x661.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-14-300x194.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-14-768x495.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-14-1536x991.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-14-2048x1321.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>4. Reset the user password</strong></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="2384" height="1340" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited.png" alt="" class="wp-image-28590" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited.png 2384w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited-300x169.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited-1024x576.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited-768x432.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited-1536x863.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-15-edited-2048x1151.png 2048w" sizes="auto, (max-width: 2384px) 100vw, 2384px" /></figure>



<p><strong>5. Take note of te following parameters</strong></p>



<p>Go to your database dashboard:</p>



<figure class="wp-block-image aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="640" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-16-1024x640.png" alt="" class="wp-image-28583" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-16-1024x640.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-16-300x188.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-16-768x480.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-16-1536x960.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-16-2048x1280.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Then, copy the <strong>connexion information</strong>:</p>



<pre class="wp-block-code"><code class="">&lt;db_hostname&gt;
&lt;db_username&gt;
&lt;db_password&gt;
&lt;db_name&gt;
&lt;db_port&gt;
&lt;ssl_mode&gt;</code></pre>



<p>Your <strong>Backend Store</strong> is now ready to use!</p>



<h4 class="wp-block-heading">Step 4 -Build you custom MLflow Docker image and </h4>



<p>1. Develop MLflow launching script</p>



<p>Firstly, you have to write a script in bash to launch the server: <strong><em>mlflow_server.sh</em></strong></p>



<pre class="wp-block-code"><code class="">echo "The MLflow server is starting..."

mlflow server \
  --backend-store-uri postgresql://${POSTGRE_USER}:${POSTGRE_PASSWORD}@${PG_HOST}:${PG_PORT}/${PG_DB}?sslmode=${SSL_MODE} \
  --default-artifact-root ${S3_BUCKET_NAME}/ \
  --host 0.0.0.0 \
  --port 5000</code></pre>



<p><strong>2. Create Dockerfile</strong></p>



<p>Install the required Python dependency and give the rights on the<strong> /mlruns</strong> path to the OVHcloud user.</p>



<pre class="wp-block-code"><code class="">FROM ghcr.io/mlflow/mlflow:latest

# Install Python dependencies
RUN pip install psycopg2-binary

COPY mlflow_server.sh .

# Change the ownership of `mlruns` directory to the OVHcloud user (42420:42420)
RUN mkdir -p /mlruns
RUN chown -R 42420:42420 /mlruns

# Start MLflow server inside container
CMD ["bash", "mlflow_server.sh"]</code></pre>



<p><strong>3. Build your custom MLflow docker image</strong></p>



<p>Build the docker image using the previous Dockerfile.</p>



<pre class="wp-block-code"><code class="">docker build . -t mlflow-server-ai-training:latest</code></pre>



<p><strong>4. Tag and push the docker image to your registry</strong></p>



<p>Finally, you can push the Docker image to your registry.</p>



<pre class="wp-block-code"><code class="">docker tag mlflow-server-ai-training:latest &lt;your-registry-address&gt;/mlflow-server-ai-training:latest</code></pre>



<pre class="wp-block-code"><code class="">docker push &lt;your-registry-address&gt;/mlflow-server-ai-training:latest</code></pre>



<p>Congrats! You can now use the Docker image to launch MLflow server.</p>



<h4 class="wp-block-heading">Step 5 &#8211; Start MLflow Tracking Server inside container</h4>



<p>You can use AI Training to start MLflow server inside a job.</p>



<p><strong>1. Using <code>ovhai</code> CLI, run the following command inside terminal</strong></p>



<pre class="wp-block-code"><code class="">ovhai job run --name mlflow-server \
              --default-http-port 5000 \
              --cpu 4 \
              -v mlflow-s3-bucket@DEMO/:/artifacts:RW:cache \
              -e POSTGRE_USER=avnadmin \
              -e POSTGRE_PASSWORD=&lt;db_password&gt; \
              -e S3_ENDPOINT=https://s3.gra.io.cloud.ovh.net/ \
              -e S3_BUCKET_NAME=mlflow-s3-bucket \
              -e PG_HOST=&lt;db_hostname&gt; \
              -e PG_DB=defaultdb \
              -e PG_PORT=20184 \
              -e SSL_MODE=require \
              &lt;your_registry_address&gt;/mlflow-server-ai-training:latest</code></pre>



<p><em>Full command explained:</em></p>



<ul class="wp-block-list">
<li><code>ovhai job run</code></li>
</ul>



<p>This is the core command to <strong>run a job</strong> using the <strong>OVHcloud AI Training</strong> platform.</p>



<ul class="wp-block-list">
<li><code>--name mlflow-server</code></li>
</ul>



<p>Sets a <strong>custom name</strong> for the job. For example, <code>mlflow-server</code>.</p>



<ul class="wp-block-list">
<li><code>--default-http-port 5000</code></li>
</ul>



<p>Exposes <strong>port 5000</strong> as the default HTTP endpoint. MLflow’s web UI typically runs on port 5000, so this ensures the UI is accessible once the job is running.</p>



<ul class="wp-block-list">
<li><code>--cpu </code>4</li>
</ul>



<p>Allocates <strong>4 CPUs</strong> for the job. You can adjust this based on how heavy your MLflow workload is.</p>



<ul class="wp-block-list">
<li><code>-v mlflow-s3-bucket@DEMO/:/artifacts:RW:cache</code></li>
</ul>



<p>This mounts your <strong>OVHcloud Object Storage volume</strong> into the job’s file system:<br>&#8211; <code>mlflow-s3-bucket@DEMO/</code>: refers to your <strong>S3 bucket volume</strong> from the OVHcloud Object Storage<br>&#8211; <code>:/artifacts</code>: mounts the volume into the container under <code>/artifacts</code><br>&#8211; <code>RW</code>: enables <strong>Read/Write</strong> permissions<br>&#8211; <code>cache</code>: enables <strong>volume caching</strong>, improving performance for frequent reads/writes</p>



<ul class="wp-block-list">
<li><code>-e POSTGRE_USER=avnadmin</code></li>



<li><code>-e POSTGRE_PASSWORD=&lt;db_password&gt;</code></li>



<li><code>-e PG_HOST=&lt;db_hostname&gt;</code></li>



<li><code>-e PG_DB=defaultdb</code></li>



<li><code>-e PG_PORT=20184</code></li>



<li><code>-e SSL_MODE=require</code></li>
</ul>



<p>These are <strong>environment variables</strong> for connecting to the <strong>PostgreSQL </strong>backend store:<br>&#8211; <code>avnadmin</code>: the default admin user for OVHcloud’s managed PostgreSQL<br>&#8211; <code>POSTGRE_PASSWORD</code>: must be replaced with your actual database password<br>&#8211; <code>PG_HOST</code>: the hostname of your managed PostgreSQL instance<br>&#8211; <code>PG_DB</code>: the name of the database to use (default: <code>defaultdb</code>)<br>&#8211; <code>PG_PORT</code>: the port your PostgreSQL server is listening on<br>&#8211; <code>SSL_MODE</code>: enforce SSL connection to secure DB traffic</p>



<ul class="wp-block-list">
<li><code>-e S3_ENDPOINT=https://s3.gra.io.cloud.ovh.net/</code></li>
</ul>



<p>Tells MLflow where the <strong>S3-compatible endpoint</strong> is hosted. This is specific to OVHcloud&#8217;s GRA (Gravelines) region Object Storage.</p>



<ul class="wp-block-list">
<li><code>-e S3_BUCKET_NAME=mlflow-s3-bucket</code></li>
</ul>



<p>Sets the <strong>name of the S3 bucket</strong> where MLflow should store artifacts (models, metrics, etc.).</p>



<ul class="wp-block-list">
<li><code>&lt;your_registry_address&gt;/mlflow-server-ai-training:latest</code></li>
</ul>



<p>This is the<strong> custom MLflow Docker image</strong> you are running inside the job.</p>



<p><strong>2. Check if your AI Training job is RUNNING</strong></p>



<p>Replace the <code>&lt;job_id&gt;</code> by yours.</p>



<pre class="wp-block-code"><code class="">ovhai job get &lt;job_id&gt;</code></pre>



<p>You should obtain:</p>



<p><code>History:<br>    DATE                  STATE<br>    04-04-25 09:58:00     QUEUED<br>    04-04-25 09:58:01     INITIALIZING<br>    04-04-25 09:58:07     PENDING<br>    04-04-25 09:58:10     <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">RUNNING</mark></strong><br>  Info:<br>    Message:   Job is running</code></p>



<p><strong>3. Recover the IP and external IP of your AI Training job</strong></p>



<p>Using, your <code>&lt;job_id&gt;</code>, you can retrieve your AI Training <strong>job IP</strong>.</p>



<pre class="wp-block-code"><code class="">ovhai job get &lt;job_id&gt; -o json | jq '.status.ip' -r</code></pre>



<p>For example, you can obtain something like that: <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">10.42.80.176</mark></strong></p>



<p>You also need the External IP:</p>



<pre class="wp-block-code"><code class="">ovhai job get &lt;job_id&gt; -o json | jq '.status.externalIp' -r</code></pre>



<p>Returning the IP address you will have to whitelist to be able to connect to your database (e.g. <mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><strong>51.210.38.188</strong></mark>)</p>



<h4 class="wp-block-heading">Step 6 – Whitelist AI Training job IP in PostgreSQL DB</h4>



<p>From <strong>Databases &amp; Analytics &gt; Databases</strong>, edit your DB configuration to <strong>allow access from the job Extranal IP</strong>.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="475" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-19-1024x475.png" alt="" class="wp-image-28593" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-19-1024x475.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-19-300x139.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-19-768x356.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-19-1536x712.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-19-2048x950.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Then, you can see that the job External IP is now white listed.</p>



<p>Well done! Your MLflow server and the backend store are now connected.</p>



<h4 class="wp-block-heading">Step 7 –  Create an AI Notebook</h4>



<p>It&#8217;s time to train and track your Machine Learning models using MLflow!</p>



<p>To do so, use the OVHcloud <code>ovhai</code> CLI and start a new AI Notebook with GPU.</p>



<pre class="wp-block-code"><code class="">ovhai notebook run conda jupyterlab \
  --name mlflow-notebook \
  --framework-version conda-py311-cudaDevel11.8 \
  --gpu 1</code></pre>



<p><em>Full command explained:</em></p>



<ul class="wp-block-list">
<li><code>ovhai noteb</code>ook<code> run</code></li>
</ul>



<p>This is the core command to <strong>run a notebook</strong> using the <strong>OVHcloud AI Notebooks</strong> platform.</p>



<ul class="wp-block-list">
<li><code>--name mlflow-notebook</code></li>
</ul>



<p>Sets a <strong>custom name</strong> for the notebook. In this case, you can name it <code>mlflow-notebook</code>.</p>



<ul class="wp-block-list">
<li><code>--framework-version conda-py311-cudaDevel11.8</code></li>
</ul>



<p>Define the framework and version you want to use in your notebook. Here, you are using Python 3.11 with Conda framework and CUDA compatibility.</p>



<ul class="wp-block-list">
<li><code>--gpu 1</code></li>
</ul>



<p>Allocates <strong>1 GPU</strong> for the job, by default a <strong>Tesla V100S</strong> from NVIDIA (<code>ai1-1-gpu</code>). You can select the flavor you want from the OVHcloud GPU range.</p>



<p>Then, check if your AI Notebook is RUNNING.</p>



<pre class="wp-block-code"><code class="">ovhai notebook get &lt;notebook_id&gt;</code></pre>



<p>Once your notebook is in RUNNING status, you should be able to access it using its URL:</p>



<p><code>State:          <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">RUNNING</mark></strong><br>Duration:       1411412   <br>Url:            <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">https://&lt;notebook_id&gt;.notebook.gra.ai.cloud.ovh.net</mark></strong><br>Grpc Address:   &lt;notebook_id&gt;.nb-grpc.gra.ai.cloud.ovh.net:443<br>Info Url:       https://ui.gra.ai.cloud.ovh.net/notebook/&lt;notebook_id&gt;</code></p>



<p>You can start your AI model development inside notebook.</p>



<h4 class="wp-block-heading">Step 8 – Model training inside Jupyter notebook</h4>



<p>To begin with, set up your notebook environment.</p>



<p><strong>1. Create the <code>requirements.txt</code> file</strong></p>



<pre class="wp-block-code"><code class="">numpy==2.2.3
scipy==1.15.2
mlflow==2.20.3
sklearn==1.6.1</code></pre>



<p><strong>2. Install dependencies</strong></p>



<p>From a notebook cell, launch the following command.</p>



<pre class="wp-block-code"><code class="">!pip3 install -r requirements.txt</code></pre>



<p>Perfect! You can start coding&#8230;</p>



<p><strong> 3. Import Python librairies</strong></p>



<p>Here, you have to import os, mlflow and scikit-learn.</p>



<pre class="wp-block-code"><code class=""># import dependencies
import os
import mlflow
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor</code></pre>



<p>In another notebook cell, set the MLflow tracking URI. Note that you have to replace <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">10.42.80.176</mark></strong> by your own <strong>job IP</strong>.</p>



<pre class="wp-block-code"><code class="">mlflow.set_tracking_uri("http://<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">10.42.80.176</mark></strong>:5000")</code></pre>



<p>Then start training your model!</p>



<pre class="wp-block-code"><code class="">mlflow.autolog()

db = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)

# Create and train models.
rf = RandomForestRegressor(n_estimators=100, max_depth=6, max_features=3)
rf.fit(X_train, y_train)

# Use the model to make predictions on the test dataset.
predictions = rf.predict(X_test)</code></pre>



<p><strong>Output:</strong></p>



<p><code>🏃 View run dashing-foal-850 at: http://<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">10.42.80.176</mark></strong>:5000/#/experiments/0/runs/e7dad7c073634ec28675c0defce2b9ec </code><br><code>🧪 View experiment at: http://<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">10.42.80.176</mark></strong>:5000/#/experiments/0</code></p>



<p>Congrats! You can now track your model training from<strong> MLflow remote server</strong>&#8230;</p>



<h4 class="wp-block-heading">Step 9 – Track and compare models from MLflow remote server</h4>



<p>Finally, access to MLflow dashboard using the job URL: <strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color"><code>https://&lt;job_id&gt;.job.gra.ai.cloud.ovh.net</code></mark></strong></p>



<figure class="wp-block-image aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="578" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-23-1024x578.png" alt="" class="wp-image-28598" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-23-1024x578.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-23-300x169.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-23-768x433.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-23-1536x867.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-23-2048x1155.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Then, you can check your model trainings and evaluations:</p>



<figure class="wp-block-image aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="577" src="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-24-1024x577.png" alt="" class="wp-image-28599" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-24-1024x577.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-24-300x169.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-24-768x433.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-24-1536x866.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/04/image-24-2048x1154.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>What a success! You can finally use your MLflow to evaluate, compare and archive your various trainings.</p>



<h4 class="wp-block-heading">Step 10 &#8211; Monitor everything remotely</h4>



<p>You now have a complete Machine Learning pipeline with remote experiment tracking. Access:</p>



<ul class="wp-block-list">
<li><strong>Metrics, Parameters, and Tags</strong> → PostgreSQL</li>



<li><strong>Artifacts (Models, Files)</strong> → S3 bucket</li>
</ul>



<p>This setup is reusable, automatable, and production-ready!</p>



<h2 class="wp-block-heading">What’s next?</h2>



<ul class="wp-block-list">
<li>Automate deployment with <strong><a href="https://eu.api.ovh.com/" data-wpel-link="exclude">OVHcloud APIs</a></strong></li>



<li>Run different training sessions in parallel and compare them with your <strong>remote MLflow tracking server</strong></li>



<li>Use <strong><a href="https://www.ovhcloud.com/fr/public-cloud/ai-deploy/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">AI Deploy</a></strong> to serve your trained models</li>
</ul>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fmlflow-remote-tracking-server-ovhcloud-databases-object-storage-ai-solutions%2F&amp;action_name=Reference%20Architecture%3A%20set%20up%20MLflow%20Remote%20Tracking%20Server%20on%20OVHcloud&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Deep Dive into DeepSeek-R1 &#8211; Part 1</title>
		<link>https://blog.ovhcloud.com/deep-dive-into-deepseek-r1-part-1/</link>
		
		<dc:creator><![CDATA[Fabien Ric]]></dc:creator>
		<pubDate>Thu, 06 Mar 2025 09:56:20 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Deploy]]></category>
		<category><![CDATA[AI Endpoints]]></category>
		<category><![CDATA[DeepSeek]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=28199</guid>

					<description><![CDATA[Introduction A few weeks ago, the release of the open-source large language model DeepSeek-R1 has taken the AI world by storm. The Chinese research team claimed their new reasoning model was on par with OpenAI&#8217;s flagship model o1, open-sourced the model and gave details about the work behind it. In this blog post series, we [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdeep-dive-into-deepseek-r1-part-1%2F&amp;action_name=Deep%20Dive%20into%20DeepSeek-R1%20%26%238211%3B%20Part%201&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="512" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-1024x512.png" alt="A cute whale with a baseball cap, using a computer, representing DeepSeek." class="wp-image-28353" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-1024x512.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-300x150.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-768x384.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-1536x768.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading">Introduction</h2>



<p>A few weeks ago, the release of the open-source large language model DeepSeek-R1 has taken the AI world by storm. The Chinese research team claimed their new reasoning model was on par with OpenAI&#8217;s flagship model o1, open-sourced the model and gave details about the work behind it.</p>



<p>In this blog post series, we will dive into the DeepSeek-R1 model family and see how you can run it on OVHcloud to build a simple chatbot that handles reasoning.</p>



<p>The &#8220;R&#8221; in DeepSeek-R1 stands for &#8220;Reasoning&#8221;, so let&#8217;s start by defining what a reasoning model is.</p>



<h2 class="wp-block-heading">What are reasoning models?</h2>



<p>Reasoning models are large language models (LLM) capable of reflecting on a problem before generating an answer. Traditionally, LLMs have been improved by spending more compute (more data, increase the number of parameters and the number of training iterations) at training time: it is <strong>training-time compute</strong>. Reasoning models, however, differ with standard LLMs in the way they use <strong>test-time compute</strong>, which means that during inference, they spend more time and resources to generate and refine a better answer.</p>



<p>Reasoning models excel at tasks that require understanding and working through a problem step-by-step, such as mathematics, riddles, puzzles, coding, planning tasks and agentic workflows. They may be counterproductive for use cases that don&#8217;t require reasoning capabilities, such as knowledge facts (for example, <em>who discovered penicillin)</em>.</p>



<p>In a classroom, a reasoning model would be a student that takes time to understand the question, split the problem into manageable steps and detail the resolution process before rushing to write the answer.</p>



<p>Here is a comparison between the outputs of a standard LLM and a reasoning LLM, on an example prompt:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69cd4fc5c3422&quot;}" data-wp-interactive="core/image" data-wp-key="69cd4fc5c3422" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1029" height="492" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14.png" alt="A diagram showing the differences between standard LLM and reasoning LLM outputs for a given prompt." class="wp-image-28318" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14.png 1029w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14-300x143.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14-1024x490.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14-768x367.png 768w" sizes="auto, (max-width: 1029px) 100vw, 1029px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>The reasoning model has generated more tokens, showing how it plans to solve the problem, before the actual answer. You can see it generates reasoning content into <code>&lt;think&gt;...&lt;/think&gt; </code>tags, in the case of DeepSeek-R1.</p>



<p>A standard LLM can also show reasoning abilities, that are often more visible when using a technique called <a href="https://arxiv.org/abs/2201.11903" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Chain-of-Thought prompting (CoT)</a>, by adding phrases such as &#8220;let&#8217;s think step-by-step&#8221; in the prompt.</p>



<p>However, a reasoning LLM has been trained to behave this way. Its reasoning skill is internalized, so it doesn&#8217;t require specific prompting techniques to trigger the chain of thoughts process.</p>



<p>It&#8217;s important to note that DeepSeek-R1 is not the first reasoning model; OpenAI led the way by releasing their o1 model in September 2024.</p>



<p>The two main reasons why DeepSeek-R1 made the headline are its open-source nature, and the paper released by the research team which give many details on how they trained the model, with valuable insight for the open-source community to create reasoning models. Especially, the key highlight of their paper is that they observe the reasoning behavior can emerge only through Reinforcement Learning (RL), without fine-tuning.</p>



<h2 class="wp-block-heading">The DeepSeek-R1 model family</h2>



<p>You may have heard about DeepSeek-R1 but it&#8217;s not the only model of the DeepSeek family: DeepSeek-V3, DeepSeek-R1-Zero, and distilled models, are also available. So what are the differences between those models?</p>



<p>First, let&#8217;s go through some definitions and an overview of how language models are trained.</p>



<h3 class="wp-block-heading">Language model training overview</h3>



<p>The large language models available in apps and playgrounds are usually trained in 3 steps:</p>



<ol class="wp-block-list">
<li>A <strong>base model</strong> is trained on an unsupervised language modeling task (for instance, next token prediction) with a dataset of trillions of tokens (also called <em>pre-training</em>),</li>



<li>An <strong>instruct model </strong>is trained from the base model, by fine-tuning it on a massive dataset of instructions, conversations, questions and answers, to improve the performances of the model with the prompts frequently encountered in a chat,</li>



<li>The <strong>final model</strong> is the instruct model trained to better handle human preferences, avoid the generation of harmful content, etc. with techniques such as RLHF (reinforcement learning from human feedback) and DPO (direct policy optimization).</li>
</ol>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69cd4fc5c39ff&quot;}" data-wp-interactive="core/image" data-wp-key="69cd4fc5c39ff" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1459" height="239" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image.png" alt="A diagram showing the 3 training steps of a LLM." class="wp-image-28268" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image.png 1459w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-300x49.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-1024x168.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-768x126.png 768w" sizes="auto, (max-width: 1459px) 100vw, 1459px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p></p>



<h3 class="wp-block-heading">DeepSeek-V3 training</h3>



<p>According to the <a href="https://arxiv.org/pdf/2412.19437" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">technical report provided by DeepSeek</a>, DeepSeek-V3 is a mixture-of-experts (MoE) language model trained with the same kind of process, which is described in the image below:</p>



<ul class="wp-block-list">
<li><strong>DeepSeek-V3-Base</strong> is trained with 14.8 trillion tokens,</li>



<li>A dataset of 1.5 million instructions examples is used to fine-tune the base model,</li>



<li>This instruct model goes through reinforcement learning with several reward models. The final model is <strong>DeepSeek-V3</strong>.</li>
</ul>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69cd4fc5c3f71&quot;}" data-wp-interactive="core/image" data-wp-key="69cd4fc5c3f71" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1453" height="242" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8.png" alt="A diagram showing the 3 training steps of DeepSeek-V3." class="wp-image-28288" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8.png 1453w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8-300x50.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8-1024x171.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8-768x128.png 768w" sizes="auto, (max-width: 1453px) 100vw, 1453px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>For the reinforcement learning step, DeepSeek uses their algorithm called <strong>GRPO</strong> (<a href="https://arxiv.org/pdf/2402.03300" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">group relative policy optimization</a>), which uses several reward models to assess the quality of the content generated by the model. The score given by each reward model is combined into a final score, used to update the model so that it maximizes its global score the next time.</p>



<h3 class="wp-block-heading">DeepSeek-R1 model series training</h3>



<p><strong>DeepSeek-R1</strong> models are built with a different training pipeline, using the base model of DeepSeek-V3. The diagram below shows the main steps of the process designed by DeepSeek to create several reasoning models mentioned in their <a href="https://arxiv.org/pdf/2501.12948" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">technical report</a>:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69cd4fc5c44ea&quot;}" data-wp-interactive="core/image" data-wp-key="69cd4fc5c44ea" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1262" height="1323" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12.png" alt="A diagram showing the training process of DeepSeek-R1, DeepSeek-R1-Zero and DeepSeek-Distill models." class="wp-image-28301" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12.png 1262w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12-286x300.png 286w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12-977x1024.png 977w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12-768x805.png 768w" sizes="auto, (max-width: 1262px) 100vw, 1262px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>Let&#8217;s walk through it step-by-step (no pun intended):</p>



<p>1. The main breakthrough described in DeepSeek&#8217;s paper: they managed to train the DeepSeek-V3-Base 671B model to learn the reasoning capability with reinforcement learning only, which doesn&#8217;t require labeled data, as opposed to supervised fine-tuning. They use the same GRPO algorithm as before, with two rewards: one on the accuracy of the generated content, with &#8220;rule-based&#8221; experts instead of full reward models, that are also trained and require significant resources. For example, to assess if the model generated a correct Python code, you could have one expert that compiles the generated code and gives a note based on the number of errors. Another expert would generate test cases and see if the generated code can handle them. The other reward they use is about the format of the model&#8217;s responses, which must follow the  <code>&lt;think&gt;...&lt;think&gt;</code> tags to enclose the reasoning content. The resulting model is <strong>DeepSeek-R1-Zero.</strong> However, it has limitations that make it unsuitable for direct use, such as language mixing and poor readability.</p>



<p>2. To overcome these limitations, DeepSeek uses DeepSeek-R1-Zero to create a cold-start reasoning dataset, augmented with other data from sources not explicitly mentioned. DeepSeek-V3-Base is trained with this cold-start data, before applying a new round of reinforcement learning.</p>



<p>3. They use the same RL approach to get a new reasoning model, that generates a better quality of output. Using this model, they build a 100x bigger reasoning data, growing from 5k to 600k samples, using DeepSeek-V3 as a quality judge. This dataset is then completed with 200k samples generated with DeepSeek-V3 on non-reasoning tasks.</p>



<p>4. A second stage of supervised fine-tuning is done with the dataset built earlier.</p>



<p>5. The model is then aligned with human preferences with a final round of reinforcement learning with a specific human preferences reward. The resulting model is <strong>DeepSeek-R1</strong>.</p>



<p>6. Finally, DeepSeek experimented with fine-tuning much smaller models than DeepSeek-V3 (LLaMa 3.3 70B, Qwen 2.5 32B&#8230;) with the dataset built at step 3. In the paper, they call this process <strong>distillation</strong>. However, it must not be mistaken with the <em>knowledge distillation</em> technique frequently used in deep learning, where a student model learns from the probabilities distribution of a teacher model. Here, the term &#8220;distillation&#8221; refers to the fact that the reasoning skill is &#8220;distilled&#8221; into the base model, but it&#8217;s plain old supervised fine-tuning. This is how the <strong>DeepSeek-R1-Distill </strong>model series is trained. The quality of the dataset enables the resulting distilled models to beat much larger models on reasoning tasks, as show in the benchmark below:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69cd4fc5c4a0c&quot;}" data-wp-interactive="core/image" data-wp-key="69cd4fc5c4a0c" class="wp-block-image aligncenter size-full is-resized wp-lightbox-container"><img loading="lazy" decoding="async" width="770" height="312" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13.png" alt="A screen capture of benchmark data table." class="wp-image-28310" style="width:750px;height:auto" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13.png 770w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13-300x122.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13-768x311.png 768w" sizes="auto, (max-width: 770px) 100vw, 770px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button><figcaption class="wp-element-caption"><em>Benchmark of distilled models on several reasoning tasks (source: DeepSeek R1 technical paper)</em></figcaption></figure>



<h3 class="wp-block-heading">Recap</h3>



<p>The table below summarize the differences between the model of the DeepSeek-R1 series:</p>



<figure class="wp-block-table"><table><tbody><tr><td>Model</td><td>Description</td></tr><tr><td>DeepSeek-R1-Zero</td><td>Intermediate 671B reasoning model trained from DeepSeek-V3 exclusively with reinforcement learning, and used to bootstrap DeepSeek-R1 training.</td></tr><tr><td>DeepSeek-R1</td><td>671B reasoning model trained from DeepSeek-V3.</td></tr><tr><td>DeepSeek-R1-Distill</td><td>Smaller models fine-tuned for reasoning with a dataset generated by an intermediate version of DeepSeek-R1.</td></tr></tbody></table></figure>



<h2 class="wp-block-heading">Run DeepSeek-R1 on OVHcloud</h2>



<p>Now that we&#8217;ve seen the differences between all DeepSeek models, let&#8217;s try to use them!</p>



<h3 class="wp-block-heading">AI Endpoints</h3>



<p>The fastest way to test DeepSeek-R1 is to use OVHcloud<strong> AI Endpoints</strong>.</p>



<p><strong>DeepSeek-R1-Distill-Llama-70B</strong> is already available, ready to use and optimized for inference speed. Check it out here: <a href="https://endpoints.ai.cloud.ovh.net/models/a011515c-0042-41b2-9a00-ec8b5d34462d" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://endpoints.ai.cloud.ovh.net/models/a011515c-0042-41b2-9a00-ec8b5d34462d</a></p>



<p>AI Endpoints makes it easy to integrate AI into your applications with a simple API call, without the need for deep AI expertise or infrastructure management. And while it’s in beta, it’s <strong>free</strong>!</p>



<p>Here is an example cURL command to use DeepSeek-R1 Distill Llama 70B on the OpenAI compatible endpoint provided by OVHcloud AI Endpoints:</p>



<pre class="wp-block-code"><code class="">curl -X 'POST' \
  'https://deepseek-r1-distill-llama-70b.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "max_tokens": 4096,
  "messages": [
    {
      "content": "How can I calculate an approximation of Pi in Python?",
      "role": "user"
    }
  ],
  "model": null,
  "seed": null,
  "stream": false,
  "temperature": 0.7,
  "top_p": 1
}'</code></pre>



<p>We can see in the output the thinking process followed by the answer, which have been truncated for clarity.</p>



<pre class="wp-block-code"><code class="">{
    "id": "chatcmpl-8c21b2e3fac44d43b63c06fa25e58091",
    "object": "chat.completion",
    "created": 1741199564,
    "model": "DeepSeek-R1-Distill-Llama-70B",
    "choices":
    [
        {
            "index": 0,
            "message":
            {
                "role": "assistant",
                "content": "&lt;think&gt;\nOkay, the user is asking how to approximate Pi using Python. I need to think about different methods they can use. Let's see, there are a few common approaches. \n\nFirst, there's the Monte Carlo method. ... Let me structure the response with each method as a separate section, explaining what it is, how it works, and providing the code. Then, the user can pick which one they prefer based on their situation.\n&lt;/think&gt;\n\nThere are several ways to approximate the value of Pi (π) using Python. Below are a few methods:\n\n### 1. Using the Monte Carlo Method..."
            },
            "finish_reason": "stop",
            "logprobs": null
        }
    ],
    "usage":
    {
        "prompt_tokens": 14,
        "completion_tokens": 1377,
        "total_tokens": 1391
    }
}</code></pre>



<p>Stéphane Philippart, Developer Relation Advocate at OVHcloud, has written a blog post covering everything you need to know to get up to speed with AI Endpoints and run this model: <a href="https://blog.ovhcloud.com/release-of-deepseek-r1-on-ovhcloud-ai-endpoints/" target="_blank" rel="noreferrer noopener" data-wpel-link="internal">Release of DeepSeek-R1 on OVHcloud AI Endpoints</a></p>



<h3 class="wp-block-heading">AI Deploy</h3>



<p>What if you want to run another version of DeepSeek-R1, such as the Qwen 7B distilled version?</p>



<p>You can use another OVHcloud AI product, <strong>AI Deploy</strong>, to create your own serving endpoint, with <a href="https://docs.vllm.ai/en/stable/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">vLLM</a> as the inference engine. It is open-source, fast and well maintained, ensuring maximal compatibility with even the most recent AI models.</p>



<p>Eléa Petton, Solution Architect at OVHcloud, has written a blog post explaining in details how to serve an open-source model with vLLM on AI Deploy. Just replace the Mistral Small model with the DeepSeek distilled version you want to use (e.g. <strong>deepseek-ai/DeepSeek-R1-Distill-Qwen-7B</strong>) and adapt the number of L40S cards needed (1 is enough for the 7B version) : <a href="https://blog.ovhcloud.com/mistral-small-24b-served-with-vllm-and-ai-deploy-one-command-to-deploy-llm/" target="_blank" rel="noreferrer noopener" data-wpel-link="internal">Mistral Small 24B served with vLLM and AI Deploy – a single command to deploy an LLM (Part 1)</a></p>



<h3 class="wp-block-heading">Next up, creating a reasoning chatbot with DeepSeek-R1</h3>



<p>In part 2 of this blog post series, we will use a DeepSeek-R1-Distill model to create a chatbot that will handle reasoning gracefully, by showing the thinking process of the model.</p>



<p>We will develop our chatbot with OVHcloud AI Endpoints and the Python library <a href="https://www.gradio.app/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Gradio</a>, that enables to quickly create simple chat interfaces.</p>



<p>Here a screenshot of the finalized chatbot we will build:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69cd4fc5c50ec&quot;}" data-wp-interactive="core/image" data-wp-key="69cd4fc5c50ec" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="723" height="1173" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot.png" alt="A screenshot of a chatbot application developed with DeepSeek-R1 and Gradio in Python." class="wp-image-28328" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot.png 723w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot-185x300.png 185w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot-631x1024.png 631w" sizes="auto, (max-width: 723px) 100vw, 723px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>Stay tuned for the next article in this DeepSeek-R1 series. In the meantime, try out DeepSeek-R1 on AI Endpoints and AI Deploy and let us know what you &lt;think&gt;!</p>



<h3 class="wp-block-heading">Resources</h3>



<p>If you want to learn more about DeepSeek-R1 and the topics we covered in this blog post, such as test-time compute, GRPO, reinforcement learning and reasoning models, we suggest having a look at these resources:</p>



<ul class="wp-block-list">
<li><a href="https://arxiv.org/pdf/2501.12948" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">DeepSeek-R1 technical report</a>, by the DeepSeek team</li>



<li><a href="https://newsletter.languagemodels.co/p/the-illustrated-deepseek-r1" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">The Illustrated DeepSeek-R1</a>, by Jay Alamar</li>



<li><a href="https://magazine.sebastianraschka.com/p/understanding-reasoning-llms" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Understanding Reasoning LLMs</a>, by Sebastian Raschka</li>



<li><a href="https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-reasoning-llms" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">A Visual Guide to Reasoning LLMs</a>, by Maarten Grootendorst</li>
</ul>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdeep-dive-into-deepseek-r1-part-1%2F&amp;action_name=Deep%20Dive%20into%20DeepSeek-R1%20%26%238211%3B%20Part%201&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Mistral Small 24B served with vLLM and AI Deploy &#8211; a single command to deploy an LLM (Part 1)</title>
		<link>https://blog.ovhcloud.com/mistral-small-24b-served-with-vllm-and-ai-deploy-one-command-to-deploy-llm/</link>
		
		<dc:creator><![CDATA[Eléa Petton]]></dc:creator>
		<pubDate>Mon, 24 Feb 2025 10:08:37 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Deploy]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Mistral]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=28212</guid>

					<description><![CDATA[You are not dreaming! You can deploy open-source LLM in a single command line. Deploying advanced language models can be a challenge! But this sometimes this arduous task is becoming increasingly accessible, enabling developers to integrate sophisticated AI capabilities into their applications. In this guide, we will walk through deploying the Mistral-Small-24B-Instruct-2501 model using vLLM [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fmistral-small-24b-served-with-vllm-and-ai-deploy-one-command-to-deploy-llm%2F&amp;action_name=Mistral%20Small%2024B%20served%20with%20vLLM%20and%20AI%20Deploy%20%26%238211%3B%20a%20single%20command%20to%20deploy%20an%20LLM%20%28Part%201%29&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p><strong><em>You are not dreaming! You can deploy open-source LLM in a single command line</em>.</strong></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="724" src="https://blog.ovhcloud.com/wp-content/uploads/2025/02/image_blog_post_mistral_small_ai_deploy-1024x724.png" alt="Rocket in MistralAI colors in a data center with a French rooster showing rapid LLM deployment" class="wp-image-28219" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/02/image_blog_post_mistral_small_ai_deploy-1024x724.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/02/image_blog_post_mistral_small_ai_deploy-300x212.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/02/image_blog_post_mistral_small_ai_deploy-768x543.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/02/image_blog_post_mistral_small_ai_deploy-1536x1086.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/02/image_blog_post_mistral_small_ai_deploy.png 2000w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Deploying advanced language models can be a challenge! But this sometimes this arduous task is becoming increasingly accessible, enabling developers to integrate sophisticated AI capabilities into their applications.</p>



<p>In this guide, we will walk through deploying the <strong><a href="https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral-Small-24B-Instruct-2501</a></strong> model using <strong>vLLM</strong> on OVHcloud&#8217;s <a href="https://www.ovhcloud.com/fr/public-cloud/ai-deploy/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">AI Deploy platform</a>. This combination offers a powerful solution for efficient and scalable AI model serving.</p>



<p>Deploying a model is great, but doing it quickly is even better!</p>



<p>🤯 <strong>What if a single command line was enough?</strong> That&#8217;s the challenge we&#8217;re tackling today!</p>



<h2 class="wp-block-heading">Context</h2>



<p>Before deployment, let’s take a closer look at our key technologies!</p>



<h3 class="wp-block-heading">Mistral Small</h3>



<p>The <code><strong>mistralai/Mistral-Small-24B-Instruct-2501</strong></code> is a 24-billion-parameter instruction-fine-tuned model, renowned for its compact size and performance comparable to larger models.</p>



<p>This model, from <a href="https://mistral.ai/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">MistralAI</a>, is an instruction-fine-tuned version of the base model:&nbsp;<a href="https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral-Small-24B-Base-2501</a>.</p>



<p>To serve this model efficiently, we will utilize vLLM, an open-source library for <strong>LLM inference</strong>.</p>



<h3 class="wp-block-heading">vLLM</h3>



<p><a href="https://docs.vllm.ai/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">vLLM</a> (<strong>Virtual LLM</strong>) is a highly optimized service engine designed to efficiently run large language models. It takes advantage of several key optimizations, such as:</p>



<ul class="wp-block-list">
<li><strong>PagedAttention:</strong> an attention mechanism that reduces memory fragmentation and enables more efficient use of GPU memory</li>



<li><strong>Continuous Batching:</strong> vLLM dynamically adjusts batch sizes in real time, ensuring that the GPU is always used efficiently, even with multiple simultaneous requests</li>



<li><strong>Tensor parallelism:</strong> enables model inference across multiple GPUs to boost performance</li>



<li><strong>Optimized kernel implementations:</strong> vLLM uses custom CUDA kernels for faster execution, reducing latency compared to traditional inference frameworks</li>
</ul>



<p>These features make vLLM one of the best choices for large models such as Mistral Small 24B, enabling low-latency, high-throughput inference on the latest GPUs.</p>



<p>By deploying on OVHcloud&#8217;s AI Deploy platform, you can deploy this model in a single command line.</p>



<h3 class="wp-block-heading">AI Deploy </h3>



<p>OVHcloud AI Deploy is a<strong> Container as a Service</strong> (CaaS) platform designed to help you deploy, manage and scale AI models. It provides a solution that allows you to optimally deploy your applications / APIs based on Machine Learning (ML), Deep Learning (DL) or LLMs.</p>



<p>The key benefits are:</p>



<ul class="wp-block-list">
<li><strong>Easy to use:</strong> bring your own custom Docker image and deploy it in a command line or a few clicks surely</li>



<li><strong>High-performance computing:</strong> a complete range of GPUs available (H100, A100, V100S, L40S and L4)</li>



<li><strong>Scalability and flexibility:</strong> supports automatic scaling, allowing your model to effectively handle fluctuating workloads</li>



<li><strong>Cost-efficient:</strong> billing per minute, no surcharges</li>
</ul>



<p>✅ To go further, some prerequisites must be checked!</p>



<h2 class="wp-block-heading">Prerequisites</h2>



<p>Before you begin, ensure that you have:</p>



<ul class="wp-block-list">
<li><strong>OVHcloud account</strong>: access to the&nbsp;<a href="https://www.ovh.com/auth/?action=gotomanager&amp;from=https://www.ovh.co.uk/&amp;ovhSubsidiary=GB" data-wpel-link="exclude">OVHcloud Control Panel</a></li>



<li><strong>ovhai CLI available:</strong> install the <a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-cli-install-client?id=kb_article_view&amp;sysparm_article=KB0047844" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">ovhai CLI</a></li>



<li><strong>AI Deploy access</strong>: ensure you have a <a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-users?id=kb_article_view&amp;sysparm_article=KB0048170" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">user for AI Deploy</a></li>



<li><strong>Hugging Face access</strong>: create an <a href="https://huggingface.co/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Hugging Face account</a> and generate an <a href="https://huggingface.co/settings/tokens" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">access token</a></li>



<li><strong>Gated model authorization</strong>: be sure you have been granted access to <a href="https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Mistral-Small-24B-Instruct-2501</a> model</li>
</ul>



<p><strong>🚀 Having all the ingredients for our recipe, it&#8217;s time to deploy!</strong></p>



<h2 class="wp-block-heading">Deployment of the Mistral Small 24B LLM</h2>



<p>Let&#8217;s go for the deployment of the model <code><strong>mistralai/Mistral-Small-24B-Instruct-2501</strong></code></p>



<h3 class="wp-block-heading">Manage access tokens</h3>



<p>Export your <a href="https://huggingface.co/settings/tokens" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Hugging Face token</a>.</p>



<pre class="wp-block-code"><code class="">export MY_HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxx</code></pre>



<p><a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-cli-app-token?id=kb_article_view&amp;sysparm_article=KB0035280" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Create a token</a> to access your AI Deploy app once it will be deployed.</p>



<pre class="wp-block-code"><code class="">ovhai token create --role operator ai_deploy_token=my_operator_token</code></pre>



<p>Returning the following output:</p>



<p><code><strong>Id:         47292486-fb98-4a5b-8451-600895597a2b<br>Created At: 20-02-25 11:53:05<br>Updated At: 20-02-25 11:53:05<br>Spec:<br>  Name:           ai_deploy_token=my_operator_token<br>  Role:           AiTrainingOperator<br>  Label Selector: <br>Status:<br>  Value:   XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br>  Version: 1</strong></code></p>



<p>You can now store and export your access token:</p>



<pre class="wp-block-code"><code class="">export MY_OVHAI_ACCESS_TOKEN=<span style="background-color: initial; font-family: inherit; font-size: inherit; font-weight: inherit;">XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</span></code></pre>



<h3 class="wp-block-heading">Launch Mistral Small LLM with AI Deploy</h3>



<p>You are ready to start<strong> Mistral-Small-24B</strong> using vLLM and AI Deploy:</p>



<pre class="wp-block-code"><code class="">ovhai app run --name vllm-mistral-small \
              --default-http-port 8000 \
              --label ai_deploy_token=my_operator_token \
              --gpu 2 \
              --flavor l40s-1-gpu \
              -e OUTLINES_CACHE_DIR=/tmp/.outlines \
              -e HF_TOKEN=$MY_HF_TOKEN \
              -e HF_HOME=/hub \
              -e HF_DATASETS_TRUST_REMOTE_CODE=1 \
              -e HF_HUB_ENABLE_HF_TRANSFER=0 \
              -v standalone:/hub:rw \
              -v standalone:/workspace:rw \
              vllm/vllm-openai:v0.8.2 \
              -- bash -c "python3 -m vllm.entrypoints.openai.api_server \
                        --model mistralai/Mistral-Small-24B-Instruct-2501 \
                        --tensor-parallel-size 2 \
                        --tokenizer_mode mistral \
                        --load_format mistral \
                        --config_format mistral \
                        --dtype half"</code></pre>



<p><strong>How to understand the different parameters of this command?</strong></p>



<h5 class="wp-block-heading">1. Start your AI Deploy app</h5>



<p>Launch a new app using <a href="https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-cli-install-client?id=kb_article_view&amp;sysparm_article=KB0047844" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">ovhai CLI</a> and name it.</p>



<p><code><strong>ovhai app run --name vllm-mistral-small</strong></code></p>



<h5 class="wp-block-heading">2. Define access</h5>



<p>Define the HTTP API port and restrict access to your token.</p>



<p><strong><code>--default-http-port 8000</code><br><code>--label ai_deploy_token=my_operator_token</code></strong></p>



<h5 class="wp-block-heading">3. Configure GPU resources</h5>



<p>Specifies the hardware type (<code><strong>l40s-1-gpu</strong></code>), which refers to an <strong>NVIDIA L40S GPU</strong> and the number (<code><strong>2</strong></code>).</p>



<p><code><strong>--gpu 2<br>--flavor l40s-1-gpu</strong></code></p>



<p><strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">⚠️WARNING!</mark></strong> For this model, two L40S are sufficient, but if you want to deploy another model, you will need to check which GPU you need. Note that you can also access to A100 and H100 GPUs for your larger models.</p>



<h5 class="wp-block-heading">4. Set up environment variables</h5>



<p>Configure caching for the <strong>Outlines library</strong> (used for efficient text generation):</p>



<p><code><strong>-e OUTLINES_CACHE_DIR=/tmp/.outlines</strong></code></p>



<p>Pass the <strong>Hugging Face token</strong> (<code>$MY_HF_TOKEN</code>) for model authentication and download:</p>



<p><code><strong>-e HF_TOKEN=$MY_HF_TOKEN</strong></code></p>



<p>Set the <strong>Hugging Face cache directory</strong> to <code>/hub</code> (where models will be stored):</p>



<p><code><strong>-e HF_HOME=/hub</strong></code></p>



<p>Allow execution of <strong>custom remote code</strong> from Hugging Face datasets (required for some model behaviors):</p>



<p><code><strong>-e HF_DATASETS_TRUST_REMOTE_CODE=1</strong></code></p>



<p>Disable <strong>Hugging Face Hub transfer acceleration</strong> (to use standard model downloading):</p>



<p><code><strong>-e HF_HUB_ENABLE_HF_TRANSFER=0</strong></code></p>



<h5 class="wp-block-heading">5. Mount persistent volumes</h5>



<p>Mounts <strong>two persistent storage volumes</strong>:</p>



<ul class="wp-block-list">
<li><code>/hub</code> → Stores Hugging Face model files</li>



<li><code>/workspace</code> → Main working directory</li>
</ul>



<p>The <code>rw</code> flag means <strong>read-write access</strong>.</p>



<p><code><strong>-v standalone:/hub:rw<br>-v standalone:/workspace:rw</strong></code></p>



<h5 class="wp-block-heading">6. Choose the target Docker image</h5>



<p>Uses the <strong><code>v<strong><code>llm/vllm-openai:v0.8.2</code></strong></code></strong> Docker image (a pre-configured vLLM OpenAI API server).</p>



<p><strong><code>vllm/vllm-openai:v0.8.2</code></strong></p>



<h5 class="wp-block-heading">7. Running the model inside the container</h5>



<p>Runs a<strong> bash shell</strong> inside the container and executes a Python command to launch the vLLM API server:</p>



<ul class="wp-block-list">
<li><strong><code>python3 -m vllm.entrypoints.openai.api_server</code></strong> → Starts the OpenAI-compatible vLLM API server</li>



<li><strong><code>--model mistralai/Mistral-Small-24B-Instruct-2501</code></strong> → Loads the <strong>Mistral Small 24B</strong> model from Hugging Face</li>



<li><strong><code>--tensor-parallel-size 2</code></strong> → Distributes the model across <strong>2 GPUs</strong></li>



<li><strong><code>--tokenizer_mode mistral</code></strong> → Uses the <strong>Mistral tokenizer</strong></li>



<li><strong><code>--load_format mistral</code></strong> → Uses Mistral’s model loading format</li>



<li><strong><code>--config_format mistral</code></strong> → Ensures the model configuration follows Mistral&#8217;s standard</li>



<li><strong><code>--dtype half</code></strong> → Uses <strong>FP16 (half-precision floating point)</strong> for optimized GPU performance</li>
</ul>



<p>You can now check if your <strong>AI Deploy</strong> app is alive:</p>



<pre class="wp-block-code"><code class="">ovhai app get &lt;your_vllm_app_id&gt;</code></pre>



<p>💡<strong>Is your app in <code>RUNNING</code> status?</strong> Perfect! You can check in the logs that the server is started&#8230;</p>



<pre class="wp-block-code"><code class="">ovhai app logs &lt;your_vllm_app_id&gt;</code></pre>



<p><strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-ast-global-color-0-color">⚠️WARNING!</mark></strong> This step may take a little time as the template must be loaded&#8230;<br>After a few minutes, you should get the following information in the logs:</p>



<p><code><strong>2025-02-20T13:48:07Z [app] [tcmzt] INFO:     Started server process [13] 2025-02-20T13:48:07Z [app] [tcmzt] INFO:     Waiting for application startup. 2025-02-20T13:48:07Z [app] [tcmzt] INFO:     Application startup complete. 2025-02-20T13:48:07Z [app] [tcmzt] INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)</strong></code></p>



<p>🚦 <strong>Are all the indicators green? </strong>Then it&#8217;s off to inference!</p>



<h3 class="wp-block-heading">Request and send prompt to the LLM</h3>



<p>Launch the following query by asking the question of your choice:</p>



<pre class="wp-block-code"><code class="">curl https://&lt;your_vllm_app_id&gt;.app.gra.ai.cloud.ovh.net/v1/chat/completions \
  -H "Authorization: Bearer $MY_OVHAI_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/Mistral-Small-24B-Instruct-2501",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Give me the name of OVHcloud’s founder."}
    ],
    "stream": false
  }'</code></pre>



<p>Returning the following result:</p>



<pre class="wp-block-code"><code class="">{
  "id":"chatcmpl-d6ea734b524bd851668e71d4111ba496",
  "object":"chat.completion",
  "created":1740059807,
  "model":"mistralai/Mistral-Small-24B-Instruct-2501",
  "choices":[
    {
      "index":0,
      "message":{
        "role":"assistant",
        "reasoning_content":null, 
        "content":"The founder of OVHcloud is Octave Klaba.",
        "tool_calls":[]
      },
      "logprobs":null,
      "finish_reason":"stop",
      "stop_reason":null
    }
  ],
  "usage":{
    "prompt_tokens":22,
    "total_tokens":35,
    "completion_tokens":13,
    "prompt_tokens_details":null
  },
  "prompt_logprobs":null
}</code></pre>



<h2 class="wp-block-heading">Conclusion</h2>



<p>By following these steps, you have successfully deployed the <code><strong>mistralai/Mistral-Small-24B-Instruct-2501</strong></code> model using <strong>vLLM</strong> on OVHcloud&#8217;s AI Deploy platform. This setup provides a scalable and efficient solution for serving advanced language models in production environments.</p>



<p>For further customization and optimization, refer to the <a href="https://help.ovhcloud.com/csm/en-ie-documentation-public-cloud-ai-and-machine-learning-ai-deploy?id=kb_browse_cat&amp;kb_id=574a8325551974502d4c6e78b7421938&amp;kb_category=3241efc6a052d910f078d4b4ef43651f&amp;spa=1" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">vLLM documentation</a> and <a>OVHcloud AI Deploy resources</a>.</p>



<p>💪 <strong>Challenges taken!</strong> You can now enjoy the power of your LLM deployed in a single command line!</p>



<p>Want even more simplicity? You can also use ready-to-use APIs with <a href="https://endpoints.ai.cloud.ovh.net/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">AI Endpoints</a>!</p>



<p><strong><em>But… what’s next?</em></strong></p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fmistral-small-24b-served-with-vllm-and-ai-deploy-one-command-to-deploy-llm%2F&amp;action_name=Mistral%20Small%2024B%20served%20with%20vLLM%20and%20AI%20Deploy%20%26%238211%3B%20a%20single%20command%20to%20deploy%20an%20LLM%20%28Part%201%29&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Five ways to develop sovereign, sustainable AI solutions</title>
		<link>https://blog.ovhcloud.com/five-ways-to-develop-sovereign-sustainable-ai-solutions/</link>
		
		<dc:creator><![CDATA[Cezary Skarzynski]]></dc:creator>
		<pubDate>Mon, 27 Jan 2025 15:07:21 +0000</pubDate>
				<category><![CDATA[OVHcloud Startup Program]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Endpoints]]></category>
		<category><![CDATA[Data Sovereignty]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<category><![CDATA[Startup Program]]></category>
		<category><![CDATA[Sustainability]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=28039</guid>

					<description><![CDATA[Now that organisations understand AI and what it can achieve, businesses around the world are focusing on how to build it responsibly. Three of the five main themes at the Paris AI Action Summit examine the need for responsible AI, with separate streams on trust, public interest and good governance. These themes are not simple. [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Ffive-ways-to-develop-sovereign-sustainable-ai-solutions%2F&amp;action_name=Five%20ways%20to%20develop%20sovereign%2C%20sustainable%20AI%20solutions&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p>Now that organisations understand AI and what it can achieve, businesses around the world are focusing on how to build it responsibly. Three of the five main themes at the <a href="https://www.elysee.fr/en/sommet-pour-l-action-sur-l-ia" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Paris AI Action Summit</a> examine the need for responsible AI, with separate streams on trust, public interest and good governance.</p>



<p>These themes are not simple. In addition to the core function of AI tools – for example, considering what an AI app does, how it does it, and whether bias is present – most businesses are starting to realise that they need to consider the deeper ‘AI supply chain’.</p>



<p>This is not just altruistic. A number of LLM tools are currently facing the risk of lawsuits for copyright infringement, because they may have been trained without due content permission. AI tools that present biased results are quickly exposed in press, leading to reputational damage and a loss of customer trust. Some countries also have legislation permitting data usage for economic intelligence purposes – but in another region, this may represent a data breach. AI has also received negative publicity for ‘running hot’ and consuming large amounts of energy and water in datacenters.</p>



<p>However, AI can also be a tremendous force for good – if handled correctly. So, what should businesses be thinking about so that they get the most from AI, without incurring undue commercial or reputational risk?</p>



<h5 class="wp-block-heading"><strong>1- Consider Sovereignty from the Start</strong></h5>



<p>Understand your data ‘supply chain’ from the very beginning of the process. For example, if you’re using an external LLM for a chatbot, where was this developed? Which data was it trained on, and was this data acquired ethically?</p>



<p>“AI can often be a black box when it comes to processing data,” says Lex Avstreikh, Strategy Lead for Stockholm-based AI firm Hopsworks. “It’s far too complex to show how the system arrived at any one decision. But if you can show people the inputs and the outputs, then that goes a long way to building transparency and trust.”</p>



<h5 class="wp-block-heading"><strong>2- Plan for a Sovereign Future</strong></h5>



<p>It’s important to think about where data will be during its future lifecycle – will you be running in an external datacenter, and where will data be in transit and at rest? Where are the headquarters of the datacenter company in question and what does this mean from a regulatory and handling perspective? Perhaps most importantly, will your customers be happy with all of these arrangements?</p>



<p>This was the decision journey faced by Swedish AI firm Ebbot. In July 2020, the Data Protection Commission v. Facebook Ireland case, commonly referred to as Schrems II, resulted in the Court of Justice of the European Union (CJEU) issuing a decision that added more regulations to data protection and processing principles. Ebbot recognised the importance of data security and compliance and thus made it a priority to store and process all data within the EU.</p>



<h5 class="wp-block-heading"><strong>3- Location, location, location</strong></h5>



<p>Location isn’t just an important sovereignty concern – it’s also crucial to sustainability. Although Scandinavia may have very green energy, it’s easy to forget that many cloud providers will offer geographical ‘computing zones’ rather than defined locations, which can result in a less green footprint. CPU- and GPU-intensive tasks like model training should be run in green energy zones wherever possible, and are rarely latency-dependent; consequently, you can locate them far away if necessary.</p>



<p>When your AI app goes into production, also remember that backup and redundancy are a necessity – but will also increase your carbon footprint. Consider having a ‘low power’ or passive backup if commercially feasible – it will take longer to bring online in the case of emergency, but you’ll be consuming less power.</p>



<h5 class="wp-block-heading"><strong>4- Always Consider Necessity</strong></h5>



<p>A lot of organisations only consider hardware efficiency and power consumption during the development process, but green software is rapidly gaining popularity. Having efficient code which is still fit for purpose can have a huge impact on power consumption, particularly if you’re building an app for very broad use. “We’ll definitely see more efficient and specific LLMs, because they’re absolutely needed,” added Avstreikh.</p>



<p>Although organisations are often considering the cost of development, with FinOps initiatives, we are also seeing the dawn of GreenOps, ensuring that technology is as green as possible from end to end. To that effect, consider benchmarking the CPU and memory usage of your application, because less hardware-intensive apps are usually less power-hungry.</p>



<h5 class="wp-block-heading"><strong>5- Re-use, recycle</strong></h5>



<p>Developing bespoke code can make sure that it’s as lean and efficient as possible, but it can also use needless computing power to develop. Many technology organisations will offer PaaS offerings that can automate common parts of the application development and deployment process. For example, consider our <a href="https://ovh.commander1.com/c3/?tcs=3810&amp;chn=display&amp;src=partnership&amp;cty_ads=multi&amp;lang_ads=en&amp;cty=US&amp;unvrse=multi&amp;pcat=multi&amp;subtpc=undefinite&amp;tactic=awrns&amp;objv=impressions&amp;site_domain=https://labs.ovhcloud.com&amp;cmp=display_PR_multi_en_US_multi_multi_undefinite_awrns_impressions&amp;crtive=dimg_image_728x90_STN-NE&amp;url=https%3A%2F%2Flabs.ovhcloud.com%2Fen%2Fai-endpoints%2F%3Fat_medium%3Ddisplay%26at_campaign%3Dpartnership%26at_creation%3Ddisplay_PR_multi_en_US_multi_multi_undefinite_awrns_impressions%26at_variant%3Ddimg_image_728x90_STN-NE" data-wpel-link="exclude">AI Endpoints solution</a>, which helps developers to access other AI models, from Bert to Mistral to Llama, all using a simple API.</p>



<p>This is not an easy process, but establishing responsible AI conduct in your organisation’s DNA will avoid complications further down the road, and also show to customers that you are considering data – including theirs – in a responsible, secure way. With increasing numbers of organisations tracking not only their scope three emissions, but also their data supply chains in a more comprehensive fashion, sovereignty and sustainability are two clear ‘musts’ for any modern AI company.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<figure class="wp-block-image aligncenter size-large is-resized"><a href="https://startup.ovhcloud.com/en/accelerator/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><img loading="lazy" decoding="async" width="1024" height="253" src="https://blog.ovhcloud.com/wp-content/uploads/2025/01/FF-banner-1024x253.png" alt="" class="wp-image-28042" style="width:626px;height:auto" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/01/FF-banner-1024x253.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/FF-banner-300x74.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/FF-banner-768x190.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/FF-banner-1536x379.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/FF-banner.png 1870w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a></figure>



<p></p>



<p><em>If you’re a startup or scale-up building an AI solution, and would like to work with a sovereign, sustainable cloud provider in turn, you can find more information about OVHcloud – including our cloud credit scheme – on our <a href="https://ovh.commander1.com/c3/?tcs=3810&amp;chn=display&amp;src=partnership&amp;cty_ads=multi&amp;lang_ads=en&amp;cty=US&amp;unvrse=multi&amp;pcat=multi&amp;subtpc=undefinite&amp;tactic=awrns&amp;objv=impressions&amp;site_domain=https://startup.ovhcloud.com&amp;cmp=display_PR_multi_en_US_multi_multi_undefinite_awrns_impressions&amp;crtive=dimg_image_728x90_STN-NE&amp;url=https%3A%2F%2Fstartup.ovhcloud.com%2Fen%2F%3Fat_medium%3Ddisplay%26at_campaign%3Dpartnership%26at_creation%3Ddisplay_PR_multi_en_US_multi_multi_undefinite_awrns_impressions%26at_variant%3Ddimg_image_728x90_STN-NE" data-wpel-link="exclude">startup hub</a>.</em></p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Ffive-ways-to-develop-sovereign-sustainable-ai-solutions%2F&amp;action_name=Five%20ways%20to%20develop%20sovereign%2C%20sustainable%20AI%20solutions&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Introducing OVHcloud’s Trusted and Innovative AI Ecosystem</title>
		<link>https://blog.ovhcloud.com/introducing-ovhclouds-trusted-and-innovative-ai-ecosystem/</link>
		
		<dc:creator><![CDATA[Gilles Closset]]></dc:creator>
		<pubDate>Tue, 21 Jan 2025 13:26:19 +0000</pubDate>
				<category><![CDATA[Ecosystem]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Partner Program]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<category><![CDATA[Startup Program]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=27953</guid>

					<description><![CDATA[Artificial intelligence (AI) has become the most transformative force in the global economy, impacting every sector from healthcare to finance to the public sector. New and innovative capabilities come from all parts of the technology ecosystem and from all regions of the world. Every week, almost every day! The momentum in this space is incredible. [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fintroducing-ovhclouds-trusted-and-innovative-ai-ecosystem%2F&amp;action_name=Introducing%20OVHcloud%E2%80%99s%20Trusted%20and%20Innovative%20AI%20Ecosystem&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p>Artificial intelligence (AI) has become the most transformative force in the global economy, impacting every sector from healthcare to finance to the public sector.</p>



<p>New and innovative capabilities come from all parts of the technology ecosystem and from all regions of the world. Every week, almost every day!</p>



<p>The momentum in this space is incredible. In fact, we&#8217;ve seen a significant acceleration in the number of AI startups that have joined the OVHcloud Startup Program as well as Partners &amp; Editors adding AI expertise &amp; capabilities to their portfolio.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Aligned with our DNA of a Trusted &amp; Sustainable Cloud, OVHcloud is committed to supporting AI innovation that adheres to core values.</p>
</blockquote>



<p>To help customers and developers harness this innovation, we’re bringing the best of OVHcloud’s infrastructure, AI products, and State-of-the-art models to members of our Ecosystem at every layer of the AI stack: chipmakers, models builders and AI platforms, technology partners enabling companies to develop and deploy machine learning (ML) models, app-builders solving customer use-cases with generative AI, and global services and consulting firms that help enterprise customers implement all of this technology at scale.</p>



<p>Let’s deep dive into our partnerships, programs, and resources for each segment of the ecosystem that showcase our open approach.</p>



<h2 class="wp-block-heading">Building a Trusted and Innovative AI Ecosystem</h2>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="627" height="429" src="https://blog.ovhcloud.com/wp-content/uploads/2025/01/image-9.png" alt="" class="wp-image-27955" style="width:751px;height:auto" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/01/image-9.png 627w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/image-9-300x205.png 300w" sizes="auto, (max-width: 627px) 100vw, 627px" /></figure>



<h3 class="wp-block-heading">Model builders &amp; Chipmakers</h3>



<p>Let’s kickstart the introduction of our AI Ecosystem with the members that are directly integrated into specific OVHcloud AI products, aka Technology Partners.</p>



<p>Companies like <a target="_blank" rel="noreferrer noopener nofollow external" href="https://www.linkedin.com/article/edit/7285968197617352704/#" data-wpel-link="external">Mistral AI</a>, <a target="_blank" rel="noreferrer noopener nofollow external" href="https://www.linkedin.com/article/edit/7285968197617352704/#" data-wpel-link="external">Meta</a> and <a target="_blank" rel="noreferrer noopener nofollow external" href="https://www.linkedin.com/article/edit/7285968197617352704/#" data-wpel-link="external">Stability AI</a> are building open-source foundation models, including LLMs, that can significantly accelerate the development of generative AI and natural language processing (NLP) applications. OVHcloud serves to end-customers these models through AI Endpoints with its high-performance infrastructure and industry-leading energy efficiency.</p>



<p>AI endpoints require&nbsp;no AI expertise&nbsp;or dedicated infrastructure, as the serverless platform provides&nbsp;access to advanced AI models&nbsp;including Large Language Models (LLMs), natural language processing, translation, speech recognition, image recognition, and more. Developers can select from a range of models, including open-source options like Mistral AI, Llama, Whisper, and Stable Diffusion, as well as a variety of optimized models from our Model Builders partners, creating a versatile testing ground for chosen AI models.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Our catalog of AI models is continually expanding, and we are actively seeking new collaborations with partners to integrate proprietary models that address specific use cases.</p>
</blockquote>



<p><a target="_blank" rel="noreferrer noopener nofollow external" href="https://www.linkedin.com/article/edit/7285968197617352704/#" data-wpel-link="external">OVHcloud</a> also developed strong and long-lasting partnerships with chipmakers like <a target="_blank" rel="noreferrer noopener nofollow external" href="https://www.linkedin.com/article/edit/7285968197617352704/#" data-wpel-link="external">NVIDIA</a> and <a target="_blank" rel="noreferrer noopener nofollow external" href="https://www.linkedin.com/article/edit/7285968197617352704/#" data-wpel-link="external">AMD</a> to deliver tailored services for deep learning, inference and high-performance computing, with the best available GPUs. AI models are becoming more complex due to the rise of conversational AI. Training and inference now require massive computing power and scalability, and OVHcloud follows the industry innovations by integrating the latest GPUs, including for 2025 the AMD MI325X series, and the Nvidia H200 NVL and Blackwell generation. Using industrial innovations, such as water cooling in our servers, allow us to achieve the lowest energy consumption on the market.</p>



<h3 class="wp-block-heading">AI PaaS Solutions &amp; Tools</h3>



<p>Organizations and developers engaged in ambitious AI projects usually employ various tools to facilitate the creation, management, and deployment of their models. These tools assist developers with essential tasks such automating and optimizing data pipelines, monitoring model performance, managing private datasets, defining and enforcing safety &amp; security measures related to regulation or specific policies. OVHcloud collaborates with these organizations to address the crucial requirements of machine learning engineers and data scientists.</p>



<p>To meet growing demand from customers and partners building innovative AI services on OVHcloud, many of the leaders in AI solutions are launching new or expanded partnerships with OVHcloud today. Let’s have a look to these few examples:</p>



<ul class="wp-block-list">
<li><a href="https://multiversecomputing.com/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Multiverse Computing</a> are the world leaders in Quantum AI. They apply quantum and quantum-inspired AI to solve complex problems delivering practical applications and tangible value today.<br></li>



<li><a href="https://www.hopsworks.ai/integrations/ovhcloud" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Hopsworks</a> seamlessly integrates and can be deployed on OVHcloud using Kubernetes, allowing users to run feature engineering pipelines, training pipelines, and batch inference pipelines using Spark, Flink, or Python on OVHcloud.<br></li>



<li>With <a href="https://valohai.com/blog/valohai-partners-with-ovhcloud/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Valohai</a> access scalable and secure cloud environments without having to rebuild your ML workflows. The integration between the Valohai MLOps platform and OVHcloud makes it easy to access on-demand computational resources. Scale up with ease to meet the needs of your projects, while ensuring data security and regulatory compliance<br></li>



<li><a href="https://www.lampi.ai/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Lampi AI</a> provides a Secure AI platform with the best and latest LLMs to power predictable and fine-tuned AI agents that pick the relevant information from your data and web, reason, iterate, and tackle complex tasks.<br></li>



<li><a href="https://qdrant.tech/blog/hybrid-cloud-ovhcloud" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Qdrant</a> : Through the seamless integration between Qdrant Hybrid Cloud and OVHcloud, developers and businesses are able to deploy the fully managed vector database within their existing OVHcloud setups in minutes, enabling faster, more accurate AI-driven insights.</li>
</ul>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Through our support to these critical members, we offer developers the best platform and ecosystem in which to build the next generation of helpful AI applications, and provide customers with a single destination for building, innovating with, and applying AI.</p>
</blockquote>



<p></p>



<h3 class="wp-block-heading">AI Apps addressing End-customers use cases specifically</h3>



<p>OVHcloud is the destination for developers and partners to build the next generation of innovative applications with AI and ML, including exciting new generative AI capabilities.</p>



<p>Much innovation in the generative AI space comes from fast moving, early-stage startups. They excel in developing new applications designed to address very specific End-customers’ use cases. Some differentiate through their model(s), either proprietary or fined-tuned, and make it available through inference API or in their App. Others bring value buy developing Applications or User Interface on top of “General-Purpose AI Models” &#8211; so-called API wrappers – by knowing precisely the business workflow of their customers. This may translate into AI agents capable of performing tasks independently, without the need for constant human oversight.</p>



<p>Many AI startups are choosing OVHcloud not only for industry-leading Sustainable &amp; Trusted Cloud infrastructure, but also for fully managed AI services, which makes it faster and easier to scale, at the best price and with no lockin.</p>



<p>Let’s review some of them:</p>



<ul class="wp-block-list">
<li><a href="https://www.illuin.tech/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">ILLUIN Technology</a> provides a powerful low-code multimodal AI orchestration platform that enables you to hybridize different AI approaches and models to implement and industrialize your most complex customized use cases, including AI Agents.<br></li>



<li><a href="https://www.moin.ai/en" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">moinAI</a> uses AI to automatically resolve recurring customer enquiries &#8211; across multiple channels and in various languages &#8211; with minimal effort. Chatbots, live chat and product advisors allow companies to communicate quickly and efficiently with customers on the website around the clock.<br></li>



<li><a href="https://www.catch.hr/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">catchHR</a> uses AI to streamline recruitment, automating tasks like job posting, candidate sourcing, and skill matching to save time and boost efficiency. AI-generated job ads attract top talent, while AI-powered candidate analysis ensures a strong match between applicants and roles, considering both skills and personality fit.<br></li>



<li><a href="https://rayscape.ai/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Rayscape</a> has already demonstrated excellent results in 150+ clinics and hospitals. Its AI is trained on more than 43 million images from all around the globe and powers predictive insights, automated analysis, efficient workflows to prioritize cases based on urgency, and generates structured reports.<br></li>



<li><a href="https://www.factiverse.ai/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Factiverse</a> offers AI-powered solutions to enhance content credibility and streamline fact-checking processes. Their offerings include an advanced text editor that identifies and verifies factual statements in your content. It highlights claims and provides links to credible sources, assisting in correcting inaccuracies Factiverse GPT.</li>
</ul>



<h3 class="wp-block-heading">Services Partners</h3>



<p>We stand at the brink of an exhilarating transformation, propelled by advancements in machine learning (ML) technologies. This shift holds the promise of revolutionizing customer experiences, introducing groundbreaking applications, and boosting our customers&#8217; productivity to new levels. The market&#8217;s enthusiasm is clear, with an unprecedented number of customers eager to leverage generative artificial intelligence to revamp their businesses.</p>



<p>Successfully innovating with large language models and generative AI demands proficiency in data management, AI, human resources, and operational processes. It is crucial that these models and AI solutions are crafted to be ethical, transparent, and reliable.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Our partner ecosystem will lead the way in developing innovative business solutions tailored to customers across various industries and sizes.</p>
</blockquote>



<p>Services partners from our Ecosystem have demonstrated expertise delivering Machine Learning and generative AI solutions on OVHcloud. These partners offer a range of products and services and technologies including specialized consulting services, Managed Services and Applications that are secure, efficient, and scalable across industries.</p>



<p>Today, several of our leading partners, <a href="https://www.cgi.com/france/fr-fr/partenariat/cgi-ovhcloud" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">CGI</a><strong>, </strong><a href="https://www.groupeonepoint.com/fr/actualites/nouvelle-solution-dia-generative-souveraine-sur-ovhcloud/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Onepoint</a><strong>, </strong>Accenture<strong>, </strong><a href="https://www.synaigy.com/details/ovhcloud-cloud-ki-zukunft" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">synaigy</a> <strong>, </strong>W&amp;B Asset Studio<strong>, </strong><a href="https://www.soprasteria.com/fr/media/publications/details/sopra-steria-et-ovhcloud-etendent-leur-partenariat-afin-d-industrialiser-lintelligence-artificielle-et-accelerer-la-transformation-des-entreprises-dans-une-demarche-open-source" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Sopra Steria</a><strong>, </strong><a href="https://www.inetum.com/fr/presse/inetum-ovhcloud" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Inetum</a> and NEXiD are already providing key support in terms of OVHcloud generative AI advisory, implementation services and capabilities available to customers. These partners play an essential role in applying new AI capabilities to solve industry-specific challenges and helping enterprises build generative AI into their products and everyday business processes.</p>



<h2 class="wp-block-heading">OVHcloud and its broader ecosystem</h2>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="627" height="213" src="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110514433.png" alt="" class="wp-image-27957" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110514433.png 627w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110514433-300x102.png 300w" sizes="auto, (max-width: 627px) 100vw, 627px" /></figure>



<p></p>



<p>Our AI Ecosystem is part of our broader Ecosystem and includes a wide variety of startups and partners, ready to support customers with both current and future technological challenges. It does so by giving customers the means to innovate and develop their own competitive advantage.</p>



<p>Through these programs, we provide product support, marketing amplification, and co-selling opportunities to help our services and ISV partners bring these solutions to market faster, reach more customers, and grow their businesses.</p>



<p>We have launched over the past 10+ years the following initiatives:</p>



<h3 class="wp-block-heading">OVHcloud Partner Program</h3>



<p>OVHcloud partners play a key role in customers’ digital transformation, with the support and services they offer to help them meet the challenges involved. Over 700 companies joint this program, providing a wide range of expertise and services to our customers.</p>



<p>Interested Partners can go <a href="https://partner.ovhcloud.com/en-gb" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a> to apply to join the program</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="744" height="117" src="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110907507.png" alt="" class="wp-image-27958" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110907507.png 744w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110907507-300x47.png 300w" sizes="auto, (max-width: 744px) 100vw, 744px" /></figure>



<p></p>



<h3 class="wp-block-heading">OVHcloud Startup Program</h3>



<p>We nurture tech entrepreneurs by deploying an array of business scaling opportunities within OVHcloud’s global ecosystem of trust.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Over 5.000 startups and scaleups from across the globe that have already benefited from our program since its launch in 2015.</p>
</blockquote>



<p>To further support the AI startups and accelerate their app development, we’re launching a new initiative, called <strong>AI Accelerator</strong> which recognizes select startups whose applications and platforms are optimized to run as-a-service on OVHcloud infrastructure and who are utilizing OVHcloud’s AI capabilities in new and helpful ways. The program provides dedicated access to OVHcloud expertise, training, and co-marketing support to help partners build capacity and go to market.</p>



<p>Interested AI startups can go <a href="https://startup.ovhcloud.com/en-gb/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a> to apply to join the program</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="744" height="117" src="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110816403.png" alt="" class="wp-image-27959" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110816403.png 744w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737110816403-300x47.png 300w" sizes="auto, (max-width: 744px) 100vw, 744px" /></figure>



<p></p>



<h3 class="wp-block-heading">Open Trusted Cloud &nbsp;</h3>



<p>This program is aimed at software publishers, as well as SaaS and PaaS solution providers. Its ambition is to work together on building an ecosystem of SaaS and PaaS services — hosted in the open, reversible and trusted cloud offered by OVHcloud. This will provide a common platform for competitive solutions, and hundreds have already joined.</p>



<p>You can browse some of the solutions available in our ecosystem <a href="https://opentrustedcloud.ovhcloud.com/en-gb" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="744" height="141" src="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737111014479.png" alt="" class="wp-image-27960" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737111014479.png 744w, https://blog.ovhcloud.com/wp-content/uploads/2025/01/1737111014479-300x57.png 300w" sizes="auto, (max-width: 744px) 100vw, 744px" /></figure>



<p></p>



<h3 class="wp-block-heading">OVHcloud Marketplace</h3>



<p>At the heart of the ecosystem, the Marketplace was designed to benefit everyone. OVHcloud Marketplace brings together the best solutions from SaaS and PaaS publishers in the ecosystem on an ethical and transparent cloud. Carry out the digital transformation of your company or subscribe to a solution for your personal use with complete peace of mind thanks to these trusted solutions.</p>



<figure class="wp-block-image"><a href="https://marketplace.ovhcloud.com/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external"><img decoding="async" src="https://media.licdn.com/dms/image/v2/D4E12AQGh-68Z2E6B2g/article-inline_image-shrink_400_744/article-inline_image-shrink_400_744/0/1737111193841?e=1743033600&amp;v=beta&amp;t=e6LoWR8Nbia6G43l2TDZnNXXjufFctq3csqRZBjZQcg" alt=""/></a></figure>



<p></p>



<h3 class="wp-block-heading">Technology partners</h3>



<p>The OVHcloud vision is to create a transparent, reversible and interoperable cloud. We work with the best players on the market to deliver solutions for the most high-performance, high-security requirements.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p>OVHcloud is committed to <strong>democratize AI </strong>within organizations, through a wide range of solutions positioned at every price point while advocating for Digital Sovereignty &amp; Sustainability.</p>



<p>Should your company consider to leverage OVHcloud, would like to know more about our vibrant AI Ecosystem or share a feedback, please feel free to contact me!</p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fintroducing-ovhclouds-trusted-and-innovative-ai-ecosystem%2F&amp;action_name=Introducing%20OVHcloud%E2%80%99s%20Trusted%20and%20Innovative%20AI%20Ecosystem&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Apply now for the Fast Forward AI Accelerator</title>
		<link>https://blog.ovhcloud.com/apply-now-for-the-fast-forward-ai-accelerator/</link>
		
		<dc:creator><![CDATA[Philip Marais]]></dc:creator>
		<pubDate>Tue, 22 Oct 2024 19:45:24 +0000</pubDate>
				<category><![CDATA[OVHcloud Startup Program]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Fast Forward]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<category><![CDATA[Scaleups]]></category>
		<category><![CDATA[Startup Program]]></category>
		<category><![CDATA[Startups]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=27635</guid>

					<description><![CDATA[Today we’re launching our new AI Accelerator to meet the scaling needs of AI startups and shape the future of the AI industry. Building on the success of our Fast Forward Accelerator, designed to be light-touch in terms of your time but high-impact in terms of value, the AI Accelerator offers everything that is OVHcloud [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fapply-now-for-the-fast-forward-ai-accelerator%2F&amp;action_name=Apply%20now%20for%20the%20Fast%20Forward%20AI%20Accelerator&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p>Today we’re launching our new AI Accelerator to meet the scaling needs of AI startups and shape the future of the AI industry.</p>



<p>Building on the success of our Fast Forward Accelerator, designed to be light-touch in terms of your time but high-impact in terms of value, the AI Accelerator offers everything that is OVHcloud (<a href="https://startup.ovhcloud.com/en-gb/lp/cloud-transparency/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">data sovereignty</a>, <a href="https://corporate.ovhcloud.com/en/sustainability/environment/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">energy efficiency</a>, <a href="https://startup.ovhcloud.com/en-gb/lp/tech-freedom/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">tech freedom</a>, <a href="https://startup.ovhcloud.com/en-gb/lp/personal-touch/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">personal touch</a>, price/performance) and more.</p>



<p>The 3-month program offers:</p>



<ul class="wp-block-list">
<li><strong>€50k in free cloud credits</strong> to use on our <a href="https://www.ovhcloud.com/en/public-cloud/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Public Cloud</a> and <a href="https://www.ovhcloud.com/en/public-cloud/ai-machine-learning/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">AI solutions</a>. This is in addition to Startup Program credits but the maximum total credits that can be allocated remains €100k.</li>



<li><strong>AI technology deep-dives </strong>to solve technical challenges</li>



<li>Workshops include AI, sales, investor readiness, and PR training to enhance business and communication skills.</li>



<li>1-on-1 mentoring from experts</li>



<li>Engagement with corporates for possible POCs</li>



<li>Engagement with Venture Capitalists (VCs) for possible funding</li>
</ul>



<p>Only 10-15 startups will be selected for the first cohort of the AI Accelerator. Applications opened on 1 October for the first cohort of the AI accelerator that will run from 13 January 2025 to 3 April 2025. Entries close on 24 November 2024 (<a href="https://startup.ovhcloud.com/en-gb/fast-forward-ai-accelerator/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Apply NOW!</a>) and selected participants will be announced on 16 December 2024.</p>



<p>The 3-month program is divided into 3 phases:</p>



<ol class="wp-block-list">
<li><strong>Phase 1 (Build): </strong>The Build Phase will focus on refining your product-market fit and cloud integration. This will include deep dives with our AI Team to make sure you get the best out of our AI Solutions.</li>



<li><strong>Phase 2 (Sell): </strong>The Sell Phase will focus on business development, corporate partnerships, and sales readiness. In this phase you will engage with potential corporate partners to investigate collaboration.</li>



<li><strong>Phase 3 (Scale):</strong> The Scale Phase will focus on investor readiness, growth strategy, and funding opportunities. This phase will culminate with a Showcase event where participants will pitch their funding needs to VCs.</li>
</ol>



<p>Based on feedback from our existing startup community, the program will have a particular focus on data sovereignty. This will include sessions with experts and other organizations to help the cohort understand and deal with sovereignty requirements better, particularly as we draw closer to the finalization of the European AI Act.</p>



<p>The accelerator program includes 1-on-1 mentoring from OVHcloud and external experts who will be matched with participants based on their needs. The program is designed to be agile, requiring only three hours a week or less, but can scale to support you as needed. It does also include a 1-year commitment to use OVHcloud’s products and solutions to ensure continuity after exit from the Accelerator.</p>



<p>Applications will need to meet the following criteria to be selected:</p>



<ul class="wp-block-list">
<li>You must be a Startup Program member that has been active in the program or as an OVHcloud customer for at least 3 months (not a member? <a href="https://startup.ovhcloud.com/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Apply now</a>)</li>



<li>You must have a need for GPUs and OVHcloud&#8217;s AI Solutions</li>



<li>Preference will be given to <a href="https://startup.ovhcloud.com/en-gb/scaleups/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Scale level</a> members of the Startup Program</li>
</ul>



<p>Startups like <a href="https://www.ovhcloud.com/en-gb/case-studies/cux-io/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">CUX.io</a>, <a href="https://www.ovhcloud.com/en-gb/case-studies/combigo/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Combigo</a>, <a href="https://www.ovhcloud.com/en-gb/case-studies/superprotocol/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Super Protocol</a> and <a href="https://www.ovhcloud.com/en/case-studies/orpiva/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">ORPIVA</a> have already enjoyed the benefits that the Fast Forward Accelerator offers.</p>



<p class="has-text-align-center"><em>&#8220;During the first weeks of data migration, it became clear that we needed additional training in terms of traffic modeling and routing. This ran outside the scope of our previous provider, but OVHcloud showed flexibility and willingness to help – following a quick technical consultation with an OVHcloud architect cleared any uncertainty. This was very important for us, because traffic control, next to the sheer amount of data held, are our biggest challenges. We have no doubt that OVHcloud is a great partner with whom we can always brainstorm with to come up with the best solution together.&#8221;</em> says Kamil Walkowiak, VP of R&amp;D at CUX.io<strong>.</strong></p>



<p class="has-text-align-center"><em>“We have been using OVHcloud solutions to develop and deploy our AI-powered applications, including LLM training and automatic AI video generation. We are very satisfied with their performance, reliability, and support.”</em> says Salman Valibeik, CEO and Co-Founder at ORPIVA.</p>



<p>Sign up to the <a href="https://startup.ovhcloud.com/en/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Startup Program </a>and <a href="https://startup.ovhcloud.com/en-gb/fast-forward-ai-accelerator/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">AI Accelerator</a> today to benefit from a wealth of support – and scale your business faster.</p>



<figure class="wp-block-image aligncenter size-full is-resized"><a href="https://startup.ovhcloud.com/en-gb/fast-forward-ai-accelerator/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><img loading="lazy" decoding="async" width="548" height="150" src="https://blog.ovhcloud.com/wp-content/uploads/2024/10/Email-banner-FY25.jpg" alt="" class="wp-image-27636" style="width:642px;height:auto" srcset="https://blog.ovhcloud.com/wp-content/uploads/2024/10/Email-banner-FY25.jpg 548w, https://blog.ovhcloud.com/wp-content/uploads/2024/10/Email-banner-FY25-300x82.jpg 300w" sizes="auto, (max-width: 548px) 100vw, 548px" /></a></figure>



<p></p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fapply-now-for-the-fast-forward-ai-accelerator%2F&amp;action_name=Apply%20now%20for%20the%20Fast%20Forward%20AI%20Accelerator&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>F.A.I.R. Principles in Data for AI</title>
		<link>https://blog.ovhcloud.com/f-a-i-r-principles-in-data-for-ai/</link>
		
		<dc:creator><![CDATA[Lex Avstreikh]]></dc:creator>
		<pubDate>Mon, 30 Sep 2024 22:44:43 +0000</pubDate>
				<category><![CDATA[OVHcloud Startup Program]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[DevOps]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[MLops]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<category><![CDATA[Startup Program]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=27422</guid>

					<description><![CDATA[How the FAIR Data Principles apply to Machine Learning Data and Infrastructure At Hopsworks, the FAIR Guiding Principles for scientific data management and stewardship have been a cornerstone of our approach to build a better machine learning platform. F.A.I.R. principles initially became prevalent in academia and diverse fields of research in an effort to make [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Ff-a-i-r-principles-in-data-for-ai%2F&amp;action_name=F.A.I.R.%20Principles%20in%20Data%20for%20AI&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<h5 class="wp-block-heading"><em>How the FAIR Data Principles apply to Machine Learning Data and Infrastructure</em></h5>



<p>At Hopsworks, <a href="https://www.nature.com/articles/sdata201618" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">the FAIR Guiding Principles for scientific data management and stewardship</a> have been a cornerstone of our approach to build a better machine learning platform. F.A.I.R. principles initially became prevalent in academia and diverse fields of research in an effort to make sure that the ever growing amount of data could still be usable and beneficial for the society, and it has since been widely adopted. However, few people mention them in the context of machine learning systems and data management. Yet those principles are even more relevant today in the fast moving AI and <a href="https://www.hopsworks.ai/dictionary/llms-large-language-models" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">LLMs</a> landscape, where <a href="https://www.hopsworks.ai/post/high-risk-ai-in-the-eu-ai-act" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">new legislation</a> is changing the rules of the game.&nbsp;</p>



<p>AI professionals should consider how questions of ethics, data management, and open frameworks may influence their choice of tools and machine learning platforms when implementing <a href="https://www.hopsworks.ai/post/mlops-to-ml-systems-with-fti-pipelines" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">modern ML systems</a>. In Hopsworks, we follow the F.A.I.R. principles in the design of a platform for managing machine learning data and infrastructure.&nbsp;</p>



<h2 class="wp-block-heading"><strong>What are the Four Core Concepts of F.A.I.R.?&nbsp;</strong></h2>



<p>‍<strong>Findable</strong>; referring to mechanics to make the data easily searchable and findable. Infrastructure, stakeholders, and projects need easy-to-use functionality for data discovery.&nbsp;</p>



<ul class="wp-block-list">
<li>Data needs to follow<a href="https://datamanagement.hms.harvard.edu/plan-design/file-naming-conventions#:~:text=The%20file%20name%20should%20be%20descriptive%20and%20provide%20just%20enough%20contextual%20information.&amp;text=A%20good%20format%20for%20date,filename%2C%20use%20the%20format%20YYYYMMDDThhmm." data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"> clear naming conventions</a>, be indexed for free-text search and have persistent uniquely identified metadata that clearly and explicitly describe the data.&nbsp;</li>



<li>The design and curation of metadata needs to have good system support.</li>
</ul>



<p>‍<strong>Accessible</strong>; allow access not only to the data but the provenance of the data and metadata for the data.&nbsp;</p>



<ul class="wp-block-list">
<li>Open, free, and universally implementable protocols that allow access to the data itself, the metadata and its provenance,</li>



<li>Access control support is required when sharing data. Role-based access control is good, but attribute-based access control and/or dynamic role-based access control provides even more fine-grained support for data sharing and reuse.</li>
</ul>



<p><strong>Interoperable; </strong>data should be easily shared between different computer systems. This is achieved by implementing open standards and formats for data</p>



<ul class="wp-block-list">
<li>Open and accessible file formats and transport protocols for accessing the data.&nbsp;</li>
</ul>



<p><strong>Reusable</strong>; data produced by one system should be easy to reuse in <a href="https://www.hopsworks.ai/dictionary/downstream" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">downstream</a> systems, without copying the data. In order to reuse data, it’s important to include metadata related to the data licenses,, provenance, community standards, and custom metadata that will allow other institutions, teams or groups to be able to reuse the data.</p>



<ul class="wp-block-list">
<li>Versioning, cataloging, provenance/lineage, data integrity, and custom metadata make it easier for users of data to decide on whether they can use the shared data.</li>
</ul>



<h2 class="wp-block-heading"><strong>Why F.A.I.R. is challenging for AI platforms and ML Systems</strong></h2>



<p>Some of the FAIR principles are directly applicable in the context of machine learning systems: there are lots of open source frameworks, file systems, and programming languages that are used for the operation of AI products and services. Still, some very serious challenges do emerge that are specifically due to the way any <a href="https://www.hopsworks.ai/dictionary/ml-systems" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">ML System</a> needs to operate.&nbsp;</p>



<p><strong>Findable</strong>; while strategies that apply metadata and clear nomenclature can be applied in the context of operational machine learning systems, practitioners will find it challenging to create a clear centralized logic between the different data sources and databases needed to operate such services; a modern ML system might need to be connected to multiple sources, some of which may be real-time, or <a href="https://www.hopsworks.ai/dictionary/vector-database" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">vector databases</a> for large languages. Making a clear structure for the assets and the metadata becomes a complex endeavor without a centralized solution capable of catering to the different scenarios.&nbsp;</p>



<p>‍<strong>Interoperable &amp; Accessible; </strong>When open frameworks and open file formats are used; core challenges in regards to accessibility and interoperability should be easier to resolve; in which case it becomes important to consider open standards, compute engines and avoid <a href="https://en.wikipedia.org/wiki/Domain-specific_language" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">DSLs</a>. One additional challenge that can span from the very nature of the underlying data is to make it accessible for auditing (for example; what was the data that the model in production last year trained on?), review and debugging whilst the systems continuously updates and appends data.</p>



<p><strong>Reusability</strong>; Finally a fundamental characteristic of machine learning models is that some of them require the data processing to be directly tied to the model that will be trained; we call these <a href="https://www.hopsworks.ai/dictionary/model-dependent-transformations" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">model-dependent transformations</a>. This process essentially compromises the integrity of the data and the underlying datasets can’t be re-used in a different scenario. And not only does it prevent the reuse of the data itself, it is also harder to understand for a human. This leads to significant holds on the ability of any organization to reuse their data in different models, leading to deduplication and the creation of <a href="https://www.hopsworks.ai/dictionary/monolithic-ml-pipeline" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">monolithic pipelines</a> that are notoriously harder to scale from.&nbsp;</p>



<h2 class="wp-block-heading"><strong>Making Data for AI F.A.I.R. </strong>‍</h2>



<h3 class="wp-block-heading">Use case of<em>Hopsworks with the Human Exposome Assessment Platform</em></h3>



<p>At Hopsworks, we have a strong heritage working with academia and research, participating in projects such as <a href="https://heap-exposome.eu/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">HEAP</a> (Human Exposome Assessment Project) that manages personal data from numerous medical institutes across the world. We have always been mindful of the evident privacy and security concerns and needs of efficiency in managing data following FAIR principles; when approaching such project; we consider those principles as a blueprint on how to refine our own software;&nbsp;</p>



<ul class="wp-block-list">
<li>Using open frameworks,&nbsp;</li>



<li>Using open languages,</li>



<li>Modular technologies,&nbsp;</li>



<li>Reusable file formats.</li>
</ul>



<p>Additionally, striving to build strong abstractions and APIs that enable users and organizations to have a better understanding of the models they are building and more flexibility in reusing their data pipelines. Those are core aspects of the Hopsworks platform, which we believe all state-of-the-art ML&nbsp; platforms should follow to be within the FAIR framework.</p>



<h2 class="wp-block-heading"><strong>FAIR principles in practice at Hopsworks</strong></h2>



<figure class="wp-block-image"><a href="https://cdn.prod.website-files.com/618399cd49d125734c8dec95/65bbad2d391979288c4153b7_FAIR_lightbox.png" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><img decoding="async" src="https://cdn.prod.website-files.com/618399cd49d125734c8dec95/65bbad2d391979288c4153b7_FAIR_lightbox.png" alt="FAIR principles at Hopsworks"/></a></figure>



<h2 class="wp-block-heading">Sources;</h2>



<ul class="wp-block-list">
<li><a href="https://direct.mit.edu/dint/article/2/1-2/10/10017/FAIR-Principles-Interpretations-and-Implementation" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">FAIR Principles: Interpretations and Implementation Considerations | Data Intelligence | MIT Press</a></li>



<li><a href="https://www.nature.com/articles/sdata2018118" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">A design framework and exemplar metrics for FAIRness | Scientific Data</a></li>



<li><a href="https://www.nature.com/articles/sdata201618" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">The FAIR Guiding Principles for scientific data management and stewardship</a></li>
</ul>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Ff-a-i-r-principles-in-data-for-ai%2F&amp;action_name=F.A.I.R.%20Principles%20in%20Data%20for%20AI&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Adopting AI in SaaS: how can we move quickly without losing control?</title>
		<link>https://blog.ovhcloud.com/ai-saas-ovhcloud/</link>
		
		<dc:creator><![CDATA[Germain Masse]]></dc:creator>
		<pubDate>Wed, 14 Aug 2024 06:04:09 +0000</pubDate>
				<category><![CDATA[OVHcloud Product News]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[LLM Serving]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Saas]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=27229</guid>

					<description><![CDATA[The widespread use of AI poses numerous challenges. Including the risks of data leakage, the need for explainable results, handling it in SaaS. But also, the growing dependence on Big Tech. Not to mention the environmental toll linked to AI. No doubt the eco-design of digital services is becoming increasingly popular. Still, the efforts to [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fai-saas-ovhcloud%2F&amp;action_name=Adopting%20AI%20in%20SaaS%3A%20how%20can%20we%20move%20quickly%20without%20losing%20control%3F&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p>The widespread use of AI poses numerous challenges. Including the risks of data leakage, the need for explainable results, handling it in SaaS. But also, the growing dependence on Big Tech. Not to mention the environmental toll linked to AI.</p>



<p>No doubt the eco-design of digital services is becoming increasingly popular. Still, the efforts to achieve digital sobriety seem to be marginal. Especially compared to the energy consumed by training general-purpose LLMs. Is there a way to make AI greener? And what would a more “trusted AI” mean?</p>



<p>Here’s a roundup of challenges and solutions.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://blog.ovhcloud.com/wp-content/uploads/2024/08/AdobeStock_8327344081-1024x576.jpeg" alt="" class="wp-image-27235" srcset="https://blog.ovhcloud.com/wp-content/uploads/2024/08/AdobeStock_8327344081-1024x576.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2024/08/AdobeStock_8327344081-300x169.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2024/08/AdobeStock_8327344081-768x432.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2024/08/AdobeStock_8327344081-1536x864.jpeg 1536w, https://blog.ovhcloud.com/wp-content/uploads/2024/08/AdobeStock_8327344081-2048x1152.jpeg 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>Efficiency of specialised LLMs compared to general-purpose LLMs</strong><strong></strong></p>



<p>General-purpose LLMs, such as GPT-4 developed by OpenAI, LaMa (Meta) and Gemini (Google) are currently in the spotlight. Versatile, omniscient, and able to handle a variety of scenarios. They seem to be able to meet every need: generating text, code, answering questions, translating content, and even composing poems.</p>



<p>However, these general-purpose models have not yet eclipsed specialised LLMs,<a id="_ftnref1" href="#_ftn1"><sup>[1]</sup></a><sup> </sup>which target a narrower range of situations, but perform much better in them. The Retrieval-Augmented Generation (RAG) technique certainly makes it possible to specialise a general-purpose LLM via transfer learning, with or without retraining the model. However, the use of general-purpose LLMs continues to pose a range of challenges. Starting with their generic results, unreliable quality or lack of reproducibility. This will prove even more challenging as the available sources of quality data may become scarce, due to legal actions<a id="_ftnref2" href="#_ftn2"><sup>[2]</sup></a> brought for unauthorised use of content and copyright infringement. Additionally, the use of general-purpose LLMs leads to operator dependency and reinforce monopolies,<a id="_ftnref3" href="#_ftn3"><sup>[3]</sup></a><sup> </sup>which is unfavourable for long-term users.</p>



<p><strong>The impact of AI on the environment<br></strong>Researchers from Hugging Face and the Allen Institute<a href="#_ftn4" id="_ftnref4"><sup>[4]</sup></a><sup> </sup><strong>have shown that in the case of servers with GPUs, the carbon emissions linked to machine use far exceed those linked to manufacturing the components, unlike traditional cloud computing.</strong><strong><sup> </sup></strong><strong>Generating an image using an AI model is one of the most energy-intensive uses, and requires as much electricity as fully charging a smartphone.</strong><a href="#_ftn5" id="_ftnref5"><strong><sup><strong><sup>[5]</sup></strong></sup></strong></a><strong><sup> </sup></strong>Reversing the distribution of carbon emissions throughout the lifecycle of servers in this way means that the power usage effectiveness (PUE) of the datacentres in which AI models are trained and inferred, as well as the energy mix of the countries in which they are located, are very significant selection criteria in calculating your application’s global carbon footprint.</p>



<p>This is a bonus for OVHcloud. Indeed, the Group has long been committed to reducing the carbon footprint of its datacentres.<a id="_ftnref6" href="#_ftn6"><sup>[6]</sup></a></p>



<p>As it might be expected, general-purpose LLMs are more environmentally damaging than specialised models designed for specific tasks. This has been revealed in a series of comparative tests carried out by the same researchers.<a id="_ftnref7" href="#_ftn7"><sup>[7]</sup></a><sup> </sup>With thousands of billions of settings, the largest LLMs are getting larger and more data-intensive.<a id="_ftnref8" href="#_ftn8"><sup>[8]</sup></a> An article in the<sup> </sup><em>New Scientist </em>recently explained that algorithm advances are outpacing Moore’s Law, as after eight months. A large language model would need only half the computing power to achieve the same level of performance.<a id="_ftnref9" href="#_ftn9"><sup>[9]</sup></a><sup> </sup>However, to run a model like OpenAI today, it would cost Microsoft around $700,000 per day<a id="_ftnref10" href="#_ftn10">[10]</a>, or an average cost of 36 cents per query. Still unreasonable from an economic and environmental point of view to meet needs that are often precise and well-defined.</p>



<p>Specialised models, which can be chained to perform complex tasks (referred to as agentisation), are therefore a more environmentally responsible alternative to general-purpose LLMs. On top of that, specialised models, which are more widely available in open source, are also easier to understand and to fine-tune. They seem more suitable for reversibly building innovations for which the ROI is still very uncertain.</p>



<p><strong>Maintaining control: working towards developing a trusted AI</strong><strong></strong></p>



<p>While large companies quickly became aware of the risks of leaking confidential data when using digital services (like online translation, which they are beginning to ban), AI intensified the temptation to output a company’s data and submit it to an algorithm: here to write a report more quickly, there to generate an image that will illustrate a presentation on a confidential project. Samsung learned this the hard way, as a victim of three consecutive data leaks related to the use of ChatGPT by its employees, who notably copied/pasted source code to solve or optimise a problem.<br>You don’t need to disclose a lot of information to say a lot about your intentions. What insights would your rival gather about your strategy from reading your ChatGPT prompts? After all, it is possible for AI to accidentally “scrape” data submitted by users, thanks to a bug<a href="#_ftn11" id="_ftnref11"><sup>[11]</sup></a><sup> </sup>causing security issues. The same goes for datasets you might submit on AI platforms: will your data be used to train and refine the model? Could they benefit potential rivals?<br><br></p>



<p>Beyond this, there is also the question of the transparency of AI models. With it, comes the risk of outsourcing increasingly important tasks to sophisticated AI models. Indeed, they can become &#8220;black boxes&#8221;, making incomprehensible decisions, or producing skewed results because of the data they are trained on. </p>



<p>Let&#8217;s face the possibility that you may not have any problem with the results. Would you run the risk of relying on a service where you can’t explain in broad terms how it works? And that you couldn’t stop using without losing everything? Here, we encounter another problem – reversibility. </p>



<p>If for example, the AI service deemed the party to be over and the infrastructure that it has long financed at a loss must now be made profitable, so it takes advantage of its monopoly and your dependency to increase its rates in an unreasonable way, you could certainly cancel the service. But then you would lose the results of your data training and/or model specialisation, and you would have to start from scratch. In the current absence of standards for portability/interoperability between different AI services, this issue is crucial – all the more so given that, for the moment, while open-source is popular, proprietary models are very much in the majority.</p>



<p>There is no simple answer to the questions that have been raised. That’s because AI development is currently very empirical, based on a trial-and-error model, with no traceability of training data or model modifications.</p>



<p>This, incidentally, makes the “explainability”<a href="#_ftn12" id="_ftnref12"><sup>[12]</sup></a><sup> </sup>of an AI system’s results a real challenge, even though the AI Act establishes a duty to do so (see below).<br><br></p>



<p>The development of a “trustworthy AI”, as it was termed in a 2019 paper<a href="#_ftn13" id="_ftnref13"><sup>[13]</sup></a><sup> </sup>by the Independent High-Level Expert Group on Artificial Intelligence (AI HLEG), is perhaps a direction to keep in mind. It defines a trustworthy AI with three main objectives, which OVHcloud aims to help you achieve: AI must be lawful (legislative or regulatory aspect), ethical (respect for ethical norms) and robust (from both a technical and social perspective).</p>



<p>In the meantime, ensuring swift regulatory compliance at the national, European, and international levels is a powerful lever for promoting greener business practices, without compromising future prospects in the pursuit of innovation.</p>



<p><strong>1/ Complying with current and future regulations</strong></p>



<p>The EU was quick to respond to the democratisation of AI, proposing a draft European regulation on the subject on 21 April 2021. In March 2024, the AI Act was officially adopted. Now, it applies to all services used in the EU, regardless of whether the providers are foreign or not.</p>



<p>The law divides AI systems into four categories, taking into account their impact on fundamental rights in the EU and the security of individuals, groups, societies, and civilization. Each risk category has associated prohibitions<a href="#_ftn14" id="_ftnref14"><sup>[14]</sup></a><sup> </sup>and obligations, ranging from environmental sustainability to security, and including marking content that has been AI-generated.</p>



<p>A “compliance checker” online allows you to quickly find out the extent to which this European AI law applies to your projects: <a href="https://artificialintelligenceact.eu/assessment/eu-ai-act-compliance-checker/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://artificialintelligenceact.eu/</a></p>



<p>Other national and European regulations on personal data protection, such as the GDPR, already apply to your AI projects, holding companies accountable for hosting and transferring personal data outside the EU.</p>



<p>Incidentally, those who complain about regulation being too burdensome in comparison to the American laissez-faire attitude or the Chinese spirit of conquest have the wrong end of the stick: the absence of a genuine European single market is a much bigger factor<a href="#_ftn15" id="_ftnref15"><sup>[15]</sup></a><sup> </sup>behind Europe’s innovation gap. So, too, is the weak support for public procurement, or the incomprehensible message sent by governments that claim to want to develop sovereign solutions by relying on investments by foreign stakeholders.<a href="#_ftn16" id="_ftnref16"><sup>[16]</sup></a></p>



<p>It’s also worth noting that the AI Act provides for the possibility for national competent authorities (<a>the ICO in the UK) </a><a href="#_msocom_1">[1]</a>&nbsp; to set up “regulatory sandboxes”, i.e. a controlled environment to test innovative technologies for a limited time in order to ensure the compliance of the AI system and to not delay any potential placing on the market, with priority access to these sandboxes for SMEs and startups.</p>



<p>In short, regulations today do not hinder the development of projects that take advantage of the possibilities offered by AI, but rather strengthen companies’ obligations regarding the protection of personal data due to increased risks. These obligations will help to reassure users, once this brief period of carelessness and frivolity with AI has passed, and the inevitable first scandals start to surface. As Yoshua Bengio, researcher and founder of MILA (the Quebec Artificial Intelligence Institute), summed up: “We’re going too fast in an unfamiliar direction, and that could change the world in a very positive, or very dangerous, way.”<a href="#_ftn17" id="_ftnref17"><sup>[17]</sup></a><sup> </sup>Countries should therefore seek to regulate AI so that its development does not feel like the Wild West.</p>



<p>In this context, the preference for sovereign solutions will make it easier for your projects to comply with regulations, in addition to establishing a clear medium- and long-term vision. COVID and the current geopolitical instability have shown the cost of relying on foreign entities for essential services, and AI-based services will quickly follow the same path if they integrate the software we use every day, in such critical areas as health, education, transport, or logistics.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p><a href="#_ftnref1" id="_ftn1"><sup>[1]</sup></a> General-purpose and specialised LLMs can be distinguished by the number of parameters in their neural network: tens, hundreds or even thousands of billions of parameters for a general-purpose model versus “a few billion” for a specialised model.</p>



<p><a href="#_ftnref2" id="_ftn2"><sup>[2]</sup></a> <a href="https://www.usine-digitale.fr/article/openai-cible-par-deux-class-actions-aux-etats-unis.N2148412" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://www.usine-digitale.fr/article/openai-cible-par-deux-class-actions-aux-etats-unis.N2148412</a>; <a href="https://www.lefigaro.fr/secteur/high-tech/des-journaux-americains-poursuivent-openai-et-microsoft-en-justice-pour-violation-de-leurs-droits-d-auteur-20240430" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://www.lefigaro.fr/secteur/high-tech/des-journaux-americains-poursuivent-openai-et-microsoft-en-justice-pour-violation-de-leurs-droits-d-auteur-20240430</a></p>



<p><a id="_ftn3" href="#_ftnref3"><sup>[3]</sup></a> <a href="https://www.nytimes.com/2024/06/05/technology/nvidia-" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.nytimes.com/2024/06/05/technology/nvidia-</a>microsoft-openai-antitrust-doj-ftc.html</p>



<p><a id="_ftn4" href="#_ftnref4"><sup>[4]</sup></a> <a href="http://arxiv.org/pdf/2311.16863" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">http://arxiv.org/pdf/2311.16863</a></p>



<p><a id="_ftn5" href="#_ftnref5"><sup>[5]</sup></a> <a href="https://www.technologyreview.com/2023/12/01/1084189/making-an-image-with-generative-ai-uses-as-much-energy-as-charging-your-phone/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.technologyreview.com/2023/12/01/1084189/making-an-image-with-generative-ai-uses-as-much-energy-as-charging-your-phone/</a></p>



<p><a id="_ftn6" href="#_ftnref6"><sup>[6]</sup></a> <a href="https://corporate.ovhcloud.com/en/sustainability/environment/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://corporate.ovhcloud.com/en-gb/sustainability/environment/</a><br>For our <strong>PUE calculation methodology,</strong> refer to <a href="https://corporate.ovhcloud.com/sites/default/files/2024-01/methodo_carboncalc_0.pdf" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://corporate.ovhcloud.com/sites/default/files/2024-01/methodo_carboncalc_0.pdf</a></p>



<p><a id="_ftn7" href="#_ftnref7"><sup>[7]</sup></a> <a href="https://www.silicon.fr/llm-generaliste-specialise-angle-environnemental-473911.html" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.silicon.fr/llm-generaliste-specialise-angle-environnemental-473911.html</a></p>



<p><a id="_ftn8" href="#_ftnref8"><sup>[8]</sup></a> <a href="https://www.radiofrance.fr/franceculture/podcasts/le-journal-de-l-eco/le-cout-environnemental-de-l-ia-est-colossal-et-sous-evalue-3781962" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.radiofrance.fr/franceculture/podcasts/le-journal-de-l-eco/le-cout-environnemental-de-l-ia-est-colossal-et-sous-evalue-3781962</a></p>



<p><a id="_ftn9" href="#_ftnref9"><sup>[9]</sup></a> <a href="https://www.newscientist.com/article/2424179-ai-chatbots-are-improving-at-an-even-faster-rate-than-computer-chips/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.newscientist.com/article/2424179-ai-chatbots-are-improving-at-an-even-faster-rate-than-computer-chips/</a></p>



<p><a id="_ftn10" href="#_ftnref10"><sup>[10]</sup></a> <a href="https://usbeketrica.com/fr/article/chatgpt-coute-t-il-vraiment-700-000-dollars-par-jour-a-openai" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://usbeketrica.com/fr/article/chatgpt-coute-t-il-vraiment-700-000-dollars-par-jour-a-openai</a></p>



<p><a id="_ftn11" href="#_ftnref11"><sup>[11]</sup></a> <a href="https://arstechnica.com/information-technology/2023/02/chatgpt-is-a-data-privacy-nightmare-and-you-ought-to-be-concerned/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://arstechnica.com/information-technology/2023/02/chatgpt-is-a-data-privacy-nightmare-and-you-ought-to-be-concerned/</a></p>



<p><a id="_ftn12" href="#_ftnref12"><sup>[12]</sup></a> <a href="https://www.cnil.fr/fr/definition/explicabilite-ia" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.cnil.fr/fr/definition/explicabilite-ia</a></p>



<p><a id="_ftn13" href="#_ftnref13"><sup>[13]</sup></a> <a href="https://op.europa.eu/en/publication-detail/-/publication/d3988569-0434-11ea-8c1f-01aa75ed71a1" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://op.europa.eu/en/publication-detail/-/publication/d3988569-0434-11ea-8c1f-01aa75ed71a1</a></p>



<p><a href="#_ftnref14" id="_ftn14"><sup>[14]</sup></a> AI systems are prohibited if they violate EU values by infringing on fundamental rights, such as:</p>



<p>• Subliminally manipulating behaviours</p>



<p>• Exploiting individuals’ vulnerabilities in order to influence their behaviour</p>



<p>• AI-based social scoring used by governments for general purposes</p>



<p>• The use of “real-time” remote biometric identification systems in publicly accessible spaces for law enforcement purposes (with exceptions).</p>



<p><a id="_ftn15" href="#_ftnref15"><sup>[15]</sup></a> <a href="https://twitter.com/hubertguillaud/status/1795001082843713968" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://twitter.com/hubertguillaud/status/1795001082843713968</a></p>



<p><a id="_ftn16" href="#_ftnref16"><sup>[16]</sup></a> <a href="https://twitter.com/canardenchaine/status/1795862230782640367" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://twitter.com/canardenchaine/status/1795862230782640367</a></p>



<p><a id="_ftn17" href="#_ftnref17"><sup>[17]</sup></a> <a href="https://ici.radio-canada.ca/ohdio/premiere/emissions/ils-ont-fait-annee/segments/entrevue/469120/robot-chatgpt-lois-securite-ordinateurs" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://ici.radio-canada.ca/ohdio/premiere/emissions/ils-ont-fait-annee/segments/entrevue/469120/robot-chatgpt-lois-securite-ordinateurs</a></p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p><a id="_msocom_1"></a></p>



<p>To be localised in translation</p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fai-saas-ovhcloud%2F&amp;action_name=Adopting%20AI%20in%20SaaS%3A%20how%20can%20we%20move%20quickly%20without%20losing%20control%3F&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
