<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Fabien Ric, Author at OVHcloud Blog</title>
	<atom:link href="https://blog.ovhcloud.com/author/fabien/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.ovhcloud.com/author/fabien/</link>
	<description>Innovation for Freedom</description>
	<lastBuildDate>Thu, 06 Mar 2025 10:22:44 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.ovhcloud.com/wp-content/uploads/2019/07/cropped-cropped-nouveau-logo-ovh-rebranding-32x32.gif</url>
	<title>Fabien Ric, Author at OVHcloud Blog</title>
	<link>https://blog.ovhcloud.com/author/fabien/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Deep Dive into DeepSeek-R1 &#8211; Part 1</title>
		<link>https://blog.ovhcloud.com/deep-dive-into-deepseek-r1-part-1/</link>
		
		<dc:creator><![CDATA[Fabien Ric]]></dc:creator>
		<pubDate>Thu, 06 Mar 2025 09:56:20 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Deploy]]></category>
		<category><![CDATA[AI Endpoints]]></category>
		<category><![CDATA[DeepSeek]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=28199</guid>

					<description><![CDATA[Introduction A few weeks ago, the release of the open-source large language model DeepSeek-R1 has taken the AI world by storm. The Chinese research team claimed their new reasoning model was on par with OpenAI&#8217;s flagship model o1, open-sourced the model and gave details about the work behind it. In this blog post series, we [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdeep-dive-into-deepseek-r1-part-1%2F&amp;action_name=Deep%20Dive%20into%20DeepSeek-R1%20%26%238211%3B%20Part%201&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="512" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-1024x512.png" alt="A cute whale with a baseball cap, using a computer, representing DeepSeek." class="wp-image-28353" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-1024x512.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-300x150.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-768x384.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16-1536x768.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-16.png 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading">Introduction</h2>



<p>A few weeks ago, the release of the open-source large language model DeepSeek-R1 has taken the AI world by storm. The Chinese research team claimed their new reasoning model was on par with OpenAI&#8217;s flagship model o1, open-sourced the model and gave details about the work behind it.</p>



<p>In this blog post series, we will dive into the DeepSeek-R1 model family and see how you can run it on OVHcloud to build a simple chatbot that handles reasoning.</p>



<p>The &#8220;R&#8221; in DeepSeek-R1 stands for &#8220;Reasoning&#8221;, so let&#8217;s start by defining what a reasoning model is.</p>



<h2 class="wp-block-heading">What are reasoning models?</h2>



<p>Reasoning models are large language models (LLM) capable of reflecting on a problem before generating an answer. Traditionally, LLMs have been improved by spending more compute (more data, increase the number of parameters and the number of training iterations) at training time: it is <strong>training-time compute</strong>. Reasoning models, however, differ with standard LLMs in the way they use <strong>test-time compute</strong>, which means that during inference, they spend more time and resources to generate and refine a better answer.</p>



<p>Reasoning models excel at tasks that require understanding and working through a problem step-by-step, such as mathematics, riddles, puzzles, coding, planning tasks and agentic workflows. They may be counterproductive for use cases that don&#8217;t require reasoning capabilities, such as knowledge facts (for example, <em>who discovered penicillin)</em>.</p>



<p>In a classroom, a reasoning model would be a student that takes time to understand the question, split the problem into manageable steps and detail the resolution process before rushing to write the answer.</p>



<p>Here is a comparison between the outputs of a standard LLM and a reasoning LLM, on an example prompt:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69e9aa4c46746&quot;}" data-wp-interactive="core/image" data-wp-key="69e9aa4c46746" class="wp-block-image aligncenter size-full wp-lightbox-container"><img decoding="async" width="1029" height="492" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14.png" alt="A diagram showing the differences between standard LLM and reasoning LLM outputs for a given prompt." class="wp-image-28318" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14.png 1029w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14-300x143.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14-1024x490.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-14-768x367.png 768w" sizes="(max-width: 1029px) 100vw, 1029px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>The reasoning model has generated more tokens, showing how it plans to solve the problem, before the actual answer. You can see it generates reasoning content into <code>&lt;think&gt;...&lt;/think&gt; </code>tags, in the case of DeepSeek-R1.</p>



<p>A standard LLM can also show reasoning abilities, that are often more visible when using a technique called <a href="https://arxiv.org/abs/2201.11903" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Chain-of-Thought prompting (CoT)</a>, by adding phrases such as &#8220;let&#8217;s think step-by-step&#8221; in the prompt.</p>



<p>However, a reasoning LLM has been trained to behave this way. Its reasoning skill is internalized, so it doesn&#8217;t require specific prompting techniques to trigger the chain of thoughts process.</p>



<p>It&#8217;s important to note that DeepSeek-R1 is not the first reasoning model; OpenAI led the way by releasing their o1 model in September 2024.</p>



<p>The two main reasons why DeepSeek-R1 made the headline are its open-source nature, and the paper released by the research team which give many details on how they trained the model, with valuable insight for the open-source community to create reasoning models. Especially, the key highlight of their paper is that they observe the reasoning behavior can emerge only through Reinforcement Learning (RL), without fine-tuning.</p>



<h2 class="wp-block-heading">The DeepSeek-R1 model family</h2>



<p>You may have heard about DeepSeek-R1 but it&#8217;s not the only model of the DeepSeek family: DeepSeek-V3, DeepSeek-R1-Zero, and distilled models, are also available. So what are the differences between those models?</p>



<p>First, let&#8217;s go through some definitions and an overview of how language models are trained.</p>



<h3 class="wp-block-heading">Language model training overview</h3>



<p>The large language models available in apps and playgrounds are usually trained in 3 steps:</p>



<ol class="wp-block-list">
<li>A <strong>base model</strong> is trained on an unsupervised language modeling task (for instance, next token prediction) with a dataset of trillions of tokens (also called <em>pre-training</em>),</li>



<li>An <strong>instruct model </strong>is trained from the base model, by fine-tuning it on a massive dataset of instructions, conversations, questions and answers, to improve the performances of the model with the prompts frequently encountered in a chat,</li>



<li>The <strong>final model</strong> is the instruct model trained to better handle human preferences, avoid the generation of harmful content, etc. with techniques such as RLHF (reinforcement learning from human feedback) and DPO (direct policy optimization).</li>
</ol>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69e9aa4c46d8a&quot;}" data-wp-interactive="core/image" data-wp-key="69e9aa4c46d8a" class="wp-block-image aligncenter size-full wp-lightbox-container"><img decoding="async" width="1459" height="239" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image.png" alt="A diagram showing the 3 training steps of a LLM." class="wp-image-28268" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image.png 1459w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-300x49.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-1024x168.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-768x126.png 768w" sizes="(max-width: 1459px) 100vw, 1459px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p></p>



<h3 class="wp-block-heading">DeepSeek-V3 training</h3>



<p>According to the <a href="https://arxiv.org/pdf/2412.19437" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">technical report provided by DeepSeek</a>, DeepSeek-V3 is a mixture-of-experts (MoE) language model trained with the same kind of process, which is described in the image below:</p>



<ul class="wp-block-list">
<li><strong>DeepSeek-V3-Base</strong> is trained with 14.8 trillion tokens,</li>



<li>A dataset of 1.5 million instructions examples is used to fine-tune the base model,</li>



<li>This instruct model goes through reinforcement learning with several reward models. The final model is <strong>DeepSeek-V3</strong>.</li>
</ul>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69e9aa4c473e2&quot;}" data-wp-interactive="core/image" data-wp-key="69e9aa4c473e2" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1453" height="242" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8.png" alt="A diagram showing the 3 training steps of DeepSeek-V3." class="wp-image-28288" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8.png 1453w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8-300x50.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8-1024x171.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-8-768x128.png 768w" sizes="auto, (max-width: 1453px) 100vw, 1453px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>For the reinforcement learning step, DeepSeek uses their algorithm called <strong>GRPO</strong> (<a href="https://arxiv.org/pdf/2402.03300" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">group relative policy optimization</a>), which uses several reward models to assess the quality of the content generated by the model. The score given by each reward model is combined into a final score, used to update the model so that it maximizes its global score the next time.</p>



<h3 class="wp-block-heading">DeepSeek-R1 model series training</h3>



<p><strong>DeepSeek-R1</strong> models are built with a different training pipeline, using the base model of DeepSeek-V3. The diagram below shows the main steps of the process designed by DeepSeek to create several reasoning models mentioned in their <a href="https://arxiv.org/pdf/2501.12948" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">technical report</a>:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69e9aa4c478a0&quot;}" data-wp-interactive="core/image" data-wp-key="69e9aa4c478a0" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="1262" height="1323" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12.png" alt="A diagram showing the training process of DeepSeek-R1, DeepSeek-R1-Zero and DeepSeek-Distill models." class="wp-image-28301" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12.png 1262w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12-286x300.png 286w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12-977x1024.png 977w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-12-768x805.png 768w" sizes="auto, (max-width: 1262px) 100vw, 1262px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>Let&#8217;s walk through it step-by-step (no pun intended):</p>



<p>1. The main breakthrough described in DeepSeek&#8217;s paper: they managed to train the DeepSeek-V3-Base 671B model to learn the reasoning capability with reinforcement learning only, which doesn&#8217;t require labeled data, as opposed to supervised fine-tuning. They use the same GRPO algorithm as before, with two rewards: one on the accuracy of the generated content, with &#8220;rule-based&#8221; experts instead of full reward models, that are also trained and require significant resources. For example, to assess if the model generated a correct Python code, you could have one expert that compiles the generated code and gives a note based on the number of errors. Another expert would generate test cases and see if the generated code can handle them. The other reward they use is about the format of the model&#8217;s responses, which must follow the  <code>&lt;think&gt;...&lt;think&gt;</code> tags to enclose the reasoning content. The resulting model is <strong>DeepSeek-R1-Zero.</strong> However, it has limitations that make it unsuitable for direct use, such as language mixing and poor readability.</p>



<p>2. To overcome these limitations, DeepSeek uses DeepSeek-R1-Zero to create a cold-start reasoning dataset, augmented with other data from sources not explicitly mentioned. DeepSeek-V3-Base is trained with this cold-start data, before applying a new round of reinforcement learning.</p>



<p>3. They use the same RL approach to get a new reasoning model, that generates a better quality of output. Using this model, they build a 100x bigger reasoning data, growing from 5k to 600k samples, using DeepSeek-V3 as a quality judge. This dataset is then completed with 200k samples generated with DeepSeek-V3 on non-reasoning tasks.</p>



<p>4. A second stage of supervised fine-tuning is done with the dataset built earlier.</p>



<p>5. The model is then aligned with human preferences with a final round of reinforcement learning with a specific human preferences reward. The resulting model is <strong>DeepSeek-R1</strong>.</p>



<p>6. Finally, DeepSeek experimented with fine-tuning much smaller models than DeepSeek-V3 (LLaMa 3.3 70B, Qwen 2.5 32B&#8230;) with the dataset built at step 3. In the paper, they call this process <strong>distillation</strong>. However, it must not be mistaken with the <em>knowledge distillation</em> technique frequently used in deep learning, where a student model learns from the probabilities distribution of a teacher model. Here, the term &#8220;distillation&#8221; refers to the fact that the reasoning skill is &#8220;distilled&#8221; into the base model, but it&#8217;s plain old supervised fine-tuning. This is how the <strong>DeepSeek-R1-Distill </strong>model series is trained. The quality of the dataset enables the resulting distilled models to beat much larger models on reasoning tasks, as show in the benchmark below:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69e9aa4c47dc6&quot;}" data-wp-interactive="core/image" data-wp-key="69e9aa4c47dc6" class="wp-block-image aligncenter size-full is-resized wp-lightbox-container"><img loading="lazy" decoding="async" width="770" height="312" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13.png" alt="A screen capture of benchmark data table." class="wp-image-28310" style="width:750px;height:auto" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13.png 770w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13-300x122.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/image-13-768x311.png 768w" sizes="auto, (max-width: 770px) 100vw, 770px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button><figcaption class="wp-element-caption"><em>Benchmark of distilled models on several reasoning tasks (source: DeepSeek R1 technical paper)</em></figcaption></figure>



<h3 class="wp-block-heading">Recap</h3>



<p>The table below summarize the differences between the model of the DeepSeek-R1 series:</p>



<figure class="wp-block-table"><table><tbody><tr><td>Model</td><td>Description</td></tr><tr><td>DeepSeek-R1-Zero</td><td>Intermediate 671B reasoning model trained from DeepSeek-V3 exclusively with reinforcement learning, and used to bootstrap DeepSeek-R1 training.</td></tr><tr><td>DeepSeek-R1</td><td>671B reasoning model trained from DeepSeek-V3.</td></tr><tr><td>DeepSeek-R1-Distill</td><td>Smaller models fine-tuned for reasoning with a dataset generated by an intermediate version of DeepSeek-R1.</td></tr></tbody></table></figure>



<h2 class="wp-block-heading">Run DeepSeek-R1 on OVHcloud</h2>



<p>Now that we&#8217;ve seen the differences between all DeepSeek models, let&#8217;s try to use them!</p>



<h3 class="wp-block-heading">AI Endpoints</h3>



<p>The fastest way to test DeepSeek-R1 is to use OVHcloud<strong> AI Endpoints</strong>.</p>



<p><strong>DeepSeek-R1-Distill-Llama-70B</strong> is already available, ready to use and optimized for inference speed. Check it out here: <a href="https://endpoints.ai.cloud.ovh.net/models/a011515c-0042-41b2-9a00-ec8b5d34462d" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://endpoints.ai.cloud.ovh.net/models/a011515c-0042-41b2-9a00-ec8b5d34462d</a></p>



<p>AI Endpoints makes it easy to integrate AI into your applications with a simple API call, without the need for deep AI expertise or infrastructure management. And while it’s in beta, it’s <strong>free</strong>!</p>



<p>Here is an example cURL command to use DeepSeek-R1 Distill Llama 70B on the OpenAI compatible endpoint provided by OVHcloud AI Endpoints:</p>



<pre class="wp-block-code"><code class="">curl -X 'POST' \
  'https://deepseek-r1-distill-llama-70b.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "max_tokens": 4096,
  "messages": [
    {
      "content": "How can I calculate an approximation of Pi in Python?",
      "role": "user"
    }
  ],
  "model": null,
  "seed": null,
  "stream": false,
  "temperature": 0.7,
  "top_p": 1
}'</code></pre>



<p>We can see in the output the thinking process followed by the answer, which have been truncated for clarity.</p>



<pre class="wp-block-code"><code class="">{
    "id": "chatcmpl-8c21b2e3fac44d43b63c06fa25e58091",
    "object": "chat.completion",
    "created": 1741199564,
    "model": "DeepSeek-R1-Distill-Llama-70B",
    "choices":
    [
        {
            "index": 0,
            "message":
            {
                "role": "assistant",
                "content": "&lt;think&gt;\nOkay, the user is asking how to approximate Pi using Python. I need to think about different methods they can use. Let's see, there are a few common approaches. \n\nFirst, there's the Monte Carlo method. ... Let me structure the response with each method as a separate section, explaining what it is, how it works, and providing the code. Then, the user can pick which one they prefer based on their situation.\n&lt;/think&gt;\n\nThere are several ways to approximate the value of Pi (π) using Python. Below are a few methods:\n\n### 1. Using the Monte Carlo Method..."
            },
            "finish_reason": "stop",
            "logprobs": null
        }
    ],
    "usage":
    {
        "prompt_tokens": 14,
        "completion_tokens": 1377,
        "total_tokens": 1391
    }
}</code></pre>



<p>Stéphane Philippart, Developer Relation Advocate at OVHcloud, has written a blog post covering everything you need to know to get up to speed with AI Endpoints and run this model: <a href="https://blog.ovhcloud.com/release-of-deepseek-r1-on-ovhcloud-ai-endpoints/" target="_blank" rel="noreferrer noopener" data-wpel-link="internal">Release of DeepSeek-R1 on OVHcloud AI Endpoints</a></p>



<h3 class="wp-block-heading">AI Deploy</h3>



<p>What if you want to run another version of DeepSeek-R1, such as the Qwen 7B distilled version?</p>



<p>You can use another OVHcloud AI product, <strong>AI Deploy</strong>, to create your own serving endpoint, with <a href="https://docs.vllm.ai/en/stable/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">vLLM</a> as the inference engine. It is open-source, fast and well maintained, ensuring maximal compatibility with even the most recent AI models.</p>



<p>Eléa Petton, Solution Architect at OVHcloud, has written a blog post explaining in details how to serve an open-source model with vLLM on AI Deploy. Just replace the Mistral Small model with the DeepSeek distilled version you want to use (e.g. <strong>deepseek-ai/DeepSeek-R1-Distill-Qwen-7B</strong>) and adapt the number of L40S cards needed (1 is enough for the 7B version) : <a href="https://blog.ovhcloud.com/mistral-small-24b-served-with-vllm-and-ai-deploy-one-command-to-deploy-llm/" target="_blank" rel="noreferrer noopener" data-wpel-link="internal">Mistral Small 24B served with vLLM and AI Deploy – a single command to deploy an LLM (Part 1)</a></p>



<h3 class="wp-block-heading">Next up, creating a reasoning chatbot with DeepSeek-R1</h3>



<p>In part 2 of this blog post series, we will use a DeepSeek-R1-Distill model to create a chatbot that will handle reasoning gracefully, by showing the thinking process of the model.</p>



<p>We will develop our chatbot with OVHcloud AI Endpoints and the Python library <a href="https://www.gradio.app/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Gradio</a>, that enables to quickly create simple chat interfaces.</p>



<p>Here a screenshot of the finalized chatbot we will build:</p>



<figure data-wp-context="{&quot;imageId&quot;:&quot;69e9aa4c4843d&quot;}" data-wp-interactive="core/image" data-wp-key="69e9aa4c4843d" class="wp-block-image aligncenter size-full wp-lightbox-container"><img loading="lazy" decoding="async" width="723" height="1173" data-wp-class--hide="state.isContentHidden" data-wp-class--show="state.isContentVisible" data-wp-init="callbacks.setButtonStyles" data-wp-on--click="actions.showLightbox" data-wp-on--load="callbacks.setButtonStyles" data-wp-on-window--resize="callbacks.setButtonStyles" src="https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot.png" alt="A screenshot of a chatbot application developed with DeepSeek-R1 and Gradio in Python." class="wp-image-28328" srcset="https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot.png 723w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot-185x300.png 185w, https://blog.ovhcloud.com/wp-content/uploads/2025/03/chatbot-631x1024.png 631w" sizes="auto, (max-width: 723px) 100vw, 723px" /><button
			class="lightbox-trigger"
			type="button"
			aria-haspopup="dialog"
			aria-label="Enlarge"
			data-wp-init="callbacks.initTriggerButton"
			data-wp-on--click="actions.showLightbox"
			data-wp-style--right="state.imageButtonRight"
			data-wp-style--top="state.imageButtonTop"
		>
			<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="none" viewBox="0 0 12 12">
				<path fill="#fff" d="M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z" />
			</svg>
		</button></figure>



<p>Stay tuned for the next article in this DeepSeek-R1 series. In the meantime, try out DeepSeek-R1 on AI Endpoints and AI Deploy and let us know what you &lt;think&gt;!</p>



<h3 class="wp-block-heading">Resources</h3>



<p>If you want to learn more about DeepSeek-R1 and the topics we covered in this blog post, such as test-time compute, GRPO, reinforcement learning and reasoning models, we suggest having a look at these resources:</p>



<ul class="wp-block-list">
<li><a href="https://arxiv.org/pdf/2501.12948" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">DeepSeek-R1 technical report</a>, by the DeepSeek team</li>



<li><a href="https://newsletter.languagemodels.co/p/the-illustrated-deepseek-r1" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">The Illustrated DeepSeek-R1</a>, by Jay Alamar</li>



<li><a href="https://magazine.sebastianraschka.com/p/understanding-reasoning-llms" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Understanding Reasoning LLMs</a>, by Sebastian Raschka</li>



<li><a href="https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-reasoning-llms" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">A Visual Guide to Reasoning LLMs</a>, by Maarten Grootendorst</li>
</ul>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdeep-dive-into-deepseek-r1-part-1%2F&amp;action_name=Deep%20Dive%20into%20DeepSeek-R1%20%26%238211%3B%20Part%201&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
