<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jean-Louis Queguiner, Author at OVHcloud Blog</title>
	<atom:link href="https://blog.ovhcloud.com/author/jean-louis-queguiner/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.ovhcloud.com/author/jean-louis-queguiner/</link>
	<description>Innovation for Freedom</description>
	<lastBuildDate>Mon, 03 Jul 2023 08:17:01 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.ovhcloud.com/wp-content/uploads/2019/07/cropped-cropped-nouveau-logo-ovh-rebranding-32x32.gif</url>
	<title>Jean-Louis Queguiner, Author at OVHcloud Blog</title>
	<link>https://blog.ovhcloud.com/author/jean-louis-queguiner/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>How to virtualize PCI-e GPU in the cloud ?</title>
		<link>https://blog.ovhcloud.com/how-to-virtualize-pci-e-gpu-in-the-cloud/</link>
		
		<dc:creator><![CDATA[Jean-Louis Queguiner]]></dc:creator>
		<pubDate>Fri, 14 Jan 2022 14:00:00 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<guid isPermaLink="false">https://blog.ovh.com/fr/blog/?p=14487</guid>

					<description><![CDATA[A quick introduction to virtualization The purpose of virtualization is to isolated the user software environment from the hardware environment. The orchestration between these virtual environments is made by the hypervisor. Hypervisor also provides the ability for a Virtual Machines (VM) to execute instructions that are not directly compatible with the underlying architecture/hardware. VT-x and [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fhow-to-virtualize-pci-e-gpu-in-the-cloud%2F&amp;action_name=How%20to%20virtualize%20PCI-e%20GPU%20in%20the%20cloud%20%3F&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<h3 class="wp-block-heading">A quick introduction to virtualization</h3>



<p>The purpose of virtualization is to <strong><em>isolated</em></strong> the user software environment from the hardware environment. The orchestration between these <strong><em>virtual environments</em></strong> is made by the <strong><em>hypervisor</em></strong>. <strong><em>Hypervisor</em></strong> also provides the ability for a <em><strong>Virtual Machines (VM)</strong></em> to execute <strong>instructions</strong> that are not directly <strong>compatible</strong> with the<strong><em> underlying architecture/hardware</em></strong>.<strong> <em>VT-x</em></strong> and <strong><em>AMD-V</em></strong> which are Intel and AMD virtualization technologies.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img fetchpriority="high" decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2019/07/IMG_0739-1024x537.jpeg" alt="How to virtualize PCI-e GPU in the cloud ?" class="wp-image-21746" width="512" height="269" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/07/IMG_0739-1024x537.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/07/IMG_0739-300x157.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/07/IMG_0739-768x403.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/07/IMG_0739.jpeg 1200w" sizes="(max-width: 512px) 100vw, 512px" /></figure></div>



<p>Some of you might know <strong>QEMU/NEMU/KVM/Lib Virt</strong>. All of these tools have specific roles as explained below :</p>



<figure class="wp-block-table"><table><thead><tr><th>Tool</th><th>Tool Family</th><th class="has-text-align-left" data-align="left">Purpose</th></tr></thead><tbody><tr><td><strong>KVM</strong><br>(Kernel-based Virtual Machine)</td><td>Virtualization</td><td class="has-text-align-left" data-align="left">&#8211; Deamon that manipulates <em>VMs</em>. In oposition to <strong><em>QEMU</em></strong>, <strong><em>KVM</em></strong> leverage the Virtualization extension provided by the CPU itself without any emulation (<strong><em>VT-x</em></strong> or <strong><em>AMD-V</em></strong>)</td></tr><tr><td><strong>Lib Virt</strong></td><td>Virtualization</td><td class="has-text-align-left" data-align="left">Manages/Manipulates Virtualization (API, CLI and a Deamon &#8211; libvirtd)</td></tr><tr><td><strong>QEMU</strong> <br>(Quick EMUlator)</td><td>Emulation</td><td class="has-text-align-left" data-align="left">&#8211; <strong><em>Emulates</em></strong> the processor and peripherals. It&#8217;s basically converting input guest instruction (for instance <strong><em>ARM</em></strong>) into the actual hardware compatible instructions (for instance <strong><em>x86</em></strong>).<br>&#8211; Supports all virtualizations/emulations.<br>&#8211; It has a reputation to be slow</td></tr><tr><td><strong>NEMU</strong></td><td>Emulation</td><td class="has-text-align-left" data-align="left">&#8211; NEMU is a fork of QEMU that focuses on modern CPUs <strong><em>used with advanced virtualization features to increase the speed </em></strong>of existing QEMU implementation.<br>&#8211; NEMU focuses on KVM support to better leverage the CPU&#8217;s virtualization extension and therefore reduce the need to translate instructions to the CPUs.<br>&#8211; It doesn&#8217;t supports all hardware<br>&#8211; It&#8217;s faster than QEMU</td></tr></tbody></table></figure>



<p>The Different components can also be represented in form of a stack.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/Sans-titre.png" alt="" class="wp-image-18882" width="296" height="344" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/Sans-titre.png 394w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/Sans-titre-258x300.png 258w" sizes="(max-width: 296px) 100vw, 296px" /></figure></div>



<h3 class="wp-block-heading">Now&#8230; Let&#8217;s talk about AI and GPU&#8217;s</h3>



<h4 class="wp-block-heading">Why AI practitioners loves virtualization and containerization</h4>



<p>When running AI workload it&#8217;s often recommended to use Docker. Indeed a big part of the pain in AI when it comes to workload management is making sure every library is compiled with the right CUDA accelerators (<a aria-label="undefined (opens in a new tab)" href="https://docs.nvidia.com/cuda/cublas/index.html" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">cuBLAS</a>, <a aria-label="undefined (opens in a new tab)" href="https://developer.nvidia.com/cudnn" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">cuDNN</a>) associated to the right CUDA version with the right Driver. </p>



<p>Playing a bit with Deep Learning framework and with CUDA version ensure you a massive headache by the end of the day.</p>



<p>This is why Virtualization is an awesome that allow you to break and rebuild you environment. But adding a containerized technology on top will help you to hence reproducibility and flexibility while playing with different frameworks.</p>



<p>This issue had been identified a while ago by Nvidia when then decided to push their <a aria-label="undefined (opens in a new tab)" href="http://NGC.nvidia.com" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Nvidia GPU Cloud (NGC) platform</a>.</p>



<h4 class="wp-block-heading">Handle GPUs in a virtualized environment</h4>



<p>When we presented the virtualization layers earlier we didn&#8217;t mentioned GPUs as they are not available on every hardware platform. There are multiple ways to handle GPUs in a virtualized environment:</p>



<ul class="wp-block-list"><li><strong>emulation</strong> through <a aria-label="undefined (opens in a new tab)" href="https://developer.nvidia.com/nvemulate" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">nvemulate</a> (Nvidia) or <a aria-label="undefined (opens in a new tab)" href="https://github.com/jrprice/Oclgrind" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">OpenCL device emulator</a> (AMD) where you emulate instruction (cuda or opencl).</li><li><strong>virtualization</strong> of the GPU (<strong>vGPU</strong>) or a slice of it where you give access to a slice of instruction to the GPU with an intermediate interpreter. <strong>In this case the GPU is accessed through a virtualized material address and not directly.</strong> Multiple market solution exist (citrix, vmware, <a href="https://www.nvidia.com/en-us/data-center/virtual-pc-apps/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">NVIDIA GRID</a> for VDI, <a href="https://www.nvidia.com/en-us/design-visualization/quadro-vdws/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">NVIDIA Quadro Virtual Data Center Workstation</a>, <a href="https://www.nvidia.com/en-us/data-center/virtual-compute-server/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">NVIDIA Virtual Compute Server (vCS)</a>, &#8230;)</li><li><strong>passthrough-ization</strong> (I know&#8230; this word doesn&#8217;t exits &#8230;) where you have direct access to the physical hardware. In this case we will use <strong><em>VFIO</em></strong> <strong>that will assign the physical address of the GPU to the guest VM.</strong> This mode provides <a aria-label="undefined (opens in a new tab)" href="http://dsc.soic.indiana.edu/publications/Cloud_2014%20(1).pdf" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">slightly better performance (between 1 and 2 percents)</a>.</li></ul>



<p>Here are 2 representations of a AI virtualized and dockerized environment.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-17.40.33-1-724x1024.jpeg" alt="" class="wp-image-18884" width="543" height="768" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-17.40.33-1-724x1024.jpeg 724w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-17.40.33-1-212x300.jpeg 212w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-17.40.33-1-768x1086.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-17.40.33-1.jpeg 905w" sizes="(max-width: 543px) 100vw, 543px" /></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.34-1024x724.jpeg" alt="" class="wp-image-18866" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.34-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.34-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.34-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.34.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h3 class="wp-block-heading">Now let&#8217;s talk about the CPUs/GPUs virtualization trap.</h3>



<h4 class="wp-block-heading">Why PCI Express pass-through and associated limitations&nbsp;?</h4>



<p>It was important to define how virtualization works to understand the limitations.<strong> The virtualization at the KVM level will attribute a CPU time for each VM using CPU cycles.</strong> At each <strong>CPU cycle the kernel will potentially move the Virtual Machine from the attached CPU to another CPU </strong>and this is when things can get dirty.</p>



<p><strong><em>PCI Express pass-through</em></strong> concept is that the <strong><em>KVM</em></strong> will not emulate the<strong><em> GPU address</em></strong> but will <strong><em>pass the instruction directly to the Graphical Processing Unit</em></strong> to the physical address of the device. The hypervisor is therefore giving to the virtual Machine a plain access to the <em><strong>GPU through the PCI-e BUS</strong></em>. This is resulting in a non controlled usage of the GPU by the <strong><em>Hypervisor</em></strong> meaning that you are, as a customer, accessing the RAW hardware at full performances.</p>



<h4 class="wp-block-heading">CPU is a sports car, GPU is a massive truck</h4>



<p>Because of the <strong><em>CPU limited throughput</em></strong> (indeed CPU as around a 10x slower throughput than GPUs). Sometimes performing a simple operation will be slower to do on a CPU than transferring it to the GPU and then calculate it.</p>



<p>Think about it&nbsp;: why would you remove your suitcase from the sports car to put it in the truck if its the only thing you have to move from Berlin to Paris. So it’s a constant trade off between having an operation executed on the CPU or on a GPU. thankfully, once again, those crazy AI framework core developer did those arbitrage for us.</p>



<h4 class="wp-block-heading">Does the CPU performance matters even when you are using a GPU?</h4>



<p>Yes ! As explained before (in this blog post <a href="https://www.ovh.com/blog/how-pci-express-works-and-why-you-should-care-gpu/" target="_blank" rel="noreferrer noopener" data-wpel-link="exclude">How PCI express works and why you should care</a>) <em><strong>CPUs are major bottlenecks when pushing data from CPU to GPU</strong> using <strong>PCie</strong> mechanism.</em></p>



<h3 class="wp-block-heading">Practical Case</h3>



<p>Lets imagine a host with 2 CPU sockets and each of them are linked to their own PCIe slots. The schema below is showing a simplified hardware architecture associated to the implementation of GPU using PCIe.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.43-1-1024x724.jpeg" alt="" class="wp-image-18902" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.43-1-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.43-1-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.43-1-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.43-1.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<p>Each <em><strong>CPU socket</strong></em> is attached to it’s PCIe through <strong><em>PCIe Lanes&nbsp;</em></strong>(if you don&#8217;t know what PCIe lanes are read this : <a href="https://www.ovh.com/blog/how-pci-express-works-and-why-you-should-care-gpu/" target="_blank" rel="noreferrer noopener" data-wpel-link="exclude">https://www.ovh.com/blog/how-pci-express-works-and-why-you-should-care-gpu/</a>) ; looking back at the schema <em><strong>accessing PCIe #1 using CPU2 will be sequence this way</strong></em> : </p>



<ul class="wp-block-list"><li>The <strong><em>VM</em></strong> requests <em><strong>GPU#1</strong></em> to it&#8217;s attached <em><strong>CPU</strong></em> for the current clock cycle meaning <strong><em>CPU#2 </em></strong>for our scenario</li><li>The <strong><em>CPU#2</em></strong> will then request access to <strong><em>GPU#1</em></strong> calling <em><strong>CPU#1</strong></em> as <em><strong>GPU#1 </strong></em>is attached physically to <em><strong>CPU#1</strong></em> (I mean the <strong><em>Socket of CPU#1</em></strong>) through the <em><strong>PCIe link</strong></em>. </li><li>This back and forth overhead is estimated to be around 3%* following our benchmarks.</li></ul>



<p>* The estimated 3% were prior to the <a aria-label="undefined (opens in a new tab)" href="https://www.intel.fr/content/www/fr/fr/architecture-and-technology/l1tf.html" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">L1TF Intel vulnerability</a> which as been since patched with a counter measure that is now invalidating/flushing the <em><strong>L1 CPU Cache</strong></em> at the kernel level therefore it might be a little bit less since both <em><strong>VM</strong></em> will have to rebuild the L1 cache even if the VM stays on the same CPU.</p>



<h3 class="wp-block-heading">Example of a virtualization full life cycle with multiple GPUs and VMs</h3>



<p>In the following section we will look at <em><strong>4 sequences of clock cycles </strong></em>and we will detail the <strong><em>communication flow between GPU(s), the VM they are attached to and the CPU and RAM managed by the hypervisor to make these VM run</em></strong>.</p>



<h4 class="wp-block-heading">The initial setup</h4>



<h5 class="wp-block-heading">In the initial setup we will consider <strong>3 VMs running on the same physical host</strong>. The physical host is composed of:</h5>



<ul class="wp-block-list"><li>2 Physical CPUs (2 CPU sockets)</li><li>2 RAM components</li><li>4 GPUs (2 attached to each CPU socket)</li></ul>



<h5 class="wp-block-heading">The Scenario is sequenced with <em><strong>to</strong></em> as the initial clock cycle. Each cycle is considered to last <em><strong>ẟt</strong></em> (delta t). Therefore :</h5>



<ul class="wp-block-list"><li><strong><em>to</em></strong> is the initial setup</li><li><strong>to + <em><strong>ẟt</strong></em></strong> the system state after 1 CPU cycles</li><li><strong>to + <em><strong>2ẟt</strong></em></strong> the system state after 2 CPU cycles</li><li><strong>to + <em><strong>3ẟt</strong></em></strong> the system state after 3 CPU cycles</li></ul>



<h5 class="wp-block-heading">How to read this sequence ? It&#8217;s pretty simple; let&#8217;s take the example of VM3:</h5>



<ul class="wp-block-list"><li><strong><em>VM3</em></strong> is assigned to <strong><em>CPU#1 </em></strong>at the<em><strong> initial setup (Yellow), </strong></em></li><li>then <strong><em>VM3</em></strong> is moved to <strong><em>CPU#2</em></strong> for the <strong><em>2nd cycle (green), </em></strong></li><li>then <strong><em>VM3</em></strong> stays on <strong><em>CPU#2 </em></strong>for the <strong><em>3rd cycle (red),</em></strong></li><li>Finally <strong><em>VM3</em></strong> is moved to <strong><em>CPU#3</em></strong> for the <strong><em>4th and final cycle (purple)</em></strong>.</li></ul>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.40-1024x724.jpeg" alt="" class="wp-image-18857" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.40-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.40-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.40-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.40.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h4 class="wp-block-heading">Lets look at the full sequence with the associated communication flows</h4>



<h5 class="wp-block-heading">First cycle communication</h5>



<p>At the first cycle the <strong><em>VM#1</em></strong> will have acces to <strong><em>GPU#4 </em></strong>and <strong><em>RAM#2</em></strong></p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.44-1024x724.jpeg" alt="" class="wp-image-18861" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.44-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.44-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.44-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.44.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h5 class="wp-block-heading">Second cycle communication</h5>



<p>At the first cycle the <strong><em>VM#1</em></strong> will have acces to <strong><em>GPU#4 </em></strong>and <strong><em>RAM#2</em></strong> through <strong><em>CPU#1</em></strong> and will therefore have an overhead</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.45-1024x724.jpeg" alt="" class="wp-image-18862" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.45-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.45-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.45-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.45.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h5 class="wp-block-heading">Third cycle communication : the speculative execution special case.</h5>



<p>During the third cycle the <strong><em>VM#1</em></strong> will still have <strong><em>RAM#2</em></strong> and <strong><em>CPU#2</em> </strong>attributed and this access will also be requested through the <strong><em>CPU#1</em></strong>.</p>



<h5 class="wp-block-heading">Speculative execution explained</h5>



<p>During every cycle something special may happen depending on your execution plan (not since L1TF Intel vulnerability for intel CPUs) and it&#8217;s called <strong><em>speculative execution</em></strong>.</p>



<p>Basically the <em><strong>CPU</strong></em> will <strong><em>anticipate</em></strong> the<strong><em> execution plan (called execution pipeline) </em></strong>and stored future potential results in it&#8217;s <strong><em>L1 cache</em></strong> (the cache of the CPU it self).</p>



<p>This anticipation of computation<strong><em> (compute ahead)</em></strong> is performed while the CPU computing power is at rest meanin<strong><em>g while transmitting resulting data of the performed execution to the next computational stage.</em></strong> This means that the <strong><em>computational branch</em></strong> <strong><em>calculated ahead might not be relevant </em></strong>if the results of the previous stage doesn&#8217;t needs to be evaluate based on the algorithm defined (basically if the condition of the branch &#8211; like of <strong>if</strong> in your code &#8211; is not met).</p>



<p><span style="text-decoration: underline"><strong>This speculative execution will be relevant if and only if the 3 following conditions are met :</strong></span></p>



<ul class="wp-block-list"><li>The<strong><em> Speculation Execution</em></strong> mode is activated (which is not the case this L1TF intel vulnerability : <a href="https://en.wikipedia.org/wiki/Spectre_(security_vulnerability)" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">spectre</a>)</li><li>The <strong><em>CPU instruction of the following cycle </em></strong>needs to be performed (scheduled) on the <strong><em>same CPU as the previous stage</em></strong>.</li><li>The <strong><em>condition of the branch needs to be met</em></strong>.</li></ul>



<p></p>



<p>Therefore if <em><strong>speculative execution</strong></em> was setup this <strong><em>3rd cycle</em></strong> could have been <em><strong>optimized</strong></em> because the <strong><em>VM#1 </em></strong>is are running on the <strong><em>same CPU (CPU#1)</em></strong> as the <strong><em>previous CPU cycle (2nd cycle)</em></strong></p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.47-1024x724.jpeg" alt="" class="wp-image-18863" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.47-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.47-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.47-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.47.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h5 class="wp-block-heading">Fourth cycle</h5>



<p>In this cycle <strong><em>VM#1 </em></strong>is scheduled back to<em><strong> CPU#2</strong></em> and <strong><em>won&#8217;t have an overhead</em></strong> as attached hardware is directly linked to the related <strong><em>CPU socket</em></strong>.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.09.49-1024x724.jpeg" alt="" class="wp-image-18864" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.49-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.49-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.49-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.09.49.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h3 class="wp-block-heading">CPU pinning</h3>



<p>To avoid the back and forth explained before a new concept had been introduced which is called <strong><em>CPU pinning</em></strong>.</p>



<h4 class="wp-block-heading">The OpenStack case</h4>



<p><strong><em>OpenStack</em></strong> is able to handle the <em><strong>NUMA-Node</strong></em> processing for you it is able to teach <strong><em>libvirt</em></strong> to statically<strong><em> pin vCPU to a physical CPU</em></strong> so that the vCPUs will no longer &#8220;move around&#8221; as described above.</p>



<p>You can check more information regarding this topic <a href="https://docs.openstack.org/nova/pike/admin/cpu-topologies.html" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">here</a>. This process is implemented on <strong><em><a href="https://www.ovhcloud.com/en/public-cloud/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">OVHcloud public cloud offers based on openstack</a></em></strong>.</p>



<h4 class="wp-block-heading">The VMware case</h4>



<p>Here is an <a href="https://docs.vmware.com/en/VMware-Integrated-OpenStack/7.0/com.vmware.openstack.admin.doc/GUID-81C482AF-C824-431E-B5B2-3E8E0403EC86.html" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">example with <strong><em>VMware CPU pinning</em></strong> configuration</a>. This feature is available on <a href="https://www.ovhcloud.com/en/enterprise/products/hosted-private-cloud/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><strong><em><strong><em>OVHcloud</em></strong> Private Cloud </em></strong>offers</a> as well are <em><strong><a href="https://www.ovhcloud.com/fr/managed-bare-metal/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><strong><em>OVHcloud</em></strong> new managed VMware on Bare Metal offers</a></strong></em> </p>



<h3 class="wp-block-heading">Splitting a GPU thanks to GPU virtualization : understanding the performance impacts</h3>



<p>One might want to <strong><em>split a GPU accross multiple VM using GPU virtualization</em></strong>. It&#8217;s especially interesting in terms of cost when you have a big memory like <strong><em>Nvidia V100s </em></strong>that have <strong><em>32GB</em></strong> of <strong><em>VRAM</em></strong> vs <strong>V100</strong> that only have <strong><em>16GB</em></strong> (check <a href="https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/" target="_blank" rel="noreferrer noopener" data-wpel-link="exclude">this blog post</a> for further details).</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="749" height="505" src="https://www.ovh.com/blog/wp-content/uploads/2021/03/image.png" alt="" class="wp-image-20688" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/03/image.png 749w, https://blog.ovhcloud.com/wp-content/uploads/2021/03/image-300x202.png 300w" sizes="auto, (max-width: 749px) 100vw, 749px" /><figcaption>Translation : Illustration of a GPU splitted in 2 that was supposed to 32GB and is assigned only 16GB or VRAM.</figcaption></figure></div>



<p><strong><span style="text-decoration: underline">We decided not to go this way for the following reasons :</span></strong></p>



<ul class="wp-block-list"><li>the first reason is that as described in the figure below the impact of splitting the <strong><em>GPU#4</em></strong> for instance will be that <strong><em>CPU#2 </em></strong>might be overflowed.</li><li>the second reason is that the <strong><em>computing power will be shared</em></strong> (meaning <strong><em>cuda core</em></strong> when it comes to <strong><em>Nvidia</em></strong>) making the performance of the <strong><em>GPU potentially not being used at it&#8217;s full advertized capacity per VM</em></strong>.</li></ul>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-20-16.17.30-1024x724.jpeg" alt="" class="wp-image-18865" width="768" height="543" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.17.30-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.17.30-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.17.30-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-20-16.17.30.jpeg 1280w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<h3 class="wp-block-heading">In conclusion</h3>



<p>In this article we presented <em><strong>how vitrualization works</strong></em>, <strong><em>how the orchestration of computing cycles </em></strong><em><strong>matters</strong></em> and why we decided to go in a <strong><em>PCI express passthrough mode </em></strong>for our <strong><em><a href="http://(https://www.ovhcloud.com/en/public-cloud/gpu" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">VM GPU offers</a></em></strong>.</p>



<p>You can also check our blog explaining <a href="https://www.ovh.com/blog/managing-gpu-pools-efficiently-in-ai-pipelines/" target="_blank" rel="noreferrer noopener" data-wpel-link="exclude">how to manage GPU pools effeciently in the cloud </a>using our <a href="https://www.ovhcloud.com/en/public-cloud/ai-training/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">new AI training offer</a> </p>



<figure class="wp-block-embed alignleft is-type-wp-embed is-provider-ovhcloud-blog wp-block-embed-ovhcloud-blog"><div class="wp-block-embed__wrapper">
<blockquote class="wp-embedded-content" data-secret="dPAN7MJXC3"><a href="https://www.ovh.com/blog/managing-gpu-pools-efficiently-in-ai-pipelines/" data-wpel-link="exclude">Managing GPU pools efficiently in AI pipelines</a></blockquote><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;Managing GPU pools efficiently in AI pipelines&#8221; &#8212; OVHcloud Blog" src="https://www.ovh.com/blog/managing-gpu-pools-efficiently-in-ai-pipelines/embed/#?secret=dPAN7MJXC3" data-secret="dPAN7MJXC3" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
</div></figure>



<h2 class="wp-block-heading">Going Further</h2>



<p>If you want to go further I suggest that you read these 2 excellent articles.</p>



<ul class="wp-block-list"><li><a href="https://www.thegeekyway.com/kvm-vs-qemu-vs-libvirt/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.thegeekyway.com/kvm-vs-qemu-vs-libvirt/</a></li><li><a href="http://dsc.soic.indiana.edu/publications/Cloud_2014%20(1).pdf" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">http://dsc.soic.indiana.edu/publications/Cloud_2014%20(1).pdf</a></li></ul>



<h2 class="wp-block-heading">Read my other related blogs</h2>



<p></p>



<figure class="wp-block-embed alignleft is-type-wp-embed is-provider-ovhcloud-blog wp-block-embed-ovhcloud-blog"><div class="wp-block-embed__wrapper">
<blockquote class="wp-embedded-content" data-secret="VZthqZwFCk"><a href="https://www.ovh.com/blog/how-pci-express-works-and-why-you-should-care-gpu/" data-wpel-link="exclude">How PCI-Express works and why you should care? #GPU</a></blockquote><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;How PCI-Express works and why you should care? #GPU&#8221; &#8212; OVHcloud Blog" src="https://www.ovh.com/blog/how-pci-express-works-and-why-you-should-care-gpu/embed/#?secret=VZthqZwFCk" data-secret="VZthqZwFCk" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
</div></figure>



<figure class="wp-block-embed alignleft is-type-wp-embed is-provider-ovhcloud-blog wp-block-embed-ovhcloud-blog"><div class="wp-block-embed__wrapper">
<blockquote class="wp-embedded-content" data-secret="VwyUy18M40"><a href="https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/" data-wpel-link="exclude">Understanding the anatomy of GPUs using Pokémon</a></blockquote><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;Understanding the anatomy of GPUs using Pokémon&#8221; &#8212; OVHcloud Blog" src="https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/embed/#?secret=VwyUy18M40" data-secret="VwyUy18M40" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
</div></figure>



<figure class="wp-block-embed alignleft is-type-wp-embed is-provider-ovhcloud-blog wp-block-embed-ovhcloud-blog"><div class="wp-block-embed__wrapper">
<blockquote class="wp-embedded-content" data-secret="qHIfp8KTaF"><a href="https://www.ovh.com/blog/deep-learning-explained-to-my-8-year-old-daughter/" data-wpel-link="exclude">Deep Learning explained to my 8-year-old daughter</a></blockquote><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;Deep Learning explained to my 8-year-old daughter&#8221; &#8212; OVHcloud Blog" src="https://www.ovh.com/blog/deep-learning-explained-to-my-8-year-old-daughter/embed/#?secret=qHIfp8KTaF" data-secret="qHIfp8KTaF" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
</div></figure>



<figure class="wp-block-embed alignleft is-type-wp-embed is-provider-ovhcloud-blog wp-block-embed-ovhcloud-blog"><div class="wp-block-embed__wrapper">
<blockquote class="wp-embedded-content" data-secret="zF1GmGaJTT"><a href="https://www.ovh.com/blog/distributed-training-in-a-deep-learning-context/" data-wpel-link="exclude">Distributed Training in a Deep Learning Context</a></blockquote><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;Distributed Training in a Deep Learning Context&#8221; &#8212; OVHcloud Blog" src="https://www.ovh.com/blog/distributed-training-in-a-deep-learning-context/embed/#?secret=zF1GmGaJTT" data-secret="zF1GmGaJTT" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
</div></figure>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fhow-to-virtualize-pci-e-gpu-in-the-cloud%2F&amp;action_name=How%20to%20virtualize%20PCI-e%20GPU%20in%20the%20cloud%20%3F&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>2021: major technological advances to accelerate, democratize and certify AI uses</title>
		<link>https://blog.ovhcloud.com/2021-major-technological-advances-to-accelerate-democratize-and-certify-ai-uses/</link>
		
		<dc:creator><![CDATA[Jean-Louis Queguiner]]></dc:creator>
		<pubDate>Tue, 26 Jan 2021 14:09:53 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[OVHcloud Thinking]]></category>
		<guid isPermaLink="false">https://www.ovh.com/blog/?p=20547</guid>

					<description><![CDATA[In 2020, OVHcloud launched a portfolio of PaaS solutions dedicated to Data and AI. As an innovative cloud player for the past 20 years, our up-to-date technology coupled with our close collaboration with data science communities as well as the most cutting-edge players in the sector has enabled us to identify 10 strong AI trends [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2F2021-major-technological-advances-to-accelerate-democratize-and-certify-ai-uses%2F&amp;action_name=2021%3A%20major%20technological%20advances%20to%20accelerate%2C%20democratize%20and%20certify%20AI%20uses&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p>In 2020, OVHcloud launched a portfolio of PaaS solutions dedicated to <a href="https://www.ovhcloud.com/en-ie/public-cloud/data-analytics/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Data</a> and <a href="https://www.ovhcloud.com/en-ie/public-cloud/ai-machine-learning/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">AI</a>. As an innovative cloud player for the past 20 years, our up-to-date technology coupled with our close collaboration with data science communities as well as the most cutting-edge players in the sector has enabled us to identify 10 strong AI trends in 2021.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0440-1024x538.png" alt="2021: major technological advances to accelerate, democratize and certify AI uses" class="wp-image-20574" width="512" height="269" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0440-1024x538.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0440-300x158.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0440-768x404.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0440.png 1200w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<h3 class="wp-block-heading">1. Moving towards a limited number of standard AI libraries</h3>



<p>With double digit growth, the French-American startup <a href="https://huggingface.co/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">HuggingFace</a> is paving the way towards the convergence of different Artificial Intelligence techniques within a single library.</p>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0444.png" alt="HuggingFace Transformers" class="wp-image-20586" width="492" height="119" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0444.png 656w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0444-300x73.png 300w" sizes="auto, (max-width: 492px) 100vw, 492px" /></figure></div>



<p>Neural network techniques have considerably evolved in recent years. Like machine learning libraries such as <a href="https://scikit-learn.org/stable/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Scikit-Learn</a>, which had laid the foundations of standardization, the rise of HuggingFace&#8217;s <a href="https://huggingface.co/transformers/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Transformers</a> library seems to usher in a new technical era: the convergence of tools that simplify and democratize the use of AI.</p>



<h3 class="wp-block-heading">2. Advances in NLP spread to other areas of AI application</h3>



<p>Natural Language Processing (NLP) grew exponentially in 2019 and 2020 due to the emergence of new technologies including Transformers. The latter was particularly noteworthy because it brought with it high-performance and highly agnostic NLP models. It mainly addresses the difficulties inherent to the temporal and spatial aspects of language. It solves the difficulty of establishing the link between the beginning and the end of a sentence and identifying key elements. </p>



<p>Given the performance of those new techniques in solving this type of problem, it becomes clear that the same techniques can also be used in areas not directly related to language such as video, voice, or even image processing. Even if there were few research paper concerning this topic in 2020, there are bringing out 2021 to be a year of improvements in the state of the art surrounding these&nbsp;subjects.</p>



<h3 class="wp-block-heading">3. The New Era of Speech Recognition</h3>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0442.png" alt="Speech recognition" class="wp-image-20578" width="314" height="249" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0442.png 627w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0442-300x238.png 300w" sizes="auto, (max-width: 314px) 100vw, 314px" /></figure></div>



<p>As we have seen, techniques related to NLP will benefit to numerous learning fields in which data temporality plays a very important role, among which Speech Recognition. Thus, <a href="https://mila.quebec/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">MILA</a> (known for its eminent professor <a href="https://yoshuabengio.org/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Yoshua </a><a href="https://yoshuabengio.org/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Bengio</a>), in collaboration with Nvidia, Samsung and Nuance have announced the<a href="https://speechbrain.github.io/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external"> Speechbrain</a> Project but has not yet revealed all its secrets but could be a &#8220;game changer&#8221; in 2021.</p>



<h3 class="wp-block-heading">4. I annotate, you annotate&#8230;</h3>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0446.png" alt="Weights and Biases" class="wp-image-20591" width="332" height="119" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0446.png 442w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0446-300x108.png 300w" sizes="auto, (max-width: 332px) 100vw, 332px" /></figure></div>



<p>We all will annotate this year! The widespread use of AI by companies will lead to an explosion in data labeling solutions, and should be accompanied by an expansion of open-source tools. A few big startups should stand out this year, like <a href="https://wandb.ai/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Weights and Biases</a> for experiment management, which was democratized last year.</p>



<h3 class="wp-block-heading">5. AI to be taught earlier in the IT curriculum</h3>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0441.png" alt="Teaching AI" class="wp-image-20576" width="163" height="160" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0441.png 326w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0441-300x294.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0441-70x70.png 70w" sizes="auto, (max-width: 163px) 100vw, 163px" /></figure></div>



<p>Although the bachelor and master&#8217;s degree programs in computer science already deals with the notions of artificial intelligence, in September 2021  the first artificial intelligence teaching programs may arrive in scientific fields upstream of the master&#8217;s or bachelor’s degrees.</p>



<h3 class="wp-block-heading">6. Fake or not fake?</h3>



<p>The rollout of Generative Neural Networks (GANs), over about 3 years, will spawn a real revolution in multimedia, especially in video games and video creation. As with great power comes great responsibility, with those technologies comes risk of malicious uses of generated images, as for example deep fake. Stay vigilant GAFAM, you are under the magnifying glass!</p>



<h3 class="wp-block-heading">7. A major open source player?</h3>



<p>All indications are that an open-source player, centralizing several areas of artificial intelligence applications such as image, sound, text, video &#8211; might emerge in the course of 2021 or 2022.</p>



<h3 class="wp-block-heading">8. Indicators to help reduce energy consumption</h3>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0430.png" alt="Climate Neutral Datacenter Pact" class="wp-image-20529" width="168" height="176" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0430.png 336w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0430-286x300.png 286w" sizes="auto, (max-width: 168px) 100vw, 168px" /></figure></div>



<p>The associated power consumption remains important, especially for the operation and cooling of GPUs. The ecological impact of Artificial Intelligence is a growing topic.  We can expect new indicators related to ecological impact in the research papers as an indicator&#8230; and why not in some cloud providers communications 😉.</p>



<p>OVHcloud already started working on this topic through the <a href="https://www.ovh.com/blog/5-keys-to-understand-the-climate-neutral-datacenter-pact/" target="_blank" rel="noreferrer noopener" data-wpel-link="exclude">Green Cloud Task Force</a>.</p>



<h3 class="wp-block-heading">9. Responsible and ethical AI certifications</h3>



<p>Everyone is talking about ethics and responsibility; it is certain that the subject will be a priority for the major certification bodies.<br>New ISO certifications, dedicated to AI, are expected to be launched this year to address critical topics such as: reversibility transparency of algorithms, multi-locality context application avoiding biases (skin color, age, gender, language, culture, accent, &#8230;). </p>



<h3 class="wp-block-heading">10. Collaborative solutions and container to secure reproducibility and to put in production</h3>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0443.png" alt="" class="wp-image-20581" width="177" height="186" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0443.png 353w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0443-285x300.png 285w" sizes="auto, (max-width: 177px) 100vw, 177px" /></figure></div>



<p>As the processes for implementing AI projects within companies are becoming more widely accessible and structured, we are seeing the trend of entire ecosystem looking forward to use/implement several collaborative data science tools, based on <a href="https://jupyter.org/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Project Jupyter</a>’s logic. Real-time collaborative code editing seems like a promising path! Reproducibility and production proof AI implementations seems to converge toward the container technology which should arrive in force for the data scientist community.</p>



<h3 class="wp-block-heading">And a last one, my personal conclusion</h3>



<div class="wp-block-image"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2021/01/IMG_0438.png" alt="JL Queguiner's Predictions" class="wp-image-20573" width="230" height="270" srcset="https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0438.png 459w, https://blog.ovhcloud.com/wp-content/uploads/2021/01/IMG_0438-255x300.png 255w" sizes="auto, (max-width: 230px) 100vw, 230px" /></figure></div>



<p>And here is an 11th prediction in the form of a more personal conclusion: the trend towards simplifying usage for developers/data scientist will grow&#8230; It is for this reason that we have worked to simplify as much as possible the user experience of our AI services such as <a href="https://www.ovhcloud.com/en/public-cloud/ai-training/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">AI Training</a> and <a href="https://www.ovhcloud.com/en/public-cloud/machine-learning-serving/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">ML Serving</a> 😉</p>



<p>And what a bonus if these tools are on a trusted cloud 💖</p>



<p>Happy cloud year 2021!</p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2F2021-major-technological-advances-to-accelerate-democratize-and-certify-ai-uses%2F&amp;action_name=2021%3A%20major%20technological%20advances%20to%20accelerate%2C%20democratize%20and%20certify%20AI%20uses&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Our partnership with Project Jupyter: the value of an open-source data science community</title>
		<link>https://blog.ovhcloud.com/ovhcloud-partnership-with-project-jupyter/</link>
		
		<dc:creator><![CDATA[Jean-Louis Queguiner]]></dc:creator>
		<pubDate>Tue, 10 Nov 2020 21:44:24 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Ecosystem]]></category>
		<category><![CDATA[Jupyter]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<guid isPermaLink="false">https://www.ovh.com/blog/?p=19862</guid>

					<description><![CDATA[In between two major online events that OVHcloud sponsors, JupyterCon and PyData, I wanted to share my team’s feedback on our collaboration with Project Jupyter that started a few of years back.<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fovhcloud-partnership-with-project-jupyter%2F&amp;action_name=Our%20partnership%20with%20Project%20Jupyter%3A%20the%20value%20of%20an%20open-source%20data%20science%20community&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/11/IMG_0366-1024x537.png" alt="Our partnership with Project Jupyter" class="wp-image-19886" width="768" height="403" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0366-1024x537.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0366-300x157.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0366-768x403.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0366.png 1200w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<p>Between two&nbsp;major online&nbsp;events,&nbsp;JupyterCon&nbsp;and&nbsp;PyData &#8211; both sponsored by OVHcloud &#8211; I wanted to share my team’s feedback on our collaboration with&nbsp;<a rel="noreferrer noopener nofollow external" href="https://jupyter.org/index.html" target="_blank" data-wpel-link="external">Project&nbsp;Jupyter</a>. The partnership, which begun a few years back, is still growing&nbsp;and we are continuing to learn a&nbsp;lot in relation to the open-source experience.&nbsp;</p>



<p>I’d like to introduce&nbsp;Maël&nbsp;Le Gal,&nbsp;one of our&nbsp;Data &amp; AI&nbsp;DevOps for&nbsp;almost three&nbsp;years.&nbsp;In 2019,&nbsp;Maël&nbsp;was heavily invested in&nbsp;making&nbsp;OVHcloud&nbsp;one of the&nbsp;<a rel="noreferrer noopener" href="https://www.ovh.com/blog/mybinder-and-ovh-partnership/" target="_blank" data-wpel-link="exclude">hosters&nbsp;of Binder Hubs</a>.&nbsp;&nbsp;</p>



<p><em>&#8220;Joining the multi-cloud Binder&nbsp;Federation and becoming an infrastructure provider of MyBinder.org was a thrilling project! I experienced open collaboration on a new scale. Typically, all the discussions regarding architecture evolution are held publicly within the Binder community on GitHub and additional day-to-day communications &#8211; such as operational tasks &#8211; are held collectively on&nbsp;<a rel="noreferrer noopener nofollow external" href="https://gitter.im/jupyterhub/mybinder.org-deploy" target="_blank" data-wpel-link="external">Gitter</a>.&nbsp;It’s&nbsp;a very different approach to building a trusting relationships than what we are accustomed to in the IT industry, and I really appreciate it!&#8221;</em>&nbsp;, explained Maël.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/11/IMG_0365.png" alt="Binder" class="wp-image-19883" width="281" height="86" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0365.png 561w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0365-300x92.png 300w" sizes="auto, (max-width: 281px) 100vw, 281px" /></figure></div>



<div class="wp-block-image is-style-rounded"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://secure.gravatar.com/avatar/1679520721bd8949f3afed1315daaa6e?s=300&amp;d=mm&amp;r=g" alt="" width="150" height="150"/></figure></div>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p><em>In hindsight, it&nbsp;was a unique,&nbsp;co-innovation&nbsp;opportunity for&nbsp;OVHcloud&nbsp;to become&nbsp;the&nbsp;first partner of the Binder Federation, and&nbsp;it became even more interesting when&nbsp;the&nbsp;two&nbsp;other partners &#8211;&nbsp;Gesis&nbsp;and&nbsp;Turing &#8211; joined as well. Thanks to our collaborative methods, we can easily&nbsp;and collectively&nbsp;submit a new feature; once&nbsp;most of&nbsp;the community agrees on the feature, we deploy&nbsp;it&nbsp;on each of the three clouds.</em>&nbsp;</p><cite>Maël Le Gal, Data &amp; AI DeVops with OVHcloud, explains the daily collaboration of the Binder Federation.</cite></blockquote>



<p>This is what I wrote about my experience with&nbsp;MyBinder&nbsp;last year:<em>&nbsp;&#8220;Working with open source requires a very human-centric&nbsp;mindset to build consensus and deliver progress when&nbsp;everyone has different&nbsp;objectives, priorities, timelines&nbsp;and points of view.”</em>&nbsp;</p>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/11/10_36_16-1024x388.jpg" alt="Binder Federation" class="wp-image-19891" width="768" height="291" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/11/10_36_16-1024x388.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/10_36_16-300x114.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/10_36_16-768x291.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/10_36_16-1536x582.jpg 1536w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/10_36_16.jpg 1854w" sizes="auto, (max-width: 768px) 100vw, 768px" /><figcaption>Here is the Binder Federation as of beginning of November 2020.</figcaption></figure>



<h2 class="wp-block-heading"><strong>2020:&nbsp;scaling projects and accelerating</strong>&nbsp;</h2>



<p>Building on this experience,&nbsp;Maël&nbsp;contributed to&nbsp;another hosting initiative&nbsp;for Project&nbsp;Jupyter&nbsp;earlier this year&nbsp;that quickly scaled:&nbsp;<a rel="noreferrer noopener nofollow external" href="https://github.com/jupyter/nbviewer" target="_blank" data-wpel-link="external">NBViewer</a> &#8211;&nbsp;the web application behind The&nbsp;Jupyter&nbsp;Notebook Viewer, hosted by&nbsp;OVHcloud.&nbsp;&nbsp;</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/11/IMG_0363.png" alt="Jupyter Nbviewer" class="wp-image-19881" width="360" height="133" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0363.png 720w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/IMG_0363-300x110.png 300w" sizes="auto, (max-width: 360px) 100vw, 360px" /></figure></div>



<p><em>“Jupyter&nbsp;was quite happy about our first collaboration and, in March, asked us to replace a previous&nbsp;hosting provider&nbsp;that had disengaged from the open-source community. Because&nbsp;we&nbsp;already&nbsp;had the deployment experience,&nbsp;and methods&nbsp;from Binder, it was a very smooth deployment. We were happy to see that the traffic running on the&nbsp;OVHcloud&nbsp;infrastructure grew from 25% to 100%”,&nbsp;</em>explained&nbsp;Maël.&nbsp;</p>



<p>Now I’d like to introduce another member of my team:&nbsp;Guillaume Salou, our AI Technical Lead, who’s been working closely with Project&nbsp;Jupyter&nbsp;from the beginning and&nbsp;driving the&nbsp;OVHcloud&nbsp;<a rel="noreferrer noopener" href="https://www.ovh.com/blog/sponsorship-of-the-jupytercon-2020-sharing-values-and-supporting-with-infrastructure/" target="_blank" data-wpel-link="exclude">sponsorhip&nbsp;of&nbsp;the&nbsp;JupyterCon&nbsp;2020 digital event</a>&nbsp;as an infrastructure&nbsp;donor.&nbsp;&nbsp;</p>



<figure class="wp-block-image size-large is-resized"><a href="https://www.ovh.com/blog/sponsorship-of-the-jupytercon-2020-sharing-values-and-supporting-with-infrastructure/" data-wpel-link="exclude"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-1024x537.jpeg" alt="Sponsorship of the JupyterCon 2020: sharing values and supporting with infrastructure" class="wp-image-18692" width="512" height="269" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-1024x537.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-300x157.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-768x403.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8.jpeg 1200w" sizes="auto, (max-width: 512px) 100vw, 512px" /></a></figure>



<p><em>“We kicked off this project in June with&nbsp;Project&nbsp;Jupyter&nbsp;and&nbsp;NumFOCUS&nbsp;Foundation as well as&nbsp;IBL Education,&nbsp;responsible for the Open EDX deployment at the conference.&nbsp;OVHcloud&nbsp;has&nbsp;offered&nbsp;the underlying infrastructure to host the global event (for the first time fully online) and its educational platform and now we had to deploy it!</em></p>



<p><em>The type of infrastructure we’re talking about is, of course, GPUs,&nbsp;coupled with Kubernetes to support all the talks, demos and online experiments that were planned.&nbsp;</em>&nbsp;</p>



<p><em>I want to highlight three aspects of this infrastructure t</em><em>hat</em><em>&nbsp;meet Project&nbsp;</em><em>Jupyter’s</em><em>&nbsp;requirements</em><em>:</em>&nbsp;</p>



<ul class="wp-block-list"><li><em>Ability to s</em><em>cale up and down as necessary</em><em>;</em>&nbsp;</li></ul>



<ul class="wp-block-list"><li><em>Ability to load balance on other cloud providers and the reversibility of our services;</em>&nbsp;</li><li><em>Simplified rights management and user account creation&nbsp;based&nbsp;on&nbsp;Openstack.</em>&#8220;</li></ul>



<div class="wp-block-image is-style-rounded"><figure class="alignright size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/11/guillaume_salou.jpg" alt="" class="wp-image-19875" width="150" height="150" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/11/guillaume_salou.jpg 400w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/guillaume_salou-300x300.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/guillaume_salou-150x150.jpg 150w, https://blog.ovhcloud.com/wp-content/uploads/2020/11/guillaume_salou-70x70.jpg 70w" sizes="auto, (max-width: 150px) 100vw, 150px" /></figure></div>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p><em>This&nbsp;collaboration&nbsp;has steered our efforts towards simplifying&nbsp;features and providing&nbsp;ready-to-use services&nbsp;for the event’s organizers and users; it’s a new milestone in&nbsp;OVHcloud’s&nbsp;AI&nbsp;approach to eliminate all infrastructure set-up and management complexity&nbsp;to facilitate and spread usages.</em></p><cite>Guillaume Salou, AI Technical Lead with OVHcloud explains how the collaboration on JupyterCon 2020 has supported internal efforts to simplify the user experience.</cite></blockquote>



<p>To conclude: this partnership with Project&nbsp;Jupyter&nbsp;is here to stay. Thanks a&nbsp;lot&nbsp;to our partner as well as internal teams for making this open and fruitful collaboration a reality! </p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fovhcloud-partnership-with-project-jupyter%2F&amp;action_name=Our%20partnership%20with%20Project%20Jupyter%3A%20the%20value%20of%20an%20open-source%20data%20science%20community&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How PCI-Express works and why you should care? #GPU</title>
		<link>https://blog.ovhcloud.com/how-pci-express-works-and-why-you-should-care-gpu/</link>
		
		<dc:creator><![CDATA[Jean-Louis Queguiner]]></dc:creator>
		<pubDate>Thu, 09 Jul 2020 10:16:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Deep learning]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[PCIe]]></category>
		<guid isPermaLink="false">https://blog.ovh.com/fr/blog/?p=14485</guid>

					<description><![CDATA[What is PCI-Express ? Everyone, and I mean everyone, should pay attention when they do intensive Machine Learning / Deep Learning Training. As I explained in a previous blog post, GPUs have accelerated Artificial Intelligence evolution massively. However, building a GPUs server is not that easy. And failing to create an appropriate infrastructure can have [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fhow-pci-express-works-and-why-you-should-care-gpu%2F&amp;action_name=How%20PCI-Express%20works%20and%20why%20you%20should%20care%3F%20%23GPU&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="538" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/69659375-3553-40C9-A201-73C4CDED2461-1024x538.jpeg" alt="How PCI-Express works and why you should care? #GPU" class="wp-image-18783" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/69659375-3553-40C9-A201-73C4CDED2461-1024x538.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/69659375-3553-40C9-A201-73C4CDED2461-300x158.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/69659375-3553-40C9-A201-73C4CDED2461-768x403.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/69659375-3553-40C9-A201-73C4CDED2461.jpeg 1200w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure></div>



<h2 class="wp-block-heading">What is PCI-Express ?</h2>



<p>Everyone, and I mean everyone, should pay attention when they do intensive Machine Learning / Deep Learning Training. </p>



<p>As I explained in a previous blog post, GPUs have accelerated Artificial Intelligence evolution massively.  </p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><a href="https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/" data-wpel-link="exclude"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/EEAD0A02-DFCA-4745-802B-E36BC517EFED.png" alt="" class="wp-image-18103" width="334" height="254" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/EEAD0A02-DFCA-4745-802B-E36BC517EFED.png 668w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/EEAD0A02-DFCA-4745-802B-E36BC517EFED-300x228.png 300w" sizes="auto, (max-width: 334px) 100vw, 334px" /></a></figure></div>



<p>However, building a GPUs server is not that easy. And failing to create an appropriate infrastructure can have consequences on training time.</p>



<p>If you use GPUs, you should know that there are 2 ways to connect them to the motherboard to allow it to connect to the other components (network, CPU, storage device). Solution 1 is through <strong>PCI Express </strong>and solution 2 through <strong>SXM2</strong>. We will talk about <strong>SXM2</strong> in the future. Today, we will focus on <strong>PCI Express</strong>. This is because it has a strong dependency with the choice of adjacent hardware such as PCI BUS or CPU.</p>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>                     NVIDIA V100 with SXM2 design</th><th class="has-text-align-center" data-align="center">                          NVIDIA V100 with PCI express design</th></tr></thead><tbody><tr><td><img loading="lazy" decoding="async" width="609" height="644" class="wp-image-18763" style="width: 500px" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/PCIe-01.jpg" alt="NVIDIA V100 with SXM2 design" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-01.jpg 609w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-01-284x300.jpg 284w" sizes="auto, (max-width: 609px) 100vw, 609px" /><br>Source : <a aria-label="undefined (opens in a new tab)" href="https://www.ebizpc.com/NVIDIA-Tesla-V100-900-2G502-0300-000-16GB-GPU-p/900-2g503-0310-000.htm" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://www.ebizpc.com/NVIDIA-Tesla-V100-900-2G502-0300-000-16GB-GPU-p/900-2g503-0310-000.htm</a></td><td class="has-text-align-center" data-align="center"><img loading="lazy" decoding="async" width="450" height="450" class="wp-image-18764" style="width: 500px" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/PCIe-02.jpg" alt="NVIDIA V100 with PCI express design" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-02.jpg 450w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-02-300x300.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-02-150x150.jpg 150w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-02-70x70.jpg 70w" sizes="auto, (max-width: 450px) 100vw, 450px" /><br>Source : <a aria-label="undefined (opens in a new tab)" href="https://nvidiastore.com.br/nvidia-tesla-v100-16gb" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">https://nvidiastore.com.br/nvidia-tesla-v100-16gb</a></td></tr></tbody></table><figcaption>SXM2 design VS PCI Express Design</figcaption></figure>



<p>This is a major element to consider when talking about deep learning as data loading phase is a waste of compute time, so bandwidth between components and GPUs is a key bottleneck in most deep learning training contexts.</p>



<h2 class="wp-block-heading">How does PCI-Express work and why you should care about the number of PCIe lanes?</h2>



<h3 class="wp-block-heading">What is a PCI-Express Lanes and are there any associated CPU limitations?</h3>



<p>Each GPU V100 is using the 16 PCI-e lanes. What does it mean exactly?</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="618" height="442" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/PCIe-03.png" alt="Extract from NVidia V100 product specification sheet" class="wp-image-18767" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-03.png 618w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-03-300x215.png 300w" sizes="auto, (max-width: 618px) 100vw, 618px" /><figcaption>Extract from NVidia V100 product specification <a href="https://images.nvidia.com/content/technologies/volta/pdf/tesla-volta-v100-datasheet-letter-fnl-web.pdf" target="_blank" aria-label="undefined (opens in a new tab)" rel="noreferrer noopener nofollow external" data-wpel-link="external">sheet</a></figcaption></figure></div>



<p>The <strong><em>&#8220;x16&#8221;</em></strong> means that the PCIe has 16 dedicated lanes. So&#8230; next question: What is a PCI Express lane ?</p>



<h4 class="wp-block-heading">What&#8217;s a PCI Express lane?</h4>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/72DFDF80-DC39-4253-BAB3-CEB351B627D3.jpeg" alt="2 PCI Express Devices with its interconnexion" class="wp-image-18779" width="424" height="299" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/72DFDF80-DC39-4253-BAB3-CEB351B627D3.jpeg 848w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/72DFDF80-DC39-4253-BAB3-CEB351B627D3-300x211.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/72DFDF80-DC39-4253-BAB3-CEB351B627D3-768x541.jpeg 768w" sizes="auto, (max-width: 424px) 100vw, 424px" /><figcaption>2 PCI Express Devices with its interconnexion : figure inspired of the awesome <a aria-label="undefined (opens in a new tab)" href="https://www.phhsnews.com/what-is-chipset-and-why-should-i-care3538" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">article</a> &#8211; what is chipset and why should I care</figcaption></figure></div>



<p>PCIe lanes are used to communicate between PCIe Devices or between PCIe and CPU. A lane is composed of 2 wires: one for inbound communications and one, which has double the traffic bandwidth, for outbound. </p>



<p>Lane communications are similar to network Layer 1 communications &#8211; it’s all about transferring bits as fast as possible through electrical wires! However, the technique used for PCIe Link is a bit different as the PCIe device is composed of xN lanes. In our previous example N=16 but it could be any power of 2 from 1 to 16 (1/2/4/8/16).</p>



<h3 class="wp-block-heading">So… if PCIe is similar to network architecture it means that PCIe layers exist, doesn&#8217;t it?</h3>



<p>Yes ! you are right PCIe has 4 layers:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="724" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/photo_2020-07-02-15.08.02-1024x724.jpeg" alt="" class="wp-image-18723" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-02-15.08.02-1024x724.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-02-15.08.02-300x212.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-02-15.08.02-768x543.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/photo_2020-07-02-15.08.02.jpeg 1280w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p></p>



<h4 class="wp-block-heading"><strong>The Physical Layer (aka <em>the Big Negotiation Layer</em>)</strong></h4>



<p>The<strong><em> Physical Layer (PL)</em></strong> is responsible for negotiating the terms and conditions for receiving the raw packets (PLP for Physical Layer Packets) i.e the lane width and the frequency with the other device.</p>



<p>You should be aware that only the smallest number of lanes of the two devices will be used. This is why choosing the appropriate CPU is so important. CPUs have a limited number of lanes that they can manage so <strong>having a nice GPU with 16 PCIe Lanes and having a CPU with 8 PCIe Bus lanes will be as efficient as throwing away half your money because it doesn’t fit in your wallet.</strong></p>



<p>Packets received at the <strong><em>Physical Layer (aka PHY) </em></strong>are coming from other PCIe devices or from the system (via <strong><em>Direct Access Memory — DAM</em></strong> or from CPU for instance) and are encapsulated in a frame. </p>



<p>The purpose of a Start-of-Frame is to say: “I am sending you data, this is the beginning,” and it takes just 1 byte to say that!</p>



<p>The <strong><em>End-of-Frame</em> </strong>word is also 1 byte to say “goodbye I’m done with it”.</p>



<p>This layer implement a <strong><em>8b/10b or 128b/130b decoding</em></strong> that we will explain later and is mainly used for <strong><em>clock recovery.</em></strong></p>



<h4 class="wp-block-heading"><strong>The Data Link Layer Packet (aka <em>Let’s put this mess in the right&nbsp;order</em>)</strong></h4>



<p>The <strong><em>Data Link Layer Packet (DLLP)</em></strong> is starting with a <strong><em>Packet Sequence Number.</em></strong> This is really important as a packet might get corrupted at one point, so may need to be uniquely identified for retry purposes. The <strong><em>Sequence Number </em></strong>is coded on 2 bytes.</p>



<p>The <strong><em>Data Link Layer Packet</em></strong> is then followed by the <strong><em>Transaction Layer Packet</em></strong> and then closed with the <strong><em>LCRC (Local Cyclic Redundancy Check) </em></strong>and is used to check the <strong><em>Transaction Layer Packet (meaning the actual Payload)</em></strong> integrity.</p>



<p>If the <strong><em>LCRC</em></strong> is validated, then the <em><strong>Data Link Layer</strong></em> sends an <strong><em>ACK (ACKnowledge)</em></strong> signal to the <em><strong>emitter</strong></em> through the <strong><em>Physical Layer</em>.</strong> Otherwise it sends a <strong><em>NAK (Not AcKnowledge) </em></strong>signal to the emitter which will resend the frame associated with the <strong><em>sequence number </em></strong>to retry; this part handles the replay buffer on the <em><strong>receiver</strong></em> side.</p>



<h4 class="wp-block-heading"><strong>The Transaction Layer</strong></h4>



<p>The<strong><em> Transaction Layer</em></strong> is responsible for <strong>managing the actual payload (Header + Data)</strong> as well as the (optional) message digest <strong><em>ECRC (End to End Cyclic Redundancy Check)</em></strong>. This <strong><em>Transaction Layer Packet </em></strong>is coming from the <strong><em>Data Link Layer</em></strong> where it has been <strong>decapsulated</strong>.</p>



<p>An <strong>integrity check</strong> is performed if needed/requested. This step will check the integrity of the business logic and will insure no packet corruption when passing data from<strong><em> Data Link Layer</em></strong> to <em><strong>Transaction Layer.</strong></em></p>



<p>The header is describing the type of transaction such as:</p>



<ul class="wp-block-list"><li>Memory Transaction</li><li>I/O Transaction</li><li>Configuration Transaction</li><li>or Message Transaction</li></ul>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/5E282911-B63F-410D-A2CD-AD52B928C62E-1024x600.jpeg" alt="PCIe Layers" class="wp-image-18781" width="512" height="300" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/5E282911-B63F-410D-A2CD-AD52B928C62E-1024x600.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/5E282911-B63F-410D-A2CD-AD52B928C62E-300x176.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/5E282911-B63F-410D-A2CD-AD52B928C62E-768x450.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/5E282911-B63F-410D-A2CD-AD52B928C62E.jpeg 1368w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<h4 class="wp-block-heading"><strong>The Application Layer</strong></h4>



<p>The role of the <em><strong>application layer</strong></em> is to handle the<strong><em> User Logic</em></strong>. This layer is sending the <strong><em>Header</em></strong> <strong><em>and the data payload </em></strong>to the <strong><em>Transaction Layer</em></strong>. The magic happens in this layer where data in rooted to different hardware components.</p>



<h3 class="wp-block-heading">How PCIe is communicating with the rest of the&nbsp;world?</h3>



<p>PCIe Link is using the <strong>packet switching concept used in network in a full duplex mode.</strong></p>



<p>PCIe device have an <strong>internal clock to orchestrate PCIe </strong><em><strong>Data Transfer Cycles</strong>.</em> This <strong><em>Data Transfer Cycle</em></strong> is also orchestrated thanks to the <strong><em>Referential Clock.</em></strong> The latter is sending a signal through a <strong><em>Dedicated Lane</em> (which is not part of the x1/2/4/8/16/32 mentioned above)</strong>. This clock will help both receiving and emitting devices to synchronize for packets communications.</p>



<p><strong>Each PCIe lane is used to send bytes in parallel with other lanes</strong>. The<strong><em> Clock Synchronization </em></strong>mentioned above will help the receiver to put back those bytes in the right order</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="618" height="442" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/PCIe-03.png" alt="" class="wp-image-18767" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-03.png 618w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-03-300x215.png 300w" sizes="auto, (max-width: 618px) 100vw, 618px" /><figcaption>x16 means 16 lanes of parallel communication on generation 3 of PCIe&nbsp;protocol</figcaption></figure></div>



<h3 class="wp-block-heading">You may have the bytes in order but do you have the data integrity at the physical layer&nbsp;?</h3>



<p>To ensure <strong>integrity</strong> PCIe device uses <strong>8b/10b encoding for PCIe generations 1 and 2</strong> or <strong>128b/130b encoding scheme for generations 3</strong> <strong>and 4.</strong></p>



<p>These encodings are used to prevent the loss of temporal landmarks, especially when transmitting consecutive similar bits. This process is called “<strong><em>Clock Recovery</em></strong>”</p>



<p>Those 128 bits of payload data are sent and 2 bytes of control are appended to it.</p>



<h4 class="wp-block-heading">Quick examples</h4>



<p><em>Let’s simplify it with a 8b/10b example:</em> according to IEEE 802.3 clause 36, table 36–1a based on Ethernet specifications here is the table 8b/10b encoding:</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="600" height="546" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/PCIe-04.png" alt="IEEE 802.3 clause 36, table 36–1a - 8b/10b encoding table" class="wp-image-18770" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-04.png 600w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/PCIe-04-300x273.png 300w" sizes="auto, (max-width: 600px) 100vw, 600px" /><figcaption>IEEE 802.3 clause 36, table 36–1a &#8211; 8b/10b encoding table</figcaption></figure></div>



<p>So how can the receiver make the difference between all those repeating 0 (Code Group Name D0.0) ?</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/2B41AC73-59D2-4230-B8F4-73327F3991E4-1024x819.png" alt="Repeating bits everywhere" class="wp-image-18777" width="512" height="410" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/2B41AC73-59D2-4230-B8F4-73327F3991E4-1024x819.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/2B41AC73-59D2-4230-B8F4-73327F3991E4-300x240.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/2B41AC73-59D2-4230-B8F4-73327F3991E4-768x615.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/2B41AC73-59D2-4230-B8F4-73327F3991E4.png 1381w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<p>8b/10b encoding is composed of 5b/6b + 3b/4b encodings.</p>



<p>Therefore <strong>00000 000</strong> will be encoded into <strong>100111 0100 </strong>the 5 first bits of the original data <strong>00000</strong> are encoded to <strong>100111</strong> using 5b/6b encoding (<strong>rd+</strong>); same goes for the second group of 3bits of original data <strong>000</strong> encoded into <strong>0100</strong> using 3b/4b encoding (<strong>rd-</strong>).</p>



<p>It could have been also <strong>5b/6b encoding rd+ </strong>and<strong> 3b/4b encoding rd- </strong>making <strong>00000 000</strong> turning into <strong>011000 1011</strong></p>



<p><strong>Therefore the original data which was 8bits is now 10bits due to bits control (1 control bit for 5b/6b and 1 fir 3b/4b). </strong></p>



<p>But don&#8217;t worry I will draft a blog post later dedicated to encoding.</p>



<p><strong>PCIe Generations 1 and 2 were designed with 8b/10b encoding </strong>meaning that the <strong>actual data transmitted was only 80% of the total load </strong>(as 20% — 2 bits are used as Clock synchronization).</p>



<p><strong>PCIe Gen3&amp;4 were designed with 128b/130b </strong>meaning that the <strong>control bits are now representing only 1.56% of the payload. </strong>Quite good isn’t it?</p>



<h3 class="wp-block-heading">Let’s calculate the PCIe bandwidth together</h3>



<p>Here is the table of PCIe versions specifications</p>



<figure class="wp-block-table"><table><thead><tr><th>Number of Lanes</th><th>PCIe 1.0 (2003)</th><th>PCIe 2.0 (2007)</th><th><strong>PCIe 3.0 (2010)</strong></th><th><strong>PCIe 4.0 (2017)</strong></th><th>PCIe 5.0 (2019)</th><th>PCIe 6.0 (2021)</th></tr></thead><tbody><tr><td>x1</td><td>250 MB/s</td><td>500 MB/s</td><td>1 GB/s</td><td>2 GB/s</td><td>4 GB/s</td><td>8 GB/s</td></tr><tr><td>x2</td><td>500 MB/s</td><td>1 GB/s</td><td>2 GB/s</td><td>4 GB/s</td><td>8 GB/s</td><td>16 GB/s</td></tr><tr><td>x4</td><td>1 GB/s</td><td>2 GB/s</td><td>4 GB/s</td><td>8 GB/s</td><td>16 GB/s</td><td>32 GB/s</td></tr><tr><td>x8</td><td>2 GB/s</td><td>4 GB/s</td><td>8 GB/s</td><td>16 GB/s</td><td>32 GB/s</td><td>64 GB/s</td></tr><tr><td><strong>x16</strong></td><td>4 GB/s</td><td>8 GB/s</td><td><strong>16 GB/s</strong></td><td>32 GB/s</td><td>64 GB/s</td><td>128 GB/s</td></tr></tbody></table><figcaption>consortium PCI-SIG PCIe theoretical bandwidth/Lane/Way specification sheet</figcaption></figure>



<figure class="wp-block-table"><table><thead><tr><th>                                </th><th>PCIe 1.0 (2003)</th><th>PCIe 2.0 (2007)</th><th>PCIe 3.0 (2010)</th><th>PCIe 4.0 (2017)</th><th>PCIe 5.0 (2019)</th><th>PCIe 6.0 (2021)</th></tr></thead><tbody><tr><td><strong>Frequency</strong></td><td>2.5 GT/s</td><td>5.0 GT/s</td><td>8.0 GT/s</td><td>16 GT/s</td><td>32 GT/s</td><td>64 GT/s</td></tr></tbody></table><figcaption>consortium PCI-SIG PCIe theoretical raw bit rate specification sheet</figcaption></figure>



<p>To obtain such numbers let&#8217;s look at the general Bandwidth formula:</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/B529B3E3-419B-49DE-9544-8B7BF190D3BB-1024x155.jpeg" alt="" class="wp-image-18793" width="512" height="78" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/B529B3E3-419B-49DE-9544-8B7BF190D3BB-1024x155.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/B529B3E3-419B-49DE-9544-8B7BF190D3BB-300x46.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/B529B3E3-419B-49DE-9544-8B7BF190D3BB-768x117.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/B529B3E3-419B-49DE-9544-8B7BF190D3BB.jpeg 1298w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<ul class="wp-block-list"><li>BW stands for Bandwidth</li><li>MT/s&nbsp;: Mega Transfers per second</li><li>Encoding could be 4b/5b/, 8b/10b, 128b/130b,&nbsp;…</li></ul>



<h4 class="wp-block-heading">For PCIe v1.0:</h4>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/A99597E2-4117-43B1-9048-1CE24EFAE227-1024x170.jpeg" alt="BW/lane\ (MB/s) = \ 2\ 500\ (MT/s)\ *\ \frac{8\ bits}{10\ bits} * \frac{1\ Byte}{8\ bits" class="wp-image-18785" width="512" height="85" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/A99597E2-4117-43B1-9048-1CE24EFAE227-1024x170.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/A99597E2-4117-43B1-9048-1CE24EFAE227-300x50.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/A99597E2-4117-43B1-9048-1CE24EFAE227-768x127.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/A99597E2-4117-43B1-9048-1CE24EFAE227.jpeg 1231w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/BC5F6C70-2FCF-4CD4-9040-848C8EB654CB.jpeg" alt="BW/lane\ (MB/s) = \ 250\ (MB/s)" class="wp-image-18788" width="347" height="79" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/BC5F6C70-2FCF-4CD4-9040-848C8EB654CB.jpeg 806w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/BC5F6C70-2FCF-4CD4-9040-848C8EB654CB-300x67.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/BC5F6C70-2FCF-4CD4-9040-848C8EB654CB-768x172.jpeg 768w" sizes="auto, (max-width: 347px) 100vw, 347px" /></figure></div>



<h4 class="wp-block-heading">For PCIe v3.0 (the one that interest us for NVIDIA V100):</h4>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/6EFDAF22-C7FC-44FC-B5BE-D8C4D291B71A-1024x154.jpeg" alt="BW/lane\ (MB/s) = \ 8\ 000\ (MT/s)\ *\ \frac{128\ bits}{130\ bits} * \frac{1\ Byte}{8\ bits}" class="wp-image-18795" width="512" height="77" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/6EFDAF22-C7FC-44FC-B5BE-D8C4D291B71A-1024x154.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/6EFDAF22-C7FC-44FC-B5BE-D8C4D291B71A-300x45.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/6EFDAF22-C7FC-44FC-B5BE-D8C4D291B71A-768x115.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/6EFDAF22-C7FC-44FC-B5BE-D8C4D291B71A.jpeg 1292w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/3B7E1754-67C8-4EF1-88BE-3A5D8985803F.jpeg" alt="BW/lane\ (MB/s) = \ 984.6\ (MB/s)" class="wp-image-18796" width="355" height="63" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/3B7E1754-67C8-4EF1-88BE-3A5D8985803F.jpeg 802w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/3B7E1754-67C8-4EF1-88BE-3A5D8985803F-300x53.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/3B7E1754-67C8-4EF1-88BE-3A5D8985803F-768x136.jpeg 768w" sizes="auto, (max-width: 355px) 100vw, 355px" /></figure></div>



<p>Therefore with <strong>16 lanes for a NVIDIA V100 connected in PCIe v3.0</strong>, we have an effective data rate transfer (data bandwidth)<strong> of nearly 16GB/s/way </strong>(<strong>actual bandwidth is 15.75GB/s/way</strong>)</p>



<p>You need to be careful not to get confused, as total bandwidth can also be interpreted as two ways bandwidth; in this case we consider total bandwidth x16 to be around 32GB/s.</p>



<p><em><strong>Note :</strong></em> Another element that we haven&#8217;t considered is that the maximum theoretical bandwidth needs to be reduced by around 1 Gb/s for error correction protocols (<strong><em>ECRC</em></strong> and <strong><em>LCRC</em></strong>) as well as the <strong><em>Headers</em></strong> (<strong><em>Start tag, Sequence tag, Header</em></strong>) and <strong><em>Footer</em></strong> (<em><strong>End</strong></em> tag) overheads explained earlier in this blog post.</p>



<h3 class="wp-block-heading">In conclusion</h3>



<p>We have seen that PCI Express has evolved a lot and that It&#8217;s based on the same concepts as network. To take the best from the PCIe devices it is necessary to understand the fundamentals of the underlying infrastructure. </p>



<p>Failing to choose the right underlying Motherboard, CPU or BUS can lead to major performance bottleneck and GPU under performance.</p>



<p>To sum up :</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Friends don&#8217;t let friends build their own GPUs hosts 😉</p><cite>Jean-Louis Quéguiner July 1<sup>st</sup>, 2020</cite></blockquote>



<p>If you liked this post but you want to drill down a bit into the Deep Learning and AI aspect of things don&#8217;t hesitate to check out my other blog posts:</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><a href="https://www.ovh.com/blog/deep-learning-explained-to-my-8-year-old-daughter/" data-wpel-link="exclude"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/BC0E1AC1-6593-4395-9844-A7D2CB457028.png" alt="" class="wp-image-18099" width="515" height="376" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/BC0E1AC1-6593-4395-9844-A7D2CB457028.png 748w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/BC0E1AC1-6593-4395-9844-A7D2CB457028-300x219.png 300w" sizes="auto, (max-width: 515px) 100vw, 515px" /></a></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><a href="https://www.ovh.com/blog/what-does-training-neural-networks-mean/" data-wpel-link="exclude"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/04/81921ABA-7642-4CA2-87BF-9B2D92278BF1-1024x538.png" alt="What does training neural networks mean?" class="wp-image-17932" width="512" height="269" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/04/81921ABA-7642-4CA2-87BF-9B2D92278BF1-1024x538.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/04/81921ABA-7642-4CA2-87BF-9B2D92278BF1-300x158.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/04/81921ABA-7642-4CA2-87BF-9B2D92278BF1-768x404.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/04/81921ABA-7642-4CA2-87BF-9B2D92278BF1.png 1200w" sizes="auto, (max-width: 512px) 100vw, 512px" /></a></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><a href="https://www.ovh.com/blog/distributed-training-in-a-deep-learning-context/" data-wpel-link="exclude"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/20C35ECE-4738-4967-951E-6BC863342D5D-1024x537.png" alt="Distributed Learning in a Deep Learning context" class="wp-image-18106" width="512" height="269" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/20C35ECE-4738-4967-951E-6BC863342D5D-1024x537.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/20C35ECE-4738-4967-951E-6BC863342D5D-300x157.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/20C35ECE-4738-4967-951E-6BC863342D5D-768x403.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/20C35ECE-4738-4967-951E-6BC863342D5D.png 1200w" sizes="auto, (max-width: 512px) 100vw, 512px" /></a></figure></div>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fhow-pci-express-works-and-why-you-should-care-gpu%2F&amp;action_name=How%20PCI-Express%20works%20and%20why%20you%20should%20care%3F%20%23GPU&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Sponsorship of the JupyterCon 2020: sharing values and supporting with infrastructure</title>
		<link>https://blog.ovhcloud.com/sponsorship-of-the-jupytercon-2020-sharing-values-and-supporting-with-infrastructure/</link>
		
		<dc:creator><![CDATA[Jean-Louis Queguiner]]></dc:creator>
		<pubDate>Thu, 02 Jul 2020 08:06:51 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Ecosystem]]></category>
		<category><![CDATA[Jupyter]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<guid isPermaLink="false">https://www.ovh.com/blog/?p=18688</guid>

					<description><![CDATA[Echoing yesterday’s announcement on the Jupyter blog, OVHcloud is proud to support JupyterCon as platinum sponsor and infrastructure donor. As you probably know, Jupyter has been a huge enabler of the programming community, with over 140 Kernels supported such as Python, R, Julia, Spark, Sas, Haskell, Ruby, C++, Go, etc&#8230; The&#160;Jupyter&#160;Project embodies open and collaborative [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fsponsorship-of-the-jupytercon-2020-sharing-values-and-supporting-with-infrastructure%2F&amp;action_name=Sponsorship%20of%20the%20JupyterCon%202020%3A%20sharing%20values%20and%20supporting%20with%20infrastructure&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p>Echoing yesterday’s <a href="https://blog.jupyter.org/jupytercon-online-more-than-a-conference-4677cf25a915" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">announcement </a>on the Jupyter blog, OVHcloud is proud to support JupyterCon as platinum sponsor and infrastructure donor.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="537" src="https://www.ovh.com/blog/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-1024x537.jpeg" alt="Sponsorship of the JupyterCon 2020: sharing values and supporting with infrastructure" class="wp-image-18692" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-1024x537.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-300x157.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8-768x403.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/07/D28D9C80-FC63-4332-8E47-27092C1797A8.jpeg 1200w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>As you probably know, Jupyter has been a huge enabler of the programming community, with over 140 <a href="https://github.com/jupyter/jupyter/wiki/Jupyter-kernels" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Kernels supported</a> such as Python, R, Julia, Spark, Sas, Haskell, Ruby, C++, Go, etc&#8230;</p>



<p>The&nbsp;Jupyter&nbsp;Project embodies open and collaborative dev communities, both key values&nbsp;in the makeup of our own ecosystem.</p>



<p>In recent years, Jupyter and OVHcloud have worked hand in hand on <a href="https://www.ovh.com/blog/mybinder-and-ovh-partnership/" data-wpel-link="exclude">projects such as Mybinder.org</a> and NbViewer. I would like to thank the Jupyter board and NumFocus for their openness and trust, which has enabled such a successful partnership.</p>



<p>In 2020,&nbsp;we moved further, offering a full year of support for the&nbsp;JupiterCon and helping to contribute to the&nbsp;event&#8217;s&nbsp;success.</p>



<p>With the transition to a fully digital experience, COVID-19 travel restrictions are an opportunity not only to reduce&nbsp;travel&nbsp;costs&nbsp;and ease accessibility, but to&nbsp;limit&nbsp;the carbon footprint tied to knowledge sharing within the ecosystem.</p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fsponsorship-of-the-jupytercon-2020-sharing-values-and-supporting-with-infrastructure%2F&amp;action_name=Sponsorship%20of%20the%20JupyterCon%202020%3A%20sharing%20values%20and%20supporting%20with%20infrastructure&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Understanding the anatomy of GPUs using Pokémon</title>
		<link>https://blog.ovhcloud.com/understanding-the-anatomy-of-gpus-using-pokemon/</link>
		
		<dc:creator><![CDATA[Jean-Louis Queguiner]]></dc:creator>
		<pubDate>Wed, 13 Mar 2019 16:25:32 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Deep learning]]></category>
		<category><![CDATA[GPU]]></category>
		<guid isPermaLink="false">https://blog.ovh.com/fr/blog/?p=14482</guid>

					<description><![CDATA[Please welcome this beautiful new born in GPGPU Nvidia Family Ampere BLOG UPDATE FROM MAY 14, 2020 In the previous episode&#8230; In our previous blog post about&#160;Deep Learning, we explained that this technology is all about massive parallel matrix computation, and that these computations are simplistic operations: + and x. Fact 1:&#160; GPUs are good [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Funderstanding-the-anatomy-of-gpus-using-pokemon%2F&amp;action_name=Understanding%20the%20anatomy%20of%20GPUs%20using%20Pok%C3%A9mon&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Please welcome this beautiful new born in GPGPU Nvidia Family <strong>Ampere</strong></p><cite>BLOG UPDATE FROM MAY 14, 2020</cite></blockquote>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="581" height="854" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.25.01.png" alt="" class="wp-image-18271" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.25.01.png 581w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.25.01-204x300.png 204w" sizes="auto, (max-width: 581px) 100vw, 581px" /></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.36.57-1024x750.png" alt="" class="wp-image-18277" width="768" height="563" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.36.57-1024x750.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.36.57-300x220.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.36.57-768x562.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/Capture-d’écran-2020-05-15-à-17.36.57.png 1146w" sizes="auto, (max-width: 768px) 100vw, 768px" /><figcaption>Congratulations</figcaption></figure></div>



<h3 class="wp-block-heading">In the previous <a href="https://www.ovh.com/fr/blog/deep-learning-explained-to-my-8-year-old-daughter/" rel="nofollow" data-wpel-link="exclude">episode</a>&#8230;</h3>



<p>In our previous blog post about&nbsp;<a href="https://www.ovh.com/fr/blog/deep-learning-explained-to-my-8-year-old-daughter/" rel="nofollow" data-wpel-link="exclude">Deep Learning,</a> we explained that this technology is all about massive parallel matrix computation, and that these computations are simplistic operations: + and x.</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="537" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/46A55CAC-42D2-4782-B6D8-03F9A8C49C40-1024x537.png" alt="" class="wp-image-18283" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/46A55CAC-42D2-4782-B6D8-03F9A8C49C40-1024x537.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/46A55CAC-42D2-4782-B6D8-03F9A8C49C40-300x157.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/46A55CAC-42D2-4782-B6D8-03F9A8C49C40-768x403.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/46A55CAC-42D2-4782-B6D8-03F9A8C49C40.png 1200w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure></div>



<h3 class="wp-block-heading">Fact 1:&nbsp; GPUs are good for (drum roll)&#8230;</h3>



<p>Once you get that Deep Learning is just massive parallel matrix multiplications and additions, the magic happens. General Purpose Graphic Processing Units (GPGPU) (i.e. GPUs, or variants of GPUs, designed for something other than graphic processing) are perfect for&#8230;</p>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="475" height="263" src="/blog/wp-content/uploads/2019/02/ComplicatedOldIcterinewarbler-size_restricted.gif" alt="" class="wp-image-14672"/></figure></div>



<p>matrix multiplications and additions !</p>



<p>Perfect isn&#8217;t it ? But why ? Let me tell you a little story</p>



<h3 class="wp-block-heading">Fact 2: There was a time when GPUs were just GPUs</h3>



<p>Yes, you read that correctly&#8230;</p>



<p>The first GPUs in the 90s were designed in a very linear way. The engineer took the engineering process used for graphical rendering and implemented it into the hardware.</p>



<p>To keep it simple, this is what a graphical rendering process looks like:</p>



<div class="wp-block-image"><figure class="aligncenter is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2019/03/IMG_0148-1024x841.png" alt="Graphical rendering process" class="wp-image-15125" width="768" height="631" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_0148-1024x841.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_0148-300x246.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_0148-768x630.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_0148-1200x985.png 1200w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_0148.png 1871w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<p>Uses for GPUs included transformation, building lighting effects, building triangle setups and clipping, and integrating rendering engines at a scale that was not achievable at the time (tens of millions of polygons per second).</p>



<p>The first GPUs integrated the various steps of image processing and rendering in a linear way. Each part of the process had predefined hardware components associated with vertex shaders, tessellation modules, geometry shaders, etc.</p>



<p>In short, graphics cards were initially designed &nbsp;to perform graphical processing. What a surprise!</p>



<h3 class="wp-block-heading">Fact 3: CPUs are sports cars, GPUs are massive trucks</h3>



<p>As explained earlier, for image processing and rendering, you don&#8217;t want your image being generated pixel per pixel – you want it in a single shot. That means that every pixel of the image – representing every object pointed in the camera, at a given time, in a given position – needs to be calculated at once.</p>



<p>It&#8217;s a complete contrast with <strong>CPU</strong> logic, where operations are meant to be achieved in a sequential way. As a result, <strong>GPGPUs</strong> needed a massively parallel general-purpose architecture to be able to process all the points (vertex), build all the meshes (tessellation), build the lighting, perform the object transformation from the absolute referential, apply texture, and perform shading (I&#8217;m still probably missing some parts!). However, the purpose of this blog post is not to look in-depth at image processing and rendering, as we will do that in another blog post in the future.</p>



<p>As explained in our previous post, CPUs are like sports cars, able to calculate a chunk of data really fast with minimal latency, while GPUs are trucks, moving lots of data at once, but suffering from latency as a result.</p>



<p>Here is a nice video from Mythbusters, where the two concepts of CPU and GPU are explained:</p>


<div class="lazyblock-youtube-gdpr-compliant-ZqsNoD wp-block-lazyblock-youtube-gdpr-compliant"><script type="module">
  import 'https://blog.ovhcloud.com/wp-content/assets/ovhcloud-gdrp-compliant-embedding-widgets/src/ovhcloud-gdrp-compliant-spreaker.js';
</script>
      
      <ovhcloud-gdrp-compliant-spreaker
          spreaker=""
          debug></ovhcloud-gdrp-compliant-spreaker> 

</div>


<h3 class="wp-block-heading">Fact 4: 2006&nbsp;–&nbsp;NVIDIA killed the image processing Taylorism</h3>



<div class="wp-block-image"><figure class="aligncenter"><img decoding="async" src="https://thumbs.gfycat.com/DirtyHastyAustrianpinscher-size_restricted.gif" alt="RÃ©sultat de recherche d'images pour &quot;temps modern gif&quot;"/></figure></div>



<p>The previous method for performing image processing was done using specialised manpower (hardware) at every stage of the production line in the image factory.</p>



<p>This all changed in 2006, when NVIDIA decided to introduce General Purpose Graphical Processing Units using Arithmetic Logical Units (ALUs), aka CUDA cores, which were able to run multi-purpose computations (a bit like a Jean-Claude Van Damme of GPU computation units!).</p>



<div class="wp-block-image"><figure class="aligncenter"><img decoding="async" src="https://media.lelombrik.net/t/64037621a78f86abb4c7c4e53a6c2b89/p/01.gif" alt=""/><figcaption>GoDaddy Commercial (2013) featuring Jean-Claude Van Damme Source : https://imgur.com/r/gifs/PvuZxBZ</figcaption></figure></div>



<p>Even today,&nbsp;<a href="https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf" rel="nofollow external noopener noreferrer" data-wpel-link="external" target="_blank">modern GPU architectures</a> (such as Fermi, Kepler or Volta) are composed of non-general cores, named Special Function Units (SFUs), &nbsp;to run high-performance mathematical graphical operations, such as sin, cosine, reciprocal, and square root, as well as Texture Mapping Units (TMUs) for the high-dimension matrix operations involved in image texture mapping.</p>



<h3 class="wp-block-heading">Fact 5: GPGPUs can be explained simply with Pokémon!</h3>



<p>GPU architectures can seem difficult to understand at first, but trust me&#8230; they are not!</p>



<p>Here is my gift to you: a <a href="https://bulbapedia.bulbagarden.net/wiki/Pok%C3%A9dex" rel="nofollow external noopener noreferrer" data-wpel-link="external" target="_blank">Pokédex</a> to help you understand GPUs in simple terms.</p>



<h3 class="wp-block-heading">The&nbsp;<em>Micro-Architecture </em>Family</h3>



<h4 class="wp-block-heading">Here&#8217;s how you use it&#8230;</h4>



<p>You basically have four families of cards:</p>



<p>This family will already be known to many of you. We are, of course, talking about Fermi, Maxwell, Kepler, Volta, Ampere etc.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/05/gpu-families-387x1024.jpg" alt="" class="wp-image-18293" width="387" height="1024" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/05/gpu-families-387x1024.jpg 387w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/gpu-families-113x300.jpg 113w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/gpu-families-768x2032.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/gpu-families-581x1536.jpg 581w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/gpu-families-774x2048.jpg 774w, https://blog.ovhcloud.com/wp-content/uploads/2020/05/gpu-families-scaled.jpg 968w" sizes="auto, (max-width: 387px) 100vw, 387px" /><figcaption>A beautiful picture of new born with all the other familier</figcaption></figure></div>



<h4 class="wp-block-heading">The <em>Architecture</em> Family</h4>



<p>This is the center, where the magic happens: orchestration, cache, workload scheduling&#8230; It&#8217;s the brain of the GPU.</p>



<figure class="wp-block-table"><table><tbody><tr><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15084" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.56.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.56.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.56-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15083" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.58.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.58.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.58-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15082" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.00.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.00.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.00-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td></tr></tbody></table></figure>



<h4 class="wp-block-heading">The <em>Multi-Core Units</em>&nbsp;<i>(aka CUDA Cores)&nbsp;</i>Family</h4>



<p>This represents the physical core, where the maths computations actually happen.</p>



<figure class="wp-block-table"><table><tbody><tr><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15081" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.02.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.02.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.02-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15080" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.04.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.04.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.04-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15079" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.06.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.06.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.06-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td></tr><tr><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15078" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.09.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.09.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.09-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15077" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.12.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.12.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.12-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15076" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.14.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.14.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.14-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td></tr><tr><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15075" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.16.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.16.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.16-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15074" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.26.png" alt="" width="200" height="263" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.26.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.30.26-228x300.png 228w" sizes="auto, (max-width: 200px) 100vw, 200px" /></figure></td><td>&nbsp;</td></tr></tbody></table></figure>



<h4 class="wp-block-heading">The<em>&nbsp;Programming Model</em> Family</h4>



<p>The different layers of the programming model are used to abstract the GPU&#8217;s parallel computation for a programmer. It also makes the code portable to any GPU architecture.</p>



<figure class="wp-block-table"><table><tbody><tr><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15089" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.46.png" alt="" width="342" height="450" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.46.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.46-228x300.png 228w" sizes="auto, (max-width: 342px) 100vw, 342px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15088" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.48.png" alt="" width="352" height="464" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.48.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.48-228x300.png 228w" sizes="auto, (max-width: 352px) 100vw, 352px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15087" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.49.png" alt="" width="346" height="456" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.49.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.49-228x300.png 228w" sizes="auto, (max-width: 346px) 100vw, 346px" /></figure></td></tr><tr><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15086" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.51.png" alt="" width="352" height="464" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.51.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.51-228x300.png 228w" sizes="auto, (max-width: 352px) 100vw, 352px" /></figure></td><td><figure><img loading="lazy" decoding="async" class="aligncenter wp-image-15085" src="/blog/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.53.png" alt="" width="344" height="453" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.53.png 764w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/Screen-Shot-2019-03-08-at-14.29.53-228x300.png 228w" sizes="auto, (max-width: 344px) 100vw, 344px" /></figure></td></tr></tbody></table></figure>



<h3 class="wp-block-heading">How to play</h3>



<ol class="wp-block-list"><li>Start by choosing a card from the <em>Micro-Architecture</em> family</li><li>Look at the components, and choose the appropriate card from the <em>Architecture</em>&nbsp;family</li><li>Look at the components within the<em> Micro-Architecture</em> family and pick them from the <i>Multi-Core Units </i>family, then place them under the <em>Architecture</em>&nbsp;card</li><li>Now, if you want to know how to program a GPU, place the <i>Programming Model &#8211; Multi-Core Units</i>&nbsp;special card on top of the&nbsp;<em>Multi-Core Units&nbsp;</em>cards</li><li>Finally, on top of the <i>Programming Model &#8211; Multi-Core Units </i>special card, place all the <i>Programming Model&nbsp;</i>cards near the <em>SM</em></li><li>You then should have something that look like this:</li></ol>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2139.jpg" alt="" class="wp-image-15108" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2139.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2139-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2139-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2139-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2139-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<h3 class="wp-block-heading">Examples of card configurations:</h3>



<h4 class="wp-block-heading">Fermi</h4>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2148.jpg" alt="" class="wp-image-15098" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2148.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2148-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2148-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2148-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2148-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<h4 class="wp-block-heading">Kepler</h4>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2147.jpg" alt="" class="wp-image-15100" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2147.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2147-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2147-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2147-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2147-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<h4 class="wp-block-heading">Maxwell</h4>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2146.jpg" alt="" class="wp-image-15101" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2146.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2146-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2146-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2146-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2146-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<h4 class="wp-block-heading">Pascal</h4>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2143.jpg" alt="" class="wp-image-15104" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2143.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2143-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2143-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2143-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2143-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<h4 class="wp-block-heading">Volta</h4>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2140.jpg" alt="" class="wp-image-15107" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2140.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2140-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2140-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2140-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2140-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<h4 class="wp-block-heading">Turing</h4>



<div class="wp-block-image"><figure class="aligncenter"><img loading="lazy" decoding="async" width="4032" height="3024" src="/blog/wp-content/uploads/2019/03/IMG_2145.jpg" alt="" class="wp-image-15102" srcset="https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2145.jpg 4032w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2145-300x225.jpg 300w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2145-768x576.jpg 768w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2145-1024x768.jpg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2019/03/IMG_2145-1200x900.jpg 1200w" sizes="auto, (max-width: 4032px) 100vw, 4032px" /></figure></div>



<p><br>After playing around with different&nbsp;<em>Micro-Architectures</em>, <em>Architectures</em> and <em>Multi-Core Units</em> for a bit, you should see that GPUs are just as simple as Pokémon!</p>



<p>Enjoy the attached PDF, which will allow you to print your own GPU Pokédex.&nbsp;You can download it here: <a href="https://www.ovh.com/blog/wp-content/uploads/2020/05/GPU-Cards-1.pdf" target="_blank" rel="noreferrer noopener" data-wpel-link="exclude">GPU Cards Game</a></p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Funderstanding-the-anatomy-of-gpus-using-pokemon%2F&amp;action_name=Understanding%20the%20anatomy%20of%20GPUs%20using%20Pok%C3%A9mon&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
