How to serve LLMs with vLLM and OVHcloud AI Deploy
In this tutorial, we will learn how to serve Large Language Models (LLMs) using vLLM and the OVHcloud AI Products.
How to serve LLMs with vLLM and OVHcloud AI Deploy Read More »
In this tutorial, we will learn how to serve Large Language Models (LLMs) using vLLM and the OVHcloud AI Products.
How to serve LLMs with vLLM and OVHcloud AI Deploy Read More »
In this tutorial, we will walk you through the process of fine-tuning LLaMA 2 models, providing step-by-step instructions. All the code related to this article is available in our dedicated GitHub repository. You can reproduce all the experiments with OVHcloud AI Notebooks. Introduction On July 18, 2023, Meta released LLaMA 2, the latest version of
Fine-Tuning LLaMA 2 Models using a single GPU, QLoRA and AI Notebooks Read More »
Two years after launching our Managed Kubernetes service, we’re seeing a lot of diversity in the workloads that run in production. We have been challenged by some customers looking for GPU acceleration, and have teamed up with our partner NVIDIA to deliver high performance GPUs on Kubernetes. We’ve done it in a way that combines
Using GPU on Managed Kubernetes Service with NVIDIA GPU operator Read More »
A growing number of companies are using artificial intelligence on a daily basis — and dealing with the back-end architecture can reveal some unexpected challenges. Whether the machine learning workload involves fraud detection, forecasts, chatbots, computer vision or NLP, it will need frequent access to computing power for training and fine-tuning. GPUs have proven to
Managing GPU pools efficiently in AI pipelines Read More »
What is PCI-Express ? Everyone, and I mean everyone, should pay attention when they do intensive Machine Learning / Deep Learning Training. As I explained in a previous blog post, GPUs have accelerated Artificial Intelligence evolution massively. However, building a GPUs server is not that easy. And failing to create an appropriate infrastructure can have
How PCI-Express works and why you should care? #GPU Read More »
Previously on OVHcloud Blog … In previous blog posts we have discussed a high level approach to deep learning as well as what is meant by ‘training’ in relation to Deep Learning. Following the article, I had lots of questions entering my twitter inbox, especially regarding how GPUs actually works. I decided, therefore, to write
Distributed Training in a Deep Learning Context Read More »
In a previous blog post we discussed general concepts surrounding Deep Learning. In this blog post, we will go deeper into the basic concepts of training a (deep) Neural Network. Where does “Neural” comes from ? As you should know, a biological neuron is composed of multiple dendrites, a nucleus and a axon (if only
What does Training Neural Networks mean? Read More »
Please welcome this beautiful new born in GPGPU Nvidia Family Ampere BLOG UPDATE FROM MAY 14, 2020 In the previous episode… In our previous blog post about Deep Learning, we explained that this technology is all about massive parallel matrix computation, and that these computations are simplistic operations: + and x. Fact 1: GPUs are good
Understanding the anatomy of GPUs using Pokémon Read More »
Machine Learning and especially Deep Learning are hot topics and you are sure to have come across the buzzword “Artificial Intelligence” in the media. Yet these are not new concepts. The first Artificial Neural Network (ANN) was introduced in the 40s. So why all the recent interest around neural networks and Deep Learning? We will explore this
Deep Learning explained to my 8-year-old daughter Read More »