OVHcloud Engineering Archives - OVHcloud Blog

Devoxx France 2026: feedback and highlights

Aurélie Vache — Tue, 19 May 2026 05:45:45 +0000

From April 22 to 24, 2026, the Devoxx France conference took place at the Palais des Congrès in Paris.

Aurélie Vache and Stéphane Philippart attended as dit 19 other OVHcloud employees. In this blog post, they share their thoughts and feedback from this 14th edition of Devoxx France.

Devoxx France 2026: The AI Edition

Devoxx France 2026 is one of Europe’s biggest independent developer conferences. Formerly focused centrally on Java, over the past few years, the conference has also focused on Architecture, Data & Analytics, Development practices, Front-end & UX, Java/JVM, Security & Privacy, Cloud and non-technical talks about people and culture.

Key figures from the 2026 edition:

4,980 attendees (The largest attendance on record)
307 speakers
259 talks
70+ sponsors

As might be expected, AI was the central theme of this edition, with a large number of the talks focused on AI topics. Indeed, there were 65 sessions out of 259 about AI and Agentic Systems, the most discussed topic!

Notably this year, and perhaps even more than in previous years, we could clearly see attendees arriving early to secure seats for their favorite talks. Even so, there ended up being a lot of disappointment – especially on the first day – as several sessions were already at full capacity minutes before they even started.

This was particularly true for sessions featuring multiple OVHcloud speakers 💪.

Keynotes

The keynote sessions (“plenary sessions”) were also heavily centered on Artificial Intelligence, but with a notably broader lens beyond pure technology. Rather than focusing only on tools or LLM implementation, the talks explored AI through the intersecting dimensions of power, governance, cybersecurity, human transformation, and geopolitics.

Some highlights from the keynotes:

“In 50 years, AI has multiplied its power, along with the challenges of governance and cybersecurity” – Laurence Devillers (@lau_devil)

Jean-Gabriel Ganascia (@Quecalcoatle) questioned the promise of AI as a force that could free humans from effort, raising deeper reflections on what this means for our relationship with work and meaning.

Loup Cellard (@CellardLoup) examined the implications of foreign investments in AI infrastructure, shedding light on the geopolitical and strategic stakes behind these technologies.

Meet & Greet

Devoxx France consists of three days of conferences, sponsor booths to discover, and Thursday evening’s unmissable annual tradition: the Meet & Greet.

Thursday night’s Meet & Greet is a major community event built around networking and social sessions like BOFs (Birds of a Feather) and seed networking. It’s one of the signature traditions of the conference, beyond talks and sponsor booths.

This evening event is free, open to the public with pre-registration, and offers a genuine moment for connection, sharing, and conversation over a drink and a plate of charcuterie and cheese 😇.

It’s also the opportunity to discover the fun of “Voxx Jam”, the community-party, music-oriented side of Devoxx/Voxxed culture 🎸.

OVHcloud Presence

At the OVHcloud booth, we were a team of 8 speakers and 11 colleagues from Tech, HR, and Sales, and their dynamic presence really made a difference. Engaging in topics like AI, Public Cloud, Domain Names, Observability, Quantum technologies, and more, we had many insightful conversations throughout the event.

We also discussed AI topics at the booth, which was of course the main theme of the conference, but not the only one.

A lot of conversations also focused on sovereignty. Three years ago, people were saying: “I don’t care about sovereignty, I’ll just choose the cheapest option.” This year, the tone has clearly changed, “How can we use your sovereign products?”

There is a real shift happening, and once again, being present at events like this is essential to witness and take part in these evolving discussions.

It was truly a top-tier booth experience for all of us💪.

Of course, the goal of our booth was so attendees could discuss with our teams, but also so we could engage them through our very own video game, “Gaming Camp: Beat Cloud Villains!”. The specially designed video game’s description: “Join the fight against the villains of the cloud. Take on Hidden Cost, Jailor Stack, and Autonomous Zero, and prove yourself as a true Guardian of the Cloud.”

Players were welcomed to step into a two-player fighting game inspired by the style of Street Fighter, where strategy and skill are your best weapons. Game on!

We also wanted to say a word about the success of our Schrödinger cat (Quantum) swag – socks, keychains, badges – they were a huge hit, and often sparked great conversations throughout the event.

OVHcloud Speakers & Talks

Getting accepted to Devoxx France is not easy, so we were proud to be included with 8 speakers and 11 talks! We were the most represented company in terms of talks at Devoxx France 2026, and ranked in the top 3 by number of speakers 💪.

Congratulations to Benoit Masson, Fanny Bouton, Mathieu Busquet, Sébastien Ferrer, Théo Bougé, and Héla Ben Khalfallah, Stéphane Philippart & Aurélie Vache for their talks 👏. A large number of attendees joined, and the sessions were all very high quality.

Find here the topics of their talks:

“Question pour un cluster Kubernetes : Quiz sur Kubernetes & ses concepts”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “The Ultimate Kubernetes Challenge: An Interactive Trivia Game on concepts, components, usage…”

🎤 Speaker: Aurélie Vache

“Kubernetes est devenu le standard de facto pour déployer et exploiter des applications conteneurisées. Nous l’utilisons, ainsi que son ecosystème, au quotidien, mais le connaît-on si bien ?

Tout au long de ce talk, avec un mix de quiz et de démos en live, vous découvrirez (ou redécouvrirez) les concepts clés de Kubernetes (pods, secrets, services, namespaces…), les composants interne mais aussi les bonnes pratiques d’utilisation.

Un format original avec un quiz, du fun et des démos, qui conviendra aussi bien aux débutants qu’aux confirmés, afin d’apprendre, réviser et challenger vos connaissances du merveilleux monde de Kubernetes et de son écosystème, tout en s’amusant.

Soyez là ou le plus rapide pour tenter de gagner des cadeaux !”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 Kubernetes has become the de facto standard for deploying and operating containerized applications. We use it, as well as its ecosystem, on a daily basis, but do we know them as well as we think we do?

With a mix of quiz and live demos, come learn and/or improve your knowledge. You will discover (or rediscover) the key concepts of Kubernetes (pods, secrets, services…), internal components but also best practices.

In this fun and dynamic talk, come compete throughout the quiz and explore the wonderful world of Kubernetes.
Icing on the cake: the first will win some swags.

🎥 Replay.

“QR Codes : suivez les points sans vous perdre !”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “QR codes: follow the dots without getting lost!“

🎤 Speaker: Benoît Masson & Sébastien Chedor (OnePoint)

“Les QR Codes, tout le monde connaît et les utilise régulièrement. Mais savez-vous vraiment comment ils fonctionnent, pourquoi c’est aussi rapide et fiable, même avec une caméra de faible qualité ou un code en partie caché ou détérioré ?

Nous vous proposons de coder ensemble un lecteur de QR Codes, avec un minimum d’outils :
* capture et analyse de la vidéo issue de la webcam pour détecter la position du code, à l’aide d’OpenCV
* extraction et décodage du contenu, avec correction d’erreur grâce à l’algorithme de Reed-Solomon.

À la fin de cette session, vous devriez être capables de décoder un QR Code à l’oeil nu 🕵️ (et un brouillon…)”.

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Everyone knows QR codes and uses them regularly. But do you really know how they work, and why they are so fast and reliable, even with a low-quality camera or a partially hidden or damaged code?

We propose coding a QR code reader together, using a minimum number of tools:
* capturing and analysing webcam video to detect the position of the code, using OpenCV
* extracting and decoding the content, with error correction using the Reed-Solomon algorithm

By the end of this session, you should be able to decode a QR code with the naked eye 🕵️— and a rough sheet of paper…”

🎥 Replay.

“Noms de domaines : la grande histoire des petites extensions”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Domain names: the big story behind small extensions”

🎤 Speakers: Benoît Masson & Theo Bougé

“Derrière les quelques lettres qui suivent un point (.com, .fr, .ai…) se cache un univers riche de stratégies techniques, d’enjeux géopolitiques et de batailles commerciales.

À l’approche du nouveau round de l’ICANN prévu en 2026 qui va autoriser de nouvelles extensions, il est temps de revenir sur les fondations techniques du DNS, ainsi que sur les grands épisodes de cette aventure méconnue. Des TLD historiques aux extensions détournées, des dramas autour du .web aux ambitions du Web3, nous explorerons l’évolution d’un système devenu central dans les logiques de souveraineté numérique et d’innovation commerciale.

Une plongée dans les coulisses d’un Internet en perpétuelle transformation.”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Behind the few letters that follow a dot — .com, .fr, .ai and others — lies a rich world of technical strategies, geopolitical issues and commercial battles.

As the new ICANN round planned for 2026 approaches, which will authorise new extensions, it is time to revisit the technical foundations of DNS, as well as the major episodes in this little-known story. From historic TLDs to repurposed extensions, from the drama around .web to the ambitions of Web3, we will explore the evolution of a system that has become central to digital sovereignty and commercial innovation.

A deep dive behind the scenes of an Internet in constant transformation.”

🎥 Replay.

“Informatique quantique, ce coup-ci on vous dit tout !”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Quantum computing: this time, we tell you everything!”

🎤 Speaker: Fanny Bouton, Olivier Ezrati (Quantum Energy Initiative) & Guillaume Schurck (Alice & Bob)

“Informatique quantique pour développeurs : comprendre, coder, passer à l’échelle

L’informatique quantique sort du laboratoire et devient progressivement accessible aux développeurs via des SDK open source, des notebooks, des simulateurs et des QPU disponibles dans le cloud. En 2026, la question n’est plus « qu’est-ce que le quantique ? » mais « comment un développeur peut-il s’en emparer concrètement ? »

Nous commencerons par poser les bases essentielles pour comprendre le modèle de calcul quantique : qubit, superposition, intrication, et ce que ces concepts impliquent pour un développeur.
Nous passerons ensuite au code : écrire et exécuter des circuits quantiques, utiliser des SDK modernes, travailler dans des notebooks, tester sur simulateur puis sur de vrais QPU. Vous verrez à quoi ressemble un workflow quantique aujourd’hui.

Enfin, nous aborderons les cas d’usage concrets, illustrés par le retour d’expérience d’un grand compte : ce qui fonctionne déjà, les limites actuelles, et comment les équipes tech expérimentent le quantique de manière réaliste et industrielle.

Une session technique pensée pour les développeurs qui veulent anticiper la prochaine évolution majeure du calcul.”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Quantum computing for developers: understand, code and scale up.
Quantum computing is moving out of the laboratory and becoming progressively accessible to developers through open source SDKs, notebooks, simulators and QPUs available in the cloud. In 2026, the question is no longer ‘What is quantum?’ but ‘How can developers make practical use of it?’

We will begin by laying out the essential foundations needed to understand the quantum computing model: qubits, superposition, entanglement, and what these concepts mean for developers.

We will then move on to code: writing and running quantum circuits, using modern SDKs, working in notebooks, testing on simulators and then on real QPUs. You will see what a quantum workflow looks like today.

Finally, we will address concrete use cases, illustrated by the experience of a large account: what already works, the current limitations, and how tech teams are experimenting with quantum computing in a realistic and industrial way.

A technical session designed for developers who want to anticipate the next major evolution in computing.”

🎥 Replay.

“Développer avec l’IA : et si c’était aussi simple qu’ajouter une librairie ?”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Developing with AI: what if it were as simple as adding a library?”

🎤 Speakers: Mathieu Busquet & Stéphane Philippart

“Intégrer de l’intelligence artificielle (IA) dans nos développements peut nous paraître plus complexe que de les utiliser dans notre quotidien.

Dois-je apprendre un nouveau langage ou une nouvelle stack ?
Durant ce workshop nous vous proposons de vous donner tous les éléments pour intégrer l’IA sans quitter votre langage de prédilection : Java 😍. Ce sera l’occasion de découvrir les Frameworks du moments : LangChain4j, Quarkus, …

Nous vous invitons à découvrir toutes les facettes d’un chatbot avec l’IA générative (customiser un prompt, rajouter vos données (RAG), appeler des outils locaux ou distants (MCP) et créer des agents) mais aussi parce que l’IA ne se limite pas aux chatbots : faire de la transcription, créer de l’audio ou même faire un traducteur.

Et, toujours pour vous simplifier la vie, venez juste avec votre ordinateur et un navigateur Internet, on se charge du reste pour vous construire un environnement de développement aux petits oignons grâce aux CDE.

À la suite de ce talk vous repartirez avec une boîte à outils vous permettant d’intégrer simplement la puissance des modèles d’IA au sein de vos développements de tous les jours.”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Integrating artificial intelligence into our developments can seem more complex than using it in our daily lives.

Do I need to learn a new language or a new stack?

During this workshop, we will give you all the tools you need to integrate AI without leaving your favourite language: Java 😍. It will be an opportunity to discover some of today’s key frameworks, including LangChain4j and Quarkus.

We invite you to explore all the facets of a chatbot with generative AI — customising a prompt, adding your own data with RAG, calling local or remote tools with MCP, and creating agents — but also to see that AI is not limited to chatbots: it can also be used for transcription, audio creation and even translation.

And to make your life even easier, just bring your computer and an internet browser. We will take care of the rest, building a polished development environment for you thanks to CDEs.

After this talk, you will leave with a toolkit that will allow you to integrate the power of AI models into your everyday development work.”

“Détectives de la prod : résoudre l’enquête avant le crash”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Production detectives: solve the case before the crash”

🎤 Speaker: Sébastien Ferrer

“Saviez-vous que, derrière les coulisses de vos outils de travail, se cachent des équipes prêtes à intervenir à tout moment ?

Ces équipes, souvent discrètes mais essentielles, gèrent des dizaines de projets avec des effectifs réduits. Mais quand une alerte survient, elles doivent réagir vite. Très vite. Comment réussir à diagnostiquer et résoudre un incident en pleine production, sans perdre une précieuse seconde ?

Dans ce talk je vous emmène au cœur de l’action, où je partage notre méthodologie pour transformer chaque crise en une enquête méthodique et efficace. Nous explorerons comment des outils bien pensés, une organisation affûtée, et un soupçon d’intuition transforment la gestion d’incidents en une véritable enquête… parfois aussi palpitante qu’une partie de Cluedo.

Au programme : bonnes pratiques de troubleshooting, logging et monitoring, pour que vous repartiez avec des clés concrètes pour dompter les incidents dans vos propres projets.

Vous verrez qu’en production, chaque problème cache une histoire… à résoudre en équipe !”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Did you know that behind the scenes of your work tools, there are teams ready to intervene at any moment?

These teams, often discreet but essential, manage dozens of projects with limited staff. But when an alert occurs, they need to react quickly. Very quickly. How can they diagnose and resolve a production incident without losing precious seconds?

In this talk, I will take you into the heart of the action, where I share our methodology for turning every crisis into a structured and efficient investigation. We will explore how well-designed tools, a well-honed organisation and a touch of intuition can transform incident management into a real investigation — sometimes as thrilling as a game of Cluedo.

On the agenda: troubleshooting best practices, logging and monitoring, so you leave with concrete keys to taming incidents in your own projects.

You will see that in production, every problem hides a story… one to solve as a team!”

🎥 Replay.

“Et si écrire du SQL redevenait cool ?”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “What if writing SQL became cool again?”

🎤 Speaker: Sébastien Ferrer

“On nous l’a répété maintes fois : “écrire du SQL dans du code source, c’est dépassé”.

Les ORMs sont partout. Ils ont facilité notre quotidien en nous permettant de manipuler nos bases de données sans nous soucier du SQL. Mais parfois, on aimerait un peu plus de contrôle, un peu plus de performance… sans pour autant revenir aux longues heures de mapping manuel et de requêtes préparées à la main.

SQLC offre une autre approche. Initialement conçu pour du Go, langage dans lequel cette technologie sera présentée dans ce talk, il permet d’écrire des requêtes SQL tout en générant du code type-safe et performant, sans ajouter de lourdeur ni de dépendances. Pas question ici de rejeter les ORMs, mais plutôt d’explorer un nouvel outil qui vient enrichir notre palette de solutions.

Dans ce talk, nous verrons comment SQLC fonctionne, dans quels cas il brille, et comment il s’intègre parfaitement dans un stack moderne. Vous aimez le SQL ? Vous voulez juste un peu plus de maîtrise sur vos requêtes ? Venez, vous risquez d’être agréablement surpris.”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “We have been told many times: ‘Writing SQL in source code is outdated.’

ORMs are everywhere. They have made our daily lives easier by allowing us to manipulate databases without worrying about SQL. But sometimes, we would like a little more control, a little more performance — without going back to long hours of manual mapping and hand-written prepared queries.

SQLC offers another approach. Initially designed for Go, the language in which this technology will be presented during the talk, it allows you to write SQL queries while generating type-safe and high-performance code, without adding heaviness or dependencies. The goal here is not to reject ORMs, but rather to explore a new tool that enriches our range of solutions.

In this talk, we will see how SQLC works, where it shines, and how it integrates perfectly into a modern stack. Do you like SQL? Do you simply want more control over your queries? Come along — you may be pleasantly surprised.”

🎥 Replay.

“🤖 Apprendre à notre IA à … apprendre 🧠”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “🤖 Teaching our AI to… learn 🧠”

🎤 Speaker: Stéphane Philippart

“RAG, MCP, tooling, function calling, agents, fine tuning, training, …
Que de termes barbares mais qui ont tous le même objectif : faire en sorte que le modèle d’intelligence artificielle que vous utilisez réponde correctement à vos questions et attentes 😅.
Et pour ça il va falloir ajouter de la connaissance, des données (privée ou publiques, …).

Durant ce talk je vous propose d’y voir un peu plus clair dans cette jungle des acronymes puis, fort de connaître les différences, vous proposer comment l’implémenter en tant que développeuses et développeurs.

Chaque approche a ses spécificités, ses avantages et ses inconvénients.
A la fin de ce talk, non seulement vous saurez choisir la bonne approche, mais aussi ajouter dans vos développements quotidiens la dose d’IA utile.”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “RAG, MCP, tooling, function calling, agents, fine tuning, training…

So many intimidating terms, but they all have the same goal: ensuring that the artificial intelligence model you use responds correctly to your questions and expectations 😅.

And to do that, you need to add knowledge and data — private, public or otherwise.

During this talk, I will help you see more clearly through this jungle of acronyms, and once you understand the differences, I will show you how to implement them as developers.

Each approach has its own specificities, advantages and disadvantages.

By the end of this talk, you will not only know how to choose the right approach, but also how to add the right dose of useful AI into your daily development work.”

🎥 Replay.

“Refactorer sans tout casser: anatomie des patterns de modernisation incrémentale”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “Refactoring without breaking everything: anatomy of incremental modernisation patterns”

🎤 Speaker: Héla Ben Khalfallah

“Cette session répond à un problème extrêmement courant mais rarement traité de façon structurée : comment moderniser un système legacy sans big bang, sans freeze de la prod, et sans multiplier les régressions. Plutôt que de parler “microservices” ou “rewrite from scratch” de manière abstraite, la session propose un playbook de modernisation incrémentale, articulé autour de patterns éprouvés : Strangler Fig, Parallel Change (Expand/Contract), Branch by Abstraction, décomposition par capacités métier / sous-domaines / transactions, et les patterns de conception (Facade, Adapter, Proxy, Mediator) utilisés comme briques concrètes de migration.

Le contenu est ancré dans la pratique : il synthétise à la fois des retours d’expérience industriels (Netflix, Khan Academy, etc.) et des travaux de recherche / rédaction. L’objectif n’est pas de présenter un catalogue de patterns, mais de montrer comment les combiner pour construire une trajectoire de migration observable, réversible et livrable en continu.

Vous repartirez avec une grille de lecture concrète pour garder des migrations observables, réversibles et compatibles avec le rythme produit.”

🏴󠁧󠁢󠁥󠁮󠁧󠁿 “This session addresses an extremely common problem that is rarely handled in a structured way: how to modernise a legacy system without a big bang, without a production freeze, and without multiplying regressions. Rather than talking abstractly about microservices or rewriting from scratch, the session offers an incremental modernisation playbook built around proven patterns: Strangler Fig, Parallel Change — Expand/Contract — Branch by Abstraction, decomposition by business capabilities, subdomains and transactions, as well as design patterns such as Facade, Adapter, Proxy and Mediator used as concrete building blocks for migration.

The content is rooted in practice: it brings together both industrial feedback from companies such as Netflix and Khan Academy, and research and written work. The goal is not to present a catalogue of patterns, but to show how they can be combined to build a migration path that is observable, reversible and continuously deliverable.

You will leave with a concrete framework for keeping migrations observable, reversible and compatible with the pace of product development.”

🎥 Replay.

📺 Devoxx France published the 232 videos (keynotes, conferences, tools in action, lunch talks & deep dives) on the Devoxx France YouTube channel.

Podcast

Devoxx France was also an opportunity for OVHcloud’s Aurélie Vache, Stéphane Philippart, and Magali De Labareyre to be interviewed in the Press space for the “Tech en Pratique” podcast.

The episodes will be available on YouTube starting in September! 🙂

Key Trends

AI moved from hype to production
The focus shifted toward agentic systems, RAG, observability, governance, and enterprise integration, with more emphasis on shipping useful AI than experimenting.

Java evolved for modern AI and cloud workloads
LangChain4j, GraalVM, native image, and JDK modernization reinforced Java’s role as a serious platform for AI-enabled enterprise systems.

Platform engineering became a core priority
CI/CD maturity, OpenRewrite, modernization, and developer productivity all reflected one goal: faster delivery without losing control.

Security moved deeper into developer workflows
Shift-left security, AppSec, authorization, Software Supply Chain Security and secure-by-design approaches gained importance, especially with AI-generated code increasing governance needs.

Cloud & architecture focused on operational resilience
Kubernetes, containers, observability, and scalable systems remained central, with a stronger focus on practical engineering over hype.

Front-end discussions matured
Accessibility, performance, reactivity, and maintainability took priority over framework wars.

Open source and European digital sovereignty gained traction
Open models, self-hosted tooling, privacy, and vendor independence became increasingly important themes.

Developer experience (DX) became strategic
Tooling, automation, terminal workflows, and reducing cognitive load were seen as key drivers of productivity and competitiveness.

Conclusion

This Devoxx France edition was a raging success for the speakers, sponsors, and attendees alike ♥️.

💬 Stay in Touch

Want to chat with us, share your thoughts, or just say hi? Here’s how to get in touch with us:

🟣 Discord: OVHcloud Discord server
🐦 X / Twitter: @OVHcloud
💼 LinkedIn: OVHcloud LinkedIn
🐙 GitHub: github.com/ovh

Remote development #3 – Industrialisation and Automation

Rémy Vandepoel — Wed, 13 May 2026 08:05:05 +0000

After manually configuring your server step by step, it’s time to automate the entire process.

The idea is simple: describe your infrastructure in configuration files and let Terraform take care of managing the resources at OVHcloud.

Here is an introductory guide to Terraform, with plenty of useful information: https://support.us.ovhcloud.com/hc/en-us/articles/22648864003219-Using-Terraform-with-OVHcloud.
As well as the link to OVHcloud’s official Terraform provider: https://registry.terraform.io/providers/ovh/ovh/latest

There are two steps to automating the deployment:

Deployment of the Public Cloud instance
Deployment of the application part (vscode-server) and its configuration

1. The heart of the automation: the Cloud-init script

Before we move onto Terraform, we need to understand how the server self-configures during its initialisation.
To do this, use cloud-init, a standard that allows scripts to be executed from the first boot of the instance.

What you will automate in this script:

The system update (apt update/upgrade)
The installation of code-server via the official script
The installation and configuration of Caddy (for automatic SSL)
The configuration of the Uncomplicated Firewall (UFW)

This type of file has a very particular syntax; the cloud-config.yaml will be available further down.

However, the important point to remember is: why use this format?

Idempotence: cloud-init ensures that everything is ready from the first boot.
Security from the outset: the UFW is activated immediately, reducing the exposure window.
Terraform Integration: a single line is required to include this: user_data = file("cloud-config.yaml")

2. Using Terraform for deployment

Terraform allows for a much easier and quicker instance startup.
Its configuration also has several advantages:

Persistent data: a terraform destroy of the instance can retain the data volume (goal set in chapter 2)
Scalability: if the project grows, the size of the volume and/or the flavour can be adjusted
Portability: the data volume can be unmounted and remounted on another machine.

To keep this post brief we won’t copy-paste the code here, but this link to a GitHub repository contains everything needed to deploy this in a few minutes:
https://github.com/RemyAtOVH/blogpost-dev-server

Its usage:

ubuntu@vscode-server:~$ source openrc.production.sh
ubuntu@vscode-server:~$ terraform init
ubuntu@vscode-server:~$ terraform plan
ubuntu@vscode-server:~$ terraform apply
[...]
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

Before applying cloud-init (or without it), there is a secondary volume /dev/sdb, sized according to Terraform specifications:

ubuntu@vscode-server-automated:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
[…]
sda 8:0 0 25G 0 disk
[…]
sdb 8:16 0 10G 0 disk

This is what will ensure data persistence.

You could manually delete the instance and other components, without deleting it.
To prevent any deletion in the event of “terraform destroy”, a parameter has been added:

lifecycle { prevent_destroy = true }

During the first startup, the various installation scripts may take time. You can check their steps with a simple tail:

ubuntu@vscode-server-automated:~$ tail -f /var/log/cloud-init-output.log

Once cloud-init has been executed automatically, everything that could have been set up manually in the previous chapters has been done automatically, in a way that can be reproduced!

It will therefore be possible to deploy this customised remote development environment if needed (with a few minutes of execution) and potentially delete it after a few hours or days of use.

In this series of chapters, we have transformed a simple idea – having access to VS Code wherever you are – into a professional-grade, automated and resilient infrastructure.
Below are the steps involved and the progress so far.

Chapter 1: first steps in manual installation to understand the mechanics of code-server.
Chapter 2: making it secure, using a Reverse Proxy (Caddy) and a firewall (UFW) to navigate smoothly in HTTPS.
Chapter 3: this article, in which we’ll use Terraform and OpenStack for better reproducibility.

The automation we have implemented with an OVHcloud deployment using an OpenStack-based Public Cloud provides a solid foundation.

From here, you can go even further: add automatic backups of your volumes (snapshotting), couple this with a CI/CD pipeline, or even explore deploying this environment via docker-compose or even Kubernetes.

A step-by-step video version of these blog posts will soon be available on our YouTube channel. Stay tuned!

Remote development #2 – Security and Performance

Rémy Vandepoel — Mon, 11 May 2026 16:00:02 +0000

In the previous chapter, we started the VSCode Server on a remote instance.

That’s a win. However, as it stands, your installation is vulnerable, or at least not optimally secured. Traffic is being sent in clear (HTTP) and port 8080 is exposed to anyone scanning our IP address.

To transform this prototype into a daily working tool, we need to set up a Reverse Proxy.
Its role is simple: to intercept secure connections (HTTPS) on the standard port 443 and redirect them locally to our service.

1. Prerequisites: securing the network part

First and foremost, we need to instruct code-server to no longer listen for connections from outside, but only to those coming from the machine itself (the proxy).

Modify your configuration file: nano ~/.config/code-server/config.yaml

Change the line “bind-addr” as follows:

bind-addr: 127.0.0.1:8080

Then restart the service.

ubuntu@vscode-server:~$ sudo systemctl restart code-server@$USER

This will ensure that vscode-server will indeed only “listen” locally and cannot be contacted directly from outside.

2. Implement the reverse proxy

Here, you have two choices:

NGINX, which has been the standard choice for many years
Caddy, which has a more simplistic (but comprehensive) and newer approach.

For this blog post, we have selected Caddy for the example and to familiarise ourselves if we have not already!

Caddy natively manages SSL certificate renewal – which can be done through OVHcloud!

Installation (Debian/Ubuntu)

You will find more comprehensive documentation for other systems or installation methods in the official documentation: https://caddyserver.com/docs/install.

ubuntu@vscode-server:~$ sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https
ubuntu@vscode-server:~$ curl -1sLf ‘https://dl.cloudsmith.io/public/caddy/stable/gpg.key’| sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
ubuntu@vscode-server:~$ curl -1sLf ‘https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt’| sudo tee /etc/apt/sources.list.d/caddy-stable.list
ubuntu@vscode-server:~$ sudo apt update && sudo apt install caddy -y

Configuration: modify the file /etc/caddy/Caddyfile (clear it and replace it with this):

Replace “dev.your-domain.uk” with your own domain name, with the subdomain of your choice pointing to the IP of the instance.

Simple configuration only on HTTP port (80)

dev.your-domain.uk {
reverse_proxy 127.0.0.1:8080
}

Recommended configuration on HTTPS port (443), using a domain hosted with OVHcloud.

For creating OVHcloud API tokens, you can refer to this page: https://eu.api.ovh.com/createToken/.

dev.your-domain.uk {
tls {
dns ovh {
endpoint “ovh-eu”
application_key {$OVH_APPLICATION_KEY}
application_secret {$OVH_APPLICATION_SECRET}
consumer_key {$OVH_CONSUMER_KEY}
}
}
reverse_proxy 127.0.0.1:8080
}

For further details regarding SSL certificate management, consult the official Caddy documentation.
Application:

ubuntu@vscode-server:~$ sudo systemctl reload caddy

If you have opted for the recommended configuration in HTTPS, your environment is now protected by robust SSL encryption.

You are no longer at risk of having your password intercepted on public Wi-Fi, which is a considerable step towards our goal.

3. Network and firewall

Now that the access point is unique via the HTTPS URL configured just above, the rest of the ports, except for SSH, can be closed.

Now, implement the basic rules in the firewall. On Ubuntu, the standard tool is UFW (Uncomplicated Firewall).

Start by opening the ports related to the functional services.

ubuntu@vscode-server:~$ sudo ufw allow ssh
ubuntu@vscode-server:~$ sudo ufw allow http
ubuntu@vscode-server:~$ sudo ufw allow https

Activate the firewall:

ubuntu@vscode-server:~$ sudo ufw enable

Check the implementation of the rules.

ubuntu@vscode-server:~$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere
80/tcp                     ALLOW       Anywhere
443                        ALLOW       Anywhere
45876                      ALLOW       Anywhere
22/tcp (v6)                ALLOW       Anywhere (v6)
80/tcp (v6)                ALLOW       Anywhere (v6)
443 (v6)                   ALLOW       Anywhere (v6)
45876 (v6)                 ALLOW       Anywhere (v6)

You can also add stricter rules to explicitly reject anything unauthorised in incoming traffic while generally authorising outgoing traffic.

ubuntu@vscode-server:~$ sudo ufw default deny incoming
ubuntu@vscode-server:~$ sudo ufw default allow outgoing

From now on, if someone attempts to access the IP on port 8080, the connection will be outright rejected.

Only the domain name in HTTPS is the legitimate entry point.
This handy little development server now feels more like a fortress.

But what happens if you decide to delete this instance to move to a more powerful one and/or stop it for an indefinite period, as your project is on hold?

This is what you will find out in the next part: how to isolate your data and configurations on a persistent storage volume to make your environment completely interchangeable, but also how to automate the deploymen of this development environment!

The ultimate goal is for a simple terraform apply command to to be enough to generate a development environment that’s ready to use in under two minutes.

Navigating OVHcloud Enterprise File Storage (EFS) with Trident CSI On Kubernetes clusters (MKS)

Aurélie Vache — Mon, 11 May 2026 12:18:46 +0000

If you find yourself in need of shared persistent storage for applications running on OVHcloud Managed Kubernetes Service (MKS), then OVHcloud Enterprise File Storage (EFS) with Trident CSI offers you a practical way to provision and manage it.

This blog post explains how to create and connect OVHcloud EFS to your MKS cluster using Trident CSI, so you can dynamically provision persistent storage for Kubernetes workloads.

OVHcloud Enterprise File System (EFS)

EFS is a high-performance, fully managed file storage solution powered by NetApp ONTAP in an active-active architecture. It is designed for enterprise workloads requiring high availability, predictable performance, and seamless integration with cloud-native environments.

The service is available in multiple regions, including Roubaix, Gravelines, Strasbourg, Limbourg, and Beauharnois, with a strong SLA of 99.99% uptime. Storage capacity ranges from 50 GB up to 29 TB.

EFS delivers guaranteed performance with 4,000 IOPS and 64 MB/s throughput per TiB, scaling linearly with volume size thanks to NVMe SSD infrastructure.

Built for modern infrastructures, EFS integrates natively with Kubernetes via Trident CSI (compatible with MKS) and supports ReadWriteMany (RWX) access. It operates within a single availability zone (1AZ) and provides low-latency NFS storage over OVHcloud’s secure vRack network, ensuring strong security and compliance.

NetApp Trident CSI

Trident is an open-source, fully supported storage orchestration project maintained by NetApp. It is designed to help Kubernetes applications consume persistent storage using standard interfaces such as the Container Storage Interface (CSI).

Trident runs directly inside Kubernetes clusters as a set of Pods and enables dynamic provisioning and management of storage for containerized workloads. It allows applications to easily access persistent storage from NetApp’s ecosystem, including ONTAP systems (like the OVHcloud EFS).

Let’s do it!

EFS creation

We already have a MKS cluster, in GRA11 region, running inside a private network and a subnet, with a gateway.
We also already have a vRack and our Public Cloud Project attached to this vRack.
So in this blog post we will only create a new EFS in eu-west-rbx region, attached to a vRackServices, inside the same subnet that our existing MKS cluster.

Here you can see the architecture of all the services:

⚠️ EFS and MKS regions may differ; be aware that latency between different regions may impact your storage workloads performance. It’s highly recommended to keep your storage and compute as close as possible.

We will deploy the EFS in eu-west-rbx instead of in eu-west-gra region to show you that it is possible.

To deploy the EFS, we will use the Terraform OVHcloud EFS module.

The module we will use can deploy all the components necessary to use EFS with a MKS cluster (like you can see in the schema).

But in this blog post we will assume that we already deployed:

a vRack
a Private Network
a Private Subnet
a Gateway
a MKS cluster

So using the Terraform module we will fill the existing resources information and ask Terraform to create:

an OAuth2 credential
an IAM policy
an EFS
a vRack Services

Let’s deploy our components with Terraform!

Create a provider.tf file and fill it with the information:

terraform {
  required_providers {
    ovh = {
      source  = "ovh/ovh"
      version = ">= 2.12.0"
    }
    null = {
      source  = "hashicorp/null"
      version = ">= 3.0.0"
    }
  }

  required_version = ">= 1.7.0"
}

provider "ovh" {
}

If you don’t define the provider information inside this file, as was shown in this example, you can instead set the environment variables with your credentials:

# OVHcloud provider needed keys
export OVH_ENDPOINT="ovh-eu"
export OVH_APPLICATION_KEY="xxx"
export OVH_APPLICATION_SECRET="xxx"
export OVH_CONSUMER_KEY="xxx"
export OVH_CLOUD_PROJECT_SERVICE="xxx"

Create a variable.tf.template file and fill it with these information:

# Existing services
variable "service_name" {
  default = "$OVH_CLOUD_PROJECT_SERVICE"
}

variable "vrack_id" {
  default = "pn-1234567" #ID of your existing vRack
}

variable "vlan_id" {
  default = "666" #ID of your VLAN
}

variable "private_network_id" {
  default = "d111cb65-1234-5678-9012-dac2e93b8944" #ID of your private network
}

variable "private_subnet_id" {
  default = "d8dc2469-1234-5678-9012-1f86551d3466" #ID of your subnet
}

variable "vrackservices_subnet_service_range_cidr" {
  default = "192.168.168.248/29" #CIDR of your private network
}

variable "private_subnet_cidr" {
  default = "192.168.168.0/24" #CIDR of your subnet
} 

variable "mks_region" {
  default = "GRA11" #Region of your existing MKS cluster
}

variable "mks_cluster_id" {
  default = "7c3e1e6e-1234-5678-9012-4fb5a5b145e7" #ID of your existing MKS cluster
}

# Services to create

variable "oauth2_client_name" {
  default = "efs-trident-client-example"
}

variable "oauth2_client_description" {
  default = "OAuth2 client for EFS Trident integration"
}

variable "iam_policy_name" {
  default = "efs-trident-policy-example"
}

variable "iam_policy_description" {
  default = "IAM policy for EFS Trident access"
}

variable "vrackservices_attach_to_efs" {
  description = "Whether to attach the EFS service endpoint to vRack Services. Set to false before destroying."
  type        = bool
  default     = true
}

variable "efs_region" {
  default = "eu-west-rbx"
}

variable "efs_name" {
  default = "my-efs-storage"
}

variable "efs_plan" {
  default = "enterprise-file-storage-premium-1tb"
}

⚠️ In the file, replace the IDs, CIDR & MKS region with your existing resources information.

Replace the value of the OVH_CLOUD_PROJECT_SERVICE environment variable in the variables.tf file:

envsubst < variables.tf.template > variables.tf

Create a efs.tf file and fill it with the information:

module "ovh_efs_trident" {
  source = "ovh/efs/ovh//modules/efs-trident"

  # OVH region for EFS and vRack Services
  region = var.efs_region

  # Public Cloud region for MKS and private network
  public_cloud_region = var.mks_region

  # VLAN ID must be the same for vRack Services and Public Cloud private network
  vlan_id = var.vlan_id

  # Set to false before destroying to detach endpoint first
  vrackservices_attach_to_efs = var.vrackservices_attach_to_efs

  # EFS creation
  storage_efs_name      = var.efs_name
  storage_efs_plan_code = var.efs_plan

  # --- vRack ---
  create_vrack       = false
  vrack_service_name = var.vrack_id

  # --- Cloud Project ---
  create_cloud_project        = false
  cloud_project_id            = var.service_name
  bind_vrack_to_cloud_project = false # Set to false if already bound

  # --- Private Network ---
  create_private_network      = false
  private_network_id = var.private_network_id

  # --- Private Subnet ---
  create_private_subnet      = false
  private_subnet_id = var.private_subnet_id

  # --- Gateway ---
  create_gateway = false  # Set to false only if existing network has gateway

  # --- MKS Cluster ---
  create_mks_cluster = false
  mks_cluster_id     = var.mks_cluster_id # mks-priv-gra11
  create_node_pool   = false # Set to false if using existing node pool

  # OAuth2 and IAM
  oauth2_client_name        = var.oauth2_client_name
  oauth2_client_description = var.oauth2_client_description
  iam_policy_name           = var.iam_policy_name
  iam_policy_description    = var.iam_policy_description

  # Network (shared between vRack Services and Public Cloud)
  private_network_subnet_cidr             = var.private_subnet_cidr
  vrackservices_subnet_service_range_cidr = var.vrackservices_subnet_service_range_cidr # EFS gets IPs here
}

Create an output.tf file with the following content:

output "client_id" {
    value = module.ovh_efs_trident.client_id
}

output "client_secret" {
    value = module.ovh_efs_trident.client_secret
    sensitive = true
}

output "efs_id" {
  value       = module.ovh_efs_trident.efs_id
}

The Terraform configuration is ready. Let’s init it:

terraform init

The output should be like this:

$ terraform init

Initializing the backend...
Initializing modules...
Initializing provider plugins...
- Reusing previous version of hashicorp/null from the dependency lock file
- Reusing previous version of ovh/ovh from the dependency lock file
- Using previously-installed hashicorp/null v3.2.4
- Using previously-installed ovh/ovh v2.13.1

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Apply it:

terraform apply

The output should be like this:

$ terraform apply

module.ovh_efs_trident.data.ovh_me.my_account: Reading...
module.ovh_efs_trident.data.ovh_cloud_project_kube.existing[0]: Reading...
module.ovh_efs_trident.data.ovh_cloud_project.existing[0]: Reading...
module.ovh_efs_trident.data.ovh_me.my_account: Read complete after 1s [id=xx12345-ovh]
module.ovh_efs_trident.data.ovh_cloud_project.existing[0]: Read complete after 0s
module.ovh_efs_trident.data.ovh_order_cart.cart: Reading...
module.ovh_efs_trident.data.ovh_order_cart.cart: Read complete after 0s [id=d582ab7c-1234-5678-9012-4a6e702ea4c5]
module.ovh_efs_trident.data.ovh_cloud_project_kube.existing[0]: Read complete after 5s [id=7c3e1e6e-1234-5678-9012-4fb5a5b145e7]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.ovh_efs_trident.null_resource.config_validation will be created
  + resource "null_resource" "config_validation" {
      + id = (known after apply)
    }

  # module.ovh_efs_trident.ovh_iam_policy.iam_policy will be created
  + resource "ovh_iam_policy" "iam_policy" {
      + allow       = [
          + "storageNetApp:apiovh:get",
          + "storageNetApp:apiovh:serviceInfos/get",
          + "storageNetApp:apiovh:share/accessPath/get",
          + "storageNetApp:apiovh:share/acl/create",
          + "storageNetApp:apiovh:share/acl/delete",
          + "storageNetApp:apiovh:share/acl/get",
          + "storageNetApp:apiovh:share/create",
          + "storageNetApp:apiovh:share/delete",
          + "storageNetApp:apiovh:share/edit",
          + "storageNetApp:apiovh:share/extend",
          + "storageNetApp:apiovh:share/get",
          + "storageNetApp:apiovh:share/revertToSnapshot",
          + "storageNetApp:apiovh:share/snapshot/create",
          + "storageNetApp:apiovh:share/snapshot/delete",
          + "storageNetApp:apiovh:share/snapshot/edit",
          + "storageNetApp:apiovh:share/snapshot/get",
        ]
      + created_at  = (known after apply)
      + description = "IAM policy for EFS Trident access"
      + id          = (known after apply)
      + identities  = (known after apply)
      + name        = "efs-trident-policy-example"
      + owner       = (known after apply)
      + read_only   = (known after apply)
      + resources   = (known after apply)
      + updated_at  = (known after apply)
    }

  # module.ovh_efs_trident.ovh_me_api_oauth2_client.api_oauth2_client will be created
  + resource "ovh_me_api_oauth2_client" "api_oauth2_client" {
      + client_id     = (known after apply)
      + client_secret = (sensitive value)
      + description   = "OAuth2 client for EFS Trident integration"
      + flow          = "CLIENT_CREDENTIALS"
      + id            = (known after apply)
      + identity      = (known after apply)
      + name          = "efs-trident-client-example"
    }

  # module.ovh_efs_trident.ovh_storage_efs.efs[0] will be created
  + resource "ovh_storage_efs" "efs" {
      + created_at        = (known after apply)
      + iam               = (known after apply)
      + id                = (known after apply)
      + name              = "my-efs-storage"
      + order             = (known after apply)
      + ovh_subsidiary    = "FR"
      + performance_level = (known after apply)
      + plan              = [
          + {
              + configuration = [
                  + {
                      + label = "region"
                      + value = "eu-west-rbx"
                    },
                  + {
                      + label = "network"
                      + value = "vrack"
                    },
                ]
              + duration      = "P1M"
              + plan_code     = "enterprise-file-storage-premium-1tb"
              + pricing_mode  = "default"
            },
        ]
      + product           = (known after apply)
      + quota             = (known after apply)
      + region            = (known after apply)
      + service_name      = (known after apply)
      + status            = (known after apply)
    }

  # module.ovh_efs_trident.ovh_vrack_vrackservices.vrack-vrackservices-binding[0] will be created
  + resource "ovh_vrack_vrackservices" "vrack-vrackservices-binding" {
      + id             = (known after apply)
      + service_name   = "pn-1234567"
      + vrack_services = (known after apply)
    }

  # module.ovh_efs_trident.ovh_vrackservices.vrackservices[0] will be created
  + resource "ovh_vrackservices" "vrackservices" {
      + checksum        = (known after apply)
      + created_at      = (known after apply)
      + current_state   = (known after apply)
      + current_tasks   = (known after apply)
      + iam             = (known after apply)
      + id              = (known after apply)
      + order           = (known after apply)
      + ovh_subsidiary  = "FR"
      + plan            = [
          + {
              + configuration = [
                  + {
                      + label = "region_name"
                      + value = "eu-west-rbx"
                    },
                ]
              + duration      = "P1M"
              + plan_code     = "vrack-services"
              + pricing_mode  = "default"
            },
        ]
      + resource_status = (known after apply)
      + target_spec     = {
          + subnets = [
              + {
                  + cidr              = "192.168.168.0/24"
                  + service_endpoints = [
                      + {
                          + managed_service_urn = (known after apply)
                        },
                    ]
                  + service_range     = {
                      + cidr = "192.168.168.248/29"
                    }
                  + vlan              = 666
                    # (1 unchanged attribute hidden)
                },
            ]
        }
      + updated_at      = (known after apply)
    }

Plan: 6 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + client_id     = (known after apply)
  + client_secret = (sensitive value)
  + efs_id        = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.ovh_efs_trident.null_resource.config_validation: Creating...
module.ovh_efs_trident.null_resource.config_validation: Creation complete after 0s [id=8553589333890826101]
module.ovh_efs_trident.ovh_me_api_oauth2_client.api_oauth2_client: Creating...
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Creating...
module.ovh_efs_trident.ovh_me_api_oauth2_client.api_oauth2_client: Creation complete after 0s [id=EU.xxxxxxxxxxxxx]
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Still creating... [00m10s elapsed]
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Still creating... [00m20s elapsed]
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Still creating... [00m30s elapsed]
...
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Still creating... [03m40s elapsed]
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Still creating... [03m50s elapsed]
module.ovh_efs_trident.ovh_storage_efs.efs[0]: Creation complete after 3m52s [id=c2d759de-cd63-4e28-aaab-a7599aad2ca8]
module.ovh_efs_trident.ovh_vrackservices.vrackservices[0]: Creating...
module.ovh_efs_trident.ovh_iam_policy.iam_policy: Creating...
module.ovh_efs_trident.ovh_iam_policy.iam_policy: Creation complete after 0s [id=a434d1a4-1234-5678-9012-cf54251eee52]
module.ovh_efs_trident.ovh_vrackservices.vrackservices[0]: Still creating... [00m10s elapsed]
module.ovh_efs_trident.ovh_vrackservices.vrackservices[0]: Still creating... [00m20s elapsed]
...
module.ovh_efs_trident.ovh_vrackservices.vrackservices[0]: Still creating... [01m20s elapsed]
module.ovh_efs_trident.ovh_vrackservices.vrackservices[0]: Creation complete after 1m30s [id=vrs-a00-b11-c22-d33]
module.ovh_efs_trident.ovh_vrack_vrackservices.vrack-vrackservices-binding[0]: Creating...
module.ovh_efs_trident.ovh_vrack_vrackservices.vrack-vrackservices-binding[0]: Still creating... [00m10s elapsed]
module.ovh_efs_trident.ovh_vrack_vrackservices.vrack-vrackservices-binding[0]: Still creating... [00m20s elapsed]
...
module.ovh_efs_trident.ovh_vrack_vrackservices.vrack-vrackservices-binding[0]: Still creating... [01m40s elapsed]
module.ovh_efs_trident.ovh_vrack_vrackservices.vrack-vrackservices-binding[0]: Creation complete after 1m43s [id=vrack_pn-1234567-vrackServices_vrs-a00-b11-c22-d33]

Apply complete! Resources: 6 added, 0 changed, 0 destroyed.

Outputs:

client_id = "EU.xxxxxxxxxxxxx"
client_secret = 
efs_id = "c2d759de-cd63-4e28-aaab-a7599aad2ca8"

Save the OAuth2 credentials in environment variables:

export EFS_CLIENT_ID=$(terraform output -raw client_id)
export EFS_CLIENT_SECRET=$(terraform output -raw client_secret)

Trident CSI Installation

Install the Trident operator in your MKS cluster:

helm repo add netapp-trident https://netapp.github.io/trident-helm-chart

helm install trident-operator netapp-trident/trident-operator \
  --version 100.2502.1 \
  --create-namespace \
  --namespace trident \
  --set tridentSilenceAutosupport=true \
  --set operatorImage="ovhcom/trident-operator:25.02.1-linux-amd64" \
  --set tridentImage="ovhcom/trident:25.02.1-linux-amd64"

You should have a result like this:

$ helm install trident-operator netapp-trident/trident-operator \
  --version 100.2502.1 \
  --create-namespace \
  --namespace trident \
  --set tridentSilenceAutosupport=true \
  --set operatorImage="ovhcom/trident-operator:25.02.1-linux-amd64" \
  --set tridentImage="ovhcom/trident:25.02.1-linux-amd64"

NAME: trident-operator
LAST DEPLOYED: Tue Apr 28 14:01:19 2026
NAMESPACE: trident
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing trident-operator, which will deploy and manage NetApp's Trident CSI
storage provisioner for Kubernetes.

Your release is named 'trident-operator' and is installed into the 'trident' namespace.
Please note that there must be only one instance of Trident (and trident-operator) in a Kubernetes cluster.

To configure Trident to manage storage resources, you will need a copy of tridentctl, which is
available in pre-packaged Trident releases.  You may find all Trident releases and source code
online at https://github.com/NetApp/trident.

To learn more about the release, try:

  $ helm status trident-operator
  $ helm get all trident-operator

Once the installation is complete, verify that all Trident pods are in Running state in the trident namespace before proceeding:

$ kubectl get pods -n trident

NAME                                  READY   STATUS    RESTARTS      AGE
trident-controller-5bf6c8d6f6-g95jq   6/6     Running   0             119s
trident-node-linux-4xtjr              2/2     Running   1 (82s ago)   119s
trident-node-linux-6w5ff              2/2     Running   1 (82s ago)   119s
trident-node-linux-r7hxp              2/2     Running   0             119s
trident-operator-859f59c58b-2z2ts     1/1     Running   0             2m31s

Trident Backend Creation

The Trident backend connects NetApp Trident to the OVHcloud EFS service using the IAM credentials previously created.

1. Secret Creation

Create a Kubernetes Secret containing the connection information that allows Trident to access the OVHcloud API. Create a trident-secret.yaml.template file with the following content:

apiVersion: v1
kind: Secret
metadata:
  name: ovh-efs-secret
type: Opaque
stringData:
  clientID: "$EFS_CLIENT_ID"         # your clientId
  clientSecret: "$EFS_CLIENT_SECRET" # your clientSecret

Replace the clientID and clientSecret values by the OAuth2 client we created with Terraform:

envsubst < trident-secret.yaml.template > trident-secret.yaml

Apply the secret in your cluster:

kubectl apply -f trident-secret.yaml -n trident

Check that the secret has been correctly created:

$ kubectl get secret ovh-efs-secret -n trident

NAME             TYPE     DATA   AGE
ovh-efs-secret   Opaque   2      3s

2. Trident Backend Creation

Create your backend with the command below:

cat <apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
  name: ovh-efs-rbx
spec:
  version: 1
  backendName: backend-ovh-efs
  defaults:
    exportRule: "192.168.168.0/24"    # CIDR of your network for NFS ACLs
  storageDriverName: ovh-efs
  clientLocation: ovh-eu
  location: eu-west-rbx         # Location of your EFS service
  serviceLevel: premium
  nfsMountOptions: rw,hard,rsize=65536,wsize=65536,nfsvers=3,tcp
  credentials:
    name: ovh-efs-secret
  volumeCreateTimeout: "60" 
EOF

⚠️ The ovh-efs storage driver must be used. Replace exportRule, location, and other parameters with values matching your environment.

Verify that the backend has been created correctly with the command below:

$ kubectl get TridentBackendConfig -n trident

NAME          BACKEND NAME      BACKEND UUID                           PHASE   STATUS
ovh-efs-rbx   backend-ovh-efs   ace12d67-70ea-44e1-abd8-20d016f7f030   Bound   Success

Use EFS in your MKS cluster

This section describes how to expose Enterprise File Storage to Kubernetes workloads using Trident.

1. StorageClass

In a sc_efs.yaml file, define a StorageClass to enable dynamic provisioning via the Trident CSI driver:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ovh-efs-premium
provisioner: csi.trident.netapp.io
parameters:
  backendType: "ovh-efs"
  fsType: "nfs"
allowVolumeExpansion: true

Apply the StorageClass:

kubectl apply -f sc_efs.yaml

Check that the StorageClass has been created:

$ kubectl get sc ovh-efs-premium

NAME              PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
ovh-efs-premium   csi.trident.netapp.io   Delete          Immediate           true                   3h13m

This StorageClass allows volumes to be provisioned on demand and expanded dynamically.

2. Volume Creation (PVC)

Create a PersistentVolumeClaim with ReadWriteMany (RWX) access mode. Create a pvc_efs.yaml file with this content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: premium-pvc-efs
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Gi
  storageClassName: ovh-efs-premium

Apply it:

kubectl apply -f pvc_efs.yaml

Verify that the PVC has been created with the command below:

kubectl get pvc premium-pvc-efs

At this point, the EFS is creating a volume, attach the correct ACL to it and mount it in the PVC

After a little time, the output should show the PVC in Bound state:

$ kubectl get pvc

NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      VOLUMEATTRIBUTESCLASS   AGE
premium-pvc-efs   Bound    pvc-faca364d-ad76-44ec-9bc9-959c0d33c515   100Gi      RWX            ovh-efs-premium                    3m43s

The volume has been created through the PVC and you can now mount it in a Pod 🎉.

Conclusion

In this blog, we’ve explained how to create an EFS and use it in a MKS cluster through Trident CSI. This will give you a flexible, production-ready approach to persistent shared storage in Kubernetes.

We recommend you also take a look at our Cloud Roadmap & Changelog for an overview of all the coming features for OVHcloud Public Cloud products.

Remote development #1 – First Deployment

Rémy Vandepoel — Thu, 07 May 2026 16:00:42 +0000

A development environment is an essential day-to-day system, but it can quickly become complex to manage. In this three-part blog post, we will explore how to become more comfortable and productive with it!

Endless meetings, slightly differing Docker environments on each machine, and untimely system updates: maintaining a reliable and consistent development workstation can quickly become a daily struggle.

With each new project, you have to reinstall the same tools, the same CLIs, and reconfigure the same SDKs or frameworks. And above all, hope that the local machine can handle the load when tests, the linter, and the database are all running simultaneously. Meanwhile, with remote work or working while travelling, individuals find themselves developing with a temperamental VPN, from a laptop that is sometimes close to obsolescence.

In this series of articles, we aim to transform this reality by building on a complete development environment hosted in the cloud and accessible from any browser via VS Code Server.

The idea is to have a remote, powerful, and, if necessary, reproducible and independent “workstation”.

This first chapter demonstrates how to easily deploy a Public Cloud instance manually and install VS Code Server on it. The following chapters will improve its security and automation.

1. Deploying the instance

For the initial tests it may be wise to opt for a smaller, Discovery-type instance so that you can familiarise yourself with the environment and test it. A d2-2 instance will be used here. 1 vCPU and 2 GB of RAM should be enough.

2. Installing the application element

The fountain of knowledge for the following steps is the GitHub for the vscode-server project: https://github.com/coder/code-server

There are several options for the installation. In this chapter, to simplify the deployment and for those who are not very familiar with Docker, the installation will be done via the “native” installation script, without using containers.

ubuntu@vscode-server:~$ sudo apt update && sudo apt upgrade
ubuntu@vscode-server:~$ curl -fsSL

https://code-server.dev/install.sh | sh

This step is enough to install the essentials. Activate the service now and check that it is running correctly.

ubuntu@vscode-server:~$ sudo systemctl enable --now code-server@$USER
ubuntu@vscode-server:~$ sudo systemctl status code-server@$USER
●

code-server@ubuntu.servic

e - code-server
     Loaded: loaded (/usr/lib/systemd/system/code-server@.service; enabled; preset: enabled)
     Active: active (running) since Wed 2025-12-03 14:55:37 UTC; 15min ago
 Invocation: 1b393d84bebe415cbb770a17a0c8d399
   Main PID: 893 (node)
      Tasks: 22 (limit: 4532)
     Memory: 95.1M (peak: 112.1M)
        CPU: 1.868s
     CGroup: /system.slice/system-code\x2dserver.slice/code-server@ubuntu.service
             ├─ 893 /usr/lib/code-server/lib/node /usr/lib/code-server
             └─1130 /usr/lib/code-server/lib/node /usr/lib/code-server/out/node/entry

3. Validate the configuration

At this stage, the service is operational; the configuration still needs to be finalised, particularly creating the folder that will contain the code as well as the authentication.

ubuntu@vscode-server:~$ mkdir workspace

ubuntu@vscode-server:~$ cat ~/.config/code-server/config.yaml
bind-addr: 127.0.0.1:8080
auth: password
password:
cert: false

You need to set a secure password here and verify that the bind-addr corresponds to your desired configuration.

If you wish to directly test the service in its current state, use 0.0.0.0:8080. Then restart the service and access the interface via http://:8080.

After providing the password found in the config.yaml in the authentication window, you will gain direct access to VS Code in the browser.

From this deployment, you can then partially address the issue of getting a stable development environment.

At this stage, it is possible to directly clone your GitHub repositories or to use the workspace folder to clone them.
This is recommended for greater longevity, as you will see in the second chapter.

To perform a test commit via the vscode-server interface, you must configure git locally (just once) so that the authentication of the remote repository runs correctly.

ubuntu@vscode-server:~$ git config user.email “mail@foo.bar”
ubuntu@vscode-server:~$ git config --global user.name"John Doe"

From this step onwards, you can use the remote development environment with vscode-server, while enjoying nearly all the features you might have locally, but with the advantages of having an environment dedicated to this use.

⚠️ Reminder: in its current state, the deployment made here is not “production ready”!

The aim of this first chapter is to introduce the service, with the instructions here to help you familiarize yourself with the environment. Therefore, please ensure that you do not operate the service as deployed here for more than a few hours!

The environment will need to be secured, as it is directly exposed on the Internet. We’ll talk about this in the following chapters.

By now, you have an operational development environment that is already capable of supporting a real application project!

The instance is online, VS Code Server is responding in the browser, the workspace is ready, and the first repository has been cloned and opened as if on a local machine. This foundation demonstrates that it is possible to abstract from the hardware to gain portability and more easily share a common configuration within a team or a remote development workstation.

In the upcoming chapters, this minimum viable environment will be gradually enhanced with persistent storage, backup mechanisms, and secure access via HTTPS. It will then be fully automated through Infrastructure as Code, in order to transition from a simple technical test to a genuine development platform ready for internal production.

Copy.Fail (CVE-2026-31431): How to Rapidly Protect OVHcloud MKS Clusters from the Linux Kernel Zero-Day

Aurélie Vache — Thu, 30 Apr 2026 13:42:17 +0000

A newly disclosed Linux kernel zero-day, CVE-2026-31431, “Copy.Fail”, is one of the most serious privilege-escalation vulnerabilities in recent years.

Discovered by Theori and publicly disclosed on April 29, 2026, Copy.Fail is a Linux kernel zero-day that roots every distribution since 2017. Unlike many local privilege-escalation flaws that depend on race conditions, kernel address leaks, or distribution-specific behavior, Copy.Fail is alarmingly reliable: it works consistently across mainstream Linux distributions with only a standard user account.

Why the CVE-2026-31431 is dangerous?

Copy.Fail abuses a logic flaw in the Linux kernel’s algif_aead crypto module, introduced through a 2017 optimization. By manipulating the kernel’s AF_ALG crypto interface, an attacker can write controlled data into the Linux page cache (the in-memory representation of trusted system binaries).

This allows attackers to temporarily hijack binaries like /usr/bin/su without modifying the file on disk.

In practical terms:

A normal user can become root
A compromised container can escape to the host
A malicious CI job can root its runner
Shared infrastructure becomes vulnerable across tenants
Disk forensics may show no file tampering because only RAM is altered

This makes Copy.Fail especially dangerous for:

Kubernetes clusters
CI/CD systems
Shared development environments
Cloud notebook platforms
Multi-tenant container infrastructure

How to patch it easily in your MKS clusters?

OVHcloud is preparing patched MKS versions including the upstream kernel fix. Patched versions are expected to be available 30 April 2026, at 16:00 UTC+2.

While waiting for the next MKS release, here is a DaemonSet manifest that you can apply in your MKS clusters in order to mitigate the vulnerability.

Create a patch-copy-fail-cve file with the following content:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: patch-copy-fail-cve
  labels:
    app: patch-copy-fail-cve
  namespace: default
spec:
  selector:
    matchLabels:
      app: patch-copy-fail-cve
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 100%
  template:
    metadata:
      labels:
        app: patch-copy-fail-cve
    spec:
      hostPID: true
      priorityClassName: system-node-critical
      volumes:
        - name: root-mount
          hostPath:
            path: /
            type: Directory
      initContainers:
        - image: mks.kubernatine.ovh/docker.io/library/busybox:1.36.1
          name: patch-copy-fail-cve
          command: ["/bin/bash", "-c"]
          args:
            - |
              tee /etc/modprobe.d/disable-algif-aead.conf <<<'install algif_aead /bin/false'
              rmmod algif_aead 2>/dev/null
              update-initramfs -u
          securityContext:
            privileged: true
            runAsUser: 0
          volumeMounts:
            - name: root-mount
              mountPath: /
      containers:
        - image: "mks.kubernatine.ovh/registry.k8s.io/pause:3.10.1"
          name: pause

Apply it:

kubectl apply -f patch-copy-fail-cve.yaml

⚠️ This mitigation has been tested on OVHcloud internal test clusters. Applying it to your own service remains under your responsibility.

If the vulnerability has already been exploited on your cluster, this mitigation will not remediate any pre-existing compromise.
The recommended remediation remains the official security release, which will be made available as soon as possible.

Read more about the mitigation: https://github.com/rootsecdev/cve_2026_31431#mitigation

KubeCon + CloudNativeCon Europe 2026 in Amsterdam: feedback and highlights

Aurélie Vache and Rémy Vandepoel — Wed, 29 Apr 2026 07:00:31 +0000

From March 23 to 26, 2026, the KubeCon + CloudNativeCon Europe took place in Amsterdam.

Aurélie Vache and Rémy Vandepoel attended alongside 26 other OVHcloud employees. In this blog, they share their thoughts about this second KubeCon set in the land of tulips.

KubeCon Europe 2026: the maturity milestone

Back from Amsterdam, the buzz of the RAI halls still echoes in our ears. This 2026 edition of KubeCon + CloudNativeCon Europe wasn’t just another Kubernetes conference. It marked a turning point for this event: the point of maturity. And this is evident just by looking at the numbers: 13,500 attendees for this edition! The largest attendance ever recorded!

While previous years were about exploration and expansion, 2026 was the year of massive industrialization, with one non-negotiable pre-requirement: digital sovereignty.

Key figures from the 2026 edition:

13,500+ attendees (46% first-time attendees)
100 countries represented
3,474 unique organizations/companies
891 sessions
230 projects in the CNCF landscape with 19.9 million contributors

CNCF Contributors by Geography (Last 12 Months)

Europe: 38.8% of contributions (ahead of the United States)
United States: 36.29%
Germany: 9.82% (leading in Europe)
France: 4.68%
Switzerland: 2.49%
Strong signals for digital sovereignty, a key theme of this year’s keynotes 💪

Colocated events

KubeCon + CloudNativeCon Europe 2026 traditionally kicks off with a full day dedicated to co-located events. This year was no exception, with an impressive lineup of 16 events, including well-known favorites such as ArgoCon, BackstageCon, CiliumCon, Platform Engineering Day, Kubernetes on Edge Day, and Observability Day.

Among the newcomer events, Open Sovereign Cloud Day was a stand out, as it highlighted the growing importance of cloud sovereignty in Europe.

During CiliumCon, we were proud to see the spotlight on our MKS Standard offer 🚀.

OVHcloud Presence

OVHcloud had a strong presence at the event, with two different booths serving two different purposes.

One was located in the Activation Zone, designed as an interactive space to engage with attendees through a video game “Gaming Camp: Beat Cloud Villains!”, described as “Join the fight against the villains of the cloud. Take on Hidden Cost, Jailor Stack, and Autonomous Zero, and prove yourself as a true Guardian of the Cloud.”

Players were welcomed to step into a two-player fighting game inspired by the style of Street Fighter, where strategy and skill are your best weapons. Winners won exclusive t-shirts.

The second booth had a more corporate focus, highlighting OVHcloud’s broader portfolio, strategic positioning, and enterprise offerings. It provided a space for deeper conversations around demos, use cases, and cloud strategies.

The opportunity was too good to pass up, so we took the chance to interview key players in the ecosystem, as well as customers of our solutions.

We conducted five interviews and had many discussions, and we can’t wait to share them with you soon!

Here’s a sneak peek featuring Sudeep Goswami, CEO of Traefik Labs:

These interviews will soon be available on YouTube, so stay tuned!

Aurélie Vache’s talk

Getting accepted to KubeCon is not easy, and Aurélie, our Developer Advocate and CNCF Ambassador, rose to the challenge by once again presenting a new talk.

“The Ultimate Kubernetes Challenge: An Interactive Trivia Game”:

“Kubernetes has become the de facto standard for deploying and operating containerized applications. We use it, as well as its ecosystem, on a daily basis, but do we know them as well as we think we do?

With a mix of quiz and live demos, come learn and/or improve your knowledge. You will discover (or rediscover) the key concepts of Kubernetes (pods, secrets, services…), internal components but also best practices.

In this fun and dynamic talk, come compete throughout the quiz and explore the wonderful world of Kubernetes.

Icing on the cake: the first will win some swags.“

During this talk, attendees tested their Kubernetes knowledge through an interactive quiz, with results presented via illustrated slides and live, hands-on demos.

Giving a talk at 5 p.m., during the final session of the second day, was an ambitious way to finish up. But thanks to the interactive format of her talk, attendees were able to enjoy testing their knowledge while discovering tips about Kubernetes and its concepts and features.

Three OVHcloud MKS clusters were created especially for the occasion, one with 3 nodes, one with zero nodes, and one with 3 nodes across 3 Availability Zones:

Watch the talk here:

Keynotes: Toward “Agent-Based” and Autonomous AI

Plenary sessions at the event were dominated by a convergence of Kubernetes and Artificial Intelligence. This term, already ubiquitous in tech news, was bound to be a major focus here. Jonathan Bryce, the Executive Director of Cloud & Infrastructure at the Linux Foundation and an iconic figure in the ecosystem, made a strong point by reminding the audience that while Kubernetes is everywhere (82% adoption rate), AI in production remains a major challenge.

In November, during the latest KubeCon + CoudNativeCon NA at Atlanta, the CNCF launched the “Certified Kubernetes AI Conformance Program to Standardize AI Workloads on Kubernetes“. 5 months later, several companies including the OVHcloud Managed Kubernetes Services (MKS) platform, succeeded this new program with their own certified Kubernetes AI platform.

During the keynotes we even saw a real plane!

And to top it off, seeing Michelin present the Top End User Award to SNCF was a real highlight for us. Cocoricoooo! 🇫🇷

Key Trends

Find below the most frequently discussed technical pillars that will remain prominent in the coming months and years:

* Agent-based AI: The focus is shifting from training to inference. The announcement of Dapr Agents 1.0 shows that Kubernetes will now orchestrate agents capable of making real-time decisions on the infrastructure.

* GPU Standardization (DRA): Thanks to NVIDIA’s widespread adoption of Dynamic Resource Allocation (DRA) drivers, GPU scheduling is becoming as simple and granular as CPU scheduling. A boon for cost optimization.

* Sovereignty: Sovereignty is no longer a legal concept; it is an architecture. We have seen a rise in encryption tools for data in transit and at rest (Confidential Computing) natively integrated into CNIs such as Cilium.

* FinOps 2.0: With 67% of AI compute dedicated to inference by the end of 2026, precise monitoring of GPU consumption via projects like Kepler has become essential for the economic viability of projects.

The Gateway API is becoming the standard

As we announced in our blog post “Moving Beyond Ingress: Why should OVHcloud Managed Kubernetes Service (MKS) users start looking at the Gateway API?”, the ingress-nginx controller, the most widely used ingress controller, has now been archived.

Now, after 8 years of development, 275 released versions, and nearly 20k GitHub stars, the maintainers of the Kubernetes Gateway API introduced ingress2gateway v1.0, a tool designed to simplify migration. It automatically converts Ingress resources including annotations into Gateway API resources. The recommended approach remains pragmatic: first migrate the controller while keeping existing Ingress objects, then gradually transition to the Gateway API. Attempting a full migration in a single step is considered risky and unnecessary.

Additionally, Gateway API version 1.5 represents a major milestone: five features have moved from experimental status to the Standard channel in a single release.

Amongst them:

ListenerSet: delegates TLS listener management outside of the Gateway
TLSRoute: SNI-based routing in either termination or passthrough mode
Client certificate validation for mTLS at the ingress layer
Native CORS filter for HTTPRoute

The Kubernetes Gateway API is now establishing itself as much more than just a successor to Ingress: it is evolving into Kubernetes’ unified network control plane.

Favorite talk

As usual, Aurélie wasn’t able to attend many talks, but among the 2-3 she did see, there was one that really had a “wow” effect on her:

« An immersive and visual journey into kubernetes networking ».

Benoit, a DevSecOps engineer at Feesh in Switzerland with extensive expertise in Kubernetes networking, created a video game using Godot with four levels: “pod-to-pod basics”, “pod-to-pod advanced”, “service mesh sidecar”, and “service mesh with ambient mode”.

Across these four levels, he explains Kubernetes networking in a vanilla setup, then with Cilium and Istio, all from the perspective of a TCP packet, represented as a fish.

Networking and I don’t exactly get along, and I’ll admit I’ve always struggled with it. Even now, although I’ve had no choice but to work with Kubernetes and service mesh, I still find it challenging. But seeing the fish swim from frontend to backend, enter a building underwater (the node), interact with an eBPF program… it really makes things more visual and intuitive.

On Thursday morning, after the keynote, the room with 2000 seats was packed!

Explaining networking by building a 3D game from scratch specifically for the occasion: hats off to you!

Benoit had an issue on stage, because he had built the game in 4K and it didn’t display properly on the projection screen. Luckily, about 30 seconds before showtime, the production team and he managed to fix it. He went on stage without showing any of that stress 💪.

Replay:

KubeCon in 45 seconds

To keep memories of these 3-4 amazing days, we created a “KubeCon Europe 2026 in 45 seconds movie:

#KubeCon 2026 in 45 seconds 🎥⏱️

The energy. Conversations. The community.#Sovereignty, #Kubernetes at scale, #reversibility — same themes in every conversation. That's why we show up.

Thanks for the moments you can't script 👋#CloudNativeCon pic.twitter.com/dBinAqM04u
— OVHcloud (@OVHcloud) April 14, 2026

Conclusion

KubeCon Amsterdam proved once again that the strength of open source lies in its community.

From the halls of the RAI to the technical sessions, the excitement was palpable. We’re leaving with our heads full of ideas, but above all with the certainty that collaboration remains the key to solving the complex challenges of modern IT. This was particularly evident in the packed conference rooms and the crowded aisles of the exhibition hall.

One thing is certain: the future of Cloud Native is being written together, and we at OVHcloud look forward to contributing to it with you by helping you get the most out of Kubernetes through our managed platform. Because we’re convinced that for businesses in 2026, the challenge will no longer be how to run Kubernetes, but how to use it to innovate faster and better than the competition.

Discover the External Secret Operator (ESO) OVHcloud Provider to manage your Kubernetes secrets 🎉

Aurélie Vache — Tue, 14 Apr 2026 07:02:22 +0000

Several months ago, we released the Beta version of the OVHcloud Secret Manager and we guided you how to manage your secrets thanks to the existing External Secret Operator (ESO) Hashicorp Vault provider.

As our Secret Manager is now in General Availability, our teams worked on the development of an OVHcloud ESO Provider now available in the ESO v2.3.0 new release 🎉.

In this blog post, you will learn how to create a new secret in the OVHcloud Secret Manager and how to manage it within your Kubernetes clusters through the OVHcloud ESO provider.

External Secrets Operator (ESO)

The External Secrets Operator (ESO), a CNCF sanbox project since 2022, is a Kubernetes operator that integrates external secret management systems.

The operator reads the information from an external APIs and automatically injects the values into a Kubernetes Secret. If the secret changes in the external API, the operator updates the secret in the Kubernetes cluster.

The ESO connects to an external Secret Manager, such as OVHcloud, Vault, AWS, or GCP, via a provider configured in a (Cluster)SecretStore. An ExternalSecret resource then specifies which secrets to retrieve. ESO fetches those values and creates a corresponding Kubernetes Secret within the cluster.

For more details, read the ESO official documentation.

Prerequisites

To be able to use the ESO OVHcloud provider, you need to follow some prerequisites:

Have an OVHcloud account
Created an OKMS domain (“305db938-331f-454d-83a7-3a0a29291661” for example in this blog post)
Created an IAM local user (“secretmanager-305db938-331f-454d-83a7-3a0a29291661” for example in this blog post)
Installed the OVHcloud CLI
Have a Kubernetes cluster

The ESO OVH provider supports both token and mTLS authentication. In this blog post, we will use the token authentication mode. Please follow the OVHcloud ESO provider guide if you wish to use mTLS authentication mode.

Generate a PAT token (For token authentication only)

The ESO (Cluster)SecretStore needs the permission to fetch secrets from Secret Manager.

If you want to use token autentication, you’ll need a token (PAT). You can use the ovhcloud CLI to do that:

PAT_TOKEN=$(ovhcloud iam user token create  --name pat- --description "PAT secret manager for domain " -o json  | jq .details.token |  tr -d '"')

echo $PAT_TOKEN

You should have a result like this:

$ PAT_TOKEN=$(ovhcloud iam user token create secretmanager-305db938-331f-454d-83a7-3a0a29291661 --name pat-secretmanager-305db938-331f-454d-83a7-3a0a29291661 --description "PAT secret manager for domain 305db938-331f-454d-83a7-3a0a29291661" -o json  | jq .details.token |  tr -d '"')
2026/04/07 14:07:45 Final parameters:
{
 "description": "PAT secret manager for domain 305db938-331f-454d-83a7-3a0a29291661",
 "name": "pat-secretmanager-305db938-331f-454d-83a7-3a0a29291661"
}

$ echo $PAT_TOKEN
eyJhbGciOiJFZERTQSIsImtpZCI6IjgzMkFGNUE5ODg3MzFCMDNGM0EzMTRFMDJFRUJFRjBGNDE5MUY0Q0YiLCJraW5kIjoicGF0IiwidHlwIjoiSldUIn0.eyJ0b2tlbiI6InBBSFh1WE5JdVNHYVpmV3F2OUFzVmJrU3UwR2UySTJrdFU0OGdTZkwyZ1k9In0.-VDbiUf4vNm1KB9qSv7i4sGMCvxs_EuZFAETB-eaOFf3IX8-9m7akN800--ASgXy55_DDFHdy4Z5uSq8lww-Bw

Encode the PAT token in base 64 and save it in an environment variable:

export PAT_TOKEN_B64=$(echo -n $PAT_TOKEN | base64)
echo $PAT_TOKEN_B64

Retrieve and save the KMS information

List the OKMS domains:

$ ovhcloud okms list
┌──────────────────────────────────────┬─────────────┐
│                  id                  │   region    │
├──────────────────────────────────────┼─────────────┤
│ 305db938-331f-454d-83a7-3a0a29291661 │ eu-west-par │
│ xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx │ eu-west-par │
└──────────────────────────────────────┴─────────────┘

Save the KMS endpoint and the OKMS ID in two environment variables. For example:

export OKMS_ID="305db938-331f-454d-83a7-3a0a29291661"
export KMS_ENDPOINT=$(ovhcloud okms get 305db938-331f-454d-83a7-3a0a29291661 -o json | jq .restEndpoint | xargs)

Create a secret in the Secret Manager

In the OVHcloud Control Panel (UI), go to ‘Secret Manager’ section and click on the Create a secret button.

Then in order to create a secret ‘prod/eu-west-par/dockerconfigjson’, in the Europe region (France – Paris) eu-west-par, choose this region:

Then, choose the OKMS domain and create”prod/eu-west-par/dockerconfigjson” in the path and fill the content:

Finally, click on the Create button to finalise the creation of the new secret.

Install or update the ESO

If you’d never installed ESO in your Kubernetes cluster, you can install it via Helm:

helm repo add external-secrets https://charts.external-secrets.io
helm repo update

helm install external-secrets \
   external-secrets/external-secrets \
    -n external-secrets \
    --create-namespace \
    --set installCRDs=true

If you already installed it, now you should update it in order to use this new provider:

helm upgrade external-secrets external-secrets/external-secrets -n external-secrets

⚠️ In order to use the OVHcloud provider, you need to have a running instance of ESO equals to version 2.3.0 or more.

$ helm list -n external-secrets

NAME            	NAMESPACE       	REVISION	UPDATED                              	STATUS  	CHART                 	APP VERSION
external-secrets	external-secrets	1       	2026-04-13 13:56:29.071329 +0200 CEST	deployed	external-secrets-2.3.0	v2.3.0

Let’s deploy a Secret in Kubernetes using the ESO provider!

Deploy a ClusterSecretStore to connect ESO to Secret Manager

Set up a ClusterSecretStore to manage synchronization with Secret Manager.
It will use the OVHcloud provider with token authorization mode, and the OKMS endpoint as the backend.

Create a clustersecretstore.yaml.template file with the content below:

apiVersion: external-secrets.io/v1
kind: ClusterSecretStore
metadata:
  name: secret-store-ovh
spec:
  provider:
    ovh:
      server: "$KMS_ENDPOINT" # for example: "https://eu-west-rbx.okms.ovh.net"
      okmsid: "$OKMS_ID" # for example: "734b9b45-8b1a-469c-b140-b10bd6540017"
      auth:
        token:
          tokenSecretRef:
            name: ovh-token
            namespace: external-secrets
            key: token
---
apiVersion: v1
kind: Secret
metadata:
  name: ovh-token
  namespace: external-secrets
data:
  token: $PAT_TOKEN_B64

Generate the clustersecretstore.yaml file from the environment variables you defined:

envsubst < clustersecretstore.yaml.template > clustersecretstore.yaml

You should obtain a file filled with the OVHcloud KMS information:

apiVersion: external-secrets.io/v1
kind: ClusterSecretStore
metadata:
  name: secret-store-ovh
spec:
  provider:
    ovh:
      server: "https://eu-west-par.okms.ovh.net" # for example: "https://eu-west-rbx.okms.ovh.net"
      okmsid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" # for example: "734b9b45-8b1a-469c-b140-b10bd6540017"
      auth:
        token:
          tokenSecretRef:
            name: ovh-token
            namespace: external-secrets
            key: token
---
apiVersion: v1
kind: Secret
metadata:
  name: ovh-token
  namespace: external-secrets
data:
  token: ZXlK...UJ3

Apply it in your Kubernetes cluster:

kubectl apply -f clustersecretstore.yaml

Check:

$ kubectl get clustersecretstore.external-secrets.io/secret-store-ovh

NAME               AGE   STATUS   CAPABILITIES   READY
secret-store-ovh   7s    Valid    ReadWrite      True

Create an ExternalSecret

Create an externalsecret.yaml file with the content below:

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: docker-config-secret
  namespace: external-secrets
spec:
  refreshInterval: 30m
  secretStoreRef:
    name: secret-store-ovh
    kind: ClusterSecretStore
  target:
    template:
      type: kubernetes.io/dockerconfigjson
      data:
        .dockerconfigjson: "{{ .mysecret | toString }}"
    name: ovhregistrycred
    creationPolicy: Owner
  data:
  - secretKey: ovhregistrycred
    remoteRef:
      key: prod/eu-west-par/dockerconfigjson

Apply it:

$ kubectl apply -f externalsecret.yaml

externalsecret.external-secrets.io/docker-config-secret created

Check:

$ kubectl get externalsecret.external-secrets.io/docker-config-secret -n external-secrets 

NAME                   STORETYPE            STORE              REFRESH INTERVAL   STATUS         READY   LAST SYNC
docker-config-secret   ClusterSecretStore   secret-store-ovh   30m                SecretSynced   True    4s

After applying this command, it will create a Kubernetes Secret object.

$ kubectl get secret ovhregistrycred -n external-secrets

NAME              TYPE                             DATA   AGE
ovhregistrycred   kubernetes.io/dockerconfigjson   1      49s

The Kubernetes Secret have been created 🎉

We created a Secret directly from the key, but the OVHcloud ESO provider allows you to fetch the original secret from different parameters (fetch the whole secret, fetch nested values, fetch multiple secrets…), according to your needs.

Conclusion

In this blog, we’ve explained how to create secrets in the OVHcloud Secret Manager and then integrate them directly in your Kubernetes clusters using the new ESO OVHcloud provider.

With this brand new OVHcloud provider, you will have a smoother integration between the Secret Manager and your Kubernetes clusters with ESO.

Our team are working on several other integrations, so stay tuned, and please share your thoughts with us!

Wrappers on Linux Workstations

Isabelle Bauer — Mon, 13 Apr 2026 14:26:16 +0000

As Linux Sys Admins, we are sometimes faced with dilemmas regarding what we can or cannot allow on machines.

Some functionalities are very important to users for their daily tasks and overall better use of their devices, but they sometimes also come with security concerns.
We came up with a way to still allow most of these functionalities, while having more control over them, but also their outcome.

While the Linux user community is technical, their missions are still quite heterogeneous. They can range from developers, sysadmins, network engineers and more…
And they all work with very different workflows (from front-end web to the low-level driver). Sometimes on the laptop, on a docker, on a local VM or remotely on a development VM. Some may even need to hook up via a specific hardware.

Which leaves us users who are very much used to having access to every aspect of their personal computers rather frustrated when they are too limited.

Combining Usability and Security

Wrappers are usually used for abstraction and convenience; they often are relied on to simplify command-line workflows, enforce consistent parameters, or adapt legacy tools to modern environments.
In the case at hand, they are used a bit more like “guardrails.”

Take, as example, package management. Tools like apt are powerful but inherently risky when misused (or maliciously used), and capable of altering a system’s status by removing critical dependencies, etc…
Instead of exposing these tools directly (or completely removing access to them), our team provides a wrapped version that preserves essential functionality, while explicitly blocking operations that could compromise the system’s integrity.

Why would our user base need access to apt? Why not just completely remove that option?

Since our user base is rather technical, and knows their operating system rather well in general, they should be able to install authorized packages, or to update or remove them whenever they like (even though we also have a daily, automatic updates running too).
Plus, if they encounter any basic dpkg/apt issue, it makes sense for them to be able to resolve them autonomously.

Here is the list of options we made available:

Usage:
ovh-apt  [OPTIONS]  [package...]
ovh-apt 
ovh-apt  (this executes apt-get install -f)
ovh-apt  (this executes dpkg --configure -a)
Examples:
ovh-apt update
ovh-apt install vim
ovh-apt install vim=2:8.1.2269-1ubuntu5.17
ovh-apt install --only-upgrade bash
ovh-apt fix

We also have a list of protected packages, to avoid having very useful ones deleted; firewall configuration, systemd, sudo, etc…
Basically, this includes everything that could have an impact on security or system integrity. As well, this wrapper in specific is non-interactive – in order to make sure a root shell is never offered – as can be the case natively with dpkg.

How to prevent specific packages from being uninstalled?

We have a .txt file containing a bunch of package names (one per line).
In our ovh-apt script, we look into that file, and if we find a corresponding package, we exit the script.

re="^(Purg|Remv) ([^ ]+) "
IFS="
"
protected="$(cat /etc/ovh/ovh-apt/protected.txt)"
apt_output=$(cat nohup.out)
for line in $apt_output ; do
if [[ "$line" =~ $re ]]; then
package="${BASH_REMATCH[2]}"
if [[ " ${protected[*]} " =~ [[:space:]]${package}[[:space:]] ]]; then
echo "Error: Package $package is protected, won't do."
cancel=1
fi
fi
done
unset IFS

There is more

By default, you would need root access on unix systems to be able to change the keyboard layout. We decided to make a wrapper to allow for some users to set the layout of their choice.
This implementation was also heavily requested by Linux users, and very understandably so.

Here’s how it works:

Usage: ovh-keyboard  [options]
Commands:
show -> Show current keyboard configuration.
set  -> Update keyboard configuration.
Valid options: fr, us, gb, ca, es, it, de, pt

Just for funsies, here is the list of other wrappers we use:

ovh_backlightctl Allows user to control the backlight options of their monitors.
ovh_snap Allows users to manage a list of snap packages on their device protected
ovh_swapclean Obviously.
ovh-systemctl Allows specific and unharmful systemctl commands.
nmcli_wrapper Blocks the –show-secrets options with nmcli. ‘Cause we don’t want secrets to be seen (that’s why they’re secret).

We definitely will keep on using wrappers, whether it is for user or security needs, when the use case allows it. We find this way of handling the accessibility / security compromise fits quite well with how we manage the Linux parc so far.

Reference Architecture: Deploying a vision-language model with vLLM on OVHcloud MKS for high performance inference and full observability

Eléa Petton — Fri, 10 Apr 2026 07:48:53 +0000

Ensure complete digital sovereignty of your AI models with end-to-end control through open-source solutions on OVHcloud’s Managed Kubernetes Service.

vLLM on OVHcloud MKS for high availability and full observability

This reference architecture demonstrates how to deploy a Large Language Model (LLM) inference system using vLLM on OVHcloud Managed Kubernetes Service (MKS). The solution leverages NVIDIA L40S GPUs to serve the Qwen3-VL-8B-Instruct multimodal model (vision + text) with OpenAI-compatible API endpoints.

This comprehensive guide shows you how to deploy, to scale automatically, and how to monitor vLLM-based LLM workloads on the OVHcloud infrastructure.

What are the key benefits?

Cost-effectiveness: Leverage managed services to minimise operational overhead
Real-time observability: Track Time-to-First-Token (TTFT), throughput, and resource utilisation
Sovereign infrastructure: Keep all metrics and data within European datacentres
Scalable by design: Automatically scale GPU inference replicas based on real workload demand

Context

Managed Kubernetes Service

OVHcloud MKS is a fully managed Kubernetes platform designed to help you deploy, operate, and scale containerised applications in production. It provides a secure and reliable Kubernetes environment without the operational overhead of managing the control plane.

How does this benefit you?

Cost-efficient: Pay only for worker nodes and consumed resources, with no additional charge for the Kubernetes control plane
Fully managed Kubernetes: Certified upstream Kubernetes with automated control plane management, provided upgrades and high availability
Production-ready by design: Built-in integrations with OVHcloud Load Balancers, networking, and persistent storage
Scalable and flexible: Scale workloads easily, node pools to match application demand
Open and portable: Based on standard Kubernetes APIs, enable seamless integration with open-source ecosystems and avoid vendor lock-in

In the following guide, all services are deployed within the OVHcloud Public Cloud.

Architecture overview

This reference architecture demonstrates a basic deployment of vLLM for vision-language model inference on OVHcloud Managed Kubernetes Service, featuring:

High-availability deployment with 2 GPU nodes (NVIDIA L40S)
Optimised GPU utilisation with proper driver configuration
Scalable infrastructure supporting vision-language models
Comprehensive monitoring using Prometheus, Grafana, and DCGM
Full observability for both application and hardware metrics

Data flow:

Data Flow

Inference request:
- User → LoadBalancer → Gateway → NGINX Ingress → “Qwen3 VL” Service → vLLM Pod → GPU
- Response follows reverse path with streaming support
Metrics collection:
- vLLM Pods expose /metrics endpoint (port 8000)
- DCGM Exporters expose GPU metrics (port 9400)
- Prometheus scrapes both endpoints every 30 seconds
- Grafana queries Prometheus for visualization
Load distribution
- NGINX Ingress uses cookie-based session affinity
- vLLM Service uses ClientIP session affinity
- Anti-affinity ensures 1 pod per GPU node

Prerequisites

Before you begin, ensure you have:

An OVHcloud Public Cloud account
An OpenStack user with the Administrator role
Hugging Face access – create a Hugging Face account and generate an access token
kubectl already installed and helm installed (at least version 3.x)

🚀 Now you have all the ingredients, it’s time to deploy the recipe for Qwen/Qwen3-VL-8B-Instruct using vLLM and MKS!

Architecture guide: Native GPU deployment of vLLM on MKS with full stack observability

This reference architecture describes a Large Language Model deployment using vLLM inference server and Kubernetes, to enjoy the benefits of a service that’s both highly available and monitorable in real time.

Step 1 – Create MKS cluster and Node pools

From OVHcloud Control Panel, create a Kubernetes cluster using the MKS.

Navigate to: Public Cloud → Managed Kubernetes Service → Create a cluster

1. Configure cluster

Consider using the following configuration for the current use case:

Name: vllm-deployment-l40s-qwen3-8b
Location: 1-AZ Region – Gravelines (GRA11)
Plan: Free (or Standard)
Network: attach a Private network (e.g. 0000 - AI Private Network)
Version: Latest stable (e.g. 1.34)

2. Create GPU Node pool

During the cluster creation, configure the vLLM Node pool for GPUs:

Node pool name: vllm
Flavor: L40S-90
Number of nodes: 2
Autoscaling: Disabled (OFF)

Why L40S-90?

Cost-effective for single-model deployment (1 GPU per node)
Sufficient RAM (90GB) for Qwen3-VL-8B model

You should see your cluster (e.g. vllm-deployment-l40s-qwen3-8b) in the list, along with the following information:

You can now set up the node pool dedicated to monitoring.

3. Create CPU Node pool

From your cluster, click on Add a node pool and configure it as follow:

Node pool name: monitoring
Flavor: B2-15
Number of nodes: 1
Autoscaling: Disabled (OFF)

✅ Note

Monitoring stack can run on GPU nodes if cost is a concern. Dedicated CPU node provides better isolation and resource management.

If the status is green with the OK label, you can proceed to the next step.

4. Configure Kubernetes access

Once your nodes have been provisioned, you can download the Kubeconfig file and configure kubectl with your MKS cluster.

# configure kubectl with your MKS cluster
export KUBECONFIG=/path/to/your/kubeconfig-xxxxxx.yml

# verify cluster connectivity
kubectl cluster-info
kubectl get nodes

Returning:

NAME STATUS ROLES AGE VERSION monitoring-node-xxxxxx Ready 1d v1.34.2 vllm-node-yyyyyy Ready 1d v1.34.2 vllm-node-zzzzzz Ready 1d v1.34.2

Before going further, add a label to the CPU node for monitoring workloads.

CPU_NODE=$(kubectl get nodes -o json | \
  jq -r '.items[] | select(.status.allocatable."nvidia.com/gpu" == null) | .metadata.name')
kubectl label node $CPU_NODE node-role=monitoring

Finally, check with the following command:

NAME                     GPU      ROLE
monitoring-node-xxxxxx      monitoring
vllm-node-yyyyyy         1        
vllm-node-zzzzzz         1

Once both nodes are in Ready status, you can proceed to the next step.

Step 2 – Install GPU operator

To start, consider setting up the GPU operator.

✅ Note

This step is based on this OVHcloud documentation: Deploying a GPU application on OVHcloud Managed Kubernetes Service

1. Add NVIDIA helm repository and create namespace

Add NVIDIA helm repo:

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update

And create Namespace as follow.

kubectl create namespace gpu-operator

2. Install GPU operator with correct configuration

The GPU Operator must be configured with specific driver versions to ensure compatibility with vLLM containers.

However, the default installation uses recent drivers (580.x with CUDA 13.x) which are incompatible with vLLM containers (CUDA 12.x).

Solution: Force driver version 535.183.01 (CUDA 12.2).

helm install gpu-operator nvidia/gpu-operator \
  -n gpu-operator \
  --set driver.enabled=true \
  --set driver.version="535.183.01" \
  --set toolkit.enabled=true \
  --set operator.defaultRuntime=containerd \
  --set devicePlugin.enabled=true \
  --set dcgmExporter.enabled=true \
  --set dcgmExporter.image="dcgm-exporter" \
  --set dcgmExporter.version="3.1.7-3.1.4-ubuntu20.04" \
  --set gfd.enabled=true \
  --set migManager.enabled=false \
  --set nodeStatusExporter.enabled=true \
  --set validator.driver.enable=false \
  --set validator.toolkit.enable=false \
  --set validator.plugin.enable=false \
  --timeout 20m

✅ Note

Specifying the DCGM version may only be necessary if you encounter problems with the default image (e.g. ‘ImagePullBackOff’). If this is the case, add the following parameters:
--set dcgmExporter.repository="nvcr.io/nvidia/k8s" --set dcgmExporter.image="dcgm-exporter" --set dcgmExporter.version="3.1.7-3.1.4-ubuntu20.04"

kubectl get pods -n gpu-operator

Note that all pods should reach Running state in 5-10 minutes.

You can also check the GPU availability:

kubectl get nodes -o json | jq -r '.items[] | select(.status.allocatable."nvidia.com/gpu" != null) | "\(.metadata.name): \(.status.allocatable."nvidia.com/gpu") GPU(s)"'

Returning:

vllm-node-yyyyyy: 1 GPU(s) vllm-node-zzzzzz: 1 GPU(s)

And you can test to run nvidia-smi:

DRIVER_POD=$(kubectl get pods -n gpu-operator -l app=nvidia-driver-daemonset -o name | head -1)
kubectl exec -n gpu-operator $DRIVER_POD -- nvidia-smi

If GPU tests are working properly, you can move on DCGM service configuration.

3. Configure DCGM service

Why is DCGM Exporter required?

DCGM (Data Centre GPU Manager) is NVIDIA’s official tool for monitoring GPUs in production. The goal is to be able to collect and display metrics from both GPU nodes.

GPU monitoring with DCGM

The metrics provided are:

DCGM_FI_DEV_GPU_UTIL – GPU utilisation (%)
DCGM_FI_DEV_GPU_TEMP – GPU temperature (°C)
DCGM_FI_DEV_FB_USED – VRAM used (MB)
DCGM_FI_DEV_FB_FREE – Free VRAM (MB)
DCGM_FI_DEV_POWER_USAGE – Power consumption (W)
And 50+ other GPU metrics

Next, ensure DCGM service has the correct labels and port configuration:

kubectl patch svc nvidia-dcgm-exporter -n gpu-operator --type merge -p '{
  "metadata": {
    "labels": {
      "app": "nvidia-dcgm-exporter"
    }
  },
  "spec": {
    "ports": [
      {
        "name": "metrics",
        "port": 9400,
        "targetPort": 9400,
        "protocol": "TCP"
      }
    ]
  }
}'

Verify the endpoints (should show 2 IPs, one per GPU node).

kubectl get endpoints nvidia-dcgm-exporter -n gpu-operator

NAME ENDPOINTS AGE nvidia-dcgm-exporter x.x.x.x:9400,x.x.x.x:9400 17d

Step 3 – Deploy Qwen3 VL 8B with vLLM inference server

The deployment of the Qwen 3 VL 8B model on two L40S GPU nodes is carried out in several stages.

1. Create namespace and Hugging Face secret

Start by creating Namespace:

kubectl create namespace vllm

Next, you must retrieve your Hugging Face token and replace the HF_TOKEN value by your own:

export HF_TOKEN="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Create your secret as follow:

kubectl create secret generic huggingface-secret \
  --from-literal=token=$HF_TOKEN \
  --namespace=vllm

Verify you obtain the following output by launching:

kubectl get secret huggingface-secret -n vllm

NAME TYPE DATA AGE huggingface-secret Opaque 1 14d

2. Create vLLM deployment configuration

First, you can create vllm-deployment-2nodes.yaml file.

Deploy vLLM:

kubectl apply -f vllm-deployment-2nodes.yaml

You can monitor the deployment (it should take 8-10 minutes for model download and loading).

kubectl get pods -n vllm -o wide -w

Expected output after 10 minutes:

NAME               READY  STATUS   RESTARTS  AGE  IP       NODE  
qwen3-vl-xxxx-yyy  1/1    Running  0         1d   X.X.X.X  vllm-node-yyyyyy
qwen3-vl-xxxx-zzz  1/1    Running  0         1d   X.X.X.X  vllm-node-zzzzzz

You can also check the container logs:

kubectl logs -f -n vllm

You should find in the logs: “Uvicorn running on http://0.0.0.0:8000“

Is everything installed correctly? Then let’s move on to the next step.

3. Add service label

Ensure service has the correct label for ServiceMonitor discovery.

kubectl label svc qwen3-vl-service -n vllm app=qwen3-vl --overwrite

You can now verify by launching the following command.

kubectl get svc qwen3-vl-service -n vllm --show-labels | grep "app=qwen3-vl"

Returning:

qwen3-vl-service ClusterIP X.X.X.X 8000/TCP 1d app=qwen3-vl

Step 4 – Install NGINX ingress controller

⚠️ Moving beyond Ingress

Follow this tutorial if you want to use Gateway instead of Ingress.

1. Add helm repository and configure Ingress

First of all, add helm repository:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

Create configuration file with ingress-nginx-values.yaml.

Then, install NGINX Ingress:

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  -f ingress-nginx-values.yaml \
  --wait

Wait for LoadBalancer IP. The external IP assignment should take 1-2 minutes.

kubectl get svc -n ingress-nginx ingress-nginx-controller -w

Once is no longer , Ctrl+C and export it:

export EXTERNAL_IP=
echo "API URL: http://$EXTERNAL_IP"

2. Create vLLM Ingress resource

Next, create vLLM Ingress using vllm-ingress.yaml.

Apply it as follow:

kubectl apply -f vllm-ingress.yaml

You can now test different API calls to verify that your deployment is functional.

3. Test API

Firstly, check if the model is available:

curl http://$EXTERNAL_IP/v1/models | jq

{
  "object": "list",
  "data": [
    {
      "id": "qwen3-vl-8b",
      "object": "model",
      "created": 1772472143,
      "owned_by": "vllm",
      "root": "Qwen/Qwen3-VL-8B-Instruct",
      "parent": null,
      "max_model_len": 8192,
      "permission": [
        {
          "id": "modelperm-8fb35cdd3208b068",
          "object": "model_permission",
          "created": 1772472143,
          "allow_create_engine": false,
          "allow_sampling": true,
          "allow_logprobs": true,
          "allow_search_indices": false,
          "allow_view": true,
          "allow_fine_tuning": false,
          "organization": "*",
          "group": null,
          "is_blocking": false
        }
      ]
    }
  ]
}

Next, test inference using the following request:

curl http://$EXTERNAL_IP/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-vl-8b",
    "messages": [{"role": "user", "content": "Count from 1 to 10."}],
    "max_tokens": 100
  }' | jq '.choices[0].message.content'

"1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

Great! You’re almost there…

Step 5 – Install Prometheus stack

Now, set up the monitoring stack that provides complete observability for application-level (vLLM) and hardware-level (GPU) metrics:

Monitoring architecture

1. Add helm repository and create namespace

Add Prometheus helm repo:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Then, create the monitoring Namespace.

kubectl create namespace monitoring

2. Create Prometheus deployment configuration and installation

First, create prometheus.yaml file.

Install Prometheus stack:

helm install prometheus prometheus-community/kube-prometheus-stack \
  -n monitoring \
  -f prometheus.yaml \
  --timeout 10m \
  --wait

Now, monitor its installation and wait until the pods are ready:

kubectl get pods -n monitoring -w

If all pods are running successfully, you can proceed to the next step.

3. Check that the installation is operational

First access Grafana in background:

kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80 &

Test Grafana health:

curl -s http://localhost:3000/api/health | jq

{
  "database": "ok",
  "version": "12.3.3",
  "commit": "2a14494b2d6ab60f860d8b27603d0ccb264336f6"
}

You can now access to Grafana locally via http://localhost:3000. You will have to use:

Login: admin
Password: Admin123!vLLM

Well done! You can now proceed to the configuration step.

Step 6 – Configure ServiceMonitors

The ServiceMonitors is used to tell Prometheus which endpoints to scrape for metrics.

1. Create vLLM ServiceMonitor

Retrieve the file from the GitHub repository: vllm-servicemonitor.yaml.

Next, apply and check that the ServiceMonitor vllm-metrics exists:

kubectl apply -f vllm-servicemonitor.yaml
kubectl get servicemonitor -n vllm

2. Create DCGM ServiceMonitor

First, create the dcgm-servicemonitor.yaml file.

Once again, apply and verify:

kubectl apply -f dcgm-servicemonitor.yaml
kubectl get servicemonitor -n gpu-operator

gpu-operator                  1d
nvidia-dcgm-exporter          1d
nvidia-node-status-exporter   1d

3. Configure Prometheus for Cross-Namespace discovery

Apply a patch to allow Prometheus to discover ServiceMonitors in all namespaces.

kubectl patch prometheus prometheus-kube-prometheus-prometheus -n monitoring --type merge -p '{
  "spec": {
    "serviceMonitorNamespaceSelector": {},
    "podMonitorNamespaceSelector": {}
  }
}'

Now you have to restart Prometheus.

Delete Prometheus pod to force configuration reload
Wait for Prometheus to restart

kubectl delete pod prometheus-prometheus-kube-prometheus-prometheus-0 -n monitoring

kubectl wait --for=condition=Ready \
  pod/prometheus-prometheus-kube-prometheus-prometheus-0 \
  -n monitoring \
  --timeout=180s

Wait about 2 minutes for discovery and finally, verify targets:

kubectl port-forward -n monitoring \
  prometheus-prometheus-kube-prometheus-prometheus-0 9090:9090 &

You can open in browser: http://localhost:9090/targets and search for:

vllm
dcgm

Note that the expected targets are:

serviceMonitor/vllm/vllm-metrics/0 (2/2 UP)
serviceMonitor/gpu-operator/nvidia-dcgm-exporter/0 (2/2 UP)

Step 7 – Create Grafana dashboards

In this final step, the goal is to create two Grafana dashboards to track both the software side with vLLM metrics and the hardware metrics that will monitor the GPU consumption and system.

1. vLLM application metrics

The dashboard provides insights into vLLM application performance, request handling, and resource utilization based on the following metrics:

Metric	Type	Description	Unit	Dashboard Usage
`vllm:request_success_total`	Counter	Total successful requests	count	Request Rate, Total Requests
`vllm:num_requests_running`	Gauge	Requests currently being processed	count	Queue Depth, Active Requests
`vllm:num_requests_waiting`	Gauge	Requests waiting in queue	count	Queue Depth, Queued Requests
`vllm:time_to_first_token_seconds`	Histogram	Latency until first token generated	seconds	TTFT P50/P95/P99
`vllm:e2e_request_latency_seconds`	Histogram	Total end-to-end latency	seconds	E2E Latency P50/P95/P99
`vllm:generation_tokens_total`	Counter	Total tokens generated (output)	count	Token Generation Rate, Throughput
`vllm:prompt_tokens_total`	Counter	Total prompt tokens (input)	count	Token Generation Rate, Avg Tokens
`vllm:kv_cache_usage_perc`	Gauge	GPU KV cache utilization	0-1 (0-100%)	KV Cache Usage
`vllm:prefix_cache_hits_total`	Counter	Number of prefix cache hits	count	Cache Hit Rate
`vllm:prefix_cache_queries_total`	Counter	Number of prefix cache queries	count	Cache Hit Rate
`vllm:request_queue_time_seconds`	Histogram	Time spent waiting in queue	seconds	Request Queue Time
`vllm:request_prefill_time_seconds`	Histogram	Prefill phase time	seconds	Prefill Time
`vllm:request_decode_time_seconds`	Histogram	Decode phase time	seconds	Decode Time
`vllm:inter_token_latency_seconds`	Histogram	Latency between each token	seconds	Inter-Token Latency
`vllm:num_preemptions_total`	Counter	Number of preemptions (OOM)	count	Preemptions
`vllm:prompt_tokens_cached_total`	Counter	Prompt tokens cached	count	Cached Tokens
`vllm:request_prompt_tokens`	Histogram	Prompt size distribution	count	(Table)
`vllm:request_generation_tokens`	Histogram	Generated tokens distribution	count	(Table)
`vllm:iteration_tokens_total`	Histogram	Tokens per iteration	count	(Advanced analysis)

This vLLM Grafana dashboard is composed of 23 panels:

The dashboard provides insights into LLM application performance, request handling, and resource utilisation based on the previous metrics.

Type	Nombre	Panels
Timeseries	12	Request Rate, Queue Depth, TTFT, E2E Latency, Token Gen, Cache Usage, Cache Hit, Queue Time, Prefill/Decode, Inter-Token, Preemptions, Avg Tokens
Stat	10	Throughput, TTFT P95, Active Req, Queued Req, Cache Hit Rate, Cache Usage, Total Req, Total Tokens, Cached Tokens, Preemptions
Table	1	Pod Performance

Now create the dashboard using vllm-app-dashboard.json. Then, launch:

echo "Importing vLLM application dashboard..."
curl -X POST \
  'http://localhost:3000/api/dashboards/db' \
  -H 'Content-Type: application/json' \
  -u 'admin:Admin123!vLLM' \
  -d @vllm-app-dashboard.json | jq '.status, .url'

Next, you an access the vLLM dashboard and follow metrics in real time:

This dashboard is also essential to track hardware consumption for comprehensive monitoring.

2. GPU hardware metrics

Take advantage of the most useful DCGM metrics to check both the functioning and consumption of your hardware resources:

Metric	Type	Description	Unit	Normal Thresholds	Dashboard Usage
`DCGM_FI_DEV_GPU_UTIL`	Gauge	GPU utilization (compute)	% (0-100)	70-95% optimal	GPU Utilization
`DCGM_FI_DEV_GPU_TEMP`	Gauge	GPU temperature	°C	< 85°C normal	GPU Temperature
`DCGM_FI_DEV_FB_USED`	Gauge	VRAM used	MB	Variable by model	GPU Memory Used
`DCGM_FI_DEV_FB_FREE`	Gauge	VRAM free	MB	> 2GB recommended	GPU Memory Free
`DCGM_FI_DEV_POWER_USAGE`	Gauge	Power consumption	Watts	< 300W (L40S)	GPU Power Usage
`DCGM_FI_DEV_SM_CLOCK`	Gauge	GPU clock speed (compute)	MHz	Variable	GPU Clock Speed
`DCGM_FI_DEV_MEM_CLOCK`	Gauge	Memory clock speed	MHz	Variable	Memory Clock Speed
`DCGM_FI_DEV_NVLINK_BANDWIDTH_TOTAL`	Counter	Total NVLink bandwidth	bytes/s	(If multi-GPU)	NVLink Bandwidth
`DCGM_FI_DEV_PCIE_TX_BYTES`	Counter	PCIe data transmitted	bytes	(I/O monitoring)	PCIe TX
`DCGM_FI_DEV_PCIE_RX_BYTES`	Counter	PCIe data received	bytes	(I/O monitoring)	PCIe RX
`DCGM_FI_DEV_ECC_DBE_VOL_TOTAL`	Counter	ECC double-bit errors	count	0 ideal	(Health check)
`DCGM_FI_DEV_ECC_SBE_VOL_TOTAL`	Counter	ECC single-bit errors	count	< 10/day acceptable	(Health check)

This hardware Grafana dashboard is composed of 13 panels with GPU hardware and system metrics. A detailed view is also available GPU util (%), temperature (°C), vRAM (GB) and power (Watt).

Type	Count	Panels
Timeseries	8	GPU Util, GPU Mem, GPU Temp, GPU Power, CPU Usage, RAM Usage, Network I/O, Disk I/O
Stat	4	Avg GPU Util, Avg GPU Temp, Total GPU Mem, Total GPU Power
Table	1	Hardware Status

Please refer to hardware-dashboard.json by loading it as follows:

echo "Importing hardware dashboard..."
curl -X POST \
  'http://localhost:3000/api/dashboards/db' \
  -H 'Content-Type: application/json' \
  -u 'admin:Admin123!vLLM' \
  -d @hardware-dashboard.json | jq '.status, .url'

Finally, track resource consumption using this hardware dashboard:

Congratulations! Everything is working. You can now test your model and track the various metrics in real time.

Step 8 – LLM testing and performance tracking

Start by installing Python dependencies:

pip3 install openai tqdm

Replace the by the vLLM service external IP and launch the performance test thanks to the following Python code:

import time
import threading
import random
from statistics import mean
from openai import OpenAI
from tqdm import tqdm

APP_URL = "http://94.23.185.22/v1"
MODEL = "qwen3-vl-8b"

CONCURRENT_WORKERS = 500          # concurrency
REQUESTS_PER_WORKER = 10
MAX_TOKENS = 200                  # generation pressure

# some random prompts
SHORT_PROMPTS = [
    "Summarize the theory of relativity.",
    "Explain what a transformer model is.",
    "What is Kubernetes autoscaling?"
]

MEDIUM_PROMPTS = [
    "Explain how attention mechanisms work in transformer-based models, including self-attention and multi-head attention.",
    "Describe how vLLM manages KV cache and why it impacts inference performance."
]

LONG_PROMPTS = [
    "Write a very detailed technical explanation of how large language models perform inference, "
    "including tokenization, embedding lookup, transformer layers, attention computation, KV cache usage, "
    "GPU memory management, and how batching affects latency and throughput. Use examples.",
]

PROMPT_POOL = (
    SHORT_PROMPTS * 2 +
    MEDIUM_PROMPTS * 4 +
    LONG_PROMPTS * 6    # bias toward long prompts
)

# openai compliance
client = OpenAI(
    base_url=APP_URL,
    api_key="foo"
)

# basic metrics
latencies = []
errors = 0
lock = threading.Lock()

# worker
def worker(worker_id):
    global errors
    for _ in range(REQUESTS_PER_WORKER):
        prompt = random.choice(PROMPT_POOL)

        start = time.time()
        try:
            client.chat.completions.create(
                model=MODEL,
                messages=[{"role": "user", "content": prompt}],
                max_tokens=MAX_TOKENS,
                temperature=0.7,
            )
            elapsed = time.time() - start

            with lock:
                latencies.append(elapsed)

        except Exception as e:
            with lock:
                errors += 1

# run
threads = []
start_time = time.time()

print("\n-> STARTING PERFORMANCE TEST:")
print(f"Concurrency: {CONCURRENT_WORKERS}")
print(f"Total requests: {CONCURRENT_WORKERS * REQUESTS_PER_WORKER}")

for i in range(CONCURRENT_WORKERS):
    t = threading.Thread(target=worker, args=(i,))
    t.start()
    threads.append(t)

for t in threads:
    t.join()

total_time = time.time() - start_time

# results
print("\n-> BENCH RESULTS:")
print(f"Total requests sent: {len(latencies) + errors}")
print(f"Successful requests: {len(latencies)}")
print(f"Errors: {errors}")
print(f"Total wall time: {total_time:.2f}s")

if latencies:
    print(f"Avg latency: {mean(latencies):.2f}s")
    print(f"Min latency: {min(latencies):.2f}s")
    print(f"Max latency: {max(latencies):.2f}s")
    print(f"Throughput: {len(latencies)/total_time:.2f} req/s")

Returning:

-> STARTING PERFORMANCE TEST:
Concurrency: 500
Total requests: 5000

-> BENCH RESULTS:
Total requests sent: 5000
Successful requests: 5000
Errors: 0
Total wall time: 225.54s
Avg latency: 21.45s
Min latency: 6.06s
Max latency: 25.19s
Throughput: 22.17 req/s

Don’t forget to track GPU and vLLM metrics in your Grafana dashboards!

Conslusion

This reference architecture demonstrates a vLLM deployment on OVHcloud Managed Kubernetes Service (MKS) with comprehensive GPU monitoring. Benefits include:

High Performance: GPU-accelerated inference with L40S
Scalability: Kubernetes-native, horizontal scaling-ready
Reliability: Health checks, auto-restart, monitoring
API Compatibility: OpenAI-compatible endpoints
Multimodality: Vision & text capabilities
Full stack monitoring: Complete vLLM application and hardware dashboards

Going Further

Your current architecture is functional. However, if desired, it could be improved into a full production-ready solution.

Wish to take production hardening a step further?

Go further with the following enhancements:

Authentication & authorization
- vLLM API authentication
- Grafana authentication
- Prometheus security
High availability & load balancing
- Grafana high availability with multiple replicas and shared storage
- Prometheus high availability
- vLLM Horizontal Pod Autoscaling (HPA) based on custom metrics
Data persistence & backup
- Prometheus long-term storage with persistent storage
- Grafana Dashboard Backup
Observability enhancements
- Distributed tracing by adding OpenTelemetry for request tracing
- Alerting rules with production-ready alert rules