Kubernetes Archives - OVHcloud Blog

Reference Architecture: Custom metric autoscaling for LLM inference with vLLM on OVHcloud AI Deploy and observability using MKS

Eléa Petton — Tue, 10 Feb 2026 08:51:11 +0000

Take your LLM (Large Language Model) deployment to production level with comprehensive custom autoscaling configuration and advanced vLLM metrics observability.

vLLM metrics monitoring and observability based on OVHcloud infrastructure

This reference architecture describes a comprehensive solution for deploying, autoscaling and monitoring vLLM-based LLM workloads on OVHcloud infrastructure. It combinesAI Deploy, used for model serving with custom metric autoscaling, and Managed Kubernetes Service (MKS), which hosts the monitoring and observability stack.

By leveraging application-level Prometheus metrics exposed by vLLM, AI Deploy can automatically scale inference replicas based on real workload demand, ensuring high availability, consistent performance under load and efficient GPU utilisation. This autoscaling mechanism allows the platform to react dynamically to traffic spikes while maintaining predictable latency for end users.

On top of this scalable inference layer, the monitoring architecture provides observability through Prometheus, Grafana and Alertmanager. It enables real-time performance monitoring, capacity planning, and operational insights, while ensuring full data sovereignty for organisations running Large Language Models (LLMs) in production environments.

What are the key benefits?

Cost-effective: Leverage managed services to minimise operational overhead
Real-time observability: Track Time-to-First-Token (TTFT), throughput, and resource utilisation
Sovereign infrastructure: All metrics and data remain within European datacentres
Production-ready: Persistent storage, high availability, and automated monitoring

Context

AI Deploy

OVHcloud AI Deploy is a Container as a Service (CaaS) platform designed to help you deploy, manage and scale AI models. It provides a solution that allows you to optimally deploy your applications/APIs based on Machine Learning (ML), Deep Learning (DL) or Large Language Models (LLMs).

Key points to keep in mind:

Easy to use: Bring your own custom Docker image and deploy it in a command line or a few clicks surely
High-performance computing: A complete range of GPUs available (H100, A100, V100S, L40S and L4)
Scalability and flexibility: Supports automatic scaling, allowing your model to effectively handle fluctuating workloads
Cost-efficient: Billing per minute, no surcharges

Managed Kubernetes Service

OVHcloud MKS is a fully managed Kubernetes platform designed to help you deploy, operate, and scale containerised applications in production. It provides a secure and reliable Kubernetes environment without the operational overhead of managing the control plane.

What should you keep in mind?

Cost-efficient: Only pay for worker nodes and consumed resources, with no additional charge for the Kubernetes control plane
Fully managed Kubernetes: Certified upstream Kubernetes with automated control plane management, upgrades and high availability
Production-ready by design: Built-in integrations with OVHcloud Load Balancers, networking and persistent storage
Scalability and flexibility: Easily scale workloads and node pools to match application demand
Open and portable: Based on standard Kubernetes APIs, enabling seamless integration with open-source ecosystems and avoiding vendor lock-in

In the following guide, all services are deployed within the OVHcloud Public Cloud.

Overview of the architecture

This reference architecture describes a complete, secure and scalable solution to:

Deploy an LLM with vLLM and AI Deploy, benefiting from automatic scaling based on custom metrics to ensure high service availability – vLLM exposes /metrics via its public HTTPS endpoint on AI Deploy
Collect, store and visualise these vLLM metrics using Prometheus and Grafana on MKS

vLLM metrics monitoring and observability architecture overview

Here you will find the main components of the architecture. The solution comprises three main layers:

Model serving layer with AI Deploy
- vLLM containers running on top of GPUs for LLM inference
- vLLM inference server exposing Prometheus metrics
- Automatic scaling based on custom metrics to ensure high availability
- HTTPS endpoints with Bearer token authentication
Monitoring and observability infrastructure using Kubernetes
- Prometheus for metrics collection and storage
- Grafana for visualisation and dashboards
- Persistent volume storage for long-term retention
Network layer
- Secure HTTPS communication between components
- OVHcloud LoadBalancer for external access

To go further, some prerequisites must be checked!

Prerequisites

Before you begin, ensure you have:

An OVHcloud Public Cloud account
An OpenStack user with the Administrator role
ovhai CLI available – install the ovhai CLI
A Hugging Face access – create a Hugging Face account and generate an access token
kubectl installed and helm installed (at least version 3.x)

🚀 Now you have all the ingredients for our recipe, it’s time to deploy the Ministral 14B using AI Deploy and vLLM Docker container!

Architecture guide: From autoscaling to observability for LLMs served by vLLM

Let’s set up and deploy this architecture!

Overview of the deployment workflow

✅ Note

In this example, mistralai/Ministral-3-14B-Instruct-2512 is used. Choose the open-source model of your choice and follow the same steps, adapting the model slug (from Hugging Face), the versions and the GPU(s) flavour.

Remember that all of the following steps can be automated using OVHcloud APIs!

Step 1 – Manage access tokens

Before introducing the monitoring stack, this architecture starts with the deployment of the Ministral 3 14B on OVHcloud AI Deploy, configured to autoscale based on custom Prometheus metrics exposed by vLLM itself.

Export your Hugging Face token.

export MY_HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxx

Create a Bearer token to access your AI Deploy app once it’s been deployed.

ovhai token create --role operator ai_deploy_token=my_operator_token

Returning the following output:

Id: 47292486-fb98-4a5b-8451-600895597a2b Created At: 20-01-26 11:53:05 Updated At: 20-01-26 11:53:05 Spec: Name: ai_deploy_token=my_operator_token Role: AiTrainingOperator Label Selector: Status: Value: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Version: 1

You can now store and export your access token:

export MY_OVHAI_ACCESS_TOKEN=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Step 2 – LLM deployment using AI Deploy

1. Define the targeted vLLM metric for autoscaling

Before proceeding with the deployment of the Ministral 3 14B endpoint, you have to choose the metric you want to use as the trigger for scaling.

Instead of relying solely on CPU/RAM utilisation, AI Deploy allows autoscaling decisions to be driven by application-level signals.

To do this, you can consult the metrics exposed by vLLM.

In this example, you can use a basic metric such as vllm:num_requests_running to scale the number of replicas based on real inference load.

This enables:

Faster reaction to traffic spikes
Better GPU utilisation
Reduced inference latency under load
Cost-efficient scaling

Finally, the configuration chosen for scaling this application is as follows:

Parameter	Value	Description
Metric source	`/metrics`	vLLM Prometheus endpoint
Metric name	`vllm:num_requests_running`	Number of in-flight requests
Aggregation	`AVERAGE`	Mean across replicas
Target value	`50`	Desired load per replica
Min replicas	`1`	Baseline capacity
Max replicas	`3`	Burst capacity

✅ Note

You can choose the metric that best suits your use case. You can also apply a patch to your AI Deploy deployment at any time to change the target metric for scaling.

When the average number of running requests exceeds 50, AI Deploy automatically provisions additional GPU-backed replicas.

2. Deploy Ministral 3 14B using AI Deploy

Now you can deploy the LLM using the ovhai CLI.

Key elements necessary for proper functioning:

GPU-based inference: 1 x H100
vLLM OpenAI-compatible Docker image: vllm/vllm-openai:v0.13.0
Custom autoscaling rules based on Prometheus metrics: vllm:num_requests_running

Below is the reference command used to deploy the mistralai/Ministral-3-14B-Instruct-2512:

ovhai app run \
  --name vllm-ministral-14B-autoscaling-custom-metric \
  --default-http-port 8000 \
  --label ai_deploy_token=my_operator_token \
  --gpu 1 \
  --flavor h100-1-gpu \
  -e OUTLINES_CACHE_DIR=/tmp/.outlines \
  -e HF_TOKEN=$MY_HF_TOKEN \
  -e HF_HOME=/hub \
  -e HF_DATASETS_TRUST_REMOTE_CODE=1 \
  -e HF_HUB_ENABLE_HF_TRANSFER=0 \
  -v standalone:/hub:rw \
  -v standalone:/workspace:rw \
  --liveness-probe-path /health \
  --liveness-probe-port 8000 \
  --liveness-initial-delay-seconds 300 \
  --probe-path /v1/models \
  --probe-port 8000 \
  --initial-delay-seconds 300 \
  --auto-min-replicas 1 \
  --auto-max-replicas 3 \
  --auto-custom-api-url "http://:8000/metrics" \
  --auto-custom-metric-format PROMETHEUS \
  --auto-custom-value-location vllm:num_requests_running \
  --auto-custom-target-value 50 \
  --auto-custom-metric-aggregation-type AVERAGE \
  vllm/vllm-openai:v0.13.0 \
  -- bash -c "python3 -m vllm.entrypoints.openai.api_server \
    --model mistralai/Ministral-3-14B-Instruct-2512 \
    --tokenizer_mode mistral \
    --load_format mistral \
    --config_format mistral \
    --enable-auto-tool-choice \
    --tool-call-parser mistral \
    --enable-prefix-caching"

How to understand the different parameters of this command?

a. Start your AI Deploy app

Launch a new app using ovhai CLI and name it.

ovhai app run --name vllm-ministral-14B-autoscaling-custom-metric

b. Define access

Define the HTTP API port and restrict access to your token.

--default-http-port 8000
--label ai_deploy_token=my_operator_token

c. Configure GPU resources

Specify the hardware type (h100-1-gpu), which refers to an NVIDIA H100 GPU and the number (1).

--gpu 1 --flavor h100-1-gpu

⚠️WARNING! For this model, one H100 is sufficient, but if you want to deploy another model, you will need to check which GPU you need. Note that you can also access L40S and A100 GPUs for your LLM deployment.

d. Set up environment variables

Configure caching for the Outlines library (used for efficient text generation):

-e OUTLINES_CACHE_DIR=/tmp/.outlines

Pass the Hugging Face token ($MY_HF_TOKEN) for model authentication and download:

-e HF_TOKEN=$MY_HF_TOKEN

Set the Hugging Face cache directory to /hub (where models will be stored):

-e HF_HOME=/hub

Allow execution of custom remote code from Hugging Face datasets (required for some model behaviours):

-e HF_DATASETS_TRUST_REMOTE_CODE=1

Disable Hugging Face Hub transfer acceleration (to use standard model downloading):

-e HF_HUB_ENABLE_HF_TRANSFER=0

e. Mount persistent volumes

Mount two persistent storage volumes:

/hub → Stores Hugging Face model files
/workspace → Main working directory

The rw flag means read-write access.

-v standalone:/hub:rw -v standalone:/workspace:rw

f. Health checks and readiness

Configure liveness and readiness probes:

/health verifies the container is alive
/v1/models confirms the model is loaded and ready to serve requests

The long initial delays (300 seconds) can be reduced; they correspond to the startup time of vLLM and the loading of the model on the GPU.

--liveness-probe-path /health --liveness-probe-port 8000 --liveness-initial-delay-seconds 300 --probe-path /v1/models --probe-port 8000 --initial-delay-seconds 300

g. Autoscaling configuration (custom metrics)

First set the minimum and maximum number of replicas.

--auto-min-replicas 1 --auto-max-replicas 3

This guarantees basic availability (one replica always up) while allowing for peak capacity.

Then enable autoscaling based on application-level metrics exposed by vLLM.

--auto-custom-api-url "http://:8000/metrics" --auto-custom-metric-format PROMETHEUS --auto-custom-value-location vllm:num_requests_running --auto-custom-target-value 50 --auto-custom-metric-aggregation-type AVERAGE

AI Deploy:

Scrapes the local /metrics endpoint
Parses Prometheus-formatted metrics
Extracts the vllm:num_requests_running gauge
Computes the average value across replicas

Scaling behaviour:

When the average number of in-flight requests exceeds 50, AI Deploy adds replicas
When load decreases, replicas are scaled down

This approach ensures high availability and predictable latency under fluctuating traffic.

h. Choose the target Docker image and the startup command

Use the official vLLM OpenAI-compatible Docker image.

vllm/vllm-openai:v0.13.0

Finally, run the model inside the container using a Python command to launch the vLLM API server:

python3 -m vllm.entrypoints.openai.api_server → Starts the OpenAI-compatible vLLM API server
--model mistralai/Ministral-3-14B-Instruct-2512 → Loads the Ministral 3 14B model from Hugging Face
--tokenizer_mode mistral → Uses the Mistral tokenizer
--load_format mistral → Uses Mistral’s model loading format
--config_format mistral → Ensures the model configuration follows Mistral’s standard
--enable-auto-tool-choice → Automatic call of tools if necessary (function/tool call)
--tool-call-parser mistral → Tool calling support
--enable-prefix-caching → Prefix caching for improved throughput and reduced latency

You can now launch this command using ovhai CLI.

3. Check AI Deploy app status

You can now check if your AI Deploy app is alive:

ovhai app get

Is your app in RUNNING status? Perfect! You can check in the logs that the server is started:

ovhai app logs

⚠️WARNING! This step may take a little time as the LLM must be loaded.

4. Test that the deployment is functional

First you can request and send a prompt to the LLM. Launch the following query by asking the question of your choice:

curl https://.app.gra.ai.cloud.ovh.net/v1/chat/completions \
  -H "Authorization: Bearer $MY_OVHAI_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/Ministral-3-14B-Instruct-2512",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Give me the name of OVHcloud’s founder."}
    ],
    "stream": false
  }'

You can also verify access to vLLM metrics.

curl -H "Authorization: Bearer $MY_OVHAI_ACCESS_TOKEN" \
  https://.app.gra.ai.cloud.ovh.net/metrics

If both tests show that the model deployment is functional and you receive 200 HTTP responses, you are ready to move on to the next step!

The next step is to set up the observability and monitoring stack. This autoscaling mechanism is fully independent from Prometheus used for observability:

AI Deploy queries the local /metrics endpoint internally
Prometheus scrapes the same metrics endpoint externally for monitoring, dashboards and potentially alerting

This ensures:

A single source of truth for metrics
No duplication of exporters
Consistent signals for scaling and observability

Step 3 – Create an MKS cluster

From OVHcloud Control Panel, create a Kubernetes cluster using the MKS.

Consider using the following configuration for the current use case:

Location: GRA ( Gravelines) – you can select the same region as for AI Deploy
Network: Public
Node pool :
- Flavour : b2-15 (or something similar)
- Number of nodes: 3
- Autoscaling : OFF
Name your node pool: monitoring

You should see your cluster (e.g. prometheus-vllm-metrics-ai-deploy) in the list, along with the following information:

If the status is green with the OK label, you can proceed to the next step.

Step 4 – Configure Kubernetes access

Download your kubeconfig file from the OVHcloud Control Panel and configure kubectl:

# configure kubectl with your MKS cluster
export KUBECONFIG=/path/to/your/kubeconfig-xxxxxx.yml

# verify cluster connectivity
kubectl cluster-info
kubectl get nodes

Now,- you can create the values-prometheus.yaml file:

# general configuration
nameOverride: "monitoring"
fullnameOverride: "monitoring"

# Prometheus configuration
prometheus:
  prometheusSpec:
    # data retention (15d)
    retention: 15d
    
    # scrape interval (15s)
    scrapeInterval: 15s
    
    # persistent storage (required for production deployment)
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: csi-cinder-high-speed  # OVHcloud storage
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 50Gi  # (can be modified according to your needs)
    
    # scrape vLLM metrics from your AI Deploy instance (Ministral 3 14B)
    additionalScrapeConfigs:
      - job_name: 'vllm-ministral'
        scheme: https
        metrics_path: '/metrics'
        scrape_interval: 15s
        scrape_timeout: 10s
        
        # authentication using AI Deploy Bearer token stored Kubernetes Secret
        bearer_token_file: /etc/prometheus/secrets/vllm-auth-token/token
        static_configs:
          - targets:
              - '.app.gra.ai.cloud.ovh.net'  # /!\ REPLACE THE  by yours /!\
            labels:
              service: 'vllm'
              model: 'ministral'
              environment: 'production'
        
        # TLS configuration
        tls_config:
          insecure_skip_verify: false
    
    # kube-prometheus-stack mounts the secret under /etc/prometheus/secrets/ and makes it accessible to Prometheus
    secrets:
      - vllm-auth-token

# Grafana configuration (visualization layer)
grafana:
  enabled: true
  
  # disable automatic datasource provisioning
  sidecar:
    datasources:
      enabled: false
  
  # persistent dashboards
  persistence:
    enabled: true
    storageClassName: csi-cinder-high-speed
    size: 10Gi
  
  # /!\ DEFINE ADMIN PASSWORD - REPLACE "test" BY YOURS /!\
  adminPassword: "test"
  
  # access via OVHcloud LoadBalancer (public IP and managed LB)
  service:
    type: LoadBalancer
    port: 80
    annotations:
      # optional : limiter l'accès à certaines IPs
      # service.beta.kubernetes.io/ovh-loadbalancer-allowed-sources: "1.2.3.4/32"
  
# alertmanager (optional but recommended for production)
alertmanager:
  enabled: true
  
  alertmanagerSpec:
    storage:
      volumeClaimTemplate:
        spec:
          storageClassName: csi-cinder-high-speed
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 10Gi

# cluster observability components
nodeExporter:
  enabled: true
  
kubeStateMetrics:
  enabled: true

✅ Note

On OVHcloud MKS, persistent storage is handled automatically through the Cinder CSI driver. When a PersistentVolumeClaim (PVC) references a supported storageClassName such as csi-cinder-high-speed, OVHcloud dynamically provisions the underlying Block Storage volume and attaches it to the node running the pod. This enables stateful components like Prometheus, Alertmanager and Grafana to persist data reliably without any manual volume management, making the architecture fully cloud-native and operationally simple.

Then create the monitoring namespace:

# create namespace
kubectl create namespace monitoring

# verify creation
kubectl get namespaces | grep monitoring

Finally, configure the Bearer token secret to access vLLM metrics.

# create bearer token secret
kubectl create secret generic vllm-auth-token \
  --from-literal=token='"$MY_OVHAI_ACCESS_TOKEN"' \
  -n monitoring

# verify secret creation
kubectl get secret vllm-auth-token -n monitoring

# test token (optional)
kubectl get secret vllm-auth-token -n monitoring \
  -o jsonpath='{.data.token}' | base64 -d

Right, if everything is working, let’s move on to deployment.

Step 5 – Deploy Prometheus stack

Add the Prometheus Helm repository and install the monitoring stack. The deployment creates:

Prometheus StatefulSet with persistent storage
Grafana deployment with LoadBalancer access
Alertmanager for future alert configuration (optional)
Supporting components (node exporters, kube-state-metrics)

# add Helm repository
helm repo add prometheus-community \
  https://prometheus-community.github.io/helm-charts
helm repo update

# install monitoring stack
helm install monitoring prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --values values-prometheus.yaml \
  --wait

Then you can retrieve the LoadBalancer IP address to access Grafana:

kubectl get svc -n monitoring monitoring-grafana

Finally, open your browser to http:// and login with:

Username: admin
Password: as configured in your values-prometheus.yaml file

Step 6 – Create Grafana dashboards

In this step, you will be able to access Grafana interface and add your Prometheus as a new data source, then create a complete dashboard with different vLLM metrics.

1. Add a new data source in Grafana

First of all, create a new Prometheus connection inside Grafana:

Navigate to Connections → Data sources → Add data source
Select Prometheus
Configure URL: http://monitoring-prometheus:9090
Click Save & test

Now that your Prometheus has been configured as a new data source, you can create your Grafana dashboard.

2. Create your monitoring dashboard

To begin with, you can use the following pre-configured Grafana dashboard by downloading this JSON file locally:

In the left-hand menu, select Dashboard:

Navigate to Dashboards → Import
Upload the provided dashboard JSON
Select Prometheus as datasource
Click Import and select the vLLM-metrics-grafana-monitoring.json file

The dashboard provides real-time visibility for Ministral 3 14B deployed with vLLM container and OVHcloud AI Deploy.

You can now track:

Performance metrics: TTFT, inter-token latency, end-to-end latency
Throughput indicators: Requests per second, token generation rates
Resource utilisation: KV cache usage, active/waiting requests
Capacity indicators: Queue depth, preemption rates

Here are the key metrics tracked and displayed in the Grafana dashboard:

Metric Category	Prometheus Metric	Description	Use case
Latency	`vllm:time_to_first_token_seconds`	Time until first token generation	User experience monitoring
Latency	`vllm:inter_token_latency_seconds`	Time between tokens	Throughput optimisation
Latency	`vllm:e2e_request_latency_seconds`	End-to-end request time	SLA monitoring
Throughput	`vllm:request_success_total`	Successful requests counter	Capacity planning
Resource	`vllm:kv_cache_usage_perc`	KV cache memory usage	Memory management
Queue	`vllm:num_requests_running`	Active requests	Load monitoring
Queue	`vllm:num_requests_waiting`	Queued requests	Overload detection
Capacity	`vllm:num_preemptions_total`	Request preemptions	Peak load indicator
Tokens	`vllm:prompt_tokens_total`	Input tokens processed	Usage analytics
Tokens	`vllm:generation_tokens_total`	Output tokens generated	Cost tracking

Well done, you now have at your disposal:

An endpoint of the Ministral 3 14B model deployed with vLLM thanks to OVHcloud AI Deploy and its autoscaling strategies based on custom metrics
Prometheus for metrics collection and Grafana for visualisation/dashboards thanks to OVHcloud MKS

But how can you check that everything will work when the load increases?

Step 7 – Test autoscaling and real-time visualisation

The first objective here is to force AI Deploy to:

Increase vllm:num_requests_running
‘Saturate’ a single replica
Trigger the scale up
Observe replica increase + latency drop

1. Autoscaling testing strategy

The goal is to combine:

High concurrency
Long prompts (KVcache heavy)
Long generations
Bursty load

This is what vLLM autoscaling actually reacts to.

To do so, a Python code can simulate the expected behaviour:

import time
import threading
import random
from statistics import mean
from openai import OpenAI
from tqdm import tqdm

APP_URL = "https://.app.gra.ai.cloud.ovh.net/v1" # /!\ REPLACE THE  by yours /!\
MODEL = "mistralai/Ministral-3-14B-Instruct-2512"
API_KEY = $MY_OVHAI_ACCESS_TOKEN

CONCURRENT_WORKERS = 500          # concurrency (main scaling trigger)
REQUESTS_PER_WORKER = 25
MAX_TOKENS = 768                  # generation pressure

# some random prompts
SHORT_PROMPTS = [
    "Summarize the theory of relativity.",
    "Explain what a transformer model is.",
    "What is Kubernetes autoscaling?"
]

MEDIUM_PROMPTS = [
    "Explain how attention mechanisms work in transformer-based models, including self-attention and multi-head attention.",
    "Describe how vLLM manages KV cache and why it impacts inference performance."
]

LONG_PROMPTS = [
    "Write a very detailed technical explanation of how large language models perform inference, "
    "including tokenization, embedding lookup, transformer layers, attention computation, KV cache usage, "
    "GPU memory management, and how batching affects latency and throughput. Use examples.",
]

PROMPT_POOL = (
    SHORT_PROMPTS * 2 +
    MEDIUM_PROMPTS * 4 +
    LONG_PROMPTS * 6    # bias toward long prompts
)

# openai compliance
client = OpenAI(
    base_url=APP_URL,
    api_key=API_KEY,
)

# basic metrics
latencies = []
errors = 0
lock = threading.Lock()

# worker
def worker(worker_id):
    global errors
    for _ in range(REQUESTS_PER_WORKER):
        prompt = random.choice(PROMPT_POOL)

        start = time.time()
        try:
            client.chat.completions.create(
                model=MODEL,
                messages=[{"role": "user", "content": prompt}],
                max_tokens=MAX_TOKENS,
                temperature=0.7,
            )
            elapsed = time.time() - start

            with lock:
                latencies.append(elapsed)

        except Exception as e:
            with lock:
                errors += 1

# run
threads = []
start_time = time.time()

print("Starting autoscaling stress test...")
print(f"Concurrency: {CONCURRENT_WORKERS}")
print(f"Total requests: {CONCURRENT_WORKERS * REQUESTS_PER_WORKER}")

for i in range(CONCURRENT_WORKERS):
    t = threading.Thread(target=worker, args=(i,))
    t.start()
    threads.append(t)

for t in threads:
    t.join()

total_time = time.time() - start_time

# results
print("\n=== AUTOSCALING BENCH RESULTS ===")
print(f"Total requests sent: {len(latencies) + errors}")
print(f"Successful requests: {len(latencies)}")
print(f"Errors: {errors}")
print(f"Total wall time: {total_time:.2f}s")

if latencies:
    print(f"Avg latency: {mean(latencies):.2f}s")
    print(f"Min latency: {min(latencies):.2f}s")
    print(f"Max latency: {max(latencies):.2f}s")
    print(f"Throughput: {len(latencies)/total_time:.2f} req/s")

How can you verify that autoscaling is working and that the load is being handled correctly without latency skyrocketing?

2. Hardware and platform-level monitoring

First, AI Deploy Grafana answers ‘What resources are being used and how many replicas exist?‘.

GPU utilisation, GPU memory, CPU, RAM and replica count are monitored through OVHcloud AI Deploy Grafana (monitoring URL), which exposes infrastructure and runtime metrics for the AI Deploy application. This layer provides visibility into resource saturation and scaling events managed by the AI Deploy platform itself.

Access it using the following URL (do not forget to replace by yours): https://monitoring.gra.ai.cloud.ovh.net/d/app/app-monitoring?var-app=&orgId=1

For example, check GPU/RAM metrics:

You can also monitor scale ups and downs in real time, as well as information on HTTP calls and much more!

3. Software and application-level monitoring

Next the combination of MKS + Prometheus + Grafana answers ‘How the inference engine behaves internally’.

In fact, vLLM internal metrics (request concurrency, token throughput, latency indicators, KV cache pressure, etc.) are collected via the vLLM /metrics endpoint and scraped by Prometheus running on OVHcloud MKS, then visualised in a dedicated Grafana instance. This layer focuses on model behaviour and inference performance.

Find all these metrics via (just replace ): http:///d/vllm-ministral-monitoring/ministral-14b-vllm-metrics-monitoring?orgId=1

Find key metrics such as TTF, etc:

You can also find some information about ‘Model load and throughput’:

To go further and add even more metrics, you can refer to the vLLM documentation on ‘Prometheus and Grafana‘.

Conclusion

This reference architecture provides a scalable, and production-ready approach for deploying LLM inference on OVHcloud using AI Deploy and the autoscaling on custom metric feature.

OVHcloud MKS is dedicated to running Prometheus and Grafana, enabling secure scraping and visualisation of vLLM internal metrics exposed via the /metrics endpoint.

By scraping vLLM metrics securely from AI Deploy into Prometheus and exposing them through Grafana, the architecture provides full visibility into model behaviour, performance and load, enabling informed scaling analysis, troubleshooting and capacity planning in production environments.

Moving Beyond Ingress: Why should OVHcloud Managed Kubernetes Service (MKS) users start looking at the Gateway API?

Aurélie Vache and Antonin Anchisi — Mon, 15 Dec 2025 09:26:36 +0000

For years, the Kubernetes Ingress API, and the popular Ingress NGINX controller (ingress-nginx), have been the default way to expose applications running inside a Kubernetes cluster.

But the ecosystem is changing: the Kubernetes SIG network has announced the retirement of Ingress NGINX in March 2026.

After March 2026 the Ingress NGINX will no longer get new features, new releases, security patches and bug fixes.

Furthermore, the Kubernetes project recommends using Gateway instead of Ingress.

The Ingress API has already been frozen, which means it is no longer being developed, and will have no further changes or updates made to it. The Kubernetes project has no plans to remove Ingress from Kubernetes.

While OVHcloud Managed Kubernetes Service (MKS) does not yet provide a native GatewayClass, you can already benefit from Gateway API capabilities today by deploying your own controller 💪 .

Also, until Gateway API becomes fully integrated with OpenStack providers, there is an intermediate option: using a modern, actively maintained Ingress controller other than ingress-nginx.

The limitations of the current Ingress controller model

The traditional Kubernetes Ingress model was intentionally simple: define an Ingress, install an Ingress Controller, and let it configure a single proxy (usually Nginx) to route traffic.

This design works, but it comes with limitations:

– Single Monolithic “Entry Point”: All HTTP routing for the entire cluster goes through one shared proxy. It adds complexity, configuration conflicts and scaling challenges.
– Protocol limitations: only HTTP and HTTPS.Support for gRPC, HTTP/2, TCP, UDP or TLS passthrough is inconsistent and controller-specific.
– Heavy Reliance on Annotations: Advanced features (timeouts, rewrites, header handling…) rely on custom annotations.
– Strong 3rd parties and cloud Load Balancers support: Every Ingress controllers (3rd parties providers) come with their specialized annotations.

Finally, as mentioned, the most used Ingress controller, Ingress NGINX, will be retired in March 2026.

A Transitional Solution: Using a Modern Ingress Controller (Traefik, Contour, HAProxy…)

Before moving to the Gateway API, as a transitional solution, OVHcloud MKS users can simply replace Ingress Nginx with a modern, actively maintained Ingress controller.

This allows you to:

– keep using your existing Ingress manifests
– keep the same architecture: Service type LoadBalancer → OVHcloud Public Cloud Load Balancer → Ingress Controller
– avoid relying on unsupported or deprecated components
– gain features (better gRPC support, built‑in dashboards, improved L7 behaviour…)

Popular alternatives:

Traefik:
– Very easy to deploy
– Excellent support for HTTP/2, gRPC, WebSockets
– Built‑in dashboard
– Supports both Ingress and Gateway API
– Actively maintained
– Seamless migration from NGINX Ingress Controller to Traefik with NGINX annotation compatibility

Contour (Envoy):
– Envoy-based Ingress Controller
– Excellent performance
– Good stepping‑stone toward Gateway API

HAProxy Ingress:
– Extremely performant
– Enterprise-grade L7 routing
– Optional Gateway API support

NGINX Gateway Fabric (NGF):
– The successor to Ingress NGINX
– Built directly around Gateway API
– Still maturing but a strong long‑term candidate

If you are interested, you can read the more exhaustive list of Ingress controllers.

Installing an Alternative Ingress Controller on OVHcloud MKS

We will show you how to install Traefik, as an alternative Ingress controller and use it to spawn a single OVHcloud Public Cloud Load Balancer (based on OpenStack Octavia).

Install Traefik:

helm repo add traefik https://traefik.github.io/charts
helm repo update

helm install traefik traefik/traefik --namespace traefik --create-namespace --set service.type=LoadBalancer

This automatically triggers:
– the OpenStack CCM (used by OVHcloud)
– the creation of an OVHcloud Public Cloud Load Balancer
– exposure of Traefik through a public IP

After several seconds, the Load Balancer will be active.

Check that Traefik is running:

$ kubectl get all -n traefik
NAME                           READY   STATUS    RESTARTS   AGE
pod/traefik-6777c5db85-pddd6   1/1     Running   0          31s

NAME              TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
service/traefik   LoadBalancer   10.3.129.188        80:30267/TCP,443:30417/TCP   31s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/traefik   1/1     1            1           31s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/traefik-6777c5db85   1         1         1       31s

Then in order to use it, create an ingress.yaml file with the following content:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
  namespace: default
  annotations:
    kubernetes.io/ingress.class: "traefik"  # Specifies Traefik as the ingress controller
spec:
  rules:
    - host: my-app.local
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-app-service
                port:
                  number: 80

And apply it in your cluster:

kubectl apply -f ingress.yaml

Using this type of alternative provides a fully supported, modern Ingress Controller while you prepare a long‑term transition to the Gateway API.

Gateway API: A modern, flexible networking model

The Gateway API is the next-generation Kubernetes networking specification. It introduces clearer roles and more flexible architectures.

Gateway API splits responsibilities across:
– GatewayClass: defines the type of gateway and which controller manages it
– Gateway: the actual entry point (e.g., a Load Balancer)
– Routes: routing rules, protocol-specific (HTTPRoute, TLSRoute, GRPCRoute, TCPRoute…)

Gateway API supports:
– HTTP(S)
– HTTP/2
– gRPC
– TCP
– TLS passthrough
…in a consistent and portable way.

Unlike Ingress, Gateway API is explicitly designed to allow providers like OVHcloud, AWS, GCP, Azure to:
– provision Load Balancers (LB)
– manage listeners
– expose multiple ports
– integrate with their LB features
This paves the way for native OVHcloud GatewayClass support.

How does it work today on OVHcloud MKS?

OVHcloud MKS relies on the OpenStack Cloud Controller Manager (CCM) to provision OVHcloud Public Cloud Load Balancers in response to a Service of type LoadBalancer.

Since MKS does not yet include a native GatewayClass, you can use Gateway API today as follows:

1. You deploy an existing Gateway Controller (Envoy Gateway, Traefik, Contour/Envoy…) and its GatewayClass.
2. The controller deploys a Data Plane proxy inside the cluster.
3. To expose that proxy, you still have to create a Service of type LoadBalancer (and your app of course).
4. The CCM provisions an OVHcloud Public Cloud Load Balancer and forwards traffic to your proxy.

Thanks to that, you will have a fully functional Gateway API. The workflow is very similar to that which is required for using NGINX Ingress controller.

Using the Gateway API on OVHcloud MKS today

You can already use the Gateway API by deploying your preferred controller.

Here’s an example using Envoy Gateway, one of the most future-proof options.

Install Gateway API CRDs:

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/latest/download/standard-install.yaml

Deploy Envoy Gateway:

helm install eg oci://docker.io/envoyproxy/gateway-helm -n envoy-gateway-system --create-namespace

You should have a result like this:

$ helm install eg oci://docker.io/envoyproxy/gateway-helm -n envoy-gateway-system --create-namespace

Pulled: docker.io/envoyproxy/gateway-helm:1.6.0
Digest: sha256:5c55e7844ae8cff3152ca00330234ef61b1f9fa3d466f50db2c63a279f1cd1df
NAME: eg
LAST DEPLOYED: Mon Dec  1 16:27:07 2025
NAMESPACE: envoy-gateway-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
**************************************************************************
*** PLEASE BE PATIENT: Envoy Gateway may take a few minutes to install ***
**************************************************************************

Envoy Gateway is an open source project for managing Envoy Proxy as a standalone or Kubernetes-based application gateway.

Thank you for installing Envoy Gateway! 🎉

Your release is named: eg. 🎉

Your release is in namespace: envoy-gateway-system. 🎉

To learn more about the release, try:

  $ helm status eg -n envoy-gateway-system
  $ helm get all eg -n envoy-gateway-system

To have a quickstart of Envoy Gateway, please refer to https://gateway.envoyproxy.io/latest/tasks/quickstart.

To get more details, please visit https://gateway.envoyproxy.io and https://github.com/envoyproxy/gateway.

Check the Envoy gateway is running:

$ kubectl get po -n envoy-gateway-system
NAME                            READY   STATUS    RESTARTS   AGE
envoy-gateway-9cbbc577c-5h5qw   1/1     Running   0          16m

As a quickstart, you can install directly the GatewayClass, Gateway, HTTPRoute and an example app:

kubectl apply -f https://github.com/envoyproxy/gateway/releases/download/latest/quickstart.yaml -n default

This command deploys a GatewayClass, a Gateway, a HTTPRoute and an app deployed in a deployment and exposed through a service:

gatewayclass.gateway.networking.k8s.io/eg created
gateway.gateway.networking.k8s.io/eg created
serviceaccount/backend created
service/backend created
deployment.apps/backend created
httproute.gateway.networking.k8s.io/backend created

As you can see, a GatewayClass have been deployed:

$ kubectl get gatewayclass -o yaml | kubectl neat
apiVersion: v1
items:
- apiVersion: gateway.networking.k8s.io/v1
  kind: GatewayClass
  metadata:
    name: eg
  spec:
    controllerName: gateway.envoyproxy.io/gatewayclass-controller
kind: List
metadata:
  resourceVersion: ""

Note that a GatewayClass is a cluster-wide resource so you don’t have to specify any namespace.

A Gateway have been deployed also:

$ kubectl get gateway -o yaml -n default | kubectl neat
apiVersion: v1
items:
- apiVersion: gateway.networking.k8s.io/v1
  kind: Gateway
  metadata:
    name: eg
    namespace: default
  spec:
    gatewayClassName: eg
    listeners:
    - allowedRoutes:
        namespaces:
          from: Same
      name: http
      port: 80
      protocol: HTTP
kind: List
metadata:
  resourceVersion: ""

A HTTPRoute also:

$ kubectl get httproute -o yaml -n default | kubectl neat
apiVersion: v1
items:
- apiVersion: gateway.networking.k8s.io/v1
  kind: HTTPRoute
  metadata:
    name: backend
    namespace: default
  spec:
    hostnames:
    - www.example.com
    parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: eg
    rules:
    - backendRefs:
      - group: ""
        kind: Service
        name: backend
        port: 3000
        weight: 1
      matches:
      - path:
          type: PathPrefix
          value: /
kind: List
metadata:
  resourceVersion: ""

In order to retrieve the external IP (of the external Load Balancer), you just have to get information about the Gateway and export it in an environment variable:

$ kubectl get gateway eg
NAME   CLASS   ADDRESS        PROGRAMMED   AGE
eg     eg      xx.xxx.xx.xxx   True        18m

$ export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')

$ echo $GATEWAY_HOST
xx.xxx.xx.xxx

And finally, a backend service have been deployed with its deployment:

$ kubectl get pod,svc -l app=backend -n default
NAME                           READY   STATUS    RESTARTS   AGE
pod/backend-765694d47f-zr6hh   1/1     Running   0          21m

NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/backend   ClusterIP   10.3.114.179           3000/TCP   21m

In order to create your own Gateway and *Route resources, don’t hesitate to take a look at the Gateway API website.

Conclusion

Two migration paths are currently available for OVHcloud MKS users:

Short-term: switch to a modern Ingress Controller (Traefik, Contour, HAProxy, NGF…). It provides full support for current Ingress usage, without requiring API changes.
Long-term: adopt the Gateway API. Gateway API brings multi‑protocol support, clearer separation of roles, and is the strategic direction of Kubernetes networking.

Which approach and which tool should you choose? Well, it’s up to you, depending on your use cases, your teams, your needs… 🙂

As we have seen in this blog post, OVHcloud MKS users can begin adopting these technologies today, safely and incrementally.

This ecosystem is evolving quickly, so stay tuned to find out about the coming release of a pre-installed official GatewayClass (based on OpenStack Octavia) 💪.

Manage your secrets using OVHcloud Secret Manager with External Secrets Operator (ESO) on OVHcloud Managed Kubernetes Service (MKS)

Aurélie Vache — Tue, 25 Nov 2025 14:44:52 +0000

Secrets resources in Kubernetes help us keep sensitive information like logins, passwords, tokens, credentials and certificates secure. But just a heads up: Secrets in Kubernetes are base64 encoded, not encrypted so anyone can read and decode them if they know how.

The good news is that OVHcloud has just launched the Secret Manager Beta, which you can use within your Kubernetes clusters via the External Secrets Operator (ESO) 🎉.

External Secrets Operator

The External Secrets Operator (ESO) extends Kubernetes with Custom Resource Definitions (CRDs) ) that define where secrets are and how to sync them.

The controller retrieves secrets from an external API and creates Kubernetes Secrets. If the secret changes in the external API, the controller updates the secret in the Kubernetes cluster.

Basically, the ESO can connect to an external Secret Manager like OVHcloud, Vault, AWS, or GCP using a (Cluster)SecretStore, and an ExternalSecret to figure out which Secret it needs to fetch. It then creates a Secret in the Kubernetes cluster with the fetched secret’s value.

Plus, it can sync secrets across all the namespaces in your Kubernetes cluster (I love this feature ❤️):

You can use External Secrets with different Providers, including AWS Secrets Manager, HashiCorp Vault, Google Secret Manager. In this blog I’ll show you how to create a secret in the new OVHcloud Secret Manager using Hashicorp Vault.

For more details, read the ESO official documentation.

Let’s jump in!

Create an IAM local user

To fetch secrets in Secret Manager, you’ll need an IAM user with the right permissions. You can either set it up or use an existing one.

In the OVHcloud Control Panel (UI), go to ‘Identity and Access Management’, then ‘Identities’.

Click the ‘Add user’ button to create an IAM local user and complete the fields as shown below:

Quick note, I’ve named the user ‘secretmanager-’ followed by the ID of the OKMS domain I want to use.

The user needs to be an ADMIN, or, ideally, have the following policies:

okms:apikms:secret/create
okms:apikms:secret/version/getData
okms:apiovh:secret/get

Get the Personal Access Token (PAT)

The ESO ClusterSecretStore needs the permission to fetch secrets from Secret Manager, so you’ll need a token (PAT).

You can access it via our API, which you’ll find here: https://eu.api.ovh.com/console/?section=%2Fme&branch=v1#post-/me/identity/user/-user-/token

Path parameters

user: secretmanager-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx

Request body:

{
  "description": "PAT secretmanager-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
  "name": "pat-secretmanager-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx"
}

You should obtain a response like this:

{
  "creation": "2025-11-07T14:02:56.679157188Z",
  "description": "PAT secretmanager-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
  "expiresAt": null,
  "lastUsed": null,
  "name": "pat-secretmanager-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
  "token": "eyJhbGciOiJ...punpVAg"
}

Save the token value, because you’ll need it in a bit.

Create a secret in the Secret Manager

Here’s how to create a secret with OVHcloud MPR credentials for use in Kubernetes cluster(s).

In the OVHcloud Control Panel (UI), go to ‘Secret Manager’, then create a secret ‘prod/va1/dockerconfigjson’ in the Europe region (France – Paris) eu-west-par:

You’ll need to activate the region if you’re selecting it for the first time:

Select an OKMS domain:

Enter the path and value of your secret. For example:

Your secret is all set!

Install External Secrets Operators on your cluster

Deploy external secret through Helm:

helm repo add external-secrets https://charts.external-secrets.io
helm repo update

Install from the chart repository:

helm install external-secrets \
   external-secrets/external-secrets \
    -n external-secrets \
    --create-namespace \
    --set installCRDs=true

Your result should look something like this:

$ helm install external-secrets \
   external-secrets/external-secrets \
    -n external-secrets \
    --create-namespace \
    --set installCRDs=true

NAME: external-secrets
LAST DEPLOYED: Mon Nov 24 17:08:58 2025
NAMESPACE: external-secrets
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
external-secrets has been deployed successfully in namespace external-secrets!

In order to begin using ExternalSecrets, you will need to set up a SecretStore
or ClusterSecretStore resource (for example, by creating a 'vault' SecretStore).

More information on the different types of SecretStores and how to configure them
can be found in our Github: https://github.com/external-secrets/external-secrets

This command will install the External Secrets Operator in your cluster.

Check ESO is running:

$ kubectl get all -n external-secrets
NAME                                                    READY   STATUS    RESTARTS   AGE
pod/external-secrets-6b9f8ff5d4-jwd6g                   1/1     Running   0          25m
pod/external-secrets-cert-controller-7bf8fd894c-d24xb   1/1     Running   0          25m
pod/external-secrets-webhook-df488ddff-2xv4t            1/1     Running   0          25m

NAME                               TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
service/external-secrets-webhook   ClusterIP   10.3.106.32           443/TCP   25m

NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/external-secrets                   1/1     1            1           25m
deployment.apps/external-secrets-cert-controller   1/1     1            1           25m
deployment.apps/external-secrets-webhook           1/1     1            1           25m

NAME                                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/external-secrets-6b9f8ff5d4                   1         1         1       25m
replicaset.apps/external-secrets-cert-controller-7bf8fd894c   1         1         1       25m
replicaset.apps/external-secrets-webhook-df488ddff            1         1         1       25m

Create a Secret contains the PAT

Encode the PAT in base64:

$ echo -n "" | base64

ZXlKaG...wVkFn

Create a secret with it inside a secret.yaml file:

apiVersion: v1
kind: Secret
metadata:
  name: ovhcloud-vault-token
  namespace: external-secrets
data:
  token: ZXlKaG...wVkFn

Apply the resource in your cluster:

kubectl apply -f secret.yaml

Check that the secret have been created:

$ kubectl get secret ovhcloud-vault-token -n external-secrets
NAME                   TYPE     DATA   AGE
ovhcloud-vault-token   Opaque   1      5m

Deploy a ClusterSecretStore to connect ESO to Secret Manager

Set up a ClusterSecretStore to manage synchronisation with Secret Manager.
It will use the HashiCorp Vault provider with token auth, and the OKMS endpoint as the backend.

Create a clustersecretstore.yaml file with the content below:

apiVersion: external-secrets.io/v1
kind: ClusterSecretStore
metadata:
  name: vault-secret-store
spec:
  provider:
      vault:
        server: "https://eu-west-par.okms.ovh.net/api/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" # OKMS endpoint, fill with the correct region and your okms_id
        path: "secret"
        version: "v2"
        auth:
            tokenSecretRef:
              name: ovhcloud-vault-token # The k8s secret that contain your PAT
              key: token

Keep in mind, in our example, we’ve selected the “eu-west-par” region. You can enter a different server URL, depending on your desired region.

Apply it:

kubectl apply -f clustersecretstore.yaml

Check:

$ kubectl get clustersecretstore.external-secrets.io/vault-secret-store
NAME                 AGE   STATUS   CAPABILITIES   READY
vault-secret-store   2m   Valid    ReadWrite      True

Create an ExternalSecret

Create an externalsecret.yaml file with the content below:

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: docker-config-secret
  namespace: external-secrets
spec:
  refreshInterval: 30m
  secretStoreRef:
    name: vault-secret-store
    kind: ClusterSecretStore
  target:
    template:
      type: kubernetes.io/dockerconfigjson
      data:
        .dockerconfigjson: "{{ .mysecret | toString }}"
    name: ovhregistrycred
    creationPolicy: Owner
  data:
  - secretKey: mysecret
    remoteRef:
      key: prod/va1/dockerconfigjson

Apply it:

$ kubectl apply -f externalsecret.yaml
externalsecret.external-secrets.io/docker-config-secret created

Check:

$ kubectl get externalsecret.external-secrets.io/docker-config-secret -n external-secrets
NAME                   STORETYPE            STORE                REFRESH INTERVAL   STATUS         READY
docker-config-secret   ClusterSecretStore   vault-secret-store   30m0s              SecretSynced   True

After applying this command, it will create a Kubernetes Secret object.

$ kubectl get secret -n external-secrets
NAME                                     TYPE                             DATA   AGE
...
ovhregistrycred                          kubernetes.io/dockerconfigjson   1      17d
...

As you can see, the Secret is ready, and you can now use it as an imagePullSecret in your Pods!

Conclusion

In this blog, we’ve explained how to create secrets in the new OVHcloud Secret Manager and integrate them directly in your Kubernetes clusters using the ESO Vault provider.

And here’s some great news: our teams are working on an OVHcloud External Secret Operator, set to go live in the coming months, which you can use 🎉.

Stay tuned and share your thoughts!

10 Reasons Scaling Startups Are Migrating to OVHcloud

Alexander Grau — Tue, 21 Oct 2025 21:26:37 +0000

Cloud infrastructure plays a critical role in how startups scale—affecting everything from product delivery and user experience to budget and compliance. While many startups begin their journey with public cloud giants, the challenges of unpredictable costs, data control, and technical constraints become more apparent as they grow.

For startups ready to scale smarter, OVHcloud offers a compelling alternative: high-performance, cost-effective, and sovereignty-first infrastructure. Here’s why more and more growing startups are making the switch.

1. Predictable, Transparent Pricing

OVHcloud’s flat-rate pricing model eliminates hidden fees and unpredictable billing. Bandwidth is included. Egress costs? Zero. This gives startups the ability to budget confidently—even as infrastructure scales rapidly.

2. Cost-Efficient Scaling

Startups that migrate to OVHcloud often report up to 60% cost savings compared to hyperscalers. Whether you’re scaling your backend, AI workloads, or customer-facing applications, OVHcloud lets you do more with less.

3. Performance Without Compromise

From Bare Metal servers and high-memory VMs to GPU and storage-optimized instances, OVHcloud infrastructure is engineered for performance. Ideal for AI, SaaS, analytics, and other compute-intensive use cases.

4. Full Data Sovereignty in the EU

OVHcloud is headquartered in Europe and operates under the strictest data protection laws (like GDPR in the EU or Law 25 in Quebec, Canada). Unlike other providers, your data stays within jurisdictions that respect privacy and sovereignty—no exposure to foreign surveillance laws.

5. Open Standards and No Vendor Lock-In

Freedom matters—especially when you’re building for scale. OVHcloud supports open technologies like Kubernetes, Terraform, and OpenStack, giving your team full flexibility and avoiding lock-in to proprietary tools or services.

6. Infrastructure That Grows With You

Whether you’re launching in new markets or onboarding thousands of new users, OVHcloud enables seamless horizontal and vertical scaling. With availability across multiple regions, your growth won’t hit a wall.

7. Faster Time-to-Market Through Cloud Migration Support

OVHcloud offers cloud migration guidance and tools, including compatibility with major platforms and migration kits. This helps your team move faster, avoid downtime, and focus on innovation—not infrastructure headaches.

8. Dev-Friendly Ecosystem

With support for containerization, automation, and CI/CD pipelines, OVHcloud makes life easier for DevOps teams. You can provision infrastructure programmatically and scale efficiently—just like you would with AWS or GCP.

9. Sustainability Built In

Efficiency is built into OVHcloud’s DNA. By designing and operating its own energy-efficient data centers, OVHcloud helps startups meet their sustainability goals without compromising on performance or cost.

10. A Cloud Partner—Not Just a Provider

Startups need more than compute power—they need guidance, flexibility, and a partner that understands their journey. OVHcloud offers tailored support, technical documentation, and real human engagement to help you succeed at every stage of growth.

OVHcloud: Built to Scale With Startups

If your startup is growing fast and needs infrastructure that can keep up—without breaking the budget or sacrificing control—OVHcloud offers a cloud built around your values: scalability, transparency, freedom, and performance.

Migrate with confidence. Scale with control. Grow with OVHcloud.

If you’re a startup looking to transform your business, we encourage you to join the OVHcloud Startup Program or contact OVHcloud to discover how our solutions can support your journey!

Create encrypted Persistent Volumes on OVHcloud Managed Kubernetes clusters with LUKS

Aurélie Vache — Tue, 19 Aug 2025 11:35:41 +0000

Since this summer, it’s possible to create encrypted OVHcloud Block Storage with OMK (OVHcloud managed key) in RBX, SBG, Paris & BHS regions. More regions will come in the coming months 💪.

And the good news is that you can use encrypted Block Storage using Persistent Volumes in your OVHcloud Managed Kubernetes Service (MKS) clusters 🎉.

In this post, we’ll show you how to encrypt persistent volumes on an OVHcloud Managed Kubernetes (MKS) cluster using a csi-cinder-high-speed-gen2-luks Storage Class. Leveraging LUKS-based encryption at the storage layer, you’ll learn how to protect your data at rest without sacrificing the performance of NVMe-backed volumes.

We’ll guide you step by step: defining the Storage Class, creating a Persistent Volume Claim (PVC), and deploying a Pod that mounts the encrypted volume.

This practical walkthrough is designed for developers and platform engineers looking to secure their Kubernetes workloads on OVHcloud in a straightforward way.

How to

You will create a Persistent Volume Claim (PVC), linked to a Storage Class, that will automatically create a Persistent Volume (PV) that will automatically create an associated encrypted Public Cloud Block Storage volume.
Then you will create a Pod attached to the PVC.

Let’s create an encrypted Persistent Volume in our OVHcloud MKS cluster

Prerequisite: Have an OVHcloud MKS cluster.

First, create a csi-cinder-high-speed-gen2-luks.yaml file with the following content:

💡 Note that if you deploy in on a MKS 1AZ cluster (instead of my 3AZ MKS cluster), you should define the volumeBindingMode to Immediate instead.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cinder-high-speed-gen2-luks
allowVolumeExpansion: true
parameters:
  fsType: ext4
  type: high-speed-gen2-luks
provisioner: cinder.csi.openstack.org
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

This StorageClass is using the same configuration as existing csi-cinder-high-speed-gen2 but with the high-speed-gen2-luks type.

So the result will be the usage of SSD disks with NVMe interfaces encrypted with LUKS (Linux Unified Key Setup) which is a standard on-disk format for hard disk encryption.

Apply the manifest file:

kubectl apply -f csi-cinder-high-speed-gen2-luks.yaml

⚠️ You can’t modify the volumeBindingMode value for an existing Storage Class, you have to delete it and create a new one.

List the Storage Classes in the cluster:

$ kubectl get sc
NAME                              PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
csi-cinder-high-speed (default)   cinder.csi.openstack.org   Delete          WaitForFirstConsumer   true                   33d
csi-cinder-high-speed-gen-2       cinder.csi.openstack.org   Delete          WaitForFirstConsumer   true                   33d
csi-cinder-high-speed-gen2-luks   cinder.csi.openstack.org   Delete          WaitForFirstConsumer   true                   4s

Create a pvc-luks.yaml file with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-luks
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: csi-cinder-high-speed-gen2-luks

Create a new namespace and apply the manifest file into it:

kubectl create ns test-pvc-luks
kubectl apply -f pvc-luks.yaml -n test-pvc-luks

Check the status of our newly created PVC:

$ kubectl get pvc -n test-pvc-luks
NAME       STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS                      VOLUMEATTRIBUTESCLASS   AGE
pvc-luks   Pending                                      csi-cinder-high-speed-gen2-luks                    3s


$ kubectl describe pvc pvc-luks -n test-pvc-luks
Name:          pvc-luks
Namespace:     test-pvc-luks
StorageClass:  csi-cinder-high-speed-gen2-luks
Status:        Pending
Volume:
Labels:        
Annotations:   
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  WaitForFirstConsumer  10s (x2 over 10s)  persistentvolume-controller  waiting for first consumer to be created before binding
$ kubectl describe pvc pvc-luks
Name:          pvc-luks
Namespace:     test-pvc-luks
StorageClass:  csi-cinder-high-speed-gen2-luks
Status:        Pending
Volume:
Labels:        
Annotations:   
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  WaitForFirstConsumer  10s (x2 over 10s)  persistentvolume-controller  waiting for first consumer to be created before binding

As you can see, your PVC have been creating, with the luks Storage Class, and is Pending to be Bound, until the creation of a Pod with a volume (because of the WaitForFirstConsumer value):

Create a pod.yaml file with the following content:

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-encrypted-volume
spec:
  containers:
  - name: nginx
    image: nginx
    volumeMounts:
    - mountPath: "/usr/share/nginx/html"
      name: encrypted-volume
  volumes:
  - name: encrypted-volume
    persistentVolumeClaim:
      claimName: pvc-luks

Create a new namespace and apply the manifest file into it:

kubectl apply -f pod.yaml -n test-pvc-luks

The PVC should now be Bound and a new PV should be created:

$ kubectl get pvc -n test-pvc-luks
NAME       STATUS   VOLUME                                                                     CAPACITY   ACCESS MODES   STORAGECLASS                      VOLUMEATTRIBUTESCLASS   AGE
pvc-luks   Bound    ovh-managed-kubernetes-siti343p-pvc-3a3b1d2e-ebdf-41a2-8f8f-4ee6984b6149   10Gi       RWO            csi-cinder-high-speed-gen2-luks                    3m27s

$ kubectl get pv -n test-pvc-luks
NAME                                                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                    STORAGECLASS                      VOLUMEATTRIBUTESCLASS   REASON   AGE
ovh-managed-kubernetes-siti343p-pvc-3a3b1d2e-ebdf-41a2-8f8f-4ee6984b6149   10Gi       RWO            Delete           Bound    test-pvc-luks/pvc-luks   csi-cinder-high-speed-gen2-luks                             32s

First the Pod should be in ContainerCreating state (waiting the creation and the attachment of the volume) and after few seconds it will be Running:

$ kubectl get pod pod-with-encrypted-volume -n test-pvc-luks
NAME                        READY   STATUS              RESTARTS   AGE
pod-with-encrypted-volume   0/1     ContainerCreating   0          44s

# Wait a little...

$ kubectl get pod pod-with-encrypted-volume -n test-pvc-luks
NAME                        READY   STATUS    RESTARTS   AGE
pod-with-encrypted-volume   1/1     Running   0          2m10s

The Pod is now created with an attached volume:

$ kubectl describe pod pod-with-encrypted-volume -n test-pvc-luks
Name:             pod-with-encrypted-volume
Namespace:        test-pvc-luks
Priority:         0
Service Account:  default
Node:             my-pool-zone-c-h5xjf-7n7kt/192.168.142.174
Start Time:       Tue, 19 Aug 2025 10:10:41 +0200
Labels:           
Annotations:      
Status:           Running
IP:               10.240.0.203
IPs:
  IP:  10.240.0.203
Containers:
  nginx:
    Container ID:   containerd://c38c0a0e19970503ad1bfaa0c74b5cc320cb9df08456c7613b9a9a8c908b9190
    Image:          nginx
    Image ID:       docker.io/library/nginx@sha256:33e0bbc7ca9ecf108140af6288c7c9d1ecc77548cbfd3952fd8466a75edefe57
    Port:           
    Host Port:      
    State:          Running
      Started:      Tue, 19 Aug 2025 10:11:42 +0200
    Ready:          True
    Restart Count:  0
    Environment:    
    Mounts:
      /usr/share/nginx/html from encrypted-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vbcnk (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True
  Initialized                 True
  Ready                       True
  ContainersReady             True
  PodScheduled                True
Volumes:
  encrypted-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-luks
    ReadOnly:   false
  kube-api-access-vbcnk:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               3m48s                  default-scheduler        Successfully assigned test-pvc-luks/pod-with-encrypted-volume to my-pool-zone-c-h5xjf-7n7kt
  Warning  FailedAttachVolume      3m25s (x6 over 3m43s)  attachdetach-controller  AttachVolume.Attach failed for volume "ovh-managed-kubernetes-siti343p-pvc-3a3b1d2e-ebdf-41a2-8f8f-4ee6984b6149" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach b76d1025-9473-4050-86be-4880f0f625cb volume to 516c41cf-9637-4b08-a75e-1d265d1773f4 compute: Bad request with: [POST https://compute.eu-west-par.cloud.ovh.net/v2.1/a212a1e43b614c4ba27a247b890fcf59/servers/516c41cf-9637-4b08-a75e-1d265d1773f4/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid input received: Invalid volume: Volume b76d1025-9473-4050-86be-4880f0f625cb status must be available or downloading to reserve, but the current status is creating. (HTTP 400) (Request-ID: req-e94505fd-39d6-496c-bc6d-275cd2604dda)"}}
  Normal   SuccessfulAttachVolume  3m8s                   attachdetach-controller  AttachVolume.Attach succeeded for volume "ovh-managed-kubernetes-siti343p-pvc-3a3b1d2e-ebdf-41a2-8f8f-4ee6984b6149"
  Normal   Pulling                 2m53s                  kubelet                  Pulling image "nginx"
  Normal   Pulled                  2m48s                  kubelet                  Successfully pulled image "nginx" in 5.072s (5.072s including waiting). Image size: 72324501 bytes.
  Normal   Created                 2m48s                  kubelet                  Created container: nginx
  Normal   Started                 2m48s                  kubelet                  Started container nginx

Logging in the OVHcloud Control Panel, you can see that the encrypted volume have been successfully created:

Finally, you can use your volume.

Execute a shell in the Nginx Pod and create an index.html file into it:

$ kubectl exec -it pod-with-encrypted-volume -n test-pvc-luks -- /bin/bash

root@pod-with-encrypted-volume:/# echo "Hello from OVHcloud encrypted Block Storage!" > /usr/share/nginx/html/index.html

And curl the webserver:

root@pod-with-encrypted-volume:/# apt update
root@pod-with-encrypted-volume:/# apt install curl
root@pod-with-encrypted-volume:/# curl http://localhost/
Hello from OVHcloud encrypted Block Storage!

🎉

What’s next?

In this blog post we saw a basic (but concrete) usage of the encrypted Persistent Volume on OVHcloud Kubernetes clusters that just bee released, don’t hesitate to think about it for your sensitive data.

In the coming months, the encrypted Block Storage will be available worldwide. Follow the Encrypted Block Volumes issue on GitHub to stay informed.

And don’t hesitate to take a look to our Cloud Roadmap & Changelog to see the state of all of the coming features in OVHcloud Public Cloud products.

Discover Kubernetes 1.33 features – Topology aware routing in multi-zones Kubernetes clusters

Aurélie Vache — Tue, 17 Jun 2025 07:05:40 +0000

Kubernetes 1.33 version has just been released few days/weeks ago.
As this new release contains 64 enhancements (!), it can not be easy to know what are the interesting and useful features and how to use them.

In this blog post, let’s discover one of interesting and useful new feature: “Topology aware routing in multi-zones Kubernetes clusters”.

⚠️ Kubernetes 1.33 should be available on OVHcloud MKS clusters at the end of June/beginning of July but the demo is working also on MKS with Kubernetes 1.32 release 😉.

Topology aware routing

Since Kubernetes 1.33, the topology aware routing and traffic distribution feature is in General Availability (GA).

This feature allows to optimize service traffic in multi-zone clusters and reduce latency and cross-zone data transfer cost.

Topology Aware Routing provides a mechanism to help keep traffic within the zone it originated from.

In a context of multi-zone clusters, it helps reliability, performance, reduce costs or improve network performance.

As OVHcloud just released, in Beta, the launch of their Managed Kubernetes clusters (MKS) on 3 AZ (Availability Zones), it’s the perfect occasion for me to test this brand new Kubernetes feature 🙂.

Demo

Prerequisite: Have a Kubernetes cluster with at least 2 nodes running in 2 different zones.

If you already don’t have one, you can follow this blog post in order to create an OVHcloud MKS cluster with 3 nodes pools, one per AZ.

On my side I set-up a MKS cluster in 3AZ (one per node pool), with 3 nodes per node pool:

$ kubectx kubernetes-admin@multi-zone-mks
Switched to context "kubernetes-admin@multi-zone-mks".

$ kubectl get np
NAME             FLAVOR   AUTOSCALED   MONTHLYBILLED   ANTIAFFINITY   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   MIN   MAX   AGE
my-pool-zone-a   b3-8     false        false           false          3         3         3            3           0     100   20d
my-pool-zone-b   b3-8     false        false           false          3         3         3            3           0     100   20d
my-pool-zone-c   b3-8     false        false           false          3         3         3            3           0     100   20d

$ kubectl get no
NAME                         STATUS   ROLES    AGE   VERSION
my-pool-zone-a-b9ztj-brgpq   Ready       20d   v1.32.3
my-pool-zone-a-b9ztj-gt5vd   Ready       20d   v1.32.3
my-pool-zone-a-b9ztj-mss8j   Ready       20d   v1.32.3
my-pool-zone-b-tr6wf-5wfgz   Ready       20d   v1.32.3
my-pool-zone-b-tr6wf-ct7fs   Ready       20d   v1.32.3
my-pool-zone-b-tr6wf-vlkwg   Ready       20d   v1.32.3
my-pool-zone-c-wgrl6-b2f9s   Ready       20d   v1.32.3
my-pool-zone-c-wgrl6-lp22l   Ready       20d   v1.32.3
my-pool-zone-c-wgrl6-slkq5   Ready       20d   v1.32.3

⚠️ As you saw, the Kubernetes version installed on my cluster is not equals to 1.33, but the ServiceTrafficDistribution feature gate is in Beta and it is activated:

$ kubectl get --raw /metrics | grep kubernetes_feature_enabled | grep Traffic

kubernetes_feature_enabled{name="ServiceTrafficDistribution",stage="BETA"} 1

A visual architecture of my MKS cluster:

⚠️ In MKS Standard clusters, don’t forget to enable the topology aware routing for 3AZ region.

In order to test this feature, in a new namespace, we will deploy:

a deployment with two pods named receiver-xxx
a ClusterIP service named svc-prefer-close with the feature enabled
a Pod named sender

Let’s do that!

Create a deploy.yaml file with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/name: service-traffic-example
  name: receiver
  namespace: prefer-close
spec:
  replicas: 2
  selector:
    matchLabels:
      app: service-traffic-example
  template:
    metadata:
      labels:
        app: service-traffic-example
    spec:
      containers:
      - image: scraly/hello-pod:1.0.1
        name: receiver
        ports:
        - containerPort: 8080
        env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName

Create a svc.yaml file with the following content:

apiVersion: v1
kind: Service
metadata:
  name: svc-prefer-close
  namespace: prefer-close
  annotations:
    service.kubernetes.io/topology-mode: auto
spec:
  ports:
    - name: http
      protocol: TCP
      port: 8080
      targetPort: 8080
  selector:
    app: service-traffic-example
  type: ClusterIP
  trafficDistribution: PreferClose

As you can see, this Service has two specific configurations.
First, we added the service.kubernetes.io/topology-mode: auto annotation to enable Topology Aware Routing for a Service.
Then, we configured the trafficDistribution to PreferClose in order to ask Kubernetes to send the traffic, preferably, to a pod that is “closed” to the sender.

Create a new namespace and apply the manifest files:

$ kubectl create ns prefer-close
$ kubectl apply -f deploy.yaml
$ kubectl apply -f svc.yaml

Result:
You should have two running Pods on 2 differents Nodes.

$ kubectl get po -o wide -n prefer-close

NAME                        READY   STATUS              RESTARTS   AGE   IP            NODE                         NOMINATED NODE   READINESS GATES
receiver-7cfd89d78d-dhv6z   1/1     Running             0          94s   10.240.4.91   my-pool-zone-c-wgrl6-slkq5              
receiver-7cfd89d78d-hrxrt   1/1     Running             0          94s   10.240.5.63   my-pool-zone-a-b9ztj-mss8j

OK, receiver-xxxxxxxx-dhv6z is running on my-pool-zone-c-xxxx and the other pod is running on my-pool-zone-a-xxxx. There are running on differents Availability Zones.

Now, we can create a Pod sender. it will be scheduled on a Node:

Run it and execute a curl command to test the traffic redirection to the “svc-prefer-close” Service:

$ kubectl run sender -n prefer-close --image=curlimages/curl -it -- sh
If you don't see a command prompt, try pressing enter.
~ $ curl http://svc-prefer-close.prefer-close:8080
Version: 1.0.1
Hostname: receiver-7cfd89d78d-dhv6z
Node: my-pool-zone-c-wgrl6-slkq5

Let’s verify where are our Pods:

$ kubectl get po -n prefer-close -o wide
NAME                        READY   STATUS    RESTARTS     AGE   IP             NODE                         NOMINATED NODE   READINESS GATES
receiver-7cfd89d78d-dhv6z   1/1     Running   0            9d    10.240.4.91    my-pool-zone-c-wgrl6-slkq5              
receiver-7cfd89d78d-hrxrt   1/1     Running   0            9d    10.240.5.63    my-pool-zone-a-b9ztj-mss8j              
sender                      1/1     Running   1 (5s ago)   21s   10.240.3.134   my-pool-zone-c-wgrl6-b2f9s

Kube-proxy sent the traffic from sender to a receiver-xx Pod on the same Availability Zone 🎉

⚠️ Note that because preferClose means “topologically proximate”, it may vary across implementations and could encompass endpoints within the same node, rack, zone, or even region.

How is it working?

When calculating the endpoints for a Service, the EndpointSlice controller considers the topology (region and zone) of each endpoint and populates the hints field to allocate it to a zone.

Cluster components such as kube-proxy can then consume those hints, and use them to influence how the traffic is routed (favoring topologically closer endpoints).

So, with PreferClose value for trafficDistribution, we ask kube-proxy to redirect traffic to the nearest available endpoints based on the network topology.

That’s why the option is called PreferClose.

What’s next?

In the future you will be able to configure the trafficDistribution field with other values.

Indeed, two new values, more explicit, are currently in Alpha since the Kubernetes 1.33 release: PreferSameZone and PreferSameNode.

Personally I can’t wait to test them 😇.

Want to go further?

Want to learn more on this topic? In the coming days, we will publish a blog post about MKS Premium plan.

Visit our Managed Kubernetes Service (MKS) Premium plan in the OVHcloud Labs website to know more about Premium MKS.

Join the free Beta: https://labs.ovhcloud.com/en/managed-kubernetes-service-mks-premium-plan/

Read the documentation about the new Managed Kubernetes Service (MKS) Premium plan.

Join us on Discord and give us your feedbacks.

Deploy your workloads on 3 availability zones with our new Managed Kubernetes Services (MKS) ‘Premium’ plan

Aurélie Vache — Mon, 19 May 2025 05:20:42 +0000

This blog post will first explain briefly what is the new MKS Premium plan, for who and which use case, then you will see how to deploy a new MKS cluster in 3 availability zones and how to deploy your workloads with this new architecture of Kubernetes cluster.

What’s inside the Premium MKS?

The 30th of April, we launched, in Beta, our brand new “Premium plan” of our Managed Kubernetes Services (MKS) 🎉

Concretely, with MKS Premium you will have:

💡 For the moment, only Paris is available for the 3AZ region but several new regions will be available in the coming months including Milan.

Behind this new plan, this new version of our MKS offering actually represents a complete overhaul of our platform based on several Cloud Native Open Source projects like Cluster API, Kamaji, ArgoCD and several homemade Kubernetes operators.

For who? For what?

The new MKS Premium plan has been designed for those who wants high availability and scalability of their critical applications.

Thanks to a dedicated and fully managed control plane, resilience across multiple availability zones, dedicated resources for the Kubernetes control plane, and the ability to deploy the data plane across multiple availability zones.

You will be able to design cloud-native applications that are resilient to failures and deploy highly resilient cloud-native applications across our multi-zones region.

You will have the full control on how to deploy your worker node in our new 3AZ region (EU-WEST-PAR).

Deploying your cloud-native applications in our new Paris 3-AZ region also means enjoying the full range of services available:

Well architected application relying on resilient managed services (MKS + Load Balancer + Gateway + DBaaS + Object Storage …),
Advanced internal cluster networking with the new Cilium CNI
Better API server performances and scaling capacity
And much more to come!

Let’s deploy a MKS Premium cluster in 3 AZ at Paris!

Like the actual Standard MKS, you can deploy MKS on the 3AZ via the Control Panel (OVHcloud UI), the API and also our Infrastructure as Code (IaC) providers (Terraform/OpenTofu, Pulumi…).

In this blog post, we will deploy a new MKS cluster, in a 3AZ region (Paris) with 3 node pools (one per availability zone).

With OVHcloud Control Panel

Log in to the OVHcloud Control Panel, go to the Public Cloud section and select the Public Cloud project concerned.

In the left panel, go in the Containers & Orchestration section, click on Managed Kubernetes Service link and click on the Create a Kubernetes cluster button

Fill the name of the cluster, choose a 3AZ region, click on Paris (EU-WEST-PAR) and select the Premium plan:

Then, select the Kubernetes version and the security policy.

⚠️ Contrary to the Standard MKS, which is public by default, the Premium MKS is private by default so it is mandatory to create a private network, a subnet and a gateway.

Then, create one node pool by Availability Zone, with 3 nodes by node pool, for example:

Confirm the creation of your cluster and wait its creation.

Finally, click on the new created cluster and get the kubeconfig file.

With Terraform

In a previous blog post, we showed you how to deploy a MKs cluster with Terraform/OpenTofu. Please read the post if you are not familiar with Terraform or OpenTofu.

Create a ovh_kube.tf file with the following content:

resource "ovh_cloud_project_network_private" "network" {
  service_name = var.service_name
  vlan_id      = 84
  name         = "terraform_mks_multiaz_private_net"
  regions      = ["EU-WEST-PAR"]
}

resource "ovh_cloud_project_network_private_subnet" "subnet" {
  service_name = ovh_cloud_project_network_private.network.service_name
  network_id   = ovh_cloud_project_network_private.network.id

  # whatever region, for test purpose
  region     = "EU-WEST-PAR"
  start      = "192.168.142.100"
  end        = "192.168.142.200"
  network    = "192.168.142.0/24"
  dhcp       = true
  no_gateway = false
}

resource "ovh_cloud_project_gateway" "gateway" {
  service_name = ovh_cloud_project_network_private.network.service_name
  name       = "gateway"
  model      = "s"
  region     = "EU-WEST-PAR"
  network_id = tolist(ovh_cloud_project_network_private.network.regions_attributes[*].openstackid)[0]
  subnet_id  = ovh_cloud_project_network_private_subnet.subnet.id
}

resource "ovh_cloud_project_kube" "my_multizone_cluster" {
  service_name  = ovh_cloud_project_network_private.network.service_name
  name          = "multi-zone-mks"
  region        = "EU-WEST-PAR"
  plan          = "standard"

  private_network_id = tolist(ovh_cloud_project_network_private.network.regions_attributes[*].openstackid)[0]
  nodes_subnet_id    = ovh_cloud_project_network_private_subnet.subnet.id

  depends_on    = [ ovh_cloud_project_gateway.gateway ] //Gateway is mandatory for multizones cluster
}

resource "ovh_cloud_project_kube_nodepool" "node_pool_multi_zones_a" {
  service_name       = ovh_cloud_project_network_private.network.service_name
  kube_id            = ovh_cloud_project_kube.my_multizone_cluster.id
  name               = "my-pool-zone-a" //Warning: "_" char is not allowed!
  flavor_name        = "b3-8"
  desired_nodes      = 3
  availability_zones = ["eu-west-par-a"] //Currently, only one zone is supported
}

resource "ovh_cloud_project_kube_nodepool" "node_pool_multi_zones_b" {
  service_name       = ovh_cloud_project_network_private.network.service_name
  kube_id            = ovh_cloud_project_kube.my_multizone_cluster.id
  name               = "my-pool-zone-b"
  flavor_name        = "b3-8"
  desired_nodes      = 3
  availability_zones = ["eu-west-par-b"]
}

resource "ovh_cloud_project_kube_nodepool" "node_pool_multi_zones_c" {
  service_name       = ovh_cloud_project_network_private.network.service_name
  kube_id            = ovh_cloud_project_kube.my_multizone_cluster.id
  name               = "my-pool-zone-c"
  flavor_name        = "b3-8"
  desired_nodes      = 3
  availability_zones = ["eu-west-par-c"]
}

output "kubeconfig_file_eu_west_par" {
  value     = ovh_cloud_project_kube.my_multizone_cluster.kubeconfig
  sensitive = true
}

This HCL configuration will create several OVHcloud services:

a private network
a subnet
a gateway (S size)
a MKS cluster in EU_WEST_PAR region
one node pool in eu-west-par-a availability zone with 3 nodes
one node pool in eu-west-par-b availability zone with 3 nodes
one node pool in eu-west-par-c availability zone with 3 nodes

Apply the configuration:

$ terraform apply

...

ovh_cloud_project_network_private.network: Creating...
ovh_cloud_project_network_private.network: Still creating... [10s elapsed]
ovh_cloud_project_network_private.network: Creation complete after 14s [id=pn-xxxxxxxx_xx]
ovh_cloud_project_network_private_subnet.subnet: Creating...
ovh_cloud_project_network_private_subnet.subnet: Creation complete after 3s [id=c14cbb87-xxxx-xxxx-xxxx-7b9d4940d857]
ovh_cloud_project_gateway.gateway: Creating...
ovh_cloud_project_gateway.gateway: Still creating... [10s elapsed]
ovh_cloud_project_gateway.gateway: Creation complete after 13s [id=7dafdcfe-xxxx-xxxx-xxxx-240df8f93af1]
ovh_cloud_project_kube.my_multizone_cluster: Creating...
ovh_cloud_project_kube.my_multizone_cluster: Still creating... [10s elapsed]
ovh_cloud_project_kube.my_multizone_cluster: Still creating... [20s elapsed]
ovh_cloud_project_kube.my_multizone_cluster: Still creating... [30s elapsed]
...
ovh_cloud_project_kube.my_multizone_cluster: Still creating... [1m40s elapsed]
ovh_cloud_project_kube.my_multizone_cluster: Still creating... [1m50s elapsed]
ovh_cloud_project_kube.my_multizone_cluster: Still creating... [2m0s elapsed]
ovh_cloud_project_kube.my_multizone_cluster: Creation complete after 2m2s [id=0196cd9a-xxxx-xxxx-xxxx-3acbb48d6dda]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Creating...
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Creating...
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Creating...
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [10s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [10s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [10s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [20s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [20s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [20s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [30s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [30s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [30s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [40s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [40s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [40s elapsed]
...
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [4m0s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [4m0s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [4m10s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [4m10s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [4m10s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [4m20s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Still creating... [4m20s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Still creating... [4m20s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_c: Creation complete after 4m24s [id=0196cd9c-xxxx-xxxx-xxxx-8e1925c4c18e]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_b: Creation complete after 4m24s [id=0196cd9c-xxxx-xxxx-xxxx-96a18b9202ff]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Still creating... [4m30s elapsed]
ovh_cloud_project_kube_nodepool.node_pool_multi_zones_a: Creation complete after 4m35s [id=0196cd9c-xxxx-xxxx-xxxx-8a08cdc2e68d]

Apply complete! Resources: 7 added, 0 changed, 0 destroyed.

Outputs:

kubeconfig_file_eu_west_par =

Our MKS in 3AZ have been deployed 🎉

To connect into it, retrieve the kubeconfig file locally:

$ terraform output -raw kubeconfig_file_eu_west_par > ~/.kube/multi-zone-mks.yml

Connect and discover your MKS cluster

Initialize or append the KUBECONFIG environment variable with the new kubeconfig files:

export KUBECONFIG=/Users/my-user/.kube/mks.yml:/Users/my-user/.kube/multi-zone-mks.yml

Display the node pools. Our cluster have 3 nodes pools, one per AZ:

$ kubectl get np
NAME             FLAVOR   AUTOSCALED   MONTHLYBILLED   ANTIAFFINITY   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   MIN   MAX   AGE
my-pool-zone-a   b3-8     false        false           false          3         3         3            3           0     100   7h8m
my-pool-zone-b   b3-8     false        false           false          3         3         3            3           0     100   7h8m
my-pool-zone-c   b3-8     false        false           false          3         3         3            3           0     100   7h8m

You can also display the control plane’s pods in order to discover the new components of the MKS Premium:

$ kubectl get po -n kube-system

How To

Deploy pods accross several availability zones

Now, let’s create a Depoyment with 6 pods and ask Kubernetes to deploy them in our 3 AZ (in the three node pools).

To do that, create a nginx-cross-az.yaml file with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-cross-az
  labels:
    app: nginx-cross-az
spec:
  replicas: 6
  selector:
    matchLabels:
      app: nginx-cross-az
  template:
    metadata:
      labels:
        app: nginx-cross-az
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: "topology.kubernetes.io/zone"
                operator: In
                values:
                - eu-west-par-a
                - eu-west-par-b
                - eu-west-par-c
      containers:
      - name: nginx
        image: nginx:1.28.0
        ports:
        - containerPort: 80

Thanks to the nodeAffinity feature of Kubernetes, we declare that we want 6 replicas (pods) running in 3 zones: eu-west-par-a, eu-west-par-b, eu-west-par-c.

Create a new namespace and apply the deployment:

$ kubectl create ns hello-app
$ kubectl apply -f nginx-cross-az.yaml -n hello-app

As you can see, 6 pods have been created, and they are running on the nodes located in the 3 AZ.

$ kubectl get po -o wide -l app=nginx-cross-az -n hello-app
NAME                             READY   STATUS    RESTARTS   AGE    IP             NODE                         NOMINATED NODE   READINESS GATES
nginx-cross-az-6ffd957c4-7528p   1/1     Running   0          6s     10.240.2.140   my-pool-zone-b-tr6wf-5wfgz              
nginx-cross-az-6ffd957c4-96mnh   1/1     Running   0          6s     10.240.3.91    my-pool-zone-c-wgrl6-b2f9s              
nginx-cross-az-6ffd957c4-b48cv   1/1     Running   0          115m   10.240.6.182   my-pool-zone-c-wgrl6-lp22l              
nginx-cross-az-6ffd957c4-k7rwf   1/1     Running   0          115m   10.240.1.237   my-pool-zone-b-tr6wf-ct7fs              
nginx-cross-az-6ffd957c4-pb7zp   1/1     Running   0          115m   10.240.8.195   my-pool-zone-a-b9ztj-gt5vd              
nginx-cross-az-6ffd957c4-vhhcw   1/1     Running   0          6s     10.240.7.40    my-pool-zone-a-b9ztj-brgpq

Deploy pods only in a desired availability zone

You can also choose to deploy a Deployment with 3 replicas, only in the AZ of your choice, only in eu-west-par-a for example.

Create a nginx-one-az.yaml file with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-one-az
  labels:
    app: nginx-one-az
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx-one-az
  template:
    metadata:
      labels:
        app: nginx-one-az
    spec:
      nodeSelector:
        topology.kubernetes.io/zone: eu-west-par-a
      containers:
      - name: nginx
        image: nginx:1.28.0
        ports:
        - containerPort: 80

Deploy the manifest file in your cluster:

$ kubectl apply -f nginx-one-az.yaml -n hello-app
deployment.apps/nginx-one-az created

As you can see, our three pods are running in the PAR region only in the zone-a nodes:

$ kubectl get po -o wide -l app=nginx-one-az -n hello-app
NAME                            READY   STATUS    RESTARTS   AGE    IP             NODE                         NOMINATED NODE   READINESS GATES
nginx-one-az-6b5f9bdccc-8vv9l   1/1     Running   0          98s    10.240.7.13    my-pool-zone-a-b9ztj-brgpq              
nginx-one-az-6b5f9bdccc-ck99s   1/1     Running   0          100s   10.240.5.216   my-pool-zone-a-b9ztj-mss8j              
nginx-one-az-6b5f9bdccc-tlg4d   1/1     Running   0          96s    10.240.8.221   my-pool-zone-a-b9ztj-gt5vd

Want to go further?

Want to learn more on this topic? In the coming days, we will publish a blog post about MKS Premium plan.

Visit our Managed Kubernetes Service (MKS) Premium plan in the OVHcloud Labs website to know more about Premium MKS.

Join the free Beta: https://labs.ovhcloud.com/en/managed-kubernetes-service-mks-premium-plan/

Read the documentation about the new Managed Kubernetes Service (MKS) Premium plan.

Join us on Discord and give us your feedbacks.

Solutions at OVHcloud to overcome the Docker Hub pull rate limits

Aurélie Vache — Fri, 11 Apr 2025 06:53:38 +0000

For the past few months, Docker has been announcing the implementation of new pull rate limits for the Docker Hub. The most significant change is the 10 pulls-per-hour limit, per IP address, for unauthenticated users that can quickly lead to a “You have reached your pull rate limit” error message.

Even if these changes have been implemented and rollbacked as of April 1, 2025, at OVHcloud, we are aware that these upcoming changes could impact your deployments and daily work.

In this blog post, you will find several solutions and best practices that can help you reduce Docker pull commands and avoid hitting Docker Hub’s pull rate limit.

Use OVHcloud Managed Private Registry and activate the proxy cache

OVHcloud Managed Private Registry (MPR) is a container image registry, based on CNCF project Harbor. It allows you to store and manage Docker (or OCI-compliant) container images and artifacts in a private, secure, and scalable environment, hosted in OVHcloud’s infrastructure.

MPR provides a proxy cache feature that helps you mirror and cache images from external registries, like Docker Hub, Github Container Registry, Quay, JFrog Artifactory Registry, etc. External registries can be private or public. This improves performance and reduces rate limits imposed by external registries 💪.

Configure proxy cache in OVHcloud Managed Private Registry

If you don’t have deployed a MPR yet, you can deploy it through the OVHcloud Control Panel, the OVHcloud Terraform provider, the OVHcloud Pulumi provider and even the API. Follow the guide according to your needs.

First, log in the Harbor user interface on your private registry, follow the guide if you needed to.

⚠️ In order to activate the proxy cache, you need to log in the Harbor UI with an administrator account.

Registry endpoint creation

In the left sidebar, click on Registries (inside the Administration section).

Then click on the New endpoint button.

Select Docker Hub in the provider list, enter a name (“Docker Hub” for example), fill your Docker Hub login in Access ID field and fill your Docker Hub password in Access Secret field.

⚠️ Note that we strongly recommend using a Docker account (even a free one) to avoid rate limits, for unanthenticated users, when pulling images. Without authentication, Docker Hub enforces strict pull limits, which may cause failures when pulling frequently used images.

Click on the Test connection button to test if your login and password are correct.

Now click on the OK button in order to create the new endpoint.

The Docker Hub endpoint is created 🎉

Proxy cache project creation

In the left sidebar, click on Projects, then click on the New project button.

Enter a project name (“docker-hub” for example), enable the Proxy Cache, click on the Docker Hub endpoint in the list and click on the OK button.

ℹ️ Note that a project is private by default, so you have to click on the Public checkbox if you want to change the visibilty of a project.

⚠️ The name of a proxy cache project should not contains dot(s), indeed it can causes issues with external tools like Kaniko.

Your proxy cache project have been created 🎉

⚠️ A proxy cache project works similarly to a normal Harbor project, except that you are not able to push images to a proxy cache project.

Now when you want to pull a Docker images hosted in the Docker Hub you proxy cached, instead of pulling directly from the Docker Hub, you need to configure your docker/podman pull commands and Kubernetes pod manifests to pull images from the OVHcloud Managed Private Registry:

$ docker pull xxxxxxxx.c1.de1.container-registry.ovh.net/docker-hub/ovhcom/ovh-platform-hello:latest
latest: Pulling from docker-hub/ovhcom/ovh-platform-hello
1f3e46996e29: Pull complete 
6aa905c35cc0: Pull complete 
Digest: sha256:fddb76f0eb92d95b3721bfa0ea87350c5d39ea262e90cd30d66f429bb40c8b07
Status: Downloaded newer image for xxxxxxxx.c1.de1.container-registry.ovh.net/docker-hub/ovhcom/ovh-platform-hello:latest
xxxxxxxx.c1.de1.container-registry.ovh.net/docker-hub/ovhcom/ovh-platform-hello:latest

Disable the AlwaysPullImages admission plugin on your MKS cluster

By default, the AlwaysPullImages Kubernetes admission plugin is enabled in your OVHcloud Managed Kubernetes (MKS) cluster.

⚠️ When it is enabled, this forces the imagePullPolicy of a container to be set to Always, no matter how it is specified when creating the resource.

This is useful in a multitenant cluster so that users can be assured that their private images can only be used by those who have the credentials to pull them. Without this admission controller, once an image has been pulled to a node, any pod from any user can use it by knowing the image’s name (assuming the Pod is scheduled onto the right node), without any authorization check against the image.

But, it can cause a lot of pull requests to the Docker Hub and you can reach the rate limits.

So a solution can be to deactivate the AlwaysPullImages admission plugin in your MKS cluster.

In this blog post, we will deactivate it in the OVHcloud Control Panel.

Enable/Disable MKS admission plugins

Log in the OVHcloud Control Panel. In the left sidebar, click on the Managed Kubernetes Service and then click on the wanted MKS cluster.

In the Cluster information section, scroll down and click on Enable/disable plugin. A popup will appear.

Then click on Disable for the Always Pull Images plugin and click on the Save button.

⚠️ Any changes on the Admission plugins require a redeployment of the MKS cluster API server (without data loss) so the API server can be potentially not available during the redeployment.

Conclusion

To learn more about how to use and configure OVHcloud private registries and OVHcloud MKS clusters, don’t hesitate to follow our guides.

Enhancing Kubernetes Security: Detecting Threats in OVHcloud Managed Kubernetes cluster (MKS) Audit Logs with Falco

Aurélie Vache — Tue, 11 Feb 2025 08:58:40 +0000

Several month ago we discovered Falco, a Cloud Native near real-time threats detection tool, and we saw how to install it on an OVHcloud MKS cluster.

Today we will connect our Falco instance to a MKS cluster in order to retrieve Kubernetes Audit Logs events and watch if everything is OK in our cluster.

Concretely, in this blog post we will:

deploy an OVHcloud LDP (Logs Data Platform)
create a data stream into this LDP
connect an OVHcloud MKS cluster to the data stream (to send Audit Logs into it)
use the k8saudit-ovh Falco plugin to retrieve in realtime the Audit Logs of a MKS cluster
test a rule and detect security events based on MKS audit logs activity

Prerequisites

This blog post presupposes that you already have a working OVHcloud Managed Kubernetes (MKS) cluster, and a running instance of Falco.

If it is not the case, follow the Near real-time threats detection with Falco on OVHcloud Managed Kubernetes blog post.

Deploying a Logs Data Platform (LDP)

LDP is the managed platform for collecting, processing, analyzing and storing your logs of the OVHcloud products. To be able to access to our Kubernetes clusters Audit Logs we need to deploy a LDP.

Find more information on our dedicated LDP page.

We can deploy a LDP through the OVHcloud Control Panel and the API. In this blog post, we will deploy it through the Control Panel.

First, you have to log in to the OVHcloud Control Panel, click on the Bare Metal Cloud section located at the top in the header and then click on the Logs Data Platform in the sidebar.

Choose the LDP plan you want: Standard (free) or Enterprise one, depending on your needs.

Select a region (North America or Europe). We will choose “GRA” for this blog post, click on Order button and follow the instructions.

After several minutes your LDP will be created.

Refresh the page, click on the new deployed LDP, then enter a password and click on the Save button.

Creating a Data stream and retrieving the Websocket URL

Our Kubernetes Audit Logs will be stored in a data stream so click on the Data stream tab and then click on the Add data stream button.

Choose a name of the data stream. On my side I like to call it with the name of my MKS cluster following by “-audit-logs” to know easily what it is this data stream for. My MKS cluster’s name is “my-rancher-mks-cluster” so let’s name it “my-rancher-mks-cluster-audit-logs”. Fill the description (mandatory).

The OVHcloud Audit Logs Falco plugin you will use receive the audit logs through Websocket so you need to enable Websocket broadcasting then click on the Save button.

Now, to retrieve the Websocket URL of your data stream, click on the Data stream tab, then click on the … button (located at the right in the line of your data stream), and click on Monitor in real time action.

Finally, click on the Action button and in the Copy Websocket address, then save the LDP Websocket URL somewhere ;-).

Note that the Websocket address have this kind of format: wss://.logs.ovh.com/tail/?tk=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx

Connect a MKS cluster to a LDP data stream

Now we need to send the Kubernetes Audit Logs of our MKS cluster in the data stream.

For that, in the OVHcloud Control Panel, click on the Public Cloud section in the header and then in Managed Kubernetes Service in the sidebar.

Click on your Kubernetes cluster (my-rancher-mks-cluster for example), then in the Logs tab and click on the Subscribe button.

Click on the Add data stream button to visualize in real time the Audit Logs of your cluster. Then select the LDP instance and click on the Subscribe button for the data stream your created:

Retrieve the MKS Audit Logs with Falco

Falco can receive Events, compare them to a set of Rules to determine the actions to perform and generate Alerts to different endpoints.

Thanks to the k8saudit-ovh plugin, Falco can receive a new sort of Events: the Audit Logs of your MKS cluster. These events have also some rules to follow.

Concretely, when an user will execute some kubectl commands in an OVHcloud MKS cluster, Audit Logs will be generated. Falco is listening from them and depending on the configured rules, it will generate some alerts.

Let’s install or update a Falco configuration running in a MKS cluster and use this plugin.

Create a values.yaml file with the following content:

tty: true
kubernetes: false

# Just a Deployment with 1 replica (instead of a Daemonset) to have only one Pod that pulls the MKS Audit Logs from a OVHcloud LDP
controller:
  kind: deployment
  deployment:
    replicas: 1

falco:
  rule_matching: all
  rules_files:
    - /etc/falco/k8s_audit_rules.yaml
    - /etc/falco/rules.d
  plugins:
    - name: k8saudit-ovh
      library_path: libk8saudit-ovh.so
      open_params: ".logs.ovh.com/tail/?tk=" # Replace with your LDP Websocket URL
    - name: json
      library_path: libjson.so
      init_config: ""
  # Plugins that Falco will load. Note: the same plugins are installed by the falcoctl-artifact-install init container.
  load_plugins: [k8saudit-ovh, json]

driver:
  enabled: false
collectors:
  enabled: false

# use falcoctl to install automatically the plugin and the rules
falcoctl:
  artifact:
    install:
      enabled: true
    follow:
      enabled: true
  config:
    indexes:
    - name: falcosecurity
      url: https://falcosecurity.github.io/falcoctl/index.yaml
    artifact:
      allowedTypes:
        - plugin
        - rulesfile
      install:
        resolveDeps: false
        refs: [k8saudit-rules:0, k8saudit-ovh:0.1, json:0]
      follow:
        refs: [k8saudit-rules:0]

This values.yaml file will install Falco with the k8saudit-ovh and the json plugins.

Install the latest version of Falco with helm install command:

$ helm install falco --create-namespace --namespace falco --values=values.yaml falcosecurity/falco

This command will install the latest version of Falco, with the k8saudit-ovh and json plugins, and create a new falco namespace:

$ helm install falco --create-namespace --namespace falco --values=values.yaml falcosecurity/falco

NAME: falco
LAST DEPLOYED: Mon Feb 10 10:15:20 2025
NAMESPACE: falco
STATUS: deployed
REVISION: 1
NOTES:
No further action should be required.

Or if you already have Falco deployed in a Kubernetes cluster, you can use the helm update command instead:

$ helm upgrade falco --create-namespace --namespace falco --values=values.yaml falcosecurity/falco

You can check if the Falco pods are correctly running:

$ kubectl get pods -n falco

NAME                                      READY   STATUS    RESTARTS   AGE
falco-6b8bc77d8b-v24jr                    2/2     Running   0          96s
falco-falcosidekick-67877d6946-4hmbn      1/1     Running   0          96s
falco-falcosidekick-67877d6946-tpjk6      1/1     Running   0          96s
falco-falcosidekick-ui-78b96fd57d-4wb6q   1/1     Running   0          96s
falco-falcosidekick-ui-78b96fd57d-v7rnm   1/1     Running   0          96s
falco-falcosidekick-ui-redis-0            1/1     Running   0          96s

Wait and execute the command again if the pods are in “Init” or “ContainerCreating” state.

Once the Falco pod is ready, run the following command to see the logs:

kubectl logs -l app.kubernetes.io/name=falco -n falco -c falco

You should see logs like that:

$ kubectl logs -l app.kubernetes.io/name=falco -n falco -c falco

Mon Feb 10 09:15:35 2025:    /etc/falco/k8s_audit_rules.yaml | schema validation: ok
Mon Feb 10 09:15:35 2025: Hostname value has been overridden via environment variable to: my-pool-1-node-921b61
Mon Feb 10 09:15:35 2025: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs)
Mon Feb 10 09:15:35 2025: Starting health webserver with threadiness 2, listening on 0.0.0.0:8765
Mon Feb 10 09:15:35 2025: Loaded event sources: syscall, k8s_audit
Mon Feb 10 09:15:35 2025: Enabled event sources: k8s_audit
Mon Feb 10 09:15:35 2025: Opening 'k8s_audit' source with plugin 'k8saudit-ovh'
{"hostname":"my-pool-1-node-921b61","output":"09:15:40.698757000: Warning K8s Operation performed by user not in allowed list of users (user=csi-cinder-controller target=csi-6afb06dce281b86b7bab718b5d966dc261b2b1554941ae449519a128cb2e3fb3/volumeattachments verb=patch uri=/apis/storage.k8s.io/v1/volumeattachments/csi-6afb06dce281b86b7bab718b5d966dc261b2b1554941ae449519a128cb2e3fb3/status resp=200)","output_fields":{"evt.time":1739178940698757000,"ka.response.code":"200","ka.target.name":"csi-6afb06dce281b86b7bab718b5d966dc261b2b1554941ae449519a128cb2e3fb3","ka.target.resource":"volumeattachments","ka.uri":"/apis/storage.k8s.io/v1/volumeattachments/csi-6afb06dce281b86b7bab718b5d966dc261b2b1554941ae449519a128cb2e3fb3/status","ka.user.name":"csi-cinder-controller","ka.verb":"patch"},"priority":"Warning","rule":"Disallowed K8s User","source":"k8s_audit","tags":["k8s"],"time":"2025-02-10T09:15:40.698757000Z"}
{"hostname":"my-pool-1-node-921b61","output":"09:15:57.508657000: Warning K8s Operation performed by user not in allowed list of users (user=yacht target=my-pool-1.18051c0a88716868/events verb=patch uri=/api/v1/namespaces/default/events/my-pool-1.18051c0a88716868 resp=403)","output_fields":{"evt.time":1739178957508657000,"ka.response.code":"403","ka.target.name":"my-pool-1.18051c0a88716868","ka.target.resource":"events","ka.uri":"/api/v1/namespaces/default/events/my-pool-1.18051c0a88716868","ka.user.name":"yacht","ka.verb":"patch"},"priority":"Warning","rule":"Disallowed K8s User","source":"k8s_audit","tags":["k8s"],"time":"2025-02-10T09:15:57.508657000Z"}
{"hostname":"my-pool-1-node-921b61","output":"09:15:57.807013000: Warning K8s Operation performed by user not in allowed list of users (user=yacht target=my-pool-1/nodepools verb=update uri=/apis/kube.cloud.ovh.com/v1alpha1/nodepools/my-pool-1/status resp=200)","output_fields":{"evt.time":1739178957807013000,"ka.response.code":"200","ka.target.name":"my-pool-1","ka.target.resource":"nodepools","ka.uri":"/apis/kube.cloud.ovh.com/v1alpha1/nodepools/my-pool-1/status","ka.user.name":"yacht","ka.verb":"update"},"priority":"Warning","rule":"Disallowed K8s User","source":"k8s_audit","tags":["k8s"],"time":"2025-02-10T09:15:57.807013000Z"}

The logs confirm that Falco k8saudit-ovh plugin and the k8saudit rules have been loaded correctly 💪.

Testing Falco

In order to test Falco we need to know which rules are installed by default. In our case, as we defined it in the values.yaml file, the k8saudit-ovh plugin follow the k8s_audit_rules.yaml file. You can take a look at them in order to know them.

In this blog post we will test one of well-known default k8s audit rules:

- rule: Attach/Exec Pod
  desc: >
    Detect any attempt to attach/exec to a pod
  condition: kevt_started and pod_subresource and (kcreate or kget) and ka.target.subresource in (exec,attach) and not user_known_exec_pod_activities
  output: Attach/Exec to pod (user=%ka.user.name pod=%ka.target.name resource=%ka.target.resource ns=%ka.target.namespace action=%ka.target.subresource command=%ka.uri.param[command])
  priority: NOTICE
  source: k8s_audit
  tags: [k8s]

This rule is interesting because an event will be generated if/when an user execute commands in a pod.

Let’s test the rule!

In a tab of your terminal, watch the coming logs:

$ kubectl logs -l app.kubernetes.io/name=falco -n falco -c falco -f

In an another tab of your terminal, create a Nginx pod and execute a command into it:

$ kubectl run nginx --image=nginx

$ kubectl exec -it nginx -- cat /etc/shadow

Several seconds later, in the logs you should see this you will see this Attach/Exec to pod logs:

...
{"hostname":"my-pool-1-node-921b61","output":"09:29:46.302906000: Notice Attach/Exec to pod (user=kubernetes-admin pod=nginx-676b6c5bbc-4xc6t resource=pods ns=hello-app action=exec command=cat)","output_fields":{"evt.time":1739179786302906000,"ka.target.name":"nginx-676b6c5bbc-4xc6t","ka.target.namespace":"hello-app","ka.target.resource":"pods","ka.target.subresource":"exec","ka.uri.param[command]":"cat","ka.user.name":"kubernetes-admin"},"priority":"Notice","rule":"Attach/Exec Pod","source":"k8s_audit","tags":["k8s"],"time":"2025-02-10T09:29:46.302906000Z"}
...

🎉

Conclusion

Ensuring the security of Kubernetes clusters is important and in general we have a lot of information in the Audit Logs but we don’t use them so don’t hesitate to use this new plugin.

We installed the new k8saudit-ovh plugin in an OVHcloud MKS cluster but note that you can deploy it in a Kubernetes cluster in another Cloud provider and even in a Falco instance running locally 💪.

We visualized the logs/the events in the terminal but you can also visualize them in the sidekick UI, create a custom rule and even use Talon to execute some actions.

Empowering Healthcare Efficiency

Leonard Pommereau — Thu, 09 Jan 2025 16:40:33 +0000

Startup highlight: Interview with Thomas Foricher, CTO at Silbo

At OVHcloud’s Startup Program, we are proud to support innovative startups like Silbo that are reshaping industries.
Today, we speak with Thomas Foricher, CTO of Silbo, a groundbreaking company transforming patient flow management in healthcare.

Can you introduce Silbo and its mission?

Silbo was founded in 2018 by Antoine Bohuon with a clear mission: to serve those who care for others by enabling healthcare professionals to focus on their core mission—delivering quality care.

Our platform addresses the complexities of hospital bed management, facilitating the allocation of beds to patients, improving information sharing among healthcare providers, and optimizing patient trajectories. By offering an intuitive, all-in-one solution, Silbo empowers hospitals to efficiently manage patient flows while enhancing the quality of care provided to patients.

What challenges did Silbo face before partnering with OVHcloud?

Our primary need was to find a sovereign hosting provider with strong expertise in modern technologies such as Kubernetes, MongoDB, and Redis. Additionally, compliance with HDS certification and a guaranteed SLA were critical due to the nature of our work in healthcare.

Managing and securing servers at scale requires highly specialized expertise. With increasing technical interdependencies, ensuring constant security and availability is a dedicated profession. For us, relying on experts like OVHcloud allows us to focus on what we do best—improving healthcare efficiency.

How did OVHcloud help Silbo address these challenges?

OVHcloud was the clear choice. It is the only hosting provider in France that offers a public cloud with high-quality services, combined with HDS and SecNumCloud compliance. Additionally, its strategic vision of deploying local datacenters across Europe aligns with the needs of countries seeking to keep sensitive data, such as health information, close to healthcare institutions.

We use several managed services, including Kubernetes, Object Storage, MongoDB, and Redis. These services are comparable to those offered by other providers, but OVHcloud’s standout feature was Rancher, which we beta-tested. Rancher allowed us to monitor on-premise Kubernetes deployments effectively, with support from OVHcloud’s engineering team.

With the help of OVHcloud experts, we improved our knowledge of Kubernetes. By adopting their best practices, we successfully deployed Kubernetes on-premise and leveraged Rancher for effective monitoring. This expertise has significantly enhanced our ability to scale infrastructure while maintaining high security and reliability.

What tangible results has Silbo achieved through this partnership?

Since partnering with OVHcloud, we’ve seen our infrastructure costs cut in half compared to other providers without public cloud capabilities. Our platform is now faster, providing a noticeably better experience for end-users.

From a development perspective, on-demand Kubernetes allows us to execute large computational jobs and parallelize analyses. This accelerates our deployment workflows and optimizes our processes.

By outsourcing infrastructure management to OVHcloud, we’ve freed up resources to focus on developing new features and enhancing our platform. This enables us to deliver even greater value to our users and maintain our commitment to innovation.

What’s next for Silbo?

Our ambitions are closely aligned with OVHcloud’s vision: expanding across Europe while providing secure, localized services compliant with both international and local regulations.
One of our main challenges will be managing multiple instances that comply with the specific regulations of different countries. Cloud solutions will be key in helping us navigate this complexity while maintaining high standards of security and efficiency.

What advice would you give to startups exploring cloud solutions?

Focus on two key aspects: identifying your user’s problem and solving it as simply and efficiently as possible.
Leverage existing, standardized solutions wherever possible. Avoid reinventing the wheel—your priorities should always align with your clients’ needs, not internal assumptions or preferences.

Join the OVHcloud Startup Program

Silbo’s success highlights the transformative power of leveraging OVHcloud’s Startup Program.
Are you ready to take your startup to the next level? Join a growing community of innovators and benefit from tailored cloud solutions, expert guidance, and a global ecosystem.
Learn more about OVHcloud’s Startup Program and get started on your journey today!