prometheus

vLLM on OVHcloud MKS for high availability and full observability

Reference Architecture: Deploying a vision-language model with vLLM on OVHcloud MKS for high performance inference and full observability

Ensure complete digital sovereignty of your AI models with end-to-end control through open-source solutions on OVHcloud’s Managed Kubernetes Service. This reference architecture demonstrates […]

Reference Architecture: Deploying a vision-language model with vLLM on OVHcloud MKS for high performance inference and full observability Read More »

reference architecture vLLM deployment and metrics obervability stack

Reference Architecture: Custom metric autoscaling for LLM inference with vLLM on OVHcloud AI Deploy and observability using MKS

Take your LLM (Large Language Model) deployment to production level with comprehensive custom autoscaling configuration and advanced vLLM metrics observability.

Reference Architecture: Custom metric autoscaling for LLM inference with vLLM on OVHcloud AI Deploy and observability using MKS Read More »