Today in the monitoring world, we see the rise of the Prometheus tool. It’s a great tool to deploy in your infrastructure, as it allows you to scrap all of your servers or applications to retrieve, store and analyze the metrics. And all you have to do is to extract and run it, it does […]
At OVHcloud Metrics, we love open source! Our goal is to provide all of our users with a full experience. We rely on the Warp10 time series database which enables us to build open source tools for our users benefit. Let’s take a look at some in this blogpost. Storage tool Our Infrastructure is based
At OVHCloud, we are open sourcing our “Agility Telemetry” project. Jerem, as our data collector, is the main component of this project. Jerem scrapes our JIRA at regular intervals, and extracts specific metrics for each project. It then forwards them to our long-time storage application, the OVHCloud Metrics Data Platform. Agility concepts from a developer’s
In today’s blogpost, we’re going to take a look at our upstream contribution to Apache HBase’s stochastic load balancer, based on our experience of running HBase clusters to support OVHcloud’s monitoring. The context Have you ever wondered how: we generate the graphs for your OVHcloud server or web hosting package? our internal teams monitor their
Last year, we released TSL as an open source tool to query a Warp 10 platform, and by extension, the OVHcloud Metrics Data Platform. But how has it evolved since then? Is TSL ready to query other time series databases? What about TSL states on the Warp10 eco-system? TSL to query many time series databases
Last spring, I built a wood oven in my garden. I’ve wanted to have one for years, and I finally decided to make it. To use it, I make a big fire inside for two hours, remove all the embers, and then it’s ready for cooking. The oven accumulates the heat during the fire and
Our colleagues in the K8S team launched the OVH Managed Kubernetes solution last week, in which they manage the Kubernetes master components and spawn your nodes on top of our Public Cloud solution. I will not describe the details of how it works here, but there are already many blog posts about it (here and here, to get you
At the OVH Observability (formerly Metrics) team, we collect, process and analyse most of OVH’s monitoring data. It represents about 500M unique metrics, pushing data points at a steady rate of 5M per second. This data can be classified in two ways: host or application monitoring. Host monitoring is mostly based on hardware counters (CPU,
At the Metrics team we have been working on time series for several years. From our experience the data analytics capabilities of a Time Series Database (TSDB) platform is a key factor to create value from your metrics. And these analytics capabilities are mostly defined by the query languages they support.
TSL stands for Time Series Language. In a few words, TSL is an abstracted way, under the form of an HTTP proxy, to generate queries for different TSDB backends. Currently it supports Warp 10’s WarpScript and Prometheus’ PromQL query languages but we aim to extend the support to other major TSDB.
To better understand why we created TSL, we are reviewing some of the TSDB query languages supported on OVH Metrics Data Platform. When implementing them, we learnt the good, the bad and the ugly of each one. At the end, we decided to build TSL to simplify the querying on our platform, before open-sourcing it to use it on any TSDB solution.
Why did we decide to invest some of our Time in such a proxy? Let me tell you the story of the OVH metrics protocol!
OVH relies extensively on metrics to effectively monitor its entire stack. Whenever they are low-level or business centric, they allow teams to gain insight into how our services are operating on a daily basis. The need to store millions of datapoints per second has produced the need to create a dedicated team to build a operate a product to handle that load: Metrics Data Platform. By relying on Apache Hbase, Apache Kafka and Warp 10, we succeeded in creating a fully distributed platform that is handling all our metrics… and yours!
After building the platform to deal with all those metrics, our next challenge was to build one of the most needed feature for Metrics: Alerting.