February 2019

The Unexpected Quest for Business Intelligence

Business Intelligence (BI) is the ability to collect substantial data from an information system to feed a Data Warehouse (DWH) or data lake. They usually provide a copy of the data that will be used for BI applications. Different strategies can be applied to feed a DWH. One such strategy is Change Data Capture (CDC), which is the ability to capture changing states from a database, and convert them to events that can be used for other purposes. Most databases are intended for OLTP purposes, and are well designed for this. Nonetheless, different use cases would require the same data with different access patterns. These use cases (big data, ETL, and stream processing, to name a few) mostly fall under the OLAP banner.

OVH, as a cloud provider, manages numerous databases, both for its customers and its own needs. Managing a database lifecycle always involves both keeping the infrastructure up to date, and remaining in synch with the development release cycle, to align the software with its database dependency. For example, an app might require MySQL 5.0, which could then be announced as EOL (End Of Life). In this case the app needs to be modified to support (let’s say) MySQL 5.5. We’re not reinventing the wheel here – this process has been managed by operations and dev teams for decades now.

This becomes trickier if you don’t have control over the application. For example, imagine a third party provides you with an application to ensure encrypted transactions. You have absolutely no control over this application, nor the associated database. Nonetheless, you still need the data from the database.

This blog post relates a similar example we encountered while building the OVH data lake, with the help of an in-house CDC development. This story takes place in early 2015, although I still think it’s worth sharing. 🙂

The Unexpected Quest for Business Intelligence Read More »

Getting external traffic into Kubernetes – ClusterIp, NodePort, LoadBalancer, and Ingress

For the last few months, I have been acting as Developer Advocate for the OVH Managed Kubernetes beta, following our beta testers, getting feedback, writing docs and tutorials, and generally helping to make sure the product matches our users’ needs as closely as possible.

In the next few posts, I am going to tell you some stories about this beta phase. We’ll be taking a look at feedback from some of our beta testers, technical insights, and some fun anecdotes about the development of this new service.

Today, we’ll start with one of the most frequent questions I got during the early days of the beta: How do I route external traffic into my Kubernetes service? The question came up a lot as our customers began to explore Kubernetes, and when I tried to answer it, I realised that part of the problem was the sheer number of possible answers, and the concepts needed to understand them.

Getting external traffic into Kubernetes – ClusterIp, NodePort, LoadBalancer, and Ingress Read More »

Deep Learning: A new hype

Deep Learning explained to my 8-year-old daughter

Machine Learning and especially Deep Learning are hot topics and you are sure to have come across the buzzword “Artificial Intelligence” in the media. Yet these are not new concepts. The first Artificial Neural Network (ANN) was introduced in the 40s. So why all the recent interest around neural networks and Deep Learning?  We will explore this

Deep Learning explained to my 8-year-old daughter Read More »

CDS

How does OVH manage the CI/CD at scale?

From git commit to production, the delivery process is the set of steps that take place to deliver your service to your customers. Continuous Integration and Continuous Delivery – CI/CD – are practices based on the Agile Values which aim to automate this process as much as possible.

The Continuous Delivery Team @OVH has a mission: to help the OVH developers to industrialize and automate their delivery process. The CD team is here to advocate CI/CD best practices and maintain the ecosystem tools, with a maximum focus on as-a-service solutions.

The central point of this ecosystem is a tool built in-house at OVH, named CDS.
CDS is an OVH opensource software, you will find it on https://github.com/ovh/cds with documentation on https://ovh.github.io/cds.

How does OVH manage the CI/CD at scale? Read More »

CDS

Understanding CI/CD for Big Data and Machine Learning

This week, the OVH Integration and Continuous Deployment team was invited to the DataBuzzWord podcast. Together, we explored the topic of continuous deployment in the context of machine learning and big data. We also discussed continuous deployment for environments like Kubernetes, Docker, OpenStack and VMware VSphere. If you missed it, or would like to review everything that was discussed, you

Understanding CI/CD for Big Data and Machine Learning Read More »

TSL - Time Series Language

TSL: a developer-friendly Time Series query language for all our metrics

At the Metrics team we have been working on time series for several years. From our experience the data analytics capabilities of a Time Series Database (TSDB) platform is a key factor to create value from your metrics. And these analytics capabilities are mostly defined by the query languages they support. 

TSL stands for Time Series Language. In a few words, TSL is an abstracted way, under the form of an HTTP proxy, to generate queries for different TSDB backends. Currently it supports Warp 10’s WarpScript and  Prometheus’ PromQL query languages but we aim to extend the support to other major TSDB.

To better understand why we created TSL, we are reviewing some of the TSDB query languages supported on OVH Metrics Data Platform. When implementing them, we learnt the good, the bad and the ugly of each one. At the end, we decided to build TSL to simplify the querying on our platform, before open-sourcing it to use it on any TSDB solution. 

Why did we decide to invest some of our Time in such a proxy? Let me tell you the story of the OVH metrics protocol!

TSL: a developer-friendly Time Series query language for all our metrics Read More »

Upgrading vSphere

How we’ve updated 850 vCenter in 4 weeks

Handling release management on enterprise software isn’t an easy job: updating infrastructures, coping with the fear of not being supported by the software editor, upgrading licenses to be compatible with new versions, and taking all precautions to rollback if something isn’t working as expected…

With OVH Private Cloud, we take away from you this responsibility, we are managing this time-costing and stressful aspect to allow you to concentrate in your business and your production.

But, this doesn’t mean it’s not a challenge for us neither.

How we’ve updated 850 vCenter in 4 weeks Read More »

Le rôle d’Humain dans les entreprises numériques

En 2017, OVH a beaucoup grandi à travers les recrutements que nous avons faits en Europe, le rachat d’une entreprise aux USA, la construction de 14 nouveaux Datacentres et le démarrage des activités en APAC. En moins de 18 mois, nous avons doublé le nombre d’employés en passant de 1200 personnes à 2500 personnes. BTW, je ne recommande pas de grandir aussi vite.  OVH est une entreprise

Le rôle d’Humain dans les entreprises numériques Read More »