Apache Spark

Why are you still managing your data processing clusters?

Why are you still managing your data processing clusters?

Cluster computing is used to share a computation load among a group of computers. This achieves a higher level of performance and scalability.    Apache Spark is an open-source, distributed and cluster-computing framework, that is much faster than the previous one (Hadoop MapReduce). This is thanks to features like in-memory processing and lazy evaluation. Apache Spark […]

Why are you still managing your data processing clusters? Read More »

Improving the quality of data with Apache Spark

Improving the quality of data with Apache Spark

Today we are proposing you a guest post by Hubert Stefani, Chief Innovation Officer and Cofounder of Novagen Conseil As data consultant experts and heavy Apache Spark users, we felt honoured to become early adopters of OVHcloudData Processing. As a first use case to test this offering, we chose our quality assessment process. As a

Improving the quality of data with Apache Spark Read More »

Do you need to process your data? Try the new OVHcloud Data Processing cloud service!

Do you need to process your data? Try the new OVHcloud Data Processing service!

One of the data services of OVHcloud is called OVHcloud Data Processing (ODP). It is a service that allows you to submit a processing job without caring about the cluster behind it. You just have to specify the ressources you want to use for your job, and the service will abstract the cluster creation, and destroy it for you as soon as your job is finished. In other words, you don’t have to think about clusters any more. Decide how much resources you need to process your data in the most efficient way for you and let OVHcloud Data Processing do the rest.

Do you need to process your data? Try the new OVHcloud Data Processing service! Read More »

Apache Spark & OVH Analytics Data Compute

How to run massive data operations faster than ever, powered by Apache Spark and OVH Analytics Data Compute

If you’re reading this blog for the first time, welcome to the ongoing data revolution! Just after the industrial revolution came what we call the digital revolution, with millions of people and objects accessing a world wide network – the internet – all of them creating new content, new data. Let’s think about ourselves… We

How to run massive data operations faster than ever, powered by Apache Spark and OVH Analytics Data Compute Read More »