Managing Harbor at cloud scale : The story behind Harbor Kubernetes Operator

Recently, our container platforms team made our “Private Managed Registry” service generally available. In this blog post, we will explain why OVHcloud chose to base this service on the Harbor project, built a Kubernetes operator for it, and open sourced it under the CNCF goharbor project.

The need for a S.M.A.R.T private registry

After our Managed Kubernetes Service release, we received many requests  for a fully managed private container registry.

Though a container registry for hosting images may sound quite trivial to deploy, our users mentioned a production-grade registry solution was a critical part of the software delivery supply chain and was actually quite difficult to maintain.

Our customers were asking for an enterprise-grade solution, offering advanced role-based-access-control and security by design, as concerns around vulnerabilities within the publicly available images increased and requirements for content-trust became a necessity.

Users were regularly praising the user interface of services such as the Docker Hub, but at the same time requested a service with high availability and backed by SLA.

The perfect mix of open source and enterprise-grade feature set

After surveying prospects to fine tune our feature set and pricing model, we searched for the best existing technologies to back it and landed on the CNCF incubating project Harbor (donated to the CNCF by VMWare). In addition to Harbor being one of the few projects to reach CNCF incubation state, thus confirming the strong commitment from the community, it has as well become a key part of several commercial enterprise containerization solutions. We also appreciated the approach taken by Harbor of not re-inventing the wheel but gluing best-of-breed technologies for components such as vulnerability scanning, content trust and many others. It leverages CNCF’s strong network of open source projects and ensures great UX quality levels.

It was now the time to take this 10k-GitHub-stars technology and adapt it to our specific case : managing tens of thousands of registries for our users, each of them having specific volume of container images and usage patterns.

Of course high-availability (customers’s software integration and deployment rely on this service) but also data durability were non-negotiable for us.

In addition, Kubernetes to ensure stateless services HA and object storage (based on Openstack Swift and compatible with the S3 API) were evident choices to check those requirements.

Addressing  operational challenges at the cloud-provider scale

Within a few weeks, we opened the service in public beta, quickly attracting hundreds of active users. But with this surge in traffic, we naturally hit our first bottlenecks and performance challenges.

We approached the Harbor user group and team who kindly pointed us to potential solutions, and after some small but key changes to how Harbor handles database connections our issues were resolved. This reinforced our beliefs that the Harbor community is strong and committed to the health of the project and the requirements of its users.

As our service flourished there was no real tooling available to easily accommodate the life-cycle of Harbor instances. Our commitment to the Kubernetes ecosystem made the concept of a Harbor operator for Kubernetes an interesting approach.

We discussed with the Harbor maintainers and they warmly welcomed our idea to develop it, and open source it as the official Harbor Kubernetes Operator. OVHcloud is very proud to have the project now available in the goharbor GitHub project under Apache 2 licensing. This project is another example of our strong commitment towards open source and our willingness to contribute our efforts back to the projects that we love.

A versatile operator designed to accommodate any Harbor deployment

Readers familiar with the Harbor project may wonder what value this operator brings to the current catalogue of deployments including the Helm Chart version maintained by the project.

The operator design pattern is quickly catching on and mimics an application-centric controller that extends Kubernetes to manage more complex, stateful apps.  Simply put, It addresses different use-cases than those of Helm. Whereas the Helm chart offers an all-in-one installer that would also deploy the different dependencies of Harbor (database, cache, etc) from open source Docker images,other enterprises, service operators and cloud providers like us will want to pick-and-choose the service or technology behind those components.

We also aim at extending the current v0.5  operator to manage the full life-cycle of Harbor, from deployment to deletion, including scaling, updates, upgrades, and backup management.

This will help production users reach their target SLO, benefit from managed solutions or from existing databases clusters they already maintain for example.

We designed the operator (leveraging the OperatorSDK framework) so that both Harbor optional modules (Helm Chart store, vulnerability scanner etc) and dependencies (registry storage backend, relation and non relational databases, etc) can easily match your specific use case.

Simplified architecture behind OVHcloud’d Managed Private Registry service

Contributing to Harbor and the operator project

We already have a roadmap planned with the Harbor maintainers to further enrich the operator to accommodate more than the deployment and destruction phases (for example making Harbor version upgrades more elegant). We look forward to being an integral part of the project and will continue investing in Harbor.

To that end, Jérémie Monsinjon and Pierre Peronnet have also been invited to be  maintainers of the Harbor project focusing on goharbor/operator .

In addition to regular contributions to multiple projects we use within OVHcloud, the container-platform team is also working on other major open sources releases, like an official OVHcloud cloud controller for self-managed Kubernetes we plan to deliver in late 2020.

Download Harbor or the Harbor Operator : Official Harbor Github repo

Learn more about Harbor : Official Harbor website

+ posts

Cloud Native Services Product Manager