Building A Distributed Docker Registry At Scale

Share:

Docker Registry is a core component for many organizations which are deploying containerized applications, or using containers throughout the phases of development and deployment. Registry compliments Docker Hub in many ways, providing the ability to do caching for local access, but still providing a location to store images which is wholly controlled by the organization. Registry itself is very scalable, capable of handling massive numbers of requests and transactions concurrently, however this isn’t necessarily true of the storage which is being used to store those images.

Deploying for a single site is quite easy: simply stand up a Registry instance and provide appropriate persistent storage. The only slightly difficult part is to ensure that the storage being used can provide enough performance for the amount of traffic you expect. The architecture for this is simple: a single host with all of the clients connecting to it.

A simple single site Docker Registry deployment

Registry starts to become complicated as the organization adds multiple sites. At this point an architectural decision must be made: do we have a single master instance where all users retrieve and store their images at, or do we have a number of independent instances, where each site is an island and must manually pull images from other sites. Both of these solutions require access across the WAN for the non-local sites to push and pull images.

A single site Docker Registry deployment with no distribution for remote users

Multiple instances of Docker Registry, each an island of images with no sharing between them

But, what if there was a better way? NetApp’s clustered Data ONTAP has built-in replication technology, known as SnapMirror, which efficiently moves data between storage systems. SnapMirror, by itself, reduces the amount of time and bandwidth needed to replicate container images between sites by only sending the new and changed data in the volume. This granular replication takes place at the block level, unlike many other replication technologies which work at the file level. So, if an image file which is hundreds of megabytes in size has only a single update applied to it, then only that change is replicated, not the entire file.

NetApp's SnapMirror enables replication of data across the enterprise

There are a number of additional efficiencies which can be gained using NetApp storage technology as well. For example, with clustered Data ONTAP the data can be compressed as it leaves the source system and traverses the WAN, then decompressed as it lands on the destination. This saves significantly on the amount of bandwidth needed, and the amount of time needed to replicate the changes across the enterprise.

SnapMirror enables bandwidth efficient replication through the use of deduplication and compression

This doesn’t factor in additional NetApp storage efficiency either. Standard deduplication and compression can still be applied to the data on disk, further reducing the capacity needed to store the images beyond Docker’s use of shared image layers and gzip compression of images to reduce capacity. For many data sets, we see significant savings with simple deduplication alone.

By using SnapMirror technology we quickly and easily setup a distributed, replicated instance of Docker Registry. Combining that with some easy to apply proxy rules we create a Registry deployment which can span the entire globe transparently to the users.

Docker Registry with NetApp SnapMirror facilitates a distributed registry with access to all images from all sites

Be sure to check out Jared Hocutt’s Community Theater talk about how NetApp has deployed an internal Registry which enables access to container images across our global enterprise. The DockerCon Europe 2015 slides can be found here.

If you’re curious about NetApp enabling the Docker Registry, be sure to send us an email about how Cloud ONTAP and NetApp Private Storage enable any organization to quickly and easily host their Registry in the cloud, while taking advantage of the incredible storage efficiency of NetApp’s Data ONTAP. If you don’t want to have your Registry in the cloud any more, that’s ok! We understand that you want to store, and the NetApp Data Fabric makes it easy to move that data between the cloud, adjacent to the cloud, and on premises, all with almost no effort.

The NetApp Data Fabric means that you can host your Registry on-premises, adjacent to the cloud, or in the cloud depending on your preferred operating model. You can even move between those models, or any combination of them, seamlessly, on demand. For example, a hybrid cloud model is great for distributed organizations which don’t have a datacenter footprint at remote sites, using Cloud ONTAP and Amazon Web Services (AWS).

If you have any questions about Docker and NetApp’s integration, please reach out to us at opensource@netapp.com! We would love to hear your thoughts!

Related Content