Blog

Setting up Harbor as a Local Docker Registry

By Jay

Jan 7, 2024 | 7 minutes read

Series: Docker

Tags: blog, tech, docker

A Caching, Proxy Cache, or Pull Through Cache is an application that runs inside your local network and exposes a OCI compliant endpoint for developers to issue image pull requests against. The Caching Registry checks to see if the requested image is stored locally, and if so it serves that image. However, if the image does not exist in the local store a call is made to the backend registry and the image is pulled first to the caching registry and then provided to the client.

A caching registry is a valuable tool in software development, particularly in the context of managing Docker container images or similar artifacts. Here’s why it is beneficial and why you might consider using it:

  • Local Caching: When you use a pull-through registry, it caches images locally after the first pull from the upstream registry (like Docker Hub). Subsequent pulls of the same image are much faster since they are served from the local cache.
  • Efficient Builds: This can significantly reduce the time taken for building and deploying applications, especially in a continuous integration/continuous deployment (CI/CD) pipeline.
  • Reduced Data Transfer: By caching images locally, you reduce the amount of data transferred over the internet from the upstream registry.
  • Cost-Effective: This can be particularly important for cloud-based environments where data transfer costs can be significant.
  • Mitigates Risks of Downtime: If the upstream registry is down or experiencing issues, your operations can continue uninterrupted using the cached images.
  • Stable Builds: It ensures that your build environments are more stable and less susceptible to external outages.
  • Version Control: It helps ensure that all users or CI/CD processes are using the same version of an image, reducing the “it works on my machine” problem.
  • Access Control: You can implement access policies and control who can pull which images, enhancing security.
  • Geographical Distribution: For teams distributed across different geographical locations, a caching registry can provide local copies of images, reducing the latency that comes with pulling images from a distant registry.
  • Audit Trails: You can maintain logs of image pulls and pushes, which is useful for compliance and monitoring.
  • Security Scanning: Some caching registries allow for scanning of images for vulnerabilities before they are distributed.

The following diagram illustrates the flow of image requests in a caching registry setup:

Flow

For this example, we will use Harbor, a popular cloud-native registry that is distributed under an OSS license. Full documentation can be found by navigating to the Harbor GitHub project page at https://github.com/goharbor/harbor.

For our purposes, we’ll be using the online installer, which simplifies the installation process by pulling the necessary components from the internet during the installation. This method is especially beneficial for ensuring that you’re setting up the most current version of Harbor with all the latest features and security updates. The installers can be found on their releases page at https://github.com/goharbor/harbor/releases.

$ tar xzvf ~/Downloads/Harbor\ Online\ Installer\ v2.10.0.tgz
x harbor/prepare
x harbor/LICENSE
x harbor/install.sh
x harbor/common.sh
x harbor/harbor.yml.tmpl
  • The harbor.yml.tmpl is a template you can use to build yours.
    • At the very least you will need to either change hostname or external_url
    • If you are using HTTPS you will need to define your certs
      • If not, comment out the HTTPS lines
      • Note that at some point Harbor will deprecate HTTP; in this case you will need to create certs.
  • If you are going to run this in production, you will need to set the security related parameters in this file * Do not run this as-is in production
  • The installer will validate the configuration and return an error if it is invalid
$ sudo ./install.sh

[Step 0]: checking if docker is installed ...

Note: docker version: 24.0.2

[Step 1]: checking docker-compose is installed ...

Note: Docker Compose version v2.18.1


[Step 2]: preparing environment ...

[Step 3]: preparing harbor configs ...
prepare base dir is set to /home/ubuntu/harbor
WARNING:root:WARNING: HTTP protocol is insecure. Harbor will deprecate http protocol in the future. Please make sure to upgrade to https
Clearing the configuration file: /config/jobservice/config.yml
Clearing the configuration file: /config/jobservice/env
Clearing the configuration file: /config/registry/config.yml
Clearing the configuration file: /config/registry/passwd
Clearing the configuration file: /config/db/env
Clearing the configuration file: /config/log/rsyslog_docker.conf
Clearing the configuration file: /config/log/logrotate.conf
Clearing the configuration file: /config/nginx/nginx.conf
Clearing the configuration file: /config/portal/nginx.conf
Clearing the configuration file: /config/registryctl/config.yml
Clearing the configuration file: /config/registryctl/env
Clearing the configuration file: /config/core/env
Clearing the configuration file: /config/core/app.conf
Generated configuration file: /config/portal/nginx.conf
Generated configuration file: /config/log/logrotate.conf
Generated configuration file: /config/log/rsyslog_docker.conf
Generated configuration file: /config/nginx/nginx.conf
Generated configuration file: /config/core/env
Generated configuration file: /config/core/app.conf
Generated configuration file: /config/registry/config.yml
Generated configuration file: /config/registryctl/env
Generated configuration file: /config/registryctl/config.yml
Generated configuration file: /config/db/env
Generated configuration file: /config/jobservice/env
Generated configuration file: /config/jobservice/config.yml
loaded secret from file: /data/secret/keys/secretkey
Generated configuration file: /compose_location/docker-compose.yml
Clean up the input dir


Note: stopping existing Harbor instance ...


[Step 4]: starting Harbor ...
[+] Running 62/16
 ✔ log 7 layers [⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                      22.9s
 ✔ postgresql 10 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                           27.7s
 ✔ core 9 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                   26.2s
 ✔ registry 5 layers [⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                    7.9s
 ✔ registryctl 7 layers [⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                               5.4s
 ✔ redis 5 layers [⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                      22.8s
 ✔ jobservice 5 layers [⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                 16.5s
 ✔ portal 4 layers [⣿⣿⣿⣿]      0B/0B      Pulled                                                                                      25.2s
 ✔ proxy 1 layers []      0B/0B      Pulled                                                                                          22.4s

[+] Building 0.0s (0/0)
[+] Running 10/10
 ✔ Network harbor_harbor        Created                                                                                                0.1s
 ✔ Container harbor-log         Started                                                                                                4.6s
 ✔ Container registry           Started                                                                                                1.6s
 ✔ Container harbor-db          Started                                                                                                1.8s
 ✔ Container redis              Started                                                                                                1.7s
 ✔ Container harbor-portal      Started                                                                                                1.8s
 ✔ Container registryctl        Started                                                                                                1.7s
 ✔ Container harbor-core        Started                                                                                                2.8s
 ✔ Container harbor-jobservice  Started                                                                                                3.5s
 ✔ Container nginx              Started                                                                                                3.5s
✔ ----Harbor has been installed and started successfully.----

You will connect on port 80, unless you have enabled and configured SSL. ConnectHub

First, go to the “Registries” page. Registries

Then create a new endpoint; for our example we will use Docker Hub. Endpoint

Go to the projects page. Project

Now create a new project; the name we define here will be used as part of the Docker pull/push commands to and from this registry. Since we are creating a proxy cache registry, we put that information into the create dialog. The target is Docker Hub (the registry we defined above).

If you are creating a registry hosted by Harbor (not a pull-through), you would uncheck the proxy cache box. .Proxy

You should now be able to test pulling an image through the proxy cache. The format to use the proxy cache is hostname/projectname/image:tag. For example, dock11.virington.com/dockerhub/nginx:latest.

$ docker pull dock11.virington.com/dockerhub/nginx:latest
latest: Pulling from dockerhub/nginx
58cc89079bd7: Download complete
6f26751fc54b: Download complete
c98494bb3682: Download complete
3799b53049f3: Download complete
2a580edba2f4: Download complete
24e221e92a36: Download complete
cfe7877ea167: Download complete
Digest: sha256:2bdc49f2f8ae8d8dc50ed00f2ee56d00385c6f8bc8a8b320d0a294d9e3b49026
Status: Downloaded newer image for dock11.virington.com/dockerhub/nginx:latest
dock11.virington.com/dockerhub/nginx:latest

What's Next?
  View a summary of image vulnerabilities and recommendations → docker scout quickview dock11.virington.com/dockerhub/nginx:latest

The first pull will require Harbor to pull the image from Docker Hub, while every subsequent pull will use the cached version. If there is a change to the upstream version, the newer version for the tag will be pulled to Harbor and then to the client. This is accomplished by checking the image manifest to determine if there was a change.

Harbor provides data on what images/tags are stored. To see it, naviagate to your project (in this case it’s dockerhub); you will now see a screen showing you information on that registry, as shown below.

Check

In this case, we see that we have our one image - dockerhub/library/nginx:latest - along with the number of pulls and the last modify time. We also can see that we are using 64.1MiB of our unlimited quota, and that the repository is public.

Note that any images provided by Docker directly are shown as being in the library project.

  • You can create as many registries as you want within Harbor, subject to the available resources.
  • Login to Harbor can be enforced, including for pull-through registries.
  • Harbor can authenticate against it’s built-in database or via LDAP/Active Directory, or OIDC.
  • Harbor can replicate between registries.
  • Harbor allows for scanners to be added in order to scan data at rest.
  • Harbor supports most registries, including Artifactory, Github, ECR, ACR, and GCR.
  • Other options are to use the Docker Registry image as a pull-through registry, or to mirror the Docker registry. Full instructions can be found here for both use cases.