Docker Part 9 — Registry: Where Do Images Live?

What Is a Registry?
Major Registry Comparison
Docker Hub — The Most Common Starting Point
Cloud Provider Registries
Self-Hosted Registry — Setting Up with Harbor
Tagging Strategy — The Most Common Mistake and Alternatives
Pinning Images by Digest
Multi-Architecture Images
The Complete Deployment Flow
Where We Stand

What Is a Registry?

A registry is a storage service for images. More precisely, it is an image server that follows the OCI (Open Container Initiative) spec. Docker provides an open-source implementation called registry, and products like Harbor and Nexus add RBAC, scanning, replication, and other features on top of it.

flowchart LR
    DEV["Developer PC"] -->|docker push| REG["Registry"]
    CI["CI/CD"] -->|docker push| REG
    REG -->|docker pull| K8S["Kubernetes"]
    REG -->|docker pull| SERVER["Server host"]
    REG -->|docker pull| DEV2["Other developer"]

A single image consists of multiple layers, and a registry manages these layers by their hash (digest). When different images share the same layer, storage space is saved, and during pulls, layers already present are skipped.

Major Registry Comparison

Let’s organize the ones commonly used in practice:

Registry	Operated By	Private Support	Cost	Strengths
Docker Hub	Docker Inc.	Paid plans	Moderate	Default store for official images
AWS ECR	AWS	Private by default	pull/push + storage	IAM integration, VPC endpoint
GCP GCR / Artifact Registry	Google	Private by default	Storage + network	GKE integration
Azure ACR	Azure	Private by default	Tiered	AAD integration
GitHub Container Registry	GitHub	Public/private	Free for public	GitHub Actions integration
Harbor (self-hosted)	CNCF	Private by default	Infrastructure cost	RBAC, scanning, replication, signing
Nexus / JFrog	Commercial	Private by default	License	Multi-format (Maven, npm, etc.)

The selection criteria roughly break down as follows. For cloud-native environments, each cloud’s registry offers benefits in IAM integration and VPC-internal pulls. For on-premises or multi-cloud, Harbor is the most versatile. For public open-source distribution, Docker Hub or GHCR are the way to go.

Docker Hub — The Most Common Starting Point

Whether individual or organization account, you can create public repos for free after signing up.

# Login
docker login

# Tag and push
docker tag myapp:latest mydockerid/myapp:1.0.0
docker push mydockerid/myapp:1.0.0

On first login, an authentication token is saved in ~/.docker/config.json. In CI environments, it is safer to issue a Personal Access Token instead. Never hardcode account passwords in CI.

# Non-interactive login in CI environments
echo "$DOCKERHUB_TOKEN" | docker login -u "$DOCKERHUB_USER" --password-stdin

Docker Hub has pull rate limits — 200 pulls per 6 hours for authenticated users, 100 for anonymous. It is common for large clusters to hit the rate limit when pulling without a public cache in front. Using Docker Hub alone as a production registry is not recommended.

Cloud Provider Registries

AWS ECR

Authentication is via IAM. The token from aws ecr get-login-password is passed to docker login.

aws ecr get-login-password --region ap-northeast-2 \
  | docker login --username AWS \
      --password-stdin 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com

docker tag myapp:latest 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp:1.0.0
docker push 123456789012.dkr.ecr.ap-northeast-2.amazonaws.com/myapp:1.0.0

The token is valid for 12 hours. On EC2/EKS, using IAM instance profiles automates this process. In GitHub Actions, aws-actions/configure-aws-credentials with OIDC federation enables login without long-lived keys.

GCP Artifact Registry

The older GCR is being phased out; Artifact Registry is now the official option.

gcloud auth configure-docker asia-northeast3-docker.pkg.dev

docker tag myapp:latest \
  asia-northeast3-docker.pkg.dev/my-project/my-repo/myapp:1.0.0
docker push asia-northeast3-docker.pkg.dev/my-project/my-repo/myapp:1.0.0

gcloud auth configure-docker sets up the Docker credential helper to use the current gcloud authentication.

Azure ACR

az acr login --name myregistry
docker tag myapp:latest myregistry.azurecr.io/myapp:1.0.0
docker push myregistry.azurecr.io/myapp:1.0.0

Authentication uses AAD tokens. On AKS, az aks update --attach-acr links the cluster to ACR, enabling pulls without imagePullSecrets.

Self-Hosted Registry — Setting Up with Harbor

For on-premises environments or when full control is needed, Harbor is the de facto standard. The most common installation is via Helm on Kubernetes:

helm repo add harbor https://helm.goharbor.io
helm repo update

helm install harbor harbor/harbor \
  --namespace harbor --create-namespace \
  --set expose.type=ingress \
  --set expose.ingress.hosts.core=harbor.example.com \
  --set externalURL=https://harbor.example.com

Harbor provides image storage + RBAC + vulnerability scanning via Trivy/Clair + project-level isolation + replication all in one package.

flowchart TB
    subgraph HARBOR["Harbor"]
      UI["Web UI / API"]
      REG2["Registry Backend"]
      SCAN["Vulnerability Scanner (Trivy)"]
      REPL["Replication Controller"]
      DB["Postgres / Redis"]
    end
    DEV["Developer / CI"] -->|push / pull| UI
    UI --> REG2
    REG2 --> SCAN
    REG2 -.->|replication| REMOTE["Remote registry"]
    UI --> DB

You create projects and assign roles (Guest/Developer/Maintainer) to developers. This makes it easy to partition access by team. You can also configure policies to auto-scan pushed images and block pulls if the severity threshold is exceeded.

Tagging Strategy — The Most Common Mistake and Alternatives

The most frequent beginner mistake is overwriting all images with :latest. latest is not really a “version” — it is a kind of “pointer,” so today’s latest and yesterday’s latest can be entirely different images. During production incident recovery, you have no way of knowing which image to roll back to.

There are three main tagging approaches used in practice:

1. semver — Release Versioning

Used for images where users (other teams or external parties) need to explicitly choose a version.

docker tag myapp:build harbor.example.com/team/myapp:1.4.2
docker tag myapp:build harbor.example.com/team/myapp:1.4
docker tag myapp:build harbor.example.com/team/myapp:1
docker tag myapp:build harbor.example.com/team/myapp:latest

Tagging 1.4.2, 1.4, and 1 together lets users pin at their desired level of granularity. latest is used only as a pointer to the latest stable release.

2. git sha — Clear Traceability

For internal service images, commit hashes are the most practical. The exact code that produced the image is embedded in the tag.

TAG=$(git rev-parse --short HEAD)   # e.g., a1b2c3d
docker build -t harbor.example.com/team/myapp:$TAG .
docker push harbor.example.com/team/myapp:$TAG

This pairs well with GitOps tools like Argo CD and Flux. Git sha tags are immutable, making rollbacks simple and incident root-cause analysis straightforward — you can find the commit immediately.

3. Combined — semver + git sha + environment

The most robust approach is combining multiple tags for different purposes:

VERSION=1.4.2
SHA=$(git rev-parse --short HEAD)
REG=harbor.example.com/team/myapp

docker build -t $REG:$VERSION -t $REG:$VERSION-$SHA -t $REG:$SHA .
docker push $REG:$VERSION
docker push $REG:$VERSION-$SHA
docker push $REG:$SHA

:1.4.2 — Release tag, referenced by docs/deployment pipelines
:1.4.2-a1b2c3d — Release + commit, for audit trails
:a1b2c3d — Commit standalone, for internal PR/staging environments

Patterns to Avoid

Using :latest only — no reproducibility
Environment name tags like :dev, :stable alone — no way to tell what is inside
Overwriting (pushing over an existing tag) — audit nightmare

Pinning Images by Digest

Tags are pointers, but digests (sha256:...) are immutable. For truly critical images (security bases, shared libraries), pinning by digest is the safe bet.

docker pull alpine:3.20
docker inspect alpine:3.20 --format '{{index .RepoDigests 0}}'
# alpine@sha256:aabbcc...

You can pin in a Dockerfile like this:

FROM alpine@sha256:aabbccddeeff0011...

It looks inconvenient, but it is a reliable defense against supply chain attacks. There have been past incidents where official image tags were replaced with different content under the same name.

Multi-Architecture Images

Sometimes you want to serve both AMD64 and ARM64 under the same tag. With the shift to Apple Silicon on MacBooks, this has become a common requirement.

docker buildx create --name multiarch --use
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t harbor.example.com/team/myapp:1.4.2 \
  --push .

buildx is the evolved form of BuildKit commands. Providing an architecture list via --platform builds for each and bundles them into a single manifest for push. The pulling side automatically receives the image matching its architecture. Details are covered in Part 11.

The Complete Deployment Flow

The entire flow from CI building and pushing an image to a cluster pulling and deploying looks like this:

sequenceDiagram
    participant Dev as Developer
    participant Git as Git
    participant CI as CI/CD
    participant Reg as Registry
    participant K8s as Kubernetes

    Dev->>Git: git push
    Git->>CI: webhook
    CI->>CI: Build + test
    CI->>Reg: docker push myapp:v1.4.2-a1b2c3d
    CI->>Git: Update manifest image tag
    Git->>K8s: ArgoCD sync
    K8s->>Reg: docker pull myapp:v1.4.2-a1b2c3d
    K8s->>K8s: Rolling update

An image is not complete just because it was built — it becomes a “deployable artifact” only when uploaded to a registry. Deciding where to store it and which tag to use is not a trivial task but a major pillar of operational stability.

Where We Stand

The structure for sharing images externally is now in place. The next step is how to build those images securely.

In the next part, we cover container security. Non-root execution, image scanning, secret management, read-only filesystems — the things that need to be blocked before they blow up in production.

→ Part 10: Security