Constructing and Pushing Docker Images without Docker

Once upon a time, I ran into a particular use case: I wanted to build and push a Docker image without using docker binary or any similar tool.

The approach documented below relies only on the Go standard library.

I made some assumptions, so this workflow doesn't cover any arbitrary scenario. The workflow below relies on these assumptions and simplifications:

program deals with an image for a single statically built linux/amd64 Go binary that has no external dependencies, an image you can get with the following Dockerfile:
```
FROM scratch
COPY main /main
CMD ["/main"]
```
the program should work with an AWS ECR repository;
for the sake of simplicity, the program reads authentication credentials from the local docker configuration.

Steps

Docker provides an HTTP API; its documentation conveniently has Pusing An Image section.

In short, the entire sequence to build and publish an image has these steps:

Construct a Docker image layer. My docker image will have a single layer holding a single file.
Construct an image config — a JSON file referencing image layer by its digest (sha256 sum of layer.tar.gz), plus some metadata, like ARCH/OS, what command to run, what environment variables to use.
Upload a docker layer object.
Upload an image config object.
Construct an image manifest — a JSON file referencing layer and config by their digests.
Publish an image manifest.

Authorization

All API endpoints used by this workflow require authorization.

AWS ECR documentation clarifies how to do this: each request must have an Authorization: Basic $TOKEN header.

If you inspect a ~/.docker/config.json file after running the docker login command, you'll see that this file holds a mapping between registry domain and authorization token.

For simplicity, my code will get an authorization token for the matching domain from this file.

Parsing Full Image Name

The code in my prototype will have to deal with a docker image full name, consisting of 3 parts: Docker registry domain, image name, and a tag. For example, for an image identified by public.ecr.aws/amazonlinux/amazonlinux:latest full name, code needs to distinguish between domain (public.ecr.aws), short name (amazonlinux/amazonlinux), and a tag (latest).

Parsing this is trivial. Code has a dedicated type to hold all separate parts:

type imageSpec struct {
        Domain string
        Name   string
        Tag    string
}

Constructing Docker Image Layer

Docker image layer is just a gzip-compressed tar archive. Go standard library packages archive/tar and compress/gzip cover this case.

One notable caveat is that I need to calculate sha256 checksums of both uncompressed and compressed content (both for layer.tar and layer.tar.gz) during layer construction. There's a convenient way to do so as the program constructs an image, relying on hash.Hash implementing io.Writer interface, and io.MultiWriter allows writing data to multiple writers at once.

The relevant piece of code looks like this:

outerHash, innerHash := sha256.New(), sha256.New()
buf := new(bytes.Buffer)
gw := gzip.NewWriter(io.MultiWriter(buf, outerHash)) // compressed stream (layer.tar.gz)
tw := tar.NewWriter(io.MultiWriter(gw, innerHash))   // uncompressed stream (layer.tar)
if err := tw.WriteHeader(&tar.Header{
        Name:    "main",
        Mode:    0755,
        ModTime: fi.ModTime(),
        Size:    fi.Size(),
}); err != nil {
        return nil, nil, err
}

...

outerDigest := fmt.Sprintf("sha256:%x", outerHash.Sum(nil))
innerDigest := fmt.Sprintf("sha256:%x", innerHash.Sum(nil))

Image Config

A minimal image config may look like this:

{
    "os": "linux",
    "architecture": "amd64",
    "created": "2021-03-13T16:39:51.535472845Z",
    "config": {
        "Env": ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],
        "Cmd": ["/main"],
        "WorkingDir": "/",
        "ArgsEscaped": true
    },
    "rootfs": {
        "type": "layers",
        "diff_ids": [
            "sha256:0a7631da79a7cf9bfbe5c09457481b869b45095dfd309681f7ed465e711815ed"
        ]
    }
}

Array under rootfs → diff_ids references layers by their uncompressed content digests. That corresponds to the innerDigest variable in the previous section code snippet.

Object Upload

Image layers and config are uploaded to the registry the same way: do a POST /v2/<name>/blobs/uploads/ request to the registry, get a new unique URL from a response Location header, and then upload your payload to this URL with a PUT request. (That config should be uploaded the same way as a layer was the tricky part to figure out for me. Initially, I thought it must be a part of the manifest.)

The code to get an upload location from a reply:

resp, err := http.DefaultClient.Do(req)
if err != nil {
        return "", err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusAccepted {
        return "", fmt.Errorf("unexpected status on %v: %v", req.URL.Path, resp.Status)
}
uploadLocation := resp.Header.Get("Location")
if uploadLocation == "" {
        return "", errors.New("response has no valid location")
}
return uploadLocation, nil

To then upload an object (layer or config) I also need its size in bytes and digest — a hex-encoded sha256 checksum of object content with a sha256: prefix:

// payload has a []byte type, it's either a layer in tar.gz format, or a json-encoded image config
digest := fmt.Sprintf("sha256:%x", sha256.Sum256(payload))
uploadLocation = uploadLocation + "?digest=" + digest
req, err := http.NewRequestWithContext(ctx, http.MethodPut, uploadLocation, bytes.NewReader(payload))
if err != nil {
        return err
}
req.Header.Set("Authorization", "Basic "+auth)
req.Header.Set("Content-Type", "application/octet-stream")
req.ContentLength = int64(len(payload))
resp, err := http.DefaultClient.Do(req)
if err != nil {
        return err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusCreated {
        return fmt.Errorf("unexpected status on object upload %q: %v", req.URL, resp.Status)
}

Image Manifest

Once both layer and config are uploaded, I need to construct an image manifest. At this point, I have all details available.

A minimal manifest looks like this:

{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 616,
    "digest": "sha256:82a2678c5bdcdf82a0fe0d54b0a58f5604182d4ffb7b9e5ca6835e5c207c720c"
},
"layers": [
    {
        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
        "size": 360666,
        "digest": "sha256:81c1eb1aeb9f8cabc9eca332be5c331a1bd687c453eec180c1f447ee720827eb"
    }
]
}

Under the config key, this object references an image config from the previous section. Size describes config size in bytes, and digest is a payload digest — the same that was used on the config upload step.

The layer array describes all image layers. In my case image only has a single layer. Size is the size of the layer object in bytes (layer.tar.gz file size). A digest is the layer object sha256 digest, the one corresponding to the outerDigest variable.

Manifest is publised over a dedicated API endpoint, with a PUT /v2/<name>/manifests/<tag> request. API supports different manifest versions, the one above requires Content-Type: application/vnd.docker.distribution.manifest.v2+json header.

On success, API responds with 201 Created code.

At this point, a new image should appear in the registry.

Building a Prototype

You can find the complete prototype code at https://github.com/artyom/push-to-docker-repo.

It is a self-contained Go program that uses only the Go standard library. This program can take a statically built linux/amd64 binary, pack it into a docker container, and publish it to the Docker registry.

Note that it's only been tested with AWS ECR repositories.