Constructing and Pushing Docker Images without Docker
Once upon a time, I ran into a particular use case: I wanted to build and push a Docker image without using docker binary or any similar tool.
The approach documented below relies only on the Go standard library.
I made some assumptions, so this workflow doesn't cover any arbitrary scenario.
The workflow below relies on these assumptions and simplifications:
-
program deals with an image for a single statically built linux/amd64 Go binary that has no external dependencies, an image you can get with the following Dockerfile:
FROM scratch
COPY main /main
CMD ["/main"]
-
the program should work with an AWS ECR repository;
-
for the sake of simplicity, the program reads authentication credentials from the local docker configuration.
Steps
Docker provides an HTTP API; its documentation conveniently has Pusing An Image section.
In short, the entire sequence to build and publish an image has these steps:
- Construct a Docker image layer.
My docker image will have a single layer holding a single file.
- Construct an image config — a JSON file referencing image layer by its digest (sha256 sum of layer.tar.gz), plus some metadata, like ARCH/OS, what command to run, what environment variables to use.
- Upload a docker layer object.
- Upload an image config object.
- Construct an image manifest — a JSON file referencing layer and config by their digests.
- Publish an image manifest.
Authorization
All API endpoints used by this workflow require authorization.
AWS ECR documentation clarifies how to do this: each request must have an Authorization: Basic $TOKEN
header.
If you inspect a ~/.docker/config.json
file after running the docker login
command, you'll see that this file holds a mapping between registry domain and authorization token.
For simplicity, my code will get an authorization token for the matching domain from this file.
Parsing Full Image Name
The code in my prototype will have to deal with a docker image full name, consisting of 3 parts: Docker registry domain, image name, and a tag.
For example, for an image identified by public.ecr.aws/amazonlinux/amazonlinux:latest
full name, code needs to distinguish between domain (public.ecr.aws
), short name (amazonlinux/amazonlinux
), and a tag (latest
).
Parsing this is trivial. Code has a dedicated type to hold all separate parts:
type imageSpec struct {
Domain string
Name string
Tag string
}
Constructing Docker Image Layer
Docker image layer is just a gzip-compressed tar archive.
Go standard library packages archive/tar
and compress/gzip
cover this case.
One notable caveat is that I need to calculate sha256 checksums of both uncompressed and compressed content (both for layer.tar and layer.tar.gz) during layer construction.
There's a convenient way to do so as the program constructs an image, relying on hash.Hash implementing io.Writer interface, and io.MultiWriter allows writing data to multiple writers at once.
The relevant piece of code looks like this:
outerHash, innerHash := sha256.New(), sha256.New()
buf := new(bytes.Buffer)
gw := gzip.NewWriter(io.MultiWriter(buf, outerHash)) // compressed stream (layer.tar.gz)
tw := tar.NewWriter(io.MultiWriter(gw, innerHash)) // uncompressed stream (layer.tar)
if err := tw.WriteHeader(&tar.Header{
Name: "main",
Mode: 0755,
ModTime: fi.ModTime(),
Size: fi.Size(),
}); err != nil {
return nil, nil, err
}
...
outerDigest := fmt.Sprintf("sha256:%x", outerHash.Sum(nil))
innerDigest := fmt.Sprintf("sha256:%x", innerHash.Sum(nil))
Image Config
A minimal image config may look like this:
{
"os": "linux",
"architecture": "amd64",
"created": "2021-03-13T16:39:51.535472845Z",
"config": {
"Env": ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],
"Cmd": ["/main"],
"WorkingDir": "/",
"ArgsEscaped": true
},
"rootfs": {
"type": "layers",
"diff_ids": [
"sha256:0a7631da79a7cf9bfbe5c09457481b869b45095dfd309681f7ed465e711815ed"
]
}
}
Array under rootfs → diff_ids references layers by their uncompressed content digests.
That corresponds to the innerDigest
variable in the previous section code snippet.
Object Upload
Image layers and config are uploaded to the registry the same way: do a POST /v2/<name>/blobs/uploads/
request to the registry, get a new unique URL from a response Location
header, and then upload your payload to this URL with a PUT
request.
(That config should be uploaded the same way as a layer was the tricky part to figure out for me. Initially, I thought it must be a part of the manifest.)
The code to get an upload location from a reply:
resp, err := http.DefaultClient.Do(req)
if err != nil {
return "", err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusAccepted {
return "", fmt.Errorf("unexpected status on %v: %v", req.URL.Path, resp.Status)
}
uploadLocation := resp.Header.Get("Location")
if uploadLocation == "" {
return "", errors.New("response has no valid location")
}
return uploadLocation, nil
To then upload an object (layer or config) I also need its size in bytes and digest — a hex-encoded sha256 checksum of object content with a sha256:
prefix:
// payload has a []byte type, it's either a layer in tar.gz format, or a json-encoded image config
digest := fmt.Sprintf("sha256:%x", sha256.Sum256(payload))
uploadLocation = uploadLocation + "?digest=" + digest
req, err := http.NewRequestWithContext(ctx, http.MethodPut, uploadLocation, bytes.NewReader(payload))
if err != nil {
return err
}
req.Header.Set("Authorization", "Basic "+auth)
req.Header.Set("Content-Type", "application/octet-stream")
req.ContentLength = int64(len(payload))
resp, err := http.DefaultClient.Do(req)
if err != nil {
return err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusCreated {
return fmt.Errorf("unexpected status on object upload %q: %v", req.URL, resp.Status)
}
Image Manifest
Once both layer and config are uploaded, I need to construct an image manifest.
At this point, I have all details available.
A minimal manifest looks like this:
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 616,
"digest": "sha256:82a2678c5bdcdf82a0fe0d54b0a58f5604182d4ffb7b9e5ca6835e5c207c720c"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 360666,
"digest": "sha256:81c1eb1aeb9f8cabc9eca332be5c331a1bd687c453eec180c1f447ee720827eb"
}
]
}
Under the config
key, this object references an image config from the previous section.
Size describes config size in bytes, and digest is a payload digest — the same that was used on the config upload step.
The layer
array describes all image layers.
In my case image only has a single layer.
Size is the size of the layer object in bytes (layer.tar.gz file size).
A digest is the layer object sha256 digest, the one corresponding to the outerDigest
variable.
Manifest is publised over a dedicated API endpoint, with a PUT /v2/<name>/manifests/<tag>
request.
API supports different manifest versions, the one above requires Content-Type: application/vnd.docker.distribution.manifest.v2+json
header.
On success, API responds with 201 Created
code.
At this point, a new image should appear in the registry.
Building a Prototype
You can find the complete prototype code at https://github.com/artyom/push-to-docker-repo.
It is a self-contained Go program that uses only the Go standard library.
This program can take a statically built linux/amd64 binary, pack it into a docker container, and publish it to the Docker registry.
Note that it's only been tested with AWS ECR repositories.