Finding leaked secrets in your Docker image with a scanner

If you’re not careful, you can end up with a private SSH key, AWS access token, or password embedded in your Docker image. That means anyone who access the image will be able to get that secret, and potentially use it to gain further access to additional systems.

While you can and should take steps to prevent leaking secrets in the first place, it’s still useful to catch leaks if they do happen. If you can catch the leak before you push the image to a remote registry, no harm done.

That’s where a secrets scanner comes in handy: it can automatically catch secrets, up to a point anyway.

Recap: how secrets get leaked in Docker images

Here’s an example of a Dockerfile that leaks secrets two ways: with build args, and by copying secrets in. You can view the former with docker image history, and the latter is available in the image much in the way old commits are accessible in a Git repository, accessible for example via docker image save. (For more details, and the secure alternative, see my article on Docker build secrets.)

FROM busybox

# Copy in SSH private key, then delete it; this is INSECURE,
# the secret will still be in the image.
COPY id_dsa .
RUN rm id_dsa

# Accept a secret as build arg. This is INSECURE.
ARG mypassword
RUN echo $mypassword

I build the image:

$ docker build --build-arg mypassword=XW835S3d20-3432S%K@345 -t bad-secrets .

And now this image has leaked two secrets, a build argument (“XW8…”) and a SSH private key.

Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.

Python on Docker Production Handbook Need to ship quickly, and don’t have time to figure out every detail on your own? Read the concise, action-oriented Python on Docker Production Handbook.

Using a secrets scanner

A good secrets scanner with Docker support will be able to find both. There aren’t that many secrets scanners that support Docker, and the ones I’ve tried haven’t been able to find both. The only one I’ve found that does work is GitGuardian.

You can get a free account from the service that gives a pretty decent number of free scans. You’ll need to generate an API token through their dashboard. Then:

$ export GITGUARDIAN_API_KEY=the-token-you-got-from-dashboard
$ python3 -m venv /tmp/venv
$ /tmp/venv/bin/pip install ggshield
$ ggshield scan docker bad-secrets
...
>>> Incident 1(Secrets detection): Generic High Entropy Secret (Validity: Cannot Check)  (Ignore with SHA: 73c43dc3b30b828a082b5ea3401c69fa07145aec16202fb1babec325db2dad6c) (2 occurrences)
1 | …usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],"Cmd":["|1","mypassword=XW8*******-**32S%K@345","/bin/sh","-c","echo $mypassword"],"Image":"sha256:504c3ee7a91b445126658150ed2…
...
>>> Incident 1(Secrets detection): OpenSSH Private Key (Validity: Cannot Check)  (Ignore with SHA: a0e1f407b0cdbb2d0484a6b5bb6e931135d3972b89bf79dce3ec1b50caed53bf) (1 occurrence)

The scanner spotted both secrets. Success!

ggshield also does the right thing of exiting with a non-zero exit code if it finds a potential vulnerability, so it’s ready to go for CI setups.

The limitations of secret scanners

It’s worth keeping in mind that there are two basic techniques for spotting secrets:

  1. Standard file formats, file names, and the like. An SSH private key, for example, has a very specific format, and was correctly identified based on that.
  2. Things that look like secrets, and specifically that look like good secrets, i.e. with lots of randomness or entropy. That’s how the “XW835S3d20 etc.” string was caught.

The second method is a heuristic. That means it might occasionally have false positives, spotting something that looks like a secret but isn’t. It also might miss secrets that are sufficiently not-random. For example, if I chose the rather worse password “12345” the ggshield scanner won’t catch that.

So as useful as secret scanners are, you should also take steps to prevent leaking secrets in the first place.