Finding leaked secrets in your Docker image with a scanner
If you’re not careful, you can end up with a private SSH key, AWS access token, or password embedded in your Docker image. That means anyone who access the image will be able to get that secret, and potentially use it to gain further access to additional systems.
While you can and should take steps to prevent leaking secrets in the first place, it’s still useful to catch leaks if they do happen. If you can catch the leak before you push the image to a remote registry, no harm done.
That’s where a secrets scanner comes in handy: it can automatically catch secrets, up to a point anyway.
Recap: how secrets get leaked in Docker images
Here’s an example of a
Dockerfile that leaks secrets two ways: with build args, and by copying secrets in.
You can view the former with
docker image history, and the latter is available in the image much in the way old commits are accessible in a Git repository, accessible for example via
docker image save.
(For more details, and the secure alternative, see my article on Docker build secrets.)
FROM busybox # Copy in SSH private key, then delete it; this is INSECURE, # the secret will still be in the image. COPY id_dsa . RUN rm id_dsa # Accept a secret as build arg. This is INSECURE. ARG mypassword RUN echo $mypassword
I build the image:
$ docker build --build-arg mypassword=XW835S3d20-3432S%K@345 -t bad-secrets .
And now this image has leaked two secrets, a build argument (“XW8…”) and a SSH private key.
Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.
Need to ship quickly, and don’t have time to figure out every detail on your own? Read the concise, action-oriented Python on Docker Production Handbook.
Using a secrets scanner
A good secrets scanner with Docker support will be able to find both. There aren’t that many secrets scanners that support Docker, and the ones I’ve tried haven’t been able to find both. The only one I’ve found that does work is GitGuardian.
You can get a free account from the service that gives a pretty decent number of free scans. You’ll need to generate an API token through their dashboard. Then:
$ export GITGUARDIAN_API_KEY=the-token-you-got-from-dashboard $ python3 -m venv /tmp/venv $ /tmp/venv/bin/pip install ggshield $ ggshield scan docker bad-secrets ... >>> Incident 1(Secrets detection): Generic High Entropy Secret (Validity: Cannot Check) (Ignore with SHA: 73c43dc3b30b828a082b5ea3401c69fa07145aec16202fb1babec325db2dad6c) (2 occurrences) 1 | …usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],"Cmd":["|1","mypassword=XW8*******-**32S%K@345","/bin/sh","-c","echo $mypassword"],"Image":"sha256:504c3ee7a91b445126658150ed2… ... >>> Incident 1(Secrets detection): OpenSSH Private Key (Validity: Cannot Check) (Ignore with SHA: a0e1f407b0cdbb2d0484a6b5bb6e931135d3972b89bf79dce3ec1b50caed53bf) (1 occurrence)
The scanner spotted both secrets. Success!
ggshield also does the right thing of exiting with a non-zero exit code if it finds a potential vulnerability, so it’s ready to go for CI setups.
The limitations of secret scanners
It’s worth keeping in mind that there are two basic techniques for spotting secrets:
- Standard file formats, file names, and the like. An SSH private key, for example, has a very specific format, and was correctly identified based on that.
- Things that look like secrets, and specifically that look like good secrets, i.e. with lots of randomness or entropy. That’s how the “XW835S3d20 etc.” string was caught.
The second method is a heuristic.
That means it might occasionally have false positives, spotting something that looks like a secret but isn’t.
It also might miss secrets that are sufficiently not-random.
For example, if I chose the rather worse password “12345” the
ggshield scanner won’t catch that.
So as useful as secret scanners are, you should also take steps to prevent leaking secrets in the first place.
The concise and action-oriented guide to Docker packaging for production
Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.
Take the fast path to learning best practices, by using the Python on Docker Production Handbook.
Free ebook: "Introduction to Dockerizing for Production"
Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.
Plus, you'll join over 6700 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.