Fast tests for slow services: why you should use verified fakes
Let’s say your code talks to some slow or expensive external service—the Twitter API, say. When it comes time to write tests you face a dilemma:
- One the one hand, talking to the real API would make your tests slow, hard to run, and flaky.
- On the other hand, if you use a fake or mock test double, how do you know that your code will actually work in the real world? After all, the fake is—fake. Your code isn’t talking to the real thing.
You want correctness and speed: the confidence that if your tests pass, your code will run in production, as well as the ability to have a fast and robust test suite.
The solution is a special kind of test double: a verified fake. Unlike regular test doubles, where you swap out the real thing for some random object, with verified fakes you actually prove the test double has the same behavior as the real thing.
In this article I’ll cover:
- A quick intro to test doubles in general.
- How to write a verified fake, and why it’s different.
- The limitations of verified fakes, and when you should use them.
Testing with test doubles
Before looking at Verified Fakes, let’s take a quick look at why test doubles are useful.
Let’s say you have a
MessageService class you want to test, and it uses
You could write tests like this:
But then your tests will end up talking to the real Twitter API.
So instead, you create a
FakeTwitterClient and use it in your tests:
Now your tests can prove
MessageService works without have to talk to the Twitter API.
There’s only one problem: you’re assuming that
FakeTwitterClient behave the same, without any evidence to suggest that.
Yes, there are tools like the
mock library in Python that make this a little easier, but at best those make sure you’re matching the function signature.
What they don’t do is validate anything about behavior.
From fakes to verified fakes
In order to make
FakeTwitterClient into a verified fake, a fake you can trust, you need to write an additional set of tests that run against both
These tests validate some sort of contract or interface that you expect both implementations to adhere to.
Running the same tests against both implementations ensures both versions behave the same way:
A worked out example
TwitterClient looks like this:
class TwitterClient(object): """A client for the Twitter API.""" def tweet(self, message): """Tweet a message for the user.""" # ... implementation ... def list_tweets(self): """Return a list of the user's tweets.""" # ... implementation ...
This client provides a behavioral guarantees, a contract of sorts: if a message is tweeted it will show up in the list of tweets. You can encode this contract into a test:
def test_tweet_listed(client): "A tweeted messages shows up in the list of messages." message = generate_random_message() client.tweet(message) assert message in client.list_tweets()
FakeTwitterClient to provide the same guarantee, so that when you can confidently use it as a drop-in replacement for
So you implement a
FakeTwitterClient that implements this contract:
class FakeTwitterClient(object): """A fake client.""" def __init__(self): self.messages =  def tweet(self, message): """Tweet a message for the user.""" self.messages.append(message) def list_tweets(self): """Return a list of the user's tweets.""" return self.messages
And here’s the important bit: you want to run
Once against the real client and once against the fake client, to ensure they both provide the same behavior.
The version of
test_tweet_listed that runs against
FakeTwitterClient is just another fast in-memory test, so it can be run anywhere by anyone.
The version that will run against the real client will need to use a real Twitter login (presumably a test account of some sort).
This means it will be slow and not something you want developers running regularly.
The contract verification test for
TwitterClient could therefore be configured to only run on the CI server once a night, or when the relevant code changes.
Once you have
test_tweet_listed running against both classes you have some guarantee that
FakeTwitterClient behave the same way.
And that means you can trust that the tests for
MessageService are valid tests even though they rely on
FakeTwitterClient and not the real thing.
The limits of verified fakes
TwitterClient may produce errors in some cases, e.g. if a message is too long it might throw a
This sort of error can easily be implemented in
FakeTwitterClient and verified by the contract tests.
Some errors cannot be verified by a contract test, however.
For example, if the Twitter API server has a bug it might return an error that results in an exception being raised by
Or, a network error may result in a
The problem with these errors is that they are difficult or impossible to trigger reliably in your contract verification tests. If you can’t trigger an edge case in your contract verification tests, then you don’t have a verified fake for that edge case.
The best you can do, if you want to trigger these errors in
MessageService tests, is to just go the regular test double route and have an unverified fake or mock.
When should you use a verified fake?
A verified fake gives you more assurance that your tests are testing what you think you’re testing, since the test double has been verified to act like the real thing. On the other hand, this requires more work, since you have to create an additional set of contract verification tests.
That means verified fakes make sense when the following conditions apply:
- The API you want to fake is slow and/or expensive to setup. Otherwise, why not use the real thing?
- The API you want to fake is frequently used by test code. If it’s only used once, maybe it’s better to just use the real thing.
- The cost of uncaught bugs is high. If bugs aren’t expensive, it may not be worth the extra effort to write the extra contract tests.
Next time you’re about to write a fake, consider whether this is a good place for verified fakes. With a little bit of work you’ll end up with tests that are fast and correct.
The concise and action-oriented guide to Docker packaging for production
Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.
Take the fast path to learning best practices, by using the Python on Docker Production Handbook.
Learn practical Python software engineering skills you can use at your job
Too much to learn? Don't know where to start?
Sign up for my newsletter, and join over 6100 Python developers and data scientists learning practical tools and techniques, from Docker packaging to testing to Python best practices, with a free new article in your inbox every week.