When you want to learn about software testing, you quickly hear the famous “Test behavior, not implementation details”. More often than not, it doesn’t help you much. I know that – it was the same for me. So what does that mean? How can I apply that to my project? Let’s look into that.
This tip is part of Test behavior, not implementation details series
What is software behavior?
When building software, we want it to do something useful. For example, we want to create our TODO task, see the list of tasks, and mark them as done. We could describe a couple of desired behaviors
- When I create a task, I want to see it in the list of tasks.
- When I create a task, I want to be able to mark it as done.
- When I try to create a task with an empty title, I want to be rejected with an error message.
- …
In a similar fashion, we can describe all the desired behaviors of our software. To mention a few other examples:
- When OpenAI API is down, we want to default to a local model.
- When the user is not logged in, we want to redirect them to the login page.
- When users sign up, we want to send them a welcome email.
- When a user hasn’t confirmed their email address after a week, we want to delete them.
- When two consecutive emails bounce, we want to stop sending emails to that address.
To put it simply, software behavior is what the software does and how it reacts to different inputs and conditions. That’s what we’re interested in. And that’s where we want to invest our testing efforts.
What are implementation details?
There are always many ways to implement a desired behavior. For simple cases, there might be only one way (e.g., to sum two numbers). On the other hand, there might be multiple valid ways with their pros and cons that all satisfy the desired behavior (e.g., different sorting algorithms). In any case, the implementation details are how we’ve chosen to implement the desired behavior. Changing them must keep the behavior the same but can make our software faster, more reliable, easier to maintain, etc.
Testing behavior vs. testing implementation details
As mentioned, when writing automated tests, we want to focus on behaviors. That’s what brings value. For example, it doesn’t help much to have the fastest “banking” system that doesn’t transfer any money when it should. Simply put, with tests, we ensure that our software behaves as expected. Our engineering expertise ensures that our code is performant, reliable, maintainable, etc. Let’s take a look at the examples.
Going back to the TODO behaviors, we can focus on “When I create a task, I want to see it in the list of tasks.”. To do that, we need some store to persist our tasks. Nothing in the behavior specifies where and how we should store the tasks. We just need to be able to list them after they are added. Testing implementation details, tests, and implementation could look like this:
Things would look slightly different when focusing on behavior. Tests and implementation could look like this:
Let’s study the differences.
Readability and maintainability are improved when testing behavior
First, tests testing implementation details are entirely tied to the fact that we’re using SQLite as a store. There’s actually SQL inside the tests. This hurts readability, but that’s not the worst part. The worst part is the fact that we’ll need to touch these tests when:
- new columns are added to the table
- the table is renamed
- the table is moved to a different schema
- columns are renamed
- …
All of these are changes in implementation. While changes in the database may be less frequent, adding new columns is quite a common thing. Touching them every time when any of the above-mentioned changes is introduced decreases the protection against the regression. That’s due to the fact that you need to change tests whenever you’re changing the implementation. On the other hand, tests that are testing behavior are not tied to the fact that we’re using SQLite as a store. There’s no SQL inside them. We use only the store’s public methods to execute and verify behavior. This gives us much better protection against the regression. We can introduce any of the previously mentioned changes without touching the tests. Have a hard time believing that? Take a look at the in-memory store implementation and tests:
The observed behavior is still there, but the implementation is completely different. In other words, we can use exactly the same tests to verify the behavior of the in-memory and SQLite stores. There’s a change in how we initialize the store, but that’s not part of the tests.
Another thing to notice is that there’s a single test when testing behavior as opposed to two tests when testing implementation details. Does that mean fewer tests when testing behavior? Absolutely not! This only shows that we are not focused on methods (one method, one test) but on behaviors (one behavior, one test). In our case, our behavior is “When I add a task, I want to see it in the list of tasks.”. Therefore, we call two methods – add_task and list_tasks – to verify that behavior.
Testing behavior improves resistance to refactoring
There’s more – resistance to refactoring. Tests that are testing implementation details for the in-memory store could look like this:
We’ve mentioned the SQL when looking at the tests for the SQLite store. Here, we have the same problem. It’s just not SQL. It’s the fact that tasks inside the in-memory store are stored inside the list. Good luck refactoring that to store tasks inside a dictionary, or a set, or a custom data structure. You’ll need to touch the tests despite unchanged behavior. This means that such tests aren’t resistant to refactoring. They’re tied to the implementation details. With tests that are testing behavior, we can refactor the implementation details without touching the tests. There’s no awareness of the fact that tasks are stored inside the list. We can easily swap it with something else without touching the tests.
Conclusion
While things might seem obvious now, I’ve mostly seen the tests that are testing implementation details (if any). That’s one of the major reasons for frustrations with automated tests. Therefore, once again, I encourage you to focus on behaviors when writing automated tests. Hide the implementation details inside your objects and keep the tests clean. Tests that are testing behavior are more readable, more maintainable, and resistant to refactoring. They also lead to more modular code design. A good thumb rule to check whether you’re testing behavior or implementation details is to ask yourself: “Can I replace my implementation with a different one without touching the tests?”. If the answer is yes, you’re most likely testing behavior. If the answer is no, you’re most likely testing implementation details.
You can find full code examples in this Github repository.