The paradox of low test coverage
When I learn that code owned by a team has low test coverage, I expect "here be dragons." But I never know what to expect if the code coverage is high. I call this a paradox of high test coverage.
High test coverage does not tell much about the quality of unit tests. Low coverage does.
The low coverage argument is self-explanatory. If tests cover only a small portion of the product code, they cannot prevent bugs in the code that is not covered. The opposite is, however, not true: high test coverage does not guarantee a quality product. How is this possible?
Test issues
While unit tests ensure the quality of the product code, nothing, except the developer, ensures the quality of the unit tests. As a result, tests sometimes have issues that allow bugs to sneak in. Finding unit test issues is more luck than science. It usually happens by accident—usually when tests continue to pass despite code changes that should trigger test failures.
One of the simplest examples of a unit test issue is missing asserts. Tests without asserts are unlikely to flag issues. Other common problems include incorrect setup and bugs caused by copying existing tests and incorrectly adapting them to test a new scenario.
Mocking issues
Mocking allows the code under test to be isolated from its dependencies and simulate the dependency behavior. However, when the simulation is incorrect or the behavior of the dependency changes, tests may happily pass, hiding serious issues.
I've been working with C++ code bases, and I often see developers assume, without confirming, that a dependency they use won't throw an exception. So, when they mock this dependency, they forget about the exception case. Even though their tests cover all the code, an exception in production takes the entire service down.
Uncovered code
Getting to 100% code coverage is usually impractical, if not impossible. As a result, a small amount of code is still not covered. Similar to the low coverage scenarios, any change to the code that is not covered can introduce a bug that won't be detected.
Chasing the coverage number
Test coverage is only a metric. I've seen teams do whatever it takes to achieve the metric goal, especially if it was mandated externally, e.g., at the organization or company level. Occasionally, I encountered teams that wrote "test" code whose primary purpose was increasing coverage. Detecting or preventing bugs was a non-goal.
Low test coverage is only the tip of the iceberg
At first sight, low test coverage seems a benign issue. But it often signals bigger problems the team is facing, like:
spending a significant amount of time fixing regressions
shipping high-quality new features is slow due to excessive manual validation
many bugs reach production and are only caught and reported by users
the on-call, if the team has one, is challenging
the engineering culture of the team is poor, or the team is under pressure to ship new features at an unsustainable pace
the code is not very well organized and might be hard to work with, only slowing down the development even further
test coverage is likely lower than admitted to and will continue to deteriorate
I've worked on a few teams where developers understood the value of unit testing. They treated test code like product code and never sent a PR without unit tests. Because of this, even if they experienced the problems listed above, it was at a much smaller scale. They also never needed to worry about meeting the test coverage goals - they achieved them as a side effect.
If you found this useful, please share it with a friend and consider subscribing if you haven’t already.
Thanks for reading!
-Pawel