Some Thoughts on Testing

The main problem with testing is that nobody can think of a procedure to decide how to test. And maybe that’s not even possible. What’s worse is that the feedback cycle from bad tests lasts a long time. It might take a year or more before you discover that an important code path has bad tests when a use case changes and subtle bugs appear or large swaths of the test suite needs to be rewritten. Every developer will have a different set of experiences with tests that will shape their attitudes on the matter simply by random chance. It’s easy to see how opposite extremist viewpoints arise under these conditions. A developer who is bitten early in his career by a bad test suite will naturally be opposed to testing efforts. He will test less which will reinforce the belief since he will have less opportunity to be exposed to well-written tests. And if you think about it, that’s a perfectly rational way to react to those experiences since maintaining a bad test suite can definitely be worse than having no tests at all.

I was very fortunate to have been exposed to an excellent test suite on the first major project I worked on as a new developer in open source. The maintainers always insisted on a test for bug fixes and features and I saw for myself how this practice benefited the project.

What is a Test?

A test is a model that is used to approximate user behavior. When the model fits well, we can conclude from a passing test that the user will experience the result defined by the test assertions under the conditions of the test setup. In this way, tests become a precise definition of the intended application behavior bridging the gap between how the application actually runs and how the application ought to run. The setup shows an ideal for usage and the assertions show value judgments the user can use to set their expectations of defined application behavior.

In the broader picture of the engineering process, you can imagine downward flow of work from requirements, to design, to documentation, to testing, to implementation. Each lower level serves the higher level by adding precision at the expense of the ability to judge value. Tests use the documentation as their source of truth and serve its purpose to define application behavior but at a level of specificity that cannot be attained by plain English. However, this specificity comes at the cost of the ability to make more general value judgments since only specific inputs can be tested.

Tests sit only above the implementation which is by definition completely specific and objective. Implementation code simply runs how it runs. Correct behavior can only be determined in the context of the levels above. It is easier to determine whether application code is correct in the context of tests than documentation since tests are written in the same language domain as the implementation.

The Benefits of Testing

Since there is no procedure to decide how to write tests, it’s important for anyone who writes or reviews tests to understand their purpose. If the answer is “because my boss said every pull request must have tests” then in my experience, this nearly always leads to a low quality test suite. Testing may be a legitimate business requirement, and it’s a reasonable ask, but to meaningfully deliver this as a feature, your technical lead must be conscious of the testing strategy and must be able to articulate good practices to less experienced developers. Only include tests that you can reasonably understand brings value to the project, even if this understanding is only intuitive.

There are two different groups of people who are benefited by tests and tests must benefit both of these groups simultaneously.

Benefits to Developers

Most often it is your developers who will be writing their own tests so it’s important to get them invested in the process. Testing is unusual in that it is seen as a low class technical activity, but at the same time it requires an enormous amount of skill to do correctly. If a developer is resentful about needing to write tests, they will always write bad tests with this mindset. To make things easier for them, it’s a good idea to come up with a testing strategy while you determine the approach to implementation. All things equal, you should always prefer an approach that is more testable. Lack of testability is a valid reason to reject an approach.

The main benefit to the writer of the test is that it codifies their intent into the repository. Writing a test sends a clear message to other developers who may modify this code what it’s supposed to do so the writer may be confident others won’t break a feature he is relying on for future work. This is especially important in open source projects where lots of people are making one-off changes and might not recognize an edge case you are relying on for your feature. This sort of communication with others is done much more efficiently with tests than comments.

If your team has a culture where testing is a responsibility the benefits are much clearer. When a developer can expect others to write tests for the features they need, they gain the ability to freely modify the code without being as nervous about breaking another developer’s feature. This frees up mental energy for code quality improvements like large-scale refactoring that simply wouldn’t be possible without feedback to guard against unintended side effects. Ideally, any sensible implementation at all that passes the test suite should be acceptable which has the effect of reducing the actual code base to just details. It’s much more fun to commit to a code base where there are less consequences for mistakes, and much easier to review as well.

Without this expectation, when a regression occurs, the only possible solution is to “be more careful” which is not nearly as actionable as “write a regression test”.

Benefits to Users

As an user, the test suite is a good way to evaluate your use of the application. The test suite contains examples of usage you can compare to your own usage to understand whether you are on the common path. If your usage is different than the tests, you’ll know you are doing something novel and need to exercise some caution with your implementation. Whenever I see some odd behavior with a library I’m using, the first place I look for an explanation is the test suite. If my path is tested and my results are different, then it narrows down the possible reasons for the discrepancy to the environment. Knowing this is helpful when reporting bugs on the project. If the path is not tested, I know I’m doing something with the library the authors may not have intended and I’m on my own to make sure it works. In that case, I know I need to do some work in the library and then add a test for my use case to make sure it remains supported in the future.

You can use the test suite as documentation for the project. In some ways, it’s better than the actual documentation because you know if the tests pass you are looking at working code, while the documentation may be out of date. Not nearly enough people know to use the test suites in the projects they use this way.

The Costs of Testing

Writing tests is a lot of work, but when done correctly, it’s a force multiplier for developer and user productivity by clearly showing design intent and increasing stability of the code base. The problem is the stability you gain is forced and it takes additional effort to relax assumptions when a use case changes.

Just like with any other code, most of your testing effort will go into maintaining an existing test suite and this should be considered the primary driver of cost. Maintenance costs vary inversely with the stability of the interface. The more stable an interface is, the less it costs to test it which makes it a better target for tests. It doesn’t make any sense to test scaffold or POC code because you’ll end up paying the cost of removing the tests later.

General Testing Principles

So in conclusion, here are some basic principles to decide how to design your test suite.

  • Only include a test if you can justify its value.
  • Limit your tests to code under your control.
  • Write application code with testability in mind.
  • Write tests to augment the documentation.
  • Do not test undefined application behavior.
  • Write the minimum amount of tests you need.
  • Write tests whose failure has a meaningful business reason.