Test automation is an integral part of the developer doctrine. It definitely doesn't need an introduction. We use different techniques on different levels. We write unit tests, integration tests, end-to-end tests, UI tests, performance tests, and so on. But it is still not enough. We do our best to follow clean code practices, SOLID principles and work towards a testable software in order ensure software quality.
Even then, there are certain pitfalls around writing automated tests and designing test systems. They exist even after all our efforts towards testability. And they can even have negative impacts on the software quality. This article is about one of these pitfalls, which I have experienced in almost all projects I've worked on. I'm talking about the return of investment.
OK, let's start.
What do you mean you weren't testing?
Years ago, when I joined a startup company as a software developer, they had already two flagship applications, with which they were providing software as a service to more than 15 happy customers. And they had not one single automated test.
Shocking enough? I worked at this successful company for over thirteen years, the company grew, I had multiple promotions, the customer base went thirty-fold. At the time I left the company, we still had no more than a handful of automated tests. Please pay attention and believe me, it was (and still is) a successful organization. It's just that we had a rich testing & quality management strategy, which was comprised of different tools.
This experience gave me a nice perspective to better assess the testing strategies wherever I went, and to better discover the opportunities to improve them.
Thank you very much my fellow developer
And then, in a different time, in a different project, I was asked to fix a bug in a fairly complex feature, where I definitely was not the owner of the code. I worked hard to understand all aspects of this feature and the overall architecture. Finally, I thought I understood everything, did the fix and ran the tests. That was when these two nicely designed tests failed and showed me the two scenarios I was completely missing. The names of the test methods, the preparation of the test data, the assertion messages were so nicely written, it was really "into your face". There I said to myself: Thank you very much my fellow developer.
Why do we write automated tests?
The story above is exactly why:
Automated tests (should) warn us when we are about to create impact.
Without his tests, I was totally unaware that I was breaking the functionality.
It is a common story, it happens to all of us, there is no shame in it. We change something, and something else breaks. That's why the best tests are those which fail at the right moment and warn us.
The story of the test maintainers guild
In another project, I was part of a team, which were releasing a major version approximately once a year. As part of these releases, we were officially required to provide a certain code coverage. We were achieving this by maintaining a considerable number of end-to-end (E2E) tests. Not surprisingly, in only one week into the development activities of the new version, these tests were starting to break. Keeping them green was an impossible task. Because, yes, they were E2E tests and almost any kind of code change was breaking a test. Hence, they were simply allowed to fail until the latest sprints. Until then we were relying on the manual tests. Once all the features were implemented and (manually) tested, the members of the test maintainers guild (the developers) were spending weeks to "fix" these E2E tests.
No failing test was an indicator of a bug anymore, because all code changes were already manually tested, remember? All that was left was to make the tests green, and to generate the desired code coverage report.
Did this story look familiar to you?
Why do the tests fail?
This is not a philosophical question on clean code principles or software architecture. It's more a technical question. Test failures can be categorized into the following three groups:
Impact
The "thank you my fellow developer" example: there is a real impact happening from a code change.
Infrastructure
Cases such as a test agent having a malfunction. These are the cases where the test itself doesn't have a stability issue or it's not flaky.
Maintenance
These are the cases where a test failure ends up to be a fix on the test code itself. For instance, add a new parameter to a constructor, change the name of a method, add a new button to UI, change the underlying test data source, etc.)
Or a test can flaky because it's not well-written, well-designed, or it's simply over complicated.
And needless to say, we want the tests to fail because of impact, and ideally, nothing else. How often is this the situation in your projects? How often do you find yourself maintaining your tests?
Return of investment
In another team I was working with, there were two test engineers, who were writing and maintaining the UI tests. They were given the goal to add more and more UI tests to the repository. However, the maintenance load (for the UI tests) was so huge, most of their effort was usually spent on keeping the existing tests green. They were barely able to write new tests. On top of this, these tests were revealing only one or two bugs per year. It was for sure not their fault, but at the end, all the resources that was put into the UI tests were making a little difference.
My argument back then was on the return of investment. What I proposed was an alternative strategy: Drop all UI tests at once, and instead ask these two engineers to do manual testing.
How many bugs would they find instead, if they were to do manual testing all year long, eight hours per day? I'm pretty sure, more than two.
The most important thing with the automated tests
Now please ask yourself this question: What is the benefit of automated tests over manual tests? They run much faster, and they are easily repeatable without considerable cost. On the other hand, a manual test requires resources every time it's executed.
In reality, automated tests also have a cost: the cost of maintenance. And if the automated tests have a high demand for maintenance, then the primary principle of test automation is open to discussion.
Here is the most important thing with the automated tests:
The lower the maintenance effort, the better the automated tests.
This was exactly the case in the previous examples. In the UI test example, two dedicated engineers were assigned to a task that barely produced results. In the E2E test example, an entire team of developers were spending weeks "fixing" the tests, which were, at the end, had nothing to do with the quality of the software.
At any stage of the project you are developing / maintaining, please ask yourself this question:
How much maintenance effort I'm putting into automated tests, and how many issues are they discovering: What is the return of investment?
If the numbers don't add up, it might be a good time to think of alternative strategies.
Tips & Tricks
Writing resilient tests
Pay attention to those tests that require constant maintenance. By following best practices try to minimize the maintenance effort.
Here is a simple and concrete example:
public class MyWhateverHandler
{
public MyWhateverHandler(
IHttpContextHelper contextHelper,
ISessionHelper sessionHelper,
IFileHelper fileHelper,
IAuthenticationManager authManager,
IDataAccessLayer dataAccessLayer,
IThisHelper thisHelper,
IThatHelper thatHelper,
[...]) {
}
}
One thing is certain that this class has many responsibilities, and since it is the class that does everything, it will only grow. While growing, you'll always find yourself refactoring your tests, to update the constructor signature, or to mock the dependencies. Violating the single responsibility principle has also an impact on the test maintainability.
Another example:
public int MyMethodWithTooManyParameters(
int x, int y, int z,
double m, double n, double k,
string successMessage, string errorMessage,
IThisInterface t1, IThatInterface t2) {
[...]
}
A similar violation is happening here, so breaking down this method will also help test resilience. If this is not possible, an alternative that can help testing could be to aggregate the parameters into a single entity.
A final example could be a UI test for a web application. Which one would be the more maintainable option: Locating an element via the visible text, id / class name, or a test-only attribute such as data-testid? There can be multiple approaches to this design choice. But, at the end, I'd always prefer the most maintainable option.
Tests that find issues
Are your tests designed to increase code coverage, or to find issues? Sometimes some tests are designed to always pass. I try to avoid them. Here is a question that you can ask yourself before writing a test:
In what kind of a situation in the future a developer can overlook things and break the feature (I just created)? What kind of a test should I write so that this won't happen?
This is a difficult task, but if you practice it constantly, both your production code and your tests will be more maintainable.
Hidden danger: Tests that always pass
int multiplyByTwo(int x) {
return x * 2;
}
Would you write unit tests around this method? If you do, I wouldn't mind. If you don't, I wouldn't judge. Would I? Probably not. And here is why: Not one time in the future would those tests find any issues / bugs. If someone changes the implementation of this method (i.e. change its name, adds a parameter, etc.) she would have to find the test and change the test to match the implementation. There is only one thing that such tests would bring: More maintenance.
And here is the key take away: If a developer doesn't change the signature of this method, but changes only the implementation to return, let's say, x * 3, then I have news for you: You have a bigger concern than your missing tests!
Tests as part of your continuous integration
Multiple times have I seen so called "automated" tests that were not part of the continuous integration. You had to run them manually, from time to time. You had to create the correct setup, talk to the correct people, have some scripts fixed, etc. And maybe after two days of work, you could execute the tests. Even more work for the maintenance guild.
Achieving the flow state
The automated tests should be there, close to you. There are tools that run the tests while you are typing code. Use them, if possible. The key point is this: Your attention is on what you are doing, and it is limited. If a test is there, shortly available, quickly responding; then you will get your feedback without losing your focus. This is one of the essential differences between fast running unit tests and the E2E tests you run once in a day.
E2E tests run at night, you come back tomorrow morning, see that five tests are failing, but who knows whose changes broke which. The attention is already lost.
This is also why the CI must be swift. A pull request pipeline that runs in 35 minutes instead of 5 minutes is the difference between you getting your immediate response and you going for lunch. Try to improve the repository design, leverage the impact analysis tools / libraries, if possible.
At the end, it's all about the value
Let me say it once again: Working towards a cleaner architecture is the most affective strategy towards maintainable tests. It's a given. We should expect though, in a complex legacy system, there are also other important factors. This article attempted to focus on these other, more blurry aspects.
Automated tests are not sacred. At the end, it's all about the value. The tests that always pass have minimum value. Tests that require constant maintenance can cost you a lot. Tests that are a burden on your CI, or prevent you to improve your tech-stack come with a price tag.
You can remove tests, you can remove entire layers of tests, you can decide not to write tests. Do the value assessment, and decide yourself.