A project might find itself in different stages of unit test line and branch coverage.
But what does that really say about a project? If it’s 100% covered, is it really tested?
Or as wikipedia puts it :
Tests can be created to verify the correctness of the implementation of a given software system, but the creation of tests still poses the question whether the tests are correct and sufficiently cover the requirements that have originated the implementation.
Whether intentionally doing a poor job, or just being sloppy, coverage percentages just serve as a nice tool to
lie bring reassurances to oneself or to the management about quality.
The real measure of quality are the tests quality AND the amount covered.
So this begs the question, besides reviewing the tests alongside the code and giving it some subjective grade, is there such a tool that can speak volumes about the quality of tests ?
I won’t bore you with the details you can find out more on the provided links, but the basic idea is :
If you modify the code in some non-equivalent way and the test still pass, you have bad tests, and some programming error down the line won’t be caught by the tests.
Fortunately some nice folks are working on a really cool project called PIT
), which is a mutation tester for Java.
I encourage you to read everything on the site including the media links
and try it out.
For the purpose of this article I will provide an example project for showcasing mutation testing with PIT, including a gradle build script. This is made possible by the PIT gradle plugin
Coverage tools report 100% line coverage and branch coverage ( ignoring the equals method )
Eclipse with EclEmma and Intellij Coverage report :
At this point you might call it a day and a job well done. Test look preety thorough, and there are a lot of them, the code is tested, but are the tests solid?
Turns out they are not. If we run the PIT tester
reports will be generated on the path
PIT has detected that if we negate the condition test pass all the same.
It would be easy to make this mistake, or some other developer to alter the code in such a subtle way and no one to notice until one day when the code will not work as expected in a production environment.
To solve this uncomment the glider test ( a more complex test for GoL ).
If you trully believe you have quality tests, consider running mutation testing for an extra verification.
Alexander Turner December 14, 2014
Really interesting idea. It remind me of Popper and the notion that one should seek to disprove ones theory and only when it cannot be disproved is it reliable. Normal testing seeks to prove your program is correct. What one should do it seek to prove it is incorrect and then when you cannot, you have a good program.