Posts Tagged ‘testing’

Starting with the version 2.17.0 Mockito provides the official (built-in) support for managing a mocking life cycle if JUnit 5 is used.


Getting started

To take advantage of the integration, the Mockito’s mockito-junit-jupiter dependency is required to be added next to the JUnit 5’s junit-platform-engine one (see below for details).

After that, the new Mockito extension MockitoExtension for JUnit 5 has to be enabled. And that’s enough. All the Mockito annotation should automatically start to work.

import org.junit.jupiter.api.Test;  //do not confuse with 'org.junit.Test'!
//other imports
import org.mockito.junit.jupiter.MockitoExtension;

class SpaceShipJUnit5Test {

    private SpaceShip spaceShip;

    private TacticalStation tacticalStation;

    private OperationsStation operationsStation;

    void shouldInjectMocks() {

It’s nice that both a test class and test methods don’t need to be public anymore.

Please note. Having also JUnit 4 on a classpath (e.g. via junit-vintage-engine) for the “legacy” part of tests it is important to do not confuse org.junit.jupiter.api.Test with the old one org.junit.Test. It will not work.

Stubbing and verification

If for some reasons you are not a fan of AssertJ (although, I encourage you to at least give it a try) JUnit 5 provides a native assertion assertThrows (which is very similar to assertThatThrownBy() from AssertJ). It provides a meaningful error message in a case of an assertion failure.

void shouldMockSomething() {
    willThrow(SelfCheckException.class).given(tacticalStation).doSelfCheck();   //void method "given..will" not "when..then" cannot be used
    Executable e = () -> spaceShip.doSelfCheck();
    assertThrows(SelfCheckException.class, e);

I were not myself if I would not mention here that leveraging support for default methods in interfaces available in AssertJ and mockito-java8 a lot of static imports can be made redundant.

class SpaceShipJUnit5Test implements WithAssertions, WithBDDMockito {

Tweaking default behavior

It also worth to point our that using the JUnit 5 extension Mockito by default works in the “strict mode”. It means that – for example – unneeded stubbing will fail the test. While very often it is a code smell, there are some cases where that test construction is desired. To change the default behavior an @MockitoSettings annotation can be used.

@MockitoSettings(strictness = Strictness.WARN)
class SpaceShipJUnitAdvTest implements WithAssertions, WithBDDMockito {


As I already mentioned, to start using it is required to add a Mockito’s mockito-junit-jupiter dependency next to a JUnit 5’s junit-platform-engine one. In a Gradle build it could look like:

dependencies {
    testCompile 'org.junit.vintage:junit-platform-engine:5.1.0'
    testCompile 'org.mockito:mockito-junit-jupiter:2.17.2'  //mockito-core is implicitly added

    testCompile 'org.junit.vintage:junit-vintage-engine:5.1.0'  //for JUnit 4.12 test execution, if needed
    testCompile 'org.assertj:assertj-core:3.9.1'    //if you like it (you should ;) )

Please note. Due to a bug with injecting mocks via constructor into final fields that I have found writing this blog post, it is recommended to use at least version 2.17.2 instead of 2.17.0. That “development” version is not available in the Maven Central and the extra Bintray repository has to be added.

repositories {
    maven { url "" }    //for development versions of Mockito

In addition, it would be a waste to do not use a brand new native support for JUnit 5 test execution in Gradle 4.6+.

test {

IntelliJ IDEA has provided JUnit support since 2016.2 (JUnit 5 Milestone 2 at that time). Eclipse Oxygen also seems to add support for JUnit 5 recently.



It is really nice to have a native support for JUnit 5 in Mockito. Not getting ahead there is ongoing work to make it even better.
The feature has been implemented by Christian Schwarz and polished by Tim van der Lippe with great assist from a few other people.

The source code is available from GitHub.

Btw, are you wondering how JUnit 5 compares with Spock? I will be talking about that at GeeCON 2018.

Geecon logo


Get know how fast (or slow) mutation testing in fact is (based on tangible results from the real FOSS projects) and how can it be optimized (with real implementation in PIT, a tool for Java projects).

One of the questions I’ve got after my first presentation about mutation testing with PIT at in Brno years ago was “how long would it take in my project?”. I wasn’t able to answer precisely. The execution time depends on multiple factors. Size of the project, number of tests, kind of tests (unit/integration), just to mention a few. There are also some optimization techniques which can speed up the analysis. To give you a roughly idea about that, I collected the results from a mutation testing session in two popular FOSS projects performed with PIT.

I assume that you already know something about mutation testing. If not, please read that article before going further.

The fast and the furious 1910


As my guinea pigs I chose two popular open source projects – Assertj – which provides “fluent assertions for Java” and Joda-Time providing “a quality replacement for the Java date and time classes” which was a must for Java projects before Java 8 where similar solution became available out-of-box.

AssertJ can be named a medium size project according to FOSS standards. It has ~85K lines of production code and 150K lines of tests (as measured by Idea statistics plugin – including empty lines and comments). The code compilation takes ~2 seconds. Test execution ~12 seconds (almost 10_000 (mostly) unit tests). Joda-Time is somehow smaller with ~70K LOC in production and ~72K LOC in tests. Test execution takes ~3 seconds (~4200 tests).

I chose AssertJ and Joda-Time as they have a very solid test harness which consists (mostly) of unit tests – the best possible kind of tests for most of the cases. While integration tests can also be used with mutation testing (with some limitations and side effects) unit tests are the perfect candidate.

Brute force

Based on the further PIT analysis I estimated a number of possible to generate mutants at 8000 for AssertJ and 10000 for Joda-time. Of course it depends on the number of mutations available in your tool belt, however, to compare a brute force method with PIT the same number of mutants is accurate.

You may ask why there is greater number of mutants generated for Joda-time which has smaller codebase. It would be a fair question. PIT (or any other mutation testing tool) generates mutants in the lines where something can be “broken”. As Joda-time contains more “real logic” inside it was possible to generate more mutants than in AssertJ code where is a lot of delegation. In the other words, it’s usually possible to generate more mutants generated in a line with an if statement than in a line which just calls some void method in an another class. Therefore, number of mutations does not depend only on the number of lines.

Let’s get back to our calculations. AssertJ, brute force algorithm. For every mutation code has to be compiled – 8000 * 2 seconds and all tests have to be executed 8000 * 12 seconds. It gives 112000 seconds in total. It is ~1867 minutes, over 31 hours! Even for 5 times smaller Joda-time it is still: 10000 * 1 seconds + 10000 * 3 = 40000 seconds -> 667 minutes -> over 11 hours!

Wow, over 11 and 31 hours. For not so large projects. It is one of the reasons why mutation testing – first proposed by Richard Lipton in 1971 (over 40 years ago!) – had been an academic technique only. Let’s take a look how it could be optimized to make it conform to the reality in enterprise projects.


Test selection

Running all tests for every mutation it greatly ineffective. One of the naive approaches is an usage of a name convention – tests usually are named SomeProductionClass*Test. However, it tends to underestimate test suite effectiveness as no all tests are written in that way (especially integration/acceptance ones) and there could be also some typos. Another idea is a static call analysis. It’s more reliable, however, it completely skips reflection calls and in addition it can have a problem with polymorphism.

PIT in turns, leverages some other techniques. First of all, tests are executed in a “normal” mode and a standard code coverage metric is gathered. Thanks to that procedure, mutants with no “standard” coverage can be skipped. There is no chance to have them killed – no test even executes that given line. That technique can be especially beneficial if PIT is used in a project having small number of automated tests (low standard code coverage). What is more, PIT knows exactly which tests execute the particular line. Therefore, no test will ever be run against a mutant that it will not execute. This is essential to limit the number of tests executed per mutant. As a bonus, PIT (based on the first execution) is aware of tests execution time. Tests covering a given mutant can be reordered to have fast ones executed first. If a mutant has been killed, the subsequent (slower) tests can be skipped.

Those optimizations (among other technical solutions – such as creating mutations at the bytecode level instead of the source code one – which is must faster, albeit harder in implementation and also has some limitations) contributed to much faster mutation testing analysis in comparison to theoretical brute force mode.

Let’s compare the theoretical brute force analysis time with the one achieved with vanilla PIT configuration.

Execution time in minutes Brute force PIT (1 thread)
AssertJ 1866.67 14.15
Joda-time 666.67 11.65

Brute force mutation analysis vs PIT

Looking at the chart please notice that Y-axis has a logarithmic scale (!). It’s ~14 minutes for AssertJ (as opposed to ~1867 minutes – over 31 hours) and ~11 minutes for Joda-time (compared to ~667 minutes – over 11 hours). It’s ~130 and ~60 times faster respectively. Looks good, but it can be done even better.

Parallel execution

One of the consequences of breaking the Moore’s law years ago was an increment of number of cores in modern CPUs. Nowadays, 4 or 8 virtual cores is a standard of a developer workstation. Servers can have much more of them. PIT advertised as “a state of the art in mutation testing” in addition to aforementioned optimizations supports parallel execution. Let’s take a look how much it can be gained in our tests case.

Test environment

All tests were performed using a dedicated m4.4xlarge AWS instance (32 virtual cores – 2x Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz with 8 physical cores each – and 64GB RAM). Most of the executions were repeated to verify the achieved result, however, the methodology was far from one required in scientific researches (which was definitely not an aim of my work).

Raw results

Number of threads 1 2 4 6 8 10 12 16 20 24 32
Execution time in minutes 14.15 7.48 4.27 3.27 2.92 2.77 2.85 2.88 3.02 3.12 3.85
Number of timeout errors 17 17 17 17 17 17 17 18 31 31 260

AssertJ - PIT analysis time

Number of threads 1 2 4 8 12 16 20 24 32
Execution time in minutes 11.65 6.27 3.83 2.62 2.35 2.42 2.47 2.55 2.60
Number of timeout errors 49 49 50 49 49 49 49 48 51

Joda-time - PIT analysis time

Commented results

The aforementioned results clearly shows that having quite powerful machine it was possible to the reduce mutation testing analysis execution time over 5 times (in comparison to sequential analysis) thanks to doing multiple things at the same time. In the test subjects the saturation point was placed around 10-12 threads (out of 32 virtual cores). At that state the machine seemed to be (almost) fully loaded (see the screenshot below) by most of the time (excluding the initial and the last stage of the analysis). After certain point the number of timeout errors was increased significantly (in a case of AssertJ) which could confirming that the machine was overloaded. Unfortunately, I don’t have a good explanation why the system was saturated much below 32 virtual cores (16 physical cores) as PIT should not use more threads as defined and there was no extra parallelism in the tests itself. As reasonably suggested by Henry Coles, the author of PIT, next time I will attach a profiler to the analysis to try to find potential places to speed the things up even further.

CPU utilisation - PIT, 12 threads

CPU utilisation – PIT, 12 threads

Further optimizations

Having mutation testing performed regularly in large codebase, the amount of changed code is usually meaningless. To not to have to re-execute mutation on all the code, PIT incorporates the incremental mode. When enabled, PIT determines which mutant could have got a chance to get killed by the recent changes in the production code and tests. As a result, the analysis scope can be limited to those classes. This approach is perfect for running a local analysis by a developer on his/her workstation or in a change/pull request to determine the “quality” of the recently introduced changes. For large codebase savings on time execution can be tremendously. It is only worth to remember that the heuristic selecting which part of the analysis should be executed has some limitation and from time to time it is good to run the full analysis anyway.


With a few smart optimizations (here implemented in PIT) it is possible to reduce (a theoretical) brute force mutation testing execution time by even ~130 times (from 31 hours to less than 15 minutes in a case of AssertJ). Having a hi-end quad core laptop it is possible to cut another chunk thanks to parallel execution. However, only the power of very strong server machine (here a server with 32 virtual cores which is not uncommon among CI server executors used in large projects) allows to unleash the power of PIT to speed up the mutation testing analysis. The full analysis time was possible to reduce by 3 orders of magnitude (!) (from 31 hours to less than 3 minutes in a case of AssertJ). It, together with incremental analysis, can make mutation testing feasible* to use on a daily basis in the real life, large enterprise projects (especially with unit tests posed the majority of the tests used in a project – but it is a topic for an another blog post :) ).

Btw, what is your experience in using mutation testing in non-academic projects? Leave your comment below.

Self promotion. Would you like to improve your and your team testing skills and knowledge of Spock/JUnit/Mockito/AssertJ/PIT quickly and efficiently? I conduct a condensed (unit) testing training which you may find useful.

Mutation testing allows to check the quality (effectiveness) of automatic tests. PIT (aka pitest) is a leading mutation testing tool for Java environment. In my last blog post about PIT in January 2013 I have covered version 0.29. Since then the PIT development team has been busy and the 4 releases introduced various new features (besides fixed bugs). In this post I will cover the most important (in my opinion) changes in the project (up to recently released version 0.33).

PIT mutation testing tool logo

New features

– preliminary support for Java 8 bytecode – PIT can be used with code which contains Java 8 syntax and constructions (including lamdas)
– internal refactoring resulted in much faster “standard” line coverage calculation
– support for parametrized JUnit tests written with Spock (in Groovy) and JUnitParams
– ability to define a coverage threshold (both line and mutation) below which the build will fail
– ability to use PIT with Robolectric
– new Remove Conditionals Mutator (a conditional statement will always be true – not enabled by default as of 0.33)
– new Remove Increments Mutator (an increment operation will be removed – not enabled by default as of 0.33)
– ability to choose JVM to be used for mutation testing
– ability to run PIT only for locally changed files for Maven build with configured SCM plugin
– demanding users can define their own strategies for: test selection, output format and test prioritization – PIT provides extension points which allow to write custom implementations
– partial support for JUnit categories
– support for mutating static initializers in TestNG

In the meantime there were also releases of plugins/tools based on PIT. My plugin for Gradle was enhanced with the dynamic task dependencies resolution (just “gradle pitest” takes care about all the requisites in the Gradle build lifecycle) and support for the additional main and test source sets. Plugin for Eclipse has got (inter alia) a new mutation view and an ability to run PIT against all the projects in a workspace.

Not only releases

Besides new releases PIT has got brand new Bootstrap based webpage, the logo (see above) and the source code was migrated from Mercurial on Google Code to GitHub. The nice thing is that the move resulted in a few contributions withing the first weeks.

Henry Coles the author of PIT also started the new commercial project FaultSeed – “better mutation testing tools for the JVM” which will be based on PIT and has a goal to be 50% faster than PIT and support also Groovy and Scala. Very promising.

PIT (and mutation testing in general) becomes more and more popular and recently there were given a number of talks about it (including my talk at Developer Conference 2014slides). The number of questions on the project’s mailing list also significantly increased. And you, have you tried PIT in your project yet? 2014 logo