Mutation Testing with Stryker

This is a first look into Stryker Mutator. As for what it does; you might have already done this yourself, albeit manually, when reviewing someones PR.

You’re considering how effective a given test is, so you pull the code and run the tests, and sure enough, you see the test pass and 100% code coverage.

You’re still not convinced so you tweak the code in some minor way, expecting the test to fail. Surprise! It still passes and you still have 100% code coverage! So this is a bad set of tests!

So what is Stryker Mutation testing?

Consider the following example code (taken from the stryker docs).

function isUserOldEnough(user) {
    return user.age >= 18;
}

When Stryker is run against this code base, it will automatically mutate the return statement, recompile your application, and re-run the tests. For every mutation that Styker tries, a test MUST fail, otherwise we have a mutant.

Here’s what mutations Stryker will make here:

/* 1 */ return user.age > 18;
/* 2 */ return user.age < 18;
/* 3 */ return false;
/* 4 */ return true;

Stryker will give you a mutation score, which looks to be a more accurate code coverage metric.

What kind of mutations can Stryker make?

There’s a bunch of mutations that Stryker can make!

It can delete the body of a function, mess with your arithmetic or logical comparisons, change string literals, empty arrays, etc, etc.

It’ll work with both javascript and dotnet! And as always, it’s worth reading the docs yourself: https://stryker-mutator.io/docs/stryker-net/Getting-started.

A bit of code

Given this terrible example code (don’t judge me):

public class ScoreCategoriser
{
    public string Categorise(int ratingScore)
    {
        if(ratingScore < 0) return Rating.OutOfBounds;
        if(ratingScore < 3) return Rating.ReallyBad;
        if(ratingScore < 5) return Rating.FairlyBad;
        if(ratingScore < 7) return Rating.FairlyGood;
        if(ratingScore <= 9) return Rating.ReallyGood;
        if(ratingScore == 10) return Rating.Perfect;

        return Rating.OutOfBounds;
    }
}

And these not very well written xunit tests (you can probably see where I’m going with this…):

public class ScoreCategoriserTests
{
    [Theory]
    [InlineData(-2, Rating.OutOfBounds)]
    [InlineData(2, Rating.ReallyBad)]
    [InlineData(4, Rating.FairlyBad)]
    [InlineData(6, Rating.FairlyGood)]
    [InlineData(8, Rating.ReallyGood)]
    [InlineData(20, Rating.OutOfBounds)]
    public void When_Categorise_ThenMapRatingToResult(int rating, string expectedCategorisation)
    {
        var scoreCategoriser = new ScoreCategoriser();

        string result = scoreCategoriser.Categorise(rating);

        result.Should().Be(expectedCategorisation);
    }
}

You can see we’re not testing our edge cases here. Any traditional code coverage tool we might use will report that we have 100% code coverage here, because to be fair, we are indeed testing each line of code! Here’s an example coverage report; captured with coverlet; report generated with reportgenerator:

coverlet code coverage 100 percent

Now if we run stryker with dotnet stryker we can see we’re only getting a mutation score of 68.75%, exposing our bad tests for what they are!

coverlet code coverage 100 percent

If we click on the red button next to each if we can see which mutation stryker tried that failed to break a test.
coverlet code coverage 100 percent
We can prove this by manually changing the code to <=, and re-running the tests with dotnet test and we’ll still get 100% pass rate.

Now if we add a test case for 0.

[InlineData(0, Rating.ReallyBad)]

…and re-run Stryker, we can now see it’s happy we’ve covered our edge cases for our < 0 case.
coverlet code coverage 100 percent

So…

We can see the value of Stryker here, as it automates this style of testing that would otherwise be fairly labour intensive to do manually.

It can do everything a traditional code coverage tool can do, and then some. The only issue I’ve had with it, is that it can be fairly slow to run in larger projects. There is some advice on how to resolve some performance issues in the FAQ.

This can also be fairly easily included in your CI pipeline. According to the docs there are some Azure DevOps extensions on the marketplace.

Run the demo yourself (dotnet)

Pull this repo and open it in vscode.

git clone https://github.com/michaelpmcmillan/stryker-mutator-hello-world.git
cd ./stryker-mutator-hello-world/StrykerDemo.Tests
dotnet tool restore

There’s some more detailed steps in the repo’s readme on how this repo was created and which nuget packages were used, just in case you fancy starting from scratch yourself.

Generate the traditional code coverage report:

dotnet test --collect:"XPlat Code Coverage"
dotnet reportgenerator -reports:./TestResults/**/*.xml -targetdir:./TestResults/CoverageReport/ -reporttypes:Html

The report will be generated here /dotnet/StrykerDemo.Tests/TestResults/CoverageReport/index.html. You can just open it in your browser.

Generate the Stryker report with:

dotnet stryker

And this coverage report can be found under /dotnet/StrykerDemo.Tests/StrykerOutput/*/reports/mutation-report.html

Popular posts from this blog

Taking a memory dump of a w3wp process

GitLab Badges

sp_blitzIndex