Mutahunter AI – Open Source Mutation Testing Tool
Software development is a continuously evolving field. With the advancement in development techniques and tools, it has become paramount to ensure code quality and robustness. Mutahunter AI is an innovative, open source tool which is designed to enhance the standards of software testing & quality assurance through the use of advanced mutation testing techniques, by harnessing the power of LLM’s and AI. Lets understand what makes Mutahunter an excellent choice for developers and testers to level up their QA game.
What is Mutahunter AI?
Mutahunter is an AI-driven mutation testing tool that takes help of Large Language Models (LLMs) to enhance the QA process. In typical testing methods, there is a high chance to miss subtle bugs in the code. Mutahunter introduces “context-aware” faults into the codebase which simulates potential mutations. This particular approach ensures that the testing of the code happens in a more thorough manner, leading to a very high code quality.
Website: https://mutahunter.ai/en
Github Repo: https://github.com/codeintegrity-ai/mutahunter
Key Features of Mutahunter AI
- Automatic Unit Test Generation: Mutahunter can automatically generate unit tests to increase both line and mutation coverage. By identifying and filling gaps in test coverage, it ensures that your code is rigorously tested and no mutants slip through.
- Language Agnostic: Mutahunter is compatible with any programming language that provides coverage reports in formats like Cobertura XML, Jacoco XML, and lcov. This flexibility makes it a versatile tool for various development environments.
- LLM Context-Aware Mutations: Utilizes LLM models to generate contextually relevant mutants. These mutants have higher fault detection capability and are more closely aligned with real-world faults. This ensures comprehensive testing and less bugs.
- Diff-Based Mutations: Focuses on modified files and lines based on the latest commit or pull request changes. This ensures that only the relevant parts of the code are tested, making the process efficient and targeted.
- Surviving Mutants Analysis: Automatically analyzes survived mutants to identify potential weaknesses in the test suite, vulnerabilities, and areas for improvement. This continuous feedback loop helps developers and testers in maintaining high code quality.
Why choose Mutahunter AI for your Testing?
- Enhanced Fault Detection: By injecting context-aware faults, Mutahunter ensures that even the most minute bugs are detected and addressed.
- Efficiency: The tool’s ability to focus on relevant code changes makes the testing process more efficient, saving time and resources.
- Flexibility: Its language-agnostic nature allows it to be integrated into various development environments seamlessly.
- Continuous Improvement: The analysis of surviving mutants provides valuable insights into the test suite’s effectiveness, promoting continuous improvement and ensuring the development of robust test suites.
Steps to Run Mutahunter AI on Replit
The following are the steps to run and try out Mutahunter AI by yourself, using the open-source replit. Irrespective of whether you are a Developer, Tester or DevOps guy, these steps are pretty basic and easy to follow.
Step 1: Fork the replit
The first step is to open the Mutahunter Replit & fork the replit so that you can create a copy of the existing code developed by Mutahunter team. Click on “Fork” button to create your own personal copy of the code. Once this step is done, click on “Edit in Workspace” to be able to run the code.
Step 2: Generate an Open AI API key / Anthropic Key
In order to try out Mutahunter, you should have an Open AI API key/Anthropic key. Go to the official website of ChatGPT and purchase api keys for gpt-4o.
Step 3: Setting up API Keys and run Mutahunter
First, open the shell on replit and type the below command to setup API keys. If you have GPT-4o keys, then you can use the below code:
$ export OPENAI_API_KEY=your-key-goes-here
Or else, if you have anthropic keys (Claude 3.5), you can run the below code:
$ export ANTHROPIC_API_KEY=your-key-goes-here
Now we have to run the below code to get the coverage report.
$ pytest --cov=. --cov-report=xml --cov-report=term
Now, since all the setups are done, we can finally run Mutahunter on our codebase. To do so, run the below code in shell. Mention “gpt-4o” as the model if you are using open AI keys.
$ mutahunter run --test-command "pytest" --code-coverage-report-path "coverage.xml" --only-mutate-file-paths "app.py" --model "gpt-4o"
Upon running the above code, Mutahunter will analyze your codebase and your test script to inject “mutations” into your code.
Step 4: Analyze Mutation Details
After the script has been executed, go to logs->latest->html->2.html. Download this html file and open it. A detailed code analysis is provided which shows what is the mutation that has been added in the codebase for a particular line of code. For example Line 31 is a return statement which returns the sum of num1 & num2.
Killed mutations are highlighted in Green, whereas Survived mutations are highlighted in Red.
If we click on “M11”, we get directed to the portion which explains us what mutation was applied to the return statement. In this case, Mutahunter modified return statement from “num1+num2” to “num1+num2+1”, which creates a wrong output. However, our test code for this particular code was robust enough, and was able to detect this mutation and kill it.
The report shows the type of mutation”, the exact modification that was done, the status of the mutant & the impact of mutation on the logic of the code.
However, in the case of one of the mutation generated for “incorrect condition”, it shows that the logic was changed from “number<0” to “number<=0”. Because our test cases did not have any validations for this “number<=0” logic, the mutant survived as our test case was not able to catch it.
Step 5: Analyze Mutation Reports
Mutation reports give a high-level overview of all the mutations that were injected, killed and survived along with coverage details for the code.
Go to logs->latest->html->mutation_report.html. Download this html file and open it. A mutation report will be visible which shows the overall summary of “Line Coverage”& “Mutation Coverage”. As you see in the below screenshot, Mutahunter was able to inject 13 mutants into the code. Out of these 13, our test script was able to identify 8 of them and was able to kill them. However, 4 of these mutants were not identified by our test code and it slipped through.
Line coverage depicts how many lines of codes were covered as part of our tests. In our case, 99% of lines were executed by all the tests described in our test script.
However, if you see mutation coverage, it is 61.54%. The reason being that out of 13 mutations, 4 survived and could not be caught by our test suite. Our ideal target is to ensure that we modify our test suite in such a way that the mutation coverage becomes 100%.
Step 6 : Analyze Individual Mutants
Mutahunter adds mutations in the codebase, however what is the exact piece of code that was added? If you want to check that, you can go to logs->latest->mutants folder. Inside this folder, you will see multiple “.py” files. These files are basically a copy of our existing codebase but each of these files contains one mutation.
For example, in the below screenshot we can see a mutant file. If we search “#” in this code, we will be able to find the exact mutation. Here in this case, in this particular mutant file, we have a mutation for “Incorrect multiplication”.
Similarly, for every single mutation, a new python file is created, which is stored in the mutants folder.
Step 7: Analyze Audit Reports
Audit report is the best way to get a high level understanding of what all the mutations were injected and tested, along with improvement recommendations & code refactoring suggestions. Developers have the liberty to either go with test improvements or codebase improvements. This report harnesses the power of generative AI (in this case, GPT-4o) to deliver and easy to read and comprehensive document for developers to implement.
To access the audit report, go to logs->latest->llm->audit_b9c1.md report. Open this report & you will be able to see four sections:
Vulnerable code areas: This section explains severity, impact and the exact line of code. Consider it as a detailed bug report for our code.
Test Case gaps: This section covers which test cases are “missing” in our test suite based on the mutants that survived. Developers can read this section and work on the test scripts accordingly.
Improvement recommendations: This section define how the developers can add new test cases to reduce the number of surviving mutants. It also provides the exact code/script, reducing the development efforts.
Code refactoring suggestions: This section covers suggestions that a developer can follow to improve the actual codebase. This includes steps for refactoring the code, the exact code/script & the benefits of introducing these changes in the code. The refined code suggested by AI here allows for better testability & readability.
Conclusion
Mutahunter AI’s versatility and advanced testing capabilities make it a valuable tool across various domains. Whether you’re working on a large-scale enterprise application, an open-source project, or teaching the next generation of developers, Mutahunter AI can help ensure your code is robust, secure, and of high quality.