
Research Article
Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack Defense
@INPROCEEDINGS{10.1007/978-3-031-54528-3_16, author={Min Li and Shizhan Chen and Guodong Fan and Lu Zhang and Hongyue Wu and Xiao Xue and Zhiyong Feng}, title={Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack Defense}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 19th EAI International Conference, CollaborateCom 2023, Corfu Island, Greece, October 4-6, 2023, Proceedings, Part II}, proceedings_a={COLLABORATECOM PART 2}, year={2024}, month={2}, keywords={Unit Tests Model Robustness Code Mutation Attack Defense}, doi={10.1007/978-3-031-54528-3_16} }
- Min Li
Shizhan Chen
Guodong Fan
Lu Zhang
Hongyue Wu
Xiao Xue
Zhiyong Feng
Year: 2024
Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack Defense
COLLABORATECOM PART 2
Springer
DOI: 10.1007/978-3-031-54528-3_16
Abstract
Writing high-quality unit tests plays a crucial role in discovering and diagnosing early-stage errors and preventing their further propagation throughout the development cycle. However, the low readability of existing automated test case tools hinders developers from directly using them. In addition, current approaches exhibit sensitivity to individual words in the input code, often producing completely different results for minor changes in the input code. To tackle these problems, we propose AssertGen, a powerful Java assertion generation model that maintains consistent output for minor variations in code snippets. Inspired by software mutation testing, we propose 11 heuristic strategies for code mutation, aiming to generate variant code that is human-readable but misleading to the model, by making minor changes to code text or structural information. Then, we use the variant code to attack the model to test the model’s robustness. We observe that the variant based on variable names (VM), the mutation based on method names (FM), and the mutation method FalseControlFlow, which adds additional control flow, have the greatest impact on the quality of generated assertions by the model. To enhance the robustness of AssertGen, we use multiple mutations to expand the original dataset, allowing the model to learn how to counter the instability caused by mutations during the training process. Experiment results show our assertion generation model achieves a BLEU score of 60.08 and a perfect prediction rate of 47.91%, surpassing previous work significantly.