SCOPE: Hybrid optimization strategy for higher-order mutation-based fault localization

  • Hengyuan Liu
  • , Zheng Li
  • , Xiaolan Kang
  • , Shumei Wu
  • , Doyle Paul
  • , Xiang Chen
  • , Yong Liu

Research output: Contribution to journalArticlepeer-review

Abstract

Context: Mutation-Based Fault Localization (MBFL) using Higher-Order Mutants (HOM) has achieved promising performance in multiple-fault programs by simulating more realistic faults. Despite its effectiveness, it can be extremely costly due to the execution of numerous HOMs. However, existing cost-optimization strategies mainly focus on first-order mutants (FOMs), without considering the dependency relationships between HOMs and multiple program entities. Objective: In this article, we propose a novel strategy called Smart Cost-Optimization through dynamic Prediction and sampling Execution (SCOPE). It aims to reduce costs while providing rich mutation analysis information. Methods: SCOPE contains two key components: a Smart HOM Sampler and a Mutant-Testing Predictor. The former pre-selects the most promising HOMs for each program entity to execute, based on their association with suspicious program entities. The latter employs machine learning to infer the impact of the remaining HOMs on tests using test execution data from selected HOMs, without the need for actual execution. Results: (1) SCOPE outperforms state-of-the-art optimization strategies, including SELECTIVE, SAMPLING, and PMT, regardless of sampling rate or MBFL formulas adopted. (2) SCOPE can reduce the number of involved HOMs by up to 90% without any loss in the performance of MBFL. (3) SCOPE outperforms baseline methods including SBFL, three optimized MBFL techniques (WSOME, SGS, HMBFL) and two deep learning-based fault localization techniques (CNNFL and RNNFL). (4) Ablation Experiment validates that the Smart HOM Sampler and the Mutant-Testing Predictor contribute positively to the effectiveness of SCOPE, with average improvements of 23.60% and 15.14% in TOP-1 and A-EXAM. Additionally, machine learning model comparison for the Mutant-Testing Predictor reveals that compared to the Logistic Regression and Naive Bayes, Random Forest has better prediction performance. Conclusions: Evaluation on 135 real-world multiple-fault programs from the widely used benchmark Defects4J have shown the effectiveness of our proposed hybrid optimization strategy SCOPE for higher-order mutation-based fault localization.

Original languageEnglish
Article number107873
JournalInformation and Software Technology
Volume188
DOIs
Publication statusPublished - Dec 2025

Keywords

  • Higher-order-mutants
  • Machine learning
  • Multiple faults
  • Mutation-based fault localization

Fingerprint

Dive into the research topics of 'SCOPE: Hybrid optimization strategy for higher-order mutation-based fault localization'. Together they form a unique fingerprint.

Cite this