TY - JOUR
T1 - SCOPE
T2 - Hybrid optimization strategy for higher-order mutation-based fault localization
AU - Liu, Hengyuan
AU - Li, Zheng
AU - Kang, Xiaolan
AU - Wu, Shumei
AU - Paul, Doyle
AU - Chen, Xiang
AU - Liu, Yong
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/12
Y1 - 2025/12
N2 - Context: Mutation-Based Fault Localization (MBFL) using Higher-Order Mutants (HOM) has achieved promising performance in multiple-fault programs by simulating more realistic faults. Despite its effectiveness, it can be extremely costly due to the execution of numerous HOMs. However, existing cost-optimization strategies mainly focus on first-order mutants (FOMs), without considering the dependency relationships between HOMs and multiple program entities. Objective: In this article, we propose a novel strategy called Smart Cost-Optimization through dynamic Prediction and sampling Execution (SCOPE). It aims to reduce costs while providing rich mutation analysis information. Methods: SCOPE contains two key components: a Smart HOM Sampler and a Mutant-Testing Predictor. The former pre-selects the most promising HOMs for each program entity to execute, based on their association with suspicious program entities. The latter employs machine learning to infer the impact of the remaining HOMs on tests using test execution data from selected HOMs, without the need for actual execution. Results: (1) SCOPE outperforms state-of-the-art optimization strategies, including SELECTIVE, SAMPLING, and PMT, regardless of sampling rate or MBFL formulas adopted. (2) SCOPE can reduce the number of involved HOMs by up to 90% without any loss in the performance of MBFL. (3) SCOPE outperforms baseline methods including SBFL, three optimized MBFL techniques (WSOME, SGS, HMBFL) and two deep learning-based fault localization techniques (CNNFL and RNNFL). (4) Ablation Experiment validates that the Smart HOM Sampler and the Mutant-Testing Predictor contribute positively to the effectiveness of SCOPE, with average improvements of 23.60% and 15.14% in TOP-1 and A-EXAM. Additionally, machine learning model comparison for the Mutant-Testing Predictor reveals that compared to the Logistic Regression and Naive Bayes, Random Forest has better prediction performance. Conclusions: Evaluation on 135 real-world multiple-fault programs from the widely used benchmark Defects4J have shown the effectiveness of our proposed hybrid optimization strategy SCOPE for higher-order mutation-based fault localization.
AB - Context: Mutation-Based Fault Localization (MBFL) using Higher-Order Mutants (HOM) has achieved promising performance in multiple-fault programs by simulating more realistic faults. Despite its effectiveness, it can be extremely costly due to the execution of numerous HOMs. However, existing cost-optimization strategies mainly focus on first-order mutants (FOMs), without considering the dependency relationships between HOMs and multiple program entities. Objective: In this article, we propose a novel strategy called Smart Cost-Optimization through dynamic Prediction and sampling Execution (SCOPE). It aims to reduce costs while providing rich mutation analysis information. Methods: SCOPE contains two key components: a Smart HOM Sampler and a Mutant-Testing Predictor. The former pre-selects the most promising HOMs for each program entity to execute, based on their association with suspicious program entities. The latter employs machine learning to infer the impact of the remaining HOMs on tests using test execution data from selected HOMs, without the need for actual execution. Results: (1) SCOPE outperforms state-of-the-art optimization strategies, including SELECTIVE, SAMPLING, and PMT, regardless of sampling rate or MBFL formulas adopted. (2) SCOPE can reduce the number of involved HOMs by up to 90% without any loss in the performance of MBFL. (3) SCOPE outperforms baseline methods including SBFL, three optimized MBFL techniques (WSOME, SGS, HMBFL) and two deep learning-based fault localization techniques (CNNFL and RNNFL). (4) Ablation Experiment validates that the Smart HOM Sampler and the Mutant-Testing Predictor contribute positively to the effectiveness of SCOPE, with average improvements of 23.60% and 15.14% in TOP-1 and A-EXAM. Additionally, machine learning model comparison for the Mutant-Testing Predictor reveals that compared to the Logistic Regression and Naive Bayes, Random Forest has better prediction performance. Conclusions: Evaluation on 135 real-world multiple-fault programs from the widely used benchmark Defects4J have shown the effectiveness of our proposed hybrid optimization strategy SCOPE for higher-order mutation-based fault localization.
KW - Higher-order-mutants
KW - Machine learning
KW - Multiple faults
KW - Mutation-based fault localization
UR - https://www.scopus.com/pages/publications/105014020051
U2 - 10.1016/j.infsof.2025.107873
DO - 10.1016/j.infsof.2025.107873
M3 - Article
AN - SCOPUS:105014020051
SN - 0950-5849
VL - 188
JO - Information and Software Technology
JF - Information and Software Technology
M1 - 107873
ER -