Abstract:Automatic program repair techniques can realize automatic repair of software defects and employ test suites to evaluate repair patches. However, because of inadequate test suites, the patches passing the test suites may not repair the defects correctly, or even introduce new defects with ripple effects, which results in a large number of overfitting patches generated by automatic program repair. To this end, an overfitting patch identification method based on data flow analysis is proposed. This method firstly decomposes the patch modifications to the program into operations on variables, then adopts data flow analysis to identify the patch influence domain, and selects targeted coverage criteria to identify target coverage elements according to the domain. Finally, test paths are selected and test cases are generated to fully test the repair program to avoid the impact of repairing side effects. This study conducts evaluations on two datasets, and the experimental results show that the overfitting patch identification method based on data flow analysis can improve the correctness of automatic program repair.