Abstract:Performance bugs are defects in codes that slow down program execution. Existing detection tools can only find certain types of performance bugs and require complex program analysis processes. Therefore, they lack generality and need high costs in space and time. Meanwhile, many classical clone detection techniques have been used for general similar code detection, but they can only detect highly similar codes or rely on training datasets, which makes them inapplicable for detecting performance bugs in real-world datasets. To this end, this study proposes a method of using clone detection techniques to find multiple types of performance bugs by constructing code templates with labeled tokens. By labeling tokens with different weights according to their types and frequencies, this method can distinguish tokens’ importance and thus extract key information from codes. Experimental results on real-world projects show that this method can find more types of performance bugs and consume less time than existing tools. Another experiment also proves that this method significantly improves the detection capability of token-based clone detection techniques and is more suitable for finding performance bugs than existing clone detection techniques.