Abstract:Great changes have occurred in people's daily life and routine work due to the widely used internet technology. However, we have to face threaten from the malware. Due to this, detecting malware has received more and more attentions in recent years. Malware samples are hard to obtain. Meanwhile, it needs to cost a lot of resources. So there are less malware samples and malware detection is an imbalance problem. Imbalance problem means that the distributions of various types of training samples are imbalanced. To solve this problem, a suitable over-sampling method is employed via a reasonable increase in samples of a few samples to address imbalances.