Abstract:To improve the performance of Support Vector Machine classifier for imbalanced data, an imbalanced data classification algorithm based on split and classifier ensemble is introduced. The majority class sample is divided into several sub sets by clustering, and each subset is combined with minority class sample to produce a training subset. Then the training subsets are learned and multiple classifiers are obtained. Finally the multiple classifiers are integrated and the ensemble classifier is obtained. Experimental results show the algorithm is effective for imbalanced dataset, especially for the minority class samples.