Abstract:At present, the research on the risk identification of power outage complaints and the customer sensitivity analysis in power grid companies is at its early stage. In order to effectively analyze the sensitivity of power outage customers, a sensitive customer classification algorithm based on the improved random forest algorithm is proposed. First, the data is preprocessed by methods of data cleaning, feature selection, and so on. Second, the SMOTE algorithm is used to increase the number of sensitive customers to solve the problem of data imbalance. Third, the representative feature space is selected by proportional random sampling. The Fisher ratio is used as the characteristic importance measure. Then, the random forest algorithm is used to recognize the customers that are sensitive to power outage. Finally, the experiments on real power outage data show that the proposed method not only has better accuracy and time performance but also can effectively deal with high-dimensional data with redundant features.