Abstract:The traditional KNN algorithm has shortcomings such as low classification efficiency. This study proposes an efficient weighted KNN algorithm that combines the idea of multiple representative points. It uses the concept of the upper and lower approximate regions of the variable precision rough set and integrates the clustering algorithm to generate a representative point set and construct a classification model. Then it adopts the structural risk minimization theory to optimize the classification model and analyze the factors that affect the classification model. During the classification process, the relative position of the test sample is obtained according to the similarity between the test sample and each representative point. Moreover, the category of the test sample in the lower approximate region can be directly determined. If the test sample is in other areas, the sample within the coverage of each representative point is weighted according to the relative position of the test sample and each representative point to determine the type of the test sample. Experiments on the data set in the field of text classification show that the algorithm can improve the performance of the classification model.