Abstract:For the seeded-K-means and constrained-K-means algorithm limitations that complete category information in labeled data is required, this paper put forword an semi-supervised K-means clustering algorithm based on incomplete labeled data, focused on selection of the initial cluster center of unlabeled category. We gave a definition of the Best Candidate Set of cluster center of unlabeled category, proposed a new method that selecting initial cluster center of unlabeled category from the Best Candidate Set using K-means. Finally, a complete description of semi-supervised clustering algorithm based on the new method is given, the validity of the new algorithm is verified by experiment. Experimental results show that the proposed algorithm is superior to existing algorithms not only in clustering effect and in execution speed.