Abstract:In order to better evaluate the clustering quality of unsupervised clustering algorithm and solve the problem of invalidation of clustering evaluation results caused by overlapping cluster centers, the commonly used cluster evaluation index is analyzed and a new internal evaluation index is proposed, the product of the minimum square of the distance between the adjacent boundary points and the number of samples in the cluster is taken as the separation degree of the whole sample set, the relation between the degree of separation between clusters and the degree of compactness within clusters is balanced; a new density calculation method is proposed, which takes the object with a larger average distance ratio between the sample set and each sample as a high-density point, and uses the maximum product method to select the relatively dispersed data object with a higher density as the initial cluster center, thus enhancing the representativeness of the initial center of K-medoids algorithm and the stability of the algorithm. On this basis, the cluster quality evaluation model is designed with the newly proposed internal evaluation index. The experimental results on UCI and KDD CUP 99 data sets show that the new model can effectively cluster and reasonably evaluate non-prior knowledge samples, and can give the optimal number or range of clustering.