本文已被:浏览 1817次 下载 2583次
Received:February 19, 2016 Revised:April 11, 2016
Received:February 19, 2016 Revised:April 11, 2016
中文摘要: 在传统的K-means算法中,聚类结果很大程度依赖于随机选择的初始聚类中心点以及人工指定的k值.为了提高聚类精度,本文提出了利用最小距离与平均聚集度来对初始聚类中心点进行选取,将层次聚类CURE算法得到的聚簇数作为k值,从而使聚类精度得到提高.最后,将改进后的K-means算法应用到微博话题发现中,通过对实验结果分析,证明该算法提高了聚类结果精度.
Abstract:In the traditional K-means algorithm, the clustering results greatly depend on the random selection of initial cluster centers and the artificial K values. In order to improve the clustering accuracy, this paper proposes to select the initial cluster centers by using the minimum distance and the average clustering degree. The number of clusters is obtained by the hierarchical clustering CURE algorithm as K value, so that the clustering accuracy can be improved. Finally, the improved K-means algorithm is applied to the micro-blog topic discovery. Through the analysis of the experimental results, it is proved that the algorithm can improve the accuracy of clustering results.
keywords: K-means microblog topic clustering
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(61502298)
引用文本:
张云伟,宋安军.基于K-Means改进算法在微博话题发现中的应用研究.计算机系统应用,2016,25(10):308-311
ZHANG Yun-Wei,SONG An-Jun.Application of Improved Algorithm Based on K-Means in Microblog Topic Discovery.COMPUTER SYSTEMS APPLICATIONS,2016,25(10):308-311
张云伟,宋安军.基于K-Means改进算法在微博话题发现中的应用研究.计算机系统应用,2016,25(10):308-311
ZHANG Yun-Wei,SONG An-Jun.Application of Improved Algorithm Based on K-Means in Microblog Topic Discovery.COMPUTER SYSTEMS APPLICATIONS,2016,25(10):308-311