本文已被:浏览 968次 下载 2216次
Received:April 24, 2020 Revised:May 21, 2020
Received:April 24, 2020 Revised:May 21, 2020
中文摘要: 高维数据的聚类特性通常难以直接观测. 将其构建为复杂网络, 节点间的拓扑结构可以反映样本之间的关系. 对网络中的节点进行社区发现, 可实现对数据更直观的聚类. 提出一种基于网络社区发现的低随机性标签传播聚类算法. 首先, 用半径和最近邻方法将数据集构建为稀疏的全连通网络. 之后, 根据节点相似度进行节点标签预处理, 使得相似的节点具有相同的标签. 用节点的影响力值改进标签传播过程, 降低标签选择的随机性. 最后, 基于内聚度进行社区的优化合并, 提高社区的质量. 在真实数据集和人工数据集上的实验结果表明, 该算法对各种类型的数据都具有较好的适应性.
Abstract:The clustering characteristics of high-dimensional data are usually difficult to observe directly. Constructing it into a complex network, the topological structure of the network nodes can reflect the relationship between samples. Community detection of nodes in the network can achieve more intuitive clustering of data. A low randomness label propagation clustering algorithm based on network community detection is proposed. First, the data set is constructed as a sparse fully connected network using the radius and nearest neighbor methods. Then, according to the similarity of the nodes, the node labels are preprocessed to make the similar nodes have the same labels. The influence value of the nodes is used to improve the label propagation process and reduce the randomness of label selection. Finally, based on the cohesion, the community is optimized and merged to improve the quality of the community. The experimental results on real data sets and artificial data sets show that the algorithm has better adaptability to all kinds of data.
文章编号: 中图分类号: 文献标志码:
基金项目:福建省自然科学基金(2019J01835); 认知计算与智能信息处理福建省高校重点实验室开放课题基金(KLCCIIP2018107); 智慧农林福建省高校重点实验室开放课题基金(2019LSAF03); 福建省中青年教师教育科研项目(JAT170608); 中央引导地方科技专项(2018L3013); 武夷学院校科研基金(XL1201)
引用文本:
吴清寿,郭磊,余文森.基于网络社区发现的标签传播聚类算法.计算机系统应用,2020,29(12):135-143
WU Qing-Shou,GUO Lei,YU Wen-Sen.Label Propagation Clustering Algorithm Based on Network Community Detection.COMPUTER SYSTEMS APPLICATIONS,2020,29(12):135-143
吴清寿,郭磊,余文森.基于网络社区发现的标签传播聚类算法.计算机系统应用,2020,29(12):135-143
WU Qing-Shou,GUO Lei,YU Wen-Sen.Label Propagation Clustering Algorithm Based on Network Community Detection.COMPUTER SYSTEMS APPLICATIONS,2020,29(12):135-143