基于类中心与边界自寻优的聚类算法

doi:10.15888/j.cnki.csa.006077

微信公众号

网站二维码

首页 > 过刊浏览>2017年第26卷第11期 >118-123. DOI:10.15888/j.cnki.csa.006077

PDF HTML阅读 XML下载导出引用引用提醒

基于类中心与边界自寻优的聚类算法
DOI:
                        10.15888/j.cnki.csa.006077
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家科技支撑计划资助项目（2012BAB13B00）；中华女子学院科研基金重点资助项目（KG2014-02002）

Clustering Algorithm Based on Self-Optimizing Center and Boundary of Classes

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

随着互联网应用的普及和深入，涌现了许多新的应用场景和数据类型，导致许多经典的聚类算法不能有效地适应新的发展形势，成为数据挖掘中的棘手问题和研究热点，为此提出一种新颖的基于类中心与边界自寻优的数据聚类算法.该算法引入数据点“距离半径”分布矩阵R及其“距离半径累计”分布矩阵ΣR概念表征数据聚合度，并依据广度优先原则自寻优R与ΣR中皆为最小的数据点作为类中心；同时，提出“距离半径偏导”分布矩阵R’，描述簇类之间的松散度，并采用广度优先原则自寻优矩阵R’中的突变跃迁增长点，作为簇类之间的分界.通过经典的Aggregation聚类数据集的仿真实验测试，表明该算法能够有效地对多种形状、大小和不同密度分布的数据集进行聚类分析，能较好地识别出孤立点和噪声，具有较高的鲁棒性和分析精度.

Abstract:

With the deep development and popularization of Internet, new data types emerge in new application fields so that many classic clustering algorithms are no longer effectively adapted to new situations, so data mining is becoming thorny issues and research focus. Therefore the article proposes a novel clustering algorithm based on self-optimizing the centers and boundaries of classes. The algorithm contains the points' distance-radius-distribution matrix-R and the cumulative radius-distribution matrix-ΣR characterizing the degree of data aggregation. The data points with the minimum R and ΣR as the class centers are searched under the breadth-first. The algorithm also includes the partial derivative matrix-R' of the distance-radius distribution to describe the gradient change of the loose degree between different points. According to self-optimizing and breadth-first, the transition point of matrix-R', which its partial derivative is the biggest one in adjacent points, is found as the class boundary, inside which all points belong to the class. After emulating and testing the algorithm by typical clustering data sets of Aggregation, the result shows that the algorithm can effectively cluster the data sets with different shapes, sizes and different densities, identify the isolated points and noises, and also have better robustness and accuracy.

参考文献

相似文献

引证文献

引用本文

张文军,王建平,范世平,张柳霞.基于类中心与边界自寻优的聚类算法.计算机系统应用,2017,26(11):118-123

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2017-02-23
最后修改日期:2017-03-23
录用日期:
在线发布日期: 2017-10-30
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码