基于KD 树子样的聚类初始化算法

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月26日 2:26 星期六

首页 > 过刊浏览>2011年第20卷第1期 >80-83

PDF HTML阅读 XML下载导出引用引用提醒

基于KD 树子样的聚类初始化算法
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        潘章明潘章明
广东金融学院 计算机科学与技术系，广州 510521
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Initialization Algorithm of Clustering Using Subsample for KD-Tree

Author:

PAN Zhang-Ming
PAN Zhang-Ming
Department of Computer Science and Technology, Guangdong University of Finance, Guangzhou 510521, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

在处理大数据集聚类初始化问题时，随机子样法是一种重要的数据约简操作。对随机取样的过程、特征及缺陷进行了分析，提出一种基于KD 树子样的聚类初始化方法。该方法利用KD 树将样本空间以递归方式细分成多个子空间，并分别在各子空间中随机取样形成KD 树子样，有效避免了随机子样分布有偏的不足，使得子样中好的聚类初始点也能很好的表达整个数据集的聚类结构。仿真结果表明，该方法选择的聚类初始点更加接近期望的聚类中心，能获得更高的聚类精度。

关键词:聚类初始化; KD 树; 子样; K 均值算法

Abstract:

In the field of initialization of clustering for large data set, random sampling is used as an important reduction operation. This paper focuses on the process and property of random sampling, and proposes a novel random sampling method which is based on KD-Tree samples. Sample spaces were further divided into several sub spaces using KD-Tree. KD-Tree samples were created for each sub-space. This overcomes the defect of skewness of the random samples. Thus the good initial centroids can well describe the clustering category of the whole data set. The experiment results show that the cluster initial centroids selected by the new method is more closed to the desired cluster centers, and the better clustering accuracy can be achieved.

Key words:clustering initialization; KD-tree; subsamples; K-means algorithm

引用本文

潘章明.基于KD 树子样的聚类初始化算法.计算机系统应用,2011,20(1):80-83

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2010-04-27
最后修改日期:2010-05-29
录用日期:
在线发布日期:
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码