K-Means Clustering Algorithm Based on Hadoop

doi:10.15888/j.cnki.csa.005779

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-5- 6

Home > Archive>Volume 26, Issue 6, 2017 >182-186. DOI:10.15888/j.cnki.csa.005779

PDF HTML XML Export Cite reminder

K-Means Clustering Algorithm Based on Hadoop
DOI:
                        10.15888/j.cnki.csa.005779
                    
CSTR:
                        [cstr]
                    
Author:
                        LIU Bao-LongLIU Bao-Long
School of Computer Science and Engineering, Xian Technological University, Xian 710021, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
SU JinSU Jin
School of Computer Science and Engineering, Xian Technological University, Xian 710021, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Although there are many advantages in traditional K-means algorithm, the clustering criterion function has poor efficiency on classification of the data set with uneven cluster density. On the basis of weighted standard deviation criterion function, this paper proposes a K-means parallel algorithm which is designed and optimized based on MapReduce programming. And it also increases the convergence judgment. Compared with the traditional K-means algorithm, the designed parallel algorithm has a significant improvement in the aspects of accuracy, speedup ratio, scalability and the convergence of clustering results. It also reduces the probability of misclassification caused by the uneven cluster density, and improves the clustering accuracy of the algorithm. What's more, the optimization effect will be more obvious when it deals with lager data size and more nodes.

Key words:K-means;cluster density;clustering accuracy;MapReduce;Hadoop

Get Citation

刘宝龙,苏金.基于Hadoop平台的K-means聚类算法.计算机系统应用,2017,26(6):182-186

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:September 06,2016
Revised:October 19,2016
Adopted:
Online: June 08,2017
Published:

Article QR Code

You are the first990602Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063