K-Hub Clustering Algorithm Based on Active Learning
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [14]
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    K-Hub is an efficient high-dimensional data clustering algorithm, but it is sensitive to the choice of initial clustering centers and the instances which besides the class border may not be correctly clustered. In order to solve these problems, an improved method which incorporates active learning and semi-supervised clustering into K-Hub clustering algorithm is proposed. It uses active learning strategy to study pairwise constraints, and then, it uses these pairwise constraints to guide the clustering process of K-Hub. The experiment results demonstrate that the improved method can enhance the performance of K-Hub clustering algorithm.

    Reference
    1 Donoho DL. High-dimensional data analysis:The curses and blessings of dimensionality. AMS Math Challenges Lecture, 2000:1-32.
    2 Radovanovic M, Nanopoulos A, Ivanovic M. Nearest neighbour in high-dimensional data:The emergence and influence of hubs. Proc. of 26th Annual International Conference on Machine Learning(ICML), 2009:865-872.
    3 Radovanovic M, Nanopoulos A, Ivanovic M. Hubs in space:Popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research, 2010, 11:2487-2531.
    4 Tomasev N, Radovanovic M, Mladenic D, Ivanovic M. Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. International Journal of Machine Learning and Cybernetics, 2014, 5(3):445-458.
    5 Tomasev N, Radovanovic M, Mladenic D, Ivanovic M. The role of hubness in clustering high-dimensional. Knowledge and Data Mining, 2014, 26(3):739-751.
    6 Zhai TT, He ZF. Instance selection for time series classification based on immune binary particle swarm optimization. Knowledge-Based Systems, 2013, 49:106-115.
    7 何振峰,熊范纶.结合限制的分隔模型及K-Means算法.软件学报,2005,16(5):799-809.
    8 Wagstaff K, Cardie C. Clustering with instance-level constraints. Proc. of the Seventeenth International Conference on Machine Learning(ICML 2000), 2000. 1103-1110.
    9 Basu S, Banerjee A, Mooney J. Active semi-supervision for pairwise constrained clustering. Proc. of the Society for Industrial and Applied Mathematics(SIAM) Int Conf, on Data Mining. 2004. 333-344.
    10 Huang RZ, Lam W, Zhang Z. Active learning of constraints for semi-supervised text clustering. Proc. of the Society for Industrial and Applied Mathematics(SIAM) Int'1 Conf. on Data Mining. 2007. 113-124.
    11 He ZF. Hub Selection for hub based clustering algorithms. Proc. of International Conference on Fuzzy System and Knowledge Discovery(FSKD). 2014. 479-484.
    12 张巧达,何振峰.基于Hub的高维数据初始聚类中心的选择策略.计算机系统应用,2015,24(4):171-175.
    13 Huang R, Lam W. Semi-supervised document clustering via active learning with pairwise constraints. Proc. of International Conference on Date Mining(ICDM), 2007:517-522.
    14 赵卫中,马慧芳,李志清,史忠植.一种结合主动学习的半监督文档聚类算法.软件学报,2012,23(6):1486-1499.
    Related
    Cited by
Get Citation

封建邦,何振峰.基于主动学习的K-Hub聚类算法.计算机系统应用,2016,25(3):187-193

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 05,2015
  • Revised:September 08,2015
  • Online: March 17,2016
Article QR Code
You are the first990480Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063