Improved Prototype Selection Algorithm Based on CURE Algorithm
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [19]
  • |
  • Related [20]
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Since the traditional K-nearest neighbor classifier possesses large time and space complexity for larger-scale data sets, prototype selection is an effective processed method which selects representative prototypes (instances) from the original data set for K-nearest neighbor classifier without reducing the classification accuracy. At present, there exist many prototype selection methods. In this paper, based on the existing CURE algorithm, which is difficult to determine the noise points and has bad dispersed of representative points, the shared neighbor density metric is presented to delete noise points and the maximum and minimum distances are employed to obtain scattered representative points, which generates a novel prototype selection methods PSCURE (improved Prototype Selection algorithm based on CURE algorithm). Some numerical experiments are further conducted to show the performance of the proposed prototype selection algorithm compared with other related prototype selection algorithms. The experimental results show that the proposed algorithm not only can select fewer prototypes but also can achieve higher classifier accuracy for almost all the data sets.

    Reference
    [1] Cover T, Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 1967, 13(1):21-27.[doi:10.1109/TIT.1967.1053964
    [2] Triguero I, Derrac J, Garcia S, et al. A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Transactions on Systems, Man, and Cybernetics, 2012, 42(1):86-100.[doi:10.1109/TSMCC.2010.2103939
    [3] García S, Derrac J, Cano J, et al. Prototype selection for nearest neighbor classification:Taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(3):417-435.[doi:10.1109/TPAMI.2011.142
    [4] Hart P. The condensed nearest neighbor rule (Corresp.). IEEE Transactions on Information Theory, 1968, 14(3):515-516.[doi:10.1109/TIT.1968.1054155
    [5] Tome K I. An experiment with the edited-nearest neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, 1976, SMC-6(6):448-452.[doi:10.1109/TSMC.1976.4309523
    [6] Fayed HA, Atiya AF. A novel template reduction approach for the K-nearest neighbor method. IEEE Transactions on Neural Networks, 2009, 20(5):890-896.[doi:10.1109/TNN.2009.2018547
    [7] Ougiaroglou S, Evangelidis G. Fast and accurate k-nearest neighbor classification using prototype selection by clustering. Proceedings of 2012 Panhellenic Conference on Informatics. Piraeus, Greece. 2012. 168-173.
    [8] Li J, Wang YP. A new fast reduction technique based on binary nearest neighbor tree. Neurocomputing, 2015, 149:1647-1657.[doi:10.1016/j.neucom.2014.08.028
    [9] Yang LJ, Zhu QS, Huang JL. An efficient reduction algorithm based on natural neighbor and nearest enemy. Proceedings of 2016 IEEE International Conference on Cognitive Informatics & Cognitive Computing. Palo Alto, CA, USA. 2016. 212-218.
    [10] 朱庆生, 段浪军, 杨力军. 基于自然邻居和最小生成树的原型选择算法. 计算机科学, 2017, 44(4):241-245, 268.[doi:10.11896/j.issn.1002-137X.2017.04.051
    [11] 黄宇扬, 董明刚, 敬超. 面向K最近邻分类的遗传实例选择算法. 计算机应用, 2018, 38(11):3112-3118.[doi:10.11772/j.issn.1001-9081.2018041337
    [12] 王熙照, 邢胜, 赵士欣. 基于非平稳割点的大数据分类样例选择. 模式识别与人工智能, 2016, 29(9):780-789
    [13] Acampora G, Tortora G, Vitiello A. Applying SPEA2 to prototype selection for nearest neighbor classification. Proceedings of 2016 IEEE International Conference on Systems, Man, and Cybernetics. Budapest, Hungary, 2016. 3924-3929.
    [14] 邱荣财. 基于spark平台的CURE算法并行化设计与应用[硕士学位论文]. 广州:华南理工大学, 2014. 6.
    [15] Liu R, Wang H, Yu XM. Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Information Sciences, 2018, 450:200-226.[doi:10.1016/j.ins.2018.03.031
    [16] 李金宗. 模式识别导论. 北京:高等教育出版社, 1994.
    [17] Guha S, Rastogi R, Shim K. Cure:An efficient clustering algorithm for large databases. Information Systems, 2001, 26(1):35-58.[doi:10.1016/S0306-4379(01)00008-4
    [18] Chang H, Yeung DY. Robust path-based spectral clustering. Pattern Recognition, 2008, 41(1):191-203.[doi:10.1016/j.patcog.2007.04.010
    [19] Fu LM, Medico E. FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinformatics, 2007, 8:3.[doi:10.1186/1471-2105-8-3
    Cited by
Get Citation

孙元元,张德生,张晓.基于CURE聚类算法改进的原型选择算法.计算机系统应用,2019,28(8):162-169

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 23,2019
  • Revised:February 26,2019
  • Online: August 14,2019
  • Published: August 15,2019
Article QR Code
You are the first990402Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063