###
计算机系统应用英文版:2024,33(1):11-21
本文二维码信息
码上扫一扫!
加入跳跃连接的深度嵌入K-means聚类
(山西大学 数学科学学院, 太原 030006)
Deep Embedded K-means Clustering with Skip Connections
(School of Mathematical Sciences, Shanxi University, Taiyuan 030006, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 338次   下载 788
Received:June 29, 2023    Revised:July 27, 2023
中文摘要: 现有的深度聚类算法大多采用对称的自编码器来提取高维数据的低维特征, 但随着自编码器训练次数的不断增加, 数据的低维特征空间在一定程度上发生了扭曲, 这样得到的数据低维特征空间无法反映原始数据空间中潜在的聚类结构信息. 为了解决上述问题, 本文提出了一种新的深度嵌入K-means算法(SDEKC). 首先, 在低维特征提取阶段, 在对称的卷积自编码器中相对应的编码器与解码器之间以一定的权重加入两个跳跃连接, 以减弱解码器对编码器的编码要求同时突出卷积自编码器的编码能力, 这样可以更好地保留原始数据空间中蕴含的聚类结构信息; 其次, 在聚类阶段, 通过一个标准正交变换矩阵将低维数据空间转换为一个新的揭示聚类结构信息的空间; 最后, 本文以端到端的方式采用贪婪算法迭代优化数据的低维表示及其聚类, 在6个真实数据集上验证了本文提出新算法的有效性.
Abstract:Most of the existing deep clustering algorithms adopt symmetric autoencoders to extract low-dimensional features of high-dimensional data. However, with the increasing training times of autoencoders, the low-dimensional feature space of the data is distorted to a certain extent, and then the obtained data low-dimensional feature space cannot reflect the potential clustering structure information in the original data space. To this end, this study proposes a new deep embedded K-means algorithm (SDEKC). First, during low-dimensional feature extraction, two skip connections are added with a certain weight between the corresponding encoder and decoder in the symmetric convolutional autoencoder. As a result, the encoding requirements of the decoder for the encoder are reduced, and the coding ability of the convolutional autoencoder is highlighted, which can better retain the clustering structure information in the original data space. Second, the low-dimensional data space is converted into a new space revealing clustering structure information by an orthogonal transformation matrix in the clustering stage. Finally, this study utilizes the greedy algorithm to iteratively optimize the low-dimensional representation of the data and its clustering in an end-to-end way and verifies the effectiveness of the proposed new algorithm on six real datasets.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(82274360, 61976128); 2022 年度山西省研究生教育教学改革课题(2022YJJG010); 山西省横向课题(109023901054)
引用文本:
李顺勇,胥瑞,李师毅.加入跳跃连接的深度嵌入K-means聚类.计算机系统应用,2024,33(1):11-21
LI Shun-Yong,XU Rui,LI Shi-Yi.Deep Embedded K-means Clustering with Skip Connections.COMPUTER SYSTEMS APPLICATIONS,2024,33(1):11-21