Abstract:Most of the existing deep clustering algorithms adopt symmetric autoencoders to extract low-dimensional features of high-dimensional data. However, with the increasing training times of autoencoders, the low-dimensional feature space of the data is distorted to a certain extent, and then the obtained data low-dimensional feature space cannot reflect the potential clustering structure information in the original data space. To this end, this study proposes a new deep embedded K-means algorithm (SDEKC). First, during low-dimensional feature extraction, two skip connections are added with a certain weight between the corresponding encoder and decoder in the symmetric convolutional autoencoder. As a result, the encoding requirements of the decoder for the encoder are reduced, and the coding ability of the convolutional autoencoder is highlighted, which can better retain the clustering structure information in the original data space. Second, the low-dimensional data space is converted into a new space revealing clustering structure information by an orthogonal transformation matrix in the clustering stage. Finally, this study utilizes the greedy algorithm to iteratively optimize the low-dimensional representation of the data and its clustering in an end-to-end way and verifies the effectiveness of the proposed new algorithm on six real datasets.