Abstract:Single-cell RNA sequencing (scRNA-seq) performs high-throughput sequencing analysis of the transcriptomes at the level of individual cells. Its primary application is to identify cell subpopulations with distinct functions, usually based on cell clustering. However, the high dimensionality, noise, and sparsity of scRNA-seq data make clustering challenging. Traditional clustering methods are inadequate, and most existing single-cell clustering approaches only consider gene expression patterns while ignoring relationships between cells. To address these issues, a self-optimizing single-cell clustering method with contrastive learning and graph neural network (scCLG) is proposed. This method employs an autoencoder to learn cellular feature distribution. First, it begins by constructing a cell-gene graph, which is encoded using a graph neural network to effectively harness information on intercellular relationships. Subgraph sampling and feature masking create augmented views for contrastive learning, further optimizing feature representation. Finally, a self-optimizing strategy is utilized to jointly train the clustering and feature modules, continually refining feature representation and clustering centers for more accurate clustering. Experiments on 10 real scRNA-seq datasets demonstrate that scCLG can learn robust representations of cell features, significantly surpassing other methods in clustering accuracy.