联合对比学习与图神经网络的自优化单细胞聚类
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61972100)


Self-optimizing Single-cell Clustering with Contrastive Learning and Graph Neural Network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    单细胞 RNA测序技术(single-cell RNA sequencing, scRNA-seq)在单个细胞的水平上对转录组进行高通量测序分析, 其核心应用是识别具有不同功能的细胞亚群, 通常基于细胞聚类来完成. 然而, scRNA-seq 数据高维度、高噪声、高稀疏的特点使得聚类充满挑战. 常规的聚类方法表现不佳, 现有的单细胞聚类方法也大多只考虑基因的表达模式, 而忽略了细胞之间的关系. 针对这些问题, 提出了一个联合对比学习与图神经网络的自优化单细胞聚类方法(self-optimizing single-cell clustering with contrastive learning and graph neural network, scCLG). 该方法采用自编码器来学习细胞的特征分布. 首先构建细胞-基因图, 使用图神经网络进行编码, 以有效利用细胞之间的关系信息. 通过子图采样和特征掩码获取增广视图用于对比学习, 进一步优化特征表示. 最后使用自优化的策略将聚类模块和特征模块联合训练, 不断优化特征表示和聚类中心, 实现更准确的聚类. 在10个真实的scRNA-seq数据集上的实验表明, scCLG能够学习到细胞特征的良好表示, 在聚类精度上全面优于其他方法.

    Abstract:

    Single-cell RNA sequencing (scRNA-seq) performs high-throughput sequencing analysis of the transcriptomes at the level of individual cells. Its primary application is to identify cell subpopulations with distinct functions, usually based on cell clustering. However, the high dimensionality, noise, and sparsity of scRNA-seq data make clustering challenging. Traditional clustering methods are inadequate, and most existing single-cell clustering approaches only consider gene expression patterns while ignoring relationships between cells. To address these issues, a self-optimizing single-cell clustering method with contrastive learning and graph neural network (scCLG) is proposed. This method employs an autoencoder to learn cellular feature distribution. First, it begins by constructing a cell-gene graph, which is encoded using a graph neural network to effectively harness information on intercellular relationships. Subgraph sampling and feature masking create augmented views for contrastive learning, further optimizing feature representation. Finally, a self-optimizing strategy is utilized to jointly train the clustering and feature modules, continually refining feature representation and clustering centers for more accurate clustering. Experiments on 10 real scRNA-seq datasets demonstrate that scCLG can learn robust representations of cell features, significantly surpassing other methods in clustering accuracy.

    参考文献
    相似文献
    引证文献
引用本文

蒋维康,王劲贤.联合对比学习与图神经网络的自优化单细胞聚类.计算机系统应用,2024,33(9):1-13

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-03-28
  • 最后修改日期:2024-04-23
  • 录用日期:
  • 在线发布日期: 2024-07-26
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号