改进CLIP-ReID的跨模态行人重识别
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

辽宁省应用基础研究项目(2022JH2/101300243); 2022年度沈阳市科学技术计划“揭榜挂帅”产业共性技术项目(22-316-1-07)


Cross-modal Person Re-identification Based on Improved CLIP-ReID
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    由图像到文本的跨模态行人重识别中缩小模态间差异一直是一个主要挑战, 针对该问题, 研究了一种基于CLIP-ReID (contrastive language-image pretraining-person re-identification)的改进方法. 引入了上下文调整网络模块和跨模态注意力机制模块. 上下文调整网络模块对图像特征进行深层次的非线性转换, 并有效地与可学习上下文向量相结合, 增强图像和文本间的语义关联性. 跨模态注意力机制模块通过对图像和文本特征进行动态加权和融合, 使得模型能够在处理一个模态的信息时考虑到另一模态, 提升模型在不同模态间的交互. 该方法分别在MSMT17、Market1501、DukeMTMC公共数据集上进行了评估, 实验结果在mAP值上分别提升了2.2%、0.5%、0.4%; 在R1值上分别提升了1.1%、0.1%、1.2%. 结果表明所提方法有效地提升了行人重识别的精度.

    Abstract:

    Narrowing the difference between modalities is always challenging in cross-modal person re-identification from images to texts. To address this challenge, this study proposes an improved method based on contrastive language-image pretraining-person re-identification (CLIP-ReID) by integrating a context adjustment network module and a cross-modal attention mechanism module. The former module performs a deep nonlinear transformation on image features and effectively combines with learnable context vectors to enhance the semantic relevance between images and texts. The latter module dynamically weights and fuses features from images and texts so that the model can take into account the other modality when processing the information of one modality, improving the interaction between different modalities. The method is evaluated on three public datasets. Experimental results show that the mAP on the MSMT17 dataset is increased by 2.2% and R1 is increased by 1.1%. On the Market1501 dataset, there is a 0.5% increase in mAP and a 0.1% rise in R1. The DukeMTMC dataset sees a 0.4% enhancement in mAP and a 1.2% increase in R1. The results show that the proposed method effectively improves the accuracy of person re-identification.

    参考文献
    相似文献
    引证文献
引用本文

贾军营,杨芯茹,杨海波,徐展.改进CLIP-ReID的跨模态行人重识别.计算机系统应用,2025,34(1):153-160

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-06-24
  • 最后修改日期:2024-07-18
  • 录用日期:
  • 在线发布日期: 2024-11-28
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号