基于CLIP的无监督域适应图像分类

doi:10.15888/j.cnki.csa.010087

AIPUB归智期刊联盟

微信公众号

网站二维码

首页 > 过刊浏览>2026年第35卷第1期 >141-151. DOI:10.15888/j.cnki.csa.010087

PDF HTML阅读 XML下载导出引用引用提醒

基于CLIP的无监督域适应图像分类
DOI:
                        10.15888/j.cnki.csa.010087
                    
CSTR:
                        32024.14.csa.010087
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Unsupervised Domain Adaptation Image Classification Based on CLIP

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

无监督域适应(unsupervised domain adaptation, UDA)旨在将源域中训练好的模型应用于仅有未标记数据的目标域. 当前的无监督域适应方法主要通过统计差异最小化或对抗学习来对齐源域和目标域特征空间, 从而学习域不变特征. 然而, 这些约束可能导致语义特征结构的扭曲和类可辨别性的丧失. 针对上述问题, 本文提出一种名为DAMPL的方法. 该方法利用CLIP模型注入文本描述信息, 深入挖掘图像语义内容, 采用针对领域特性的提示学习范式, 有效保留不同域的特有信息, 避免了信息丢失. 此外, 通过语义引导机制对目标域的伪标签进行校正, 以缩小域间差异, 增强模型的泛化能力. 最后还引入互信息最大化损失(mutual information maximization loss, IML), 以保留目标域的特征可区分性. 最终DAMPL方法在Office-Home、miniDomainNet和VisDA-2017数据集上分别达到83.8%、79.7%、89.8%的分类准确率, 展现了最佳的性能.

Abstract:

Unsupervised domain adaptation (UDA) aims to apply a trained model in the source domain to the target domain with only unlabeled data. Current UDA approaches learn domain-invariant features by aligning the source domain and target domain feature spaces via statistical difference minimization or adversarial learning. However, these constraints may result in the distortion of semantic feature structures and loss of class discriminability. To this end, this study proposes a new method called DAMPL. This method utilizes the CLIP model to inject textual descriptive information to deeply mine the semantic content of the image, and adopts a prompt learning paradigm for domain characteristics to effectively retain information specific to different domains, thus avoiding information loss. Additionally, the pseudo-labelling of the target domains are corrected via a semantic bootstrapping mechanism to reduce the inter-domain differences and enhance the generalization ability of the model. Finally, mutual information maximization loss (IML) is also introduced to preserve the feature distinguishability of the target domains. The final DAMPL method demonstrates optimal performance by achieving 83.8%, 79.7%, and 89.8% classification accuracy on the Office-Home, miniDomainNet, and VisDA-2017 datasets, respectively.

参考文献

相似文献

引证文献

引用本文

丁华玲,杨欢.基于CLIP的无监督域适应图像分类.计算机系统应用,2026,35(1):141-151

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-05-16
最后修改日期:2025-07-10
录用日期:
在线发布日期: 2025-12-01
出版日期:

微信公众号

网站二维码

引用本文

分享

相关视频

文章指标

历史

文章二维码