基于多尺度感知学习的图像篡改检测与定位
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(62102189, 62122032); 国家社会科学基金(2022-SKJJ-C-082)


Multi-scale Perceptual Learning for Image Manipulation Detection and Localization
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了解决现有图像篡改检测方法在检测定位性能与鲁棒性方面的不足, 本文提出了一种多尺度感知学习网络(MsPL-Net). 首先, 为了扩展感受野并解决图像后处理和操作类型多样导致的鲁棒性弱的难题, 提出了一种分层密集链接多尺度扩展卷积模块(MSDCM). 该模块可放大感受野以捕捉多尺度特征信息, 同时保持输入图像的高分辨率表示, 无缝提取复杂的图像细节和边缘信息. 其次, 为了解决篡改大小敏感性导致的篡改边缘位置模糊问题, 提出了一种由全局注意力、局部注意力和门控特征调节器组成的信息互补感知注意力模块 (ICPAM). 全局注意可以捕捉图像的整体形状、结构或背景信息, 而局部注意可以学习图像的局部区域和具体细节, 两者交互融合, 提高定位精度. 门控特征调节器采用精细嵌入从全局和局部特征图中过滤出不相关的特征和噪声响应, 引导下游识别和学习由不同篡改技术引起的异常纹理、边缘变化和其他特征信息. 最后, 设计一种新的联合损失函数, 进一步提高网络的检测性能和定位准确率. 相较于最新工作, 本文方法的检测准确率提高了2.3%. 此外, 在鲁棒性和泛化性上同样表现出较好的性能, 以及篡改区域定位更精确和清晰.

    Abstract:

    To address the shortcomings of existing image tampering detection methods in terms of detection and localization performance as well as robustness, a multi-scale perceptual learning network (MsPL-Net) is proposed. Firstly, to expand the receptive field and address the issue of weak feature robustness resulting from diverse image post-processing and operation types, a hierarchical dense linked multi-scale dilated convolution module (MSDCM) is introduced. This module expands the receptive field to capture multi-scale feature information while preserving the high-resolution representation of input images, seamlessly extracting intricate image details and edge information. Secondly, to solve the problem of blurred tampered edge positions caused by sensitivity to tampering size, an information complementary perception attention module (ICPAM) is proposed, consisting of global attention, local attention, and a gated feature modulator. The global and local attention mechanisms operate in parallel and complement each other: through feature interaction and fusion, the model’s representational capacity is enhanced, leading to improved localization performance. Global attention captures the overall shape, structure, or background information of the image, while local attention focuses on learning the local regions and specific details of the image. The two mechanisms interact and integrate to enhance positioning accuracy. The gated feature modulator employs fine embeddings to filter out irrelevant features and noise responses from the global and local feature maps. This facilitates downstream recognition and learning of abnormal textures, edge changes, and other feature information caused by different tampering techniques. Finally, a novel joint loss function is designed to further enhance the detection performance and localization accuracy of the network. Compared with the latest works, the detection accuracy of the proposed method is improved by 2.3%. In addition, the proposed method demonstrates excellent performance in terms of robustness and generalization, offering more accurate and clear localization.

    参考文献
    相似文献
    引证文献
引用本文

徐悦,袁程胜,刘庆程,夏志华.基于多尺度感知学习的图像篡改检测与定位.计算机系统应用,,():1-11

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-12-04
  • 最后修改日期:2024-12-25
  • 录用日期:
  • 在线发布日期: 2025-04-25
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号