整合卷积神经网络和神经过程的图像数据补全方法
作者:

Image Data Complementation Method Integrating Convolutional Neural Network and Neural Process
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [30]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    神经过程(NP)能够结合神经网络和高斯过程的优势, 通过少量上下文数据估计不确定性分布函数, 实现函数回归功能. 现已应用于数据补全、分类等多种机器学习任务. 但面对二维数据回归问题(如图像数据补全), 神经过程预测准确度有限且对上下文数据的拟合存在欠缺. 为此, 将卷积神经网络(CNN)整合到神经过程中, 基于证据下界和损失函数推导, 构造了面向图像的神经过程(IFNP)模型.在IFNP基础上, 设计了适用于IFNP的局部池化聚合模块和全局交叉注意力模块, 并构造出性能明显优于NP和IFNP的的面向图像的注意力神经过程(IFANP)模型. 最后, 相关模型应用于MNIST及CelebA数据集, 通过定性与定量分析相结合, 展现出IFNP的可扩展性, 证实了IFANP更佳的数据补全及细节拟合能力.

    Abstract:

    Neural process (NP) combines the advantages of neural networks and Gaussian processes to estimate uncertainty distribution functions from a small number of contexts and implement function regression. It has been applied to a variety of machine learning tasks such as data complementation and classification. However, for 2D data regression problems (e.g., image data completion), the prediction accuracy of NP and the fitting of the contexts are deficient. To this end, an image-faced neural process (IFNP) is constructed by integrating a convolutional neural network (CNN) into the neural process based on the lower bound of evidence and loss function derivation. Then, a local pooled attention (LPA) module and a global cross-attention (GCA) module are designed for the IFNP, and an image-faced attentive neural process (IFANP) model with significantly better performance than the NP and IFNP is constructed. Finally, these models are applied to MNIST and CelebA datasets, and the scalability of IFNP is demonstrated by combining qualitative and quantitative analysis. In addition, the better data completion and detail-fitting ability of IFNP are confirmed.

    参考文献
    [1] Schulz E, Speekenbrink M, Krause A. A tutorial on Gaussian process regression:Modelling, exploring, and exploiting functions. Journal of Mathematical Psychology, 2018, 85:1-16.[doi:10.1016/j.jmp.2018.03.001
    [2] Garnelo M, Schwarz J, Rosenbaum D, et al. Neural processes. arXiv:1807.01622, 2018.
    [3] Gordon J, Bruinsma WP, Foong AYK, et al. Convolutional conditional neural processes. arXiv:1910.13556, 2020.
    [4] Foong A, Bruinsma W, Gordon J, et al. Meta-learning stationary stochastic process prediction with convolutional neural processes. Proceedings of the 34th International Conference on Neural Information Processing Systems., NIPS, 2020. 8284-8295.
    [5] Zhu JC, Qin SH, Wang WS, et al. Probabilistic trajectory prediction for autonomous vehicles with attentive recurrent neural process. arXiv:1910.08102v1, 2019.
    [6] Kumar A, Ali Eslami SM, Rezende DJ, et al. Consistent generative query networks. arXiv:1807.02033, 2018.
    [7] 马烨, 王淑青, 毛月祥. 基于神经过程-粒子群算法的移动机器人路径规划. 湖北工业大学学报, 2020, 35(1):17-20.[doi:10.3969/j.issn.1003-4684.2020.01.005
    [8] Pathak D, Krähenbuhl P, Donahue J, et al. Context encoders:Feature learning by inpainting. Proceedings of the 15th IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016. 2536-2544.
    [9] Liu GL, Reda FA, Shih KJ, et al. Image inpainting for irregular holes using partial convolutions. Proceedings of the European Conference on Computer Vision (ECCV). Amsterdam:Springer, 2018. 89-105.
    [10] Yu JH, Lin Z, Yang JM, et al. Free-form image inpainting with gated convolution. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul:IEEE, 2019. 4470-4479.
    [11] Wang KF, Gou C, Duan YJ, et al. Generative adversarial networks:Introduction and outlook. IEEE/CAA Journal of Automatica Sinica, 2017, 4(4):588-598.[doi:10.1109/JAS.2017.7510583
    [12] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. Proceedings of the 4th International Conference on Learning Representations. San Juan:ICLR, 2016. 1-16.
    [13] Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion. ACM Transactions on Graphics, 2017, 36(4):107
    [14] Li HF, Li GB, Lin L, et al. Context-aware semantic inpainting. IEEE Transactions on Cybernetics, 2018, 49(12):4398-4411
    [15] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786):504-507.[doi:10.1126/science.1127647
    [16] Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders. Proceedings of the 25th International Conference on Machine Learning. Helsinki:ACM, 2008. 1096-1103.
    [17] Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv:1312.6114, 2013.
    [18] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述. 计算机学报, 2021, 40(6):1229-1251
    [19] Yu F, Koltun V, Funkhouser T. Dilated residual networks. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu:IEEE, 2017. 636-644.
    [20] Gao H, Zhu XZ, Lin S, et al. Deformable kernels:Adapting effective receptive fields for object deformation. Proceedings of the 8th International Conference on Learning Representations. Addis Ababa:ICLR, 2020.
    [21] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas:IEEE, 2016. 770-778.
    [22] Wang XL, Girshick R, Gupta A, et al. Non-local neural networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 7794-7803.
    [23] Odaibo S. Tutorial:Deriving the standard variational autoencoder (VAE) loss function. arXiv:1907.08956, 2019.
    [24] Kosiorek A. 神经网络中的注意力机制. 机器人产业, 2017, (6):12-17.[doi:10.3969/j.issn.2096-0182.2017.06.002
    [25] Hadji I, Wildes RP. What do we understand about convolutional networks? arXiv:1803.08834, 2018.
    [26] Liu ZW, Luo P, Wang XG, et al. Deep learning face attributes in the wild. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Accept:IEEE, 2015. 3730-3738.
    [27] Hu J, Shen L, Albanie S, et al. Squeeze-and-excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 7132-7141.
    [28] 张雯柏, 赵华北, 胡爱云, 等. 峰值信噪比标准下轨道图像预处理方法研究. 湖南文理学院学报(自然科学版), 2019, 31(3):7-12, 18
    [29] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words:Transformers for image recognition at scale. arXiv:2010.11929, 2020.
    [30] He KM, Chen XL, Xie SN, et al. Masked autoencoders are scalable vision learners. arXiv:2111.06377, 2021.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

余晓晗,毛绍臣,王磊,崔静,于坤.整合卷积神经网络和神经过程的图像数据补全方法.计算机系统应用,2023,32(1):135-145

复制
分享
文章指标
  • 点击次数:923
  • 下载次数: 2131
  • HTML阅读次数: 1586
  • 引用次数: 0
历史
  • 收稿日期:2022-04-25
  • 最后修改日期:2022-05-22
  • 在线发布日期: 2022-08-12
文章二维码
您是第11202862位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号