基于图像风格迁移的端到端跨域目标检测
作者:
基金项目:

安徽省重点研发计划(201904a05020035)


End-to-End Cross-Domain Object Detection Based on Image Style Transfer
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [36]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    跨域目标检测是最近兴起的研究方向, 旨在解决训练集到测试集的泛化问题. 在已有的方法中利用图像风格转换并在转换后的数据集上训练模型是一个有效的方法, 然而这一方法存在不能端到端训练的问题, 效率低, 流程繁琐. 为此, 我们提出一种新的基于图像风格迁移的跨域目标检测算法, 可以把图像风格迁移和目标检测结合在一起, 进行端到端训练, 大大简化训练流程, 在几个常见数据集上的结果证明了该模型的有效性.

    Abstract:

    Cross-domain object detection is a new research direction, which aims to solve the problem of generalization from training set to test set. In the existing methods, using image style transfer and train the model on the converted data set is an effective method. However, this method has the problems of not end-to-end training, low efficiency, and tedious process. Therefore, we propose a new cross domain target detection algorithm based on image style migration, which can combine image style migration and target detection to carry out end-to-end training, and greatly simplify the training process. The results on several common datasets show the validity of the model.

    参考文献
    [1] Li PL, Chen XZ, Shen SJ. Stereo R-CNN based 3D object detection for autonomous driving. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA. 2019. 7636–7644.
    [2] Hattori H, Lee N, Boddeti VN, et al. Synthesizing a scene-specific pedestrian detector and pose estimator for static video surveillance: Can we learn pedestrian detectors and pose estimators without real data? International Journal of Computer Vision, 2018, 126(9): 1027–1044. [doi: 10.1007/s11263-018-1077-3
    [3] Turk MA, Pentland AP. Face recognition using eigenfaces. Proceedings of 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Maui, HI, USA. 1991. 586–591.
    [4] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, QC, Canada. 2015. 91–99.
    [5] Girshick R. Fast R-CNN. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1440–1448.
    [6] Cai ZW, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 6154–6162.
    [7] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 6517–6525.
    [8] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv: 1804.02767, 2018.
    [9] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands. 2016. 21–37.
    [10] Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database. Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA. 2009. 248–255.
    [11] Everingham M, Eslami SMA, Van Gool L, et al. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 2015, 111(1): 98–136. [doi: 10.1007/s11263-014-0733-5
    [12] Lin TY, Maire M, Belongie S, et al. Microsoft coco: Common objects in context. Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland. 2014. 740–755.
    [13] Ganin Y, Lempitsky V. Unsupervised domain adaptation by backpropagation. Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France. 2015. 1180–1189.
    [14] Inoue N, Furuta R, Yamasaki T, et al. Cross-domain weakly-supervised object detection through progressive domain adaptation. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 5001–5009.
    [15] Kim T, Jeong M, Kim S, et al. Diversify and match: A domain adaptive representation learning paradigm for object detection. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach. CA, USA. 2019. 12448–12457.
    [16] Chen HY, Fang IS, Cheng CM, et al. Self-contained stylization via steganography for reverse and serial style transfer. Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass Village, FL, USA. 2020. 2152–2160.
    [17] Chen YH, Li W, Sakaridis C, et al. Domain adaptive faster R-CNN for object detection in the wild. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 3339–3348.
    [18] Zhu JY, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy. 2017. 2242–2251.
    [19] He ZW, Zhang L. Multi-adversarial faster-RCNN for unrestricted object detection. Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Republic of Korea. 2019. 6667–6676.
    [20] Wang T, Zhang XP, Yuan L, et al. Few-shot adaptive faster R-CNN. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA. 2019. 7166–7175.
    [21] Goodfellow IJ, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, QC, Canada. 2014. 2672–2680.
    [22] Mirza M, Osindero S. Conditional generative adversarial nets. arXiv: 1411.1784, 2014.
    [23] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv: 1511.06434, 2015.
    [24] Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein GANs. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, CA, USA. 2017. 5769–5779.
    [25] Isola P, Zhu JY, Zhou TH, et al. Image-to-image translation with conditional adversarial networks. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 5967–5976.
    [26] Wang TC, Liu MY, Zhu JY, et al. High-resolution image synthesis and semantic manipulation with conditional GANs. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 8798–8807.
    [27] Wang TC, Liu MY, Zhu JY, et al. Video-to-video synthesis. Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, QC, Canada. 2018. 1152–1164.
    [28] Choi Y, Choi M, Kim M, et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 8789–8797.
    [29] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770–778.
    [30] Torralba A, Murphy KP, Freeman WT, et al Context-based vision system for place and object recognition. Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France. 2003. 273–280.
    [31] Johnson-Roberson M, Barto C, Mehta R, et al. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? Proceedings of 2017 IEEE International Conference on Robotics and Automation. Singapore. 2017. 746–753.
    [32] Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 3213–3223.
    [33] Sakaridis C, Dai DX, Van Gool L. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 2018, 126(9): 973–992. [doi: 10.1007/s11263-018-1072-8
    [34] Kim S, Choi J, Kim T, et al. Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Republic of Korea. 2019. 6091–6100.
    [35] Saito K, Ushiku Y, Harada T, et al. Strong-weak distribution alignment for adaptive object detection. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA. 2019. 6956–6965.
    [36] Zhu XG, Pang JM, Yang CY, et al. Adapting object detectors via selective cross-domain alignment. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA. 2019. 687–696.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

吴泽远,朱明.基于图像风格迁移的端到端跨域目标检测.计算机系统应用,2021,30(1):194-199

复制
分享
文章指标
  • 点击次数:1258
  • 下载次数: 2644
  • HTML阅读次数: 3031
  • 引用次数: 0
历史
  • 收稿日期:2020-06-06
  • 最后修改日期:2020-07-07
  • 在线发布日期: 2020-12-31
文章二维码
您是第11304685位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号