﻿ 基于深度学习的SIFT图像检索算法
 计算机系统应用  2001, Vol. 29 Issue (9): 164-170 PDF

1. 常州工业职业技术学院, 常州 213164;
2. 上海海事大学, 上海 201306

SIFT Image Retrieval Algorithm Based on Deep Learning
SU Yong-Gang1, GAO Mao-Ting2
1. Changzhou Institute of Industry Technology, Changzhou 213164, China;
2. Shanghai Maritime University, Shanghai 201306, China
Foundation item: National Natural Science Foundation of China (61202022)
Abstract: Deep learning is a new filed in machine learning research, and to apply it to computer vision achieves effective result. To solve the problem that the traditional Scale-Invariant Feature Transform algorithm (SIFT) has low efficiency and extracts image features roughly, A SIFT image retrieval algorithm based on deep learning is proposed. The algorithm idea is that on the Spark platform, a deep Convolutional Neural Network (CNN) model is used for SIFT feature extraction, and Support Vector Machine (SVM) is utilized for unsupervised clustering of image library, then the adaptive image feature measures are used to re-sort the search results to improve the user experience. The experiment results on the Corel image set show that compared with the traditional SIFT algorithm, the precision and recall rate of the SIFT image retrieval algorithm based on deep learning is increased by about 30 percentage points and the retrieval efficiency is improved, the resulting image order is also optimized.
Key words: Convolutional Neural Network (CNN)     deep learning     image retrieval     resort

1 基本概念 1.1 CNN

 ${\textit{z}}(u,v) = \sum\limits_{i = - \infty }^\infty {\sum\limits_{j = - \infty }^\infty {{X_{i,j}}.{K_{u - i,v - j}}} }$ (1)

 ${\textit{z}}(u,v) = \sum\limits_{i = - \infty }^\infty {\sum\limits_{j = - \infty }^\infty {{X_{i + u,j + v}}.{K_{{\rm rot}\;i,j}}.X(i,j)} }$ (2)
 $X(i,j) = \left\{\begin{array}{*{20}{l}} {1,0 \le i,j \le n} \\ {0,{\rm others}} \end{array}\right.$ (3)

 $\left\{ \begin{split} &new\_height = (input\_height - filter\_height)/S + 1\\ &new\_width = (input\_width - filter\_width)/S + 1 \end{split}\right.$ (4)

 图 1 池化过程演示

 图 2 全连接示意

 $\left( {{X_1},{X_2},{X_3}} \right) * \left( {\begin{array}{*{20}{c}} {{W_{11}},}&{{W_{12}}}\\ {{W_{21}},}&{{W_{22}}}\\ {{W_{31}},}&{{W_{32}}} \end{array}} \right) = \left( {{Y_1},{Y_2}} \right)$

1.2 相似性度量

1.2.1 图像特征度量

 ${c_1} = s*\cos (h),\;\;{c_2} = s*\sin (h),\;\;c_3=v$ (5)

 $(\bar x,\bar y) = \left( {\frac{x}{W},\frac{y}{H}} \right)$ (6)

 ${D_1} = \exp \left( {\frac{{{{({c_{i1}} - {c_{j1}})}^2} + {{({c_{i2}} - {c_{j2}})}^2} + {{({c_{i3}} - {c_{j3}})}^2}}}{{3\sigma _1^2}}} \right)$ (7)

 ${D_2} = \exp \left( { - \frac{{{{({{\bar x}_i} - {{\bar x}_j})}^2} + {{({{\bar y}_i} - {{\bar y}_j})}^2}}}{{2\sigma _2^2}}} \right)$ (8)

 ${D_3} = \exp \left( { - \frac{{{w_\rho }{{({\rho _i} - {\rho _j})}^2} + {w_e}{{({e_i} - {e_j})}^2}}}{{\sigma _3^2}}} \right)$ (9)

 ${D_{{\rm{object}}}}(i,j) = {w_1}{D_1} + {w_2}{D_2} + {w_3}{D_3}$ (10)

 $w'_i = \dfrac{{{w_i} + \dfrac{1}{N}}}{{1 + \dfrac{1}{N}}},\;\;w'_j = \dfrac{{{w_j}}}{{1 + \dfrac{1}{N}}},\;\;j \ne i$ (11)

 ${S_w}({p_i},{p_j}) = p_i^{\rm T}W{p_j},W \in {R^{dxd}}$ (12)

2 基于深度学习的SIFT图像检索算法

SIFT算法主要应用于图像检索工作, 大致流程: 先按照某种规则生成尺度空间, 在尺度空间检测图像位置来剔除尺度和旋转变化大的兴趣点, 然后选取稳定的兴趣点作为关键点同时也为分配一个方向或多个方向, 最后利用关键点的邻域向量来度量图像的相似程度. SIFT算法优势在于图像缩放、旋转和亮度变化保持不变性.

2.1 SIFT算法

SIFT算法的处理过程一般分为以下几步:

 图 3 SIFT算法的图解过程

2.2 本文算法

 图 4 本文算法流程图

 $ave(C,N) = \frac{1}{n}\sum\limits_{i{\rm{ = 1}}}^n {(C,N)}$ (13)

 ${F_{i,j}} = {F_{i,j}} - \alpha \frac{\partial }{{\partial {F_{i.j}}}}P(a{{,b}})$ (14)

 图 5 本文卷积神经网络框架

 图 6 图像分块的原图和CNN特征

 ${R_i} =\left\{ \begin{array}{*{20}{l}} {1,\;{S_{i,j}} \le t} \\ {0,\;{\rm others}} \end{array}\right.$ (15)

 图 7 基于Spark平台训练图像库

 图 8 本文算法流程图

3 实验过程与分析

3.1 实验设计

3个对照实验, 主要验证本文算法比传统SIFT算法的性能更佳并且对用户更加友好. 实验1比较算法查准率; 实验2验证在检索海量数据集时本文算法的时间复杂度比传统SIFT算法低; 实验3验证本文算法检索出的图像结果集排序更合理.

3.2 图像检索性能评价性能

 $\left\{ \begin{split} &{{recall}} = \dfrac{{{{relevant\;Correctly\; Retrieved}}}}{{{{all\;Relevant}}}} = \frac{A}{{A + {{B}}}}\\ &{mAP = \dfrac{1}{m}\displaystyle\sum\limits_{i = 1}^m {\dfrac{A}{R}}} \end{split}\right.$ (16)

3.3 实验结果与分析

 图 9 本文算法与传统SIFT算法运行时间对比

 图 10 两种算法检索结果排序

4 结语

 [1] 胡二雷, 冯瑞. 基于深度学习的图像检索系统. 计算机系统应用, 2017, 26(3): 8-19. DOI:10.15888/j.cnki.csa.005692 [2] Poularakis S, Katsavounidis I. Initialization of dynamic time warping using tree-based fast Nearest Neighbor. Pattern Recognition Letters, 2016, 79: 31-37. DOI:10.1016/j.patrec.2016.04.016 [3] Inoue H, Iwahori Y, Kijsirikul B, et al. SVM based defect classification of electronic board using bag of keypoints. Proceedings of 30th International Technical Conference on Circuits Systems, Computers and Communications. Seoul, Republic of Korea. 2015.31–34. [4] 韦秋含, 梁海华, 张新鹏. 基于动态BoW模型的密文JPEG图像检索. 应用科学学报, 2018, 36(4): 628-634. DOI:10.3969/j.issn.0255-8297.2018.04.006 [5] 颜文, 金炜, 符冉迪. 结合VLAD特征和稀疏表示的图像检索. 电信科学, 2016, 32(12): 80-85. [6] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, NV, USA. 2012.1097–1105. [7] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014. [8] 朱建清, 林露馨, 沈飞, 等. 采用SIFT和VLAD特征编码的布匹检索算法. 信号处理, 2019, 35(10): 1725-1731. [9] 袁晖, 廖开阳, 郑元林, 等. 基于CNN特征加权和区域整合的图像检索. 计算机工程与科学, 2019, 41(1): 113-121. DOI:10.3969/j.issn.1007-130X.2019.01.015 [10] He KM, Zhang XY, Ren SQ, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. DOI:10.1109/TPAMI.2015.2389824 [11] Zaharia M, Chowdhury M, Franklin MJ, et al. Spark: Cluster computing with working sets. Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. Berkeley, CA, USA. 2010: 1765–1773. [12] Weinberger KQ, Saul LK. Fast solvers and efficient implementations for distance metric learning. Proceedings of the 25th International Conference on Machine Learning. New York, NY, USA. 2008.1160–1167. [13] Chechik G, Sharma V, Shalit U, et al. Large scale online learning of image similarity through ranking. Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis. Póvoa de Varzim, Portugal. 2009.11–14. [14] Wang J Z’s Research Group. The Pennsylvania State University. Test image database. http://qikan.cqvip.com/Qikan/Article/Detail?id=27238118. (2005-10-08).