﻿ 利用PCA进行深度学习图像特征提取后的降维研究
 计算机系统应用  2019, Vol. 28 Issue (1): 279-283 PDF

1. 三亚学院 信息与智能工程学院, 三亚 572022;
2. 北京师范大学研究生院 珠海分院, 珠海 519085

Applying PCA to Dimensionality Reduction of Image Features Extracted By Deep Learning
YANG Bo-Xiong1,2, YANG Yu-Qi2
1. School of Information & Intelligence Engineering, Sanya University, Sanya 572022, China;
2. Zhuhai Branch, Graduate School of Beijing Normal University, Zhuhai 519085, China
Foundation item: Scientific Research Start-up Program for High Level Talents by Sanya University
Abstract: Deep learning is a kind of machine learning method widely used in the field of artificial intelligence. The high dependence of deep learning on data makes the dimension of the data needed to be processed, which greatly affects the computing efficiency and the performance of data classification. Taking data dimensionality as the research goal, the methods of dimensionality reduction in deep learning are analyzed in this paper. Then, taking Caltech 101 image dataset as experimental object, VGG-16 depth convolution neural network is used to extract image features, and PCA statistical method is taken as an example to achieve dimensionality reduction of high-dimensional image feature data. Euclidean distance is used as a similarity measure to test the accuracy index after dimensionality reduction at the testing stage. The experiments show that image can still maintain high feature information using the PCA method to reduce the data dimension to 64 dimensions after extracting the 4096 dimensional feature of the fc3 layer of the VGG-16 neural network.
Key words: deep learning     CNN     PCA     feature dimensionality reduction

1 引言

2 PCA降维 2.1 PCA原理

n维向量w是低维映射空间的一个映射向量, 则经过最大化数据映射后其方差公式如下:

 $\mathop {\max }\limits_w \frac{1}{{m - 1}}{\sum\limits_{i = 1}^m {({w^{\rm T}}({x_i} - \overline x )} ^2}$ (1)

 $\mathop {\min }\limits_w tr({W^{\rm T}}AW),\;{\rm {s.t.}}{W^{\rm T}}W = I$ (2)

 $A = \frac{1}{{m - 1}}\sum\limits_{i = 1}^m {({x_i} - \overline x } ){({x_i} - \overline x )^{\rm T}}$ (3)

PCA的输出就是 $Y = {W'}X$ , 最优的W是由数据协方差矩阵前k个最大的特征值对应的特征向量作为列向量构成的, 由此将X的原始维度降低到了k维.

2.2 SVD分解

PCA需要计算其特征值和正交归一化的特征向量, 这两个向量在实际应用中都会非常大, 直接计算非常困难, 通常会用SVD分解来解决这个问题[8].

SVD即Singular Value Decomposition, 它是处理维数很高的矩阵经常用的方法, 通过SVD分解可以有效的将很高维的矩阵分解到低维空间里面来进行求解. 通过SVD分解可以很容易的求解出高维矩阵的特征值和其相应的特征向量. SVD分解的基本原理如下:

A是一个秩为r $n \times r$ 维矩阵, 则存在两个正交矩阵(4)、(5)和一个对角矩阵(6).

 $U = ({u_1},{u_2}, \cdots ,{u_r}) \in {R^{n \times r}}$ (4)
 $V = ({v_1},{v_2}, \cdots ,{v_r}) \in {R^{n \times r}}$ (5)
 $A = dig({\lambda _1},{\lambda _2}, \cdots ,{\lambda _r}) \in {R^{n \times r}},\;{\lambda _1} \ge {\lambda _2} \ge \cdots {\lambda _r}$ (6)

 $A = U{A^{\frac{1}{2}}}{V^{\rm T}}$ (7)

 $\sum = \frac{1}{M}\sum\limits_{i = 1}^M {({x_i} - u)} {({x_i} - u)^{\rm T}} = \frac{1}{M}X{X^{\rm T}}$ (8)

 $R = {X^{\rm T}}X \in {R^{M \times M}}$ (9)

 ${u_i} = \frac{1}{{\sqrt {{\lambda _i}} }}X{v_i},i = 1,2, \cdots ,M$ (10)

2.3 PCA特征降维流程

1) 首先计算特征平均值构建特征数据的协方差矩阵;

2) 再通过SVD分解求解该协方差矩阵的特征值以及特征向量;

3) 求出来的特征值依次从大到小的排列以便于选出主成分的特征值;

4) 当选出了主成分的特征值后, 这些特征值所对应的特征向量就构成了降维后的子空间.

3 基于CNN的图像特征提取 3.1 CNN卷积神经网络

 图 1 VGG-16结构图

3.2 数据集选取

Caltech 101数据集是加利福尼亚理工学院整理的图片数据集, Caltch101包括了101类前景图片和1个背景类, 总共9146张图片, 其中有动物、植物、卡通人物、交通工具、物品等各种类别. 每个类别包括40-800张左右的图片, 大部分类别包括50张左右的图片. 图片的大小不一, 但是像素尺寸在300×200左右[14].

4 实验测试 4.1 实验环境搭建

4.2 实验结果

 图 2 实验流程图

5 结论

1) 进行PCA降维后, 并没有产生精度的损失, 相反, 当维度降低到64的时候, 精度最高, 相比于不降维的情况, 提高了2.7%. 分析折线图可以看出, 维度从4096降到8维经历了缓慢上升和快速下降两个阶段. 第一个阶段从4096维到64维, 这个阶段的缓慢上升, 原因是由于冗余信息的去除导致的. 实验结果证明, CNN特征也有一定的信息冗余, 信息冗余所带来的影响比降维所带来的损失的影响要更大, 因此去除冗余能够提升准确率. 第二个阶段从64维到8维, 这个阶段准确率急速下降, 这是因为特征维度小于64后, 降低维度会去除有用信息, 有用信息受损, 导致了准确率的急速下降.

 图 3 PCA降维后的比对准确率折线图

2) 进行PCA降维后, 除欧式距离外, 其他相似性度量的准确率都非常低. 产生这个现象是因为PCA计算时仅仅保证低维空间上数据的方差尽量大. 在仅考虑方差的降维条件下, 其他相似性度量方式失效就不难理解了.

 [1] Jose C. A fast on-line algorithm for PCA and its convergence characteristics. IEEE Transactions on Neural Network, 2000, 4(2): 299-305. [2] Majumdar A. Image compression by sparse PCA coding in curvelet domain. Signal, Image and Video Processing, 2009, 3(1): 27-34. DOI:10.1007/s11760-008-0056-5 [3] Gottumukkal R, Asari VK. An improved face recognition technique based on modular PCA approach. Pattern Recognition Letters, 2004, 25(4): 429-436. DOI:10.1016/j.patrec.2003.11.005 [4] Mohammed AA, Minhas R, Wu QMJ, et al. Human face recognition based on multidimensional PCA and extreme learning machine. Pattern Recognition, 2011, 44(10-11): 2588-2597. DOI:10.1016/j.patcog.2011.03.013 [5] Kuo CCJ. Understanding convolutional neural networks with a mathematical model. Journal of Visual Communication and Image Representation, 2016, 41: 406-413. DOI:10.1016/j.jvcir.2016.11.003 [6] Schmidhuber J. Deep learning in neural networks: An overview. Neural Networks, 2015, 61: 85-117. DOI:10.1016/j.neunet.2014.09.003 [7] Girshick R. Fast R-CNN. 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1440–1448. [8] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 1–9. [9] Rampasek L, Goldenberg A. TensorFlow: Biology’s gateway to deep learning?. Cell Systems, 2016, 2(1): 12-14. DOI:10.1016/j.cels.2016.01.009 [10] Sebe N, Tian Q, Lew MS, et al. Similarity matching in computer vision and multimedia. Computer Vision and Image Understanding, 2008, 110(3): 309-311. DOI:10.1016/j.cviu.2008.04.001 [11] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507. DOI:10.1126/science.1127647 [12] Zhuang FZ, Luo P, He Q, et al. Survey on transfer learning research. Journal of Software, 2015, 26(1): 26-39. [13] Han S, Pool J, Tran J, et al. Learning both weights and connections for efficient neural networks. Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada. 2015. 1135–1143. [14] Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. 13th European Conference on Computer Vision. Zurich, Switzerland. 2014. 818–833.