﻿ 基于轻量化卷积神经网络的服装分类方法
 计算机系统应用  2019, Vol. 28 Issue (3): 223-228 PDF

Clothing Classification Method Based on Lightweight Convolutional Neural Network
LUO Meng-Yan, LIU Yan-Fei
School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
Abstract: Considering the growing development of e-commerce platforms, the use of artificial classification of clothing classification cannot meet the current needs. In this study, based on the actual application scenarios, the design is improved in three aspects: the interference of background factors, the key position information of the garment image, and the hardware requirements of the algorithm model operation when classifying the garment image. Accordingly, it is proposed that to remove background interference, to use of local information of images, and to lightweight processing of the model. Finally, on the premise of satisfying the accuracy, the algorithm model that can be operated in the ordinary low-configuration PC terminal is obtained, which improves the work efficiency and saves the cost.
Key words: convolution neural network     clothing classification     lightweight

1 模型的设计方案

1.1 剔除背景信息的干扰

1.2 局部特征的提取

 图 1 “跳层连接”结构示意

1.3 卷积神经网络的轻量化处理

 图 2 分块卷积操作示意

 ${{C}}{}_1 = {{{D}}_{{k}}} \cdot {{{D}}_{{k}}} \cdot M \cdot N \cdot {D_f} \cdot {D_f}$

 ${{{C}}_{\rm{2}}} = 2 \cdot {{{D}}_{{k}}} \cdot {{{D}}_{{k}}} \cdot {{M}} \cdot {{{D}}_{{f}}} \cdot {{{D}}_{{f}}} + {{M}} \cdot N \cdot {{{D}}_{{f}}} \cdot {{{D}}_{{f}}}$
 $\frac{{{{{C}}_2}}}{{{{{C}}_{\rm{1}}}}} = \frac{{{\rm{2}} \cdot {{{D}}_{{k}}} \cdot {{{D}}_{{k}}} \cdot {{M}} \cdot {{{D}}_{{f}}} \cdot {{{D}}_{{f}}} + {{M}} \cdot N \cdot {{{D}}_{{f}}} \cdot {{{D}}_{{f}}}}}{{{{{D}}_{{k}}} \cdot {{{D}}_{{k}}} \cdot M \cdot N \cdot {D_f} \cdot {D_f}}} = \frac{1}{{{N}}} + \frac{2}{{{{D}}_{{k}}^2}}$

2 算法实现及结果对比 2.1 训练与测试数据

2.2 不同结构模型效果对比

 图 3 模型结构

2.3 关键点定位效果对比

 $\begin{array}{l} {{NE}} = \dfrac{{\displaystyle\sum\nolimits_{{k}} {\left\{ {\dfrac{{{{{d}}_{{k}}}}}{{{{{s}}_k}}}\delta \left( {{v_k} = 1} \right)} \right\}} }}{{\displaystyle\sum\nolimits_k {\left\{ {\delta \left( {{v_k} = 1} \right)} \right\}} }}\times 100\% \\ \end{array}$

1) Convolutional Pose Machine进行关键点定位, 使用了6个stage的CPM训练之后NE为13.37%.

2) 尝试使用Hourglass进行定位, 仅使用了翻转进行了数据增强, NE值为10.78%.

3) 增加了Color Jittering, Random Crop Resize, Rotate等数据增强方法, 降低到8.8.

4) 使用Hourglass在MPII比赛上的模型参数进行参数初始化, 降低到7.86%.

5) 仅训练单个模型进行预测, 对Loss进行修改, 不存在的关键点Loss为0, 采用了CPN中提出的Online Hard Keypoints Mining (OHKM), 仅保留损失较大的关键点进行回传. 之前训练过的模型进行权重的初始化, 预测时通过Yolo进行裁剪, NE降低到6.76%.

 图 4 不同模型结构测试准确率对比

 图 5 5种不同关键点定位的NE值对比

 图 6 关键点定位对模型准确率效果提升对比

2.4 模型轻量化的实现及效果对比

 图 7 分块卷积方式和传统卷积方式在不同的网络结构下的运行时间以及准确率(均未使用数据增强操作和背景信息的剔除操作)

3 总结

 [1] Iivarinen J, Peura M, Särelä J, et al. Comparison of combined shape descriptors for irregular objects. Proceedings of the 8th British Machine Vision Conference. Essex, Great Britain. 1997. 430–439. [2] Rangayyan RM, El-Faramawy NM, Desautels JEL, et al. Measures of acutance and shape for classification of breast tumors. IEEE Transactions on Medical Imaging, 1997, 16(6): 799-810. DOI:10.1109/42.650876 [3] Teh CH, Chin RT. On image analysis by the methods of moments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1988, 10(4): 496-513. DOI:10.1109/34.3913 [4] Ojala T, Pietikäinen M, Harwood D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition, 1996, 29(1): 51-59. DOI:10.1016/0031-3203(95)00067-4 [5] Daugman JG. Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1988, 36(7): 1169-1179. DOI:10.1109/29.1644 [6] Girshick R. Fast R-CNN. arXiv: 1504.08083, 2015. [7] Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110. DOI:10.1023/B:VISI.0000029664.99615.94 [8] Dalal N, Triggs B. Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA. 2005. 886–893. [9] Bossard L, Dantone M, Leistner C, et al. Apparel classification with style. Asian Conference on Computer Vision. Daejeon, South Korea. 2012. 321–335. [10] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770–778. [11] Tompson JJ, Jain A, LeCun Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada. 2014. 1799–1807. [12] Wei SE, Ramakrishna V, Kanade T, et al. Convolutional pose machines. 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 4724–4732. [13] Huang G, Liu Z, Laurens VDM , et al. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2017. 2261–2269.