To address the occlusion problem in person re-identification, this study presents a person re-identification method based on pose-driven local feature alignment. The network mainly consists of a pose encoder (PE) and a human part alignment module (HPAM). Specifically, the PE restrains the confidence of the key points on the bones in obscured areas by reconstructing the pose estimation heatmap to guide the network to extract the features of the person’s visible parts. The HPAM extracts the person’s local features according to the confidence map of the key points output by the PE for feature alignment, which further reduces the interference of non-person features. The simulation and experiments on occlusion datasets and half-body datasets show that the proposed method delivers better results than those produced by other networks under comparison.
[1] Karanam S, Gou MR, Wu ZY, et al. A systematic evaluation and benchmark for person re-identification: Features, metrics, and datasets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(3): 523–536. [doi: 10.1109/TPAMI.2018.2807450
[2] Ye M, Shen JB, Lin GJ, et al. Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 2872–2893. [doi: 10.1109/TPAMI.2021.3054775
[3] Zheng L, Shen LY, Tian L, et al. Scalable person re-identification: A benchmark. Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015. 1116–1124.
[4] Zheng ZD, Zheng L, Yang Y. Pedestrian alignment network for large-scale person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(10): 3037–3045. [doi: 10.1109/TCSVT.2018.2873599
[5] Sun YF, Xu Q, Li YL, et al. Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 393–402.
[6] Zhao HY, Tian MQ, Sun SY, et al. SpindleNet: Person re-identification with human body region guided feature decomposition and fusion. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 907–915.
[7] Su C, Li JN, Zhang SL, et al. Pose-driven deep convolutional model for person re-identification. Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 3980–3989.
[8] Gao S, Wang JY, Lu HC, et al. Pose-guided visible part matching for occluded person ReID. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 11741–11749.
[9] Miao JX, Wu Y, Liu P, et al. Pose-guided feature alignment for occluded person re-identification. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 542–551.
[10] Cao Z, Hidalgo G, Simon T, et al. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(1): 172–186. [doi: 10.1109/TPAMI.2019.2929257
[11] Zheng ZD, Zheng L, Yang Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 3774–3782.
[12] Zheng WS, Li X, Xiang T, et al. Partial person re-identification. Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015. 4678–4686.
[13] He LX, Liang J, Li HQ, et al. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7073–7082.
[14] Zhuo JX, Chen ZY, Lai JH, et al. Occluded person re-identification. Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME). San Diego: IEEE, 2018. 1–6.
[15] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778.
[16] Grill JB, Strub F, Altché F, et al. Bootstrap your own latent a new approach to self-supervised learning. Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 21271–21284.
[17] Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 5686–5696.
[18] Lin TY, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context. Proceedings of the 13th European Conference on Computer Vision. Zurich: Springer, 2014. 740–755.
[19] Zheng ZD, Yang XD, Yu ZD, et al. Joint discriminative and generative learning for person re-identification. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 2133–2142.
[20] Sun YF, Zheng L, Deng WJ, et al. SVDNet for pedestrian retrieval. Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 3820–3828.
[21] Zhang X, Luo H, Fan X, et al. AlignedReID: Surpassing human-level performance in person re-identification. arXiv:1711.08184, 2017.
[22] Sun YF, Zheng L, Yang Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018. 501–518.
[23] Luo H, Fan X, Zhang C, et al. STNReID: Deep convolutional networks with pairwise spatial transformer networks for partial person re-identification. IEEE Transactions on Multimedia, 2020, 22(11): 2905–2913. [doi: 10.1109/TMM.2020.2965491