Abstract:For quick construction of a large-scale and high-quality Chinese face recognition dataset, a semi-automatic construction method is proposed in this study. Compared with the existing dataset construction strategies, this method can quickly build a large-scale Chinese celebrity face dataset, which is named CCFace (Chinese Celebrities Face). The dataset contains 506874 face images of 431 persons, with an average of 1176 images of different ages and postures per person. This method makes up for the shortage of available Chinese face image datasets in the face recognition community. In the experimental section, the effectiveness of the dataset is tested on various models and the results show that it can be used as the training set of the State Of The Art (SOTA) model. It is believed that this method and the dataset will attract more people to join the research team of face recognition and promote the face recognition applications in China.