Abstract:In the process of power production, a large amount of power-related text data is often generated, and most of these data are unstructured and large in size. Thus, achieving effective organization and management of these data can promote power companies to produce digital asset products, which can help discover new profit growth points for power companies. Aiming at structuring the text of relevant regulations in the electric power industry, this study proposes a named entity recognition model based on the features of characters and binary phrases. In this model, the word embedding representation is obtained by using the BERT pre-trained language model fused with multiple features, and the Transformer model and conditional random field that introduce the relative position coding are used as the encoding layer and the decoding layer, respectively. The model proposed in this study is applied in entity type recognition, and it can achieve effective recognition with the accuracy of as high as 92.64%.