电力企业为实现数字资产管理, 提高行业运行效率, 促进电力信息化的融合, 需要实施有效的数据组织管理方法. 针对电力行业中的数据, 提出了基于字级别特征的高效文本类型识别模型. 在该模型中, 将字符通过BERT预训练模型生成电力客服文本动态的高效字向量, 字向量序列输入利用融合注意力机制的双向长短期记忆网络(BiLSTM), 通过注意力机制有效捕捉文本中帮助实现类型识别的潜在特征, 最终利用Softmax层实现对电力文本的类型识别任务. 本文提出的模型在电力客服文本数据集上达到了98.81%的准确率, 优于CNN, BiLSTM等传统神经网络识别方法, 增强了BERT模型的应用, 并有效解决了电力文本类型识别任务中语义的长距离依赖问题.
To realize digital asset management, improve industry operation efficiency, and promote the integration of power informationization, power companies need to implement effective data organization and management methods. This study proposes an efficient text type recognition model based on character-level features for the data in the electric power industry. In this model, characters are put through the BERT pre-training model to generate dynamic and efficient character vectors of the power customer service text. A BiLSTM network with the attention mechanism is used for the input of character vector sequences. The attention mechanism enables the effective capture of the latent features helpful for type recognition. Finally, we use the Softmax layer to recognize the power text type. The model proposed in this study achieves an accuracy of 98.81% on a data set of power customer service text, which is better than traditional neural network methods such as CNN and BiLSTM. It enhances the application of the BERT model and effectively solves the problem of semantic long-distance dependence in power text type recognition.