铁路事故的相关信息以事故概况文本的形式存在, 对于铁路安全工作有重要意义. 但由于缺乏有效的信息抽取手段, 导致分散在文本中的铁路事故知识没有得到充分的利用. 命名实体识别是信息抽取的重要子任务, 目前关于事故领域的命名实体识别问题研究较少. 针对铁路事故命名实体识别问题, 提出一种融合字位置特征的命名实体识别模型, 该模型通过全连接神经网络获取字的位置特征, 并与语义层面的字向量合并作为字的最终向量表示输入BiLSTM-CRF模型获取最优标签序列. 实验结果表明, 模型在铁路事故文本命名实体识别问题上的准确率、召回率和F1值分别为93.29%、94.77%和94.02%, 相比于传统模型, 取得了更好的效果, 为铁路事故知识图谱的构建奠定基础.
Relevant information of railway accidents, existing in the form of accident overview texts, is of great significance to railway safety work. However, due to the lack of effective information extraction methods, the knowledge of railway accidents scattered in the texts has not been fully utilized. Named entity recognition is an important subtask of information extraction, and there are few studies on named entity recognition of accidents. A named entity recognition model fused with character position features is proposed for the named entity recognition of railway accidents. The model obtains the character position features through a fully connected neural network. It merges them with the character vectors at the semantic level as the final vector representation of the characters, which is then input to the BiLSTM-CRF model to obtain the optimal label sequence. The experimental results show that the accuracy, recall, and F1 value of the model on the named entity recognition of railway accident texts are 93.29%, 94.77%, and 94.02% respectively. This model yields better effects than traditional models and lays a foundation for the construction of a railway accident knowledge graph.