面向知识图谱的信息抽取技术综述

doi:10.15888/j.cnki.csa.008590

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年7月27日 23:15 星期日

首页 > 过刊浏览>2022年第31卷第7期 >46-54. DOI:10.15888/j.cnki.csa.008590

PDF HTML阅读 XML下载导出引用引用提醒

面向知识图谱的信息抽取技术综述
DOI:
                        10.15888/j.cnki.csa.008590
                    
CSTR:
                        
                    
作者:
                        姜磊姜磊
湖南科技大学 计算机科学与工程学院, 湘潭 411100
在期刊界中查找
在百度中查找
在本站中查找
刘琦刘琦
湖南科技大学 计算机科学与工程学院, 湘潭 411100
在期刊界中查找
在百度中查找
在本站中查找
赵肄江赵肄江
湖南科技大学 计算机科学与工程学院, 湘潭 411100
在期刊界中查找
在百度中查找
在本站中查找
袁鹏袁鹏
湖南科技大学 计算机科学与工程学院, 湘潭 411100
在期刊界中查找
在百度中查找
在本站中查找
李媛李媛
湖南科技大学 计算机科学与工程学院, 湘潭 411100
在期刊界中查找
在百度中查找
在本站中查找
邹子维邹子维
湖南科技大学 计算机科学与工程学院, 湘潭 411100
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金面上项目(41871320); 教育部人文社科规划项目(17YJAZH032); 湖南省教育厅创新平台开放基金(20K050)

Review on Information Extraction Techniques for Knowledge Graph

Author:

JIANG Lei
JIANG Lei
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411100, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Qi
LIU Qi
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411100, China
在期刊界中查找
在百度中查找
在本站中查找
ZHAO Yi-Jiang
ZHAO Yi-Jiang
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411100, China
在期刊界中查找
在百度中查找
在本站中查找
YUAN Peng
YUAN Peng
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411100, China
在期刊界中查找
在百度中查找
在本站中查找
LI Yuan
LI Yuan
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411100, China
在期刊界中查找
在百度中查找
在本站中查找
ZOU Zi-Wei
ZOU Zi-Wei
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411100, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [67]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

互联网时代, 数据呈爆发式的增长, 怎样从这些数据中抽取出有用的信息, 已是人工智能研究中的一个核心问题. 知识图谱作为解决这一问题的重要方法, 已成为人工智能技术发展的核心推动力. 信息抽取是知识图谱构建过程中的首要环节, 它实现了从海量的数据中抽取出结构化实体以及实体之间的关系. 本文探讨知识图谱中信息抽取的发展趋势, 对实体抽取、关系抽取和事件抽取及其关键技术进行了综述, 分析和讨论了当前存在的问题、挑战以及未来发展的方向.

关键词:知识图谱;信息抽取;实体抽取;关系抽取;事件抽取

Abstract:

How to extract useful information from surging data has become a critical issue confronting artificial intelligence in the Internet age. As an important method, knowledge graph has become the main driving force to promote the development of artificial intelligence technology. Information extraction realizes the extraction of structured entities and their relationships from massive data, which is the primary step in constructing a knowledge graph. This study discusses the development trend of information extraction in knowledge graphs, as well as entity extraction, relationship extraction, event extraction, and key technologies. Finally, it analyzes and discusses the current problems, challenges, and future development.

Key words:knowledge graph (KG);information extraction;entity extraction;relationship extraction;event extraction

参考文献

[1] Singhal A. Official Google blog: Introducing the knowledge Graph: Things, not strings. Official Google Blog. https://www.blog.google/products/search/introducing-knowledge-graph-things-not/. (2012-05-16).

[2] Ji SX, Pan SR, Cambria E, et al. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 2021: 1–21. [doi: 10.1109/TNNLS.2021.3070843

[3] 黄恒琪, 于娟, 廖晓, 等. 知识图谱研究综述. 计算机系统应用, 2019, 28(6): 1–12. [doi: 10.15888/j.cnki.csa.006915

[4] Wu XD, Chen HH, Wu GQ, et al. Knowledge engineering with big data. IEEE Intelligent Systems, 2015, 30(5): 46–55. [doi: 10.1109/MIS.2015.56

[5] 刘烨宸, 李华昱. 领域知识图谱研究综述. 计算机系统应用, 2020, 29(6): 1–12. [doi: 10.15888/j.cnki.csa.007431

[6] 郭喜跃, 何婷婷. 信息抽取研究综述. 计算机科学, 2015, 42(2): 14–17, 38. [doi: 10.11896/j.issn.1002-137X.2015.02.003

[7] Wu XD, Zhu XQ, Wu GQ, et al. Data mining with big data. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(1): 97–107

[8] Mu XF, Wang W, Xu AP. Incorporating token-level dictionary feature into neural model for named entity recognition. Neurocomputing, 2020, 375: 43–50. [doi: 10.1016/j.neucom.2019.09.005

[9] Peng ML, Xing XY, Zhang Q, et al. Distantly supervised named entity recognition using positive-unlabeled learning. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 2409–2419.

[10] Azalia FY, Bijaksana MA, Huda AF. Name indexing in Indonesian translation of hadith using named entity recognition with na?ve Bayes classifier. Procedia Computer Science, 2019, 157: 142–149. [doi: 10.1016/j.procs.2019.08.151

[11] Ghiasvand O, Kate RJ. Learning for clinical named entity recognition without manual annotations. Informatics in Medicine Unlocked, 2018, 13: 122–127

[12] Sintayehu H, Lehal GS. Named entity recognition: A semi-supervised learning approach. International Journal of Information Technology, 2021, 13(4): 1659–1665. [doi: 10.1007/s41870-020-00470-4

[13] Hao ZF, Lv D, Li ZJ, et al. Semi-supervised disentangled framework for transferable named entity recognition. Neural Networks, 2021, 135: 127–138. [doi: 10.1016/j.neunet.2020.11.017

[14] Li J, Sun AX, Han JL, et al. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(1): 50–70

[15] Hammerton J. Named entity recognition with long short-term memory. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL (Volume 4). Stroudsburg: Association for Computational Linguistics, 2003. 172–175.

[16] Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 260–270.

[17] Ma XZ, Hovy EH. End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: ACL Press, 2016. 1064–1074.

[18] Luo L, Yang ZH, Yang P, et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics, 2018, 34(8): 1381–1388. [doi: 10.1093/bioinformatics/btx761

[19] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: Association for Computational Linguistics, 2019. 4171–4186.

[20] Souza F, Nogueira R, Lotufo R. Portuguese named entity recognition using BERT-CRF. arXiv: 1909.10649, 2019.

[21] 谢腾, 杨俊安, 刘辉. 基于BERT-BiLSTM-CRF模型的中文实体识别. 计算机系统应用, 2020, 29(7): 48–55. [doi: 10.15888/j.cnki.csa.007525

[22] Sun YS, Wang SH, Li YK, et al. ERNIE: Enhanced representation through knowledge integration. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. ACL, 2019. 1441–1451.

[23] Liu XD, He PC, Chen WZ, et al. Multi-task deep neural networks for natural language understanding. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics , 2019. 4487–4496.

[24] Yang ZL, Dai ZH, Yang YM, et al. XLNet: Generalized autoregressive pretraining for language Understanding. arXiv: 1906.08237, 2019.

[25] Liu YH, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach. arXiv: 1907.11692, 2019.

[26] Joshi M, Chen DQ, Liu YH, et al. SpanBERT: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 2020, 8: 64–77. [doi: 10.1162/tacl_a_00300

[27] Lan ZZ, Chen MD, Goodman S, et al. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv: 1909.11942, 2019.

[28] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.

[29] Feng XC, Feng XC, Qin B, et al. Improving low resource named entity recognition using cross-lingual knowledge transfer. Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: IJCAI, 2018. 4071–4077.

[30] Zhou JT, Zhang H, Jin D, et al. Dual adversarial neural transfer for low-resource named entity recognition. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 3461–3471.

[31] 李冬梅, 张扬, 李东远, 等. 实体关系抽取方法研究综述. 计算机研究与发展, 2020, 57(7): 1424–1448. [doi: 10.7544/issn1000-1239.2020.20190358

[32] Wang DS, Tiwari P, Garg S, et al. Structural block driven enhanced convolutional neural representation for relation extraction. Applied Soft Computing, 2020, 86: 105913. [doi: 10.1016/j.asoc.2019.105913

[33] Lin YK, Shen SQ, Liu ZY, et al. Neural relation extraction with selective attention over instances. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. ACL, 2016. 2124–2133.

[34] Lin C, Miller T, Dligach D, et al. Self-training improves recurrent neural networks performance for temporal relation extraction. Proceedings of the 9th International Workshop on Health Text Mining and Information Analysis. Brussels: Association for Computational Linguistics, 2018. 165–176.

[35] Xiao MG, Liu C. Semantic relation classification via hierarchical recurrent neural network with attention. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016. 1254–1263.

[36] Xu Y, Mou LL, Li G, et al. Classifying relations via long short term memory networks along shortest dependency paths. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics, 2015. 1785–1794.

[37] Schlichtkrull M, Kipf TN, Bloem P, et al. Modeling relational data with graph convolutional networks. Proceedings of the 15th European Semantic Web Conference. Heraklion: Springer, 2018. 593–607.

[38] Zhang YH, Qi P, Manning CD. Graph convolution over pruned dependency trees improves relation extraction. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 2205–2215.

[39] Zhu H, Lin YK, Liu ZY, et al. Graph neural networks with generated parameters for relation extraction. Proceedings of the 57th Conference of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1331–1339.

[40] Song LF, Zhang Y, Wang ZG, et al. N-ary relation extraction using graph-state LSTM. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 2226–2235.

[41] Guo ZJ, Zhang Y, Lu W. Attention guided graph convolutional networks for relation extraction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 241–251.

[42] Zheng SC, Hao YX, Lu DY, et al. Joint entity and relation extraction based on a hybrid neural network. Neurocomputing, 2017, 257: 59–66. [doi: 10.1016/j.neucom.2016.12.075

[43] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin: Association for Computational Linguistics, 2016. 1105–1116.

[44] Zheng SC, Wang F, Bao HY, et al. Joint extraction of entities and relations based on a novel tagging scheme. Proceedings of the 55th Annual Meeting of the Association for Computational linguistics (ACL). Vancouver: Association for Computational Linguistics, 2017. 1227–1236.

[45] Bekoulis G, Deleu J, Demeester T, et al. Joint entity recognition and relation extraction as a multi-head selection problem. Expert Systems with Applications, 2018, 114: 34–45. [doi: 10.1016/j.eswa.2018.07.032

[46] Bekoulis G, Deleu J, Demeester T, et al. Adversarial training for multi-context joint entity and relation extraction. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 2830–2836.

[47] Nayak T, Ng HT. Effective modeling of encoder-decoder architecture for joint entity and relation extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8528–8535. [doi: 10.1609/aaai.v34i05.6374

[48] Li XY, Yin F, Sun ZJ, et al. Entity-relation extraction as multi-turn question answering. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1340–1350.

[49] Wei ZP, Su JL, Wang Y, et al. A novel cascade binary tagging framework for relational triple extraction. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. 1476–1488.

[50] Sun CZ, Gong YY, Wu YB, et al. Joint type inference on entities and relations via graph convolutional networks. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1361–1370.

[51] Fu TJ, Li PH, Ma WY. GraphRel: Modeling text as relational graphs for joint entity and relation extraction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1409–1418.

[52] Ji GL, Liu K, He SZ, et al. Distant supervision for relation extraction with sentence-level attention and entity descriptions. Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017. 3060–3066.

[53] Feng J, Huang ML, Zhao L, et al. Reinforcement learning for relation classification from noisy data. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 5779–5786.

[54] 鄂海红, 张文静, 肖思琪, 等. 深度学习实体关系抽取研究综述. 软件学报, 2019, 30(6): 1793–1818. [doi: 10.13328/j.cnki.jos.005817

[55] Wu F, Weld DS, et al. Open information extraction using Wikipedia. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala: Association for Computational Linguistics, 2010. 118–127.

[56] Nakashole N, Weikum G, Suchanek F, et al. PATTY: A taxonomy of relational patterns with semantic types. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: Association for Computational Linguistics, 2012. 1135–1145.

[57] Mausam, Schmitz M, Bart R, et al. Open language learning for information extraction. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: Association for Computational Linguistics, 2012. 523–534.

[58] Akbik A, L?ser A. KrakeN: N-ary facts in open information extraction. Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX). Montréal: Association for Computational Linguistics, 2012. 52–56.

[59] Chen YB, Xu LH, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing: Association for Computational Linguistics, 2015. 167–176.

[60] Nguyen TH, Cho K, Grishman R. Joint event extraction via recurrent neural networks. NAACL HLT 2016, the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 300–309.

[61] Chen YB, Yang H, Liu K, et al. Collective event detection via a hierarchical and bias tagging networks with gated multi-level attention mechanisms. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 1267–1276.

[62] Lee H, Recasens M, Chang A, et al. Joint entity and event coreference resolution across documents. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: Association for Computational Linguistics, 2012. 489–500.

[63] Barhom S, Shwartz V, Eirew A, et al. Revisiting joint modeling of cross-document entity and event coreference Resolution. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019: 4179–4189

[64] Xi XY, Wei Y, Zhang SK, et al. Capturing event argument interaction via a bi-directional entity-level recurrent decoder. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2021. 210–219.

[65] Han RJ, Ning Q, Peng NY. Joint event and temporal relation extraction with shared representations and structured prediction. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019. 434–444.

[66] Han RJ, Zhou YC, Peng NY. Domain knowledge empowered structured neural net for end-to-end event temporal relation extraction. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 5717–5729.

[67] Tang JL, Lin HY, Liao M, et al. From discourse to narrative: Knowledge projection for event relation extraction. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2021. 732–742.

引用本文

姜磊,刘琦,赵肄江,袁鹏,李媛,邹子维.面向知识图谱的信息抽取技术综述.计算机系统应用,2022,31(7):46-54

复制

文章指标

点击次数:2048
下载次数: 6496
HTML阅读次数: 7439
引用次数: 0

历史

收稿日期:2021-10-18
最后修改日期:2021-11-17
录用日期:
在线发布日期: 2022-05-31
出版日期:

微信公众号

网站二维码

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码