面向知识图谱的信息抽取技术综述
作者:
基金项目:

国家自然科学基金面上项目(41871320); 教育部人文社科规划项目(17YJAZH032); 湖南省教育厅创新平台开放基金(20K050)


Review on Information Extraction Techniques for Knowledge Graph
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [67]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    互联网时代, 数据呈爆发式的增长, 怎样从这些数据中抽取出有用的信息, 已是人工智能研究中的一个核心问题. 知识图谱作为解决这一问题的重要方法, 已成为人工智能技术发展的核心推动力. 信息抽取是知识图谱构建过程中的首要环节, 它实现了从海量的数据中抽取出结构化实体以及实体之间的关系. 本文探讨知识图谱中信息抽取的发展趋势, 对实体抽取、关系抽取和事件抽取及其关键技术进行了综述, 分析和讨论了当前存在的问题、挑战以及未来发展的方向.

    Abstract:

    How to extract useful information from surging data has become a critical issue confronting artificial intelligence in the Internet age. As an important method, knowledge graph has become the main driving force to promote the development of artificial intelligence technology. Information extraction realizes the extraction of structured entities and their relationships from massive data, which is the primary step in constructing a knowledge graph. This study discusses the development trend of information extraction in knowledge graphs, as well as entity extraction, relationship extraction, event extraction, and key technologies. Finally, it analyzes and discusses the current problems, challenges, and future development.

    参考文献
    [1] Singhal A. Official Google blog: Introducing the knowledge Graph: Things, not strings. Official Google Blog. https://www.blog.google/products/search/introducing-knowledge-graph-things-not/. (2012-05-16).
    [2] Ji SX, Pan SR, Cambria E, et al. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 2021: 1–21. [doi: 10.1109/TNNLS.2021.3070843
    [3] 黄恒琪, 于娟, 廖晓, 等. 知识图谱研究综述. 计算机系统应用, 2019, 28(6): 1–12. [doi: 10.15888/j.cnki.csa.006915
    [4] Wu XD, Chen HH, Wu GQ, et al. Knowledge engineering with big data. IEEE Intelligent Systems, 2015, 30(5): 46–55. [doi: 10.1109/MIS.2015.56
    [5] 刘烨宸, 李华昱. 领域知识图谱研究综述. 计算机系统应用, 2020, 29(6): 1–12. [doi: 10.15888/j.cnki.csa.007431
    [6] 郭喜跃, 何婷婷. 信息抽取研究综述. 计算机科学, 2015, 42(2): 14–17, 38. [doi: 10.11896/j.issn.1002-137X.2015.02.003
    [7] Wu XD, Zhu XQ, Wu GQ, et al. Data mining with big data. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(1): 97–107
    [8] Mu XF, Wang W, Xu AP. Incorporating token-level dictionary feature into neural model for named entity recognition. Neurocomputing, 2020, 375: 43–50. [doi: 10.1016/j.neucom.2019.09.005
    [9] Peng ML, Xing XY, Zhang Q, et al. Distantly supervised named entity recognition using positive-unlabeled learning. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 2409–2419.
    [10] Azalia FY, Bijaksana MA, Huda AF. Name indexing in Indonesian translation of hadith using named entity recognition with na?ve Bayes classifier. Procedia Computer Science, 2019, 157: 142–149. [doi: 10.1016/j.procs.2019.08.151
    [11] Ghiasvand O, Kate RJ. Learning for clinical named entity recognition without manual annotations. Informatics in Medicine Unlocked, 2018, 13: 122–127
    [12] Sintayehu H, Lehal GS. Named entity recognition: A semi-supervised learning approach. International Journal of Information Technology, 2021, 13(4): 1659–1665. [doi: 10.1007/s41870-020-00470-4
    [13] Hao ZF, Lv D, Li ZJ, et al. Semi-supervised disentangled framework for transferable named entity recognition. Neural Networks, 2021, 135: 127–138. [doi: 10.1016/j.neunet.2020.11.017
    [14] Li J, Sun AX, Han JL, et al. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(1): 50–70
    [15] Hammerton J. Named entity recognition with long short-term memory. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL (Volume 4). Stroudsburg: Association for Computational Linguistics, 2003. 172–175.
    [16] Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 260–270.
    [17] Ma XZ, Hovy EH. End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: ACL Press, 2016. 1064–1074.
    [18] Luo L, Yang ZH, Yang P, et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics, 2018, 34(8): 1381–1388. [doi: 10.1093/bioinformatics/btx761
    [19] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: Association for Computational Linguistics, 2019. 4171–4186.
    [20] Souza F, Nogueira R, Lotufo R. Portuguese named entity recognition using BERT-CRF. arXiv: 1909.10649, 2019.
    [21] 谢腾, 杨俊安, 刘辉. 基于BERT-BiLSTM-CRF模型的中文实体识别. 计算机系统应用, 2020, 29(7): 48–55. [doi: 10.15888/j.cnki.csa.007525
    [22] Sun YS, Wang SH, Li YK, et al. ERNIE: Enhanced representation through knowledge integration. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. ACL, 2019. 1441–1451.
    [23] Liu XD, He PC, Chen WZ, et al. Multi-task deep neural networks for natural language understanding. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics , 2019. 4487–4496.
    [24] Yang ZL, Dai ZH, Yang YM, et al. XLNet: Generalized autoregressive pretraining for language Understanding. arXiv: 1906.08237, 2019.
    [25] Liu YH, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach. arXiv: 1907.11692, 2019.
    [26] Joshi M, Chen DQ, Liu YH, et al. SpanBERT: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 2020, 8: 64–77. [doi: 10.1162/tacl_a_00300
    [27] Lan ZZ, Chen MD, Goodman S, et al. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv: 1909.11942, 2019.
    [28] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.
    [29] Feng XC, Feng XC, Qin B, et al. Improving low resource named entity recognition using cross-lingual knowledge transfer. Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: IJCAI, 2018. 4071–4077.
    [30] Zhou JT, Zhang H, Jin D, et al. Dual adversarial neural transfer for low-resource named entity recognition. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 3461–3471.
    [31] 李冬梅, 张扬, 李东远, 等. 实体关系抽取方法研究综述. 计算机研究与发展, 2020, 57(7): 1424–1448. [doi: 10.7544/issn1000-1239.2020.20190358
    [32] Wang DS, Tiwari P, Garg S, et al. Structural block driven enhanced convolutional neural representation for relation extraction. Applied Soft Computing, 2020, 86: 105913. [doi: 10.1016/j.asoc.2019.105913
    [33] Lin YK, Shen SQ, Liu ZY, et al. Neural relation extraction with selective attention over instances. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. ACL, 2016. 2124–2133.
    [34] Lin C, Miller T, Dligach D, et al. Self-training improves recurrent neural networks performance for temporal relation extraction. Proceedings of the 9th International Workshop on Health Text Mining and Information Analysis. Brussels: Association for Computational Linguistics, 2018. 165–176.
    [35] Xiao MG, Liu C. Semantic relation classification via hierarchical recurrent neural network with attention. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016. 1254–1263.
    [36] Xu Y, Mou LL, Li G, et al. Classifying relations via long short term memory networks along shortest dependency paths. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics, 2015. 1785–1794.
    [37] Schlichtkrull M, Kipf TN, Bloem P, et al. Modeling relational data with graph convolutional networks. Proceedings of the 15th European Semantic Web Conference. Heraklion: Springer, 2018. 593–607.
    [38] Zhang YH, Qi P, Manning CD. Graph convolution over pruned dependency trees improves relation extraction. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 2205–2215.
    [39] Zhu H, Lin YK, Liu ZY, et al. Graph neural networks with generated parameters for relation extraction. Proceedings of the 57th Conference of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1331–1339.
    [40] Song LF, Zhang Y, Wang ZG, et al. N-ary relation extraction using graph-state LSTM. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 2226–2235.
    [41] Guo ZJ, Zhang Y, Lu W. Attention guided graph convolutional networks for relation extraction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 241–251.
    [42] Zheng SC, Hao YX, Lu DY, et al. Joint entity and relation extraction based on a hybrid neural network. Neurocomputing, 2017, 257: 59–66. [doi: 10.1016/j.neucom.2016.12.075
    [43] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin: Association for Computational Linguistics, 2016. 1105–1116.
    [44] Zheng SC, Wang F, Bao HY, et al. Joint extraction of entities and relations based on a novel tagging scheme. Proceedings of the 55th Annual Meeting of the Association for Computational linguistics (ACL). Vancouver: Association for Computational Linguistics, 2017. 1227–1236.
    [45] Bekoulis G, Deleu J, Demeester T, et al. Joint entity recognition and relation extraction as a multi-head selection problem. Expert Systems with Applications, 2018, 114: 34–45. [doi: 10.1016/j.eswa.2018.07.032
    [46] Bekoulis G, Deleu J, Demeester T, et al. Adversarial training for multi-context joint entity and relation extraction. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 2830–2836.
    [47] Nayak T, Ng HT. Effective modeling of encoder-decoder architecture for joint entity and relation extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8528–8535. [doi: 10.1609/aaai.v34i05.6374
    [48] Li XY, Yin F, Sun ZJ, et al. Entity-relation extraction as multi-turn question answering. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1340–1350.
    [49] Wei ZP, Su JL, Wang Y, et al. A novel cascade binary tagging framework for relational triple extraction. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. 1476–1488.
    [50] Sun CZ, Gong YY, Wu YB, et al. Joint type inference on entities and relations via graph convolutional networks. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1361–1370.
    [51] Fu TJ, Li PH, Ma WY. GraphRel: Modeling text as relational graphs for joint entity and relation extraction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1409–1418.
    [52] Ji GL, Liu K, He SZ, et al. Distant supervision for relation extraction with sentence-level attention and entity descriptions. Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017. 3060–3066.
    [53] Feng J, Huang ML, Zhao L, et al. Reinforcement learning for relation classification from noisy data. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 5779–5786.
    [54] 鄂海红, 张文静, 肖思琪, 等. 深度学习实体关系抽取研究综述. 软件学报, 2019, 30(6): 1793–1818. [doi: 10.13328/j.cnki.jos.005817
    [55] Wu F, Weld DS, et al. Open information extraction using Wikipedia. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala: Association for Computational Linguistics, 2010. 118–127.
    [56] Nakashole N, Weikum G, Suchanek F, et al. PATTY: A taxonomy of relational patterns with semantic types. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: Association for Computational Linguistics, 2012. 1135–1145.
    [57] Mausam, Schmitz M, Bart R, et al. Open language learning for information extraction. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: Association for Computational Linguistics, 2012. 523–534.
    [58] Akbik A, L?ser A. KrakeN: N-ary facts in open information extraction. Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX). Montréal: Association for Computational Linguistics, 2012. 52–56.
    [59] Chen YB, Xu LH, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing: Association for Computational Linguistics, 2015. 167–176.
    [60] Nguyen TH, Cho K, Grishman R. Joint event extraction via recurrent neural networks. NAACL HLT 2016, the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 300–309.
    [61] Chen YB, Yang H, Liu K, et al. Collective event detection via a hierarchical and bias tagging networks with gated multi-level attention mechanisms. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 1267–1276.
    [62] Lee H, Recasens M, Chang A, et al. Joint entity and event coreference resolution across documents. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: Association for Computational Linguistics, 2012. 489–500.
    [63] Barhom S, Shwartz V, Eirew A, et al. Revisiting joint modeling of cross-document entity and event coreference Resolution. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019: 4179–4189
    [64] Xi XY, Wei Y, Zhang SK, et al. Capturing event argument interaction via a bi-directional entity-level recurrent decoder. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2021. 210–219.
    [65] Han RJ, Ning Q, Peng NY. Joint event and temporal relation extraction with shared representations and structured prediction. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019. 434–444.
    [66] Han RJ, Zhou YC, Peng NY. Domain knowledge empowered structured neural net for end-to-end event temporal relation extraction. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 5717–5729.
    [67] Tang JL, Lin HY, Liao M, et al. From discourse to narrative: Knowledge projection for event relation extraction. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2021. 732–742.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

姜磊,刘琦,赵肄江,袁鹏,李媛,邹子维.面向知识图谱的信息抽取技术综述.计算机系统应用,2022,31(7):46-54

复制
分享
文章指标
  • 点击次数:2048
  • 下载次数: 6496
  • HTML阅读次数: 7439
  • 引用次数: 0
历史
  • 收稿日期:2021-10-18
  • 最后修改日期:2021-11-17
  • 在线发布日期: 2022-05-31
文章二维码
您是第12460492位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号