知识图谱问答领域综述
作者:
基金项目:

中国高等教育学会专项课题(2020JXD01); 广东省普通高校“人工智能”重点领域专项(2019KZDZX1027); 广东高校省级重点平台和重大科研项目(2017KTSCX048); 广东省公益研究与能力建设(2018B070714018); 广东省中医药局科研项目(20191411); 广东省高等学校产业学院建设项目(人工智能机器人教育产业学院)


Overview on Knowledge Graph Question Answering
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [87]
  • | | | |
  • 文章评论
    摘要:

    近年来, 随着知识图谱的发展, 利用给定的知识图谱数据自动得出人类自然语言问题的答案成为了时下的研究热点, 诸如Siri和小爱同学的QA系统已经广泛投入使用. 得益于深度学习的引入, 该领域的各子课题虽然有所突破, 但依然存在需要攻克的难点, 例如多跳推理和策略组合等. 本文从主流的构建方法为切入点, 归纳总结该领域研究现状以及所面临的挑战, 不仅有助于研究者高效展开对该领域的研究工作, 更有利于不同行业的研究者研发行业相关的问答系统, 提高行业生产力.

    Abstract:

    With the development of knowledge graphs, utilizing given knowledge graph data to automatically obtain answers to human natural language questions has become popular in recent years. QA systems such as Siri and Xiao Ai have been widely used. Thanks to the introduction of deep learning, breakthroughs have been made in various sub-projects in this field, but there are still difficulties that need to be overcome, such as multi-hop reasoning and strategy combination. Therefore, starting from the mainstream construction method, this study summarizes the current research status and challenges in this field, which can not only help researchers to efficiently carry out research in this field but also help researchers in different industries to develop industry-related QA systems to improve productivity.

    参考文献
    [1] McCarthy J. Circumscription—A form of non-monotonic reasoning. Artificial Intelligence, 1980, 13(1–2): 27–39.
    [2] Pan JZ, Vetere G, Gomez-Perez JM, et al. Exploiting Linked Data and Knowledge Graphs in Large Organisations. Berlin, Heidelberg: Springer, 2017.
    [3] Beckett D, McBride B. RDF/XML syntax specification (revised). W3C Recommendation, 2004, 10(2,3).
    [4] Liu H, Singh P. ConceptNet—A practical commonsense reasoning tool-kit. BT Technology Journal, 2004, 22(4): 211–226.
    [5] Auer S, Bizer C, Kobilarov G, et al. Dbpedia: A nucleus for a web of open data. In: Aberer K, Choi KS, Noy N, et al., eds. The Semantic Web. Berlin, Heidelberg: Springer, 2007. 722–735.
    [6] Bollacker K, Evans C, Paritosh P, et al. Freebase: A collaboratively created graph database for structuring human knowledge. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008. 1247–1250.
    [7] Suchanek FM, Kasneci G, Weikum G. Yago: A large ontology from wikipedia and wordnet. Journal of Web Semantics, 2008, 6(3): 203–217.
    [8] Carlson A, Betteridge J, Kisiel B, et al. Toward an architecture for never-ending language learning. Proceedings of the 24th AAAI Conference on Artificial Intelligence. Atlanta: AAAI Press, 2010. 1306–1313.
    [9] Navigli R, Ponzetto SP. BabelNet: Building a very large multilingual semantic network. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala: Association for Computational Linguistics, 2010. 216–225.
    [10] Niu X, Sun XR, Wang HF, et al. Zhishi. me-weaving Chinese linking open data. International Semantic Web Conference. Bonn: Springer, 2011. 205–220.
    [11] Wang ZG, Li JZ, Wang ZC, et al. XLore: A large-scale English-Chinese bilingual knowledge graph. Proceedings of the 12th International Semantic Web Conference (Posters & Demos). Aachen: CEUR-WS.org, 2013. 121–124.
    [12] Webber J. A programmatic introduction to neo4j. Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity. New York: ACM, 2012. 217–218.
    [13] McBride B. Jena: A semantic web toolkit. IEEE Internet Computing, 2002, 6(6): 55–59.
    [14] Green BF, Wolf AK, Chomsky C, et al. Baseball: An automatic question-answerer. Western Joint IRE-AIEE-ACM Computer Conference. New York: ACM, 1961. 219–224.
    [15] Woods WA. Progress in natural language understanding: An application to lunar geology. Proceedings of the National Computer Conference and Exposition. New York: ACM, 1973. 441–450.
    [16] Schank RC, Abelson RP. Scripts, plans, and knowledge. Proceedings of the 4th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc., 1975. 151–157.
    [17] Radev DR, Qi H, Wu H, et al. Evaluating web-based question answering systems. Proceedings of the 3rd International Conference on Language Resources and Evaluation. Las Palmas: European Language Resources Association, 2002.
    [18] Hovy E, Gerber L, Hermjakob U, et al. Question answering in webclopedia. Proceedings of the 9th Text REtrieval Conference. Gaithersburg: National Institute of Standards and Technology. 2000. 53–56.
    [19] Zhang D, Lee W S. Web based pattern mining and matching approach to question answering. Proceedings of the Eleventh Text REtrieval Conference. Gaithersburg: National Institute of Standards and Technology, 2002. 497.
    [20] 李贺, 刘嘉宇, 李世钰, 等. 基于疾病知识图谱的自动问答系统优化研究. 数据分析与知识发现, 2021, 5(5): 115–126.
    [21] 杜泽宇, 杨燕, 贺樑. 基于中文知识图谱的电商领域问答系统. 计算机应用与软件, 2017, 34(5): 153–159.
    [22] Pei ZM, Zhang J, Xiong W, et al. A general framework for Chinese domain knowledge graph question answering based on TransE. Journal of Physics: Conference Series, 2020, 1693(1): 012136
    [23] Miller A, Fisch A, Dodge J, et al. Key-value memory networks for directly reading documents. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: ACL, 2016. 1400–1409.
    [24] Shi JX, Cao SL, Pan LM, et al. KQA Pro: A large diagnostic dataset for complex question answering over knowledge base. arXiv: 2007.03875v1, 2020.
    [25] Bordes A, Usunier N, Chopra S, et al. Large-scale simple question answering with memory networks. arXiv: 1506.02075, 2015.
    [26] Cai QQ, Yates A. Semantic parsing freebase: Towards open-domain semantic parsing. Proceedings of the Second Joint Conference on Lexical and Computational Semantics. Atlanta: ACL, 2013. 328–338.
    [27] Berant J, Chou A, Frostig R, et al. Semantic parsing on freebase from question-answer pairs. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: ACL, 2013. 1533–1544.
    [28] Yang Y, Yih WT, Meek C. Wikiqa: A challenge dataset for open-domain question answering. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: ACL, 2015. 2013–2018.
    [29] Rajpurkar P, Zhang J, Lopyrev K, et al. Squad: 100,000+ questions for machine comprehension of text. arXiv: 1606.05250, 2016.
    [30] Yih WT, Richardson M, Meek C, et al. The value of semantic parse labeling for knowledge base question answering. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Berlin: ACL, 2016. 201–206.
    [31] Bao JW, Duan N, Yan Z, et al. Constraint-based question answering with knowledge graph. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016. 2503–2514.
    [32] Su Y, Sun H, Sadler B, et al. On generating characteristic-rich question sets for QA evaluation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: ACL, 2016. 562–572.
    [33] Serban IV, García-Durán A, Gulcehre C, et al. Generating factoid questions with recurrent neural networks: The 30m factoid question-answer corpus. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin: ACL, 2016. 588–598.
    [34] Duan N. Overview of the nlpcc-iccpol 2016 shared task: Open domain Chinese question answering. In: Lin CY, Xue NW, Zhao DY, et al., eds. Natural Language Understanding and Intelligent Applications. Cham: Springer, 2016. 942–948.
    [35] Trivedi P, Maheshwari G, Dubey M, et al. Lc-quad: A corpus for complex question answering over knowledge graphs. International Semantic Web Conference. Vienna: Springer, 2017. 210–218.
    [36] Rajpurkar P, Jia R, Liang P. Know what you don’t know: Unanswerable questions for SQuAD. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Melbourne: ACL, 2018. 784–789.
    [37] Hartmann AK, Tommaso ME, Moussallem D, et al. Generating a large dataset for neural question answering over the dbpedia knowledge base. Workshop on Linked Data Management, co-located with the W3C WEBBR. 2018, 2018.
    [38] Usbeck R, Ngomo ACN, Conrads F, et al. 8th challenge on question answering over linked data (QALD-8). Language, 2018, 7: 1
    [39] Saha A, Pahuja V, Khapra M, et al. Complex sequential question answering: Towards learning to converse over linked question answer pairs with a knowledge graph. Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans: AAAI Press, 2018. 705–713.
    [40] Talmor A, Berant J. The web as a knowledge-base for answering complex questions. arXiv: 1803.06643, 2018.
    [41] Dubey M, Banerjee D, Chaudhuri D, et al. EARL: Joint entity and relation linking for question answering over knowledge graphs. International Semantic Web Conference. Monterey: Springer, 2018. 108–126.
    [42] Gu Y, Kase S, Vanni M, et al. Beyond I. I. D. : Three levels of generalization for question answering on knowledge bases. Proceedings of the Web Conference 2021. New York: ACM, 2020. 3477–3488.
    [43] Cai QQ, Yates A. Large-scale semantic parsing via schema matching and lexicon extension. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Sofia: ACL, 2013. 423–433.
    [44] Bast H, Haussmann E. More accurate question answering on freebase. Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York: ACM, 2015. 1431–1440.
    [45] Abujabal A, Yahya M, Riedewald M, et al. Automated template generation for question answering over knowledge graphs. Proceedings of the 26th International Conference on World Wide Web. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2017. 1191–1200.
    [46] Spiegel BA, Cheong V, Kaplan JE, et al. MK-SQuIT: Synthesizing Questions using Iterative Template-filling. arXiv: 2011.02566v1, 2020.
    [47] Abujabal A, Roy RS, Yahya M, et al. Never-ending learning for open-domain question answering over knowledge bases. Proceedings of the 2018 World Wide Web Conference. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2018. 1053–1062.
    [48] Vollmers D, Jalota R, Moussallem D, et al. Knowledge graph question answering using graph-pattern isomorphism. arXiv: 2103.06752, 2021.
    [49] Reddy S, Lapata M, Steedman M. Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics, 2014, 2: 377–392.
    [50] Kwiatkowski T, Choi E, Artzi Y, et al. Scaling semantic parsers with on-the-fly ontology matching. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: ACL, 2013. 1545–1556.
    [51] Zou L, Huang RZ, Wang HX, et al. Natural language question answering over RDF: A graph data driven approach. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2014. 313–324.
    [52] Yih SW, Chang MW, He X, et al. Semantic parsing via staged query graph generation: Question answering with knowledge base. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing: ACL, 2015. 1321–1331.
    [53] Yu M, Yin WO, Hasan KS, et al. Improved neural relation detection for knowledge base question answering. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver: ACL, 2017. 571–581.
    [54] Luo KQ, Lin FL, Luo XS, et al. Knowledge base question answering via encoding of complex query graphs. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018. 2185–2194.
    [55] Maheshwari G, Trivedi P, Lukovnikov D, et al. Learning to rank query graphs for complex question answering over knowledge graphs. International Semantic Web Conference. Auckland: Springer, 2019. 487–504.
    [56] Zhu SG, Cheng X, Su S. Knowledge-based question answering by tree-to-sequence learning. Neurocomputing, 2020, 372: 64–72.
    [57] Hu S, Zou L, Zhang XB. A state-transition framework to answer complex questions over knowledge base. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018. 2098–2108.
    [58] Vinyals O, Kaiser ?, Koo T, et al. Grammar as a foreign language. Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, 2015. 2773–2781.
    [59] Dong L, Lapata M. Language to logical form with neural attention. arXiv: 1601.01280, 2016.
    [60] Tu ZP, Lu ZD, Liu Y, et al. Modeling coverage for neural machine translation. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin: ACL, 2016. 76–85.
    [61] Xu K, Wu LF, Wang ZG, et al. Exploiting rich syntactic information for semantic parsing with graph-to-sequence model. arXiv: 1808.07624, 2018.
    [62] Wu PY, Zhang XW. A sememe-based approach for knowledge base question answering. Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020). CEUR-WS.org, 2020. 175–179.
    [63] Wu PY, Zhang XW. Improving knowledge base question answering with question understanding augment. Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020). CEUR-WS.org, 2020. 167–171.
    [64] Ji G, Wang S, Zhang X, et al. A fine-grained complex question translation for KBQA. Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020). CEUR-WS. org, 2020. 194–199.
    [65] Wang SJ, Jiao J, Li YH, et al. Answering questions over RDF by neural machine translating. Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020). CEUR-WS. org, 2020. 189–194.
    [66] Lukovnikov D, Fischer A, Lehmann J. Pretrained transformers for simple question answering over knowledge graphs. International Semantic Web Conference. Auckland: Springer, 2019. 470–486.
    [67] Huang X, Zhang JY, Li DC, et al. Knowledge graph embedding based question answering. Proceedings of the 12th ACM International Conference on Web Search and Data Mining. New York: ACM, 2019. 105–113.
    [68] Srivastava S, Patidar M, Chowdhury S, et al. Complex Question Answering on knowledge graphs using machine translation and multi-task learning. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. ACL, 2021. 3428–3439.
    [69] Sun HT, Bedrax-Weiss T, Cohen W. Pullnet: Open domain question answering with iterative retrieval on knowledge bases and text. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong: ACL, 2019. 2380–2390.
    [70] Saxena A, Tripathi A, Talukdar P. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 4498–4507.
    [71] Sorokin D, Gurevych I. Modeling semantics with gated graph neural networks for knowledge base question answering. arXiv: 1808.04126, 2018.
    [72] Yao XC, Van Durme B. Information extraction over structured data: Question answering with freebase. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Baltimore: ACL, 2014. 956–966.
    [73] Bordes A, Chopra S, Weston J. Question answering with subgraph embeddings. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha: ACL, 2014. 615–620.
    [74] Dong L, Wei FR, Zhou M, et al. Question answering over freebase with multi-column convolutional neural networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing: ACL, 2015. 260–269.
    [75] Hao YC, Zhang YZ, Liu K, et al. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver: ACL, 2017. 221–231.
    [76] Qu YQ, Liu J, Kang LY, et al. Question answering over freebase via attentive RNN with similarity matrix based CNN. arXiv: 1804.03317, 2018.
    [77] Naseri S, Foley J, Allan J, et al. Exploring summary-expanded entity embeddings for entity retrieval. Proceedings of the CIKM 2018 Workshops co-located with 27th ACM International Conference on Information and Knowledge Management (CIKM 2018). Torino: CEUR-WS.org, 2018.
    [78] Kadilierakis G, Fafalios P, Papadakos P, et al. Keyword search over RDF using document-centric information retrieval systems. European Semantic Web Conference. Heraklion: Springer, 2020. 121–137.
    [79] Gerritse EJ, Hasibi F, De Vries AP. Graph-embedding empowered entity retrieval. European Conference on Information Retrieval. Lisbon: Springer, 2020. 97–110.
    [80] Yamada I, Asai A, Sakuma J, et al. Wikipedia2vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from wikipedia. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. ACL, 2018. 23–30.
    [81] Nikolaev F, Kotov A. Joint word and entity embeddings for entity retrieval from a knowledge graph. European Conference on Information Retrieval. Lisbon: Springer, 2020. 141–155.
    [82] Esmeir S. SERAG: Semantic entity retrieval from arabic knowledge graphs. Proceedings of the 6th Arabic Natural Language Processing Workshop. Kyiv: ACL, 2021. 219–225.
    [83] Pérez-Iglesias J, Pérez-Agüera JR, Fresno V, et al. Integrating the probabilistic models BM25/BM25F into Lucene. arXiv: 0911.5046, 2009.
    [84] Shi JX, Cao SL, Hou L, et al. TransferNet: An effective and transparent framework for multi-hop question answering over relation graph. arXiv: 2104.07302, 2021.
    [85] Qin KC, Wang Y, Li C, et al. A complex kbqa system using multiple reasoning paths. arXiv: 2005.10970, 2020.
    [86] Wu WQ, Zhu ZF, Zhang GY, et al. A reasoning enhance network for muti-relation question answering. Applied Intelligence, 2021, 51(7): 4515–4524.
    [87] Zhang LH, Lin C, Zhou DY, et al. A Bayesian end-to-end model with estimated uncertainties for simple question answering over knowledge bases. Computer Speech & Language, 2021, 66: 101167
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

郑泳智,朱定局,吴惠粦,彭小荣.知识图谱问答领域综述.计算机系统应用,2022,31(4):1-13

复制
分享
文章指标
  • 点击次数:1559
  • 下载次数: 11458
  • HTML阅读次数: 7177
  • 引用次数: 0
历史
  • 收稿日期:2021-06-10
  • 最后修改日期:2021-07-14
  • 在线发布日期: 2022-03-22
文章二维码
您是第11114751位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号