面向机器阅读理解的多任务层次微调模型
作者:
基金项目:

国家自然科学基金(61876067); 广东省普通高校人工智能重点领域专项(2019KZDZX1033); 广东省信息物理融合系统重点实验室建设专项(2020B1212060069)


Multi-task Hierarchical Fine-tuning Model Toward Machine Reading Comprehension
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [28]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    机器阅读理解与问答一直以来被认为是自然语言理解的核心问题之一, 要求模型通过给定的文章与问题去挑选出最佳答案. 随着 BERT 等预训练模型的兴起, 众多的自然语言处理任务取得了重大突破, 然而在复杂的阅读理解任务方面仍然存在一些不足, 针对该任务, 提出了一个基于回顾式阅读器的机器阅读理解模型. 模型使用 RoBERTa 预训练模型对问题与文章进行编码, 并将阅读理解部分分为词级别的精读模块与句子级别的泛读模块两个模块. 这两个模块以两种不同的粒度来获取文章和问题的语义信息, 最终结合两个模块的预测答案合并输出. 该模型在 CAIL2020 的数据集上综合F1值达到了66.15%, 相较于RoBERTa模型提升了5.38%, 并通过消融实验证明了本模型的有效性.

    Abstract:

    Machine reading comprehension and question answering has long been considered as one of the core problems of natural language understanding, which requires models to select the best answer from a given text and question. With the rise of pre-trained language models such as BERT, great breakthroughs have been made in natural language processing (NLP) tasks. However, there are still some shortcomings in complex reading comprehension tasks. To solve this problem, this paper proposes a machine reading comprehension model based on retrospective readers. The proposed model uses the pre-trained model RoBERTa to encode questions and articles and divides the reading comprehension section into two modules: an intensive reading module at the word level and a comprehensive reading module at the sentence level. These two modules capture the semantic information in articles and problems at two different granularity levels. Finally, the prediction results of the two modules are combined to produce the answer with the highest probability. The model accuracy is improved in the CAIL2020 dataset and the joint-F1 value of the model reaches 66.15%, which is 5.38% higher than that of the RoBERTa model. The effectiveness of this model is proved by ablation experiments.

    参考文献
    [1] Long SB, Tu CC, Liu ZY, et al. Automatic judgment prediction via legal reading comprehension. China National Conference on Chinese Computational Linguistics. Kunming: Springer, 2019. 558–572.
    [2] Xu YM, Cohen SB. Stock movement prediction from tweets and historical prices. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Melbourne: ACL, 2018. 1970–1979.
    [3] ?uster S, Daelemans W. CliCR: A dataset of clinical case reports for machine reading comprehension. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans: ACL, 2018. 1551–1563.
    [4] 李舟军, 王昌宝. 基于深度学习的机器阅读理解综述. 计算机科学, 2019, 46(7): 7–12. [doi: 10.11896/j.issn.1002-137X.2019.07.002
    [5] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100, 000+ questions for machine comprehension of text. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: ACL, 2016. 2383–2392.
    [6] Seo M, Kembhavi A, Farhadi A, Hajishirzi H. Bidirectional attention flow for machine comprehension. 5th International Conference on Learning Representations. Toulon: ICLR, 2017.
    [7] Wang WH, Yang N, Wei FR, et al. Gated self-matching networks for reading comprehension and question answering. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Vancouver: ACL, 2017. 189–198.
    [8] Huang HY, Zhu C, Shen Y, Chen W. Fusionnet: Fusing via fully-aware attention with application to machine comprehension. 6th International Conference on Learning Representations. Vancouver, BC: ICLR, 2018.
    [9] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: ACL, 2019. 4171–4186.
    [10] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017. 6000–6010.
    [11] Reddy S, Chen DQ, Manning CD. Coqa: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 2019, 7: 249–266. [doi: 10.1162/tacl_a_00266
    [12] Zhu CG, Zeng M, Huang XD. Sdnet: Contextualized attention-based deep network for conversational question answering. arXiv: 1812.03593, 2018.
    [13] Rajpurkar P, Jia RB, Liang P. Know what you don’t know: Unanswerable questions for SQuAD. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Melbourne: ACL, 2018. 784–789.
    [14] Zhang ZS, Yang JJ, Zhao H. Retrospective reader for machine reading comprehension. Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual Event: AAAI Press, 2021. 14506–14514.
    [15] Yang ZL, Qi P, Zhang SS, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018. 2369–2380.
    [16] Ding M, Zhou C, Chen Q, et al. Cognitive graph for multi-hop reading comprehension at scale. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 2694–2703.
    [17] Qiu L, Xiao YX, Qu YR, et al. Dynamically fused graph network for multi-hop reasoning. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 6140–6150.
    [18] Tu M, Wang GT, Huang J, et al. Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 2704–2713.
    [19] Nishida K, Nishida K, Nagata M, et al. Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 2335–2345.
    [20] Tu M, Huang K, Wang G T, et al. Select, answer and explain: Interpretable multi-hop reading comprehension over multiple documents. Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI Press, 2020. 9073–9080.
    [21] Shao N, Cui YM, Liu T, et al. Is graph structure necessary for multi-hop question answering? Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. ACL, 2020. 7187–7192.
    [22] Duan XY, Wang BX, Wang ZY, et al. Cjrc: A reliable human-annotated benchmark dataset for Chinese judicial reading comprehension. China National Conference on Chinese Computational Linguistics. Kunming: Springer, 2019. 439–451.
    [23] 谭红叶, 屈保兴. 面向多类型问题的阅读理解方法研究. 中文信息学报, 2020, 34(6): 81–88. [doi: 10.3969/j.issn.1003-0077.2020.06.011
    [24] Loshchilov I, Hutter F. Fixing weight decay regularization in Adam. arXiv: 1711.05101v1, 2018.
    [25] Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: A lite bert for self-supervised learning of language representations. 8th International Conference on Learning Representations. Addis Ababa: ICLR, 2020.
    [26] Liu YH, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach. arXiv: 1907.11692v1, 2019.
    [27] Johnson R, Zhang T. Deep pyramid convolutional neural networks for text categorization. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Vancouver: ACL, 2017. 562–570.
    [28] 张虎, 王宇杰, 谭红叶, 等. 基于MHSA和句法关系增强的机器阅读理解方法研究. 自动化学报, 2021: 1–11.
    相似文献
    引证文献
引用本文

丁美荣,刘鸿业,徐马一,龚思雨,陈晓敏,曾碧卿.面向机器阅读理解的多任务层次微调模型.计算机系统应用,2022,31(3):212-219

复制
分享
文章指标
  • 点击次数:727
  • 下载次数: 1754
  • HTML阅读次数: 1773
  • 引用次数: 0
历史
  • 收稿日期:2021-05-31
  • 最后修改日期:2021-07-07
  • 在线发布日期: 2022-01-24
文章二维码
您是第11204427位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号