Multi-task Hierarchical Fine-tuning Model Toward Machine Reading Comprehension
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [28]
  • |
  • Related [20]
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Machine reading comprehension and question answering has long been considered as one of the core problems of natural language understanding, which requires models to select the best answer from a given text and question. With the rise of pre-trained language models such as BERT, great breakthroughs have been made in natural language processing (NLP) tasks. However, there are still some shortcomings in complex reading comprehension tasks. To solve this problem, this paper proposes a machine reading comprehension model based on retrospective readers. The proposed model uses the pre-trained model RoBERTa to encode questions and articles and divides the reading comprehension section into two modules: an intensive reading module at the word level and a comprehensive reading module at the sentence level. These two modules capture the semantic information in articles and problems at two different granularity levels. Finally, the prediction results of the two modules are combined to produce the answer with the highest probability. The model accuracy is improved in the CAIL2020 dataset and the joint-F1 value of the model reaches 66.15%, which is 5.38% higher than that of the RoBERTa model. The effectiveness of this model is proved by ablation experiments.

    Reference
    [1] Long SB, Tu CC, Liu ZY, et al. Automatic judgment prediction via legal reading comprehension. China National Conference on Chinese Computational Linguistics. Kunming: Springer, 2019. 558–572.
    [2] Xu YM, Cohen SB. Stock movement prediction from tweets and historical prices. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Melbourne: ACL, 2018. 1970–1979.
    [3] ?uster S, Daelemans W. CliCR: A dataset of clinical case reports for machine reading comprehension. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans: ACL, 2018. 1551–1563.
    [4] 李舟军, 王昌宝. 基于深度学习的机器阅读理解综述. 计算机科学, 2019, 46(7): 7–12. [doi: 10.11896/j.issn.1002-137X.2019.07.002
    [5] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100, 000+ questions for machine comprehension of text. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: ACL, 2016. 2383–2392.
    [6] Seo M, Kembhavi A, Farhadi A, Hajishirzi H. Bidirectional attention flow for machine comprehension. 5th International Conference on Learning Representations. Toulon: ICLR, 2017.
    [7] Wang WH, Yang N, Wei FR, et al. Gated self-matching networks for reading comprehension and question answering. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Vancouver: ACL, 2017. 189–198.
    [8] Huang HY, Zhu C, Shen Y, Chen W. Fusionnet: Fusing via fully-aware attention with application to machine comprehension. 6th International Conference on Learning Representations. Vancouver, BC: ICLR, 2018.
    [9] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: ACL, 2019. 4171–4186.
    [10] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017. 6000–6010.
    [11] Reddy S, Chen DQ, Manning CD. Coqa: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 2019, 7: 249–266. [doi: 10.1162/tacl_a_00266
    [12] Zhu CG, Zeng M, Huang XD. Sdnet: Contextualized attention-based deep network for conversational question answering. arXiv: 1812.03593, 2018.
    [13] Rajpurkar P, Jia RB, Liang P. Know what you don’t know: Unanswerable questions for SQuAD. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Melbourne: ACL, 2018. 784–789.
    [14] Zhang ZS, Yang JJ, Zhao H. Retrospective reader for machine reading comprehension. Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual Event: AAAI Press, 2021. 14506–14514.
    [15] Yang ZL, Qi P, Zhang SS, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018. 2369–2380.
    [16] Ding M, Zhou C, Chen Q, et al. Cognitive graph for multi-hop reading comprehension at scale. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 2694–2703.
    [17] Qiu L, Xiao YX, Qu YR, et al. Dynamically fused graph network for multi-hop reasoning. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 6140–6150.
    [18] Tu M, Wang GT, Huang J, et al. Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 2704–2713.
    [19] Nishida K, Nishida K, Nagata M, et al. Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 2335–2345.
    [20] Tu M, Huang K, Wang G T, et al. Select, answer and explain: Interpretable multi-hop reading comprehension over multiple documents. Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI Press, 2020. 9073–9080.
    [21] Shao N, Cui YM, Liu T, et al. Is graph structure necessary for multi-hop question answering? Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. ACL, 2020. 7187–7192.
    [22] Duan XY, Wang BX, Wang ZY, et al. Cjrc: A reliable human-annotated benchmark dataset for Chinese judicial reading comprehension. China National Conference on Chinese Computational Linguistics. Kunming: Springer, 2019. 439–451.
    [23] 谭红叶, 屈保兴. 面向多类型问题的阅读理解方法研究. 中文信息学报, 2020, 34(6): 81–88. [doi: 10.3969/j.issn.1003-0077.2020.06.011
    [24] Loshchilov I, Hutter F. Fixing weight decay regularization in Adam. arXiv: 1711.05101v1, 2018.
    [25] Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: A lite bert for self-supervised learning of language representations. 8th International Conference on Learning Representations. Addis Ababa: ICLR, 2020.
    [26] Liu YH, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach. arXiv: 1907.11692v1, 2019.
    [27] Johnson R, Zhang T. Deep pyramid convolutional neural networks for text categorization. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Vancouver: ACL, 2017. 562–570.
    [28] 张虎, 王宇杰, 谭红叶, 等. 基于MHSA和句法关系增强的机器阅读理解方法研究. 自动化学报, 2021: 1–11.
    Cited by
Get Citation

丁美荣,刘鸿业,徐马一,龚思雨,陈晓敏,曾碧卿.面向机器阅读理解的多任务层次微调模型.计算机系统应用,2022,31(3):212-219

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 31,2021
  • Revised:July 07,2021
  • Online: January 24,2022
Article QR Code
You are the first990429Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063