基于BiLSTM-Attention的议论文篇章要素识别

doi:10.15888/j.cnki.csa.009842

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年5月2日 17:40 星期五

首页 > 过刊浏览>年第卷第期 >1-10. DOI:10.15888/j.cnki.csa.009842

PDF HTML阅读 XML下载导出引用引用提醒

基于BiLSTM-Attention的议论文篇章要素识别
DOI:
                        10.15888/j.cnki.csa.009842
                    
CSTR:
                        
                    
作者:
                        刘佳旭刘佳旭
辽宁工程技术大学 软件学院, 葫芦岛 125105
在期刊界中查找
在百度中查找
在本站中查找
白再冉白再冉
辽宁工程技术大学 软件学院, 葫芦岛 125105
在期刊界中查找
在百度中查找
在本站中查找
张艳菊张艳菊
辽宁工程技术大学 工商管理学院, 葫芦岛 125105
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:辽宁省社会科学规划基金 (L22BJY034)

Discourse Elements Identification in Argumentative Essays Based on BiLSTM-Attention

Author:

LIU Jia-Xu
LIU Jia-Xu
Software College, Liaoning Technical University, Huludao 125105, China
在期刊界中查找
在百度中查找
在本站中查找
BAI Zai-Ran
BAI Zai-Ran
Software College, Liaoning Technical University, Huludao 125105, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Yan-Ju
ZHANG Yan-Ju
College of Business Management, Liaoning Technical University, Huludao 125105, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [30]

相似文献

引证文献

资源附件

文章评论

摘要:

篇章要素识别(discourse element identification)的主要任务是识别篇章要素单元并进行分类. 针对篇章要素识别对上下文依赖性理解不足的问题, 提出一种基于BiLSTM-Attention的识别篇章要素模型, 提高议论文篇章要素识别的准确率. 该模型利用句子结构和位置编码来识别句子的成分关系, 通过双向长短期记忆网络(bidirectional long short-term memory, BiLSTM)进一步获得深层次上下文相关联的信息; 引入注意力机制(attention mechanism)优化模型特征向量, 提高文本分类的准确度; 最终用句间多头自注意力(multi-head self-attention)获取句子在内容和结构上的关系, 弥补距离较远的句子依赖问题. 相比于HBiLSTM、BERT等基线模型, 在相同参数、相同实验条件下, 在中文数据集和英文数据集上准确率分别提升1.3%、3.6%, 验证了该模型在篇章要素识别任务中的有效性.

关键词:双向长短期记忆网络;注意力机制;位置编码;篇章要素识别;多头注意力

Abstract:

The main task of discourse element identification is to identify discourse element units and classify them. Aiming at the lack of understanding of context dependence in discourse element identification, this study proposes a discourse element identification model based on BiLSTM-Attention to improve the accuracy of discourse element identification in argumentative essays. The model uses sentence structure and positional encoding to identify sentence component relationships and further acquires deep context-related information through bidirectional long short-term memory (BiLSTM). Attention mechanism is introduced to optimize the model feature vectors and improve the accuracy of text classification. Finally, inter-sentence multi-head self-attention is used to obtain the relationships between the content and structure of sentences, so as to make up for the distant sentence dependence. Compared with baseline models such as HBiLSTM and BERT, the accuracy on Chinese and English datasets is improved by 1.3% and 3.6% respectively under the same parameters and the same environmental conditions, which verifies the effectiveness of the model in the discourse element identification task.

Key words:bidirectional long short-term memory (BiLSTM);attention mechanism;positional encoding;discourse element identification;multi-head attention

参考文献

[1] 樊子鹏, 张鹏, 高珲. 量子自然语言处理: 历史演变与新进展. 中文信息学报, 2023, 37(1): 1–15.

[2] 褚晓敏, 朱巧明, 周国栋. 自然语言处理中的篇章主次关系研究. 计算机学报, 2017, 40(4): 842–860.

[3] 蒋峰, 范亚鑫, 褚晓敏, 等. 英汉篇章结构分析研究综述. 软件学报, 2023, 34(9): 4167–4194.

[4] Li PF, Zhou GD, Zhu QM, et al. Employing compositional semantics and discourse consistency in Chinese event extraction. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island: ACL, 2012. 1006–1016.

[5] Chen C, Ng V. Joint modeling for Chinese event extraction with rich linguistic features. Proceedings of the 24th International Conference on Compatational Linguistics (COLING 2012). Mumbai: The COLING 2012 Organizing Committee, 2012. 529–544.

[6] Li JW, Hovy E. A model of coherence based on distributed sentence representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014. 2039–2048.

[7] Song W, Liu LZ. Representation learning in discourse parsing: A survey. Science China Technological Sciences, 2020, 63(10): 1921–1946.

[8] Song W, Wang D, Fu RJ, et al. Discourse mode identification in essays. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver: ACL, 2017. 112–122.

[9] Liakata M, Dobnik S, Saha S, et al. A discourse-driven content model for summarising scientific articles evaluated in a complex question answering task. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: ACL, 2013. 747–757.

[10] 张迎, 张宜飞, 王中卿, 等. 基于主次关系特征的自动文摘方法. 计算机科学, 2020, 47(S1): 6–11.

[11] Zou BW, Zhou GD, Zhu QM. Negation focus identification with contextual discourse information. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore: ACL, 2014. 522–530.

[12] Chistova E, Smirnov I. Discourse-aware text classification for argument mining. Proceedings of the 2022 Computational Linguistics and Intellectual Technologies, Papers from the Annual International Conference “Dialogue”. 2022. 93.

[13] 刘海顺, 王雷, 孙媛媛, 等. 基于预训练语言模型的案件要素识别方法. 中文信息学报, 2021, 35(11): 91–100.

[14] Mim FS, Inoue N, Reisert P, et al. Corruption is not all bad: Incorporating discourse structure into pre-training via corruption for essay scoring. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 2202–2215.

[15] Song W, Fu RJ, Liu LZ, et al. Discourse element identification in student essays based on global and local cohesion. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: ACL, 2015. 2255–2261.

[16] Fu J, Liu J Tian HJ, et al. Dual attention network for scene segmentation. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019. 3141–3149.

[17] 张周彬, 邵党国, 马磊, 等. 一种循环互作用注意力的属性级情感分类模型. 计算机应用与软件, 2020, 37(5): 140–144, 150.

[18] 程艳, 尧磊波, 张光河, 等. 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析. 计算机研究与发展, 2020, 57(12): 2583–2595.

[19] Jianlin Su JL, Ahmed M, Lu Y, et al. RoFormer: Enhanced Transformer with Rotary Position Embedding. Neurocomputing, 2024, 568(C): 127063.

[20] Daxenberger J, Eger S, Habernal I, et al. What is the essence of a claim? Cross-domain claim identification. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen: ACL, 2017. 2055–2066.

[21] 肖琳, 陈博理, 黄鑫, 等. 基于标签语义注意力的多标签文本分类. 软件学报, 2020, 31(4): 1079–1089.

[22] Song W, Song ZY, Fu RJ, et al. Discourse self-attention for discourse element identification in argumentative student essays. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. ACL, 2020. 2820–2830.

[23] Song W, Song ZY, Liu LZ. Hierarchical multi-task learning for organization evaluation of argumentative student essays. Proceedings of the 29th International Joint Conferences on Artificial Intelligence. Yokohama, 2021. 536.

[24] Sun ZH, Jiang F, Li PF, et al. Macro discourse relation recognition via discourse argument pair graph. Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing. Zhengzhou: Springer, 2020. 108–119.

[25] Wang SJ, Zhang ZW, Dou Y, et al. Discourse component recognition via graph neural network in Chinese student argumentative essays. Proceedings of the 15th International Conference on Knowledge Science, Engineering and Management. Singapore: Springer, 2022. 358–373.

[26] 余正涛, 樊孝忠, 郭剑毅, 等. 基于潜在语义分析的汉语问答系统答案提取. 计算机学报, 2006, 29(10): 1889–1893.

[27] Zhang Y, Kamigaito H, Okumura M. A language model-based generative classifier for sentence-level discourse parsing. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana: ACL, 2021. 2432–2446.

[28] Burstein J, Marcu D, Knight K. Finding the WRITE stuff: Automatic identification of discourse structure in student essays. IEEE Intelligent Systems, 2003, 18(1): 32–39.

[29] Stab C, Gurevych I. Parsing argumentation structures in persuasive essays. Computational Linguistics, 2017, 43(3): 619–659.

[30] Devlin J, Chang MW, Lee J, et al. BERT: Pre-training of deep bidirectional Transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: ACL, 2019. 4171–4186.

引用本文

刘佳旭,白再冉,张艳菊.基于BiLSTM-Attention的议论文篇章要素识别.计算机系统应用,,():1-10

复制

文章指标

点击次数:66
下载次数: 112
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2024-10-17
最后修改日期:2024-11-19
录用日期:
在线发布日期: 2025-02-28
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码