融合BERT和自注意力机制的张量图卷积网络文本分类

doi:10.15888/j.cnki.csa.009831

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年3月14日 12:32 星期五

首页 > 过刊浏览>2025年第34卷第3期 >152-160. DOI:10.15888/j.cnki.csa.009831

PDF HTML阅读 XML下载导出引用引用提醒

融合BERT和自注意力机制的张量图卷积网络文本分类
DOI:
                        10.15888/j.cnki.csa.009831
                    
CSTR:
                        32024.14.csa.009831
                    
作者:
                        史文艺史文艺
西安工程大学 计算机科学学院, 西安 710600
在期刊界中查找
在百度中查找
在本站中查找
朱欣娟朱欣娟
西安工程大学 计算机科学学院, 西安 710600
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:陕西省重点研发计划 (2024GX-YBXM-548)

Text Classification with Tensor Graph Convolutional Network Fusing BERT and Self-attention Mechanism

Author:

SHI Wen-Yi
SHI Wen-Yi
School of Computer Science, Xi’an Polytechnic University, Xi’an 710600, China
在期刊界中查找
在百度中查找
在本站中查找
ZHU Xin-Juan
ZHU Xin-Juan
School of Computer Science, Xi’an Polytechnic University, Xi’an 710600, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [32]

相似文献

引证文献

资源附件

文章评论

摘要:

TensorGCN模型是图神经网络应用在文本分类领域的SOTA模型之一. 然而在处理文本语义信息方面, 该模型使用的LSTM难以完全地提取短文本语义特征, 且对复杂的语义处理效果不佳; 同时, 由于长文本中包含的语义及句法特征较多, 在进行图间异构信息共享时特征共享不完全, 影响文本分类的准确性. 针对这两个问题, 对TensorGCN模型进行改进, 提出融合BERT和自注意力机制的张量图卷积网络 (BTSGCN)文本分类方法. 首先, 使用BERT代替TensorGCN架构中的LSTM模块进行语义特征提取, 通过考虑给定单词两侧的周围单词来捕获单词之间的依赖关系, 更准确地提取短文本语义特征; 然后, 在图间传播时加入自注意力机制, 帮助模型更好地捕捉不同图之间的特征, 完成特征融合. 在MR、R8、R52和20NG这4个数据集上的实验结果表明BTSGCN相比于其他对比方法的分类准确度更高.

关键词:文本分类;图神经网络;BTSGCN;BERT;自注意力机制

Abstract:

TensorGCN model is one of the state-of-the-art (SOTA) models applied by graph neural networks in the field of text classification. However, in terms of processing text semantic information, the long short-term memory (LSTM) used by the model has difficulty in completely extracting the semantic features of short text and performs poorly in handling complex semantic information. At the same time, due to the large number of semantic and syntactic features contained in long texts, feature sharing is incomplete when heterogeneous information is shared among graphs, which affects the accuracy of text classification. To solve these two problems, the TensorGCN model is improved, and a text classification method based on the tensor graph convolutional network fusing BERT and the self-attention mechanism (BTSGCN) is proposed. Firstly, BERT is used to replace the LSTM module in the TensorGCN architecture for semantic feature extraction. It captures the dependencies between words by considering the surrounding words on both sides of a given word, thus extracting the semantic features of short texts more accurately. Then, the self-attention mechanism is added during the propagation among graphs to help the model better capture the features among different graphs and complete the feature fusion. Experimental results on MR, R8, R52, and 20NG datasets show that BTSGCN has higher classification accuracy than other comparison methods.

Key words:text classification;graph neural network (GNN);tensor graph convolutional network fusing BERT and the self-attention mechanism (BTSGCN);BERT;self-attention mechanism

参考文献

[1] 郑诚, 肖双. 结合语法规则和图神经网络的文本分类方法. 小型微型计算机系统, 2024, 45(11): 2594–2601.

[2] Hu ZK, Hu JQ, Ding WF, et al. Review sentiment analysis based on deep learning. Proceedings of the 12th International Conference on e-Business Engineering. Beijing: IEEE, 2015. 87–94.

[3] Atanasova P, Simonsen JG, Lioma C, et al. A diagnostic study of explainability techniques for text classification. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. ACL, 2020. 3256–3274.

[4] Sun XF, Li XY, Li JW, et al. Text classification via large language models. Proceedings of the 2023 Association for Computational Linguistics. Singapore: ACL, 2023. 8990–9005.

[5] Yu WB, Yin L, Zhang CJ, et al. Application of quantum recurrent neural network in low-resource language text classification. IEEE Transactions on Quantum Engineering, 2024, 5: 2100213.

[6] Chai YY, Li Z, Liu JH, et al. Compositional generalization for multi-label text classification: A data-augmentation approach. Proceedings of the 38th AAAI Conference on Artificial Intelligence. Vancouver: AAAI, 2024. 17727–17735.

[7] Ding K, Li JD, Bhanushali R, et al. Deep anomaly detection on attributed networks. Proceedings of the 2019 SIAM International Conference on Data Mining. Tempe: SIAM, 2019. 594–602.

[8] Papageorgiou G, Economou P, Bersimis S. A method for optimizing text preprocessing and text classification using multiple cycles of learning with an application on shipbrokers emails. Journal of Applied Statistics, 2024, 51(13): 2592–2626.

[9] Roy PK, Singh JP, Banerjee S. Deep learning to filter SMS Spam. Future Generation Computer Systems, 2020, 102: 524–533.

[10] Kim Y. Convolutional neural networks for sentence classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014. 1746–1751.

[11] Liu PF, Qiu XP, Huang XJ. Recurrent neural network for text classification with multi-task learning. Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York: AAAI Press, 2016. 2873–2879.

[12] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780.

[13] Zhang Y, Liu Q, Song LF. Sentence-state LSTM for text representation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne: ACL, 2018. 317–327.

[14] Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014. 1724–1734.

[15] Yang ZC, Yang DY, Dyer C, et al. Hierarchical attention networks for document classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: ACL, 2016. 1480–1489.

[16] Yao L, Mao CS, Luo Y. Graph convolutional networks for text classification. Proceedings of the 33th AAAI Conference on Artificial Intelligence and the 31th Innovative Applications of Artificial Intelligence Conference and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence. Honolulu: AAAI Press, 2019. 905.

[17] Huang LZ, Ma DH, Li SJ, et al. Text level graph neural network for text classification. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: ACL, 2019. 3444–3450.

[18] Wu F, Souza A, Zhang TY, et al. Simplifying graph convolutional networks. Proceedings of the 36th International Conference on Machine Learning. Long Beach: PMLR, 2019. 6861–6871.

[19] Zhang YF, Yu XL, Cui ZY, et al. Every document owns its structure: Inductive text classification via graph neural networks. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 334–339 .

[20] Wang KZ, Han SC, Poon J. InducT-GCN: Inductive graph convolutional networks for text classification. Proceedings of the 26th International Conference on Pattern Recognition (ICPR). Montreal: IEEE, 2022. 1243–1249.

[21] Zhu H, Koniusz P. Simple spectral graph convolution. Proceedings of the 9th International Conference on Learning Representations. OpenReview.net, 2021.

[22] Linmei H, Yang TC, Shi C, et al. Heterogeneous graph attention networks for semi-supervised short text classification. Proceedings of the 2019 Conference on Empirical????と???ㄠ?孮㈠ぎ???ひ????嵡??扵牡?孥??嵲??楥?即奩??娠桡慮湤朠??????慨?剉???敲瑮?慴汩???楬渠敊慯物?摴椠獃捯牮楦浥楲湥慮湣瑥?慯湮愠汎祡獴極獲?睬椠瑌桡?杧敵湡敧牥愠汐楲穯散摥?歳敩牮湧攮氠?捯潮湧猠瑋牯慮楧渺琠?晃潌爬?爲漰戱甹献琠?椸洲愱朓攴?挳氰愮猼獢楲显楛挲愳瑝椠潌湩??偘慅琬琠教牯湵?剘敘挬漠杚湨楡瑮楧漠湘??监ど资????????????????扲爠?raph convolutional networks for text classification. Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020. 8409–8416.

[24] Devika R, Vairavasundaram S, Mahenthar C S J, et al. A deep learning model based on BERT and sentence Transformer for semantic keyphrase extraction on big social data. IEEE Access, 2021, 9: 165252–165261.

[25] Shen DH, Zareapoor M, Yang J. Multimodal image fusion based on point-wise mutual information. Image and Vision Computing, 2021, 105: 104047.

[26] Yan MY, Chen ZD, Deng L, et al. Characterizing and Understanding GCNs on GPU. IEEE Computer Architecture Letters, 2020, 19(1): 22–25.

[27] Widiastuti NI. Convolution neural network for text mining and natural language processing. IOP Conference Series: Materials Science and Engineering, 2019, 662(5): 052010.

[28] Zhang C, Zhu H, Peng X, et al. Hierarchical information matters: Text classification via tree based graph neural network. Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, 2022. 950–959.

[29] Peng H, Li JX, He Y, et al. Large-scale hierarchical text classification with recursively regularized deep graph-CNN. Proceedings of the 2018 World Wide Web Conference. Lyon, 2018. 1063–1072.

[30] Wang YZ, Wang CX, Zhan JY, et al. Text FCG: Fusing contextual information via graph learning for text classification. Expert Systems with Applications, 2023, 219: 119658.

[31] 王宁, 郭梓昱, 田淑珂, 等. 基于融合特征t-SNE降维的控制图质量异常模式识别. 系统工程理论与实践, 2024, 44(7): 2381–2393.

[32] Poličar PG, Stražar M, Zupan B. Embedding to reference t-SNE space addresses batch effects in single-cell classification. Machine Learning, 2023, 112(2): 721–740.

[33] 蒋志强, 陶屹, 崔岩, 等. 文本主题特征和医疗众筹筹资绩效: 基于LDA模型的研究. 中国管理科学. https://link.cnki.net/urlid/11.2835.g3.20240517.2042.005. (2

引用本文

史文艺,朱欣娟.融合BERT和自注意力机制的张量图卷积网络文本分类.计算机系统应用,2025,34(3):152-160

复制

文章指标

点击次数:39
下载次数: 625
HTML阅读次数: 5
引用次数: 0

历史

收稿日期:2024-09-06
最后修改日期:2024-10-10
录用日期:
在线发布日期: 2025-01-21
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码