本文已被:浏览 628次 下载 1171次
Received:March 04, 2022 Revised:April 02, 2022
Received:March 04, 2022 Revised:April 02, 2022
中文摘要: 文本对抗样本的生成对于研究基于深度学习的自然语言处理系统的脆弱性, 提升这类系统的鲁棒性具有重要的意义. 本文对词级对抗样本生成中的重要步骤, 替换词的搜索展开研究, 针对现有算法存在的早熟收敛和有效性差的问题, 提出了基于改进人工蜂群搜索算法的文本对抗样本生成方法. 首先, 根据知网HowNet库中单词的义原标注筛选得到拟被替换词的搜索空间; 然后, 基于改进的人工蜂群算法搜索并定位替换词生成高质量的文本对抗样本. 本文针对当前主流的基于深度神经网络的文本分类模型, 在两个文本分类数据集上进行了攻击测试. 结果表明, 跟已有文本对抗样本生成方法相比, 本文提出的方法能以较高的攻击成功率误导文本分类系统, 并更多地保留语义和语法的正确性.
Abstract:The generation of text adversarial samples is of great significance for studying the vulnerability of deep learning-based natural language processing (NLP) systems and improving the robustness of such systems. This work studies the important steps in the generation of word-level adversarial samples and the search for replacement words. Considering the problems of premature convergence and poor effectiveness of existing algorithms, a text adversarial sample generation method is proposed, which is based on an improved artificial bee colony (ABC) search algorithm. Firstly, the search space of the words to be replaced is obtained by the screening of the sememe annotations of the words in the HowNet database. Then, the improved ABC algorithm is employed to search and locate the replacement words for the generation of high-quality text adversarial samples. Finally, attack tests are conducted on two text classification datasets for a comparison with the current mainstream text classification models based on deep neural networks (DNNs). The results demonstrate that compared with the existing text adversarial sample generation methods, the proposed method can mislead the text classification system with a higher success rate of attack and preserve semantic and grammatical correctness to a larger extent.
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金创新群体项目(61521003)
引用文本:
杨帆,李邵梅,金柯君.基于改进人工蜂群算法的文本对抗样本生成.计算机系统应用,2022,31(11):238-245
YANG Fan,LI Shao-Mei,JIN Ke-Jun.Text Adversarial Samples Generation Based on Improved Artificial Bee Colony Algorithm.COMPUTER SYSTEMS APPLICATIONS,2022,31(11):238-245
杨帆,李邵梅,金柯君.基于改进人工蜂群算法的文本对抗样本生成.计算机系统应用,2022,31(11):238-245
YANG Fan,LI Shao-Mei,JIN Ke-Jun.Text Adversarial Samples Generation Based on Improved Artificial Bee Colony Algorithm.COMPUTER SYSTEMS APPLICATIONS,2022,31(11):238-245