基于大语言模型的回译式抄袭检测
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Back Translation Plagiarism Detection Based on Large Language Model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着信息技术的发展, 诸如借助翻译工具的回译式抄袭行为越发复杂隐蔽, 对抄袭检测方法提出了更高的要求. 为此, 提出一种基于提示工程(prompt engineering)的抄袭检测方法. 该方法通过设计提示词, 引导大语言模型(large language model, LLM)在语义层面关注句子文本中的潜在相似性, 能够有效识别出语义高度相似的内容. 首先, 回顾了现有的抄袭检测技术和提示工程的应用, 在此基础上设计基于提示工程的回译式抄袭行为检测流程. 其次, 设计提示模版, 通过合并缩减待检测句子对的方式, 提出句子压缩比的抄袭检测指标. 最后, 通过实验证明基于提示工程的抄袭检测方法与传统方法相比, 在检测回译式抄袭行为上具有显著优势.

    Abstract:

    With the development of information technology, back translation plagiarism, such as through the use of translation tools, becomes increasingly complex and covert, posing higher requirements for plagiarism detection methods. For this reason, a plagiarism detection method based on prompt engineering is proposed. This method guides large language model (LLM) to pay attention to potential similarities in sentence texts at the semantic level by designing prompt words, which can effectively identify highly semantically similar content. Firstly, the existing plagiarism detection technologies and the application of prompt engineering are reviewed. Based on this, a backtracking plagiarism behavior detection process based on prompt engineering is designed. Secondly, a prompt template is designed to propose a plagiarism detection index based on sentence compression ratio by merging and reducing the pairs of sentences to be detected. Finally, experiments demonstrate that the plagiarism detection method based on prompt engineering has significant advantages over traditional methods in detecting back translation plagiarism behavior.

    参考文献
    相似文献
    引证文献
引用本文

解勉,陈刚,余晓晗.基于大语言模型的回译式抄袭检测.计算机系统应用,,():1-10

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-08-06
  • 最后修改日期:2024-08-27
  • 录用日期:
  • 在线发布日期: 2025-01-16
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号