大语言模型驱动的碳知识库构建与应用
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Construction and Application of Carbon Knowledge Base Driven by Large Language Model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    大语言模型(large language model, LLM)在自然语言理解与生成领域展现出卓越能力, 但在特定领域知识密集型任务中仍面临事实准确性不足、知识更新难以及高质量领域数据集匮乏的问题. 在应对上述难题时, 检索增强生成(retrieval-augmented generation, RAG)技术脱颖而出, 成为行之有效的解决路径. 然而, 在应对碳领域的知识密集型任务时, RAG技术还存在查询理解环节容易出现偏差、外部知识检索策略僵化单一、检索得到的结果与实际需求的相关性较差等短板, 同时缺乏特定的数据集来评估问答效果. 针对以上问题, 提出基于多管道的检索增强生成(Multi-pipeline-based RAG)方法, 使用本文提出的图谱增强递归式智能合并检索, 有效提升了检索精确率; 针对特定领域问答数据集的缺乏, 提出基于父节点文本的大模型自动生成问答数据集方法. 同时在传统评估指标, 如精确率(Precision)、召回率(Recall)等基础上, 利用LLM的文本理解能力评估: (1)响应-上下文-查询相关性评估; (2)响应-查询相关性评估; (3)上下文-查询相关性评估; (4)忠诚性评估. 通过与BM25-based RAG、Vector-based RAG、Recursive-based RAG的对比实验, 基于GLM-4-Plus模型的Multi-pipeline-based RAG精确率达到了85%, 高于其他方法.

    Abstract:

    The large language model (LLM) demonstrates excellent capabilities in natural language understanding and generation. However, it still faces challenges such as insufficient factual accuracy, difficulties in knowledge updating, and a lack of high-quality domain-specific datasets in knowledge-intensive tasks. To address these challenges, retrieval-augmented generation (RAG) has emerged as an effective solution. However, when applied to knowledge-intensive tasks in the carbon domain, RAG technology has limitations, including potential bias in query understanding, rigid external knowledge retrieval strategies, poor correlation between retrieved results and actual needs, and a lack of specific datasets for evaluating question-answering performance. To tackle these issues, this study proposes a Multi-pipeline-based RAG method, which utilizes the graph-enhanced recursive intelligent merge retrieval method to effectively improve retrieval accuracy. For the lack of Q&A datasets in specific domains, a large model-based approach is proposed to automatically generate Q&A datasets from the parent node text. Moreover, this study evaluates the following aspects using the text understanding capability of LMM, alongside traditional evaluation metrics such as precision and recall: (1) response-context-query correlation; (2) response-query correlation; (3) context-query correlation, and (4) loyalty evaluation. Experimental results show that the Multi-pipeline-based RAG method based on the GLM-4-Plus model achieves an accuracy of 85%, outperforming BM25-based RAG, Vector-based RAG, and Recursive-based RAG methods.

    参考文献
    相似文献
    引证文献
引用本文

芦成飞.大语言模型驱动的碳知识库构建与应用.计算机系统应用,2025,34(12):75-88

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-05-12
  • 最后修改日期:2025-06-05
  • 录用日期:
  • 在线发布日期: 2025-10-21
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62661041 传真: Email:csa@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号