反事实增强的对抗学习序列推荐
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61672490, 61602436); 中国科学院对外合作重点项目(241711KYSB20180002); 国家重大研发计划子课题(2022YFC3320900)


Counterfactual Enhanced Adversarial Learning for Sequential Recommendation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    最近, 强化学习技术在序列推荐系统取得成功, 它能从用户长期反馈信号中学习有效的推荐策略. 然而, 模型的激励函数设计面临区分度过低的难题. 这限制了模型学习不同用户反馈信号间的价值差异的能力, 并导致推荐策略总是次优的. 现有工作主要通过调节衰减因子来保证激励函数区分度, 但它依赖专家先验知识缺乏理论基础. 为了更合理地设计激励函数和提高其区分度, 本文依据因果论来分析推荐系统, 并提出一种基于反事实区分度增强的序列推荐算法CAL4Rec. 首先, 所提出方法用结构因果图描述序列推荐过程, 并创造性地用因果图定义了因果可鉴别的价值激励区分度. 其次, 该方法用反事实生成对抗的自监督学习过程优化推荐策略网络, 以学习用户的真实倾向. 在一系列序列推荐基准数据集上, 对CAL4Rec开展了广泛对比和消融实验, 实验结果表明CAL4Rec的提升对多种网络实现结构有效(平均2.34%).

    Abstract:

    Recently, reinforcement learning techniques have achieved success in sequence recommendation systems, as they can learn effective recommendation strategies from long-term user feedback signals. However, the design of the model’s reward function faces the challenge of low discriminability. This limits the model’s ability to learn the value differences between different user feedback signals, leading to suboptimal recommendation strategies. Existing studies mainly ensure discriminability of the reward function by adjusting decay factors, but this relies on expert prior knowledge and lacks a theoretical foundation. In order to more reasonably design the reward function and enhance its discriminability, this study analyzes the recommendation system based on counterfactual reasoning and proposes a sequence recommendation algorithm CAL4Rec based on counterfactual discriminability enhancement. Firstly, the proposed method uses structural causal graphs to describe the sequence recommendation process and creatively defines causally identifiable value reward discriminability using causal graphs. Secondly, this method uses a counterfactual generative adversarial self-supervised learning process to optimize the recommendation strategy network and learn the user’s true preferences. Extensive comparative and ablation experiments were conducted on a series of sequence recommendation benchmark datasets for CAL4Rec, and the experimental results show that CAL4Rec’s improvement is effective for various network implementation structures (average 2.34%).

    参考文献
    相似文献
    引证文献
引用本文

刘珈麟,贺泽宇,李俊.反事实增强的对抗学习序列推荐.计算机系统应用,2024,33(4):235-245

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-10-27
  • 最后修改日期:2023-11-27
  • 录用日期:
  • 在线发布日期: 2024-03-07
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号