基于DSC的多文本自动摘要

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年7月24日 13:59 星期四

首页 > 过刊浏览>2014年第23卷第7期 >7-11

PDF HTML阅读 XML下载导出引用引用提醒

基于DSC的多文本自动摘要
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        李成果李成果
中国科学院软件研究所, 北京 100190;中国科学院大学, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Query-Focused Multi-Document Summarization Based on Dominant Sets Cluster

Author:

LI Cheng-Guo
LI Cheng-Guo
Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

多文本摘要的目标是对给定的查询和多篇文本（文本集），创建一个简洁明了的摘要，要求该摘要能够表达这些文本的关键内容，同时和给定的查询相关. 一个给定的文本集通常包含一些主题，而且每个主题由一类句子来表示，一个优秀的摘要应该要包含那些最重要的主题. 如今大部分的方法是建立一个模型来计算句子得分，然后选择得分最高的部分句子来生成摘要. 不同于这些方法，我们更加关注文本的主题而不是句子，把如何生成摘要的问题看成一个主题的发现，排序和表示的问题. 我们首次引入dominant sets cluster（DSC）来发现主题，然后建立一个模型来对主题的重要性进行评估，最后兼顾代表性和无重复性来从各个主题中选择句子组成摘要. 我们在DUC2005、2006、2007三年的标准数据集上进行了实验，最后的实验结果证明了该方法的有效性.

关键词:多文本自动摘要;Dominant sets cluster

Abstract:

Query-focused multi-document summarization aims at automatically creating a brief statement that presents the main points of a given document set and is relevant with the query. A given document set usually contains some themes. And each theme is represented by a cluster of sentences, and an excellent summary should cover the most important themes. Most of the existing multi-document summarization methods use a sentence-ranking model to select sentences to generate summary. These methods just consider cluster as a factor influences rank sentence or ignore it. Due to the influence of other factors, finally generated summary may not contain some important themes by these methods. Different from these methods, we focus on the themes level rather than sentence level and we treat the task as a themes detection, ranking and representation (TDRR) problem. We introduce dominant sets cluster (DSC) to produce theme clusters, construct a model to rank theme clusters, and select most representative and maximum information gain sentences to form summary. The experimental results on an open benchmark data sets from DUC05 to DUC07 show that our proposed approach is effectiveness.

Key words:multi-document summarization;dominant sets cluster;query-focused summarization

引用本文

李成果.基于DSC的多文本自动摘要.计算机系统应用,2014,23(7):7-11

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2014-03-18
最后修改日期:2014-04-14
录用日期:
在线发布日期: 2014-08-15
出版日期:

微信公众号

网站二维码

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码