Abstract:Query-focused multi-document summarization aims at automatically creating a brief statement that presents the main points of a given document set and is relevant with the query. A given document set usually contains some themes. And each theme is represented by a cluster of sentences, and an excellent summary should cover the most important themes. Most of the existing multi-document summarization methods use a sentence-ranking model to select sentences to generate summary. These methods just consider cluster as a factor influences rank sentence or ignore it. Due to the influence of other factors, finally generated summary may not contain some important themes by these methods. Different from these methods, we focus on the themes level rather than sentence level and we treat the task as a themes detection, ranking and representation (TDRR) problem. We introduce dominant sets cluster (DSC) to produce theme clusters, construct a model to rank theme clusters, and select most representative and maximum information gain sentences to form summary. The experimental results on an open benchmark data sets from DUC05 to DUC07 show that our proposed approach is effectiveness.