###

DOI:

计算机系统应用英文版:2014,23(8):163-167

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

利用概率主题模型的微博热点话题发现方法

(1.陇东学院信息工程学院, 庆阳 745000;2.西北师范大学计算机科学与工程学院, 兰州 730070)

Microblog Hot Topics Discovery Method Based on Probabilistic Topic Model

MI Wen-Li¹, SUN Yue-Xin^2,3

(1.College of Information Engineering, Longdong University, Qingyang 745000, China;2.College of Computer Science &3.Engineering, Northwest Normal University, Lanzhou 730070, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 1472次下载 3608次
Received:December 18, 2013 Revised:January 14, 2014

中文摘要: 微博具有长度短、实时传播、结构复杂以及变形词多等特点，传统的向量空间模型（VSM）文本表示方法和隐含语义分析（LSA）无法很好的对其进行建模.提出了一种基于概率潜在语义分析（pLSA）和K均值聚类（Kmeans）的二阶段聚类算法，此外通过定义微博热度分析和排序，有效地支持微博热点话题发现.实验表明，此方法能有效地进行话题聚类并检测出热点话题.

中文关键词: 概率潜在语义分析话题发现微博 Kmeans

Abstract:Microblog has the characteristic of short length, complex structure and words deformation. Therefore, traditional vector space model (VSM) and latent semantic analysis (LSA) are not suitable for modeling them. In this paper, a two stage clustering algorithm based on probabilistic latent semantic analysis (pLSA) and Kmeans clustering (Kmeans) is proposed. Besides, this paper also presents the definition of popularity and mechanism of sorting the topics. Experiments show that our method can effectively cluster topics and be applied to microblog hot topic detection.

keywords: probabilistic latent semantic analysis topic detection microblog Kmeans

文章编号： 中图分类号： 文献标志码：

基金项目:

Author Name	Affiliation
MI Wen-Li	College of Information Engineering, Longdong University, Qingyang 745000, China
SUN Yue-Xin	College of Computer Science & Engineering, Northwest Normal University, Lanzhou 730070, China

Author Name	Affiliation
MI Wen-Li	College of Information Engineering, Longdong University, Qingyang 745000, China
SUN Yue-Xin	College of Computer Science & Engineering, Northwest Normal University, Lanzhou 730070, China

引用文本：
米文丽,孙曰昕.利用概率主题模型的微博热点话题发现方法.计算机系统应用,2014,23(8):163-167
MI Wen-Li,SUN Yue-Xin.Microblog Hot Topics Discovery Method Based on Probabilistic Topic Model.COMPUTER SYSTEMS APPLICATIONS,2014,23(8):163-167