Abstract:Traditional word co-occurrence detection methods in microblog news encounter the problems of high computational complexity, high time consuming, low recall rate and low precision. An improved algorithm of word co-occurrence detection based on rough set is proposed in this paper aiming at solving these problems. It builds a word co-occurrence matrix through word co-occurrence relation, and finds out the maximum complete subgraph as topic cluster center via co-occurrence matrix, finally identifies each topic keyword set using the rough set theory. The experimental results carried out on the microblog content corpus of NLPIR and the real-time collection of microblog data set verify that this method can effectively detect news topic from the massive microblog information and realize the news topic tracking.