Abstract:In order to solve the problem of large-scale short-text corpus topic model parameter K, the FBTM model is proposed to reduce the sampling complexity from O (K) to O (1). Aiming at the short spelling of short text and the weak description ability, this paper proposes a short text classification algorithm with biterm with the same topic and FBTM. Firstly, we use FBTM to model the text, and extend the same topic biterm in a sliding window as feature in the original text. Then, we use the FBTM topic distribution as another part of the text feature. The results show that this method has significantly improved the classification performance of Weibo corpus.