Abstract:To address the problems such as the inherent sparsity in the short text and the "lexical gap" of traditional classification model, using Word2Vec model to map words to a spatial vector of low-dimensional real number according to context semantic relations can effectively ease the sparse feature issue of short text. However, further study found that only using Word2Vec will ignore the influence of different parts of speech on the short text. Therefore, we introduce part of speech to improve the feature weighting approach, in which the contribution of speech is embedded into the traditional TF-IDF algorithm to calculate the weight of the words in the short text, and the vector of short text is generated by combining the word vector of Word2Vec. Finally, we use the SVM to achieve short text classification. Experimental results on Fudan University Chinese text classification corpus validate the effectiveness of the proposed method.