Abstract:The real-time interception of user chat content in live broadcast system is of great significance. In order to improve the accuracy and efficiency of the classification, a text classification model based on the combination of Doc2Vec and SVM is proposed to classify the chat content and judge whether the chat content should be intercepted. The First part uses the Doc2Vec model to represent the chat content as a dense numeric vector, and then an SVM classifier is used to classify. The experimental results show that the model greatly reduces the dimension of text representation with high efficiency, and it has excellent accuracy rate (97%) and recall rate (89.82%), which are superior to Naive Bayes and the logistic based on Doc2Vec.