Abstract:Mobile phone text messaging has become an increasingly important means of daily communication, so the identification of spam messages has importantly practical significance. A self-attention-based Bi-LSTM neural network model combined with TFIDF is proposed for this purpose. The model first inputs the short message to the Bi-LSTM layer in a vector manner, after feature extraction and combining the information of TFIDF and self-attention layers, the final feature vector is obtained. Finally, the feature vector is classified by the Softmax classifier to obtain the classification result. The experimental results show, compared with the traditional classification model, the self-attention-based Bi-LSTM model combined with TFIDF improves the accuracy of text recognition by 2.1%–4.6%, and the running time is reduced by 0.6 s–10.2 s.