Abstract:The development of speech recognition is changing with each passing day. At the same time, the existing research results show that there is more complementary information in acoustic characteristics. In this study, a trajectory based spatio temporal spectral speech emotion recognition method is proposed. Its core idea is to get spatial and temporal descriptors from the speech spectrum, classify and identify dimensional emotion. The experiment using the exhaustive feature extraction shows that the proposed method is more robust in the noise condition than the MFCCs and the fundamental frequency extraction methods. In the 4 classes of emotion recognition experiments, the comparison of non weighted average feedback is obtained, and more accurate results are obtained. And, the voice activation detection is also improved significantly.