Abstract:A new algorithm for keyword recognition based on audio automatic segmentation and depth neural network is proposed to identify the requirements of keyword recognition on the condition of low or zero resource. Firstly, an improved speech segmentation algorithm based on metric distance is used to divide the continuous speech stream into isolated syllables, and then the syllable is subdivided into short audio segments which are connected with the phoneme state. The segmented audio segment has the characteristics of large difference between the segments, and the characteristic variance of the segment is small. Then, an improved vector quantization method is used to encode the state features of the audio fragments, and the high precision quantization coding and the low precision quantization coding of the words are realized. Finally, the syllable is used as the recognition unit, and the compressed state transition matrix is used as the whole feature of the syllable. It is sent into the deep neural network for speech recognition. The simulation results show that the algorithm can identify many specific keywords from the natural speech stream, and the algorithm is easy to understand, the training is simple and the robustness is better.