Abstract:Currently, in traffic prediction, deep learning-based spatio temporal separation modeling methods have difficulty in expressing spatio-temporal coupling correlations in data effectively. Although spatio-temporal joint modeling methods can compensate for the shortcomings of spatio-temporal separation modeling to some extent, there are deficiencies such as insufficient express ability and high computational complexity in constructing spatio-temporal hypergraphs. To address these issues, this study proposes an improved spatio-temporal joint modeling method, window spatial-temporal attention network (W-STANet). W-STANet mainly comprises three parts: a data embedding layer, a spatio-temporal correlation modeling layer, and a prediction head. The spatio-temporal correlation modeling layer learns spatio-temporal correlation features of traffic data by stacking multiple spatio-temporal attention blocks. Meanwhile, by introducing the local window calculation method, data shifting and permutation operations, the computational complexity in the modeling process is greatly reduced, and the modeling from both local and global perspectives within the spatio temporal graph is achieved. Experimental results on five real traffic public datasets demonstrate superior prediction performance compared to other spatio-temporal joint modeling methods. Compared with spatio-temporal separation modeling methods, it has superior prediction performance on large-scale road network datasets.