Abstract:Traffic flow prediction is an important method for achieving urban traffic optimization in intelligent transportation systems. Accurate traffic flow prediction holds significant importance for traffic management and guidance. However, due to the high spatiotemporal dependence, the traffic flow exhibits complex nonlinear characteristics. Existing methods mainly consider the local spatiotemporal features of nodes in the road network, overlooking the long-term spatiotemporal characteristics of all nodes in the network. To fully explore the complex spatiotemporal dependencies in traffic flow data, this study proposes a Transformer-based traffic flow prediction model called multi-spatiotemporal self-attention Transformer (MSTTF). This model embeds temporal and spatial information through position encoding in the embedding layer and integrates various self-attention mechanisms, including adjacent spatial self-attention, similar spatial self-attention, temporal self-attention, and spatiotemporal self-attention, to uncover potential spatiotemporal dependencies in the data. The predictions are made in the output layer. The results demonstrate that the MSTTF model achieves an average reduction of 10.36% in MAE compared to the traditional spatiotemporal Transformer model. Particularly, when compared to the state-of-the-art PDFormer model, the MSTTF model achieves an average MAE reduction of 1.24%, indicating superior predictive performance.