Abstract:The length of the strip steel in the annealing furnace is affected by temperature, tension, and other factors, resulting in changes in roller speed and uncertainty in weld position and threatening production safety. To accurately predict roller speed, this study proposes the banded sparse Cauchy weight enhanced Transformer (BSCWEformer) model. The model adopts a banded sparse self-attention structure enhanced by Cauchy distribution weight values calculated from relative positions, which improves the importance of adjacent input sequences and reduces the complexity of self-attention from quadratic to linear. Through experiments with actual production data and comparison with LogSparse Transformer, Transformer, RNMT+, and other models, the BSCWEformer model shows higher accuracy in predicting grouped roller speed series.