To deal with the inaccurate classification caused by a failure of quick and effective extraction of image features in the remote sensing image scene classification based on existing machine learning methods, we propose a remote sensing image scene classification method based on residual attention network. With the residual network as the benchmark model, attention modules are created in the dimensions of channel and space. For effective classification of the UC Merced Land-Use dataset, parameters are set reasonably and the model that optimizes the number of network layers is fine-tuned. The results show that the accuracy of our method reaches 98.1% compared with that based on the convolution neural network.