Abstract:High-resolution remote sensing images contains rich geographic information. At present, the semantic segmentation model based on the traditional neural network cannot extract the features of small and medium-sized objects in remote sensing images, resulting in high segmentation error rate. This study proposes a method based on the connection of encoder and decoder structure features to improve the DeconvNet network model. The model can retain the spatial structure information by recording the location of the pool index and applying it to the upper pool when being encoded. During decoding, the model can effectively extract features by connecting the corresponding feature layer of encoder and decoder. During model training, the pre-training model designed can effectively expand the data to solve the problem of model over-fitting. The experimental results show that, based on the proper adjustment of optimizer, learning rate and loss function, the accuracy of remote sensing images semantic segmentation in the validation database is about 95% by using the extended dataset for training, which is significantly improved compared with the DeconvNet and UNet network models.