Abstract:To solve the problem that it is difficult for neural networks to obtain enough information to correctly classify images by using a small amount of labeled data, this study proposes a new relational network, SDM-RNET, which combines random deep network and multi-scale convolution. First, a stochastic deep network is introduced into the model embedding module to deepen the model depth. Then, in the feature extraction stage, multi-scale depth-separable convolution is adopted to replace ordinary convolution for feature fusion. After the backbone network, deep and shallow layer feature fusion is applied to obtain richer image features and finally learn to predict the categories of images. Compared with other small sample image classification methods on mini-ImageNet, RP2K, and Omniglot datasets, the results show that the proposed method has the highest accuracy on 5-way 1-shot and 5-way 5-shot classification tasks.