Abstract:Few-shot image classification aims to learn a classifier from a limited amount of labeled data. Despite significant progress made by existing methods, challenges remain in extracting useful features and accurately classifying images due to the limited number of training samples, large intra-class variance, and small inter-class variance, which lead to confusion between support and query samples. To address these issues, this study proposes a novel multi-embedding enhanced network. This lightweight and efficient network represents images by generating a set of feature embeddings, rather than relying solely on single-image-level features. It is capable of generating various hierarchical structures to learn richer feature representations, thereby reducing intra-class variance and increasing inter-class variance. In addition, the study proposes a set-based metric combined with a dynamic self-adaptive weighting mechanism to measure the similarity between query and support sets. Experimental results demonstrate the excellent performance of the proposed model on the miniImageNet, tieredImageNet, and CUB datasets. Using a 1-shot setting in the ResNet-12 network, the model achieves accuracies of 72.22%, 75.43%, and 85.02%, respectively, outperforming the baseline models by 1.09%, 2.93%, and 1.47%.