Published In

IEEE Access

Document Type

Article

Publication Date

2020

Subjects

Image recognition, Computational modeling

Abstract

Learning from limited labelled examples is key a research hotspot with excellent scenarios and potential applications. Currently, most of metric learning-based few-shot models still have the problem of low recognition accuracy. This is mainly because that they only use the top-layer abstract feature with semantic information, which ignores the low-layer features that are also critical for the few-shot recognition. Therefore, the extracted features do not have abundant representation ability, and it is difficult to recognize easily confusing objects. Moreover, they usually adopt a fixed distance function or train a comparable network to measure features. These methods lack adaptability, cannot sufficiently fuse features, which leads to weaken the fitting ability of the metric function. And the same or different classes of images are treated equally, which makes the metric function have no emphasis point during training. To address these issues, we propose an end-to-end, metric learning-based model in this paper, called multi-scale decision network with feature fusion and weighting for few-shot learning (MSDN). Considering the importance of the low-layer features, we exploit a convolutional network to extract each layer feature. Then, we exploit a relation network to learn a non-linear metric between the support set and the query set features of each layer and classify the test images via a voting decision. During feature concatenation, we design a non-linear feature fusion item to improve the way of concatenation, so that the relation network can have a stronger function fitting ability to learn the relation score. Meanwhile, we introduce the attention mechanism by calculating the cosine similarity between the support set and the query set features as their weight, which makes the relation network pay more attention to the same class of images. Our model achieves the state-of-the-art accuracy result on Omniglot and miniImageNet datasets compared with popular few-shot...

Description

© 2020 by the authors. Licensee: IEEE. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

DOI

10.1109/ACCESS.2020.2994805

Persistent Identifier

https://archives.pdx.edu/ds/psu/33294

Share

COinS