A study of meta-learning methods for few-shot learning

Period

2020.11-2021.8（Completed）

Cooperative unit

CCF, Baidu

Introduction

Few-shot learning refers to the desire for a model to quickly adapt (show good performance) on a target task when only a small amount of data with supervised information directly related to the target task is currently available. Few-shot learning methods based on meta-learning can be divided into three main categories: optimization-based meta-learning methods, model-based meta-learning methods, and metric-based meta-learning methods. This project aims to deeply evaluate the robustness of the mainstream metric-based few-shot learning methods and propose an improvement scheme to address their shortcomings, and deploy the scheme in some practical applications.

Methodology

This project adopts the following research scheme: (1) meta-learning based on matching distribution. (2) Few-shot learning for wireless sensing.

(1) meta-learning based on matching distribution.

There is a proliferation of metric-based meta-learning methods, ranging from MatchingNet to the recent DPGN, from simple coding followed by cosine similarity to the use of graph neural networks to characterize the relationship between targets. Although the current state-of-the-art metric-based meta-learning methods excel in learning public datasets with small samples, their robustness in practical applications is doubtful. Most of the current metric-based meta-learning methods follow the same idea: using the correspondence between the support and query sets and some metric to learn an embedding space in which similar targets are close to each other and dissimilar targets are distant from each other. This way of learning an embedding space by metrics has a wide range of applications in many subfields (e.g., object tracking, person re-identification, etc.). However, it is worth noting that this learning approach is based on the instance level, which only guarantees the learning of a well-characterized embedding space with a large amount of training data. Obviously, the applicability of this type of approach is questionable in few-shot learning scenarios. In fact, the learning of the embedding space should not be based on the instance level, but on the distribution level. Since the instances are only a few samples of the distribution, if the sampled instances are not representative of the distribution, the learned embedding space cannot be well generalized to new instances. Instead, if the true distribution of the targets can be restored to some extent, and the metric can be implemented based on the similarity of distribution among the targets, it will have better generalization performance. A similar idea has been embodied in DPGN: DPGN constructs two complete graphs, point and distribution graphs, for modeling the instance-level representation and distribution-level representation of each sample, respectively, and has shown extremely high accuracy on learning public datasets for each size sample. However, it is to improve the update propagation of node information in the graph by considering the similarity distribution of each sample within the support set, and does not model the distribution of each instance itself within the support and query sets. Therefore, this project aims to model the actual distributions of various types of targets within the support set and query set, and measure the similarity of these distributions to accomplish few-shot learning. The schematic diagram of its scheme is shown below:

(2) Few-shot learning for wireless sensing.

Current public datasets for evaluating few-shot learning are mainly image-based miniImageNet, tieredImageNet, Omniglot, CIFAR-FS, CUB 200, FC100, etc., and text-based ODIC. many few-shot learning methods have exceeded 90% accuracy on some of these datasets. It can be seen that some of the datasets are no longer able to objectively evaluate the current few-shot learning methods. On the other hand, only image- and text-based public datasets are currently available in this field, and the application scenarios are not rich enough. For this reason, this project proposes a few-shot learning dataset based on wireless signals to further enrich the application scenarios of few-shot learning and add one-dimensional evaluation metrics for various models.