0

点赞

收藏

分享

【李宏毅2020 ML/DL】P97-98 More about Meta Learning

静悠 2022-02-11 阅读 57

标签: 机器学习人工智能深度学习元学习 few-shot

我已经有两年 ML 经历，这系列课主要用来查缺补漏，会记录一些细节的、自己不知道的东西。

本节内容综述

本节课由助教 陈建成 讲解。
本节 Outline 见小细节。
首先是 What is meta learning? 。
接下来是 Why meta learning? 。
How and what to do with meta learning? 是本节课的主要内容。首先讲了一个很有趣的现象：元学习的作者普遍会把其方法缩写成某种动物的名字。
讨论了 What can we “meta learn”? 。
接着是 What can we meta learn on? 。讨论元学习的常见数据集。
之后，则可以开始讨论元学习的类别，并介绍其做法。
最后，介绍了与 meta-learning 相关的技术。

文章目录

本节内容综述
小细节

Outline
What is meta learning?
Why meta learning?

Motivations for meta learning?

(Interesting Names of) Models / Techniques
What can we "meta learn"?
What can we meta learn on? Datasets
Categories

Black-box
Optimization / Gradient based

MAML
Meta-SGD
How to train your MAML?
Different meta-parameters

iMAML
R2-D2: closed form solvers

Base-box v.s. Gradient based
Metric-based / non-parametric

Siamese network
Prototypical network
Matching network
Relation network
IMP (Infinite Mixture Prototypes)

Problem of metric-based

Hybrid

LEO (Latent Embedding Optimization)

Bayesian meta-learning
Related machine learning topics

小细节

Outline

What is meta learning?
Why meta learning?
How and what to do with meta learning?

Categories
Datasets
Models

Related machine learning topics

What is meta learning?

所谓 meta 就是 sth about sth 。

meta learning 就是 learning about learning ，也就是 learn to learn。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_元学习

图示如上。

Why meta learning?

Motivations for meta learning?

too many tasks to learn, to learn more efficiently: learning to learn.

Faster learning methods (adaptation)
Better hyper-parameters / learning algorithms
Related to:

transfer learning
domain adaptation
multi-task learning
life-long learning

Too little data, to fit more accurately (few-shot learning, better learner, fit more quickly)

Traditional supervised may not work

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_02

此外，老师的频道如上。（看来我还差一些…

(Interesting Names of) Models / Techniques

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_03

如上，元学习的模型都用动物命名。SNAIL 是基于注意力的模型。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_04

具体不讲解，有需要则自行阅读。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_05

What can we “meta learn”?

Model Parameters (suitable for Few-shot framework)

`Initializations
Embeddings / Representations / Metrics
Optimizers
Reinforcement learning (Policies / other settings)

Hyperparapmeters (e.g. AutoML): beyond the scope of today, but can be viewed as kind of meta learning

Hyperparameters search ((training) settings)
Network architectures → Network architecture search (NAS), related to: evolutional strategy, genetic algorithm…

Others

Algorithm itself (literally, not a network)
…(More in DLHLP)

What can we meta learn on? Datasets

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_06

如上，可以用在语言字体识别上。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_07

此外，还可以用在图片上。

Categories

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_08

如上，可以分为：

黑箱 RNN；
优化器、初始参数；
学出一个比较标准；
还有一些混合方法。

Black-box

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_09

其原理如上，我们得到 ϕ i \phi_i ϕi可以理解为一种分布。其实还可以理解为一种 auto-encoder 。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_10

如上，其基本思想就是，用 LSTM 硬训练一发。

Optimization / Gradient based

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_11

这是最为经典的一类方法，包括 MAML 与 Reptile 等。

MAML

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_12

对 MAML 的概述如上。

Meta-SGD

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_元学习_13

但是 MAML 存在一些问题，比如其学习率对于不同任务是相同的。但学习率往往很重要。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_14

因此，为 MAML 增加一个 meta 参数，即让学习率匹配参数。

但是学习率数量与参数一样多，这很没有效率。因此，提出 How to train your MAML?

How to train your MAML?

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_15

这篇文章指出了 MAML 的五个问题，如上。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_16

针对上述五个问题，作者都提出了解决方案。

Different meta-parameters

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_17

此外，还有两篇文章，对 MAML 的架构改动较大。

iMAML

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_18

如上，中间的图是传统的 MAML ，只用了一阶的方向。而其方向向量叠加起来，与真正的 MAML 偏差是较大的。

因此，作者提出了一个假设(如上图抛物线，即二次项)：Can we do better? Consider the following…

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_19

如上图，如果 meta 参数 θ \theta θ 能够对一群 task （如图中的几条线），有一定的学习到；此时 θ \theta θ 应该会有这些 ϕ i \phi_i ϕi 中的共通点。此时， θ \theta θ 不应该跑太远。

因此，我们加一个正则，并且进行推导，发现可以近似成 θ \theta θ 的微分最终只与最终的 ϕ i \phi_i ϕi 有关。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_人工智能_20

其表现还不错。

R2-D2: closed form solvers

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_元学习_21

如上，其中 ρ \rho ρ 表示那个正则器 R.R. 中的参数。正则器中进行了数学推导，其中进行的是一些线性代数运算。

Base-box v.s. Gradient based

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_22

如上，其实二者区别并不是很大。都是把训练任务的信息存储起来。

Metric-based / non-parametric

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_人工智能_23

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_24

如上，我们在这种方法中，放其构建 F ，而直接把训练数据与测试数据输入到大的黑箱中，直接进行比较，输出标签。

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_25

如上，用 training data 来对类别做 embedding ，然后讲训练数据与之进行临近的比较。

Siamese network

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_26

如上，我们训练的目标为：一样标签的够近，不一样的够远，就可以了。

Prototypical network

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_27

如上，分别设定不同的 CNN ，抽取特征。

Matching network

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_元学习_28

而 Matching network 则为了找到其之间的关系，用双向 LSTM 存储他们之间的关系。

Relation network

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_29

Relation network 的思路类似上文，只不过是通过一个网络去算其分数。

以上是李老师讲过的四种方法。

IMP (Infinite Mixture Prototypes)

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_深度学习_30

如上，使用了贝叶斯的方法，为了追求更准的分类，增加一些类别。

Problem of metric-based

When the K in N-way K-shot large → difficult to scale
Limited to classification (only learning to compare)

Hybrid

最后讲一下融合模型。

LEO (Latent Embedding Optimization)

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_31

如上，我们的在 inner 训练时（训练元学习的能力时），讲数据集投影到一个较小维度的空间中，这个空间对应着网络参数的空间。

Bayesian meta-learning

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_few-shot_32

如上，我们的目标本来是想区分“笑”与“不笑”，但是机器却学到了“带不带帽子”这个特点。

因此，提出了一系列方法如上。

Related machine learning topics

【李宏毅2020 ML/DL】P97-98 More about Meta Learning_机器学习_33

在 RL 中，我们卡在了“交互”、“数据量”少上。因此，与元学习结合很有必要。

0 条评论

关注