李宏毅《深度学习》学习笔记（一）-CFANZ编程社区

李宏毅老师《深度学习》课程开始前两节课就是介绍机器学习的一些基本概念

文章目录

Marchine learning
Training steps
Structured Learning
overfitting

Marchine learning

$M a c h i n e l e a r n i n g \approx L o o k i n g f o r F u n c t i o n Machine\ learning \approx Looking for Function$
机器学习相当于是寻找输入输出对应的函数(model)

Training steps

step1: the function with unknown parameters

step2: define Loss from training data
Loss: how good a set of values is.
损失函数的定义： $L = \frac{1}{N} \sum_{n} e_{n} L=\frac{1}{N}\sum_{n}e_n$
$e=|y-\hat{y}|$ $MAE$ 即mean absolute error
$e=(y-\hat{y})^2$ $MSE$ 即mean square error

当 $y$ 和 $\hat{y}$ 都是概率分布时，可能会选择交叉熵 $Cross\ entropy$ 作为损失函数

step3: optimization
$w^*,b^*=arg \min_{w,b} L$
优化方法有gradient descent

Structured Learning

$S t r u c t u r e d L e a r n i n g \approx c r e a t e s t h w i t h s t r u c t u r e (i m a g e, d o c u m m e n t) Structured\ Learning \approx create\ sth\ with\ structure(image, documment)$

overfitting

当训练集上的loss小，测试集上的loss大时，才是过拟合

解决overfitting的方法：

more training data(data augmentation)
根据自己对于问题的理解，创造更多的数据
constrained model
根据问题给模型更多的限制
比如less parameters、less features、early stopping、regularization、dropout

在选择模型的时候应该综合训练集和测试集的误差来进行选择。

李宏毅老师的台湾腔真的好听啦~~