0
点赞
收藏
分享

微信扫一扫

机器学习和深度学习--李宏毅(笔记与个人理解)Day11-12

菜头粿子园 2024-04-15 阅读 42

Day11 when gradient is small……

image-20240411202524518

image-20240411165229093 image-20240411170420173

怎么知道是局部小 还是鞍点?

using Math

image-20240411170714889 image-20240411170827423 image-20240411183651821

example – using Hessan

image-20240411184214162

image-20240411185921423

image-20240411190019750

Dont afraid of saddle point(鞍点)

image-20240411191229522 image-20240411192321757

local minima VS saddle Point

图2024-4-11

image-20240411193048273

image-20240411193129424 image-20240411193238944 image-20240411193304041

Day12 Tips for training :Batch and Momentum

why we use batch?

前面有讲到这里, 前倾回归

image-20240411193721913

image-20240411193820931

shuffle :有可能batch结束后,就会重新分一次batch

small vs big

image-20240411194814890

image-20240411194902129

未考虑平行运算(并行 --gpu)

image-20240411195047480 image-20240411195201626

image-20240411195219530

image-20240411195334111

image-20240411195343292

image-20240411195439085

image-20240411195449004

image-20240411195730628 image-20240411195807268
AspectSmall Batch Size(100个样本)Large Batch Size(10000个样本)
Speed for one update (no parallel)FasterSlower
Speed for one update (with parallel)SameSame (not too large)
Time for one epochSlowerFaster
GradientNoisyStable
OptimizationBetterWorse
GeneralizationBetterWorse

batch is a hyperparameter……

image-20240411200125974

Momentum

惯性

image-20240411200222463 image-20240411200312757

image-20240411200418306

image-20240411201955070 image-20240411202039014

concluding:

image-20240411202102454

举报

相关推荐

0 条评论