【李宏毅2020 ML/DL】P16 PyTorch Tutorial | 最后提及了 apex.amp-CFANZ编程社区

我已经有两年 ML 经历，这系列课主要用来查缺补漏，会记录一些细节的、自己不知道的东西。

已经有人记了笔记（很用心，强烈推荐）：

https://github.com/Sakura-gh/ML-notes

本节对应笔记：无

本节内容综述

本节课由助教Chi-Liang Liu讲解，我将简略记录，因为自己已经具有一定PyTorch 使用经验。
今天讲四部分：如何实现的自动微分？DL中的常用函数？Data Process 在 PyTorch 如何做？新的架构Mixed Precision如何使用？
Tensors and relation to Numpy；
Tensor.view() 用于reshape()；
BROADCASTING SEMANTICS，类似numpy，两个向量大小虽然不同，但是通过“广播机制”可以直接相加；
Computation graphs，只要有需要gradient的运算，那么torch会记录你之前的计算步骤，并记录一张图；
CUDA SEMANTICS中，使用gpu = torch.device("cuda")很方便调用GPU；
PyTorch as an auto grad framework；
Using the gradient，使用y.backword()计算梯度，x.grad中保存了梯度对象。
Linear Regression；
torch.nn.Module，提供了更多的封装好的模型。
Activation functions；
Sequential，PyTorch提供了torch.nn.Sequential()，方便声明模型；
Loss functions；
torch.optim，提供了优化器。这里举了一个完整的例子，见小细节
Neural Network Basics in PyTorch；
Learning rate schedulers，PyTorch提供了调整学习率的工具；
Convolution；
Dataset class，提供了很方便的数据管道，必须重写三个方法__init__()，__len__()，__getitem__；与此对应的，还有DataLoader的类进行数据处理；
如何使用Mixed Presision Training？
英伟达官方提供了Apex库，自动进行浮点数转换，见小细节。

小细节

torch.optim

model = nn.Linear(1, 1)

X_simple = torch.tensor([1.])
Y_simple = torch.tensor([2.])

optim = torch.optim.SGD(model.parameters(), lr=1e-2)
mse_loss_fn = nn.MESLoss()

y_hat = model(X.simple)
print('model_params_before:', model.weight)
loss = mse_loss_fn(y_hat, y_simple)
optim.zero_grad()
loss.backward()
optim.step()
print('model params after:', model.weight)

apex.amp

from apex import amp

# Declare model and optimizer as usual, with default (FP32) precision
model = torch.nn.Linear(10, 100).cuda()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

# Allow Amp to perform casts as required by the opt_level
model, optimizer = amp.initialize(model, optimizer, opt_level="01")
...
# loss.backward() becomes:
with amp.scale_loss(loss, optimizer) as scaled_loss:
    scaled_loss.backward()