Pytorch 教學
文章目录
- Pytorch 教學
- @[toc]
- [Pytorch Tutorial](https://www.youtube.com/watch?v=8DaeP2vSu90)
- 實戰 [PyTorch - CNN 卷積神經網絡 - MNIST手寫數字辨識](https://hackmd.io/@lido2370/SJMPbNnKN?type=view)
- **配備技能**
- **What is PyTorch?**
- PyTorch v.s. TensorFlow
- Overview of the DNN Training Procedure
- Tensor 介紹
- PyTorch v.s. NumPy
- How to Calculate Gradient?
- Dataset & Dataloader
- torch.nn --Neural Network Layers
- torch.nn -- Activation Functions
- torch.nn -- Loss Functions
- torch.nn -- Build your own neural network
- torch.optim
- Neural Network Training
- Neural Network Evaluation (Validation Set)
- Neural Network Evaluation (Testing Set)
- Save/Load a Neural Network
- More About PyTorch
Pytorch Tutorial
PPT1 PPT2
實戰 PyTorch - CNN 卷積神經網絡 - MNIST手寫數字辨識
Outline
● Prerequisites
● What is PyTorch?
● PyTorch v.s. TensorFlow
● Overview of the DNN Training Procedure
● Tensor
● How to Calculate Gradient?
● Dataset & Dataloader
● torch.nn
● torch.optim
● Neural Network Training/Evaluation
● Saving/Loading a Neural Network
● More About PyTorch
配備技能
- Python3
- if-else, loop, function, file IO, class, …
- refs: link1, link2, link3
- NumPy
- array & array operations
- ref: link
What is PyTorch?
- An open source machine learning framework.
- A Python package that provides two high-level features: 提供兩個東西
- Tensor computation (like NumPy) with strong GPU acceleration 類似python的array 可以在GPU上加速
- Deep neural networks built on a tape-based autograd system 他會幫你算gardient
PyTorch v.s. TensorFlow
PyTorch | TensorFlow | |
---|---|---|
Developer | Facebook AI | Google Brain |
Interface | Python & C++ | C++, JavaScript, Swift |
Debug | Easier | Difficult (easier in 2.0) |
Application | Research | Production |
PyTorch:Research 很快就跑完了
Overview of the DNN Training Procedure
- load data
- 使用torch.utils.data.Dataset / torch.utils.data.DataLoader
- torch.nn / torch.optim 已經做好套件
- Define Neural
- Loss Fuction
- Optimizer
- Training (可能會重複做)
- Validation (可能會重複做)
- Testing
Tensor 介紹
- 是個高維度矩陣
- Tensor 可存
Data type dtype tensor 32-bit floating point torch.float torch.FloatTensor 64-bit integer (signed) torch.long torch.LongTensor
Tensor
Shape
- (5,) = dim0 一維
- (3,5) = dim0,1 矩陣
- (4,5,3) = dim0~2 立方
:::info
Note: dim in PyTorch == axis in NumPy
:::
建構Tensor
-
From list / NumPy array
塞list 或是 已經包好在NumPy array
x = torch.tensor([[1, -1], [-1, 1]])
x = torch.from_numpy(np.array([[1, -1], [-1, 1]])) -
Zero tensor
產生一個tensor裡面全是0,提供尺寸(非資料)
x = torch.zeros([2, 2]) -
Unit tensor
產生裡面全是1,提供尺寸(非資料)
x = torch.ones([1, 2, 5])
torch.nn.functional.Softmax(input,dim=None)
https://blog.csdn.net/Will_Ye/article/details/104994504
https://www.cnblogs.com/wanghui-garcia/p/10675588.html
Tensor 運算元
Squeeze: remove the specified dimension with length = 1
把其中一個dim拿掉
>>> x = torch.zeros([1, 2, 3])
>>> x.shape
torch.Size([1, 2, 3]) #
>>> x = x.squeeze(0) #(dim = 0)
>>> x.shape
torch.Size([2, 3])
Unsqueeze: expand a new dimension
拓展一個dim
>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x = x.unsqueeze(1)
>>> x.shape
torch.Size([2, 1, 3])
Transpose: transpose two specified dimensions
轉置,對調兩個矩陣
>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x = x.transpose(0, 1)
>>> x.shape
torch.Size([3, 2])
Cat: concatenate multiple tensors
把多個tensor在一個指定的dim接在一起
>>> x = torch.zeros([2, 1, 3]) #中間的值:1
>>> y = torch.zeros([2, 3, 3]) #中間的值:3
>>> z = torch.zeros([2, 2, 3]) #中間的值:2
>>> w = torch.cat([x, y, z], dim=1)
>>> w.shape
torch.Size([2, 6, 3]) #中間的值:6
● Addition
z = x + y
● Subtraction
z = x - y
● Power 次方
y = x.pow(2)
● Summation
y = x.sum()
● Mean
y = x.mean()
PyTorch v.s. NumPy
PyTorch | NumPy |
---|---|
Attributes | 查看內容型態 |
x.dtype | x.dtype |
x.shape | x.shape |
Shape manipulation | 形狀處理 |
x.reshape | x.view x.reshape |
x.squeeze() | x.squeeze() |
x.unsqueeze(1) | np.expand_dims(x, 1) |
Tensor–Device
tensor特別的功能可以在GPU上面跑
Default: tensors & modules will be computed with CPU
● CPU (預設)
x = x.to(‘cpu’)
● GPU ((要額外執行此程式才能跑)
x = x.to(‘cuda’)
:::info
what is cuda?
NVIDIA
● Check if your computer has NVIDIA GPU
torch.cuda.is_available()
● Multiple GPUs: specify ‘cuda:0’
, ‘cuda:1’
, ‘cuda:2’
, …
:::
How to Calculate Gradient?
>>> x = torch.tensor([[1., 0.], [-1., 1.]], requires_grad=True)
>>> z = x.pow(2).sum()
>>> z.backward() #微分
>>> x.grad #輸出算出來的微分
tensor([[ 2., 0.], [-2., 2.]])
Dataset & Dataloader
from torch.utils.data import Dataset, DataLoader #載dataset模組
#產生dataset
class MyDataset(Dataset):
# Read data & preprocess
def __init__(self, file):
self.data = ...
# Returns one sample at a time 一次取得一筆data
def __getitem__(self, index):
return self.data[index]
# Returns the size of the dataset 回傳dataset大小多大
def __len__(self):
return len(self.data)
#產生dataloader
dataset = MyDataset(file)
dataloader = DataLoader(dataset, batch_size, shuffle=True)
# shuffle開關: 只有Training: True / Testing: False
:::info
dataset是包在dataloader裡面
ex:每次dataloader會從dataset的裡面拿出5筆資料
dataloader回傳一個mini-batch=batch_size
:::
torch.nn --Neural Network Layers
Linear Layer (Fully-connected Layer)
nn.Linear(in_features, out_features)
設定input 和 output維度
can be any shape but the last dimension must be 32
e.g. (10, 32), (10, 5, 32), (1, 1, 3, 32), …
:::info
前面的維度任一大小都可以
:::
input 32維 output 64維 可以表示矩陣乘法和加法
如果是老師上課頭片講的 就會變成這樣
如果要看你產生參數,就打.weight
>>> layer = torch.nn.Linear(32, 64)
>>> layer.weight.shape #如果要看你產生參數,就打
torch.Size([64, 32])
>>> layer.bias.shape #要看bias就打這樣
torch.Size([64])
torch.nn – Activation Functions
● Sigmoid Activation
nn.Sigmoid()
● ReLU Activation
nn.ReLU()
torch.nn – Loss Functions
在regression要算MSE
● Mean Squared Error (for linear regression)
nn.MSELoss()
在Classificaion要算
● Cross Entropy (for classification)
nn.CrossEntropyLoss()
torch.nn – Build your own neural network
import torch.nn as nn
class MyModel(nn.Module):
#要定義nn.Module subtask(?) 裡面會這長這樣子
#Initialize your model & define layers
def __init__(self):
super(MyModel, self).__init__()
self.net = nn.Sequential( #sequential裡面依序放進你的layer
nn.Linear(10, 32), #input10 -> 32維
nn.Sigmoid(), #做一個sigmoid
nn.Linear(32, 1) #input32 -> 1維
)
#Compute output of your NN
#input of size (batch_size x input_dim)
def forward(self, x): #去算自己的output
return self.net(x)
torch.optim
幫忙更新model
● Optimization algorithms for neural networks (gradient descent)
● Stochastic Gradient Descent (SGD) (常用)
torch.optim.SGD(params, lr, momentum = 0)
:::warning
params = model.parameters
lr = learning rate
:::
Neural Network Training
Step 1 : training(建構資料)
dataset = MyDataset(file) #read data via MyDataset
tr_set = DataLoader(dataset, 16, shuffle=True) #put dataset into Dataloader
model = MyModel().to(device) #contruct model and move to device (cpu/cuda)
criterion = nn.MSELoss() #set loss function
optimizer = torch.optim.SGD(model.parameters(), 0.1) #set optimizer
Step 2 : training
可以寫成一個for迴圈
for epoch in range(n_epochs): #iterate n_epochs
model.train() #set model to train mode設成training mode才能更新餐數
for x, y in tr_set: #iterate through the dataloader
optimizer.zero_grad() #set gradient to zero
##如果有以前的grad就會影響現在的更新
x, y = x.to(device), y.to(device) #move data to device (cpu/cuda)
pred = model(x) #forward pass (compute output) 算他的predition
loss = criterion(pred, y) #compute loss 算minsqaureloss
loss.backward() #compute gradient (backpropagation) 算gradient
#因為算玩grad後 他還沒更新你的model,所以用optimizer.step 用剛剛的grad去更新剛剛的參數
optimizer.step() #update model with optimizer
Neural Network Evaluation (Validation Set)
可能model train完一個epoch後 就要去做Validation
model.eval() #set model to evaluation mode
total_loss = 0
for x, y in dv_set: #iterate through the dataloader
x, y = x.to(device), y.to(device) #move data to device (cpu/cuda)
#不希望在Validation的時候更新grad,所以要多加這行
with torch.no_grad(): #disable gradient calculation
pred = model(x) #forward pass (compute output)
loss = criterion(pred, y) #compute loss
total_loss += loss.cpu().item() * len(x) #accumulate loss 累積loss
avg_loss = total_loss / len(dv_set.dataset) #compute averaged loss 平均loss
這些loss會決定要不要把現在的model存下來
看model有無進步
Neural Network Evaluation (Testing Set)
這沒有正確答案
model.eval() #set model to evaluation mode
preds = []
for x in tt_set: #iterate through the dataloader
x = x.to(device) #move data to device (cpu/cuda)
with torch.no_grad(): #disable gradient calculation
pred = model(x) #forward pass (compute output)預測
preds.append(pred.cpu()) #collect prediction收集起來
:::success
.cpu 表示把他移動CPU上面
.csv 表示寫成csv檔
:::
Save/Load a Neural Network
要自己存
● Save
torch.save(model.state_dict(), path)
要自己載回來
● Load
ckpt = torch.load(path)
model.load_state_dict(ckpt)
More About PyTorch
- ● torchaudio 語音
○ speech/audio processing - ● torchtext NLP問題
○ natural language processing - ● torchvision 電腦視覺
○ computer vision - ● skorch
○ scikit-learn + pyTorch - ● Useful github repositories using PyTorch
○ Huggingface Transformers (transformer models: BERT, GPT, …)
○ Fairseq (sequence modeling for NLP & speech)
○ ESPnet (speech recognition, translation, synthesis, …)
○ Many implementation of papers
○ …
常用套件都是用PyTorch寫的