Pytorch 教學

文章目录

Pytorch 教學

Pytorch Tutorial

PPT1 PPT2

實戰 PyTorch - CNN 卷積神經網絡 - MNIST手寫數字辨識

Outline
● Prerequisites
● What is PyTorch?
● PyTorch v.s. TensorFlow
● Overview of the DNN Training Procedure
● Tensor
● How to Calculate Gradient?
● Dataset & Dataloader
● torch.nn
● torch.optim
● Neural Network Training/Evaluation
● Saving/Loading a Neural Network
● More About PyTorch

配備技能

Python3
- if-else, loop, function, file IO, class, …
- refs: link1, link2, link3
NumPy
- array & array operations
- ref: link

What is PyTorch?

An open source machine learning framework.
A Python package that provides two high-level features: 提供兩個東西
- Tensor computation (like NumPy) with strong GPU acceleration 類似python的array 可以在GPU上加速
- Deep neural networks built on a tape-based autograd system 他會幫你算gardient

PyTorch v.s. TensorFlow

	PyTorch	TensorFlow
Developer	Facebook AI	Google Brain
Interface	Python & C++	C++, JavaScript, Swift
Debug	Easier	Difficult (easier in 2.0)
Application	Research	Production

PyTorch:Research 很快就跑完了

Overview of the DNN Training Procedure

load data
- 使用torch.utils.data.Dataset / torch.utils.data.DataLoader
torch.nn / torch.optim 已經做好套件
- Define Neural
- Loss Fuction
- Optimizer
Training (可能會重複做)
Validation (可能會重複做)
Testing

Tensor 介紹

是個高維度矩陣
Tensor 可存
Data type dtype tensor
32-bit floating point torch.float torch.FloatTensor
64-bit integer (signed) torch.long torch.LongTensor

Data type	dtype	tensor
32-bit floating point	torch.float	torch.FloatTensor
64-bit integer (signed)	torch.long	torch.LongTensor

Tensor
Shape

(5,) = dim0 一維
(3,5) = dim0,1 矩陣
(4,5,3) = dim0~2 立方

:::info
Note: dim in PyTorch == axis in NumPy
:::

建構Tensor

From list / NumPy array
塞list 或是已經包好在NumPy array
x = torch.tensor([[1, -1], [-1, 1]])
x = torch.from_numpy(np.array([[1, -1], [-1, 1]]))
Zero tensor
產生一個tensor裡面全是0，提供尺寸(非資料)
x = torch.zeros([2, 2])
Unit tensor
產生裡面全是1，提供尺寸(非資料)
x = torch.ones([1, 2, 5])

torch.nn.functional.Softmax(input,dim=None)
https://blog.csdn.net/Will_Ye/article/details/104994504
https://www.cnblogs.com/wanghui-garcia/p/10675588.html

Tensor 運算元

Squeeze: remove the specified dimension with length = 1
把其中一個dim拿掉

>>> x = torch.zeros([1, 2, 3])
>>> x.shape
torch.Size([1, 2, 3]) #
>>> x = x.squeeze(0) #(dim = 0)
>>> x.shape
torch.Size([2, 3])

Unsqueeze: expand a new dimension
拓展一個dim

>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x = x.unsqueeze(1)
>>> x.shape
torch.Size([2, 1, 3])

Transpose: transpose two specified dimensions
轉置，對調兩個矩陣

>>> x = torch.zeros([2, 3])
>>> x.shape
torch.Size([2, 3])
>>> x = x.transpose(0, 1)
>>> x.shape
torch.Size([3, 2])

Cat: concatenate multiple tensors
把多個tensor在一個指定的dim接在一起

>>> x = torch.zeros([2, 1, 3])  #中間的值:1
>>> y = torch.zeros([2, 3, 3])  #中間的值:3
>>> z = torch.zeros([2, 2, 3])  #中間的值:2
>>> w = torch.cat([x, y, z], dim=1)
>>> w.shape
torch.Size([2, 6, 3]) #中間的值:6

● Addition
z = x + y
● Subtraction
z = x - y
● Power 次方
y = x.pow(2)
● Summation
y = x.sum()
● Mean
y = x.mean()

PyTorch v.s. NumPy

PyTorch	NumPy
Attributes	查看內容型態
x.dtype	x.dtype
x.shape	x.shape

Shape manipulation	形狀處理
x.reshape	x.view x.reshape
x.squeeze()	x.squeeze()
x.unsqueeze(1)	np.expand_dims(x, 1)

Tensor–Device
tensor特別的功能可以在GPU上面跑
Default: tensors & modules will be computed with CPU
● CPU (預設)
x = x.to(‘cpu’)
● GPU ((要額外執行此程式才能跑)
x = x.to(‘cuda’)

:::info
what is cuda?
NVIDIA
● Check if your computer has NVIDIA GPU
torch.cuda.is_available()
● Multiple GPUs: specify ‘cuda:0’, ‘cuda:1’, ‘cuda:2’, …
:::

How to Calculate Gradient?

>>> x = torch.tensor([[1., 0.], [-1., 1.]], requires_grad=True)
>>> z = x.pow(2).sum()
>>> z.backward()   #微分
>>> x.grad         #輸出算出來的微分
tensor([[ 2., 0.], [-2., 2.]])

Dataset & Dataloader

from torch.utils.data import Dataset, DataLoader #載dataset模組

#產生dataset
class MyDataset(Dataset):
    # Read data & preprocess
    def __init__(self, file):
        self.data = ...
        
    # Returns one sample at a time 一次取得一筆data
    def __getitem__(self, index):
        return self.data[index]
        
    # Returns the size of the dataset 回傳dataset大小多大
    def __len__(self):
        return len(self.data)
#產生dataloader
dataset = MyDataset(file)
dataloader = DataLoader(dataset, batch_size, shuffle=True)
# shuffle開關: 只有Training: True / Testing: False

:::info
dataset是包在dataloader裡面
ex:每次dataloader會從dataset的裡面拿出5筆資料
dataloader回傳一個mini-batch=batch_size
:::

torch.nn --Neural Network Layers

Linear Layer (Fully-connected Layer)
nn.Linear(in_features, out_features)
設定input 和 output維度

can be any shape but the last dimension must be 32
e.g. (10, 32), (10, 5, 32), (1, 1, 3, 32), …
:::info
前面的維度任一大小都可以
:::

input 32維 output 64維可以表示矩陣乘法和加法

如果是老師上課頭片講的就會變成這樣

如果要看你產生參數，就打.weight

>>> layer = torch.nn.Linear(32, 64)
>>> layer.weight.shape #如果要看你產生參數，就打
torch.Size([64, 32])
>>> layer.bias.shape #要看bias就打這樣
torch.Size([64])

torch.nn – Activation Functions

● Sigmoid Activation
nn.Sigmoid()
● ReLU Activation
nn.ReLU()

torch.nn – Loss Functions

在regression要算MSE
● Mean Squared Error (for linear regression)
nn.MSELoss()

在Classificaion要算
● Cross Entropy (for classification)
nn.CrossEntropyLoss()

torch.nn – Build your own neural network

import torch.nn as nn
class MyModel(nn.Module):    
    #要定義nn.Module subtask(?) 裡面會這長這樣子
    #Initialize your model & define layers
    def __init__(self):
        super(MyModel, self).__init__()
        self.net = nn.Sequential(    #sequential裡面依序放進你的layer
            nn.Linear(10, 32),    #input10 -> 32維
            nn.Sigmoid(),    #做一個sigmoid
            nn.Linear(32, 1)    #input32 -> 1維
    )
    #Compute output of your NN
    #input of size (batch_size x input_dim)
    def forward(self, x):    #去算自己的output
        return self.net(x)

torch.optim

幫忙更新model
● Optimization algorithms for neural networks (gradient descent)
● Stochastic Gradient Descent (SGD) (常用)
torch.optim.SGD(params, lr, momentum = 0)
:::warning
params = model.parameters
lr = learning rate
:::

Neural Network Training

Step 1 : training(建構資料)

dataset = MyDataset(file)    #read data via MyDataset

tr_set = DataLoader(dataset, 16, shuffle=True)    #put dataset into Dataloader
model = MyModel().to(device)    #contruct model and move to device (cpu/cuda)
criterion = nn.MSELoss()    #set loss function

optimizer = torch.optim.SGD(model.parameters(), 0.1)    #set optimizer

Step 2 : training

可以寫成一個for迴圈

for epoch in range(n_epochs):    #iterate n_epochs
    model.train()    #set model to train mode設成training mode才能更新餐數
    for x, y in tr_set:    #iterate through the dataloader
        optimizer.zero_grad()    #set gradient to zero 
        ##如果有以前的grad就會影響現在的更新

        x, y = x.to(device), y.to(device)    #move data to device (cpu/cuda)
        pred = model(x)    #forward pass (compute output) 算他的predition
        loss = criterion(pred, y)    #compute loss 算minsqaureloss
        loss.backward()    #compute gradient (backpropagation) 算gradient
        #因為算玩grad後 他還沒更新你的model，所以用optimizer.step 用剛剛的grad去更新剛剛的參數
        optimizer.step()    #update model with optimizer

Neural Network Evaluation (Validation Set)

可能model train完一個epoch後就要去做Validation

model.eval()    #set model to evaluation mode
total_loss = 0
for x, y in dv_set:    #iterate through the dataloader
    x, y = x.to(device), y.to(device)    #move data to device (cpu/cuda)
    #不希望在Validation的時候更新grad,所以要多加這行
    with torch.no_grad():    #disable gradient calculation 

        pred = model(x)    #forward pass (compute output)
        loss = criterion(pred, y)    #compute loss
    total_loss += loss.cpu().item() * len(x)    #accumulate loss 累積loss
    avg_loss = total_loss / len(dv_set.dataset)    #compute averaged loss 平均loss

這些loss會決定要不要把現在的model存下來
看model有無進步

Neural Network Evaluation (Testing Set)

這沒有正確答案

model.eval()    #set model to evaluation mode
preds = []
for x in tt_set:    #iterate through the dataloader
    x = x.to(device)    #move data to device (cpu/cuda)
    with torch.no_grad():    #disable gradient calculation
        pred = model(x)    #forward pass (compute output)預測
        preds.append(pred.cpu())    #collect prediction收集起來

:::success
.cpu 表示把他移動CPU上面
.csv 表示寫成csv檔
:::

Save/Load a Neural Network

要自己存
● Save
torch.save(model.state_dict(), path)

要自己載回來
● Load

ckpt = torch.load(path)
model.load_state_dict(ckpt)

More About PyTorch

● torchaudio 語音
○ speech/audio processing
● torchtext NLP問題
○ natural language processing
● torchvision 電腦視覺
○ computer vision
● skorch
○ scikit-learn + pyTorch
● Useful github repositories using PyTorch
○ Huggingface Transformers (transformer models: BERT, GPT, …)
○ Fairseq (sequence modeling for NLP & speech)
○ ESPnet (speech recognition, translation, synthesis, …)
○ Many implementation of papers
○ …
常用套件都是用PyTorch寫的