0
点赞
收藏
分享

微信扫一扫

初识Pytorch使用cuda对模型进行训练和测试或使用cuda对模型进行训练再用cpu测试 Use cuda to train and test the model or use cuda to t

如果使用cuda进行训练,则需要在以下三个地方进行修改,告诉计算机使用的是cuda,并且有两种方式(待会再讲):
If using cuda for training, you need to modify the following three places to tell the computer to use cuda, and there are two ways (more on this later):

1.网络结构
1.Network structure

2.损失函数
2.Loss function

3.数据马上使用之前
Data,immediately before use

two way that we can use cuda:
1. xx.cuda()
2. xx.to(device=torch.device("cuda"))

方式(way)1:

1.network structure
model.cuda()

2.loss function
cross_entropy_loss.cuda()

3.data,immediately before use
imgs,targets = data
imgs.cuda()
targets.cuda() 

注意:其实这种方式应该在最训练代码的最前面写argparse.ArgumentParser()才比较好用,但是为了方便代码好读,就不写这么难。
PS:In fact, this method should be better to write argparse.ArgumentParser() at the top of the most training code, but in order to make the code easier to read, it is not so difficult to write. 

上代码(code):

from torch.utils.data import DataLoader
from LeNet_5 import *
import torchvision
import torch
from torch import nn
from torch.utils.tensorboard import SummaryWriter


# 1.Create SummaryWriter
writer = SummaryWriter("log_loss")

# 2.Ready dataset
train_dataset = torchvision.datasets.CIFAR10(root="data", train=True, transform=torchvision.transforms.ToTensor(),
                                             download=True)

# 3.Length
train_dataset_size = len(train_dataset)
print("the train dataset size is {}".format(train_dataset_size))

# 4.DataLoader
train_dataloader = DataLoader(dataset=train_dataset, batch_size=64)

# 5.Create model
model = LeNet_5()
# a.add cuda
if torch.cuda.is_available():
    model = model.cuda()

# 6.Create loss
cross_entropy_loss = nn.CrossEntropyLoss()
# b.add cuda
cross_entropy_loss = cross_entropy_loss.cuda()

# 7.Optimizer
learning_rate = 1e-2
optim = torch.optim.SGD(model.parameters(), lr=learning_rate)

# 8. Set some parameters to control loop
# epoch
epoch = 80

total_train_step = 0

for i in range(epoch):
    print(" -----------------the {} number of training epoch --------------".format(i + 1))
    model.train()
    for data in train_dataloader:
        imgs, targets = data
        # c.add cuda
        if torch.cuda.is_available():
            imgs = imgs.cuda()
            targets = targets.cuda()
        outputs = model(imgs)
        loss_train = cross_entropy_loss(outputs, targets)

        optim.zero_grad()
        loss_train.backward()
        optim.step()
        total_train_step = total_train_step + 1
        if total_train_step % 100 == 0:
            print("the training step is {} and its loss of model is {}".format(total_train_step, loss_train.item()))
            writer.add_scalar("train_loss", loss_train.item(), total_train_step)
            if total_train_step % 10000 == 0:
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
            if i == (epoch - 1):
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
writer.close()

方式(way)2:

1.network structure
model.to(device=torch.device("cuda"))

2.loss function
cross_entropy_loss.to(device=torch.device("cuda"))

3.data,immediately before use
imgs,targets = data
imgs.to(device=torch.device("cuda"))
targets.to(device=torch.device("cuda"))

上代码(code):

from torch.utils.data import DataLoader
from LeNet_5 import *
import torchvision
import torch
from torch import nn
from torch.utils.tensorboard import SummaryWriter

# 1. torch choose cuda or cpu
if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

# 2.Create SummaryWriter
writer = SummaryWriter("log_loss")

# 3.Ready dataset
train_dataset = torchvision.datasets.CIFAR10(root="data", train=True, transform=torchvision.transforms.ToTensor(),
                                             download=True)

# 4.Length
train_dataset_size = len(train_dataset)
print("the train dataset size is {}".format(train_dataset_size))

# 5.DataLoader
train_dataloader = DataLoader(dataset=train_dataset, batch_size=64)

# 6.Create model
model = LeNet_5()
# a.add cuda
model = model.to(device=device)

# 7.Create loss
cross_entropy_loss = nn.CrossEntropyLoss()
# b.add cuda
cross_entropy_loss = cross_entropy_loss.to(device=device)

# 8.Optimizer
learning_rate = 1e-2
optim = torch.optim.SGD(model.parameters(), lr=learning_rate)

# 9. Set some parameters to control loop
# epoch
epoch = 80

total_train_step = 0

for i in range(epoch):
    print(" -----------------the {} number of training epoch --------------".format(i + 1))
    model.train()
    for data in train_dataloader:
        imgs, targets = data
        imgs = imgs.to(device)
        targets = targets.to(device)
        outputs = model(imgs)
        loss_train = cross_entropy_loss(outputs, targets)

        optim.zero_grad()
        loss_train.backward()
        optim.step()
        total_train_step = total_train_step + 1
        if total_train_step % 100 == 0:
            print("the training step is {} and its loss of model is {}".format(total_train_step, loss_train.item()))
            writer.add_scalar("train_loss", loss_train.item(), total_train_step)
            if total_train_step % 10000 == 0:
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
            if i == (epoch - 1):
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
writer.close()
其运行结果,可参考之前章节,这里不再过多阐述。
For the results, please refer to the previous chapters, which I will not be elaborated here. 

上一章 初识Pytorch之完整的模型训练套路-整理后的代码 Complete model training routine - compiled code
完结!!这个系列完结!!
The End

举报

相关推荐

0 条评论