0
点赞
收藏
分享

微信扫一扫

Chorus代码理解

卿卿如梦 2022-03-27 阅读 14
python

1.nn.Linear

torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)
Applies a linear transformation to the incoming data: y = x ∗ A T + b y = x* A^T+b y=xAT+b
in_features – size of each input sample
out_features – size of each output sample
bias – If set to False, the layer will not learn an additive bias. Default:True
in_features指的是输入张量的大小,即输入的[batch_size, size]中的size
out_features指的是输出张量的大小,即输出的二维张量的形状为[batch_size,output_size],当然,它也代表了该全连接层的神经元个数。
从输入输出的张量的shape角度来理解,相当于一个输入为[batch_size, in_features]的张量变换成了[batch_size, out_features]的输出张量。
输入

import torch
from torch import nn
m = nn.Linear(20, 30)
input = torch.randn(128, 20)
output = m(input)
print(output.size())

输出

torch.Size([128, 30])

Linear

2.nn.MarginRankingloss

RankingLoss系列是来计算输入样本的距离,而不像MSELoss这种直接进行回归。其主要思想就是分为MarginRanking
Margin这个词是页边空白的意思,平常我们打印的时候,文本内容外面的空白就叫Margin
而在Loss中也是表达类似的意思,相当于是一个固定的范围,当样本距离(即Loss)超过范围,即表示样本差异性足够了,不需要再计算Loss
Ranking则是排序,当target=1,则说明 x 1 x_1 x1排名需要大于 x 2 x_2 x2;当target=2,则说明 x 2 x_2 x2排名需要大于 x 1 x_1 x1
l o s s ( x 1 , x 2 , y ) = m a x ( 0 , − y ∗ ( x 1 − x 2 ) + m a r g i n ) loss(x_1,x_2,y)=max(0,-y*(x_1-x_2)+margin) loss(x1,x2,y)=max(0,y(x1x2)+margin)
输入

loss = nn.MarginRankingLoss()
input1 = torch.randn(3, requires_grad=True)
input2 = torch.randn(3, requires_grad=True)
target = torch.randn(3).sign()
output = loss(input1, input2, target)
output.backward()
print(output)

输出

tensor(0.8041, grad_fn=<MeanBackward0>)

nn.MarginRankingloss

3.nn.Embedding

通俗讲解nn.Embedding1
输入

import torch
from torch import nn
# an Embedding module containing 10 tensors of size 3
embedding = nn.Embedding(10, 3)
# a batch of 2 samples of 4 indices each
input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
print(embedding(input))

# example with padding_idx
embedding = nn.Embedding(10, 3, padding_idx=0)
input = torch.LongTensor([[0,2,0,5]])
print(embedding(input))

# example of changing `pad` vector
padding_idx = 0
embedding = nn.Embedding(3, 3, padding_idx=padding_idx)
embedding.weight
with torch.no_grad():
    embedding.weight[padding_idx] = torch.ones(3)
print(embedding.weight)

输出

tensor([[[-0.6279, -0.5354,  0.6028],
         [ 0.7638,  2.2731, -1.4006],
         [-0.4522, -0.8978, -1.7012],
         [-1.0510,  0.4634,  0.8874]],

        [[-0.4522, -0.8978, -1.7012],
         [ 0.4998, -1.6045,  1.7733],
         [ 0.7638,  2.2731, -1.4006],
         [-0.9028,  0.1946, -2.2259]]], grad_fn=<EmbeddingBackward0>)
tensor([[[ 0.0000,  0.0000,  0.0000],
         [ 1.5823,  0.0788, -1.4316],
         [ 0.0000,  0.0000,  0.0000],
         [-2.2077, -0.1790, -1.0192]]], grad_fn=<EmbeddingBackward0>)
Parameter containing:
tensor([[ 1.0000,  1.0000,  1.0000],
        [ 0.9638, -1.2502,  0.3289],
        [-0.8553, -0.7634, -1.1572]], requires_grad=True)

nn.Embedding

举报

相关推荐

0 条评论