1.nn.Linear
torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)
Applies a linear transformation to the incoming data:
y
=
x
∗
A
T
+
b
y = x* A^T+b
y=x∗AT+b
in_features – size of each input sample
out_features – size of each output sample
bias – If set to False, the layer will not learn an additive bias. Default:True
in_features指的是输入张量的大小,即输入的[batch_size, size]
中的size
。
out_features指的是输出张量的大小,即输出的二维张量的形状为[batch_size,output_size]
,当然,它也代表了该全连接层的神经元个数。
从输入输出的张量的shape
角度来理解,相当于一个输入为[batch_size, in_features]
的张量变换成了[batch_size, out_features]
的输出张量。
输入
import torch
from torch import nn
m = nn.Linear(20, 30)
input = torch.randn(128, 20)
output = m(input)
print(output.size())
输出
torch.Size([128, 30])
Linear
2.nn.MarginRankingloss
RankingLoss
系列是来计算输入样本的距离,而不像MSELoss
这种直接进行回归。其主要思想就是分为Margin
和Ranking
。
Margin
这个词是页边空白的意思,平常我们打印的时候,文本内容外面的空白就叫Margin
。
而在Loss
中也是表达类似的意思,相当于是一个固定的范围,当样本距离(即Loss
)超过范围,即表示样本差异性足够了,不需要再计算Loss
。
Ranking
则是排序,当target=1
,则说明
x
1
x_1
x1排名需要大于
x
2
x_2
x2;当target=2
,则说明
x
2
x_2
x2排名需要大于
x
1
x_1
x1。
l
o
s
s
(
x
1
,
x
2
,
y
)
=
m
a
x
(
0
,
−
y
∗
(
x
1
−
x
2
)
+
m
a
r
g
i
n
)
loss(x_1,x_2,y)=max(0,-y*(x_1-x_2)+margin)
loss(x1,x2,y)=max(0,−y∗(x1−x2)+margin)
输入
loss = nn.MarginRankingLoss()
input1 = torch.randn(3, requires_grad=True)
input2 = torch.randn(3, requires_grad=True)
target = torch.randn(3).sign()
output = loss(input1, input2, target)
output.backward()
print(output)
输出
tensor(0.8041, grad_fn=<MeanBackward0>)
nn.MarginRankingloss
3.nn.Embedding
通俗讲解nn.Embedding1
输入
import torch
from torch import nn
# an Embedding module containing 10 tensors of size 3
embedding = nn.Embedding(10, 3)
# a batch of 2 samples of 4 indices each
input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
print(embedding(input))
# example with padding_idx
embedding = nn.Embedding(10, 3, padding_idx=0)
input = torch.LongTensor([[0,2,0,5]])
print(embedding(input))
# example of changing `pad` vector
padding_idx = 0
embedding = nn.Embedding(3, 3, padding_idx=padding_idx)
embedding.weight
with torch.no_grad():
embedding.weight[padding_idx] = torch.ones(3)
print(embedding.weight)
输出
tensor([[[-0.6279, -0.5354, 0.6028],
[ 0.7638, 2.2731, -1.4006],
[-0.4522, -0.8978, -1.7012],
[-1.0510, 0.4634, 0.8874]],
[[-0.4522, -0.8978, -1.7012],
[ 0.4998, -1.6045, 1.7733],
[ 0.7638, 2.2731, -1.4006],
[-0.9028, 0.1946, -2.2259]]], grad_fn=<EmbeddingBackward0>)
tensor([[[ 0.0000, 0.0000, 0.0000],
[ 1.5823, 0.0788, -1.4316],
[ 0.0000, 0.0000, 0.0000],
[-2.2077, -0.1790, -1.0192]]], grad_fn=<EmbeddingBackward0>)
Parameter containing:
tensor([[ 1.0000, 1.0000, 1.0000],
[ 0.9638, -1.2502, 0.3289],
[-0.8553, -0.7634, -1.1572]], requires_grad=True)
nn.Embedding