本文只是对于pytorch深度学习框架的使用方法的介绍,如果涉及算法中复杂的数学原理,本文将不予阐述,敬请读者自行阅读相关论文或者文献。
1.tensor基础操作
1.1 tensor的dtype类型
代码 | 含义 |
float32 | 32位float |
float | floa |
float64 | 64位float |
double | double |
float16 | 16位float |
bfloat16 | 比float范围大但精度低 |
int8 | 8位int |
int16 | 16位int |
short | short |
int32 | 32位int |
int | int |
int64 | 64位int |
long | long |
complex32 | 32位complex |
complex64 | 64位complex |
cfloat | complex float |
complex128 | 128位complex float |
cdouble | complex double |
1.2 创建tensor(建议写出参数名字)
创建tensor时,有很多参数可以选择,为节省篇幅,本文在列举API时只列举一次,不列举重载的API。
1.2.1 空tensor(无用数据填充)
API
@overload
def empty(size: Sequence[Union[_int, SymInt]], *, memory_format: Optional[memory_format]=None, out: Optional[Tensor]=None, dtype: Optional[_dtype]=None, layout: Optional[_layout]=None, device: Optional[Union[_device, str, None]]=None, pin_memory: Optional[_bool]=False, requires_grad: Optional[_bool]=False) -> Tensor: ...
size:[行数,列数]
dtype(deepth type):数据类型
device:选择运算设备
requires_grad:是否进行自动求导,默认为False
示例
gpu=torch.device("cuda")
empty_tensor=torch.empty(size=[3,4],device=gpu,requires_grad=True)
print(empty_tensor)
输出
tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], device='cuda:0', requires_grad=True)
1.2.2 全一tensor
@overload
def ones(size: _size, *, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype]=None, layout: Optional[_layout]=None, device: Optional[Union[_device, str, None]]=None, pin_memory: Optional[_bool]=False, requires_grad: Optional[_bool]=False) -> Tensor: ...
size:[行数,列数]
dtype(deepth type):数据类型
device:选择运算设备
requires_grad:是否进行自动求导,默认为False
1.2.3 全零tensor
@overload
def zeros(size: _size, *, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype]=None, layout: Optional[_layout]=None, device: Optional[Union[_device, str, None]]=None, pin_memory: Optional[_bool]=False, requires_grad: Optional[_bool]=False) -> Tensor: ...
1.2.4 随机值[0,1)的tensor
@overload
def rand(size: _size, *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype]=None, layout: Optional[_layout]=None, device: Optional[Union[_device, str, None]]=None, pin_memory: Optional[_bool]=False, requires_grad: Optional[_bool]=False) -> Tensor: ...
1.2.5 随机值为整数且规定上下限的tensor
API
@overload
def randint(low: _int, high: _int, size: _size, *, generator: Optional[Generator]=None, dtype: Optional[_dtype]=None, device: Device=None, requires_grad: _bool=False) -> Tensor: ...
示例
int_tensor=torch.randint(low=0,high=20,size=[5,6],device=gpu)
print(int_tensor)
输出
tensor([[18, 0, 14, 7, 18, 14],
[17, 0, 2, 0, 0, 3],
[16, 17, 5, 15, 1, 14],
[ 7, 12, 8, 6, 4, 11],
[12, 4, 7, 5, 3, 3]], device='cuda:0')
1.2.6 随机值均值0方差1的tensor
@overload
def randn(size: _size, *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype]=None, layout: Optional[_layout]=None, device: Optional[Union[_device, str, None]]=None, pin_memory: Optional[_bool]=False, requires_grad: Optional[_bool]=False) -> Tensor: ...
1.2.7 从列表或numpy数组创建tensor
def tensor(data: Any, dtype: Optional[_dtype]=None, device: Device=None, requires_grad: _bool=False) -> Tensor: ...
- 如果使用
torch.from_numpy()
,返回的tensor与ndarray共享内存。
1.3 tensor常用成员函数和成员变量
1.3.1 转为numpy数组
def numpy(self,*args, **kwargs): # real signature unknown; NOTE: unreliably restored from __doc__
pass
- 只有在CPU上运算的tensor才可以转为numpy数组
- tensor.requires_grad属性为True的tensor不能转为numpy数组
1.3.2 获得单元素tensor的值item
def item(self): # real signature unknown; restored from __doc__
...
- 如果tensor只有一个元素,就返回它的值
- 如果tensor有多个元素,抛出ValueError
1.3.3 获取维度个数
def dim(self): #real signature unknown; restored from __doc__
return 0
- 返回一个int表示维度个数
1.3.4 获取数据类型
dtype = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
1.3.5 获取形状
def size(self,dim=None): # real signature unknown; restored from __doc__
pass
- 使用
.shape
效果相同
1.3.6 浅拷贝与深拷贝
detach函数浅拷贝
假设有模型A和模型B,我们需要将A的输出作为B的输入,但训练时我们只训练模型B. 那么可以这样做:
input_B = output_A.detach()
它可以使两个计算图的梯度传递断开,从而实现我们所需的功能。
返回一个新的tensor,新的tensor和原来的tensor共享数据内存,但不涉及梯度计算,即requires_grad=False。修改其中一个tensor的值,另一个也会改变,因为是共享同一块内存。
sequence_tensor=torch.tensor(np.array([[[1,2,3],
[4,5,6]],
[[9,8,7],
[6,5,4]]]),
dtype=torch.float,device=gpu,)
sequence_tensor_shallowCp=sequence_tensor.detach()
sequence_tensor_shallowCp+=1
print(sequence_tensor)
print(sequence_tensor_shallowCp.requires_grad)
输出
tensor([[[ 2., 3., 4.],
[ 5., 6., 7.]],
[[10., 9., 8.],
[ 7., 6., 5.]]], device='cuda:0')
False
深拷贝
- 法一:
.clone().detach()
- 法二:
.new_tensor()
1.3.7 形状变换
转置
向量或矩阵转置
def t(self): # real signature unknown; restored from __doc__
"""
t() -> Tensor
See :func:`torch.t`
"""
return _te.Tensor(*(), **{})
- 返回值与原tensor共享内存!
指定两个维度进行转置:
def permute(self, dims: _size) -> Tensor:
r"""
permute(*dims) -> Tensor
See :func:`torch.permute`
"""
...
- 返回值与原tensor共享内存!
- 对矩阵来说,
.t()
等价于.permute(0, 1)
多维度同时转置
def permute(self, *dims): # real signature unknown; restored from __doc__
"""
permute(*dims) -> Tensor
See :func:`torch.permute`
"""
return _te.Tensor(*(), **{})
- 把要转置的维度放到对应位置上,比如对于三维tensor,x、y、z分别对应0、1、2,如果想要转置x轴和z轴,则输入2、1、0即可
- 返回值与原tensor共享内存!
cat
堆叠
cat
可以把两个或多个tensor沿着指定的维度进行连接,连接后的tensor维度个数不变,指定维度上的大小改变,非指定维度上的大小不变。譬如,两个shape=(3,)
行向量按dim=0
连接,变成1个shape=(6,)
的行向量;2个3阶方阵按dim=0
连接,就变成1个(6, 3)
的矩阵。
cat
在使用时对输入的这些tensor有要求:除了指定维度,其他维度的大小必须相同。譬如,1个shape=(1, 6)
的矩阵可以和1个shape=(2, 6)
的矩阵在dim=0
连接。
例子可以参考下面的定义和注释。
def cat(tensors: Union[Tuple[Tensor, ...], List[Tensor]], dim: _int = 0, *, out: Optional[Tensor] = None) -> Tensor:
r"""
cat(tensors, dim=0, *, out=None) -> Tensor
Concatenates the given sequence of :attr:`seq` tensors in the given dimension.
All tensors must either have the same shape (except in the concatenating
dimension) or be a 1-D empty tensor with size ``(0,)``.
:func:`torch.cat` can be seen as an inverse operation for :func:`torch.split`
and :func:`torch.chunk`.
:func:`torch.cat` can be best understood via examples.
.. seealso::
:func:`torch.stack` concatenates the given sequence along a new dimension.
Args:
tensors (sequence of Tensors): any python sequence of tensors of the same type.
Non-empty tensors provided must have the same shape, except in the
cat dimension.
dim (int, optional): the dimension over which the tensors are concatenated
Keyword args:
out (Tensor, optional): the output tensor.
Example::
>>> x = torch.randn(2, 3)
>>> x
tensor([[ 0.6580, -1.0969, -0.4614],
[-0.1034, -0.5790, 0.1497]])
>>> torch.cat((x, x, x), 0)
tensor([[ 0.6580, -1.0969, -0.4614],
[-0.1034, -0.5790, 0.1497],
[ 0.6580, -1.0969, -0.4614],
[-0.1034, -0.5790, 0.1497],
[ 0.6580, -1.0969, -0.4614],
[-0.1034, -0.5790, 0.1497]])
>>> torch.cat((x, x, x), 1)
tensor([[ 0.6580, -1.0969, -0.4614, 0.6580, -1.0969, -0.4614, 0.6580,
-1.0969, -0.4614],
[-0.1034, -0.5790, 0.1497, -0.1034, -0.5790, 0.1497, -0.1034,
-0.5790, 0.1497]])
"""
...
- 返回值与原tensor不共享内存!
stack
堆叠
stack
与cat
有很大的区别,stack
把两个或多个tensor在dim
上创建一个全新的维度进行连接,非指定维度个数不变,创建的维度的大小取决于这次连接使用了多少个tensor。譬如,3个shape=(3,)
行向量按dim=0
连接,会变成一个shape=(3, 3)
的矩阵;两个3阶方阵按dim=-1
连接,就变成一个(3, 3, 2)
的tensor。
def stack(tensors: Union[Tuple[Tensor, ...], List[Tensor]], dim: _int = 0, *, out: Optional[Tensor] = None) -> Tensor:
r"""
stack(tensors, dim=0, *, out=None) -> Tensor
Concatenates a sequence of tensors along a new dimension.
All tensors need to be of the same size.
.. seealso::
:func:`torch.cat` concatenates the given sequence along an existing dimension.
Arguments:
tensors (sequence of Tensors): sequence of tensors to concatenate
dim (int, optional): dimension to insert. Has to be between 0 and the number
of dimensions of concatenated tensors (inclusive). Default: 0
Keyword args:
out (Tensor, optional): the output tensor.
Example::
>>> x = torch.randn(2, 3)
>>> x
tensor([[ 0.3367, 0.1288, 0.2345],
[ 0.2303, -1.1229, -0.1863]])
>>> x = torch.stack((x, x)) # same as torch.stack((x, x), dim=0)
>>> x
tensor([[[ 0.3367, 0.1288, 0.2345],
[ 0.2303, -1.1229, -0.1863]],
[[ 0.3367, 0.1288, 0.2345],
[ 0.2303, -1.1229, -0.1863]]])
>>> x.size()
torch.Size([2, 2, 3])
>>> x = torch.stack((x, x), dim=1)
tensor([[[ 0.3367, 0.1288, 0.2345],
[ 0.3367, 0.1288, 0.2345]],
[[ 0.2303, -1.1229, -0.1863],
[ 0.2303, -1.1229, -0.1863]]])
>>> x = torch.stack((x, x), dim=2)
tensor([[[ 0.3367, 0.3367],
[ 0.1288, 0.1288],
[ 0.2345, 0.2345]],
[[ 0.2303, 0.2303],
[-1.1229, -1.1229],
[-0.1863, -0.1863]]])
>>> x = torch.stack((x, x), dim=-1)
tensor([[[ 0.3367, 0.3367],
[ 0.1288, 0.1288],
[ 0.2345, 0.2345]],
[[ 0.2303, 0.2303],
[-1.1229, -1.1229],
[-0.1863, -0.1863]]])
"""
...
- 返回值与原tensor不共享内存!
view
改变形状
view
先把数据变成一维数组,然后再转换成指定形状。变换前后的元素个数并不会改变,所以变换前后的shape的乘积必须相等。详细例子如下:
def view(self, *shape): # real signature unknown; restored from __doc__
"""
Example::
>>> x = torch.randn(4, 4)
>>> x.size()
torch.Size([4, 4])
>>> y = x.view(16)
>>> y.size()
torch.Size([16])
>>> z = x.view(-1, 8) # the size -1 is inferred from other dimensions
>>> z.size()
torch.Size([2, 8])
>>> a = torch.randn(1, 2, 3, 4)
>>> a.size()
torch.Size([1, 2, 3, 4])
>>> b = a.transpose(1, 2) # Swaps 2nd and 3rd dimension
>>> b.size()
torch.Size([1, 3, 2, 4])
>>> c = a.view(1, 3, 2, 4) # Does not change tensor layout in memory
>>> c.size()
torch.Size([1, 3, 2, 4])
>>> torch.equal(b, c)
False
"""
return _te.Tensor(*(), **{})
- 返回值与原tensor共享内存
reshape
改变形状
reshape
与view
的区别如下:
view
只能改变连续(.contiguous())的tensor,如果已经对tensor进行了permute、transpose等操作,tensor在内存中会变得不连续,此时调用view
会报错。且view
方法与原来的tensor共享内存。reshape
再调用时自动检测原tensor是否连续,如果是,则等价于view
;如果不是,先调用.contiguous()
,再调用view
,此时返回值与原来tensor不共享内存。
def reshape(self, shape: Sequence[Union[_int, SymInt]]) -> Tensor:
...
1.3.8 数学运算
def mean(self, dim=None, keepdim=False, *args, **kwargs): # real signature unknown; NOTE: unreliably restored from __doc__
...
def sum(self, dim=None, keepdim=False, dtype=None): # real signature unknown; restored from __doc__
...
def median(self, dim=None, keepdim=False): # real signature unknown; restored from __doc__
...
def mode(self, dim=None, keepdim=False): # real signature unknown; restored from __doc__
...
def dist(self, other, p=2): # real signature unknown; restored from __doc__
...
def std(self, dim, unbiased=True, keepdim=False): # real signature unknown; restored from __doc__
...
def var(self, dim, unbiased=True, keepdim=False): # real signature unknown; restored from __doc__
...
def cumsum(self, dim, dtype=None): # real signature unknown; restored from __doc__
...
def cumprod(self, dim, dtype=None): # real signature unknown; restored from __doc__
...
1.3.9 使用指定设备计算tensor
to
可以把tensor转移到指定设备上。
def to(self, *args, **kwargs): # real signature unknown; restored from __doc__
"""
Example::
>>> tensor = torch.randn(2, 2) # Initially dtype=float32, device=cpu
>>> tensor.to(torch.float64)
tensor([[-0.5044, 0.0005],
[ 0.3310, -0.0584]], dtype=torch.float64)
>>> cuda0 = torch.device('cuda:0')
>>> tensor.to(cuda0)
tensor([[-0.5044, 0.0005],
[ 0.3310, -0.0584]], device='cuda:0')
>>> tensor.to(cuda0, dtype=torch.float64)
tensor([[-0.5044, 0.0005],
[ 0.3310, -0.0584]], dtype=torch.float64, device='cuda:0')
>>> other = torch.randn((), dtype=torch.float64, device=cuda0)
>>> tensor.to(other, non_blocking=True)
tensor([[-0.5044, 0.0005],
[ 0.3310, -0.0584]], dtype=torch.float64, device='cuda:0')
"""
return _te.Tensor(*(), **{})