0
点赞
收藏
分享

微信扫一扫

从零实现深度学习框架——计算图运算补充


在上篇文章中,我们实现了常见运算的计算题,本文来实现剩下的:Max、Slice、Reshape和Transpose的计算图。

求最大值

还是先写测试用例:

from core.tensor import Tensor
import numpy as np


def test_simple_max():
x = Tensor([1, 2, 3, 6, 7, 9, 2], requires_grad=True)
z = x.max()

assert z.data == [9]
z.backward()

assert x.grad.data.tolist() == [0, 0, 0, 0, 0, 1, 0]


def test_simple_max2():
x = Tensor([1, 2, 3, 9, 7, 9, 2], requires_grad=True)
z = x.max()

assert z.data == [9] # 最大值还是9
z.backward()

# 但是有两个最大值,所以梯度被均分了
assert x.grad.data.tolist() == [0, 0, 0, 0.5, 0, 0.5, 0]


def test_matrix_max():
a = np.array([[1., 1., 8., 9., 1.],
[4., 5., 9., 9., 8.],
[8., 6., 9., 7., 9.],
[8., 6., 1., 9., 8.]])

x = Tensor(a, requires_grad=True)
z = x.max()

assert z.data == [9] # 最大值是9
z.backward()

# 总共有6个9
np.testing.assert_array_almost_equal(x.grad.data, [[0, 0, 0, 1 / 6, 0],
[0, 0, 1 / 6, 1 / 6, 0],
[0, 0, 1 / 6, 0, 1 / 6],
[0, 0, 0, 1 / 6, 0]])


def test_matrix_max2():
a = np.array([[1., 1., 8., 9., 1.],
[4., 5., 9., 9., 8.],
[8., 6., 9., 7., 9.],
[8., 6., 1., 9., 8.]])

x = Tensor(a, requires_grad=True)
z = x.max(axis=0) # [8, 6, 9, 9, 9]

assert z.data.tolist() == [8, 6, 9, 9, 9]
z.backward([1, 1, 1, 1, 1])

grad = [[0., 0., 0., 1 / 3, 0.],
[0., 0., 0.5, 1 / 3, 0.],
[0.5, 0.5, 0.5, 0, 1],
[0.5, 0.5, 0., 1 / 3, 0.]]

np.testing.assert_array_almost_equal(x.grad.data, np.array(grad))

分析的代码在文章​​计算图运算补充​​中,这里就不再赘述。

从零实现深度学习框架——计算图运算补充_测试用例

class Max(_Function):
def forward(ctx, x: ndarray, axis=None, keepdims=False) -> ndarray:
ret = np.amax(x, axis=axis, keepdims=keepdims)
ctx.save_for_backward(x, axis, ret, keepdims)
return ret

def backward(ctx, grad: ndarray) -> ndarray:
x, axis, ret, keepdims = ctx.saved_tensors
mask = (x == ret)
div = mask.sum(axis=axis, keepdims=keepdims)
return mask * grad /

切片

切片就是索引操作,测试代码如下:

from core.tensor import Tensor
import numpy as np


def test_get_by_index():
x = Tensor([1, 2, 3, 4, 5, 6, 7], requires_grad=True)
z = x[2]

assert z.data == 3
z.backward()

assert x.grad.data.tolist() == [0, 0, 1, 0, 0, 0, 0]


def test_slice():
x = Tensor([1, 2, 3, 4, 5, 6, 7], requires_grad=True)
z = x[2:4]

assert z.data.tolist() == [3, 4]
z.backward([1, 1])

assert x.grad.data.tolist() == [0, 0, 1, 1, 0, 0, 0]


def test_matrix_slice():
a = np.array([[1., 1., 8., 9., 1.],
[4., 5., 9., 9., 8.],
[8., 6., 9., 7., 9.],
[8., 6., 1., 9., 8.]])

x = Tensor(a, requires_grad=True)
z = x[1:3, 2:4]

assert z.data.tolist() == [[9, 9], [9, 7]]
z.backward([[1, 1], [1, 1]])

# 总共有6个9
np.testing.assert_array_almost_equal(x.grad.data, [[0, 0, 0, 0, 0],
[0, 0, 1, 1, 0],
[0, 0, 1, 1, 0],
[0, 0, 0, 0, 0]])

从零实现深度学习框架——计算图运算补充_机器学习_02

class Slice(_Function):
def forward(ctx, x: ndarray, idxs: slice) -> ndarray:
'''
z = x[idxs]
'''
# 如果传入[1:3],变成切片slice
# 如果idxs传入单个索引,会被看成是整数,所以这里转换回来
if isinstance(idxs, ndarray):
idxs = int(idxs.item())
ctx.save_for_backward(x.shape, idxs)
return x[idxs]

def backward(ctx, grad) -> Tuple[ndarray, None]:
x_shape, idxs = ctx.saved_tensors
bigger_grad = np.zeros(x_shape, dtype=grad.dtype)
bigger_grad[idxs] = grad

return bigger_grad, None

变形

变形(Reshape)操作的反向传播其实是最简单的。假设经过​​y = x.reshape(..)​​​,在反向传播时,只要保证梯度的形状和​​x​​保持一致即可。

测试用例:

import numpy as np

from core.tensor import Tensor


def test_reshape():
x = Tensor(np.arange(9), requires_grad=True)
z = x.reshape((3, 3))
z.backward(np.ones((3, 3)))

assert x.grad.data.tolist() == np.ones_like(x.data).tolist()


def test_matrix_reshape():
x = Tensor(np.arange(12).reshape(2, 6), requires_grad=True)
z = x.reshape((4, 3))

z.backward(np.ones((4, 3)))

assert x.grad.data.tolist() == np.ones_like(x.data).tolist()

代码实现:

class Reshape(_Function):
def forward(ctx, x: ndarray, shape: Tuple) -> ndarray:
ctx.save_for_backward(x.shape)
return x.reshape(shape)

def backward(ctx, grad: ndarray) -> Tuple[ndarray, None]:
x_shape, = ctx.saved_tensors
return grad.reshape(x_shape), None

转置

变形就是Reshape操作,在​​计算图运算补充​​中中,我们详细分析了变成和转置的区别。

比如

从零实现深度学习框架——计算图运算补充_转置_03

经过变形后:

从零实现深度学习框架——计算图运算补充_深度学习_04

转置:

从零实现深度学习框架——计算图运算补充_机器学习_05

我们实现测试用例:

import numpy as np

from core.tensor import Tensor


def test_transpose():
x = Tensor(np.arange(6).reshape((2, 3)), requires_grad=True)
z = x.T

assert z.data.shape == (3, 2)
z.backward(np.ones((3, 2)))

assert x.grad.data.tolist() == np.ones_like(x.data).tolist()


def test_matrix_transpose():
x = Tensor(np.arange(12).reshape((2, 6, 1)), requires_grad=True)
z = x.transpose((0, 2, 1))

assert z.data.shape == (2, 1, 6)

z.backward(np.ones((2, 1, 6)))

assert x.grad.data.tolist() == np.ones_like(x.data).tolist()

代码实现:

class Transpose(_Function):
def forward(ctx, x: ndarray, axes) -> ndarray:
ctx.save_for_backward(axes)
return x.transpose(axes)

def backward(ctx, grad: ndarray) -> Any:
axes, = ctx.saved_tensors
if axes is None:
return grad.transpose()
return grad.transpose(tuple(np.argsort(axes))), None

完整代码

完整代码笔者上传到了程序员最大交友网站上去了,地址: ​​👉 https://github.com/nlp-greyfoss/metagrad​​

总结

到此,基本上我们会用到基本运算的计算图都实现了。从下篇文章开始,就基于我们的自动求导工具来实现深度学习模型了。


举报

相关推荐

0 条评论