Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别-CFANZ编程社区

最近在看TensorFlow的代码，还是1.14版本的TensorFlow的，代码难度确实比pytorch的难上不是多少倍，pytorch的代码看一遍基本能看懂个差不多，TensorFlow的代码看一遍没懂，再看一遍还没懂，再看一遍呢，还没懂，不过这时候基本也就快崩溃了。

记录一个刚学到的TensorFlow的知识点，那就是对TensorFlow中的tensor取shape操作时需要注意的事项，也就是说有一个TensorFlow的tensor，它是由placeholder生成的，使用代码如下：

import tensorflow as tf
import numpy as np


num_actions = 5

p=tf.placeholder(shape=(None, 4), dtype=np.int8)
batch_size = tf.shape(p)[0]

random_actions = tf.random_uniform(tf.stack([batch_size]), minval=0, maxval=num_actions, dtype=tf.int64)

with tf.Session() as sess:
    print( sess.run(random_actions, {p:np.array([[1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4]])}) )

执行后结果：

Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别_tensorflow

Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别_tensorflow_02

Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别_Tensorflow_03

import tensorflow as tf
import numpy as np


num_actions = 5

p=tf.placeholder(shape=(None, 4), dtype=np.int8)
# batch_size = tf.shape(p)[0]
batch_size = p.shape[0]

random_actions = tf.random_uniform(tf.stack([batch_size]), minval=0, maxval=num_actions, dtype=tf.int64)

with tf.Session() as sess:
    print( sess.run(random_actions, {p:np.array([[1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4]])}) )

执行结果：（报错）

Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别_Tensorflow_04

两个代码的区别就是对于一个placeholder生成的tensor，对它取某个维度，如果它的某个维度在定义时并没有赋值，也就是使用了None数值来占位，那么再构建神经网络时用到了这个维度如果取这个维度是用tensorflow的函数来取和直接调用这个tensor的shape来取会有不同的效果的。

如果是使用TensorFlow的函数，tf.shape来取，这个维度及时没有赋值也是会一个TensorFlow的变量存在的：

Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别_tensorflow_05

可以知道取这个维度的变量本身也就是一个tensor变量，即时这个变量的数值并不清楚（因为它是由None占位生成的），这个tensor变量也是可以进行下一步操作的。

但是如果使用p.shape[0]的方式，那么取出的其实是一个python变量而不是tensor变量，而这个python变量内部存储的数值是None，而这个数值是不能参与TensorFlow中的神经网络构建的：

Tensorflow1.14中placeholder.shape和tf.shape(placeholder)的区别_python_06

因此，上面的第二个代码运行时会报错。

修改代码：

import tensorflow as tf
import numpy as np


num_actions = 5

p=tf.placeholder(shape=(None, 4), dtype=np.int8)
batch_size = tf.shape(p)[1]

random_actions = tf.random_uniform(tf.stack([batch_size]), minval=0, maxval=num_actions, dtype=tf.int64)

with tf.Session() as sess:
    print( sess.run(random_actions, {p:np.array([[1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4]])}) )

import tensorflow as tf
import numpy as np


num_actions = 5

p=tf.placeholder(shape=(None, 4), dtype=np.int8)
# batch_size = tf.shape(p)[0]
# batch_size = p.shape[0]
batch_size = p.shape[1]

random_actions = tf.random_uniform(tf.stack([batch_size]), minval=0, maxval=num_actions, dtype=tf.int64)

with tf.Session() as sess:
    print( sess.run(random_actions, {p:np.array([[1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4]])}) )

上面的这两个代码均可以正常运行。

这说明python的数值是可以参与TensorFlow的神经网络构建的，只不过None是不可以参与TensorFlow的网络构建的，如果某个tensor的维度是由placeholder的None生成的，我们只能使用tf.shape的方式来取，这样取到的则为还没有赋值的tensor变量而不是None的python变量。

========================================