Learning|Learning Tensorflow LearningTensorflow

Learning Tensorflow Note 基础

用张量表示数据
用计算图搭建神经网络
用会话执行计算图，优化线上的权重，得到模型

计算图：搭建神经网络的计算过程，只搭建，不运算
参数：即权重w，用变量Variable表示，随机给初值
定义变量方法

w = tf.Variable(tf.random_normal([2,3], stddev=2, mean=0, seed=1)) w1 = tf.Variable(tf.random_normal([2,1], stddev=1, seed=1)) # 如果不给seed，每次输出结果将不一样 w1 = tf.Variable(0, dtype=tf.float32) w1 = tf.Variable(tf.constant(4, tf.float32)) global_step = tf.Variable(0, trainable=False)tf.truncated_normal（） # 去除掉方差大于2的数据点 tf.random_uniform() # 平均分布 tf.zeros([3,2], int32) tf.ones([3,2], int32) tf.fill([3,2], 6) # 全定值数组 tf.constant([3,2,1])# 直接给值

变量管理
当tf.get_variable用于创建变量时，和tf.Variable的功能基本等价。区别在于对于tf.Variable函数，变量名称是一个可选的参数，通过name="v"的形式给出。但是对于tf.get_variable函数，变量名是一个必填的参数。

# 下面两个定义时等价的 v= tf.get_variable("v", shape=[1], initializer=tf.constant_initializer(1.0)) v = tf.Variable(tf.constant(1.0, shape=[1]), name="v")

如果需要通过tf.get_variable获取一个已经创建的变量，需要通过tf.variable_scope函数来生成一个上下文管理器，并明确制定在这个上下文管理器中，tf.get_variable将直接获取已经生成的变量。

# 在名字为foo的命名空间内创建名字为v的变量 with tf.variable_scope("foo"): v = tf.get_variable("v", [1], initializer=tf.constant_initializer(1.0)) # 因为在命名空间foo中已经存在名字为v的变量，将参数reuse设为True，可以直接获得已经声明的变量。否则，将会报错。若设置为True后，创建一个新的变量将会报错。 with tf.variable_scope("foo", reuse=True): v= tf.get_variable("v", [1])

BP神经网络搭建流程

准备
- import 模块
- 常量定义
- 生成数据集
前向传播：定义输入，参数和输出
- x= tf.placeholder
- y_= # 真实数据 tf.placeholder
- w = tf.Variable
- a # 隐藏层 tf.matmul
- y # 神经网络输出 tf.matmul
反向传播：定义损失函数、反向传播方法
- loss = tf.reduce_mean
- train_step = tf.train.GradientDescentOptimizer(lr).minimize(loss)
生成Session，训练STEP轮
- init_op = tf.global_variables_initializer()
- sess.run(train_step, feed_dict)

占位符
初始化时先不进行赋值，在Session再进行赋值，方便批量处理
shape中的None表示可变大小，如shape = [None, INPUT_NODE]表示行数可变。
tf.placeholder()

x = tf.placeholder(tf.float32, shape=(None,2)) # None表示输入数据个数未知 x = tf.placeholder(tf.float32, shape=(1,2)) # 输入１个维度为２的数据

损失函数
tf.reduce_mean(input_tensor, axis=None, keepdims=False, name=None)

loss_mse = tf.reduce_mean(tf.square(y-y_)) loss_self = tf.reduce_mean(tf.where(tf.greater(y,y_), (y-y_)*COST, (y_-y)*PROFIT)) loss_cross_entropy = tf.reduce_mean(y_*tf.log(tf.clip_by_value(y, 1e-12, 1.0)))

numpy随机数生成器
得到一个随机数生成器

rng = np.random.RandomState(23455) X = rng.rand(32,2)

类型转换
将一个张量转换为另一个类型

tf.cast(x, tf.float32)

组合操作
将多个操作合并为一个操作
group

train_op = tf.group(train_step, variables_averages_op)

control_dependencies

with tf.control_dependencies([train_step, variables_average_op]): train_op = tf.no_op(name='train') # 什么都不做，将环境中的操作给train_op

定义训练操作

train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss) # train_step = tf.train.MomentumOptimizer(0.001， 0.9).minimize(loss) # train_step = tf.train.AdamOptimizer(0.001， 0.9).minimize(loss)

利用梯度下降法最小化loss，并自动递增global_step
指数衰减

LEARNING_RATE_BASE = 1.0 # 最初学习率 LEARNING_RATE_DECAY = 0.99 # 学习率的衰减率 LEARNING_RATE_STEP = 1 # 喂入多少轮BATCH_SIZE后，更新一次学习率，一般设为：总样本数/BATCH_SIZE即所有数据训练一遍后，更新学习率learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step, LEARNING_RATE_STEP,LEARNING_RATE_DECAY, staircase=True) train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)

滑动平均

生成滑动平均操作对象

ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)

MOVING_AVERAGE_DECAY 为衰减率
? global_step 为当前轮数
ExponentialMovingAverage为每个变量维护一个影子变量 (shadow variable) ，这个影子变量的初始值就是相应变量的初始值，而每次运行变量更新时，影子变量会更新为

为了使模型在训练前期可以更新得更快，ExponentialMovingAverage还提供了num_updates参数来动态设置decay的大小。如果在ExponentialMovingAverage初始化时还提供了num_updates参数，那么每次使用的衰减率将是：

生成滑动平均操作

ema_op = ema.apply(tf.trainable_variables()) # 在声明滑动平均模型之后，TF会自动生成一个影子变量v/ExponentialMovingAverage

tf.trainable_variables把所有待训练的参数汇总成列表
执行滑动平均操作
实际应用中，常将计算滑动平均和训练过程绑定在一起运行，使他们合成一个训练节点

with tf.control_dependencies([train_op, ema_op]): train_op = tf.no_op(name='train') # 以上操作等价于train_op = tf.group(train_op, ema_op)
滑动平均不会改变变量本身的取值，而是维护一个影子变量来记录其滑动平均值，所以当要使用这个滑动平均值是，需要明确调用average函数。
ema.average(var) 可以返回某些参数的滑动平均值

y1 = tf.nn.relu( tf.matmul(x, ema.average(w1)) + ema.average(b1) ) y = tf.matmul(y1, ema.average(w2) + ema.average(b2))

正则化

loss(w) = tf.contrib.layers.l1_regularizer(LAMBDA)tf.add_to_collection("losses", tf.contrib.l2_regularizer(regularizer)(w)) # add_to_collection 将损失加入到losses集合中 loss = cem + tf.add_n(tf.get_collection("losses")) # get_collection 从集合中获取全部变量，生成一个列表 # add_n 列表内对应元素相加

模块化神经网络
forward.py

def forward(x, regularizer): w= b= y= return y

def get_weight(shape, regularizer): w = tf.Variable() tf.add_to_collection("losses", tf.contrib.l2_regularizer(regularizer)(w)) return w

def get_bias(shape): b = tf.Variable() return b

backward.py

def backward(): x = tf.placeholder() y_ = tf.placeholder() y = forward.forward(x,REGULARIZER) global_step = tf.Variable(0, trainable=False)loss_mse=tf.reduce_mean(tf.square(y-y_)) #loss_cem=tf.reduce_mean(tf.nn.sparse_cross_entropy_with_logits(logits=y,label=tf.argmax(y_,1)))# 正则化 loss_total = loss + tf.add_n(tf.get_collection("losses"))# 指数衰减学习率 learning_rate = tf.train.exponential_decay(LR_BASE,global_step,样本集总数/BATCH_SIZE,LR_DECAY,staircase=True)train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)# 滑动平均 ema = tf.train.ExopnentialMovingAverage(MOVING_AVERAGE_DECAY, global_step) ema_op = ema.apply(tf.trainable_variables()) with tf.control_dependencies([train_op, ema_op]): train_op = tf.no_op(name="train")with tf.Session() as sess: init_op = tf.global_varaibles_initializer() # 代替initialize_all_variables() sess.run(init_op) # 可以用下句代替 # tf.global_variables_initializer().run()for i in range(STEPS): sess.run(train_step, feed_dict={x:, y_:}) for i % 轮数 == 0: print

将神经网络复现到计算图

with tf.Graph().as_default() as g: # 其内定义的节点在计算图g中

保存模型

saver = tf.train.Saver() # 实例化saver对象 # max_to_keep 保存的最大checkpoint文件的数量 # keep_checkpoint_every_n_hour 每训练n个小时保存一个checkpoint文件 # var_list 指定了将要保存的变量 # dict: "key":valuekey为所加载模型变量的name，value为运行程序中的名称，两个name可以不一样 # list: value 所加载的模型变量的name必须和程序中变量name一样， # 综上，所加载value表示程序中要恢复的变量名，"key"为保存模型中的变量的name，当保存模型中name和程序中na me相同时，可省略"key"，用list而不用dic# reshape 允许恢复一个不同shape的变量 with tf.Session() as sess: for i in range(STEPS): if i % 轮数 == 0: saver.save(sess, os.path.join('./model/model.ckpt'), global_step=global_step) # 文件尾加上当前的训练轮数

虽然上面程序只指定了一个文件路径，但是在这个文件目录下会出现三个文件。Tensorflow会将计算图的结构和图上参数取值分开保存

model.ckpt.meta
保存计算图的结构
model.ckpt
保存计算图每个变量的取值
model.index
保存当前参数名
checkpoin
保存一个目录下所有模型文件的列表
- model_checkpoint_path
  记录了最新 TF 模型文件的文件名
- all_model_checkpoint_path
  记录了当前还未被删除的所有 TF 模型文件名

在声明 tf.train.Saver()类时可以提供一个列表来指定需要保存或加载的变量
model.ckpt.meta

meta_info_def
记录计算图中的元数据及程序中所使用到的方法
- stripped_op_list
  记录了所有使用到的方法
  - op
    使用到的方法的信息
graph_def
记录计算图的节点信息
- node
  给出了该节点使用方法的名称
saver_def
记录持久化模型时用到的一些参数
collection_def
记录模型中保存集合的信息 (trainable_variables, variable)

加载模型
仅加载变量值

with tf.Session() as sess: saver.restore(sess, './model/model.ckpt-1') # 把模型参数加载到当前会话中

也可以通过tf.train.get_checkpoint_state获得一个ckpt，并通过ckpt.model_checkpoint_path 获得最后一次保存的路径。ckpt.all_model_checkpoint_paths为所有保存的路径。

ckpt = tf.train.get_checkpoint_state('./model/') if ckpt and ckpt.model_checkpoint_path: saver.restore(sess, ckpt.model_checkpoint_path)

加载持久化的图如果不希望重复定义图上的运算，也可以直接加载已经持久化的图

import tensorflow as tf saver = tf.train.import_meta_graph('./model/model.ckpt-1.meta')with tf.Session() as sess: restore_path = tf.train.latest_checkpoint('./model') # return a string saver.restore(sess, restore_path)print(sess.run(tf.get_default_graph().get_tensor_by_name("add:0")))

滑动平均的加载

v = tf.Variable(0, dtype=tf.float32, name='v') ema = tf.ExponentialMovingAverage(0.99) # 这个decay参数作为恢复时没有作用，但是必须有 saver = tf.train.Saver(ema.variables_to_restore()) with tf.Session() as sess: print(sess.run(v)) print(sess.run(ema.variables_to_restore()['v/ExponentialMovingAverage']))

迁移学习的恢复迁移学习只需要知道如何从神经网络的输入层经过前向传播得到输出层即可，而不需要类似于变量初始化，模型保存等辅助节点的信息。convert_variable_to_constant函数可以将计算图中的变量及其取值通过常量的方式保存
准确率计算方法

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

制作数据集 (tfrecords)
tfrecords是一种二进制文件，可先将图片和标签制作成该格式的文件。使用tfrecords进行数据读取，会提高内存利用率。
用tf.train.Example的协议存储训练数据。训练数据的特征用键值对的形式表示
用SerializeToString()把数据序列化成字符串存储。
保存流程

声明writer

writer = tf.python_io.TFRecordWriter('./train.tfrecords')
定义要保存的feature

feature_label = tf.train.Feature(int64_list=tf.train.Int64List(value=https://www.it610.com/article/[idx])) feature_img = tf.train.Feature(bytes_list=tf.train.BytesList(value=[img]))
将feature打包成features

features=tf.train.Features(feature={'label':feature_label, 'img':feature_img})
将features转化成要保存的Example模式

example = tf.train.Example(features=features)
调用writer.write

writer.write(example.SerializeToString())
关闭writer

writer.close()

读取流程

生成一个reader

reader = tf.TFRecordReader()

并创建一个队列来维护文件列表

filename_queue = tf.train.string_input_producer(['./train.tfrecords'])
读取一个样例

_, serialized_example = reader.read(filename_queue) # 返回文件名和文件

也可以用read_up_to函数一次性读取多个样例。
定义属性解析方法

feature_label = tf.FixedLenFeature([], tf.int64) feature_img = tf.FixedLenFeature([], tf.string)

TF提供了两种属性解析方法。
- tf.FixedLenFeature
  解析的结果是一个Tensor
- tf.VarLenFeature
  解析的结果时SparseTensor，用于处理稀疏数据
解析一个样例

features = tf.parse_single_example(serialized_example, features={ 'img':feature_img, 'lable':feature_label, })

也可以用parse_example解析多个样例
读取对应的数据
features是一个字典，利用key将对应的value读取出来，并转化成应该的类型

img = tf.decode_raw(features['img'], tf.unit8) # 将字符串解析成对应的数据组 label = tf.cast(features['label'], tf.int32)
启动多线程

sess = tf.Session() coord = tf.train.Coordinator() # 定义一个协调器 threads = tf.train.start_queue_runners(sess=sess, coord=coord) # 若是在上下文环境中，不需要给sess
读取数据

for i in range(num_examples): image, l = sess.run([img, label]) # TODO
处理多进程

coord.request_stop() coord.join(threads)

卷积层前向传播
卷积层 conv = tf.nn.conv2d(input, filter, strides=[1,2,2,1], padding="SAME")提供了一个非常方便的函数来实现卷积层前向传播的算法。

这个函数的第一个输入为当前层的节点矩阵。注意这个矩阵是一个四维矩阵，后面三个矩阵对应一个节点矩阵，第一维对应一个输入batch。比如在输入层，input[0,:,:,:]表示第一张图片
第二个参数提供了卷积层的权重，[filter_width, filter_height, depth, number_of_filter]
第三个参数为不同维度上的步长。虽然第三个参数提供的是一个长度为4的数组，但是第一维和最后一维的数字要求一定是1.因为卷积层的步长只对矩阵的长和宽有效
最后一个参数是填充 (padding) 的方法，TF 提供 SAME 或是 VALID 两中，其中SAME表示全0填充，'VALID'表示不添加

bias = tf.nn.bias_add(conv, biases)提供了一个方便的函数给每个节点加上bias。biases的维度和filter的个数相同。
actived_conv = tf.nn.relu(bias)通过ReLU激活函数完成去线性化。
池化层 pool=tf.nn.max_pool(actived_conv, ksize=[1,3,3,1], strides=[1,2,2,1], padding="SAME")实现了最大池化层，参数与tf.nn.conv2d函数类似。
TF 还提供了tf.nn.avg_pool实现平均池化层，调用格式同上。
LeNet5 输入层—卷积层—池化层—卷积层—池化层—全连接层—全连接层—输出层
tf.split 不同版本的TF参数顺序会有所不同！
tf.split(dimension, num_split, input)切分数据，将输入切入制定大小

# value is a tensor with shape [5,30] # split value into 3 tensor along dimention 1 s1, s2, s3 = tf.split(value, [4,15,11], 1) # split value into 3 tensor along dimention 1 in average s1, s2,s3 = tf.split(value, num_or_size_splits=3, axis=1)

tf.concat 粘贴张量

t1 = [[1,2,3], [4,5,6]] t2 = [[7,8,9], [10, 11, 12]] tf.concat([t1,t2],0) => [[1,2,3], [4,5,6],[7,8,9],[10,11,12]] tf.concat([t1,t2],1) => [[1,2,3,7,8,9], [4,5,6,10,11,12]]

文件I/O操作(tf.gfile)
没有线程锁的文件I/O操作包装器，提供一个接近Python文件对象的API

tf.gfile.GFile(filename, mode)
获取文本操作句柄，类似于python提供的open()函数，同tf.gfile.Open()
tf.gfile.FastGFile(filename, mode)
该函数与tf.gfile.GFile的差别仅仅在于无阻塞，即该函数会无阻塞以较快的方式获取文本操作句柄。

图像处理 (tf.image)
编码处理

解码
将jpg/jpeg/png的图像解码得到一个Tensor("DecodeJpeg")

# 通过FastGFile读取原始图像 image_raw_data = https://www.it610.com/article/tf.gfile.FastGFile('./pic/cat.jpg', 'rb').read() img_data = https://www.it610.com/article/tf.image.decode_jpeg(image_raw_data)

用eval()函数将tf.Tensor转化为np.ndarray

img = img_data.eval()
编码
用tf.image.encode_jpeg()编码

# 可以用tf.image.convert_image_dtype()转化数据类型 # img_data = https://www.it610.com/article/tf.image.convet_image_dtype(img_data, dtype=tf.float32) encoded_image = tf.image.encode_jpeg(img_data)

大小调整

tf.image.resize_images

resized = tf.image.resize_images(img_data, [300,300], method=0) print(img_data.get_shape()) # get_shape().as_list()

method	方法
0	双线性插值 ( Bilinear interpolation)
1	最近邻居法 ( Nearest neighbor interpolation)
2	双三次插值法 ( Bicubic interpolation)
3	面积插值法 ( Area interpolation )

tf.image.resize_image_with_crop_and_pad
通过tf.image.resize_image_with_crop_or_pad函数来调整图像大小。如果原始图像的尺寸大于目标图像，那么这个函数会自动截取图像中居中的部分。如果目标图像的尺寸大于原始图像，这个函数会自动在原始图像的周围填充全0背景。

croped = tf.image.resize_image_with_crop_pad(img_data, 10, 10) padded = tf.image.resize_image_with_crop_pad(img_data, 900,900)
tf.image.central_crop

central_cropped = tf.image.central_crop(img_data, 0.5) # fraction of size
tf.image.crop_to_bounding_box

box = tf.image.crop_to_bounding_box(img_data, 20, 20, 300, 400)
tf.image.crop_and_resize

img_batch = tf.image.crop_and_resize(img_batch, boxes=[[0.1,0.1,0.9,0.9],[0.3,0.3,0.9,0.9]], box_ind=[0,1], crop_size=[128,128])

tf.image.crop_and_resize的输入shape必须是(bs,h,w,d)，boxes是每张roi截取的相对位置（必须是分数），box_ind时对应boxes的index，也就是boxes中每个box作用的原图，[0,bs)。crop_size为crop之后将要resize到的shape

tf.image.flip_up_down

# 上下翻转 flipped = tf.image.flip_up_down(img_data) # 左右翻转 flipped = tf.image.flip_left_right(img_data) # 沿对角线翻转 transposed = tf.image.transpose_image(img_data)

TF还提供随机翻转训练图像

# 以一定概率上下翻转 flipped = tf.image.random_flip_up_down(img_data, seed) # 以一定概率左右翻转 flipped = tf.image.random_flip_left_right(img_data, seed)

图像色彩调整

调整亮度

# 将图像的亮度-0.5 adjusted = tf.image.adjust_brightness(img_data, -0.5) # 将图像的亮度+0.5 adjusted = tf.image.adjust_brightness(img_data, 0.5) # 在[-max_delta, max_delta)的范围随机调整图像的亮度 adjusted = tf.image.random_brightness(image, max_delta)
调整对比度

#　将对比度-5 adjusted = tf.image.adjust_contrast(img_data, -5) # 将对比度+5 adjusted = tf.image.adjust_contrast(img_data, 5) # 在[lower,upper]的范围内随机调整对比度 adjusted = tf.image.random_contrast(image, lower, upper)

调整色相 (Hue)

# 将色相+0.1 adjusted = tf.image.adjust_hue(img_data, 0.1) # 在[-max_delta, max_delta]的范围随机调整图像的色相 adjusted = tf.image.random_hue(image, max_delta) # max_delta取值范围在[0, 0.5]之间
调整饱和度

# 将饱和度-5 adjusted = tf.image.adjust_staturation(img_data, -5) # 在[lower, upper]的范围调整图的饱和度 adjusted = tf.image.random_saturation(image, lower, upper)
白化

# 将图像的三维矩阵中的数字均值变为0,方差变为1 adjusted = tf.image.per_image_whitening

处理标注框通过tf.image.draw_bounding_boxes函数在图像中加入标注框。tf.image.draw_bounding_boxes函数要求图像矩阵中的数字为tf.float32格式，该函数的输入为一个batch数据，也就是多张图像组成的四维矩阵。

生成batch

batched = tf.expand_dims(tf.image.convert_image_dtype(img_data,tf.float32), 0)

生成box

# 定义两个box，用分数表示相对位置[ymin, xmin, ymax, xmax] boxes = tf.constant([[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]])

画box

result = tf.image.draw_bounding_boxes(batched, boxes)

通过tf.image.sample_distorted_bounding_box函数来完成随机截取图像的过程。

提供有信息量的部分

boxes = tf.constant([[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]])
得到随机截取到的图像，并画出

# 得到截取到的图像 begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(tf.shape(img_data), bounding_boxes=boxes,min_object_covered=0.1) # 某些版本TF必须给min_object_covered # 生成batch batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0) # 画带有box的原图像 image_with_box = tf.image.draw_image_bounding_boxes(batched, bbox_for_draw) # 截取随机出来的图像 distorted_image = tf.slice(img_data, begin, size)

多线程输入数据
TF中，队列不仅是一种数据结构，它更提供了多线程机制 (是异步计算张量取值的一个重要机制) 。队列是TF多线程输入数据处理框架的基础。
队列与多线程队列队列和变量类似，都是计算图上有状态的节点。对于变量，可以通过赋值操作修改变量的取值。对于队列，修改队列的操作主要有Enqueue, EnqueueMany, Dequeue。
tf.FIFOQueue创建一个先进先出队列

q = tf.FIFOQueue(2, "int32") # 指定队列中最多可以保存两个元素，并指定类型为整数

enqueue_many初始化队列

init = q.enqueue_many(([0, 10], ))``

dequeue将队列第一个元素出队列

x = q.dequeue()

enqueue将元素加入队列

q_innc = q.enqueue([x])

除了FIFOQueue， TF还提供了一个随机队列RandomShuffleQueue将队列中的元素打乱，每次出队列操作得到的是从当前队列所有元素中随机选择一个。
多线程 TF提供了tf.Coordinator和tf.QueueRunner两个类来完成多线程协同的功能。

tf.Coordinator主要用于协同多个线程一起停止，并提供了should_stop request_stop join三个函数
在启动线程之前，需要先声明一个tf.Coordinator类，并将这个类传入每一个创建的线程中。

coord = tf.train.Coordinator()
- should_stop
  启动的前程需要一直查询should_stop函数，当这个函数返回值为True时，则当前线程也需要退出。
  
  while not coord.should_stop(): # TODO
- request_stop
  每一个线程都可以通过调用request_stop函数来通知其他线程退出，当某一线程调用request_stop会后，should_stop函数的返回值将被设置为True。其他线程就同时终止了。
  
  coord.request_stop()
tf.QueueRunner主要用于启动多个线程来操作同一个队列，启动的这些线程可以通过tf.Coordinator类来统一管理。
定义一个QueueRunner类，并指定队列和操作。

# 定义队列 queue = tf.FIFOQueue(100, 'float') # 定义操作 enqueue_op = queue.enqueue([tf.random_normal([1])]) # 定义Queue类 qr = tf.train.QueueRunner(queue, [enqueue_op]*5)
- tf.train.add_queue_runner
  将定义好的QueueRunner加入TensorFlow计算图上的一个集合里，默认加入集合tf.GraphKeys.QUEUE_RUNNERS
  
  tf.train.add_queue_runner(qr)
- tf.train.start_queue_runners
  使用QueueRunner时，需要明确调用tf.train.start_queue_runners启动所有线程。该函数会默认启动tf.GraphKeys.QUEUE_RUNNERS集合中所有的QueueRunner。一般将tf.train.add_queue_runner函数和tf.train.start_queue_runners指定为同一个集合。
  
  threads = tf.train.start_queue_runners(sess=sess, coord=coord)

输入文件队列当训练数据较大时，可以将数据分成多个TFRecord 文件来提高处理效率。TF提供了tf.train.match_filenames_onece函数来获取一个正则表达式的所有文件，得到的文件可以通过tf.train.string_input_producer函数创建一个队列进行管理。每次调用文件读取函数时，该函数会先判断当前是否已有打开的文件可读，如果没有这个函数会从输入队列中出队一个文件并从这个文件中读取数据。当所有文件都已经被处理完后，它会将初始化时提供的文件列表中的文件全部加入队列。 num_epochs参数限制加载初始文件列表的最大轮数。在测试神经网路模型时，因为所有测试数据只需要使用一次，所以将num_epoch设置为1。
Batch
TF提供了tf.train.batch和tf.train.shuffle_batch将单个的样例组成batch
的形式输出。这两个函数都会生成一个队列，队列的入队操作是生成单个样例的方法，二每次出对得到的是一个batch的样例。
设置随机种子

tf.set_random_seed(1)

layers
layers.dense

tf.layers.dense(tensor_in, num_out, activation=tf.nn.relu, kernel_initializer = tf.contrib.layers.xavier_initializer_conv2d(), kernel_regularizer = tf.contrib.layers.l2_regularizer(l2_reg), ) # 可以通过get_collection的方式获得正则化损失 reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) loss = tf.add_n(reg_losses + [reconstruction_loss])

layers.batch_normalization
!!几点注意!!
计算loss时，要添加update_ops到最后的train_op中，这样才能计算和的滑动平均（测试时会用到）

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): train_op = optimizer.minimize(loss)

一般来说保存checkpoint的时候，不会把所有模型参数都保存下来，因为一些无关的数据会增大模型的尺寸，常见的方法是只保存那些训练时更新的参数

var_list = tf.trainable_variables() saver = tf.train.Saver(var_list=var_list, max_to_keep=5) # 如果Saver()中不传入任何参数，默认恢复全部参数

但是使用了batch_normalization，和是可训练参数，而和不是，他们是通过滑动平均计算出来的，按照上面的方法保存模型，在读取模型预测时，会报错找不到和。因此可以使用如下代码。

var_list = tf.trainable_variables() g_list = tf.global_variables() bn_moving_vars = [g for g in g_list if 'moving_mean' in g.name] bn_moving_vars += [g for g in g_list if 'moving_variance' in g.name] var_list += bn_moving_vars saver = tf.train.Saver(var_list=var_list, max_to_keep=5)

nn
nn.in_top_k top-5 accuracy

correct = tf.nn.in_top_k(logits, y, 5) accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

nn.sparse_softmax_cross_entropy_with_logits 和nn.softmax_cross_entropy_with_logits(logits, labels)的区别

nn.softmax_cross_entropy_with_logits(logits, labels)
非sparse版本的函数labels使用的是one_hot形式，因为one_hot形式是一个列表，所以用不加sparse的版本.labels的维度和logits相同。
``nn.sparse_softmax_cross_entropy_with_logits(logits, labels)`
sparse版本的函数labels使用的是int形式，所以使用带sparse的版本。labels的维度要比logits小1。

nn.embedding_lookup

input_ids = tf.placeholder(dtype=tf.int32, shape=[None])embedding = tf.Variable(np.identity(3, dtype=np.int32)) input_embedding = tf.nn.embedding_lookup(embedding, input_ids)sess = tf.InteractiveSession() sess.run(tf.global_variables_initializer()) print(embedding.eval())# sess = tf.Session() 报错 print(sess.run(input_embedding,feed_dict={input_ids:[2,0,1]}))

nn.moments 计算输入的均值和方差

# 计算神经网络某层的mean和variance img = tf.Variable(tf.random_normal([128, 32, 32, 64])) axis = list(range(len(img.get_shape()) - 1)) # get_shape返回的不是python内置的tuple，但是打印出来是tuple mean, variance = tf.nn.moments(img, axis) # 计算出来的mean,variance可以用来做bn的输入

tf.InteractiveSession() V.S tf.Session()
使用tf.InteractiveSession()来构建会话的时候，我们可以先构建一个session然后再定义操作op，如果我们使用tf.Session()来构建会话我们需要在会话构建之前定义好全部的操作op。
tf.identity
返回一个一模一样的tensor的op

x = tf.Variable(1.0) y_val = tf.Variable(0.0) y_op = tf.Variable(0.0)# 返回一个op，表示给变量加1的操作 x_plus_1 = tf.assgin_add(x. 1)# control_dependencies的意义是，在执行with包含的内容前，先执行control_dependencies参数中的内容，这里是x_plus_1 with tf.control_dependencies([x_plus_1]): y_val = x # 打印的是[1,1,1,1,1] y_op = x # 打印的是[2,3,4,5,6]with tf.Session() as sess: tf.global_variables_initializers().run() for i in range(5): print(y_val.eval()) # 相当于sess.run(y) [1,1,1,1,1] print(y_op.eval()) # [2,3,4,5,6]

对于control_dependencies这个管理器，只有当里面的操作是一个op时，才会生效，也就是先执行传入的参数op，在执行管理器内部的op，而y=x仅仅是一个tensor的赋值，不是op，所以在图中不会形成一个节点，这样该管理器就失效了，tf.identity是返回一个一模一样新的tensor的op，这个操作会增加一个新节点到graph中，这时control_dependencies就会生效。
API tf.contrib
tf.contrib.layers

init = tf.contrib.layers.xavier_initializer() init = tf.contrib.layers.variance_scaling_initializer() # He Kaiming initialize l2_reg = 0.0001 l2_regularizer = tf.contrib.l2_regularizer(l2_reg) hidden = tf.layers.dense(X, n_hidden, activation=tf.nn.relu, kernel_initializer=init, kernel_regularizer=l2_regularizer)

tf.contrib.layers.variabce_scaling_initializer 通过这种方法，保证输入变量的变化尺度不变，从而避免变化尺度在最后一层中爆炸或弥散

To get Delving Deep into Rectifiers (also know as the "MSRA initialization"), use (Default):
factor=2.0 mode='FAN_IN' uniform=False
To get Convolutional Architecture for Fast Feature Embedding, use:
factor=1.0 mode='FAN_IN' uniform=True
To get Understanding the difficulty of training deep feedforward neural networks, use:
factor=1.0 mode='FAN_AVG' uniform=True.
To get xavier_initializer use either:
factor=1.0 mode='FAN_AVG' uniform=True, or
factor=1.0 mode='FAN_AVG' uniform=False.

tf.clip_by_value
将Tensor的数值限制在制定范围内

tf.clip_by_value(y, 1e-10, 1.0)

tf.greater
tf.greater_equal
返回x>y判断结果bool型的tensor，当tensor x,y的维度不一致时，采取广播机制
tf.where
根据condition选择x (if True)或者y (if False)

# 定义自己的损失函数，例如产品销量预测问题，当预测值大于实际值，损失的是成本，小于实际值，损失的是利润，多数情况下成本和利润的损失不相等的 loss_more = 10 loss_less = 1loss = tf.reduce_mean(tf.where(tf.greater(y,y_), (y-y_)*loss_more, (y_-y)*loss_less))

tf.one_hot
返回one_hot tensor

input = mnist.train.labels[:8] output = tf.one_hot(input, 10)

tf.reset_default_graph()
清空默认计算图，默认情况下节点是在default_graph上进行创建的

# 当存在多个计算图时 graph = tf.Graph() with graph.as_default(): # build graph with tf.Session(graph=graph) as sess: # operate

tf.data
tf.data.Dataset Dataset是存储Tensor结构的类，他可以保存一批Tensor结构，以供模型来训练和测试。
获取Dataset Dataset获取数据的方式有很多，可以从Tensor获取，也可以从另一个Dataset转换，从Tensor获取用到的接口为

tf.data.Dataset.from_tensor_sclices()

这个借口允许我们传递一个或多个Tensor结构给Dataset，因为默认把Tensor的第一个维度作为数据数目的标识，所以要保持数据结构中第一维的一致性

dataset = tf.data.Dataset.from_tensor_slices( {"a":tf.random_uniform([4]), "b":tf.random_uniform([4,100], maxval=100, dtype=tf.int32)}) print(dataset.output_types) # >> {'a':tf.float32, 'b':tf.int32} print(dataset.output_shapes) # >> {'a':(), 'b':(100,)}

tf.data.Dataset.from_tensor_slices接口可以接受任何iterator，这里使用的是字典变量。
第一个维度被认为是数据的数量，所以观察数据shapes时，只显示第一维以后的。

Dataset输出

make_one_shot_iterator

dataset = tf.data.Dataset.from_tensor_slices(np.random.randn(10,3)) iterator = dataset.make_one_shot_iterator() # 数据输出一次就丢弃了 next_element = iterator.get_next()with tf.Session() as sess: for i in range(10): value = https://www.it610.com/article/sess.run(next_element) print(i, value)

make_initializable_iterator
可以初始化迭代器允许Dataset中存在占位符

max_value = https://www.it610.com/article/tf.placeholder(tf.int64, shape=[]) dataset = tf.data.Dataset.range(max_value) iterator = dataset.make_initializable_iterator() next_element = iterator.get_next()with tf.Session() as sess: sess.run(iterator.initializer, feed_dict={max_value:10}) for i in range(10): value = sess.run(next_element)

tf.train
tf.train.slice_input_producer tensorflow 为了充分利用GPU，减少GPU等待数据的空闲时间，使用了两个线程分别执行数据读入和数据计算。
具体来说就是使用一个线程源源不断的将硬盘中的图片数据读入到一个内存队列中，另一个线程负责计算任务，所需数据直接从内存队列中获取。
tf在内存队列之前，还建立了一个文件名队列，文件名队列存放的是参与训练的而文件名，需要训练N个epoch，则文件名队列中就含有N个批次的所有文件名。
创建tf的文件名队列就需要使用到tf.train.slice_input_producer函数。
tf.train.slice_input_producer是一个Tensor生成器，作用是按照设定，每次从一个tensor列表中按顺序或随机抽取一个tensor放入文件名队列。
还需要调用tf.train.start_queue_runners函数来启动执行文件名队列填充的线程，之后计算单元才可以把数据读出来，否则文件名队列为空，计算单元会处于一直等待状态，导致系统堵塞。

import tensorflow as tf images = ['img1', 'img2', 'img3', 'img4', 'img5'] labels= [1,2,3,4,5] epoch_num=8 f = tf.train.slice_input_producer([images, labels],num_epochs=None,shuffle=False) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) for i in range(epoch_num): k = sess.run(f) print '************************' print (i,k) coord.request_stop() coord.join(threads)

返回一个队列
参数

tensor_list: 一个Tensor的list，列表中每一个Tensor的第一个维度是相同的
capacity: 定义队列的容量
shuffle: 是否打乱

返回值
一个tensor的list，如果输入tensor_list的shape为[N,a,b,....z]，那么输出的shape[a,b,...,z]
tf.train.batch & tf.train.shuffle_batch

# Ensure that random shuffling has good maxing properties min_fraction_of_examples_in_queue = 0.4 min_queue_examples = int(self.dataset_size * min_fraction_of_examples_in_queue)input_queue = tf.train.slice_input_producer([images, label], shuffle=False) image_batch, label_batch = tf.train.batch(input_queue, batch_size=10, num_threads=1, capacity=min_queue_examples+3*batch_size) image_batch, label_batch = tf.train.shuffle_batch(input_queue, batch_size=10, num_threads=1, capacity=min_queue_examples+3*batch_size, min_after_dequeue=min_queue_examples)

tf.train.shuffle_batch多一个min_after_dequeue参数，一定要保证这个参数小于capacity参数的值
tf.train.Saver var_list指定要保存和恢复的变量

保存参数
如果Saver的var_list为空，则默认保存全部参数

weight = [weights['wc'],weight['wc2'],weight['wc3']] saver = tf.train.Saver(weight) saver.save(sess,'./model/model.ckpt')
恢复参数
如果Saver的var_list为空，则默认恢复全部参数

weight = [weights['wc'],weight['wc2'],weight['wc3']] saver = tf.train.Saver(weight) saver.restore(sess, './model/model.ckpt')

【Learning|Learning Tensorflow】查看保存的变量及名字

from tensorflow.python import pywrap_tensorflow ckpt = tf.train.get_checkpoint_state(./model/) reader = pywrap_tensorflow.NewCheckpointReader(ckpt.model_checkpoint_path) var_to_shape_map = saver.get_variable_to_shape_map() for key in var_to_shape_map: print("tensor_name: ", key, reader.get_tensor(key).shape)

tf.train.init_from_checkpoint

exclude = [base_architecture + '/logits', 'global_step'] variables_to_restore = tf.contrib.slim.get_variables_to_restore(exclude=exclude)tf.train.init_from_checkpoint('model.ckpt', {v.name.split(':')[0]: v for v in variables_to_restore})

tf.losses
tf.losses.absolute_difference

tf.losses.absolute_difference(labels, predictions)

tf.app
Tensorflow解析命令行工具
tf.app.flags

flags.DEFINE_integer("epoch", 25, "Epoch to train") flags.DEFINE_float("learning_rate", 0.0002, "Learning rate for adam") flags.DEFINE_string("data_dir", "./data", "Directory name to save the image samples") flags.DEFINE_boolean("visualize", False, "True for visualizing, False for nothing") FLAGS = flags.FLAGS print(FLAGS.epoch) >>> 25 print(FLAGS.learning_rate) >>> 0.0002 print(FLAGS.data_dir) >>> ./dataif __name__ == '__main__': tf.app.run() # 执行程序中的main函数，并解析命令行参数

tf.read_file
读取一个文件，读取的图片没有经过解码，需要使用tf.image.decode_image进行解码，该函数可以解码所有格式图像，是tf.image.decode_png 、tf.image.decode_jpeg等函数的方便调用。

img_value = https://www.it610.com/article/tf.read_file('test.jpg') img = tf.image.decode_image(img_value) # decode输出是Tensor，eval后是ndarray

tf.gfile.FastGFile
和tf.read_file基本相同

img_raw = tf.gfile.FastGFile('test.jpg', 'rb').read() img = tf.image.decode_jpeg(img_raw) # decode输出是Tensor，eval后是ndarray 0-255 np.uint8

tf.WholeFileReader
使用文件队列进行读取

file_queue = tf.train.string_input_producer(['img.jpg']) img_reader = tf.WholeFileReader() # reader _, img = image_reader.read(file_queue) img = tf.image.decode_jpeg(img)

tf.cast
类型转换

x = tf.cast(x, tf.float32)

tf.stack
tf.stack则会在新的张量阶上拼接，产生的张量的阶数将会增加

a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3) b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3) ab = tf.stack([a,b], axis=0) # shape (2,2,3)

改变参数axis为2，有：

import tensorflow as tf a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3) b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3) ab = tf.stack([a,b], axis=2) # shape (2,3,2)

tf.unstack
tf.unstack与tf.stack的操作相反，是将一个高阶数的张量在某个axis上分解为低阶数的张量，例如：

a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3) b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3) ab = tf.stack([a,b], axis=0) # shape (2,2,3) a1 = tf.unstack(ab, axis=0) #[,]

tf.reshape & set_shape
tf.Tensor.set_shape() 方法(method)会更新(updates)一个 Tensor 对象的静态 shape ，当静态 shape 信息不能够直接推导得出的时候，此方法常用来提供额外的 shape 信息。它不改变此 tensor 动态 shape 的信息。
tf.reshape() 操作(operation)会以不同的动态 shape 创建一个新的 tensor。
tf.tile
tf.tile用于需要张量扩展的场景，具体来说
如果现在有shape为[width,height]的张量，需要的得到一个基于原张量，shape为[batch_size, width, height]的张量，其中每个batch的内容都和原张量一样

raw = tf.Variable([[1,2],[3,4], [5,6]]) multi = tf.tile(raw, multiples=[2,3]) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) print(raw.eval()) print('---------------') print(multi.eval())>>> output: [[1 2] [3 4] [5 6]] --------------- [[1 2 1 2 1 2] [3 4 3 4 3 4] [5 6 5 6 5 6] [1 2 1 2 1 2] [3 4 3 4 3 4] [5 6 5 6 5 6]]

tf.random_shuffle
随机将Tensor沿第一维度打乱
tf.ConfigProto
tf.ConfigProto函数用创建Session的时候，用来对session进行参数配置

config = tf.ConfigProto(allow_soft_placement=True)# 设置tf占用gpu显存情况 config.gpu_options.per_process_gpu_memory_fraction = 0.4#占用40%显存, 指定了每个GPU进程使用显存的上限，但它只能均匀用于所有GPU，无法对不同GPU设置不同的上限 config.gpu_options.allow_growth=True # 分配器将不会指定所有的GPU内存，而是按照需求增长 # OR gpu_options = tf.GPUOptions(allow_growth=True) config = tf.ConfigProto(..., gpu_options=gpu_options)sess = tf.Session(config=config)

allow_soft_placement=True
在tf中，通过命令"with tf.device('/cpu:0):" 允许手动设置操作运行的设备，如果手动设置的设备不存在或不可用，就会导致tf程序异常，为了防止这种情况，设置allow_soft_placement=True，允许tf自动选择一个存在并且可用的设备进行操作。
log_device_placement=True
在终端打印出各项操作是在哪个设备上运行的

tf.graph_util
convert_variables_to_constant tensorflow将训练好的模型freeze，将权重固化到图里面。

output_graph_def = tf.graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), ['outputdata']) with tf.gfile.GFile('log/mnist.pb', 'wb') as f: f.write(output_graph_def.SerializeToString())

读取训练好的模型.pb

with tf.gfile.FastGFile('log/mnist.pb', 'rb') as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) _ = tf.import_graph_def(graph_def, name='')

tf.py_func 将一个python函作为Tensorflow操作

def my_func(x): return np.sinh(X)inp = tf.placeholder(tf.float32) y = tf.py_func(my_func, [inp], tf.float32)

inp： list of Tensor
Tout： list or tuple of tensorflow数据类型

Tensorboard 保存当前的计算图

writer = tf.summary.FileWriter('./log/', tf.get_default_graph()) writer.close()

显示当前计算图

tensorboard --logdir=./log

版本自动更新

~/code/python/tensorflow/tf_upgrade.py --infile=origin.py --outfile=upgrage.py

slim

import tensorflow.contrib.slim as slim

slim.model_analyzer
分析Tensorflow计算图operation和variables的模块
analyze_vars

model_vars = tf.trainable_variables() slim.model_analyzer.analyze_vars(model_vars, print_info=True)

analyze_ops

images, labels = LoadData(...) predictions = MyModel(images) slim.model_analyzer.analyze_ops(tf.get_default_graph(), print_info=True)

slim.flatten
将输入扁平化但保留batch_size，假设第一维是batch
slim.arg_scope()
设置了默认参数

def func1(a=0,b=0): return (a+b) with slim.arg_scope([func1], a=10): x = func1(b=30) print(x)>>> 40

平常所使用的slim.conv2d() slim.fully_connected(), slim.max_pool2d() 等函数在他被定义时就已经添加了@add_arg_scope。所以在使用过程中可以直接设置默认参数。
在下面的代码中，不做单独声明的情况下，slim.conv2d, slim.max_pool2d, slim.avg_pool2d三个函数默认的步长都设为1，padding模式都是'VALID'的。但是也可以在调用时进行单独声明。这种参数设置方式在构建网络模型时，尤其是较深的网络时，可以节省时间。

with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],stride = 1, padding = 'VALID'): net = slim.conv2d(inputs, 32, [3, 3], stride = 2, scope = 'Conv2d_1a_3x3') net = slim.conv2d(net, 32, [3, 3], scope = 'Conv2d_2a_3x3') net = slim.conv2d(net, 64, [3, 3], padding = 'SAME', scope = 'Conv2d_2b_3x3')

原始繁琐的操作：

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME', weights_initializer=tf.truncated_normal_initializer(stddev=0.01), weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1') net = slim.conv2d(net, 128, [11, 11], padding='VALID', weights_initializer=tf.truncated_normal_initializer(stddev=0.01), weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2') net = slim.conv2d(net, 256, [11, 11], padding='SAME', weights_initializer=tf.truncated_normal_initializer(stddev=0.01), weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')

简单操作：

with slim.arg_scope([slim.conv2d], padding='SAME', weights_initializer=tf.truncated_normal_initializer(stddev=0.01) weights_regularizer=slim.l2_regularizer(0.0005)): net = slim.conv2d(inputs, 64, [11, 11], scope='conv1') net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')##这里的padding='VALID'会覆盖原来默认的padding='SAME' net = slim.conv2d(net, 256, [11, 11], scope='conv3')

嵌套操作：
这里最外面的一层scope包含slim.conv2d和slim.fully_connected两个共有参数，里面一层scope则只包括slim.conv2d的参数。

with slim.arg_scope([slim.conv2d, slim.fully_connected], activation_fn=tf.nn.relu, weights_initializer=tf.truncated_normal_initializer(stddev=0.01), weights_regularizer=slim.l2_regularizer(0.0005)): with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'): net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1') net = slim.conv2d(net, 256, [5, 5], weights_initializer=tf.truncated_normal_initializer(stddev=0.03), scope='conv2') net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

slim.repeat

net = ... net = slim.conv2d(net,16,[3,3],scope='conv1') net = slim.conv2d(net,16,[3,3],scope='conv1') net = slim.conv2d(net,16,[3,3],scope='conv1')

上面可以替换为：

net = ... net = slim.repeat(net,3,slim.conv2d,16,[3,3],scope='conv1')

slim.stack
输出神经元个数不同：

# Verbose way: x = slim.fully_connected(x, 32, scope='fc/fc_1') x = slim.fully_connected(x, 64, scope='fc/fc_2') x = slim.fully_connected(x, 128, scope='fc/fc_3')# Equivalent, TF-Slim way using slim.stack: slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

每层网络的输出神经元个数和卷积核都不同：

# Verbose way: x = slim.conv2d(x, 32, [3, 3], scope='core/core_1') x = slim.conv2d(x, 32, [1, 1], scope='core/core_2') x = slim.conv2d(x, 64, [3, 3], scope='core/core_3') x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')# Using stack: slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')

使用训好的resnet

net,endpoints = nets.resnet_v1.resnet_v1_50(inputs, num_classes=None, is_training=is_training) net = tf.squeezw(net, axis=[1,2]) net = slim.fully_connected(net, num_outputs=num_classes, activation_fn=None, scope='predict')

因为 ResNet-50 是用于 1000 个类的分类的，所以需要设置参数 num_classes=None 禁用它的最后一个输出层。
输入的图像批量形状为 [None, 224, 224, 3]，则 resnet_v1_50 函数返回的形状为 [None, 1, 1, 2048]，为了输入到全连接层，需要用函数 tf.squeeze 去掉形状为 1 的第 1，2 个索引维度。