tensorflow学习笔记8——图像数据的预处理（cifar10的例子） tensorflow

1、从文件读取数据到dataset
文章图片

2、把数据分成2部分：train和valid。其中train是取前45000条，然后用batch分成一批批（每批100条），一共要做300个epoch（repeat(300））。

文章图片

我们来计算一下：
每个 Epoch 要训练的图片数量：45000(训练集上的所有图像)
训练集具有的 Batch 个数： 45000/100=450
每个 Epoch 需要完成的 Batch 个数： 450
每个 Epoch 具有的 Iteration 个数： 450（完成一个Batch训练，相当于参数迭代一次）
每个 Epoch 中发生模型权重更新的次数：450
训练 300 个Epoch后，模型权重更新的次数： 450*300=135000
不同Epoch的训练，其实用的是同一个训练集的数据。第1个Epoch和第10个Epoch虽然用的都是训练集的45000图片，但是对模型的权重更新值却是完全不同的。因为不同Epoch的模型处于代价函数空间上的不同位置，模型的训练代越靠后，越接近谷底，其代价越小。
3、make_initializable_iterator()迭代器

iterator = train_dataset.make_initializable_iterator() # Decode the train records from the iterator record = iterator.get_next()#取一个batch的数据

4、tf.decode_raw()
将原来编码为字符串类型的变量重新变回来，这个方法在数据集dataset中很常用，因为制作图片源数据一般写进tfrecord里用to_bytes的形式，也就是字符串。这里将原始数据取出来必须制定原始数据的格式，原始数据是什么格式这里解析必须是什么格式.

record_decoded_bytes = tf.decode_raw(record, tf.uint8)

与tf.cast区别：
这个函数主要用于数据类型的转变，不会改变原始数据的值还有形状的。tf.cast还可以用于将numpy数组转化为tensor。
5、tf.slice()切片
参考：https://www.jianshu.com/p/71e6ef6c121b
def slice(input_, begin, size, name=None):
其中“input_”是你输入的tensor，就是被切的那个。“begin”是每一个维度的起始位置，这个下面详细说。“size”相当于问每个维度拿几个元素出来。

#Get the labels from the records record_labels = tf.slice(record_decoded_bytes, [0, 0], [batch_size, 1]) record_labels = tf.cast(record_labels, tf.int32) # Get the images from the records record_images = tf.slice(record_decoded_bytes, [0, 1], [batch_size, image_bytes]) record_images = tf.reshape(record_images, [batch_size, imageDepth, imageHeight, imageWidth])#NCHW #在CPU运行的，只接受NHWC这种数据格式，即Channel是在最后一个维度 record_images = tf.transpose(record_images, [0, 2, 3, 1]) record_images = tf.cast(record_images, tf.float32) #

record_labels的shape就是(batch_size, 1)
record_images的shape就是(batch_size, image_bytes)，然后再通过reshape转成（batch_size, imageDepth, imageHeight, imageWidth），通过transpose和cast转换成NHWC格式，每个数字是float32型
6、tf.image.resize_image_with_crop_or_pad()图像尺寸调整
tf.random_crop()随意裁剪
写了很多结果网页出问题，没保存，好伤心
现在就简略写吧

Random crop the images after pad each side with 4 pixels,??? distorted_images = tf.image.resize_image_with_crop_or_pad(record_images, imageHeight + 8, imageWidth + 8) distorted_images = tf.random_crop(distorted_images, size=[batch_size, imageHeight, imageHeight, 3])

图像尺寸调整属于基础的图像几何变换，TensorFlow提供了几种尺寸调整的函数：
tf.image.resize_images：将原始图像缩放成指定的图像大小，其中的参数method（默认值为ResizeMethod.BILINEAR）提供了四种插值算法，具体解释可以参考图像几何变换（缩放、旋转）中的常用的插值算法
tf.image.resize_image_with_crop_or_pad：剪裁或填充处理，会根据原图像的尺寸和指定的目标图像的尺寸选择剪裁还是填充，如果原图像尺寸大于目标图像尺寸，则在中心位置剪裁，反之则用黑色像素填充。
tf.image.central_crop：比例调整，central_fraction决定了要指定的比例，取值范围为(0，1]，该函数会以中心点作为基准，选择整幅图中的指定比例的图像作为新的图像。
————————————————
原文链接：https://blog.csdn.net/chaipp0607/article/details/73029923
tf.random_crop（）随机地将张量裁剪为给定的大小.
以一致选择的偏移量将一个形状 size 部分从 value 中切出.需要的条件：value.shape >= size.
如果大小不能裁剪,请传递该维度的完整大小.例如,可以使用 size = [crop_height, crop_width, 3] 裁剪 RGB 图像.
7、tf.unstack()和tf.stack()
tf.unstack()这是一个对矩阵进行分解的函数，以下为关键参数解释：
value：代表需要分解的矩阵变量（其实就是一个多维数组，一般为二维）；
axis：指明对矩阵的哪个维度进行分解。

# Unstack the images as the follow up operation are on single train image distorted_images = tf.unstack(distorted_images) for i in range(len(distorted_images)): #省略对每条数据的处理部分 # Stack the images distorted_images = tf.stack(distorted_images)

8、tf.image.random_flip_left_right()按水平 (从左向右) 随机翻转图像.以1比2的概率,输出image沿着第二维翻转的内容,即,width.否则按原样输出图像.
tf.image.random_brightness()在某范围随机调整图片亮度
tf.image.random_contrast()在某范围随机调整图片对比度
前3个函数都是在训练图像时，用于对图像进行预处理以增加训练数据．
【tensorflow学习笔记8——图像数据的预处理（cifar10的例子）】tf.image.per_image_standardization()归一化,三维矩阵中的数字均值变为0，方差变为1。

distorted_images[i] = tf.image.random_flip_left_right(distorted_images[i]) distorted_images[i] = tf.image.random_brightness(distorted_images[i], max_delta=63) distorted_images[i] = tf.image.random_contrast(distorted_images[i], lower=0.2, upper=1.8) distorted_images[i] = tf.image.per_image_standardization(distorted_images[i])