张量stream的tf.nn.max_pool中的“SAME”和“VALID”填充有什么区别？

张量tf.nn.max_pool的tensorflow “SAME”和“VALID”填充有tf.nn.max_pool tensorflow ？

在我看来，“有效”意味着当我们做最大的游泳池时，边缘之外不会有零填充。

根据“深度学习卷积algorithm指南”的说法，池运算符中没有填充，即只使用张量stream的“VALID”。但是在张量tensorflow中最大池的“SAME”填充是什么？

我会举一个例子来说明一下：

x ：形状[2,3]的input图像，1个通道
valid_pad ：2×2内核的最大池，第2步和VALID填充。
same_pad ：2×2内核的最大池，跨度2和SAME填充（这是经典的方式）

输出形状是：

valid_pad ：这里没有填充，所以输出形状是[1,1]
same_pad ：在这里，我们将图像填充到形状[2，4]（使用-inf然后应用最大池），因此输出形状为[1,2]

 x = tf.constant([[1., 2., 3.], [4., 5., 6.]]) x = tf.reshape(x, [1, 2, 3, 1]) # give a shape accepted by tf.nn.max_pool valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID') same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME') valid_pad.get_shape() == [1, 1, 1, 1] # valid_pad is [5.] same_pad.get_shape() == [1, 1, 2, 1] # same_pad is [5., 6.]

如果你喜欢ascii艺术：

"VALID" =没有填充：

  inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13) |________________| dropped |_________________|

"SAME" =带零填充：

  pad| |pad inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0 |________________| |_________________| |________________|

在这个例子中：

input宽度= 13
过滤宽度= 6
步幅= 5

笔记：

"VALID"只能删除最右边的列（或最底部的行）。
"SAME"试图左右均匀地填充，但是如果要添加的列的数量是奇数，它将会在右边添加额外的列，就像在这个例子中的情况一样（垂直地应用相同的逻辑：可能存在底部有一排零）。

TensorFlow卷积示例给出了关于SAME和VALID之间差异的概述：

对于SAME填充，输出的高度和宽度计算如下：

out_height = ceil（float（in_height）/ float（strides [1]））

out_width = ceil（float（in_width）/ float（strides [2]））

和

对于VALID填充，输出高度和宽度计算如下：

out_height = ceil（float（in_height – filter_height + 1）/ float（strides [1]））

out_width = ceil（float（in_width – filter_width + 1）/ float（strides [2]））

当stride为1（比卷积更具有卷积的典型特征）时，我们可以想到以下区别：

"SAME" ：输出大小与input大小相同。这就要求滤波器窗口在input映射之外滑动，因此需要填充。
"VALID" ：filter窗口停留在input映射内部的有效位置，所以输出大小由filter_size - 1缩小。没有填充发生。

填充是增加input数据大小的操作。在1维数据的情况下，你只需在数组中加一个常数，在2-dim中用这些常数来包围matrix。在n-dim中，用常量环绕n-dim超立方体。在大多数情况下，这个常量是零，它被称为零填充。

这是一个应用于二维张量的p=1的零填充示例：在这里输入图像说明

你可以为你的内核使用任意的填充，但是一些填充值比其他的使用更频繁：

有效的填充 。最简单的情况，意味着没有填充。只要保持你的数据是一样的。
同名填充有时称为半填充 。它被称为SAME，因为对于步长= 1的卷积（或用于汇集），它应该产生与input相同大小的输出。因为大小为k的内核，所以称为HALF
填充是最大填充，不会导致只填充元素的卷积。对于大小为k的内核，这个填充等于k - 1 。

要在TF中使用任意填充，可以使用tf.pad()

有三种填充select：有效（无填充），相同（或一半），全部。你可以在这里find解释（在Theano中）： http : //deeplearning.net/software/theano/tutorial/conv_arithmetic.html

有效或无填充：

有效填充不涉及零填充，因此它只覆盖有效input，不包括人为生成的零。如果步长s = 1，则输出长度为（（input长度） – （k-1））。

相同或半填充：

当s = 1时，相同的填充使得输出的大小与input的大小相同。如果s = 1，填充的零的数量是（k-1）。

全填充：

全填充意味着内核遍历整个input，所以内核可能会遇到唯一的一个input，其他的都是零。如果s = 1，则填充的零的数目是2（k-1）。如果s = 1，那么输出的长度是（（input的长度）+（k-1））。

因此，填充的数量:(有效）<=（相同）<=（满）

根据这里的解释和Tristan的回答，我通常使用这些快速function进行健康检查。

 # a function to help us stay clean def getPaddings(pad_along_height,pad_along_width): # if even.. easy.. if pad_along_height%2 == 0: pad_top = pad_along_height / 2 pad_bottom = pad_top # if odd else: pad_top = np.floor( pad_along_height / 2 ) pad_bottom = np.floor( pad_along_height / 2 ) +1 # check if width padding is odd or even # if even.. easy.. if pad_along_width%2 == 0: pad_left = pad_along_width / 2 pad_right= pad_left # if odd else: pad_left = np.floor( pad_along_width / 2 ) pad_right = np.floor( pad_along_width / 2 ) +1 # return pad_top,pad_bottom,pad_left,pad_right # strides [image index, y, x, depth] # padding 'SAME' or 'VALID' # bottom and right sides always get the one additional padded pixel (if padding is odd) def getOutputDim (inputWidth,inputHeight,filterWidth,filterHeight,strides,padding): if padding == 'SAME': out_height = np.ceil(float(inputHeight) / float(strides[1])) out_width = np.ceil(float(inputWidth) / float(strides[2])) # pad_along_height = ((out_height - 1) * strides[1] + filterHeight - inputHeight) pad_along_width = ((out_width - 1) * strides[2] + filterWidth - inputWidth) # # now get padding pad_top,pad_bottom,pad_left,pad_right = getPaddings(pad_along_height,pad_along_width) # print 'output height', out_height print 'output width' , out_width print 'total pad along height' , pad_along_height print 'total pad along width' , pad_along_width print 'pad at top' , pad_top print 'pad at bottom' ,pad_bottom print 'pad at left' , pad_left print 'pad at right' ,pad_right elif padding == 'VALID': out_height = np.ceil(float(inputHeight - filterHeight + 1) / float(strides[1])) out_width = np.ceil(float(inputWidth - filterWidth + 1) / float(strides[2])) # print 'output height', out_height print 'output width' , out_width print 'no padding' # use like so getOutputDim (80,80,4,4,[1,1,1,1],'SAME')

我从官方tensorflow文档中引用这个答案https://www.tensorflow.org/api_guides/python/nn#Convolution对于“SAME”填充，输出高度和宽度计算如下：;

 out_height = ceil(float(in_height) / float(strides[1])) out_width = ceil(float(in_width) / float(strides[2]))

顶部和左侧的填充计算如下：

 pad_along_height = max((out_height - 1) * strides[1] + filter_height - in_height, 0) pad_along_width = max((out_width - 1) * strides[2] + filter_width - in_width, 0) pad_top = pad_along_height // 2 pad_bottom = pad_along_height - pad_top pad_left = pad_along_width // 2 pad_right = pad_along_width - pad_left

对于“有效”填充，输出高度和宽度计算如下：

 out_height = ceil(float(in_height - filter_height + 1) / float(strides[1])) out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))

并且填充值始终为零。

张量stream的tf.nn.max_pool中的“SAME”和“VALID”填充有什么区别？

TensorFlow：InternalError：Blas SGEMM启动失败

在pip中找不到张量stream

更快的RCNN TensorFlow