感受野的计算
1 感受野的概念
在卷积神经网络中,感受野的定义是 卷积神经网络每一层输出的特征图(feature map)上的像素点在原始图像上映射的区域大小。
RCNN论文中有一段描述,Alexnet网络pool5输出的特征图上的像素在输入图像上有很大的感受野(have very large receptive fields ($$195 × 195 pixels$$))和步长(strides$$ (32×32 pixels) $$), 这两个变量的数值是如何得出的呢?
2 感受野大小的计算
感受野计算时有下面的几个情况需要说明:
(1)第一层卷积层的输出特征图像素的感受野的大小等于滤波器的大小
(2)深层卷积层的感受野大小和它之前所有层的滤波器大小和步长有关系
(3)计算感受野大小时,忽略了图像边缘的影响,即不考虑$padding$的大小,$stride$只影响下一层featuremap的感受野,$fsize$影响的是该层的感受野
这里的每一个卷积层还有一个$strides$的概念,这个$strides$是之前所有层$stride$的乘积。
即$$strides(i) = stride(1) stride(2) …* stride(i-1)$$
关于感受野大小的计算采用top to down的方式, 即先计算最深层在前一层上的感受野,然后逐渐传递到第一层,使用的公式可以表示如下:
$$RF = 1$$ #待计算的feature map上的感受野大小
for layer in (top layer To down layer):
$$RF = ((RF -1)* stride) + fsize$$
$stride$ 表示卷积的步长; $fsize$表示卷积层滤波器的大小
用python3实现了计算Alexnet zf-5和VGG16网络每层输出feature map的感受野大小,实现代码:
1 | # -*- coding:utf-8 -*- |
输出结果如下:
layer output sizes given image = 224x224
**net structrue name is vgg16**
Layer Name = conv1_1, Output size = 224, Strides = 1, RF size = 3
Layer Name = conv1_2, Output size = 224, Strides = 1, RF size = 5
Layer Name = pool1, Output size = 112, Strides = 2, RF size = 6
Layer Name = conv2_1, Output size = 112, Strides = 2, RF size = 10
Layer Name = conv2_2, Output size = 112, Strides = 2, RF size = 14
Layer Name = pool2, Output size = 56, Strides = 4, RF size = 16
Layer Name = conv3_1, Output size = 56, Strides = 4, RF size = 24
Layer Name = conv3_2, Output size = 56, Strides = 4, RF size = 32
Layer Name = conv3_3, Output size = 56, Strides = 4, RF size = 40
Layer Name = pool3, Output size = 28, Strides = 8, RF size = 44
Layer Name = conv4_1, Output size = 28, Strides = 8, RF size = 60
Layer Name = conv4_2, Output size = 28, Strides = 8, RF size = 76
Layer Name = conv4_3, Output size = 28, Strides = 8, RF size = 92
Layer Name = pool4, Output size = 14, Strides = 16, RF size = 100
Layer Name = conv5_1, Output size = 14, Strides = 16, RF size = 132
Layer Name = conv5_2, Output size = 14, Strides = 16, RF size = 164
Layer Name = conv5_3, Output size = 14, Strides = 16, RF size = 196
Layer Name = pool5, Output size = 7, Strides = 32, RF size = 212
**net structrue name is zf-5**
Layer Name = conv1, Output size = 112, Strides = 2, RF size = 7
Layer Name = pool1, Output size = 56, Strides = 4, RF size = 11
Layer Name = conv2, Output size = 28, Strides = 8, RF size = 27
Layer Name = pool2, Output size = 14, Strides = 16, RF size = 43
Layer Name = conv3, Output size = 14, Strides = 16, RF size = 75
Layer Name = conv4, Output size = 14, Strides = 16, RF size = 107
Layer Name = conv5, Output size = 14, Strides = 16, RF size = 139
**net structrue name is alexnet**
Layer Name = conv1, Output size = 54, Strides = 4, RF size = 11
Layer Name = pool1, Output size = 26, Strides = 8, RF size = 19
Layer Name = conv2, Output size = 26, Strides = 8, RF size = 51
Layer Name = pool2, Output size = 12, Strides = 16, RF size = 67
Layer Name = conv3, Output size = 12, Strides = 16, RF size = 99
Layer Name = conv4, Output size = 12, Strides = 16, RF size = 131
Layer Name = conv5, Output size = 12, Strides = 16, RF size = 163
Layer Name = pool5, Output size = 5, Strides = 32, RF size = 195
Process finished with exit code 0
做个例子说明,对于alexnet的pool1层,计算为$(1-1)2+3=3$,$ (3-1)4+11=19$
对alexnet的pool2层,计算为$(1-1)2+3=3$,$ (3-1)1+5=7$,$(7-1)2+3=15$,$(15-1)4+11=67$