mirror of
https://github.com/apachecn/ailearning.git
synced 2026-02-10 05:45:40 +08:00
93 lines
2.5 KiB
Markdown
93 lines
2.5 KiB
Markdown
# Theano tensor 模块:conv 子模块
|
||
|
||
`conv` 是 `tensor` 中处理卷积神经网络的子模块。
|
||
|
||
## 卷积
|
||
|
||
这里只介绍二维卷积:
|
||
|
||
`T.nnet.conv2d(input, filters, input_shape=None, filter_shape=None, border_mode='valid', subsample=(1, 1), filter_flip=True, image_shape=None, **kwargs)`
|
||
|
||
`conv2d` 函数接受两个输入:
|
||
|
||
* `4D` 张量 `input`,其形状如下:
|
||
|
||
`[b, ic, i0, i1]`
|
||
|
||
* `4D` 张量 `filter` ,其形状如下:
|
||
|
||
`[oc, ic, f0, f1]`
|
||
|
||
`border_mode` 控制输出大小:
|
||
|
||
* `'valid'`:输出形状:
|
||
|
||
`[b, oc, i0 - f0 + 1, i1 - f1 + 1]`
|
||
|
||
* `'full'`:输出形状:
|
||
|
||
`[b, oc, i0 + f0 - 1, i1 + f1 - 1]`
|
||
|
||
## 池化
|
||
|
||
池化操作:
|
||
|
||
`T.signal.downsample.max_pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0), mode='max')`
|
||
|
||
`input` 池化操作在其最后两维进行。
|
||
|
||
`ds` 是池化区域的大小,用长度为 2 的元组表示。
|
||
|
||
`ignore_border` 设为 `Ture` 时,`(5, 5)` 在 `(2, 2)` 的池化下会变成 `(2, 2)`(5 % 2 == 1,多余的 1 个被舍去了),否则是 `(3, 3)`。
|
||
|
||
## MNIST 卷积神经网络形状详解
|
||
|
||
```py
|
||
def model(X, w, w2, w3, w4, p_drop_conv, p_drop_hidden):
|
||
|
||
# X: 128 * 1 * 28 * 28
|
||
# w: 32 * 1 * 3 * 3
|
||
# full mode
|
||
# l1a: 128 * 32 * (28 + 3 - 1) * (28 + 3 - 1)
|
||
l1a = rectify(conv2d(X, w, border_mode='full'))
|
||
# l1a: 128 * 32 * 30 * 30
|
||
# ignore_border False
|
||
# l1: 128 * 32 * (30 / 2) * (30 / 2)
|
||
l1 = max_pool_2d(l1a, (2, 2), ignore_border=False)
|
||
l1 = dropout(l1, p_drop_conv)
|
||
|
||
# l1: 128 * 32 * 15 * 15
|
||
# w2: 64 * 32 * 3 * 3
|
||
# valid mode
|
||
# l2a: 128 * 64 * (15 - 3 + 1) * (15 - 3 + 1)
|
||
l2a = rectify(conv2d(l1, w2))
|
||
# l2a: 128 * 64 * 13 * 13
|
||
# l2: 128 * 64 * (13 / 2 + 1) * (13 / 2 + 1)
|
||
l2 = max_pool_2d(l2a, (2, 2), ignore_border=False)
|
||
l2 = dropout(l2, p_drop_conv)
|
||
|
||
# l2: 128 * 64 * 7 * 7
|
||
# w3: 128 * 64 * 3 * 3
|
||
# l3a: 128 * 128 * (7 - 3 + 1) * (7 - 3 + 1)
|
||
l3a = rectify(conv2d(l2, w3))
|
||
# l3a: 128 * 128 * 5 * 5
|
||
# l3b: 128 * 128 * (5 / 2 + 1) * (5 / 2 + 1)
|
||
l3b = max_pool_2d(l3a, (2, 2), ignore_border=False)
|
||
# l3b: 128 * 128 * 3 * 3
|
||
# l3: 128 * (128 * 3 * 3)
|
||
l3 = T.flatten(l3b, outdim=2)
|
||
l3 = dropout(l3, p_drop_conv)
|
||
|
||
# l3: 128 * (128 * 3 * 3)
|
||
# w4: (128 * 3 * 3) * 625
|
||
# l4: 128 * 625
|
||
l4 = rectify(T.dot(l3, w4))
|
||
l4 = dropout(l4, p_drop_hidden)
|
||
|
||
# l5: 128 * 625
|
||
# w5: 625 * 10
|
||
# pyx: 128 * 10
|
||
pyx = softmax(T.dot(l4, w_o))
|
||
return l1, l2, l3, l4, pyx
|
||
|
||
``` |