mirror of
https://github.com/apachecn/ailearning.git
synced 2026-02-10 22:05:11 +08:00
283 lines
6.0 KiB
Markdown
283 lines
6.0 KiB
Markdown
# Theano 在 Windows 上的配置
|
||
|
||
<font color="red">注意:不建议在 `windows` 进行 `theano` 的配置。</font>
|
||
|
||
<font color="red">务必确认你的显卡支持 `CUDA`。</font>
|
||
|
||
我个人的电脑搭载的是 `Windows 10 x64` 系统,显卡是 `Nvidia GeForce GTX 850M`。
|
||
|
||
## 安装 theano
|
||
|
||
首先是用 `anaconda` 安装 `theano`:
|
||
|
||
```py
|
||
conda install mingw libpython
|
||
pip install theano
|
||
```
|
||
|
||
## 安装 VS 和 CUDA
|
||
|
||
按顺序安装这两个软件:
|
||
|
||
* 安装 Visual Studio 2010/2012/2013
|
||
* 安装 对应的 x64 或 x86 CUDA
|
||
|
||
Cuda 的版本与电脑的显卡兼容。
|
||
|
||
我安装的是 Visual Studio 2012 和 CUDA v7.0v。
|
||
|
||
## 配置环境变量
|
||
|
||
`CUDA` 会自动帮你添加一个 `CUDA_PATH` 环境变量(环境变量在 控制面板->系统与安全->系统->高级系统设置 中),表示你的 `CUDA` 安装位置,我的电脑上为:
|
||
|
||
* `CUDA_PATH`
|
||
* `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0`
|
||
|
||
我们配置两个相关变量:
|
||
|
||
* `CUDA_BIN_PATH`
|
||
* `%CUDA_PATH%\bin`
|
||
* `CUDA_LIB_PATH`
|
||
* `%CUDA_PATH%\lib\Win32`
|
||
|
||
接下来在 `Path` 环境变量的后面加上:
|
||
|
||
* `Minicoda` 中关于 `mingw` 的项:
|
||
|
||
* `C:\Miniconda\MinGW\bin;`
|
||
* `C:\Miniconda\MinGW\x86_64-w64-mingw32\lib;`
|
||
* `VS` 中的 `cl` 编译命令:
|
||
|
||
* `C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin;`
|
||
* `C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE;`
|
||
|
||
生成测试文件:
|
||
|
||
In [1]:
|
||
|
||
```py
|
||
%%file test_theano.py
|
||
from theano import config
|
||
print 'using device:', config.device
|
||
|
||
```
|
||
|
||
```py
|
||
Writing test_theano.py
|
||
|
||
```
|
||
|
||
我们可以通过临时设置环境变量 `THEANO_FLAGS` 来改变 `theano` 的运行模式,在 linux 下,临时环境变量直接用:
|
||
|
||
```py
|
||
THEANO_FLAGS=xxx
|
||
```
|
||
|
||
就可以完成,设置完成之后,该环境变量只在当前的命令窗口有效,你可以这样运行你的代码:
|
||
|
||
```py
|
||
THEANO_FLAGS=xxx python <your script>.py
|
||
```
|
||
|
||
在 `Windows` 下,需要使用 `set` 命令来临时设置环境变量,所以运行方式为:
|
||
|
||
```py
|
||
set THEANO_FLAGS=xxx && python <your script>.py
|
||
```
|
||
|
||
In [2]:
|
||
|
||
```py
|
||
import sys
|
||
|
||
if sys.platform == 'win32':
|
||
!set THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 && python test_theano.py
|
||
else:
|
||
!THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python test_theano.py
|
||
|
||
```
|
||
|
||
```py
|
||
using device: cpu
|
||
|
||
```
|
||
|
||
In [3]:
|
||
|
||
```py
|
||
if sys.platform == 'win32':
|
||
!set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py
|
||
else:
|
||
!THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py
|
||
|
||
```
|
||
|
||
```py
|
||
Using gpu device 0: Tesla C2075 (CNMeM is disabled)
|
||
using device: gpu
|
||
|
||
```
|
||
|
||
测试 `CPU` 和 `GPU` 的差异:
|
||
|
||
In [4]:
|
||
|
||
```py
|
||
%%file test_theano.py
|
||
|
||
from theano import function, config, shared, sandbox
|
||
import theano.tensor as T
|
||
import numpy
|
||
import time
|
||
|
||
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
|
||
iters = 1000
|
||
|
||
rng = numpy.random.RandomState(22)
|
||
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
|
||
f = function([], T.exp(x))
|
||
|
||
t0 = time.time()
|
||
for i in xrange(iters):
|
||
r = f()
|
||
t1 = time.time()
|
||
print("Looping %d times took %f seconds" % (iters, t1 - t0))
|
||
print("Result is %s" % (r,))
|
||
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
|
||
print('Used the cpu')
|
||
else:
|
||
print('Used the gpu')
|
||
|
||
```
|
||
|
||
```py
|
||
Overwriting test_theano.py
|
||
|
||
```
|
||
|
||
In [5]:
|
||
|
||
```py
|
||
if sys.platform == 'win32':
|
||
!set THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 && python test_theano.py
|
||
else:
|
||
!THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python test_theano.py
|
||
|
||
```
|
||
|
||
```py
|
||
Looping 1000 times took 3.498123 seconds
|
||
Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761
|
||
1.62323284]
|
||
Used the cpu
|
||
|
||
```
|
||
|
||
In [6]:
|
||
|
||
```py
|
||
if sys.platform == 'win32':
|
||
!set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py
|
||
else:
|
||
!THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py
|
||
|
||
```
|
||
|
||
```py
|
||
Using gpu device 0: Tesla C2075 (CNMeM is disabled)
|
||
Looping 1000 times took 0.847006 seconds
|
||
Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761
|
||
1.62323296]
|
||
Used the gpu
|
||
|
||
```
|
||
|
||
可以看到 `GPU` 明显要比 `CPU` 快。
|
||
|
||
使用 `GPU` 模式的 `T.exp(x)` 可以获得更快的加速效果:
|
||
|
||
In [7]:
|
||
|
||
```py
|
||
%%file test_theano.py
|
||
|
||
from theano import function, config, shared, sandbox
|
||
import theano.sandbox.cuda.basic_ops
|
||
import theano.tensor as T
|
||
import numpy
|
||
import time
|
||
|
||
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
|
||
iters = 1000
|
||
|
||
rng = numpy.random.RandomState(22)
|
||
x = shared(numpy.asarray(rng.rand(vlen), 'float32'))
|
||
f = function([], sandbox.cuda.basic_ops.gpu_from_host(T.exp(x)))
|
||
|
||
t0 = time.time()
|
||
for i in xrange(iters):
|
||
r = f()
|
||
t1 = time.time()
|
||
print("Looping %d times took %f seconds" % (iters, t1 - t0))
|
||
print("Result is %s" % (r,))
|
||
print("Numpy result is %s" % (numpy.asarray(r),))
|
||
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
|
||
print('Used the cpu')
|
||
else:
|
||
print('Used the gpu')
|
||
|
||
```
|
||
|
||
```py
|
||
Overwriting test_theano.py
|
||
|
||
```
|
||
|
||
In [8]:
|
||
|
||
```py
|
||
if sys.platform == 'win32':
|
||
!set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py
|
||
else:
|
||
!THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py
|
||
|
||
```
|
||
|
||
```py
|
||
Using gpu device 0: Tesla C2075 (CNMeM is disabled)
|
||
Looping 1000 times took 0.318359 seconds
|
||
Result is <CudaNdarray object at 0x7f7bb701fb70>
|
||
Numpy result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761
|
||
1.62323296]
|
||
Used the gpu
|
||
|
||
```
|
||
|
||
In [9]:
|
||
|
||
```py
|
||
!rm test_theano.py
|
||
|
||
```
|
||
|
||
## 配置 .theanorc.txt
|
||
|
||
我们可以在个人文件夹下配置 .theanorc.txt 文件来省去每次都使用环境变量设置的麻烦:
|
||
|
||
例如我现在的 .theanorc.txt 配置为:
|
||
|
||
```py
|
||
[global]
|
||
device = gpu
|
||
floatX = float32
|
||
|
||
[nvcc]
|
||
fastmath = True
|
||
flags = -LC:\Miniconda\libs
|
||
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin
|
||
|
||
[gcc]
|
||
cxxflags = -LC:\Miniconda\MinGW
|
||
```
|
||
|
||
具体这些配置有什么作用之后可以查看官网上的教程。 |