mirror of
https://github.com/Estom/notes.git
synced 2026-04-05 11:57:37 +08:00
pytorch基础
This commit is contained in:
@@ -1 +0,0 @@
|
||||
# 学习 PyTorch
|
||||
88
pytorch/官方教程/01概述.md
Normal file
88
pytorch/官方教程/01概述.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# 学习 PyTorch
|
||||
|
||||
|
||||
## 过程
|
||||
|
||||
1. 获取数据集
|
||||
2. 数据预处理
|
||||
3. 训练模型
|
||||
1. 神经网络:torch.nn.Model.__init__定义具有一些可学习参数(或权重)的神经网络
|
||||
2. 正向传播:torch.nn.Model.forward通过网络处理输入,进行正向传播
|
||||
3. 计算损失:torch.nn.loss输出正确的距离有多远
|
||||
4. 反向传播:torch.tensor.backward将梯度传播回网络参数
|
||||
5. 更新权重:troch.optim通常使用简单的更新规则来更新网络的权重:weight = weight - learning_rate * gradient
|
||||
4. 验证模型
|
||||
5. 使用模型
|
||||
|
||||
|
||||
## 术语
|
||||
|
||||
* 特征`3*32*32`
|
||||
* 层:卷积层、池化层、全连接层
|
||||
* 算子:卷积层使用的卷积算子。池化层的赤化算子`3*3`
|
||||
* 激活函数:卷积层的的每个卷积算子计算完成后输出一个新的特征。激活函数对这个卷积算子处理后的结果进行第二次处理。`1*1`。不改变特征的维度。只添加一个对特征的调整。
|
||||
* 权重。在深度神经网络中,权重使用来计算z的求和使用的。在卷积神经网络中,
|
||||
|
||||
|
||||
|
||||
## 标准神经网络
|
||||
|
||||
* 现实生活中,channel表示通道的数量。神经网络的计算图,从前向后,有无数的通道。channel就是表示这个通道的数量。channel:某一层输入输出的数量。分别用inchannls和outchannels
|
||||
* 通道的数量,指某一层的通道的数量,而非相对于某一个神经元来说。同一层的神经元遵循相同的计算方法(使用同一个算子)
|
||||
* 在普通神经网络中,一个channel表示一条通道(连线),就只能产生一个值。这个值会×权重,作为下一层的激活值。然后调用激活函数,作为下一层的输出值。
|
||||
* 在卷积神经网络中,channel代表的含义也一样。一个channel表示一条通道(连线),但是这个通道上传递的是高维的特征数据,而非简单的一个数据。如conv2d,传递的可能是`32*32`的特征。第二层使用了inchannel3,outchannel6,卷积算子(卷积核)的大小为3.该层处理的输入特征为3个channel的`32*32`。输出特征是6个channel的`28*28`。表示该层
|
||||
|
||||
* 在标准神经网络的构建过程中,可以在前向传播过程中对函数过程进行控制。可以构造任意形状的计算图。每一层的神经元可以不一样。
|
||||
## 卷积神经网络
|
||||
|
||||
|
||||
* 与之前的卷积过程相比较,卷积神经网络的单层结构多了激活函数和偏移量;而与标准神经网络:
|
||||
|
||||
$$Z^{[l]} = W^{[l]}A^{[l-1]}+b$$
|
||||
$$A^{[l]} = g^{[l]}(Z^{[l]})$$
|
||||
|
||||
* 相比,滤波器的数值对应着权重 $W^{[l]}$,卷积运算对应着 $W^{[l]}$与 $A^{[l-1]}$的乘积运算,所选的激活函数变为 ReLU。
|
||||
|
||||
* 对于一个 3x3x3 的滤波器,包括偏移量 $b$在内共有 28 个参数。不论输入的图片有多大,用这一个滤波器来提取特征时,参数始终都是 28 个,固定不变。即**选定滤波器组后,参数的数目与输入图片的尺寸无关**。因此,卷积神经网络的参数相较于标准神经网络来说要少得多。这是 CNN 的优点之一。
|
||||
|
||||
## 数据规模——普通神经网络
|
||||
|
||||
## 数据规模——卷积神经网络
|
||||
|
||||
### channel的理解
|
||||
> 与数据规模相关的量主要包括以下几种。他们共同构成了张量的各个维度。在torch.nn当中的运算,都是以层为单位进行计算的。也就是说,同一层只有一种神经元,多个神经元表示多个channel,同一层的输入输出使用一个张量来表示。
|
||||
|
||||
### 数据的规模
|
||||
对torch.nn来说。一个张量至少应该包括四个维度。batch_size,channesl,height,width。每一个张量表示某一层的所有输入或输出。
|
||||
|
||||
|
||||
* batch_size:张量的第一个维度。表示输入样本的个数。在传播计算和理解过程中,可以假设没有这个维度。但是在实际的运算过程中,第一维表示的batch_size
|
||||
* channels:表示该层算子的个数、神经元的个数。in_channels表示输入的神经元的个数。out_channels表示输出神经元的个数。
|
||||
* height
|
||||
* width[-depth]表示一张图片或者一个条数据的大小。
|
||||
|
||||
|
||||
### 参数的数量
|
||||
对于每一层的参数的数量的计算如下。也就是说每一个卷积层有两个参数, 一个参数是weight,另一个参数是bias。具体的就是conv2.weight conv2.bias
|
||||
|
||||
```py
|
||||
>>> conv2 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=(4,3,2))
|
||||
>>> for (name, param) in conv2.named_parameters():
|
||||
>>> print(name)
|
||||
weight
|
||||
bias
|
||||
>>> print(conv2.weight.shape)
|
||||
torch.Size([32, 3, 4, 3, 2])
|
||||
```
|
||||
|
||||
|
||||
## 构成
|
||||
|
||||
> 在构建普通神经网络的时候。可以实现不同的控制级别。
|
||||
|
||||
* 神经网络的构成主要包括以下几个内容:数据+算子+过程。实现模型的训练。
|
||||
* 数据(样本)或者说激活值。
|
||||
* 算子。
|
||||
* 自己定义的算子,需要自己声明参数并初始化。
|
||||
* 使用系统定义的算子,系统会自动添加参数。
|
||||
* 过程:前项传播的函数。后向传播的函数。误差计算的函数。梯度下降的函数。
|
||||
@@ -1,39 +0,0 @@
|
||||
# PyTorch 深度学习:60分钟快速入门
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html>
|
||||
|
||||
**作者**: [Soumith Chintala](http://soumith.ch)
|
||||
|
||||
<https://www.youtube.com/embed/u7x8RXwLKcA>
|
||||
|
||||
## 什么是 PyTorch?
|
||||
|
||||
PyTorch 是基于以下两个目的而打造的python科学计算框架:
|
||||
|
||||
* 无缝替换NumPy,并且通过利用GPU的算力来实现神经网络的加速。
|
||||
* 通过自动微分机制,来让神经网络的实现变得更加容易。
|
||||
|
||||
## 本次教程的目标:
|
||||
|
||||
* 深入了解PyTorch的张量单元以及如何使用Pytorch来搭建神经网络。
|
||||
* 自己动手训练一个小型神经网络来实现图像的分类。
|
||||
|
||||
注意
|
||||
|
||||
确保已安装[`torch`](https://github.com/pytorch/pytorch)和[`torchvision`](https://github.com/pytorch/vision)包。
|
||||
|
||||

|
||||
|
||||
[张量](blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py)
|
||||
|
||||

|
||||
|
||||
[`torch.autograd`的简要介绍](blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py)
|
||||
|
||||

|
||||
|
||||
[神经网络简介](blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py)
|
||||
|
||||

|
||||
|
||||
[自己动手训练一个图像分类器](blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py)
|
||||
28
pytorch/官方教程/02Pytorch.md
Normal file
28
pytorch/官方教程/02Pytorch.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# PyTorch 深度学习:60分钟快速入门
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html>
|
||||
|
||||
**作者**: [Soumith Chintala](http://soumith.ch)
|
||||
|
||||
<https://www.youtube.com/embed/u7x8RXwLKcA>
|
||||
|
||||
## 什么是 PyTorch?
|
||||
|
||||
PyTorch 是基于以下两个目的而打造的python科学计算框架:
|
||||
|
||||
* 无缝替换NumPy,并且通过利用GPU的算力来实现神经网络的加速。
|
||||
* 通过自动微分机制,来让神经网络的实现变得更加容易。
|
||||
|
||||
## 本次教程的目标:
|
||||
|
||||
* 深入了解PyTorch的张量单元以及如何使用Pytorch来搭建神经网络。
|
||||
* 自己动手训练一个小型神经网络来实现图像的分类。
|
||||
|
||||
|
||||
## 目录
|
||||
|
||||
* [02Pytorch](02Pytorch.md)
|
||||
* [03Tensor张量](03Tensor.md)
|
||||
* [04autograd](04Autograd.md)
|
||||
* [05神经网络](05NN.md)
|
||||
* [06图像分类器](06Classification.md)
|
||||
@@ -17,8 +17,8 @@
|
||||
神经网络的典型训练过程如下:
|
||||
|
||||
* 定义具有一些可学习参数(或权重)的神经网络
|
||||
* 遍历输入数据集
|
||||
* 通过网络处理输入
|
||||
* 遍历输入数据集,进行数据预处理
|
||||
* 通过网络处理输入,进行正向传播
|
||||
* 计算损失(输出正确的距离有多远)
|
||||
* 将梯度传播回网络参数
|
||||
* 通常使用简单的更新规则来更新网络的权重:`weight = weight - learning_rate * gradient`
|
||||
@@ -17,9 +17,7 @@ PyTorch 的核心是提供两个主要功能:
|
||||
|
||||
您可以在[本页](#examples-download)浏览各个示例。
|
||||
|
||||
## 张量
|
||||
|
||||
### 预热:NumPy
|
||||
## 1 预热:NumPy
|
||||
|
||||
在介绍 PyTorch 之前,我们将首先使用 numpy 实现网络。
|
||||
|
||||
@@ -68,7 +66,7 @@ print(f'Result: y = {a} + {b} x + {c} x^2 + {d} x^3')
|
||||
|
||||
```
|
||||
|
||||
### PyTorch:张量
|
||||
## 2 PyTorch:张量
|
||||
|
||||
Numpy 是一个很棒的框架,但是它不能利用 GPU 来加速其数值计算。 对于现代深度神经网络,GPU 通常会提供 [50 倍或更高](https://github.com/jcjohnson/cnn-benchmarks)的加速,因此遗憾的是,numpy 不足以实现现代深度学习。
|
||||
|
||||
@@ -125,9 +123,9 @@ print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3'
|
||||
|
||||
```
|
||||
|
||||
## Autograd
|
||||
## 3 Autograd
|
||||
|
||||
### PyTorch:张量和 Autograd
|
||||
## 3.1 PyTorch:张量和 Autograd
|
||||
|
||||
在上述示例中,我们必须手动实现神经网络的前向和后向传递。 对于小型的两层网络,手动实现反向传递并不是什么大问题,但是对于大型的复杂网络来说,可以很快变得非常麻烦。
|
||||
|
||||
@@ -198,7 +196,7 @@ print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3'
|
||||
|
||||
```
|
||||
|
||||
### PyTorch:定义新的 Autograd 函数
|
||||
## 3.2 PyTorch:定义新的 Autograd 函数
|
||||
|
||||
在幕后,每个原始的 Autograd 运算符实际上都是在张量上运行的两个函数。 **正向**函数从输入张量计算输出张量。 **反向**函数接收相对于某个标量值的输出张量的梯度,并计算相对于相同标量值的输入张量的梯度。
|
||||
|
||||
@@ -293,9 +291,9 @@ print(f'Result: y = {a.item()} + {b.item()} * P3({c.item()} + {d.item()} x)')
|
||||
|
||||
```
|
||||
|
||||
## `nn`模块
|
||||
## 4 `nn`模块
|
||||
|
||||
### PyTorch:`nn`
|
||||
## 4.1 PyTorch:`nn`
|
||||
|
||||
计算图和 Autograd 是定义复杂运算符并自动采用导数的非常强大的范例。 但是对于大型神经网络,原始的 Autograd 可能会太低级。
|
||||
|
||||
@@ -380,7 +378,7 @@ print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item
|
||||
|
||||
```
|
||||
|
||||
### PyTorch:`optim`
|
||||
## 4.2 PyTorch:`optim`
|
||||
|
||||
到目前为止,我们已经通过使用`torch.no_grad()`手动更改持有可学习参数的张量来更新模型的权重。 对于像随机梯度下降这样的简单优化算法来说,这并不是一个巨大的负担,但是在实践中,我们经常使用更复杂的优化器(例如 AdaGrad,RMSProp,Adam 等)来训练神经网络。
|
||||
|
||||
@@ -443,7 +441,7 @@ print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item
|
||||
|
||||
```
|
||||
|
||||
### PyTorch:自定义`nn`模块
|
||||
## 4.3 PyTorch:自定义`nn`模块
|
||||
|
||||
有时,您将需要指定比一系列现有模块更复杂的模型。 对于这些情况,您可以通过子类化`nn.Module`并定义一个`forward`来定义自己的模块,该模块使用其他模块或在 Tensors 上的其他自动转换操作来接收输入 Tensors 并生成输出 Tensors。
|
||||
|
||||
@@ -510,7 +508,7 @@ print(f'Result: {model.string()}')
|
||||
|
||||
```
|
||||
|
||||
### PyTorch:控制流 + 权重共享
|
||||
## 4.4 PyTorch:控制流 + 权重共享
|
||||
|
||||
作为动态图和权重共享的示例,我们实现了一个非常奇怪的模型:一个三阶多项式,在每个正向传播中选择 3 到 5 之间的一个随机数,并使用该阶数,多次使用相同的权重重复计算四和五阶。
|
||||
|
||||
@@ -586,46 +584,4 @@ for t in range(30000):
|
||||
|
||||
print(f'Result: {model.string()}')
|
||||
|
||||
```
|
||||
|
||||
## 示例
|
||||
|
||||
您可以在此处浏览以上示例。
|
||||
|
||||
### 张量
|
||||
|
||||

|
||||
|
||||
[热身:NumPy](examples_tensor/polynomial_numpy.html#sphx-glr-beginner-examples-tensor-polynomial-numpy-py)
|
||||
|
||||

|
||||
|
||||
[PyTorch:张量](examples_tensor/polynomial_tensor.html#sphx-glr-beginner-examples-tensor-polynomial-tensor-py)
|
||||
|
||||
### Autograd
|
||||
|
||||

|
||||
|
||||
[PyTorch:张量和 Autograd](examples_autograd/polynomial_autograd.html#sphx-glr-beginner-examples-autograd-polynomial-autograd-py)
|
||||
|
||||

|
||||
|
||||
[PyTorch:定义新的 Autograd 函数](examples_autograd/polynomial_custom_function.html#sphx-glr-beginner-examples-autograd-polynomial-custom-function-py)
|
||||
|
||||
### `nn`模块
|
||||
|
||||

|
||||
|
||||
[PyTorch:`nn`](examples_nn/polynomial_nn.html#sphx-glr-beginner-examples-nn-polynomial-nn-py)
|
||||
|
||||

|
||||
|
||||
[PyTorch:`optim`](examples_nn/polynomial_optim.html#sphx-glr-beginner-examples-nn-polynomial-optim-py)
|
||||
|
||||

|
||||
|
||||
[PyTorch:自定义`nn`模块](examples_nn/polynomial_module.html#sphx-glr-beginner-examples-nn-polynomial-module-py)
|
||||
|
||||

|
||||
|
||||
[PyTorch:控制流 + 权重共享](examples_nn/dynamic_net.html#sphx-glr-beginner-examples-nn-dynamic-net-py)
|
||||
```
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,64 +1,348 @@
|
||||
# PyTorch:张量
|
||||
# 使用 TensorBoard 可视化模型,数据和训练
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_tensor/polynomial_tensor.html#sphx-glr-beginner-examples-tensor-polynomial-tensor-py>
|
||||
> 原文:<https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html>
|
||||
|
||||
经过训练的三阶多项式,可以通过最小化平方的欧几里得距离来预测`y = sin(x)`从`-pi`到`pi`。
|
||||
在 [60 分钟突击](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)中,我们向您展示了如何加载数据,如何通过定义为`nn.Module`子类的模型提供数据,如何在训练数据上训练该模型以及在测试数据上对其进行测试。 为了了解发生的情况,我们在模型训练期间打印一些统计数据,以了解训练是否在进行中。 但是,我们可以做得更好:PyTorch 与 TensorBoard 集成在一起,TensorBoard 是一种工具,用于可视化神经网络训练运行的结果。 本教程使用 [Fashion-MNIST 数据集](https://github.com/zalandoresearch/fashion-mnist)说明了其某些功能,可以使用`torchvision.datasets`将其读入 PyTorch。
|
||||
|
||||
此实现使用 PyTorch 张量手动计算正向传播,损失和后向通过。
|
||||
在本教程中,我们将学习如何:
|
||||
|
||||
PyTorch 张量基本上与 numpy 数组相同:它对深度学习或计算图或梯度一无所知,只是用于任意数值计算的通用 n 维数组。
|
||||
> 1. 读取数据并进行适当的转换(与先前的教程几乎相同)。
|
||||
> 2. 设置 TensorBoard。
|
||||
> 3. 写入 TensorBoard。
|
||||
> 4. 使用 TensorBoard 检查模型架构。
|
||||
> 5. 使用 TensorBoard 来创建我们在上一个教程中创建的可视化的交互式版本,并使用较少的代码
|
||||
|
||||
numpy 数组和 PyTorch 张量之间的最大区别是 PyTorch 张量可以在 CPU 或 GPU 上运行。 要在 GPU 上运行操作,只需将张量转换为 cuda 数据类型。
|
||||
具体来说,在第 5 点,我们将看到:
|
||||
|
||||
> * 有两种方法可以检查我们的训练数据
|
||||
> * 在训练模型时如何跟踪其表现
|
||||
> * 在训练后如何评估模型的表现。
|
||||
|
||||
我们将从 [CIFAR-10 教程](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)中类似的样板代码开始:
|
||||
|
||||
```py
|
||||
# imports
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
import torch
|
||||
import math
|
||||
import torchvision
|
||||
import torchvision.transforms as transforms
|
||||
|
||||
dtype = torch.float
|
||||
device = torch.device("cpu")
|
||||
# device = torch.device("cuda:0") # Uncomment this to run on GPU
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import torch.optim as optim
|
||||
|
||||
# Create random input and output data
|
||||
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
|
||||
y = torch.sin(x)
|
||||
# transforms
|
||||
transform = transforms.Compose(
|
||||
[transforms.ToTensor(),
|
||||
transforms.Normalize((0.5,), (0.5,))])
|
||||
|
||||
# Randomly initialize weights
|
||||
a = torch.randn((), device=device, dtype=dtype)
|
||||
b = torch.randn((), device=device, dtype=dtype)
|
||||
c = torch.randn((), device=device, dtype=dtype)
|
||||
d = torch.randn((), device=device, dtype=dtype)
|
||||
# datasets
|
||||
trainset = torchvision.datasets.FashionMNIST('./data',
|
||||
download=True,
|
||||
train=True,
|
||||
transform=transform)
|
||||
testset = torchvision.datasets.FashionMNIST('./data',
|
||||
download=True,
|
||||
train=False,
|
||||
transform=transform)
|
||||
|
||||
learning_rate = 1e-6
|
||||
for t in range(2000):
|
||||
# Forward pass: compute predicted y
|
||||
y_pred = a + b * x + c * x ** 2 + d * x ** 3
|
||||
# dataloaders
|
||||
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
|
||||
shuffle=True, num_workers=2)
|
||||
|
||||
# Compute and print loss
|
||||
loss = (y_pred - y).pow(2).sum().item()
|
||||
if t % 100 == 99:
|
||||
print(t, loss)
|
||||
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
|
||||
shuffle=False, num_workers=2)
|
||||
|
||||
# Backprop to compute gradients of a, b, c, d with respect to loss
|
||||
grad_y_pred = 2.0 * (y_pred - y)
|
||||
grad_a = grad_y_pred.sum()
|
||||
grad_b = (grad_y_pred * x).sum()
|
||||
grad_c = (grad_y_pred * x ** 2).sum()
|
||||
grad_d = (grad_y_pred * x ** 3).sum()
|
||||
# constant for classes
|
||||
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
|
||||
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')
|
||||
|
||||
# Update weights using gradient descent
|
||||
a -= learning_rate * grad_a
|
||||
b -= learning_rate * grad_b
|
||||
c -= learning_rate * grad_c
|
||||
d -= learning_rate * grad_d
|
||||
|
||||
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
|
||||
# helper function to show an image
|
||||
# (used in the `plot_classes_preds` function below)
|
||||
def matplotlib_imshow(img, one_channel=False):
|
||||
if one_channel:
|
||||
img = img.mean(dim=0)
|
||||
img = img / 2 + 0.5 # unnormalize
|
||||
npimg = img.numpy()
|
||||
if one_channel:
|
||||
plt.imshow(npimg, cmap="Greys")
|
||||
else:
|
||||
plt.imshow(np.transpose(npimg, (1, 2, 0)))
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
我们将在该教程中定义一个类似的模型架构,仅需进行少量修改即可解决以下事实:图像现在是一个通道而不是三个通道,而图像是`28x28`而不是`32x32`:
|
||||
|
||||
[下载 Python 源码:`polynomial_tensor.py`](https://pytorch.org/tutorials/_downloads/38bc029908996abe0c601bcf0f5fd9d8/polynomial_tensor.py)
|
||||
```py
|
||||
class Net(nn.Module):
|
||||
def __init__(self):
|
||||
super(Net, self).__init__()
|
||||
self.conv1 = nn.Conv2d(1, 6, 5)
|
||||
self.pool = nn.MaxPool2d(2, 2)
|
||||
self.conv2 = nn.Conv2d(6, 16, 5)
|
||||
self.fc1 = nn.Linear(16 * 4 * 4, 120)
|
||||
self.fc2 = nn.Linear(120, 84)
|
||||
self.fc3 = nn.Linear(84, 10)
|
||||
|
||||
[下载 Jupyter 笔记本:`polynomial_tensor.ipynb`](https://pytorch.org/tutorials/_downloads/1c715a0888ae0e33279df327e1653329/polynomial_tensor.ipynb)
|
||||
def forward(self, x):
|
||||
x = self.pool(F.relu(self.conv1(x)))
|
||||
x = self.pool(F.relu(self.conv2(x)))
|
||||
x = x.view(-1, 16 * 4 * 4)
|
||||
x = F.relu(self.fc1(x))
|
||||
x = F.relu(self.fc2(x))
|
||||
x = self.fc3(x)
|
||||
return x
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
net = Net()
|
||||
|
||||
```
|
||||
|
||||
我们将在之前定义相同的`optimizer`和`criterion`:
|
||||
|
||||
```py
|
||||
criterion = nn.CrossEntropyLoss()
|
||||
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
|
||||
|
||||
```
|
||||
|
||||
## 1\. TensorBoard 设置
|
||||
|
||||
现在,我们将设置 TensorBoard,从`torch.utils`导入`tensorboard`并定义`SummaryWriter`,这是将信息写入 TensorBoard 的关键对象。
|
||||
|
||||
```py
|
||||
from torch.utils.tensorboard import SummaryWriter
|
||||
|
||||
# default `log_dir` is "runs" - we'll be more specific here
|
||||
writer = SummaryWriter('runs/fashion_mnist_experiment_1')
|
||||
|
||||
```
|
||||
|
||||
请注意,仅此行会创建一个`runs/fashion_mnist_experiment_1`文件夹。
|
||||
|
||||
## 2\. 写入 TensorBoard
|
||||
|
||||
现在,使用[`make_grid`](https://pytorch.org/docs/stable/torchvision/utils.html#torchvision.utils.make_grid)将图像写入到 TensorBoard 中,具体来说就是网格。
|
||||
|
||||
```py
|
||||
# get some random training images
|
||||
dataiter = iter(trainloader)
|
||||
images, labels = dataiter.next()
|
||||
|
||||
# create grid of images
|
||||
img_grid = torchvision.utils.make_grid(images)
|
||||
|
||||
# show images
|
||||
matplotlib_imshow(img_grid, one_channel=True)
|
||||
|
||||
# write to tensorboard
|
||||
writer.add_image('four_fashion_mnist_images', img_grid)
|
||||
|
||||
```
|
||||
|
||||
正在运行
|
||||
|
||||
```py
|
||||
tensorboard --logdir=runs
|
||||
|
||||
```
|
||||
|
||||
从命令行,然后导航到`https://localhost:6006`应该显示以下内容。
|
||||
|
||||

|
||||
|
||||
现在您知道如何使用 TensorBoard 了! 但是,此示例可以在 Jupyter 笔记本中完成-TensorBoard 真正擅长的地方是创建交互式可视化。 接下来,我们将介绍其中之一,并在本教程结束时介绍更多内容。
|
||||
|
||||
## 3\. 使用 TensorBoard 检查模型
|
||||
|
||||
TensorBoard 的优势之一是其可视化复杂模型结构的能力。 让我们可视化我们构建的模型。
|
||||
|
||||
```py
|
||||
writer.add_graph(net, images)
|
||||
writer.close()
|
||||
|
||||
```
|
||||
|
||||
现在刷新 TensorBoard 后,您应该会看到一个`Graphs`标签,如下所示:
|
||||
|
||||

|
||||
|
||||
继续并双击`Net`以展开它,查看构成模型的各个操作的详细视图。
|
||||
|
||||
TensorBoard 具有非常方便的功能,可在低维空间中可视化高维数据,例如图像数据。 接下来我们将介绍这一点。
|
||||
|
||||
## 4\. 在 TensorBoard 中添加“投影仪”
|
||||
|
||||
我们可以通过[`add_embedding`](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_embedding)方法可视化高维数据的低维表示
|
||||
|
||||
```py
|
||||
# helper function
|
||||
def select_n_random(data, labels, n=100):
|
||||
'''
|
||||
Selects n random datapoints and their corresponding labels from a dataset
|
||||
'''
|
||||
assert len(data) == len(labels)
|
||||
|
||||
perm = torch.randperm(len(data))
|
||||
return data[perm][:n], labels[perm][:n]
|
||||
|
||||
# select random images and their target indices
|
||||
images, labels = select_n_random(trainset.data, trainset.targets)
|
||||
|
||||
# get the class labels for each image
|
||||
class_labels = [classes[lab] for lab in labels]
|
||||
|
||||
# log embeddings
|
||||
features = images.view(-1, 28 * 28)
|
||||
writer.add_embedding(features,
|
||||
metadata=class_labels,
|
||||
label_img=images.unsqueeze(1))
|
||||
writer.close()
|
||||
|
||||
```
|
||||
|
||||
现在,在 TensorBoard 的“投影仪”选项卡中,您可以看到这 100 张图像-每个图像 784 维-向下投影到三维空间中。 此外,这是交互式的:您可以单击并拖动以旋转三维投影。 最后,一些技巧可以使可视化效果更容易看到:选择左上方的“颜色:标签”,以及启用“夜间模式”,这将使图像更容易看到,因为它们的背景是白色的:
|
||||
|
||||

|
||||
|
||||
现在我们已经彻底检查了我们的数据,让我们展示了 TensorBoard 如何从训练开始就可以使跟踪模型的训练和评估更加清晰。
|
||||
|
||||
## 5\. 使用 TensorBoard 跟踪模型训练
|
||||
|
||||
在前面的示例中,我们仅*每 2000 次迭代*打印该模型的运行损失。 现在,我们将运行损失记录到 TensorBoard 中,并通过`plot_classes_preds`函数查看模型所做的预测。
|
||||
|
||||
```py
|
||||
# helper functions
|
||||
|
||||
def images_to_probs(net, images):
|
||||
'''
|
||||
Generates predictions and corresponding probabilities from a trained
|
||||
network and a list of images
|
||||
'''
|
||||
output = net(images)
|
||||
# convert output probabilities to predicted class
|
||||
_, preds_tensor = torch.max(output, 1)
|
||||
preds = np.squeeze(preds_tensor.numpy())
|
||||
return preds, [F.softmax(el, dim=0)[i].item() for i, el in zip(preds, output)]
|
||||
|
||||
def plot_classes_preds(net, images, labels):
|
||||
'''
|
||||
Generates matplotlib Figure using a trained network, along with images
|
||||
and labels from a batch, that shows the network's top prediction along
|
||||
with its probability, alongside the actual label, coloring this
|
||||
information based on whether the prediction was correct or not.
|
||||
Uses the "images_to_probs" function.
|
||||
'''
|
||||
preds, probs = images_to_probs(net, images)
|
||||
# plot the images in the batch, along with predicted and true labels
|
||||
fig = plt.figure(figsize=(12, 48))
|
||||
for idx in np.arange(4):
|
||||
ax = fig.add_subplot(1, 4, idx+1, xticks=[], yticks=[])
|
||||
matplotlib_imshow(images[idx], one_channel=True)
|
||||
ax.set_title("{0}, {1:.1f}%\n(label: {2})".format(
|
||||
classes[preds[idx]],
|
||||
probs[idx] * 100.0,
|
||||
classes[labels[idx]]),
|
||||
color=("green" if preds[idx]==labels[idx].item() else "red"))
|
||||
return fig
|
||||
|
||||
```
|
||||
|
||||
最后,让我们使用与之前教程相同的模型训练代码来训练模型,但是每 1000 批将结果写入 TensorBoard,而不是打印到控制台。 这是通过[`add_scalar`](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_scalar)函数完成的。
|
||||
|
||||
此外,在训练过程中,我们将生成一幅图像,显示该批量中包含的四幅图像的模型预测与实际结果。
|
||||
|
||||
```py
|
||||
running_loss = 0.0
|
||||
for epoch in range(1): # loop over the dataset multiple times
|
||||
|
||||
for i, data in enumerate(trainloader, 0):
|
||||
|
||||
# get the inputs; data is a list of [inputs, labels]
|
||||
inputs, labels = data
|
||||
|
||||
# zero the parameter gradients
|
||||
optimizer.zero_grad()
|
||||
|
||||
# forward + backward + optimize
|
||||
outputs = net(inputs)
|
||||
loss = criterion(outputs, labels)
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
running_loss += loss.item()
|
||||
if i % 1000 == 999: # every 1000 mini-batches...
|
||||
|
||||
# ...log the running loss
|
||||
writer.add_scalar('training loss',
|
||||
running_loss / 1000,
|
||||
epoch * len(trainloader) + i)
|
||||
|
||||
# ...log a Matplotlib Figure showing the model's predictions on a
|
||||
# random mini-batch
|
||||
writer.add_figure('predictions vs. actuals',
|
||||
plot_classes_preds(net, inputs, labels),
|
||||
global_step=epoch * len(trainloader) + i)
|
||||
running_loss = 0.0
|
||||
print('Finished Training')
|
||||
|
||||
```
|
||||
|
||||
现在,您可以查看“标量”选项卡,以查看在 15,000 次训练迭代中绘制的运行损失:
|
||||
|
||||

|
||||
|
||||
此外,我们可以查看整个学习过程中模型在任意批量上所做的预测。 查看“图像”选项卡,然后在“预测与实际”可视化条件下向下滚动以查看此内容; 这表明,例如,仅经过 3000 次训练迭代,该模型就已经能够区分出视觉上截然不同的类,例如衬衫,运动鞋和外套,尽管它并没有像后来的训练那样有信心:
|
||||
|
||||

|
||||
|
||||
在之前的教程中,我们研究了模型训练后的每类准确率; 在这里,我们将使用 TensorBoard 绘制每个类别的精确调用曲线([在这里解释](https://www.scikit-yb.org/en/latest/api/classifier/prcurve.html))。
|
||||
|
||||
## 6\. 使用 TensorBoard 评估经过训练的模型
|
||||
|
||||
```py
|
||||
# 1\. gets the probability predictions in a test_size x num_classes Tensor
|
||||
# 2\. gets the preds in a test_size Tensor
|
||||
# takes ~10 seconds to run
|
||||
class_probs = []
|
||||
class_preds = []
|
||||
with torch.no_grad():
|
||||
for data in testloader:
|
||||
images, labels = data
|
||||
output = net(images)
|
||||
class_probs_batch = [F.softmax(el, dim=0) for el in output]
|
||||
_, class_preds_batch = torch.max(output, 1)
|
||||
|
||||
class_probs.append(class_probs_batch)
|
||||
class_preds.append(class_preds_batch)
|
||||
|
||||
test_probs = torch.cat([torch.stack(batch) for batch in class_probs])
|
||||
test_preds = torch.cat(class_preds)
|
||||
|
||||
# helper function
|
||||
def add_pr_curve_tensorboard(class_index, test_probs, test_preds, global_step=0):
|
||||
'''
|
||||
Takes in a "class_index" from 0 to 9 and plots the corresponding
|
||||
precision-recall curve
|
||||
'''
|
||||
tensorboard_preds = test_preds == class_index
|
||||
tensorboard_probs = test_probs[:, class_index]
|
||||
|
||||
writer.add_pr_curve(classes[class_index],
|
||||
tensorboard_preds,
|
||||
tensorboard_probs,
|
||||
global_step=global_step)
|
||||
writer.close()
|
||||
|
||||
# plot all the pr curves
|
||||
for i in range(len(classes)):
|
||||
add_pr_curve_tensorboard(i, test_probs, test_preds)
|
||||
|
||||
```
|
||||
|
||||
现在,您将看到一个`PR Curves`选项卡,其中包含每个类别的精确调用曲线。 继续四处戳; 您会发现在某些类别中,模型的“曲线下面积”接近 100%,而在另一些类别中,该面积更低:
|
||||
|
||||

|
||||
|
||||
这是 TensorBoard 和 PyTorch 与之集成的介绍。 当然,您可以在 Jupyter 笔记本中完成 TensorBoard 的所有操作,但是使用 TensorBoard 时,默认情况下会获得交互式的视觉效果。
|
||||
@@ -1,77 +0,0 @@
|
||||
# PyTorch:张量和 Autograd
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_autograd/polynomial_autograd.html#sphx-glr-beginner-examples-autograd-polynomial-autograd-py>
|
||||
|
||||
经过训练的三阶多项式,可以通过最小化平方的欧几里得距离来预测`y = sin(x)`从`-pi`到`pi`。
|
||||
|
||||
此实现使用 PyTorch 张量上的运算来计算正向传播,并使用 PyTorch Autograd 来计算梯度。
|
||||
|
||||
PyTorch 张量表示计算图中的一个节点。 如果`x`是具有`x.requires_grad=True`的张量,则`x.grad`是另一个张量,其保持`x`相对于某个标量值的梯度。
|
||||
|
||||
```py
|
||||
import torch
|
||||
import math
|
||||
|
||||
dtype = torch.float
|
||||
device = torch.device("cpu")
|
||||
# device = torch.device("cuda:0") # Uncomment this to run on GPU
|
||||
|
||||
# Create Tensors to hold input and outputs.
|
||||
# By default, requires_grad=False, which indicates that we do not need to
|
||||
# compute gradients with respect to these Tensors during the backward pass.
|
||||
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
|
||||
y = torch.sin(x)
|
||||
|
||||
# Create random Tensors for weights. For a third order polynomial, we need
|
||||
# 4 weights: y = a + b x + c x^2 + d x^3
|
||||
# Setting requires_grad=True indicates that we want to compute gradients with
|
||||
# respect to these Tensors during the backward pass.
|
||||
a = torch.randn((), device=device, dtype=dtype, requires_grad=True)
|
||||
b = torch.randn((), device=device, dtype=dtype, requires_grad=True)
|
||||
c = torch.randn((), device=device, dtype=dtype, requires_grad=True)
|
||||
d = torch.randn((), device=device, dtype=dtype, requires_grad=True)
|
||||
|
||||
learning_rate = 1e-6
|
||||
for t in range(2000):
|
||||
# Forward pass: compute predicted y using operations on Tensors.
|
||||
y_pred = a + b * x + c * x ** 2 + d * x ** 3
|
||||
|
||||
# Compute and print loss using operations on Tensors.
|
||||
# Now loss is a Tensor of shape (1,)
|
||||
# loss.item() gets the scalar value held in the loss.
|
||||
loss = (y_pred - y).pow(2).sum()
|
||||
if t % 100 == 99:
|
||||
print(t, loss.item())
|
||||
|
||||
# Use autograd to compute the backward pass. This call will compute the
|
||||
# gradient of loss with respect to all Tensors with requires_grad=True.
|
||||
# After this call a.grad, b.grad. c.grad and d.grad will be Tensors holding
|
||||
# the gradient of the loss with respect to a, b, c, d respectively.
|
||||
loss.backward()
|
||||
|
||||
# Manually update weights using gradient descent. Wrap in torch.no_grad()
|
||||
# because weights have requires_grad=True, but we don't need to track this
|
||||
# in autograd.
|
||||
with torch.no_grad():
|
||||
a -= learning_rate * a.grad
|
||||
b -= learning_rate * b.grad
|
||||
c -= learning_rate * c.grad
|
||||
d -= learning_rate * d.grad
|
||||
|
||||
# Manually zero the gradients after updating weights
|
||||
a.grad = None
|
||||
b.grad = None
|
||||
c.grad = None
|
||||
d.grad = None
|
||||
|
||||
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
|
||||
[下载 Python 源码:`polynomial_autograd.py`](https://pytorch.org/tutorials/_downloads/2956e289de4f5fdd59114171805b23d2/polynomial_autograd.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`polynomial_autograd.ipynb`](https://pytorch.org/tutorials/_downloads/e1d4d0ca7bd75ea2fff8032fcb79076e/polynomial_autograd.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,103 +0,0 @@
|
||||
# PyTorch:定义新的 Autograd 函数
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_autograd/polynomial_custom_function.html#sphx-glr-beginner-examples-autograd-polynomial-custom-function-py>
|
||||
|
||||
经过训练的三阶多项式,可以通过最小化平方的欧几里得距离来预测`y = sin(x)`从`-pi`到`pi`。 而不是将多项式写为`y = a + bx + cx ^ 2 + dx ^ 3`,我们将多项式写为`y = a + b P[3](c + dx)`其中`P[3](x) = 1/2 (5x ^ 3 - 3x)`是三次的[勒让德多项式](https://en.wikipedia.org/wiki/Legendre_polynomials)。
|
||||
|
||||
此实现使用 PyTorch 张量上的运算来计算正向传播,并使用 PyTorch Autograd 来计算梯度。
|
||||
|
||||
在此实现中,我们实现了自己的自定义 Autograd 函数来执行`P'[3](x)`。 通过数学,`P'[3](x) = 3/2 (5x ^ 2 - 1)`:
|
||||
|
||||
```py
|
||||
import torch
|
||||
import math
|
||||
|
||||
class LegendrePolynomial3(torch.autograd.Function):
|
||||
"""
|
||||
We can implement our own custom autograd Functions by subclassing
|
||||
torch.autograd.Function and implementing the forward and backward passes
|
||||
which operate on Tensors.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input):
|
||||
"""
|
||||
In the forward pass we receive a Tensor containing the input and return
|
||||
a Tensor containing the output. ctx is a context object that can be used
|
||||
to stash information for backward computation. You can cache arbitrary
|
||||
objects for use in the backward pass using the ctx.save_for_backward method.
|
||||
"""
|
||||
ctx.save_for_backward(input)
|
||||
return 0.5 * (5 * input ** 3 - 3 * input)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
"""
|
||||
In the backward pass we receive a Tensor containing the gradient of the loss
|
||||
with respect to the output, and we need to compute the gradient of the loss
|
||||
with respect to the input.
|
||||
"""
|
||||
input, = ctx.saved_tensors
|
||||
return grad_output * 1.5 * (5 * input ** 2 - 1)
|
||||
|
||||
dtype = torch.float
|
||||
device = torch.device("cpu")
|
||||
# device = torch.device("cuda:0") # Uncomment this to run on GPU
|
||||
|
||||
# Create Tensors to hold input and outputs.
|
||||
# By default, requires_grad=False, which indicates that we do not need to
|
||||
# compute gradients with respect to these Tensors during the backward pass.
|
||||
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
|
||||
y = torch.sin(x)
|
||||
|
||||
# Create random Tensors for weights. For this example, we need
|
||||
# 4 weights: y = a + b * P3(c + d * x), these weights need to be initialized
|
||||
# not too far from the correct result to ensure convergence.
|
||||
# Setting requires_grad=True indicates that we want to compute gradients with
|
||||
# respect to these Tensors during the backward pass.
|
||||
a = torch.full((), 0.0, device=device, dtype=dtype, requires_grad=True)
|
||||
b = torch.full((), -1.0, device=device, dtype=dtype, requires_grad=True)
|
||||
c = torch.full((), 0.0, device=device, dtype=dtype, requires_grad=True)
|
||||
d = torch.full((), 0.3, device=device, dtype=dtype, requires_grad=True)
|
||||
|
||||
learning_rate = 5e-6
|
||||
for t in range(2000):
|
||||
# To apply our Function, we use Function.apply method. We alias this as 'P3'.
|
||||
P3 = LegendrePolynomial3.apply
|
||||
|
||||
# Forward pass: compute predicted y using operations; we compute
|
||||
# P3 using our custom autograd operation.
|
||||
y_pred = a + b * P3(c + d * x)
|
||||
|
||||
# Compute and print loss
|
||||
loss = (y_pred - y).pow(2).sum()
|
||||
if t % 100 == 99:
|
||||
print(t, loss.item())
|
||||
|
||||
# Use autograd to compute the backward pass.
|
||||
loss.backward()
|
||||
|
||||
# Update weights using gradient descent
|
||||
with torch.no_grad():
|
||||
a -= learning_rate * a.grad
|
||||
b -= learning_rate * b.grad
|
||||
c -= learning_rate * c.grad
|
||||
d -= learning_rate * d.grad
|
||||
|
||||
# Manually zero the gradients after updating weights
|
||||
a.grad = None
|
||||
b.grad = None
|
||||
c.grad = None
|
||||
d.grad = None
|
||||
|
||||
print(f'Result: y = {a.item()} + {b.item()} * P3({c.item()} + {d.item()} x)')
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
|
||||
[下载 Python 源码:`polynomial_custom_function.py`](https://pytorch.org/tutorials/_downloads/b7ec15fd7bec1ca3f921104cfb6a54ed/polynomial_custom_function.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`polynomial_custom_function.ipynb`](https://pytorch.org/tutorials/_downloads/0a64809624bf2f3eb497d30d5303a9a0/polynomial_custom_function.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,87 +0,0 @@
|
||||
# PyTorch:`nn`
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_nn/polynomial_nn.html#sphx-glr-beginner-examples-nn-polynomial-nn-py>
|
||||
|
||||
经过训练的三阶多项式,可以通过最小化平方的欧几里得距离来预测`y = sin(x)`从`-pi`到`pi`。
|
||||
|
||||
此实现使用来自 PyTorch 的`nn`包来构建网络。 PyTorch Autograd 使定义计算图和获取梯度变得容易,但是原始的 Autograd 对于定义复杂的神经网络来说可能太低了。 这是`nn`包可以提供帮助的地方。 `nn`包定义了一组模块,您可以将其视为神经网络层,该神经网络层从输入产生输出并且可能具有一些可训练的权重。
|
||||
|
||||
```py
|
||||
import torch
|
||||
import math
|
||||
|
||||
# Create Tensors to hold input and outputs.
|
||||
x = torch.linspace(-math.pi, math.pi, 2000)
|
||||
y = torch.sin(x)
|
||||
|
||||
# For this example, the output y is a linear function of (x, x^2, x^3), so
|
||||
# we can consider it as a linear layer neural network. Let's prepare the
|
||||
# tensor (x, x^2, x^3).
|
||||
p = torch.tensor([1, 2, 3])
|
||||
xx = x.unsqueeze(-1).pow(p)
|
||||
|
||||
# In the above code, x.unsqueeze(-1) has shape (2000, 1), and p has shape
|
||||
# (3,), for this case, broadcasting semantics will apply to obtain a tensor
|
||||
# of shape (2000, 3)
|
||||
|
||||
# Use the nn package to define our model as a sequence of layers. nn.Sequential
|
||||
# is a Module which contains other Modules, and applies them in sequence to
|
||||
# produce its output. The Linear Module computes output from input using a
|
||||
# linear function, and holds internal Tensors for its weight and bias.
|
||||
# The Flatten layer flatens the output of the linear layer to a 1D tensor,
|
||||
# to match the shape of `y`.
|
||||
model = torch.nn.Sequential(
|
||||
torch.nn.Linear(3, 1),
|
||||
torch.nn.Flatten(0, 1)
|
||||
)
|
||||
|
||||
# The nn package also contains definitions of popular loss functions; in this
|
||||
# case we will use Mean Squared Error (MSE) as our loss function.
|
||||
loss_fn = torch.nn.MSELoss(reduction='sum')
|
||||
|
||||
learning_rate = 1e-6
|
||||
for t in range(2000):
|
||||
|
||||
# Forward pass: compute predicted y by passing x to the model. Module objects
|
||||
# override the __call__ operator so you can call them like functions. When
|
||||
# doing so you pass a Tensor of input data to the Module and it produces
|
||||
# a Tensor of output data.
|
||||
y_pred = model(xx)
|
||||
|
||||
# Compute and print loss. We pass Tensors containing the predicted and true
|
||||
# values of y, and the loss function returns a Tensor containing the
|
||||
# loss.
|
||||
loss = loss_fn(y_pred, y)
|
||||
if t % 100 == 99:
|
||||
print(t, loss.item())
|
||||
|
||||
# Zero the gradients before running the backward pass.
|
||||
model.zero_grad()
|
||||
|
||||
# Backward pass: compute gradient of the loss with respect to all the learnable
|
||||
# parameters of the model. Internally, the parameters of each Module are stored
|
||||
# in Tensors with requires_grad=True, so this call will compute gradients for
|
||||
# all learnable parameters in the model.
|
||||
loss.backward()
|
||||
|
||||
# Update the weights using gradient descent. Each parameter is a Tensor, so
|
||||
# we can access its gradients like we did before.
|
||||
with torch.no_grad():
|
||||
for param in model.parameters():
|
||||
param -= learning_rate * param.grad
|
||||
|
||||
# You can access the first layer of `model` like accessing the first item of a list
|
||||
linear_layer = model[0]
|
||||
|
||||
# For linear layer, its parameters are stored as `weight` and `bias`.
|
||||
print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item()} x + {linear_layer.weight[:, 1].item()} x^2 + {linear_layer.weight[:, 2].item()} x^3')
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
|
||||
[下载 Python 源码:`polynomial_nn.py`](https://pytorch.org/tutorials/_downloads/b4767df4367deade63dc8a0d3712c1d4/polynomial_nn.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`polynomial_nn.ipynb`](https://pytorch.org/tutorials/_downloads/7bc167d8b8308ae65a717d7461d838fa/polynomial_nn.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,71 +0,0 @@
|
||||
# PyTorch:`optim`
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_nn/polynomial_optim.html#sphx-glr-beginner-examples-nn-polynomial-optim-py>
|
||||
|
||||
经过训练的三阶多项式,可以通过最小化平方的欧几里得距离来预测`y = sin(x)`从`-pi`到`pi`。
|
||||
|
||||
此实现使用来自 PyTorch 的`nn`包来构建网络。
|
||||
|
||||
与其像以前那样手动更新模型的权重,不如使用`optim`包定义一个优化器,该优化器将为我们更新权重。 `optim`包定义了许多深度学习常用的优化算法,包括 SGD + 动量,RMSProp,Adam 等。
|
||||
|
||||
```py
|
||||
import torch
|
||||
import math
|
||||
|
||||
# Create Tensors to hold input and outputs.
|
||||
x = torch.linspace(-math.pi, math.pi, 2000)
|
||||
y = torch.sin(x)
|
||||
|
||||
# Prepare the input tensor (x, x^2, x^3).
|
||||
p = torch.tensor([1, 2, 3])
|
||||
xx = x.unsqueeze(-1).pow(p)
|
||||
|
||||
# Use the nn package to define our model and loss function.
|
||||
model = torch.nn.Sequential(
|
||||
torch.nn.Linear(3, 1),
|
||||
torch.nn.Flatten(0, 1)
|
||||
)
|
||||
loss_fn = torch.nn.MSELoss(reduction='sum')
|
||||
|
||||
# Use the optim package to define an Optimizer that will update the weights of
|
||||
# the model for us. Here we will use RMSprop; the optim package contains many other
|
||||
# optimization algorithms. The first argument to the RMSprop constructor tells the
|
||||
# optimizer which Tensors it should update.
|
||||
learning_rate = 1e-3
|
||||
optimizer = torch.optim.RMSprop(model.parameters(), lr=learning_rate)
|
||||
for t in range(2000):
|
||||
# Forward pass: compute predicted y by passing x to the model.
|
||||
y_pred = model(xx)
|
||||
|
||||
# Compute and print loss.
|
||||
loss = loss_fn(y_pred, y)
|
||||
if t % 100 == 99:
|
||||
print(t, loss.item())
|
||||
|
||||
# Before the backward pass, use the optimizer object to zero all of the
|
||||
# gradients for the variables it will update (which are the learnable
|
||||
# weights of the model). This is because by default, gradients are
|
||||
# accumulated in buffers( i.e, not overwritten) whenever .backward()
|
||||
# is called. Checkout docs of torch.autograd.backward for more details.
|
||||
optimizer.zero_grad()
|
||||
|
||||
# Backward pass: compute gradient of the loss with respect to model
|
||||
# parameters
|
||||
loss.backward()
|
||||
|
||||
# Calling the step function on an Optimizer makes an update to its
|
||||
# parameters
|
||||
optimizer.step()
|
||||
|
||||
linear_layer = model[0]
|
||||
print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item()} x + {linear_layer.weight[:, 1].item()} x^2 + {linear_layer.weight[:, 2].item()} x^3')
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
|
||||
[下载 Python 源码:`polynomial_optim.py`](https://pytorch.org/tutorials/_downloads/bcfec6f02e0fe747a42dbd1579267469/polynomial_optim.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`polynomial_optim.ipynb`](https://pytorch.org/tutorials/_downloads/8ef669b2c61c6c5aa47c54dceac4979e/polynomial_optim.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,75 +0,0 @@
|
||||
# PyTorch:自定义`nn`模块
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_nn/polynomial_module.html#sphx-glr-beginner-examples-nn-polynomial-module-py>
|
||||
|
||||
经过训练的三阶多项式,可以通过最小化平方的欧几里得距离来预测`y = sin(x)`从`-pi`到`pi`。
|
||||
|
||||
此实现将模型定义为自定义`Module`子类。 每当您想要一个比现有模块的简单序列更复杂的模型时,都需要以这种方式定义模型。
|
||||
|
||||
```py
|
||||
import torch
|
||||
import math
|
||||
|
||||
class Polynomial3(torch.nn.Module):
|
||||
def __init__(self):
|
||||
"""
|
||||
In the constructor we instantiate four parameters and assign them as
|
||||
member parameters.
|
||||
"""
|
||||
super().__init__()
|
||||
self.a = torch.nn.Parameter(torch.randn(()))
|
||||
self.b = torch.nn.Parameter(torch.randn(()))
|
||||
self.c = torch.nn.Parameter(torch.randn(()))
|
||||
self.d = torch.nn.Parameter(torch.randn(()))
|
||||
|
||||
def forward(self, x):
|
||||
"""
|
||||
In the forward function we accept a Tensor of input data and we must return
|
||||
a Tensor of output data. We can use Modules defined in the constructor as
|
||||
well as arbitrary operators on Tensors.
|
||||
"""
|
||||
return self.a + self.b * x + self.c * x ** 2 + self.d * x ** 3
|
||||
|
||||
def string(self):
|
||||
"""
|
||||
Just like any class in Python, you can also define custom method on PyTorch modules
|
||||
"""
|
||||
return f'y = {self.a.item()} + {self.b.item()} x + {self.c.item()} x^2 + {self.d.item()} x^3'
|
||||
|
||||
# Create Tensors to hold input and outputs.
|
||||
x = torch.linspace(-math.pi, math.pi, 2000)
|
||||
y = torch.sin(x)
|
||||
|
||||
# Construct our model by instantiating the class defined above
|
||||
model = Polynomial3()
|
||||
|
||||
# Construct our loss function and an Optimizer. The call to model.parameters()
|
||||
# in the SGD constructor will contain the learnable parameters of the nn.Linear
|
||||
# module which is members of the model.
|
||||
criterion = torch.nn.MSELoss(reduction='sum')
|
||||
optimizer = torch.optim.SGD(model.parameters(), lr=1e-6)
|
||||
for t in range(2000):
|
||||
# Forward pass: Compute predicted y by passing x to the model
|
||||
y_pred = model(x)
|
||||
|
||||
# Compute and print loss
|
||||
loss = criterion(y_pred, y)
|
||||
if t % 100 == 99:
|
||||
print(t, loss.item())
|
||||
|
||||
# Zero gradients, perform a backward pass, and update the weights.
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
print(f'Result: {model.string()}')
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
|
||||
[下载 Python 源码:`polynomial_module.py`](https://pytorch.org/tutorials/_downloads/916a9c460c899330dbc53216cc775358/polynomial_module.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`polynomial_module.ipynb`](https://pytorch.org/tutorials/_downloads/19f4ecdd2763dd4b90693df4d6e10ebe/polynomial_module.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,82 +0,0 @@
|
||||
# PyTorch:控制流 + 权重共享
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/examples_nn/dynamic_net.html#sphx-glr-beginner-examples-nn-dynamic-net-py>
|
||||
|
||||
为了展示 PyTorch 动态图的强大功能,我们将实现一个非常奇怪的模型:一个三阶多项式,在每个正向传播中选择 3 到 5 之间的一个随机数,并使用该数量的阶次,多次使用相同的权重重复计算四和五阶。
|
||||
|
||||
```py
|
||||
import random
|
||||
import torch
|
||||
import math
|
||||
|
||||
class DynamicNet(torch.nn.Module):
|
||||
def __init__(self):
|
||||
"""
|
||||
In the constructor we instantiate five parameters and assign them as members.
|
||||
"""
|
||||
super().__init__()
|
||||
self.a = torch.nn.Parameter(torch.randn(()))
|
||||
self.b = torch.nn.Parameter(torch.randn(()))
|
||||
self.c = torch.nn.Parameter(torch.randn(()))
|
||||
self.d = torch.nn.Parameter(torch.randn(()))
|
||||
self.e = torch.nn.Parameter(torch.randn(()))
|
||||
|
||||
def forward(self, x):
|
||||
"""
|
||||
For the forward pass of the model, we randomly choose either 4, 5
|
||||
and reuse the e parameter to compute the contribution of these orders.
|
||||
|
||||
Since each forward pass builds a dynamic computation graph, we can use normal
|
||||
Python control-flow operators like loops or conditional statements when
|
||||
defining the forward pass of the model.
|
||||
|
||||
Here we also see that it is perfectly safe to reuse the same parameter many
|
||||
times when defining a computational graph.
|
||||
"""
|
||||
y = self.a + self.b * x + self.c * x ** 2 + self.d * x ** 3
|
||||
for exp in range(4, random.randint(4, 6)):
|
||||
y = y + self.e * x ** exp
|
||||
return y
|
||||
|
||||
def string(self):
|
||||
"""
|
||||
Just like any class in Python, you can also define custom method on PyTorch modules
|
||||
"""
|
||||
return f'y = {self.a.item()} + {self.b.item()} x + {self.c.item()} x^2 + {self.d.item()} x^3 + {self.e.item()} x^4 ? + {self.e.item()} x^5 ?'
|
||||
|
||||
# Create Tensors to hold input and outputs.
|
||||
x = torch.linspace(-math.pi, math.pi, 2000)
|
||||
y = torch.sin(x)
|
||||
|
||||
# Construct our model by instantiating the class defined above
|
||||
model = DynamicNet()
|
||||
|
||||
# Construct our loss function and an Optimizer. Training this strange model with
|
||||
# vanilla stochastic gradient descent is tough, so we use momentum
|
||||
criterion = torch.nn.MSELoss(reduction='sum')
|
||||
optimizer = torch.optim.SGD(model.parameters(), lr=1e-8, momentum=0.9)
|
||||
for t in range(30000):
|
||||
# Forward pass: Compute predicted y by passing x to the model
|
||||
y_pred = model(x)
|
||||
|
||||
# Compute and print loss
|
||||
loss = criterion(y_pred, y)
|
||||
if t % 2000 == 1999:
|
||||
print(t, loss.item())
|
||||
|
||||
# Zero gradients, perform a backward pass, and update the weights.
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
print(f'Result: {model.string()}')
|
||||
|
||||
```
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 0.000 秒)
|
||||
|
||||
[下载 Python 源码:`dynamic_net.py`](https://pytorch.org/tutorials/_downloads/3900c903cde097dc0088c3b06d588c0b/dynamic_net.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`dynamic_net.ipynb`](https://pytorch.org/tutorials/_downloads/ad230923bd9eb0d42576725b63ad8d91/dynamic_net.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,971 +0,0 @@
|
||||
# `torch.nn`到底是什么?
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/beginner/nn_tutorial.html>
|
||||
|
||||
作者:Jeremy Howard,[fast.ai](https://www.fast.ai)。 感谢 Rachel Thomas 和 Francisco Ingham。
|
||||
|
||||
我们建议将本教程作为笔记本而不是脚本来运行。 要下载笔记本(`.ipynb`)文件,请单击页面顶部的链接。
|
||||
|
||||
PyTorch 提供设计精美的模块和类[`torch.nn`](https://pytorch.org/docs/stable/nn.html),[`torch.optim`](https://pytorch.org/docs/stable/optim.html),[`Dataset`](https://pytorch.org/docs/stable/data.html?highlight=dataset#torch.utils.data.Dataset)和[`DataLoader`](https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader)神经网络。 为了充分利用它们的功能并针对您的问题对其进行自定义,您需要真正了解它们在做什么。 为了建立这种理解,我们将首先在 MNIST 数据集上训练基本神经网络,而无需使用这些模型的任何功能。 我们最初将仅使用最基本的 PyTorch 张量函数。 然后,我们将一次从`torch.nn`,`torch.optim`,`Dataset`或`DataLoader`中逐个添加一个函数,以准确显示每个函数,以及如何使代码更简洁或更有效。 灵活。
|
||||
|
||||
**本教程假定您已经安装了 PyTorch,并且熟悉张量操作的基础知识。** (如果您熟悉 Numpy 数组操作,将会发现此处使用的 PyTorch 张量操作几乎相同)。
|
||||
|
||||
## MNIST 数据集
|
||||
|
||||
我们将使用经典的 [MNIST](http://deeplearning.net/data/mnist/) 数据集,该数据集由手绘数字的黑白图像组成(0 到 9 之间)。
|
||||
|
||||
我们将使用[`pathlib`](https://docs.python.org/3/library/pathlib.html)处理路径(Python 3 标准库的一部分),并使用[`requests`](http://docs.python-requests.org/en/master/)下载数据集。 我们只会在使用模块时才导入它们,因此您可以确切地看到每个位置上正在使用的模块。
|
||||
|
||||
```py
|
||||
from pathlib import Path
|
||||
import requests
|
||||
|
||||
DATA_PATH = Path("data")
|
||||
PATH = DATA_PATH / "mnist"
|
||||
|
||||
PATH.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
URL = "https://github.com/pytorch/tutorials/raw/master/_static/"
|
||||
FILENAME = "mnist.pkl.gz"
|
||||
|
||||
if not (PATH / FILENAME).exists():
|
||||
content = requests.get(URL + FILENAME).content
|
||||
(PATH / FILENAME).open("wb").write(content)
|
||||
|
||||
```
|
||||
|
||||
该数据集为 numpy 数组格式,并已使用`pickle`(一种用于序列化数据的 python 特定格式)存储。
|
||||
|
||||
```py
|
||||
import pickle
|
||||
import gzip
|
||||
|
||||
with gzip.open((PATH / FILENAME).as_posix(), "rb") as f:
|
||||
((x_train, y_train), (x_valid, y_valid), _) = pickle.load(f, encoding="latin-1")
|
||||
|
||||
```
|
||||
|
||||
每个图像为`28 x 28`,并存储为长度为`784 = 28x28`的扁平行。 让我们来看一个; 我们需要先将其重塑为 2d。
|
||||
|
||||
```py
|
||||
from matplotlib import pyplot
|
||||
import numpy as np
|
||||
|
||||
pyplot.imshow(x_train[0].reshape((28, 28)), cmap="gray")
|
||||
print(x_train.shape)
|
||||
|
||||
```
|
||||
|
||||

|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
(50000, 784)
|
||||
|
||||
```
|
||||
|
||||
PyTorch 使用`torch.tensor`而不是 numpy 数组,因此我们需要转换数据。
|
||||
|
||||
```py
|
||||
import torch
|
||||
|
||||
x_train, y_train, x_valid, y_valid = map(
|
||||
torch.tensor, (x_train, y_train, x_valid, y_valid)
|
||||
)
|
||||
n, c = x_train.shape
|
||||
x_train, x_train.shape, y_train.min(), y_train.max()
|
||||
print(x_train, y_train)
|
||||
print(x_train.shape)
|
||||
print(y_train.min(), y_train.max())
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor([[0., 0., 0., ..., 0., 0., 0.],
|
||||
[0., 0., 0., ..., 0., 0., 0.],
|
||||
[0., 0., 0., ..., 0., 0., 0.],
|
||||
...,
|
||||
[0., 0., 0., ..., 0., 0., 0.],
|
||||
[0., 0., 0., ..., 0., 0., 0.],
|
||||
[0., 0., 0., ..., 0., 0., 0.]]) tensor([5, 0, 4, ..., 8, 4, 8])
|
||||
torch.Size([50000, 784])
|
||||
tensor(0) tensor(9)
|
||||
|
||||
```
|
||||
|
||||
## 从零开始的神经网络(没有`torch.nn`)
|
||||
|
||||
首先,我们仅使用 PyTorch 张量操作创建模型。 我们假设您已经熟悉神经网络的基础知识。 (如果不是,则可以在 [course.fast.ai](https://course.fast.ai) 中学习它们)。
|
||||
|
||||
PyTorch 提供了创建随机或零填充张量的方法,我们将使用它们来为简单的线性模型创建权重和偏差。 这些只是常规张量,还有一个非常特殊的附加值:我们告诉 PyTorch 它们需要梯度。 这使 PyTorch 记录了在张量上完成的所有操作,因此它可以在反向传播时*自动计算*的梯度!
|
||||
|
||||
**对于权重,我们在初始化之后设置`requires_grad`,因为我们不希望该步骤包含在梯度中。 (请注意,PyTorch 中的尾随`_`表示该操作是原地执行的。)**
|
||||
|
||||
注意
|
||||
|
||||
我们在这里用 [Xavier 初始化](http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf)(通过乘以`1 / sqrt(n)`)来初始化权重。
|
||||
|
||||
```py
|
||||
import math
|
||||
|
||||
weights = torch.randn(784, 10) / math.sqrt(784)
|
||||
weights.requires_grad_()
|
||||
bias = torch.zeros(10, requires_grad=True)
|
||||
|
||||
```
|
||||
|
||||
由于 PyTorch 具有自动计算梯度的功能,我们可以将任何标准的 Python 函数(或可调用对象)用作模型! 因此,我们只需编写一个普通矩阵乘法和广播加法即可创建一个简单的线性模型。 我们还需要激活函数,因此我们将编写并使用`log_softmax`。 请记住:尽管 PyTorch 提供了许多预写的损失函数,激活函数等,但是您可以使用纯 Python 轻松编写自己的函数。 PyTorch 甚至会自动为您的函数创建快速 GPU 或向量化的 CPU 代码。
|
||||
|
||||
```py
|
||||
def log_softmax(x):
|
||||
return x - x.exp().sum(-1).log().unsqueeze(-1)
|
||||
|
||||
def model(xb):
|
||||
return log_softmax(xb @ weights + bias)
|
||||
|
||||
```
|
||||
|
||||
在上面,`@`代表点积运算。 我们将对一批数据(在本例中为 64 张图像)调用函数。 这是一个*正向传播*。 请注意,由于我们从随机权重开始,因此在这一阶段,我们的预测不会比随机预测更好。
|
||||
|
||||
```py
|
||||
bs = 64 # batch size
|
||||
|
||||
xb = x_train[0:bs] # a mini-batch from x
|
||||
preds = model(xb) # predictions
|
||||
preds[0], preds.shape
|
||||
print(preds[0], preds.shape)
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor([-2.5964, -2.3153, -2.1321, -2.4480, -2.2930, -1.9507, -2.1289, -2.4175,
|
||||
-2.5332, -2.3967], grad_fn=<SelectBackward>) torch.Size([64, 10])
|
||||
|
||||
```
|
||||
|
||||
如您所见,`preds`张量不仅包含张量值,还包含梯度函数。 稍后我们将使用它进行反向传播。
|
||||
|
||||
让我们实现负对数可能性作为损失函数(同样,我们只能使用标准 Python):
|
||||
|
||||
```py
|
||||
def nll(input, target):
|
||||
return -input[range(target.shape[0]), target].mean()
|
||||
|
||||
loss_func = nll
|
||||
|
||||
```
|
||||
|
||||
让我们使用随机模型来检查损失,以便我们稍后查看反向传播后是否可以改善我们的损失。
|
||||
|
||||
```py
|
||||
yb = y_train[0:bs]
|
||||
print(loss_func(preds, yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(2.3735, grad_fn=<NegBackward>)
|
||||
|
||||
```
|
||||
|
||||
我们还实现一个函数来计算模型的准确率。 对于每个预测,如果具有最大值的索引与目标值匹配,则该预测是正确的。
|
||||
|
||||
```py
|
||||
def accuracy(out, yb):
|
||||
preds = torch.argmax(out, dim=1)
|
||||
return (preds == yb).float().mean()
|
||||
|
||||
```
|
||||
|
||||
让我们检查一下随机模型的准确率,以便我们可以看出随着损失的增加,准确率是否有所提高。
|
||||
|
||||
```py
|
||||
print(accuracy(preds, yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0938)
|
||||
|
||||
```
|
||||
|
||||
现在,我们可以运行一个训练循环。 对于每次迭代,我们将:
|
||||
|
||||
* 选择一个小批量数据(大小为`bs`)
|
||||
* 使用模型进行预测
|
||||
* 计算损失
|
||||
* `loss.backward()`更新模型的梯度,在这种情况下为`weights`和`bias`。
|
||||
|
||||
现在,我们使用这些梯度来更新权重和偏差。 我们在`torch.no_grad()`上下文管理器中执行此操作,因为我们不希望在下一步的梯度计算中记录这些操作。 [您可以在这里阅读有关 PyTorch 的 Autograd 如何记录操作的更多信息](https://pytorch.org/docs/stable/notes/autograd.html)。
|
||||
|
||||
然后,将梯度设置为零,以便为下一个循环做好准备。 否则,我们的梯度会记录所有已发生操作的运行记录(即`loss.backward()`将梯度添加到已存储的内容中,而不是替换它们)。
|
||||
|
||||
小费
|
||||
|
||||
您可以使用标准的 python 调试器逐步浏览 PyTorch 代码,从而可以在每一步检查各种变量值。 取消注释以下`set_trace()`即可尝试。
|
||||
|
||||
```py
|
||||
from IPython.core.debugger import set_trace
|
||||
|
||||
lr = 0.5 # learning rate
|
||||
epochs = 2 # how many epochs to train for
|
||||
|
||||
for epoch in range(epochs):
|
||||
for i in range((n - 1) // bs + 1):
|
||||
# set_trace()
|
||||
start_i = i * bs
|
||||
end_i = start_i + bs
|
||||
xb = x_train[start_i:end_i]
|
||||
yb = y_train[start_i:end_i]
|
||||
pred = model(xb)
|
||||
loss = loss_func(pred, yb)
|
||||
|
||||
loss.backward()
|
||||
with torch.no_grad():
|
||||
weights -= weights.grad * lr
|
||||
bias -= bias.grad * lr
|
||||
weights.grad.zero_()
|
||||
bias.grad.zero_()
|
||||
|
||||
```
|
||||
|
||||
就是这样:我们完全从头开始创建并训练了一个最小的神经网络(在这种情况下,是逻辑回归,因为我们没有隐藏的层)!
|
||||
|
||||
让我们检查损失和准确率,并将其与我们之前获得的进行比较。 我们希望损失会减少,准确率会增加,而且确实如此。
|
||||
|
||||
```py
|
||||
print(loss_func(model(xb), yb), accuracy(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0811, grad_fn=<NegBackward>) tensor(1.)
|
||||
|
||||
```
|
||||
|
||||
## 使用`torch.nn.functional`
|
||||
|
||||
现在,我们将重构代码,使其执行与以前相同的操作,只是我们将开始利用 PyTorch 的`nn`类使其更加简洁和灵活。 从这里开始的每一步,我们都应该使代码中的一个或多个:更短,更易理解和/或更灵活。
|
||||
|
||||
第一步也是最简单的步骤,就是用`torch.nn.functional`(通常按照惯例将其导入到名称空间`F`中)替换我们的手写激活和损失函数,从而缩短代码长度。 该模块包含`torch.nn`库中的所有函数(而该库的其他部分包含类)。 除了广泛的损失和激活函数外,您还会在这里找到一些方便的函数来创建神经网络,例如合并函数。 (还有一些用于进行卷积,线性层等的函数,但是正如我们将看到的那样,通常可以使用库的其他部分来更好地处理这些函数。)
|
||||
|
||||
如果您使用的是负对数似然损失和对数 softmax 激活,那么 Pytorch 会提供结合了两者的单一函数`F.cross_entropy`。 因此,我们甚至可以从模型中删除激活函数。
|
||||
|
||||
```py
|
||||
import torch.nn.functional as F
|
||||
|
||||
loss_func = F.cross_entropy
|
||||
|
||||
def model(xb):
|
||||
return xb @ weights + bias
|
||||
|
||||
```
|
||||
|
||||
请注意,我们不再在`model`函数中调用`log_softmax`。 让我们确认我们的损失和准确率与以前相同:
|
||||
|
||||
```py
|
||||
print(loss_func(model(xb), yb), accuracy(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0811, grad_fn=<NllLossBackward>) tensor(1.)
|
||||
|
||||
```
|
||||
|
||||
## 使用`nn.Module`重构
|
||||
|
||||
接下来,我们将使用`nn.Module`和`nn.Parameter`进行更清晰,更简洁的训练循环。 我们将`nn.Module`子类化(它本身是一个类并且能够跟踪状态)。 在这种情况下,我们要创建一个类,该类包含前进步骤的权重,偏置和方法。 `nn.Module`具有许多我们将要使用的属性和方法(例如`.parameters()`和`.zero_grad()`)。
|
||||
|
||||
注意
|
||||
|
||||
`nn.Module`(大写`M`)是 PyTorch 的特定概念,并且是我们将经常使用的一类。 不要将`nn.Module`与[模块](https://docs.python.org/3/tutorial/modules.html)(小写`m`)的 Python 概念混淆,该模块是可以导入的 Python 代码文件。
|
||||
|
||||
```py
|
||||
from torch import nn
|
||||
|
||||
class Mnist_Logistic(nn.Module):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.weights = nn.Parameter(torch.randn(784, 10) / math.sqrt(784))
|
||||
self.bias = nn.Parameter(torch.zeros(10))
|
||||
|
||||
def forward(self, xb):
|
||||
return xb @ self.weights + self.bias
|
||||
|
||||
```
|
||||
|
||||
由于我们现在使用的是对象而不是仅使用函数,因此我们首先必须实例化模型:
|
||||
|
||||
```py
|
||||
model = Mnist_Logistic()
|
||||
|
||||
```
|
||||
|
||||
现在我们可以像以前一样计算损失。 请注意,`nn.Module`对象的使用就好像它们是函数一样(即,它们是*可调用的*),但是在后台 Pytorch 会自动调用我们的`forward`方法。
|
||||
|
||||
```py
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(2.3903, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
以前,在我们的训练循环中,我们必须按名称更新每个参数的值,并手动将每个参数的梯度分别归零,如下所示:
|
||||
|
||||
```py
|
||||
with torch.no_grad():
|
||||
weights -= weights.grad * lr
|
||||
bias -= bias.grad * lr
|
||||
weights.grad.zero_()
|
||||
bias.grad.zero_()
|
||||
|
||||
```
|
||||
|
||||
现在我们可以利用`model.parameters()`和`model.zero_grad()`(它们都由 PyTorch 为`nn.Module`定义)来使这些步骤更简洁,并且更不会出现忘记某些参数的错误,尤其是当我们有一个更复杂的模型的时候:
|
||||
|
||||
```py
|
||||
with torch.no_grad():
|
||||
for p in model.parameters(): p -= p.grad * lr
|
||||
model.zero_grad()
|
||||
|
||||
```
|
||||
|
||||
我们将把小的训练循环包装在`fit`函数中,以便稍后再运行。
|
||||
|
||||
```py
|
||||
def fit():
|
||||
for epoch in range(epochs):
|
||||
for i in range((n - 1) // bs + 1):
|
||||
start_i = i * bs
|
||||
end_i = start_i + bs
|
||||
xb = x_train[start_i:end_i]
|
||||
yb = y_train[start_i:end_i]
|
||||
pred = model(xb)
|
||||
loss = loss_func(pred, yb)
|
||||
|
||||
loss.backward()
|
||||
with torch.no_grad():
|
||||
for p in model.parameters():
|
||||
p -= p.grad * lr
|
||||
model.zero_grad()
|
||||
|
||||
fit()
|
||||
|
||||
```
|
||||
|
||||
让我们仔细检查一下我们的损失是否减少了:
|
||||
|
||||
```py
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0808, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
## 使用`nn.Linear`重构
|
||||
|
||||
我们继续重构我们的代码。 代替手动定义和初始化`self.weights`和`self.bias`并计算`xb @ self.weights + self.bias`,我们将对线性层使用 Pytorch 类[`nn.Linear`](https://pytorch.org/docs/stable/nn.html#linear-layers),这将为我们完成所有工作。 Pytorch 具有许多类型的预定义层,可以大大简化我们的代码,并且通常也可以使其速度更快。
|
||||
|
||||
```py
|
||||
class Mnist_Logistic(nn.Module):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.lin = nn.Linear(784, 10)
|
||||
|
||||
def forward(self, xb):
|
||||
return self.lin(xb)
|
||||
|
||||
```
|
||||
|
||||
我们以与以前相同的方式实例化模型并计算损失:
|
||||
|
||||
```py
|
||||
model = Mnist_Logistic()
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(2.4215, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
我们仍然可以使用与以前相同的`fit`方法。
|
||||
|
||||
```py
|
||||
fit()
|
||||
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0824, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
## 使用`optim`重构
|
||||
|
||||
Pytorch 还提供了一个包含各种优化算法的包`torch.optim`。 我们可以使用优化器中的`step`方法采取向前的步骤,而不是手动更新每个参数。
|
||||
|
||||
这将使我们替换之前的手动编码优化步骤:
|
||||
|
||||
```py
|
||||
with torch.no_grad():
|
||||
for p in model.parameters(): p -= p.grad * lr
|
||||
model.zero_grad()
|
||||
|
||||
```
|
||||
|
||||
而是只使用:
|
||||
|
||||
```py
|
||||
opt.step()
|
||||
opt.zero_grad()
|
||||
|
||||
```
|
||||
|
||||
(`optim.zero_grad()`将梯度重置为 0,我们需要在计算下一个小批量的梯度之前调用它。)
|
||||
|
||||
```py
|
||||
from torch import optim
|
||||
|
||||
```
|
||||
|
||||
我们将定义一个小函数来创建模型和优化器,以便将来重用。
|
||||
|
||||
```py
|
||||
def get_model():
|
||||
model = Mnist_Logistic()
|
||||
return model, optim.SGD(model.parameters(), lr=lr)
|
||||
|
||||
model, opt = get_model()
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
for epoch in range(epochs):
|
||||
for i in range((n - 1) // bs + 1):
|
||||
start_i = i * bs
|
||||
end_i = start_i + bs
|
||||
xb = x_train[start_i:end_i]
|
||||
yb = y_train[start_i:end_i]
|
||||
pred = model(xb)
|
||||
loss = loss_func(pred, yb)
|
||||
|
||||
loss.backward()
|
||||
opt.step()
|
||||
opt.zero_grad()
|
||||
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(2.2999, grad_fn=<NllLossBackward>)
|
||||
tensor(0.0823, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
## 使用`Dataset`重构
|
||||
|
||||
PyTorch 有一个抽象的`Dataset`类。 数据集可以是具有`__len__`函数(由 Python 的标准`len`函数调用)和具有`__getitem__`函数作为对其进行索引的一种方法。 [本教程](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html)演示了一个不错的示例,该示例创建一个自定义`FacialLandmarkDataset`类作为`Dataset`的子类。
|
||||
|
||||
PyTorch 的[`TensorDataset`](https://pytorch.org/docs/stable/_modules/torch/utils/data/dataset.html#TensorDataset)是一个数据集包装张量。 通过定义索引的长度和方式,这也为我们提供了沿张量的第一维进行迭代,索引和切片的方法。 这将使我们在训练的同一行中更容易访问自变量和因变量。
|
||||
|
||||
```py
|
||||
from torch.utils.data import TensorDataset
|
||||
|
||||
```
|
||||
|
||||
`x_train`和`y_train`都可以合并为一个`TensorDataset`,这将更易于迭代和切片。
|
||||
|
||||
```py
|
||||
train_ds = TensorDataset(x_train, y_train)
|
||||
|
||||
```
|
||||
|
||||
以前,我们不得不分别遍历`x`和`y`值的小批量:
|
||||
|
||||
```py
|
||||
xb = x_train[start_i:end_i]
|
||||
yb = y_train[start_i:end_i]
|
||||
|
||||
```
|
||||
|
||||
现在,我们可以一起执行以下两个步骤:
|
||||
|
||||
```py
|
||||
xb,yb = train_ds[i*bs : i*bs+bs]
|
||||
|
||||
```
|
||||
|
||||
```py
|
||||
model, opt = get_model()
|
||||
|
||||
for epoch in range(epochs):
|
||||
for i in range((n - 1) // bs + 1):
|
||||
xb, yb = train_ds[i * bs: i * bs + bs]
|
||||
pred = model(xb)
|
||||
loss = loss_func(pred, yb)
|
||||
|
||||
loss.backward()
|
||||
opt.step()
|
||||
opt.zero_grad()
|
||||
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0819, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
## 使用`DataLoader`重构
|
||||
|
||||
Pytorch 的`DataLoader`负责批量管理。 您可以从任何`Dataset`创建一个`DataLoader`。 `DataLoader`使迭代迭代变得更加容易。 不必使用`train_ds[i*bs : i*bs+bs]`,`DataLoader`会自动为我们提供每个小批量。
|
||||
|
||||
```py
|
||||
from torch.utils.data import DataLoader
|
||||
|
||||
train_ds = TensorDataset(x_train, y_train)
|
||||
train_dl = DataLoader(train_ds, batch_size=bs)
|
||||
|
||||
```
|
||||
|
||||
以前,我们的循环遍历如下批量`(xb, yb)`:
|
||||
|
||||
```py
|
||||
for i in range((n-1)//bs + 1):
|
||||
xb,yb = train_ds[i*bs : i*bs+bs]
|
||||
pred = model(xb)
|
||||
|
||||
```
|
||||
|
||||
现在,我们的循环更加简洁了,因为`(xb, yb)`是从数据加载器自动加载的:
|
||||
|
||||
```py
|
||||
for xb,yb in train_dl:
|
||||
pred = model(xb)
|
||||
|
||||
```
|
||||
|
||||
```py
|
||||
model, opt = get_model()
|
||||
|
||||
for epoch in range(epochs):
|
||||
for xb, yb in train_dl:
|
||||
pred = model(xb)
|
||||
loss = loss_func(pred, yb)
|
||||
|
||||
loss.backward()
|
||||
opt.step()
|
||||
opt.zero_grad()
|
||||
|
||||
print(loss_func(model(xb), yb))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
tensor(0.0821, grad_fn=<NllLossBackward>)
|
||||
|
||||
```
|
||||
|
||||
得益于 Pytorch 的`nn.Module`,`nn.Parameter`,`Dataset`和`DataLoader`,我们的训练循环现在变得更小,更容易理解。 现在,让我们尝试添加在实践中创建有效模型所需的基本功能。
|
||||
|
||||
## 添加验证
|
||||
|
||||
在第 1 节中,我们只是试图建立一个合理的训练循环以用于我们的训练数据。 实际上,您也应该**始终**具有[验证集](https://www.fast.ai/2017/11/13/validation-sets/),以便识别您是否过拟合。
|
||||
|
||||
[对训练数据进行打乱](https://www.quora.com/Does-the-order-of-training-data-matter-when-training-neural-networks)对于防止批量与过拟合之间的相关性很重要。 另一方面,无论我们是否打乱验证集,验证损失都是相同的。 由于打乱需要花费更多时间,因此打乱验证数据没有任何意义。
|
||||
|
||||
我们将验证集的批量大小设为训练集的两倍。 这是因为验证集不需要反向传播,因此占用的内存更少(不需要存储梯度)。 我们利用这一优势来使用更大的批量,并更快地计算损失。
|
||||
|
||||
```py
|
||||
train_ds = TensorDataset(x_train, y_train)
|
||||
train_dl = DataLoader(train_ds, batch_size=bs, shuffle=True)
|
||||
|
||||
valid_ds = TensorDataset(x_valid, y_valid)
|
||||
valid_dl = DataLoader(valid_ds, batch_size=bs * 2)
|
||||
|
||||
```
|
||||
|
||||
我们将在每个周期结束时计算并打印验证损失。
|
||||
|
||||
(请注意,我们总是在训练之前调用`model.train()`,并在推理之前调用`model.eval()`,因为诸如`nn.BatchNorm2d`和`nn.Dropout`之类的层会使用它们,以确保这些不同阶段的行为正确。)
|
||||
|
||||
```py
|
||||
model, opt = get_model()
|
||||
|
||||
for epoch in range(epochs):
|
||||
model.train()
|
||||
for xb, yb in train_dl:
|
||||
pred = model(xb)
|
||||
loss = loss_func(pred, yb)
|
||||
|
||||
loss.backward()
|
||||
opt.step()
|
||||
opt.zero_grad()
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
valid_loss = sum(loss_func(model(xb), yb) for xb, yb in valid_dl)
|
||||
|
||||
print(epoch, valid_loss / len(valid_dl))
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
0 tensor(0.3743)
|
||||
1 tensor(0.3316)
|
||||
|
||||
```
|
||||
|
||||
## 创建`fit()`和`get_data()`
|
||||
|
||||
现在,我们将自己进行一些重构。 由于我们经历了两次相似的过程来计算训练集和验证集的损失,因此我们将其设为自己的函数`loss_batch`,该函数可计算一批损失。
|
||||
|
||||
我们将优化器传入训练集中,然后使用它执行反向传播。 对于验证集,我们没有通过优化程序,因此该方法不会执行反向传播。
|
||||
|
||||
```py
|
||||
def loss_batch(model, loss_func, xb, yb, opt=None):
|
||||
loss = loss_func(model(xb), yb)
|
||||
|
||||
if opt is not None:
|
||||
loss.backward()
|
||||
opt.step()
|
||||
opt.zero_grad()
|
||||
|
||||
return loss.item(), len(xb)
|
||||
|
||||
```
|
||||
|
||||
`fit`运行必要的操作来训练我们的模型,并计算每个周期的训练和验证损失。
|
||||
|
||||
```py
|
||||
import numpy as np
|
||||
|
||||
def fit(epochs, model, loss_func, opt, train_dl, valid_dl):
|
||||
for epoch in range(epochs):
|
||||
model.train()
|
||||
for xb, yb in train_dl:
|
||||
loss_batch(model, loss_func, xb, yb, opt)
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
losses, nums = zip(
|
||||
*[loss_batch(model, loss_func, xb, yb) for xb, yb in valid_dl]
|
||||
)
|
||||
val_loss = np.sum(np.multiply(losses, nums)) / np.sum(nums)
|
||||
|
||||
print(epoch, val_loss)
|
||||
|
||||
```
|
||||
|
||||
`get_data`返回训练和验证集的数据加载器。
|
||||
|
||||
```py
|
||||
def get_data(train_ds, valid_ds, bs):
|
||||
return (
|
||||
DataLoader(train_ds, batch_size=bs, shuffle=True),
|
||||
DataLoader(valid_ds, batch_size=bs * 2),
|
||||
)
|
||||
|
||||
```
|
||||
|
||||
现在,我们获取数据加载器和拟合模型的整个过程可以在 3 行代码中运行:
|
||||
|
||||
```py
|
||||
train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
|
||||
model, opt = get_model()
|
||||
fit(epochs, model, loss_func, opt, train_dl, valid_dl)
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
0 0.3120644524335861
|
||||
1 0.28915613491535186
|
||||
|
||||
```
|
||||
|
||||
您可以使用这些基本的 3 行代码来训练各种各样的模型。 让我们看看是否可以使用它们来训练卷积神经网络(CNN)!
|
||||
|
||||
## 切换到 CNN
|
||||
|
||||
现在,我们将构建具有三个卷积层的神经网络。 由于上一节中的任何功能都不假设任何有关模型形式的信息,因此我们将能够使用它们来训练 CNN,而无需进行任何修改。
|
||||
|
||||
我们将使用 Pytorch 的预定义[`Conv2d`](https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d)类作为我们的卷积层。 我们定义了具有 3 个卷积层的 CNN。 每个卷积后跟一个 ReLU。 最后,我们执行平均池化。 (请注意,`view`是 numpy 的`reshape`的 PyTorch 版本)
|
||||
|
||||
```py
|
||||
class Mnist_CNN(nn.Module):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1)
|
||||
self.conv2 = nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1)
|
||||
self.conv3 = nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1)
|
||||
|
||||
def forward(self, xb):
|
||||
xb = xb.view(-1, 1, 28, 28)
|
||||
xb = F.relu(self.conv1(xb))
|
||||
xb = F.relu(self.conv2(xb))
|
||||
xb = F.relu(self.conv3(xb))
|
||||
xb = F.avg_pool2d(xb, 4)
|
||||
return xb.view(-1, xb.size(1))
|
||||
|
||||
lr = 0.1
|
||||
|
||||
```
|
||||
|
||||
[动量](https://cs231n.github.io/neural-networks-3/#sgd)是随机梯度下降的一种变体,它也考虑了以前的更新,通常可以加快训练速度。
|
||||
|
||||
```py
|
||||
model = Mnist_CNN()
|
||||
opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)
|
||||
|
||||
fit(epochs, model, loss_func, opt, train_dl, valid_dl)
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
0 0.32337012240886687
|
||||
1 0.25021172934770586
|
||||
|
||||
```
|
||||
|
||||
## `nn.Sequential`
|
||||
|
||||
`torch.nn`还有另一个方便的类,可以用来简化我们的代码:[`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)。 `Sequential`对象以顺序方式运行其中包含的每个模块。 这是编写神经网络的一种简单方法。
|
||||
|
||||
为了利用这一点,我们需要能够从给定的函数轻松定义**自定义层**。 例如,PyTorch 没有视层,我们需要为我们的网络创建一个层。 `Lambda`将创建一个层,然后在使用`Sequential`定义网络时可以使用该层。
|
||||
|
||||
```py
|
||||
class Lambda(nn.Module):
|
||||
def __init__(self, func):
|
||||
super().__init__()
|
||||
self.func = func
|
||||
|
||||
def forward(self, x):
|
||||
return self.func(x)
|
||||
|
||||
def preprocess(x):
|
||||
return x.view(-1, 1, 28, 28)
|
||||
|
||||
```
|
||||
|
||||
用`Sequential`创建的模型很简单:
|
||||
|
||||
```py
|
||||
model = nn.Sequential(
|
||||
Lambda(preprocess),
|
||||
nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1),
|
||||
nn.ReLU(),
|
||||
nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1),
|
||||
nn.ReLU(),
|
||||
nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1),
|
||||
nn.ReLU(),
|
||||
nn.AvgPool2d(4),
|
||||
Lambda(lambda x: x.view(x.size(0), -1)),
|
||||
)
|
||||
|
||||
opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)
|
||||
|
||||
fit(epochs, model, loss_func, opt, train_dl, valid_dl)
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
0 0.30119081069231035
|
||||
1 0.25335356528759
|
||||
|
||||
```
|
||||
|
||||
## 包装`DataLoader`
|
||||
|
||||
Our CNN is fairly concise, but it only works with MNIST, because:
|
||||
|
||||
* 假设输入为`28 * 28`长向量
|
||||
* 假设 CNN 的最终网格尺寸为`4 * 4`(因为这是平均值
|
||||
|
||||
我们使用的合并核大小)
|
||||
|
||||
让我们摆脱这两个假设,因此我们的模型适用于任何 2d 单通道图像。 首先,我们可以删除初始的 Lambda 层,但将数据预处理移至生成器中:
|
||||
|
||||
```py
|
||||
def preprocess(x, y):
|
||||
return x.view(-1, 1, 28, 28), y
|
||||
|
||||
class WrappedDataLoader:
|
||||
def __init__(self, dl, func):
|
||||
self.dl = dl
|
||||
self.func = func
|
||||
|
||||
def __len__(self):
|
||||
return len(self.dl)
|
||||
|
||||
def __iter__(self):
|
||||
batches = iter(self.dl)
|
||||
for b in batches:
|
||||
yield (self.func(*b))
|
||||
|
||||
train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
|
||||
train_dl = WrappedDataLoader(train_dl, preprocess)
|
||||
valid_dl = WrappedDataLoader(valid_dl, preprocess)
|
||||
|
||||
```
|
||||
|
||||
接下来,我们可以将`nn.AvgPool2d`替换为`nn.AdaptiveAvgPool2d`,这使我们能够定义所需的*输出*张量的大小,而不是所需的*输入*张量的大小。 结果,我们的模型将适用于任何大小的输入。
|
||||
|
||||
```py
|
||||
model = nn.Sequential(
|
||||
nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1),
|
||||
nn.ReLU(),
|
||||
nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1),
|
||||
nn.ReLU(),
|
||||
nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1),
|
||||
nn.ReLU(),
|
||||
nn.AdaptiveAvgPool2d(1),
|
||||
Lambda(lambda x: x.view(x.size(0), -1)),
|
||||
)
|
||||
|
||||
opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)
|
||||
|
||||
```
|
||||
|
||||
试试看:
|
||||
|
||||
```py
|
||||
fit(epochs, model, loss_func, opt, train_dl, valid_dl)
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
0 0.327303307390213
|
||||
1 0.2181092014491558
|
||||
|
||||
```
|
||||
|
||||
## 使用您的 GPU
|
||||
|
||||
如果您足够幸运地能够使用具有 CUDA 功能的 GPU(可以从大多数云提供商处以每小时 0.50 美元的价格租用一个),则可以使用它来加速代码。 首先检查您的 GPU 是否在 Pytorch 中正常工作:
|
||||
|
||||
```py
|
||||
print(torch.cuda.is_available())
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
True
|
||||
|
||||
```
|
||||
|
||||
然后为其创建一个设备对象:
|
||||
|
||||
```py
|
||||
dev = torch.device(
|
||||
"cuda") if torch.cuda.is_available() else torch.device("cpu")
|
||||
|
||||
```
|
||||
|
||||
让我们更新`preprocess`,将批量移至 GPU:
|
||||
|
||||
```py
|
||||
def preprocess(x, y):
|
||||
return x.view(-1, 1, 28, 28).to(dev), y.to(dev)
|
||||
|
||||
train_dl, valid_dl = get_data(train_ds, valid_ds, bs)
|
||||
train_dl = WrappedDataLoader(train_dl, preprocess)
|
||||
valid_dl = WrappedDataLoader(valid_dl, preprocess)
|
||||
|
||||
```
|
||||
|
||||
最后,我们可以将模型移至 GPU。
|
||||
|
||||
```py
|
||||
model.to(dev)
|
||||
opt = optim.SGD(model.parameters(), lr=lr, momentum=0.9)
|
||||
|
||||
```
|
||||
|
||||
您应该发现它现在运行得更快:
|
||||
|
||||
```py
|
||||
fit(epochs, model, loss_func, opt, train_dl, valid_dl)
|
||||
|
||||
```
|
||||
|
||||
出:
|
||||
|
||||
```py
|
||||
0 0.1833980613708496
|
||||
1 0.17365939717292786
|
||||
|
||||
```
|
||||
|
||||
## 总结
|
||||
|
||||
现在,我们有了一个通用的数据管道和训练循环,您可以将其用于使用 Pytorch 训练许多类型的模型。 要了解现在可以轻松进行模型训练,请查看`mnist_sample`示例笔记本。
|
||||
|
||||
当然,您需要添加很多内容,例如数据扩充,超参数调整,监控训练,迁移学习等。 这些功能可在 fastai 库中使用,该库是使用本教程中所示的相同设计方法开发的,为希望进一步推广其模型的从业人员提供了自然的下一步。
|
||||
|
||||
我们承诺在本教程开始时将通过示例分别说明`torch.nn`,`torch.optim`,`Dataset`和`DataLoader`。 因此,让我们总结一下我们所看到的:
|
||||
|
||||
> * `torch.nn`
|
||||
> * `Module`:创建一个行为类似于函数的可调用对象,但也可以包含状态(例如神经网络层权重)。 它知道其中包含的 `Parameter` ,并且可以将其所有坡度归零,遍历它们以进行权重更新等。
|
||||
> * `Parameter`:张量的包装器,用于告知 `Module` 具有在反向传播期间需要更新的权重。 仅更新具有`require_grad`属性集的张量
|
||||
> * `functional`:一个模块(通常按照惯例导入到 `F` 名称空间中),其中包含激活函数,损失函数等。 以及卷积和线性层等层的无状态版本。
|
||||
> * `torch.optim`:包含诸如 `SGD` 的优化程序,这些优化程序在后退步骤
|
||||
> * `Dataset` 中更新 `Parameter` 的权重。 具有 `__len__` 和 `__getitem__` 的对象,包括 Pytorch 提供的类,例如 `TensorDataset`
|
||||
> * `DataLoader`:获取任何 `Dataset` 并创建一个迭代器,该迭代器返回批量数据。
|
||||
|
||||
**脚本的总运行时间**:(0 分钟 57.062 秒)
|
||||
|
||||
[下载 Python 源码:`nn_tutorial.py`](../_downloads/a6246751179fbfb7cad9222ef1c16617/nn_tutorial.py)
|
||||
|
||||
[下载 Jupyter 笔记本:`nn_tutorial.ipynb`](../_downloads/5ddab57bb7482fbcc76722617dd47324/nn_tutorial.ipynb)
|
||||
|
||||
[由 Sphinx 画廊](https://sphinx-gallery.readthedocs.io)生成的画廊
|
||||
@@ -1,348 +0,0 @@
|
||||
# 使用 TensorBoard 可视化模型,数据和训练
|
||||
|
||||
> 原文:<https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html>
|
||||
|
||||
在 [60 分钟突击](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)中,我们向您展示了如何加载数据,如何通过定义为`nn.Module`子类的模型提供数据,如何在训练数据上训练该模型以及在测试数据上对其进行测试。 为了了解发生的情况,我们在模型训练期间打印一些统计数据,以了解训练是否在进行中。 但是,我们可以做得更好:PyTorch 与 TensorBoard 集成在一起,TensorBoard 是一种工具,用于可视化神经网络训练运行的结果。 本教程使用 [Fashion-MNIST 数据集](https://github.com/zalandoresearch/fashion-mnist)说明了其某些功能,可以使用`torchvision.datasets`将其读入 PyTorch。
|
||||
|
||||
在本教程中,我们将学习如何:
|
||||
|
||||
> 1. 读取数据并进行适当的转换(与先前的教程几乎相同)。
|
||||
> 2. 设置 TensorBoard。
|
||||
> 3. 写入 TensorBoard。
|
||||
> 4. 使用 TensorBoard 检查模型架构。
|
||||
> 5. 使用 TensorBoard 来创建我们在上一个教程中创建的可视化的交互式版本,并使用较少的代码
|
||||
|
||||
具体来说,在第 5 点,我们将看到:
|
||||
|
||||
> * 有两种方法可以检查我们的训练数据
|
||||
> * 在训练模型时如何跟踪其表现
|
||||
> * 在训练后如何评估模型的表现。
|
||||
|
||||
我们将从 [CIFAR-10 教程](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)中类似的样板代码开始:
|
||||
|
||||
```py
|
||||
# imports
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
import torch
|
||||
import torchvision
|
||||
import torchvision.transforms as transforms
|
||||
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import torch.optim as optim
|
||||
|
||||
# transforms
|
||||
transform = transforms.Compose(
|
||||
[transforms.ToTensor(),
|
||||
transforms.Normalize((0.5,), (0.5,))])
|
||||
|
||||
# datasets
|
||||
trainset = torchvision.datasets.FashionMNIST('./data',
|
||||
download=True,
|
||||
train=True,
|
||||
transform=transform)
|
||||
testset = torchvision.datasets.FashionMNIST('./data',
|
||||
download=True,
|
||||
train=False,
|
||||
transform=transform)
|
||||
|
||||
# dataloaders
|
||||
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
|
||||
shuffle=True, num_workers=2)
|
||||
|
||||
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
|
||||
shuffle=False, num_workers=2)
|
||||
|
||||
# constant for classes
|
||||
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
|
||||
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')
|
||||
|
||||
# helper function to show an image
|
||||
# (used in the `plot_classes_preds` function below)
|
||||
def matplotlib_imshow(img, one_channel=False):
|
||||
if one_channel:
|
||||
img = img.mean(dim=0)
|
||||
img = img / 2 + 0.5 # unnormalize
|
||||
npimg = img.numpy()
|
||||
if one_channel:
|
||||
plt.imshow(npimg, cmap="Greys")
|
||||
else:
|
||||
plt.imshow(np.transpose(npimg, (1, 2, 0)))
|
||||
|
||||
```
|
||||
|
||||
我们将在该教程中定义一个类似的模型架构,仅需进行少量修改即可解决以下事实:图像现在是一个通道而不是三个通道,而图像是`28x28`而不是`32x32`:
|
||||
|
||||
```py
|
||||
class Net(nn.Module):
|
||||
def __init__(self):
|
||||
super(Net, self).__init__()
|
||||
self.conv1 = nn.Conv2d(1, 6, 5)
|
||||
self.pool = nn.MaxPool2d(2, 2)
|
||||
self.conv2 = nn.Conv2d(6, 16, 5)
|
||||
self.fc1 = nn.Linear(16 * 4 * 4, 120)
|
||||
self.fc2 = nn.Linear(120, 84)
|
||||
self.fc3 = nn.Linear(84, 10)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.pool(F.relu(self.conv1(x)))
|
||||
x = self.pool(F.relu(self.conv2(x)))
|
||||
x = x.view(-1, 16 * 4 * 4)
|
||||
x = F.relu(self.fc1(x))
|
||||
x = F.relu(self.fc2(x))
|
||||
x = self.fc3(x)
|
||||
return x
|
||||
|
||||
net = Net()
|
||||
|
||||
```
|
||||
|
||||
我们将在之前定义相同的`optimizer`和`criterion`:
|
||||
|
||||
```py
|
||||
criterion = nn.CrossEntropyLoss()
|
||||
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
|
||||
|
||||
```
|
||||
|
||||
## 1\. TensorBoard 设置
|
||||
|
||||
现在,我们将设置 TensorBoard,从`torch.utils`导入`tensorboard`并定义`SummaryWriter`,这是将信息写入 TensorBoard 的关键对象。
|
||||
|
||||
```py
|
||||
from torch.utils.tensorboard import SummaryWriter
|
||||
|
||||
# default `log_dir` is "runs" - we'll be more specific here
|
||||
writer = SummaryWriter('runs/fashion_mnist_experiment_1')
|
||||
|
||||
```
|
||||
|
||||
请注意,仅此行会创建一个`runs/fashion_mnist_experiment_1`文件夹。
|
||||
|
||||
## 2\. 写入 TensorBoard
|
||||
|
||||
现在,使用[`make_grid`](https://pytorch.org/docs/stable/torchvision/utils.html#torchvision.utils.make_grid)将图像写入到 TensorBoard 中,具体来说就是网格。
|
||||
|
||||
```py
|
||||
# get some random training images
|
||||
dataiter = iter(trainloader)
|
||||
images, labels = dataiter.next()
|
||||
|
||||
# create grid of images
|
||||
img_grid = torchvision.utils.make_grid(images)
|
||||
|
||||
# show images
|
||||
matplotlib_imshow(img_grid, one_channel=True)
|
||||
|
||||
# write to tensorboard
|
||||
writer.add_image('four_fashion_mnist_images', img_grid)
|
||||
|
||||
```
|
||||
|
||||
正在运行
|
||||
|
||||
```py
|
||||
tensorboard --logdir=runs
|
||||
|
||||
```
|
||||
|
||||
从命令行,然后导航到`https://localhost:6006`应该显示以下内容。
|
||||
|
||||

|
||||
|
||||
现在您知道如何使用 TensorBoard 了! 但是,此示例可以在 Jupyter 笔记本中完成-TensorBoard 真正擅长的地方是创建交互式可视化。 接下来,我们将介绍其中之一,并在本教程结束时介绍更多内容。
|
||||
|
||||
## 3\. 使用 TensorBoard 检查模型
|
||||
|
||||
TensorBoard 的优势之一是其可视化复杂模型结构的能力。 让我们可视化我们构建的模型。
|
||||
|
||||
```py
|
||||
writer.add_graph(net, images)
|
||||
writer.close()
|
||||
|
||||
```
|
||||
|
||||
现在刷新 TensorBoard 后,您应该会看到一个`Graphs`标签,如下所示:
|
||||
|
||||

|
||||
|
||||
继续并双击`Net`以展开它,查看构成模型的各个操作的详细视图。
|
||||
|
||||
TensorBoard 具有非常方便的功能,可在低维空间中可视化高维数据,例如图像数据。 接下来我们将介绍这一点。
|
||||
|
||||
## 4\. 在 TensorBoard 中添加“投影仪”
|
||||
|
||||
我们可以通过[`add_embedding`](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_embedding)方法可视化高维数据的低维表示
|
||||
|
||||
```py
|
||||
# helper function
|
||||
def select_n_random(data, labels, n=100):
|
||||
'''
|
||||
Selects n random datapoints and their corresponding labels from a dataset
|
||||
'''
|
||||
assert len(data) == len(labels)
|
||||
|
||||
perm = torch.randperm(len(data))
|
||||
return data[perm][:n], labels[perm][:n]
|
||||
|
||||
# select random images and their target indices
|
||||
images, labels = select_n_random(trainset.data, trainset.targets)
|
||||
|
||||
# get the class labels for each image
|
||||
class_labels = [classes[lab] for lab in labels]
|
||||
|
||||
# log embeddings
|
||||
features = images.view(-1, 28 * 28)
|
||||
writer.add_embedding(features,
|
||||
metadata=class_labels,
|
||||
label_img=images.unsqueeze(1))
|
||||
writer.close()
|
||||
|
||||
```
|
||||
|
||||
现在,在 TensorBoard 的“投影仪”选项卡中,您可以看到这 100 张图像-每个图像 784 维-向下投影到三维空间中。 此外,这是交互式的:您可以单击并拖动以旋转三维投影。 最后,一些技巧可以使可视化效果更容易看到:选择左上方的“颜色:标签”,以及启用“夜间模式”,这将使图像更容易看到,因为它们的背景是白色的:
|
||||
|
||||

|
||||
|
||||
现在我们已经彻底检查了我们的数据,让我们展示了 TensorBoard 如何从训练开始就可以使跟踪模型的训练和评估更加清晰。
|
||||
|
||||
## 5\. 使用 TensorBoard 跟踪模型训练
|
||||
|
||||
在前面的示例中,我们仅*每 2000 次迭代*打印该模型的运行损失。 现在,我们将运行损失记录到 TensorBoard 中,并通过`plot_classes_preds`函数查看模型所做的预测。
|
||||
|
||||
```py
|
||||
# helper functions
|
||||
|
||||
def images_to_probs(net, images):
|
||||
'''
|
||||
Generates predictions and corresponding probabilities from a trained
|
||||
network and a list of images
|
||||
'''
|
||||
output = net(images)
|
||||
# convert output probabilities to predicted class
|
||||
_, preds_tensor = torch.max(output, 1)
|
||||
preds = np.squeeze(preds_tensor.numpy())
|
||||
return preds, [F.softmax(el, dim=0)[i].item() for i, el in zip(preds, output)]
|
||||
|
||||
def plot_classes_preds(net, images, labels):
|
||||
'''
|
||||
Generates matplotlib Figure using a trained network, along with images
|
||||
and labels from a batch, that shows the network's top prediction along
|
||||
with its probability, alongside the actual label, coloring this
|
||||
information based on whether the prediction was correct or not.
|
||||
Uses the "images_to_probs" function.
|
||||
'''
|
||||
preds, probs = images_to_probs(net, images)
|
||||
# plot the images in the batch, along with predicted and true labels
|
||||
fig = plt.figure(figsize=(12, 48))
|
||||
for idx in np.arange(4):
|
||||
ax = fig.add_subplot(1, 4, idx+1, xticks=[], yticks=[])
|
||||
matplotlib_imshow(images[idx], one_channel=True)
|
||||
ax.set_title("{0}, {1:.1f}%\n(label: {2})".format(
|
||||
classes[preds[idx]],
|
||||
probs[idx] * 100.0,
|
||||
classes[labels[idx]]),
|
||||
color=("green" if preds[idx]==labels[idx].item() else "red"))
|
||||
return fig
|
||||
|
||||
```
|
||||
|
||||
最后,让我们使用与之前教程相同的模型训练代码来训练模型,但是每 1000 批将结果写入 TensorBoard,而不是打印到控制台。 这是通过[`add_scalar`](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_scalar)函数完成的。
|
||||
|
||||
此外,在训练过程中,我们将生成一幅图像,显示该批量中包含的四幅图像的模型预测与实际结果。
|
||||
|
||||
```py
|
||||
running_loss = 0.0
|
||||
for epoch in range(1): # loop over the dataset multiple times
|
||||
|
||||
for i, data in enumerate(trainloader, 0):
|
||||
|
||||
# get the inputs; data is a list of [inputs, labels]
|
||||
inputs, labels = data
|
||||
|
||||
# zero the parameter gradients
|
||||
optimizer.zero_grad()
|
||||
|
||||
# forward + backward + optimize
|
||||
outputs = net(inputs)
|
||||
loss = criterion(outputs, labels)
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
running_loss += loss.item()
|
||||
if i % 1000 == 999: # every 1000 mini-batches...
|
||||
|
||||
# ...log the running loss
|
||||
writer.add_scalar('training loss',
|
||||
running_loss / 1000,
|
||||
epoch * len(trainloader) + i)
|
||||
|
||||
# ...log a Matplotlib Figure showing the model's predictions on a
|
||||
# random mini-batch
|
||||
writer.add_figure('predictions vs. actuals',
|
||||
plot_classes_preds(net, inputs, labels),
|
||||
global_step=epoch * len(trainloader) + i)
|
||||
running_loss = 0.0
|
||||
print('Finished Training')
|
||||
|
||||
```
|
||||
|
||||
现在,您可以查看“标量”选项卡,以查看在 15,000 次训练迭代中绘制的运行损失:
|
||||
|
||||

|
||||
|
||||
此外,我们可以查看整个学习过程中模型在任意批量上所做的预测。 查看“图像”选项卡,然后在“预测与实际”可视化条件下向下滚动以查看此内容; 这表明,例如,仅经过 3000 次训练迭代,该模型就已经能够区分出视觉上截然不同的类,例如衬衫,运动鞋和外套,尽管它并没有像后来的训练那样有信心:
|
||||
|
||||

|
||||
|
||||
在之前的教程中,我们研究了模型训练后的每类准确率; 在这里,我们将使用 TensorBoard 绘制每个类别的精确调用曲线([在这里解释](https://www.scikit-yb.org/en/latest/api/classifier/prcurve.html))。
|
||||
|
||||
## 6\. 使用 TensorBoard 评估经过训练的模型
|
||||
|
||||
```py
|
||||
# 1\. gets the probability predictions in a test_size x num_classes Tensor
|
||||
# 2\. gets the preds in a test_size Tensor
|
||||
# takes ~10 seconds to run
|
||||
class_probs = []
|
||||
class_preds = []
|
||||
with torch.no_grad():
|
||||
for data in testloader:
|
||||
images, labels = data
|
||||
output = net(images)
|
||||
class_probs_batch = [F.softmax(el, dim=0) for el in output]
|
||||
_, class_preds_batch = torch.max(output, 1)
|
||||
|
||||
class_probs.append(class_probs_batch)
|
||||
class_preds.append(class_preds_batch)
|
||||
|
||||
test_probs = torch.cat([torch.stack(batch) for batch in class_probs])
|
||||
test_preds = torch.cat(class_preds)
|
||||
|
||||
# helper function
|
||||
def add_pr_curve_tensorboard(class_index, test_probs, test_preds, global_step=0):
|
||||
'''
|
||||
Takes in a "class_index" from 0 to 9 and plots the corresponding
|
||||
precision-recall curve
|
||||
'''
|
||||
tensorboard_preds = test_preds == class_index
|
||||
tensorboard_probs = test_probs[:, class_index]
|
||||
|
||||
writer.add_pr_curve(classes[class_index],
|
||||
tensorboard_preds,
|
||||
tensorboard_probs,
|
||||
global_step=global_step)
|
||||
writer.close()
|
||||
|
||||
# plot all the pr curves
|
||||
for i in range(len(classes)):
|
||||
add_pr_curve_tensorboard(i, test_probs, test_preds)
|
||||
|
||||
```
|
||||
|
||||
现在,您将看到一个`PR Curves`选项卡,其中包含每个类别的精确调用曲线。 继续四处戳; 您会发现在某些类别中,模型的“曲线下面积”接近 100%,而在另一些类别中,该面积更低:
|
||||
|
||||

|
||||
|
||||
这是 TensorBoard 和 PyTorch 与之集成的介绍。 当然,您可以在 Jupyter 笔记本中完成 TensorBoard 的所有操作,但是使用 TensorBoard 时,默认情况下会获得交互式的视觉效果。
|
||||
@@ -1,35 +0,0 @@
|
||||
# PyTorch 中文官方教程 1.7
|
||||
|
||||
> 原文:[WELCOME TO PYTORCH TUTORIALS](https://pytorch.org/tutorials/)
|
||||
>
|
||||
> 协议:[CC BY-NC-SA 4.0](http://creativecommons.org/licenses/by-nc-sa/4.0/)
|
||||
>
|
||||
> 自豪地采用[谷歌翻译](https://translate.google.cn/)
|
||||
>
|
||||
> 不要担心自己的形象,只关心如何实现目标。——《原则》,生活原则 2.3.c
|
||||
|
||||
* [在线阅读](https://dl.apachecn.org)
|
||||
* [ApacheCN 面试求职交流群 724187166](https://jq.qq.com/?_wv=1027&k=54ujcL3)
|
||||
* [ApacheCN 学习资源](http://www.apachecn.org/)
|
||||
|
||||
## 贡献指南
|
||||
|
||||
本项目需要校对,欢迎大家提交 Pull Request。
|
||||
|
||||
> 请您勇敢地去翻译和改进翻译。虽然我们追求卓越,但我们并不要求您做到十全十美,因此请不要担心因为翻译上犯错——在大部分情况下,我们的服务器已经记录所有的翻译,因此您不必担心会因为您的失误遭到无法挽回的破坏。(改编自维基百科)
|
||||
|
||||
## 联系方式
|
||||
|
||||
### 负责人
|
||||
|
||||
* [飞龙](https://github.com/wizardforcel): 562826179
|
||||
|
||||
### 其他
|
||||
|
||||
* 在我们的 [apachecn/apachecn-tf-zh](https://github.com/apachecn/apachecn-tf-zh) github 上提 issue.
|
||||
* 发邮件到 Email: `apachecn@163.com`.
|
||||
* 在我们的 [组织学习交流群](http://www.apachecn.org/organization/348.html) 中联系群主/管理员即可.
|
||||
|
||||
## 赞助我们
|
||||
|
||||

|
||||
@@ -1,176 +0,0 @@
|
||||
{
|
||||
"title" : "Pytorch 中文文档",
|
||||
"author" : "ApacheCN",
|
||||
"description" : "Pytorch 中文文档: 教程和文档",
|
||||
"language" : "zh-hans",
|
||||
"plugins": [
|
||||
"github",
|
||||
"github-buttons",
|
||||
"-sharing",
|
||||
"insert-logo",
|
||||
"sharing-plus",
|
||||
"back-to-top-button",
|
||||
"code",
|
||||
"copy-code-button",
|
||||
"katex",
|
||||
"pageview-count",
|
||||
"edit-link",
|
||||
"emphasize",
|
||||
"alerts",
|
||||
"auto-scroll-table",
|
||||
"popup",
|
||||
"hide-element",
|
||||
"page-toc-button",
|
||||
"tbfed-pagefooter",
|
||||
"sitemap",
|
||||
"advanced-emoji",
|
||||
"expandable-chapters",
|
||||
"splitter",
|
||||
"search-pro"
|
||||
],
|
||||
"pluginsConfig": {
|
||||
"github": {
|
||||
"url": "https://github.com/apachecn/pytorch-doc-zh"
|
||||
},
|
||||
"github-buttons": {
|
||||
"buttons": [
|
||||
{
|
||||
"user": "apachecn",
|
||||
"repo": "pytorch-doc-zh",
|
||||
"type": "star",
|
||||
"count": true,
|
||||
"size": "small"
|
||||
}
|
||||
]
|
||||
},
|
||||
"insert-logo": {
|
||||
"url": "http://data.apachecn.org/img/logo.jpg",
|
||||
"style": "background: none; max-height: 150px; min-height: 150px"
|
||||
},
|
||||
"hide-element": {
|
||||
"elements": [".gitbook-link"]
|
||||
},
|
||||
"edit-link": {
|
||||
"base": "https://github.com/apachecn/pytorch-doc-zh/blob/master/docs/1.7",
|
||||
"label": "编辑本页"
|
||||
},
|
||||
"sharing": {
|
||||
"qzone": true,
|
||||
"weibo": true,
|
||||
"twitter": false,
|
||||
"facebook": false,
|
||||
"google": false,
|
||||
"qq": false,
|
||||
"line": false,
|
||||
"whatsapp": false,
|
||||
"douban": false,
|
||||
"all": [
|
||||
"qq", "douban", "facebook", "google", "linkedin", "twitter", "weibo", "whatsapp"
|
||||
]
|
||||
},
|
||||
"page-toc-button": {
|
||||
"maxTocDepth": 4,
|
||||
"minTocSize": 4
|
||||
},
|
||||
"tbfed-pagefooter": {
|
||||
"copyright":"Copyright © ibooker.org.cn 2019",
|
||||
"modify_label": "该文件修订时间: ",
|
||||
"modify_format": "YYYY-MM-DD HH:mm:ss"
|
||||
},
|
||||
"sitemap": {
|
||||
"hostname": "http://pytorch.apachecn.org"
|
||||
}
|
||||
},
|
||||
"my_links" : {
|
||||
"sidebar" : {
|
||||
"Home" : "https://www.baidu.com"
|
||||
}
|
||||
},
|
||||
"my_plugins": [
|
||||
"donate",
|
||||
"todo",
|
||||
"-lunr",
|
||||
"-search",
|
||||
"expandable-chapters-small",
|
||||
"chapter-fold",
|
||||
"expandable-chapters",
|
||||
"expandable-chapters-small",
|
||||
"back-to-top-button",
|
||||
"ga",
|
||||
"baidu",
|
||||
"sitemap",
|
||||
"tbfed-pagefooter",
|
||||
"advanced-emoji",
|
||||
"sectionx",
|
||||
"page-treeview",
|
||||
"simple-page-toc",
|
||||
"ancre-navigation",
|
||||
"theme-apachecn@git+https://github.com/apachecn/theme-apachecn#HEAD",
|
||||
"pagefooter-apachecn@git+https://github.com/apachecn/gitbook-plugin-pagefooter-apachecn#HEAD"
|
||||
],
|
||||
"my_pluginsConfig": {
|
||||
"github-buttons": {
|
||||
"buttons": [
|
||||
{
|
||||
"user": "apachecn",
|
||||
"repo": "pytorch-doc-zh",
|
||||
"type": "star",
|
||||
"count": true,
|
||||
"size": "small"
|
||||
},
|
||||
{
|
||||
"user": "apachecn",
|
||||
"width": "160",
|
||||
"type": "follow",
|
||||
"count": true,
|
||||
"size": "small"
|
||||
}
|
||||
]
|
||||
},
|
||||
"ignores": ["node_modules"],
|
||||
"simple-page-toc": {
|
||||
"maxDepth": 3,
|
||||
"skipFirstH1": true
|
||||
},
|
||||
"page-toc-button": {
|
||||
"maxTocDepth": 2,
|
||||
"minTocSize": 2
|
||||
},
|
||||
"page-treeview": {
|
||||
"copyright": "Copyright © aleen42",
|
||||
"minHeaderCount": "2",
|
||||
"minHeaderDeep": "2"
|
||||
},
|
||||
"donate": {
|
||||
"wechat": "微信收款的二维码URL",
|
||||
"alipay": "支付宝收款的二维码URL",
|
||||
"title": "",
|
||||
"button": "赏",
|
||||
"alipayText": "支付宝打赏",
|
||||
"wechatText": "微信打赏"
|
||||
},
|
||||
"page-copyright": {
|
||||
"description": "modified at",
|
||||
"signature": "你的签名",
|
||||
"wisdom": "Designer, Frontend Developer & overall web enthusiast",
|
||||
"format": "YYYY-MM-dd hh:mm:ss",
|
||||
"copyright": "Copyright © 你的名字",
|
||||
"timeColor": "#666",
|
||||
"copyrightColor": "#666",
|
||||
"utcOffset": "8",
|
||||
"style": "normal",
|
||||
"noPowered": false
|
||||
},
|
||||
"ga": {
|
||||
"token": "UA-102475051-10"
|
||||
},
|
||||
"baidu": {
|
||||
"token": "75439e2cbd22bdd813226000e9dcc12f"
|
||||
},
|
||||
"pagefooter-apachecn": {
|
||||
"copyright":"Copyright © ibooker.org.cn 2019",
|
||||
"modify_label": "该文件修订时间: ",
|
||||
"modify_format": "YYYY-MM-DD HH:mm:ss"
|
||||
}
|
||||
}
|
||||
}
|
||||
1
pytorch/实战/.gitignore
vendored
Normal file
1
pytorch/实战/.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
||||
data
|
||||
@@ -193,21 +193,197 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"execution_count": 103,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"Net(\n (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))\n (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))\n (fc1): Linear(in_features=576, out_features=120, bias=True)\n (fc2): Linear(in_features=120, out_features=84, bias=True)\n (fc3): Linear(in_features=84, out_features=10, bias=True)\n)\n10\ntorch.Size([6, 1, 3, 3])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"import torch.nn as nn\n",
|
||||
"import torch.nn.functional as f\n",
|
||||
"import torch.nn.functional as F\n",
|
||||
"\n",
|
||||
"class Net(nn.Module):\n",
|
||||
"\n",
|
||||
" def __init__(self):\n",
|
||||
" super(Net,self).__init__()\n",
|
||||
" self.conv1 = nn.Conv2d(1,6,3)\n",
|
||||
" self.conv2 = nn.Conv2d(6,16,3)\n",
|
||||
" self.fc1 = nn.Linear(16*6*6,120)\n",
|
||||
" self"
|
||||
" self.fc2 = nn.Linear(120,84)\n",
|
||||
" self.fc3 = nn.Linear(84,10)\n",
|
||||
"\n",
|
||||
" def forward(self,x):\n",
|
||||
" x = F.max_pool2d(F.relu(self.conv1(x)),(2,2))\n",
|
||||
" x = F.max_pool2d(F.relu(self.conv2(x)),2)\n",
|
||||
" x = x.view(-1,self.num_flat_features(x))\n",
|
||||
" x = F.relu(self.fc1(x))\n",
|
||||
" x = F.relu(self.fc2(x))\n",
|
||||
" x = self.fc3(x)\n",
|
||||
" return x\n",
|
||||
" \n",
|
||||
" def num_flat_features(self,x):\n",
|
||||
" size = x.size()[1:]\n",
|
||||
" num_features = 1\n",
|
||||
" for s in size:\n",
|
||||
" num_features *=s\n",
|
||||
" return num_features\n",
|
||||
"\n",
|
||||
"net = Net()\n",
|
||||
"print(net)\n",
|
||||
"params = list(net.parameters())\n",
|
||||
"print(len(params))\n",
|
||||
"print(params[0].size())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 104,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"tensor([[ 0.0492, -0.0594, -0.1026, -0.0511, 0.0224, -0.0672, 0.1048, 0.0772,\n -0.1358, -0.0327]], grad_fn=<AddmmBackward>)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"input = torch.randn(1,1,32,32)\n",
|
||||
"out = net(input)\n",
|
||||
"print(out)\n",
|
||||
"\n",
|
||||
"net.zero_grad()\n",
|
||||
"out.backward(torch.randn(1,10))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 83,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"tensor([[0.9332, 0.9385],\n [0.5645, 0.9817],\n [0.0998, 0.0800],\n [0.3189, 0.7160],\n [0.4157, 0.3705]])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# tensor,的第一个维度表示数量。第二个维度开始表示数据的格式。如果第一维的维度>1说明存在一个以上的数据条目\n",
|
||||
"a = torch.rand(2,5)\n",
|
||||
"# b = np.rand(2,3)\n",
|
||||
"print(a.T)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 96,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"None\nTrue\ntensor([[-0.8533, -1.8255, 0.1003, -0.2550, 0.0429, -2.4029, -2.1907, -0.3119,\n -0.5956, 0.5517]])\ntensor(1.5207, grad_fn=<MseLossBackward>)\n<MseLossBackward object at 0x000001D03A547370>\n<AddmmBackward object at 0x000001D03A44CAC0>\n<AccumulateGrad object at 0x000001D03A23D400>\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# 定义损失\n",
|
||||
"print(input.grad)\n",
|
||||
"print(input.is_leaf)\n",
|
||||
"output = net(input)\n",
|
||||
"target = torch.randn(10)\n",
|
||||
"target = target.view(1,-1)\n",
|
||||
"print(target)\n",
|
||||
"criterion = nn.MSELoss()\n",
|
||||
"loss = criterion(output,target)\n",
|
||||
"# 输出了反向传播过程中的函数\n",
|
||||
"print(loss)\n",
|
||||
"print(loss.grad_fn)\n",
|
||||
"print(loss.grad_fn.next_functions[0][0])\n",
|
||||
"print(loss.grad_fn.next_functions[0][0].next_functions[0][0])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 100,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"output_type": "stream",
|
||||
"name": "stdout",
|
||||
"text": [
|
||||
"tensor([0., 0., 0., 0., 0., 0.])\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"output_type": "error",
|
||||
"ename": "RuntimeError",
|
||||
"evalue": "Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time.",
|
||||
"traceback": [
|
||||
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[1;31mRuntimeError\u001b[0m Traceback (most recent call last)",
|
||||
"\u001b[1;32m<ipython-input-100-139d55f804ab>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnet\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mconv1\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbias\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mgrad\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 6\u001b[1;33m \u001b[0mloss\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbackward\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 7\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 8\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnet\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mconv1\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbias\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mgrad\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
|
||||
"\u001b[1;32mC:\\Python\\lib\\site-packages\\torch\\tensor.py\u001b[0m in \u001b[0;36mbackward\u001b[1;34m(self, gradient, retain_graph, create_graph, inputs)\u001b[0m\n\u001b[0;32m 243\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mcreate_graph\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 244\u001b[0m inputs=inputs)\n\u001b[1;32m--> 245\u001b[1;33m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mautograd\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbackward\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mgradient\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0minputs\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 246\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 247\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mregister_hook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mhook\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
|
||||
"\u001b[1;32mC:\\Python\\lib\\site-packages\\torch\\autograd\\__init__.py\u001b[0m in \u001b[0;36mbackward\u001b[1;34m(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)\u001b[0m\n\u001b[0;32m 143\u001b[0m \u001b[0mretain_graph\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 144\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 145\u001b[1;33m Variable._execution_engine.run_backward(\n\u001b[0m\u001b[0;32m 146\u001b[0m \u001b[0mtensors\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mgrad_tensors_\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 147\u001b[0m allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag\n",
|
||||
"\u001b[1;31mRuntimeError\u001b[0m: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time."
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# 进行反向传播\n",
|
||||
"net.zero_grad()\n",
|
||||
"\n",
|
||||
"print(net.conv1.bias.grad)\n",
|
||||
"loss.backward()\n",
|
||||
"\n",
|
||||
"print(net.conv1.bias.grad)\n",
|
||||
"# 更新网络权重weight = weight -learning_rate*gradient\n",
|
||||
"learning_rate = 0.01\n",
|
||||
"for f in net.parameters():\n",
|
||||
" f.data.sub_(f.grad.data*learning_rate)\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 106,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 常用的优化器\n",
|
||||
"import torch.optim as optim\n",
|
||||
"\n",
|
||||
"optimizer = optim.SGD(net.parameters(),lr=0.01)\n",
|
||||
"\n",
|
||||
"optimizer.zero_grad()\n",
|
||||
"\n",
|
||||
"output = net(input)\n",
|
||||
"loss = criterion(output,target)\n",
|
||||
"loss.backward()\n",
|
||||
"optimizer.step()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"source": [
|
||||
"### 卷积的计算公式\n",
|
||||
"$$\n",
|
||||
"d = (d - kennelsize + 2 * padding) / stride + 1\n",
|
||||
"$$"
|
||||
],
|
||||
"cell_type": "markdown",
|
||||
"metadata": {}
|
||||
}
|
||||
]
|
||||
}
|
||||
392
pytorch/实战/6.ipynb
Normal file
392
pytorch/实战/6.ipynb
Normal file
File diff suppressed because one or more lines are too long
897
pytorch/实战/7.ipynb
Normal file
897
pytorch/实战/7.ipynb
Normal file
File diff suppressed because one or more lines are too long
BIN
pytorch/实战/cifar_net.pth
Normal file
BIN
pytorch/实战/cifar_net.pth
Normal file
Binary file not shown.
Reference in New Issue
Block a user