怀抱观古今,寝食展戏谑。这篇文章主要讲述PyTorch教程 参数访问#yyds干货盘点#相关的知识,希望能为你提供帮助。
为什么要单独摘出来参数讲一下。
因为在训练的过程中,我们的目标就是找到让损失函数最小化的参数值。经过训练之后我们需要将这些参数拿出来做预测,或者在其他地方使用。
所以为了以后方便,我们现在就要摘出来单独讲一下:
- 访问参数,用于调试、诊断和可视化。
- 参数初始化。
- 在不同模型组件间共享参数。
import torch
from torch import nnnet = nn.Sequential(nn.Linear(4, 8), nn.ReLU(), nn.Linear(8, 1))
X = torch.rand(size=(2, 4))
net(X)
这还是简易实现了一个多层感知机,然后弄了一个X做输入。
当通过
Sequential
类定义模型时,我们可以通过索引来访问模型的任意层。print(net)
可以看到输出为:
>
>
Sequential(
(0): Linear(in_features=4, out_features=8, bias=True)
(1): ReLU()
(2): Linear(in_features=8, out_features=1, bias=True)
)
我们可以通过前边的序号得到想要的层。
print(net[0])
print(net[1])print(net[2].state_dict())
>
>
Linear(in_features=4, out_features=8, bias=True)
ReLU()OrderedDict([(weight, tensor([[-0.0264, -0.0906,0.3497,0.3284, -0.0173,0.0124,0.0136,0.0782]])), (bias, tensor([0.2243]))])
不出意外,我们看到了前两层是什么。
至于第三个输出,我们可以看到,这个层包含两个参数。
[
(
weight,
tensor([[-0.0264, -0.0906,0.3497,0.3284, -0.0173,0.0124,0.0136,0.0782]])
),
(
bias,
tensor([0.2243])
)
]
print(type(net[2].bias))
print(type(net[0].weight))
>
>
<
class torch.nn.parameter.Parameter>
<
class torch.nn.parameter.Parameter>
可以看到每个参数都表示为参数(parameter)类的一个实例。
print(net[2].bias)
print(net[0].weight)
Parameter containing:
tensor([-0.1431,0.1381, -0.2775,0.0038, -0.0269,0.0631, -0.1791,0.1291],
requires_grad=True)
Parameter containing:
tensor([[-0.4736,0.2223, -0.0059,0.4146],
[-0.1052,0.2813, -0.2315,0.2931],
[-0.4990, -0.1991, -0.1453,0.0369],
[-0.4676,0.0669, -0.0069, -0.4932],
[-0.4223,0.0659, -0.3783, -0.1145],
[-0.0460,0.2386, -0.1586,0.2148],
[-0.0085, -0.3642,0.0265,0.0487],
[ 0.2703, -0.2903,0.1822, -0.3782]], requires_grad=True)
相应的层序号+方法调用,提取网络的偏置或参数。
print(*[(name, param.shape) for name, param in net[0].named_parameters()])print(*[(name, param.shape) for name, param in net.named_parameters()])print(*net.named_parameters(),end="\\n",sep=\\n)# 这里*是一个解包器 ,用于输出列表的每一个元素
>
>
(weight, torch.Size([8, 4])) (bias, torch.Size([8]))(0.weight, torch.Size([8, 4])) (0.bias, torch.Size([8]))
(2.weight, torch.Size([1, 8])) (2.bias, torch.Size([1]))(0.weight, Parameter containing:
tensor([[ 0.3700,0.3270, -0.3741, -0.1365],
[ 0.2200,0.0786,0.1241, -0.2834],
[ 0.3143,0.3718,0.3278,0.0949],
[ 0.1565,0.4639, -0.1515, -0.4962],
[ 0.3102, -0.0025, -0.0099, -0.4132],
[ 0.1754, -0.1320, -0.3762, -0.1371],
[-0.3860, -0.0369,0.3743, -0.0892],
[ 0.0280, -0.2877, -0.1884,0.2915]], requires_grad=True))
(0.bias, Parameter containing:
tensor([ 0.4722, -0.4143,0.0858, -0.2280,0.4349,0.3954,0.0971, -0.1192],
requires_grad=True))
(2.weight, Parameter containing:
tensor([[ 0.0984,0.0207, -0.1292,0.0530, -0.0693,0.0413, -0.2231, -0.3125]],
requires_grad=True))
(2.bias, Parameter containing:
tensor([0.1844], requires_grad=True))
关于解包器看这里:Python * ** 打包解包详解 - 掘金 (juejin.cn)
我把三个输出分开了。
- 第一个是解包net的第0层的参数参数名称和参数形状
- 第二个是解包net所有层的参数名称和参数形状
- 第三个是解包net的参数列表
print(net.state_dict()[2.bias].data)
print(net.state_dict()[0.weight])
>
>
tensor([0.1844])
tensor([[ 0.3700,0.3270, -0.3741, -0.1365],
[ 0.2200,0.0786,0.1241, -0.2834],
[ 0.3143,0.3718,0.3278,0.0949],
[ 0.1565,0.4639, -0.1515, -0.4962],
[ 0.3102, -0.0025, -0.0099, -0.4132],
[ 0.1754, -0.1320, -0.3762, -0.1371],
[-0.3860, -0.0369,0.3743, -0.0892],
[ 0.0280, -0.2877, -0.1884,0.2915]])
【PyTorch教程 参数访问#yyds干货盘点#】后边不管加不加
.data
都可以直接输出参数的值。def block1():
return nn.Sequential(nn.Linear(4, 8), nn.ReLU(),
nn.Linear(8, 4), nn.ReLU())def block2():
net = nn.Sequential()
for i in range(4):
# 在这里嵌套
net.add_module(fblock {i}, block1())
net[2] = nn.Linear(4,4)
return netX = torch.rand(size=(2, 4))
rgnet = nn.Sequential(block2(), nn.Linear(4, 1))
rgnet(X)
定义一个嵌套的网络。随手画了个图,长这样。
文章图片
print(rgnet)
输出一下这个网路哦,可以看到其结构如下:
>
>
Sequential(
(0): Sequential(
(block 0): Sequential(
(0): Linear(in_features=4, out_features=8, bias=True)
(1): ReLU()
(2): Linear(in_features=8, out_features=4, bias=True)
(3): ReLU()
)
(block 1): Linear(in_features=4, out_features=4, bias=True)
(block 2): Sequential(
(0): Linear(in_features=4, out_features=8, bias=True)
(1): ReLU()
(2): Linear(in_features=8, out_features=4, bias=True)
(3): ReLU()
)
)
(1): Linear(in_features=4, out_features=1, bias=True)
)
比如:
print(rgnet[0][2][0].bias.data)
print(rgnet.state_dict()[0.block 2.0.bias])
>
>
tensor([-0.1555,0.4410, -0.4920,0.1434,0.1243,0.4114, -0.0883,0.1387])
tensor([-0.1555,0.4410, -0.4920,0.1434,0.1243,0.4114, -0.0883,0.1387])
在这种情况下要取到其中的参数,第一个放大就是增加一个地址而已。
第二种方法要首先指明在哪一个块。再指明哪一个块上的哪一层。
本文是《动手学深度学习》的笔记,Github地址:DeepLearningNotes/d2l(github.com)
还在更新中…………
推荐阅读
- 学习Java必备的基础知识06,要想学好必须扎实基本功(?建议收藏)#yyds干货盘点#
- Linux中的DHCP网址重分配
- Prometheus监控运维实战十八( Alertmanager集群)
- 实验(linux搭建FTP服务器)
- [Flutter]专题Flutter 中的 AppBar详解#yyds干货盘点#
- 码云出现错误git@gitee.com: Permission denied (publickey). fatal: Could not read from remote repository.P
- #yyds干货盘点#three.js中3D场景扩散波特效
- #yyds干货盘点# RobotFramework从基础到项目实战
- “元宇宙”究竟是什么