Pytorch|PyTorch学习笔记（五）（模型定义、修改、保存）人工智能|深度学习|cnn|神经网

往期学习资料推荐： 1.Pytorch实战笔记_GoAI的博客-CSDN博客
2.Pytorch入门教程_GoAI的博客-CSDN博客
本系列目录： PyTorch学习笔记（一）：PyTorch环境安装
PyTorch学习笔记（二）：简介与基础知识
PyTorch学习笔记（三）：PyTorch主要组成模块
【Pytorch|PyTorch学习笔记（五）（模型定义、修改、保存）】PyTorch学习笔记（四）：PyTorch基础实战
PyTorch学习笔记（五）：模型定义、修改、保存
后续继续更新！！！！
一、PyTorch模型定义的方式

Module 类是 torch.nn 模块里提供的一个模型构造类 (nn.Module)，是所有神经?网络模块的基类，我们可以继承它来定义我们想要的模型；

PyTorch模型定义应包括两个主要部分：各个部分的初始化（_init_）；数据流向定义（forward）

基于nn.Module，可以通过Sequential，ModuleList和ModuleDict三种方式定义PyTorch模型。
1.Sequential 对应模块为nn.Sequential()。
当模型的前向计算为简单串联各个层的计算时， Sequential 类可以通过更加简单的方式定义模型。它可以接收一个子模块的有序字典(OrderedDict) 或者一系列子模块作为参数来逐一添加 Module 的实例，?模型的前向计算就是将这些实例按添加的顺序逐?计算。我们结合Sequential和定义方式加以理解：

class MySequential(nn.Module): from collections import OrderedDict def __init__(self, *args): super(MySequential, self).__init__() if len(args) == 1 and isinstance(args[0], OrderedDict): # 如果传入的是一个OrderedDict for key, module in args[0].items(): self.add_module(key, module)# add_module方法会将module添加进self._modules(一个OrderedDict) else:# 传入的是一些Module for idx, module in enumerate(args): self.add_module(str(idx), module) def forward(self, input): # self._modules返回一个 OrderedDict，保证会按照成员添加时的顺序遍历成 for module in self._modules.values(): input = module(input) return input

下面来看下如何使用Sequential来定义模型。只需要将模型的层按序排列起来即可，根据层名的不同，排列的时候有两种方式：

直接排列

import torch.nn as nn net = nn.Sequential( nn.Linear(784, 256), nn.ReLU(), nn.Linear(256, 10), ) print(net)

Sequential( (0): Linear(in_features=784, out_features=256, bias=True) (1): ReLU() (2): Linear(in_features=256, out_features=10, bias=True) )

OrderedDict：

import collections import torch.nn as nn net2 = nn.Sequential(collections.OrderedDict([ ('fc1', nn.Linear(784, 256)), ('relu1', nn.ReLU()), ('fc2', nn.Linear(256, 10)) ])) print(net2)

Sequential( (fc1): Linear(in_features=784, out_features=256, bias=True) (relu1): ReLU() (fc2): Linear(in_features=256, out_features=10, bias=True) )

可以看到，使用Sequential定义模型的好处在于简单、易读，同时使用Sequential定义的模型不需要再写forward，因为顺序已经定义好了。但使用Sequential也会使得模型定义丧失灵活性，比如需要在模型中间加入一个外部输入时就不适合用Sequential的方式实现。使用时需根据实际需求加以选择。
2.ModuleList 对应模块为nn.ModuleList()。
ModuleList 接收一个子模块（或层，需属于nn.Module类）的列表作为输入，然后也可以类似List那样进行append和extend操作。同时，子模块或层的权重也会自动添加到网络中来。

net = nn.ModuleList([nn.Linear(784, 256), nn.ReLU()]) net.append(nn.Linear(256, 10)) # # 类似List的append操作 print(net[-1])# 类似List的索引访问 print(net)

Linear(in_features=256, out_features=10, bias=True) ModuleList( (0): Linear(in_features=784, out_features=256, bias=True) (1): ReLU() (2): Linear(in_features=256, out_features=10, bias=True) )

要特别注意的是，nn.ModuleList 并没有定义一个网络，它只是将不同的模块储存在一起。ModuleList中元素的先后顺序并不代表其在网络中的真实位置顺序，需要经过forward函数指定各个层的先后顺序后才算完成了模型的定义。具体实现时用for循环即可完成：

class model(nn.Module): def __init__(self, ...): self.modulelist = ... ...def forward(self, x): for layer in self.modulelist: x = layer(x) return x

3.ModuleDict 对应模块为nn.ModuleDict()。
ModuleDict和ModuleList的作用类似，只是ModuleDict能够更方便地为神经网络的层添加名称。

net = nn.ModuleDict({ 'linear': nn.Linear(784, 256), 'act': nn.ReLU(), }) net['output'] = nn.Linear(256, 10) # 添加 print(net['linear']) # 访问 print(net.output) print(net)

Linear(in_features=784, out_features=256, bias=True) Linear(in_features=256, out_features=10, bias=True) ModuleDict( (act): ReLU() (linear): Linear(in_features=784, out_features=256, bias=True) (output): Linear(in_features=256, out_features=10, bias=True) )

三种方法的比较总结

Sequential适用于快速验证结果，不需要同时写__init__和forward；
ModuleList和ModuleDict在某个完全相同的层需要重复出现多次时，非常方便实现，可以”一行顶多行“；
当我们需要之前层的信息的时候，比如 ResNets 中的残差计算，当前层的结果需要和之前层中的结果进行融合，一般使用 ModuleList/ModuleDict 比较方便。

二、利用模型块快速搭建复杂网络模型搭建基本方法：

模型块分析
模型块实现
利用模型块组装模型

以U-Net模型为例，该模型为分割模型，通过残差连接结构解决了模型学习中的退化问题，使得神经网络的深度能够不断扩展。

文章图片
?
模型块分析

每个子块内部的两次卷积DoubleConv
左侧模型块之间的下采样连接Down，通过Max pooling来实现
右侧模型块之间的上采样连接Up
输出层的处理OutConv
模型块之间的横向连接，输入和U-Net底部的连接等计算，这些单独的操作可以通过forward函数来实现

模型块实现
以U-net为例：

# 两次卷积 conv 3x3, ReLU class DoubleConv(nn.Module): """(convolution => [BN] => ReLU) * 2"""def __init__(self, in_channels, out_channels, mid_channels=None): super().__init__() if not mid_channels: mid_channels = out_channels self.double_conv = nn.Sequential( nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1, bias=False), nn.BatchNorm2d(mid_channels), nn.ReLU(inplace=True), nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1, bias=False), nn.BatchNorm2d(out_channels), nn.ReLU(inplace=True) )def forward(self, x): return self.double_conv(x) # 下采样 max pool 2x2 class Down(nn.Module): """Downscaling with maxpool then double conv"""def __init__(self, in_channels, out_channels): super().__init__() self.maxpool_conv = nn.Sequential( nn.MaxPool2d(2), DoubleConv(in_channels, out_channels) )def forward(self, x): return self.maxpool_conv(x) # 上采样 up-conv 2x2 class Up(nn.Module): """Upscaling then double conv"""def __init__(self, in_channels, out_channels, bilinear=True): super().__init__()# if bilinear, use the normal convolutions to reduce the number of channels if bilinear: self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True) self.conv = DoubleConv(in_channels, out_channels, in_channels // 2) else: self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2) self.conv = DoubleConv(in_channels, out_channels)def forward(self, x1, x2): x1 = self.up(x1) # input is CHW diffY = x2.size()[2] - x1.size()[2] diffX = x2.size()[3] - x1.size()[3]x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2, diffY // 2, diffY - diffY // 2]) x = torch.cat([x2, x1], dim=1)# 输出 conv 1x1 class OutConv(nn.Module): def __init__(self, in_channels, out_channels): super(OutConv, self).__init__() self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)def forward(self, x): return self.conv(x)

利用模型块组装U-net模型

class UNet(nn.Module): def __init__(self, n_channels, n_classes, bilinear=True): super(UNet, self).__init__() self.n_channels = n_channels self.n_classes = n_classes self.bilinear = bilinearself.inc = DoubleConv(n_channels, 64) self.down1 = Down(64, 128) self.down2 = Down(128, 256) self.down3 = Down(256, 512) factor = 2 if bilinear else 1 self.down4 = Down(512, 1024 // factor) self.up1 = Up(1024, 512 // factor, bilinear) self.up2 = Up(512, 256 // factor, bilinear) self.up3 = Up(256, 128 // factor, bilinear) self.up4 = Up(128, 64, bilinear) self.outc = OutConv(64, n_classes)def forward(self, x): x1 = self.inc(x) x2 = self.down1(x1) x3 = self.down2(x2) x4 = self.down3(x3) x5 = self.down4(x4) x = self.up1(x5, x4) x = self.up2(x, x3) x = self.up3(x, x2) x = self.up4(x, x1) logits = self.outc(x) return logits

三、PyTorch修改模型 1.模型层
以pytorch中torchvision库预定义好的模型ResNet50为例，模型参数如下：

import torchvision.models as models net = models.resnet50() print(net)

ResNet( (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) (layer1): Sequential( (0): Bottleneck( (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (downsample): Sequential( (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) .............. (avgpool): AdaptiveAvgPool2d(output_size=(1, 1)) (fc): Linear(in_features=2048, out_features=1000, bias=True) )

为了适配ImageNet预训练的权重，因此最后全连接层（fc）的输出节点数是1000。假设我们要用这个resnet模型去做一个10分类的问题，就应该修改模型的fc层，将其输出节点数替换为10。另外，我们觉得一层全连接层可能太少了，想再加一层。
可以做如下修改：

from collections import OrderedDict classifier = nn.Sequential(OrderedDict([('fc1', nn.Linear(2048, 128)), ('relu1', nn.ReLU()), ('dropout1',nn.Dropout(0.5)), ('fc2', nn.Linear(128, 10)), ('output', nn.Softmax(dim=1)) ]))net.fc = classifier

这里的操作相当于将模型（net）最后名称为“fc”的层替换成了名称为“classifier”的结构，该结构是我们自己定义的。这里使用了Sequential+OrderedDict的模型定义方式，现在的模型就可以去做10分类任务了。
2.添加外部输入
有时候在模型训练中，除了已有模型的输入之外，还需要输入额外的信息。比如在CNN网络中，我们除了输入图像，还需要同时输入图像对应的其他信息，这时候就需要在已有的CNN网络中添加额外的输入变量。基本思路是：将原模型添加输入位置前的部分作为一个整体，同时在forward中定义好原模型不变的部分、添加的输入和后续层之间的连接关系，从而完成模型的修改。
我们以torchvision的resnet50模型为基础，任务还是10分类任务。不同点在于，我们希望利用已有的模型结构，在倒数第二层增加一个额外的输入变量add_variable来辅助预测。具体实现如下：

class Model(nn.Module): def __init__(self, net): super(Model, self).__init__() self.net = net self.relu = nn.ReLU() self.dropout = nn.Dropout(0.5) self.fc_add = nn.Linear(1001, 10, bias=True) self.output = nn.Softmax(dim=1)def forward(self, x, add_variable): x = self.net(x) x = torch.cat((self.dropout(self.relu(x)), add_variable.unsqueeze(1)),1) x = self.fc_add(x) x = self.output(x) return x

这里的实现要点是通过torch.cat实现了tensor的拼接。torchvision中的resnet50输出是一个1000维的tensor，我们通过修改forward函数（配套定义一些层），先将2048维的tensor通过激活函数层和dropout层，再和外部输入变量"add_variable"拼接，最后通过全连接层映射到指定的输出维度10。
另外这里对外部输入变量"add_variable"进行unsqueeze操作是为了和net输出的tensor保持维度一致，常用于add_variable是单一数值 (scalar) 的情况，此时add_variable的维度是 (batch_size, )，需要在第二维补充维数1，从而可以和tensor进行torch.cat操作。对于unsqueeze操作可以复习下2.1节的内容和配套代码 :)
之后对我们修改好的模型结构进行实例化，就可以使用了：

import torchvision.models as models net = models.resnet50() model = Model(net).cuda()

另外别忘了，训练中在输入数据的时候要给两个inputs：

outputs = model(inputs, add_var)

3.添加额外输出
有时候在模型训练中，除了模型最后的输出外，我们需要输出模型某一中间层的结果，以施加额外的监督，获得更好的中间层结果。基本的思路是修改模型定义中forward函数的return变量。
我们依然以resnet50做10分类任务为例，在已经定义好的模型结构上，同时输出1000维的倒数第二层和10维的最后一层结果。具体实现如下：

class Model(nn.Module): def __init__(self, net): super(Model, self).__init__() self.net = net self.relu = nn.ReLU() self.dropout = nn.Dropout(0.5) self.fc1 = nn.Linear(1000, 10, bias=True) self.output = nn.Softmax(dim=1)def forward(self, x, add_variable): x1000 = self.net(x) x10 = self.dropout(self.relu(x1000)) x10 = self.fc1(x10) x10 = self.output(x10) return x10, x1000

之后对我们修改好的模型结构进行实例化，就可以使用了：

import torchvision.models as models net = models.resnet50() model = Model(net).cuda() #另外别忘了，训练中在输入数据后会有两个outputs：out10, out1000 = model(inputs, add_var)

四、PyTorch模型保存与读取 1.模型存储格式
PyTorch存储模型主要采用pkl，pt，pth三种格式。就使用层面来说没有区别，这里不做具体的讨论。本节最后的参考内容中列出了查阅到的一些资料，感兴趣的读者可以进一步研究，欢迎留言讨论。
3.模型存储内容
一个PyTorch模型主要包含两个部分：模型结构和权重。其中模型是继承nn.Module的类，权重的数据结构是一个字典（key是层名，value是权重向量）。存储也由此分为两种形式：存储整个模型（包括结构和权重），和只存储模型权重。

from torchvision import models model = models.resnet152(pretrained=True)# 保存整个模型 torch.save(model, save_dir) # 保存模型权重 torch.save(model.state_dict, save_dir)

对于PyTorch而言，pt, pth和pkl三种数据格式均支持模型权重和整个模型存储，使用上没有差别。
保存+读取整个模型

torch.save(model, save_dir) loaded_model = torch.load(save_dir) loaded_model.cuda()

保存+读取模型权重

torch.save(model.state_dict(), save_dir) loaded_dict = torch.load(save_dir) loaded_model = models.resnet152()#注意这里需要对模型结构有定义 loaded_model.state_dict = loaded_dict loaded_model.cuda()