目的:将paddlepaddleocr的字符识别模型转换为pytorch的模型
过程:mobilenetv3_small的小模型在转换过程中成功转换了backbone的参数,但是head(两个双向LSTM)转换失败,因为类别数不同,而LSTM中涉及到两个fc层两个lstm层,其中的fc层的参数和类别相关联,所以双向的LSTM参数无法进行转换;
结果:只对backbone的参数进行了转换和拷贝
原始paddlepaddle工程链接:https://github.com/PaddlePaddle/PaddleOCR
简单叙述过程:
【ocr|怎样将paddlepaddleOCR的模型转换为pytorch模型】1)paddlepaddle的模型参数加载;
def _load_state(path):
"""
记载paddlepaddle的参数
:param path:
:return:
"""
if os.path.exists(path + '.pdopt'):
# XXX another hack to ignore the optimizer state
tmp = tempfile.mkdtemp()
dst = os.path.join(tmp, os.path.basename(os.path.normpath(path)))
shutil.copy(path + '.pdparams', dst + '.pdparams')
state = fluid.io.load_program_state(dst)
shutil.rmtree(tmp)
else:
state = fluid.io.load_program_state(path)
return state
2)pytorch模型的构建;
3)进行参数拷贝和模型保存;
def init_model(self):
for n, m in self.net_pytorch.named_modules():
if isinstance(m, BatchNorm2d):
self.bn_init(n, m)
elif isinstance(m, Conv2d):
self.conv_init(n, m)
# elif isinstance(m, Linear):
#self.fc_init(n, m)
# elif isinstance(m, PReLU):
# self.prelu_init(n, m)
elif isinstance(m, BatchNorm1d):
self.bn_init(n, m)def bn_init(self, layer , m):
for key in self.list_layers:
if (layer in key) and ('bn' in key):
print(key) #, ' -- shape: ', self.state_pp[key].shape)
if 'scale' in key:
m.weight.data.copy_(torch.FloatTensor(self.state_pp[key]))
self.list_layers.remove(key)
elif 'offset' in key:
m.bias.data.copy_(torch.FloatTensor(self.state_pp[key]))
self.list_layers.remove(key)
elif 'mean' in key:
m.running_mean.copy_(torch.FloatTensor(self.state_pp[key]))
self.list_layers.remove(key)
elif 'variance' in key:
m.running_var.copy_(torch.FloatTensor(self.state_pp[key]))
self.list_layers.remove(key)def conv_init(self, layer, m):
# for pr in net.params:
layer_ = layer+'_'
for key in self.list_layers:
if (layer_ in key) and ('bn' not in key):
print(key) #, ' -- shape: ', self.state_pp[key].shape)
if 'weights' in key:
m.weight.data.copy_(torch.FloatTensor(self.state_pp[key]))
elif 'offset' in key:
m.bias.data.copy_(self.state_pp[key])
self.list_layers.remove(key)
技术难点:
1)paddlepaddle中怎样读取每层的输出参数;
2)双向LSTM的参数拷贝;
3)验证参数转换是否成功时,读取图片传入后,每层layer处理后的图片特征;
工程链接:https://github.com/maomaoyuchengzi/paddlepaddle_param_to_pyotrch
推荐阅读
- Symfony 3中的Tesseract光学字符识别(OCR)入门
- 基于PaddleX的岩石识别
- 最佳OCR软件推荐合集(如何从图像和PDF中提取文本())
- 微软|最强大脑张雨暄!14岁考入清华大学,18岁直博清华数学系