算法|工程详细记录（超准确人脸检测(带关键点)YOLO5Face C++）人工智能|深度学习|java|c++

点击上方“计算机视觉工坊”，选择“星标”
干货第一时间送达

文章图片

作者丨DefTruth

编辑丨极市平台

导读
本文主要记录一下YOLO5Face C++工程相关的问题，并且简单介绍下如何使用 Lite.AI.ToolKit C++工具箱来跑直接YOLO5Face人脸检测(带关键点) , 这些案例包含了ONNXRuntime C++、MNN、TNN和NCNN版本。
1. YOLO5Face简介 Github：https://github.com/deepcam-cn/yolov5-face
ArXiv 2021：https://arxiv.org/abs/2105.1293
C++ 实现：https://github.com/DefTruth/YOLO5Face.lite.ai.toolkit
YOLO5Face是深圳神目科技&LinkSprite Technologies开源的一个新SOTA的人脸检测器（带关键点），基于YOLOv5，并且对YOLOv5的骨干网络进行的改造，使得新的模型更加适合用于人脸检测的任务。并且在 YOLOv5 网络中加了一个预测5个关键点 regression head，采用Wing loss进行作为损失函数。从论文中放出的实验结果看YOLO5Face的平均精度（mAP）和速度方面的性能都非常优秀。在模型精度和速度方面，论文中给出了和当前SOTA算法的详细比较，包括比较新的SCRFD(CVPR 2021)、RetinaFace(CVPR 2020)等等。

文章图片
另外由于YOLO5Face采用 Stem 块结构取代 YOLOv5 的 Focus 层，作者认为这样增加了网络的泛化能力，并降低了计算的复杂性。对于替换Focus层带来精度的提升，论文也给出了一些消融实验的对比，还是提了一些点。另外就是，去掉Focus的骚操作后，C++工程的难度也降低了一些，起码在用NCNN的时候，不用再额外捏个YoloV5FocusLayer自定义层进去了。

文章图片
需要了解YOLO5Face相关的算法细节的同学可以看看原论文，或者阅读：
深圳神目科技《YOLO5Face》：人脸检测在 WiderFace 实现 SOTA
https://zhuanlan.zhihu.com/p/375966269
本文主要记录一下YOLO5Face C++工程相关的问题，并且简单介绍下如何使用 Lite.AI.ToolKit C++工具箱来跑直接YOLO5Face人脸检测(带关键点)(https://github.com/DefTruth/lite.ai.toolkit) , 这些案例包含了ONNXRuntime C++、MNN、TNN和NCNN版本。

文章图片
2. C++版本源码 YOLO5Face C++ 版本的源码包含ONNXRuntime、MNN、TNN和NCNN四个版本，源码可以在 lite.ai.toolki（thttps://github.com/DefTruth/lite.ai.toolkit) 工具箱中找到。本文主要介绍如何基于 lite.ai.toolkit工具箱，直接使用YOLO5Face来跑人脸检测。需要说明的是，本文是基于MacOS下编译的 liblite.ai.toolkit.v0.1.0.dylib(https://github.com/DefTruth/yolox.lite.ai.toolkit/blob/main/lite.ai.toolkit/lib) 来实现的，对于使用MacOS的用户，可以直接下载本项目包含的liblite.ai.toolkit.v0.1.0动态库和其他依赖库进行使用。而非MacOS用户，则需要从lite.ai.toolkit中下载源码进行编译。lite.ai.toolkit c++工具箱目前包含80+流行的开源模型，就不多介绍了，只是平时顺手捏的，整合了自己学习过程中接触到的一些模型，感兴趣的同学可以去看看。

yolo5face.cpp（https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/ort/cv/yolo5face.cpp)
yolo5face.h (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/ort/cv/yolo5face.h)
mnn_yolo5face.cpp (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/mnn/cv/mnn_yolo5face.cpp)
mnn_yolo5face.h (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/mnn/cv/mnn_yolo5faceh)
tnn_yolo5face.cpp (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/tnn/cv/tnn_yolo5face.cpp)
tnn_yolo5face.h (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/tnn/cv/tnn_yolo5face.h)
ncnn_yolo5face.cpp (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/ncnn/cv/ncnn_yolo5face.cpp)
ncnn_yolo5face.h (https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/ncnn/cv/ncnn_yolo5face.h)

ONNXRuntime C++、MNN、TNN和NCNN版本的推理实现均已测试通过，欢迎白嫖~ 本文章的案例代码和工具箱仓库地址为:

代码	描述	GitHub
YOLO5Face.lite.ai.toolkit	YOLO5Face C++ 测试用例代码，包含ONNXRuntime、NCNN、MNN、TNN版本	https://github.com/DefTruth/YOLO5Face.lite.ai.toolkit
Lite.AI.ToolKit	A lite C++ toolkit of awesome AI models.（一个开箱即用的C++ AI模型工具箱，emmm，平时学一些新算法的时候顺手捏的，目前包含80+流行的开源模型。不知不觉已经将近800 ?? star啦，欢迎大家来点star??、提issue呀~）	https://github.com/DefTruth/lite.ai.toolkit

如果觉得有用，不妨给个Star??支持一下吧~
3. 模型文件 3.1 ONNX模型文件
可以从我提供的链接下载 Baidu Drive（https://pan.baidu.com/s/1elUGcx7CZkkjEoYhTMwTRQ) code: 8gin, 也可以从本仓库下载。

Class	Pretrained ONNX Files	Rename or Converted From (Repo)	Size
lite::cv::face::detect::YOLO5Face	yolov5face-blazeface-640x640.onnx	YOLO5Face（https://github.com/deepcam-cn/yolov5-face）	3.4Mb
lite::cv::face::detect::YOLO5Face	yolov5face-l-640x640.onnx	YOLO5Face	181Mb
lite::cv::face::detect::YOLO5Face	yolov5face-m-640x640.onnx	YOLO5Face	83Mb
lite::cv::face::detect::YOLO5Face	yolov5face-n-0.5-320x320.onnx	YOLO5Face	2.5Mb
lite::cv::face::detect::YOLO5Face	yolov5face-n-0.5-640x640.onnx	YOLO5Face	4.6Mb
lite::cv::face::detect::YOLO5Face	yolov5face-n-640x640.onnx	YOLO5Face	9.5Mb
lite::cv::face::detect::YOLO5Face	yolov5face-s-640x640.onnx	YOLO5Face	30Mb

3.2 MNN模型文件
MNN模型文件下载地址，Baidu Drive(https://pan.baidu.com/s/1KyO-bCYUv6qPq2M8BH_Okg) code: 9v63, 也可以从本仓库下载。

Class	Pretrained MNN Files	Rename or Converted From (Repo)	Size
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-blazeface-640x640.mnn	YOLO5Face	3.4Mb
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-l-640x640.mnn	YOLO5Face	181Mb
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-m-640x640.mnn	YOLO5Face	83Mb
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-n-0.5-320x320.mnn	YOLO5Face	2.5Mb
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-n-0.5-640x640.mnn	YOLO5Face	4.6Mb
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-n-640x640.mnn	YOLO5Face	9.5Mb
lite::mnn::cv::face::detect::YOLO5Face	yolov5face-s-640x640.mnn	YOLO5Face	30Mb

3.3 TNN模型文件
TNN模型文件下载地址，Baidu Drive(https://pan.baidu.com/s/1lvM2YKyUbEc5HKVtqITpcw) code: 6o6k, 也可以从本仓库下载。

Class	Pretrained TNN Files	Rename or Converted From (Repo)	Size
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-blazeface-640x640.opt.tnnproto&tnnmodel	YOLO5Face	3.4Mb
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-l-640x640.opt.tnnproto&tnnmodel	YOLO5Face	181Mb
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-m-640x640.opt.tnnproto&tnnmodel	YOLO5Face	83Mb
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-n-0.5-320x320.opt.tnnproto&tnnmodel	YOLO5Face	2.5Mb
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-n-0.5-640x640.opt.tnnproto&tnnmodel	YOLO5Face	4.6Mb
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-n-640x640.opt.tnnproto&tnnmodel	YOLO5Face	9.5Mb
lite::tnn::cv::face::detect::YOLO5Face	yolov5face-s-640x640.opt.tnnproto&tnnmodel	YOLO5Face	30Mb

3.4 NCNN模型文件
NCNN模型文件下载地址，Baidu Drive(https://pan.baidu.com/s/1hlnqyNsFbMseGFWscgVhgQ) code: sc7f, 也可以从本仓库下载。

Class	Pretrained NCNN Files	Rename or Converted From (Repo)	Size
lite::ncnn::cv::face::detect::YOLO5Face	yolov5face-m-640x640.opt.param&bin	YOLO5Face	80Mb
lite::ncnn::cv::face::detect::YOLO5Face	yolov5face-n-0.5-320x320.opt.param&bin	YOLO5Face	1.7Mb
lite::ncnn::cv::face::detect::YOLO5Face	yolov5face-n-0.5-640x640.opt.param&bin	YOLO5Face	1.7Mb
lite::ncnn::cv::face::detect::YOLO5Face	yolov5face-n-640x640.opt.param&bin	YOLO5Face	6.5Mb
lite::ncnn::cv::face::detect::YOLO5Face	yolov5face-s-640x640.opt.param&bin	YOLO5Face	27Mb

4. 接口文档在lite.ai.toolkit中，YOLO5Face的实现类为：

class LITE_EXPORTS lite::cv::face::detect::YOLO5Face; class LITE_EXPORTS lite::mnn::cv::face::detect::YOLO5Face; class LITE_EXPORTS lite::tnn::cv::face::detect::YOLO5Face; class LITE_EXPORTS lite::ncnn::cv::face::detect::YOLO5Face;

该类型目前包含1公共接口detect用于进行目标检测。

public: /** * @param mat cv::Mat BGR format * @param detected_boxes_kps vector of BoxfWithLandmarks to catch detected boxes and landmarks. * @param score_threshold default 0.25f, only keep the result which >= score_threshold. * @param iou_threshold default 0.45f, iou threshold for NMS. * @param topk default 400, maximum output boxes after NMS. */ void detect(const cv::Mat &mat, std::vector &detected_boxes_kps, float score_threshold = 0.25f, float iou_threshold = 0.45f, unsigned int topk = 400);

detect接口的输入参数说明：

mat: cv::Mat类型，BGR格式。
detected_boxes_kps: BoxfWithLandmarks向量，包含被检测到的框box(Boxf)，box中包含x1,y1,x2,y2,label,score等成员; 以及landmarks(landmarks)人脸关键点(5个)，其中包含了points，代表关键点，是一个cv::point2f向量(vector);
score_threshold：分类得分（质量得分）阈值，默认0.25，小于该阈值的框将被丢弃。
iou_threshold：NMS中的iou阈值，默认0.45。
topk：默认400，只保留前k个检测到的结果。

5. 使用案例这里测试使用的是yolov5face-n-640x640.onnx(yolov5n-face)nano版本的模型，你可以尝试使用其他版本的模型。
5.1 ONNXRuntime版本

#include "lite/lite.h"static void test_default() { std::string onnx_path = "../hub/onnx/cv/yolov5face-n-640x640.onnx"; // yolov5n-face std::string test_img_path = "../resources/4.jpg"; std::string save_img_path = "../logs/4.jpg"; auto *yolov5face = new lite::cv::face::detect::YOLO5Face(onnx_path); std::vector detected_boxes; cv::Mat img_bgr = cv::imread(test_img_path); yolov5face->detect(img_bgr, detected_boxes); lite::utils::draw_boxes_with_landmarks_inplace(img_bgr, detected_boxes); cv::imwrite(save_img_path, img_bgr); std::cout << "Default Version Done! Detected Face Num: " << detected_boxes.size() << std::endl; delete yolov5face; }

5.2 MNN版本

#include "lite/lite.h"static void test_mnn() { #ifdef ENABLE_MNN std::string mnn_path = "../hub/mnn/cv/yolov5face-n-640x640.mnn"; // yolov5n-face std::string test_img_path = "../resources/12.jpg"; std::string save_img_path = "../logs/12.jpg"; auto *yolov5face = new lite::mnn::cv::face::detect::YOLO5Face(mnn_path); std::vector detected_boxes; cv::Mat img_bgr = cv::imread(test_img_path); yolov5face->detect(img_bgr, detected_boxes); lite::utils::draw_boxes_with_landmarks_inplace(img_bgr, detected_boxes); cv::imwrite(save_img_path, img_bgr); std::cout << "MNN Version Done! Detected Face Num: " << detected_boxes.size() << std::endl; delete yolov5face; #endif }

5.3 TNN版本

#include "lite/lite.h"static void test_tnn() { #ifdef ENABLE_TNN std::string proto_path = "../hub/tnn/cv/yolov5face-n-640x640.opt.tnnproto"; // yolov5n-face std::string model_path = "../hub/tnn/cv/yolov5face-n-640x640.opt.tnnmodel"; std::string test_img_path = "../resources/9.jpg"; std::string save_img_path = "../logs/9.jpg"; auto *yolov5face = new lite::tnn::cv::face::detect::YOLO5Face(proto_path, model_path); std::vector detected_boxes; cv::Mat img_bgr = cv::imread(test_img_path); yolov5face->detect(img_bgr, detected_boxes); lite::utils::draw_boxes_with_landmarks_inplace(img_bgr, detected_boxes); cv::imwrite(save_img_path, img_bgr); std::cout << "TNN Version Done! Detected Face Num: " << detected_boxes.size() << std::endl; delete yolov5face; #endif }

5.4 NCNN版本

#include "lite/lite.h"static void test_ncnn() { #ifdef ENABLE_NCNN std::string param_path = "../hub/ncnn/cv/yolov5face-n-640x640.opt.param"; // yolov5n-face std::string bin_path = "../hub/ncnn/cv/yolov5face-n-640x640.opt.bin"; std::string test_img_path = "../resources/1.jpg"; std::string save_img_path = "../logs/1.jpg"; auto *yolov5face = new lite::ncnn::cv::face::detect::YOLO5Face(param_path, bin_path, 1, 640, 640); std::vector detected_boxes; cv::Mat img_bgr = cv::imread(test_img_path); yolov5face->detect(img_bgr, detected_boxes); lite::utils::draw_boxes_with_landmarks_inplace(img_bgr, detected_boxes); cv::imwrite(save_img_path, img_bgr); std::cout << "NCNN Version Done! Detected Face Num: " << detected_boxes.size() << std::endl; delete yolov5face; #endif }

输出结果为:

文章图片
虽然是nano版本的模型，但结果看起来还是非常准确的啊！还自带了5个人脸关键点，可以用来做人脸对齐，也是比较方便~
6. 编译运行 【算法|工程详细记录（超准确人脸检测(带关键点)YOLO5Face C++）】在MacOS下可以直接编译运行本项目，无需下载其他依赖库。其他系统则需要从lite.ai.toolkit 中下载源码先编译lite.ai.toolkit.v0.1.0动态库。

git clone --depth=1 https://github.com/DefTruth/YOLO5Face.lite.ai.toolkit.git cd YOLO5Face.lite.ai.toolkit sh ./build.sh

CMakeLists.txt设置

cmake_minimum_required(VERSION 3.17) project(YOLO5Face.lite.ai.toolkit)set(CMAKE_CXX_STANDARD 11)# setting up lite.ai.toolkit set(LITE_AI_DIR ${CMAKE_SOURCE_DIR}/lite.ai.toolkit) set(LITE_AI_INCLUDE_DIR ${LITE_AI_DIR}/include) set(LITE_AI_LIBRARY_DIR ${LITE_AI_DIR}/lib) include_directories(${LITE_AI_INCLUDE_DIR}) link_directories(${LITE_AI_LIBRARY_DIR})set(OpenCV_LIBS opencv_highgui opencv_core opencv_imgcodecs opencv_imgproc opencv_video opencv_videoio ) # add your executable set(EXECUTABLE_OUTPUT_PATH ${CMAKE_SOURCE_DIR}/examples/build)add_executable(lite_yolo5face examples/test_lite_yolo5face.cpp) target_link_libraries(lite_yolo5face lite.ai.toolkit onnxruntime MNN# need, if built lite.ai.toolkit with ENABLE_MNN=ON,default OFF ncnn # need, if built lite.ai.toolkit with ENABLE_NCNN=ON, default OFF TNN# need, if built lite.ai.toolkit with ENABLE_TNN=ON,default OFF ${OpenCV_LIBS})# link lite.ai.toolkit & other libs.

building && testing information:

[ 50%] Building CXX object CMakeFiles/lite_yolo5face.dir/examples/test_lite_yolo5face.cpp.o [100%] Linking CXX executable lite_yolo5face [100%] Built target lite_yolo5face Testing Start ... LITEORT_DEBUG LogId: ../hub/onnx/cv/yolov5face-n-640x640.onnx =============== Input-Dims ============== input_node_dims: 1 input_node_dims: 3 input_node_dims: 640 input_node_dims: 640 =============== Output-Dims ============== Output: 0 Name: output Dim: 0 :1 Output: 0 Name: output Dim: 1 :25200 Output: 0 Name: output Dim: 2 :16 ======================================== generate_bboxes_kps num: 2824 Default Version Done! Detected Face Num: 326 LITEMNN_DEBUG LogId: ../hub/mnn/cv/yolov5face-n-640x640.mnn =============== Input-Dims ============== **Tensor shape**: 1, 3, 640, 640, Dimension Type: (CAFFE/PyTorch/ONNX)NCHW =============== Output-Dims ============== getSessionOutputAll done! Output: output:**Tensor shape**: 1, 25200, 16, ======================================== generate_bboxes_kps num: 71 MNN Version Done! Detected Face Num: 5 LITENCNN_DEBUG LogId: ../hub/ncnn/cv/yolov5face-n-640x640.opt.param generate_bboxes_kps num: 34 NCNN Version Done! Detected Face Num: 2 LITETNN_DEBUG LogId: ../hub/tnn/cv/yolov5face-n-640x640.opt.tnnproto =============== Input-Dims ============== input: [1 3 640 640 ] Input Data Format: NCHW =============== Output-Dims ============== output: [1 25200 16 ] ======================================== generate_bboxes_kps num: 98 TNN Version Done! Detected Face Num: 7 Testing Successful !

其中一个测试结果为：

文章图片
7. 模型转换过程记录 ok，到这里，nano版本模型的效果大家都看到了，还是很不错的，640x640的input size下很多小人脸都检测出来了。C++版本的推理结果对齐也基本没有问题。那么这小节就主要记录一下，各种类型（ONNX/MNN/TNN/NCNN）的模型文件转换问题。毕竟这可以说是比较重要的一步了，因此也想和大家简单分享下。个人知识面有限，以下表述有不足之处，欢迎各位大佬指出哈~
7.1 Detect模块推理源码分析(pytorch)

def forward(self, x): # x = x.copy()# for profiling z = []# inference output if self.export_cat: for i in range(self.nl): x[i] = self.m[i](x[i])# conv bs, _, ny, nx = x[i].shape# YOLOv5: x(bs,255,20,20) to x(bs,3,20,20,85), YOLO5Face: x(bs,3,20,20,4+1+10+1=16) x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() # x[i] = x[i].view(bs, 3, 16, -1).permute(0, 1, 3, 2).contiguous()# e.g (b,3,20x20,16) for NCNN# if self.grid[i].shape[2:4] != x[i].shape[2:4]: ## self.grid[i] = self._make_grid(nx, ny).to(x[i].device) #self.grid[i], self.anchor_grid[i] = self._make_grid_new(nx, ny, i)# 这是YOLO5Face原来的代码 self.grid[i], self.anchor_grid[i] = self._make_grid_new(nx, ny, i) # 这是我修改的代码，可以去掉jit的Tracing(TracerWarning:) y = torch.full_like(x[i], 0) y = y + torch.cat((x[i][:, :, :, :, 0:5].sigmoid(), torch.cat((x[i][:, :, :, :, 5:15], x[i][:, :, :, :, 15:15 + self.nc].sigmoid()), 4)), 4) box_xy = (y[:, :, :, :, 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i]# xy box_wh = (y[:, :, :, :, 2:4] * 2) ** 2 * self.anchor_grid[i]# wh # box_conf = torch.cat((box_xy, torch.cat((box_wh, y[:, :, :, :, 4:5]), 4)), 4) landm1 = y[:, :, :, :, 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]#x1 y1 landm2 = y[:, :, :, :, 7:9] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]#x2 y2 landm3 = y[:, :, :, :, 9:11] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# x3 y3 landm4 = y[:, :, :, :, 11:13] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# x4 y4 landm5 = y[:, :, :, :, 13:15] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i]# x5 y5 # landm = torch.cat((landm1, torch.cat((landm2, torch.cat((landm3, torch.cat((landm4, landm5), 4)), 4)), 4)), 4) # y = torch.cat((box_conf, torch.cat((landm, y[:, :, :, :, 15:15+self.nc]), 4)), 4) y = torch.cat([box_xy, box_wh, y[:, :, :, :, 4:5], landm1, landm2, landm3, landm4, landm5, y[:, :, :, :, 15:15 + self.nc]], -1)z.append(y.view(bs, -1, self.no))# (bs,-1,16) return torch.cat(z, 1)# (bs,?,16) # return x # for NCNN

我们主要来看看Detect模块的forward函数。可以看到，新增的5个关键点，是在YOLOv5原来输出的基础上进行添加的，其余的和YOLOv5的输出一致。不同的是，原来的YOLOv5是一个多实体目标检测，nc=80(coco)，no=nc+5=85，前4个是预测bbox偏移量，第5个位置是前景背景的分类概率，后80个值是80个具体类别的分类概率。
而在YOLO5Face中，由于新增了5个关键点，并且只有一个实际的类别（是否为人脸），所以它的nc=1(face)，no=nc+5+10=16，前4个（索引0-3）是预测人脸框bbox偏移量，第5个（索引4）位置是前景背景的分类概率，中间10个（索引5-14）是5个关键点（x,y）的偏移量，最后1个值（索引15）是人脸类别的分类概率。
另外，关于偏移量坐标的计算方式，我们可以看到，YOLO5Face的bbox的计算方式和YOLOv5保持一致，但是关键点的偏移计算方式却是不同的，因为关键点只有一个点（x,y），没有宽和高，所以无法复用YOLOv5中的计算方式。在YOLO5Face中，关键点的偏移量是相对于步长stride和anchor的宽高而言的，是一个相对值，而不是绝对值，计算方式如下：

landmark_x_offset = (landmark_x - x_anchor * stride) / anchor_w
landmark_y_offset = (landmark_y - y_anchor * stride) / anchor_h
逆运算就是:
landmark_x = landmark_x_offset * anchor_w + x_anchor * stride
landmark_y = landmark_y_offset * anchor_h + y_anchor * stride

另外，我们可以看到，YOLO5Face这里，有一个新函数_make_grid_new，YOLOv5中用的是_make_grid。这个函数其实蛮重要的，我讲一讲我的理解。新函数中_make_grid_new中有2个新特点：

重新根据当前的anchors生成了对应的anchor_grid；
显示指定了na(num anchors)的值，而不是使用1；

@staticmethod def _make_grid(nx=20, ny=20): yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)]) return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()# 原来的函数def _make_grid_new(self, nx=20, ny=20, i=0): d = self.anchors[i].device if '1.10.0' in torch.__version__:# torch>=1.10.0 meshgrid workaround for torch>=0.7 compatibility yv, xv = torch.meshgrid([torch.arange(ny).to(d), torch.arange(nx).to(d)], indexing='ij') else: yv, xv = torch.meshgrid([torch.arange(ny).to(d), torch.arange(nx).to(d)]) grid = torch.stack((xv, yv), 2).expand((1, self.na, ny, nx, 2)).float() anchor_grid = (self.anchors[i].clone() * self.stride[i]).view((1, self.na, 1, 1, 2)).expand( (1, self.na, ny, nx, 2)).float() return grid, anchor_grid# 新函数

为什么要这样做呢？我们先来看看anchor_grid和anchor的初始代码。

self.grid = [torch.zeros(1)] * self.nl# init grid a = torch.tensor(anchors).float().view(self.nl, -1, 2) self.register_buffer('anchors', a)# shape(nl,na,2) self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2))# shape(nl,1,na,1,1,2)

在Detect模块的init中，使用了register_buffer来注册anchors和anchor_grid，这样这两个变量，就会变成能被torch识别的变量，在调用torch.save保存模型的时候，这两个变量的值，就会被作为模型的一部分，一并保存下来。（插个话，之前看到有同学问，为什么在使用YOLOv5时，直接加载预训练好的pth权重就好了呢？没看到哪里有代码使用了yolov5xxx.yaml的配置文件啊？也没看到在哪里设置了anchor啊？其实就是这原因，因为人家在save的时候已经把所有的东西都保存下来了。因此在推理的时候就可以脱离yolov5xxx.yaml配置文件了。）那么，在真正用的时候，可能需要根据情况设置新的anchors，比如YOLOv5保存的anchors并不适合与人脸检测（如果使用YOLOv5的权重作为预训练权重）又或者你纯粹只是想换新的anchors做实验，那么就要将权重文件中保存的旧anchors设置为新的适合于人脸检测的anchors，同时，由于anchor_grid是依赖于anchors的，所以也要重新生成。至于na设置成固定值，emmm...，我猜只是为了不过度依赖torch的broadcast特性吧，毕竟这个特性在工程落地的时候可能也会有坑（只是可能哦）。

# if self.grid[i].shape[2:4] != x[i].shape[2:4]: ## self.grid[i] = self._make_grid(nx, ny).to(x[i].device) #self.grid[i], self.anchor_grid[i] = self._make_grid_new(nx, ny, i)# 这是YOLO5Face原来的代码 self.grid[i], self.anchor_grid[i] = self._make_grid_new(nx, ny, i) # 这是我修改的代码，可以去掉jit的Tracing(TracerWarning:)

对于YOLO5Face的Detect中forward的源码，我做了一个无关紧要的小改动。原来的代码不影响ONNX的导出，但会出现Tracing(TracerWarning:)，self.grid[i].shape[2:4] != x[i].shape[2:4] 的结果可能为True也可能为False，不是一个确定值，所以会出现Tracing(TracerWarning:)。所以解决问题的方法就是，去掉这个判断，始终根据目前的输入维度构造新的grid从逻辑上看，这并没有改变forward最终的推理结果。
7.2 ONNX/MNN/TNN模型文件转换
如果你已经梳理清楚了Detect模块的一些新的逻辑，那么转换成ONNX就是比较简单的事了，直接调用export.py即可。比如：

PYTHONPATH=. python3 export.py --weights weights/yolov5n-0.5.pt --img_size 640 640 --batch_size 1 --simplify PYTHONPATH=. python3 export.py --weights weights/yolov5n-face.pt --img_size 640 640 --batch_size 1 --simplify

如果你去掉了self.grid[i].shape[2:4] != x[i].shape[2:4] 的判断，也不会再出现Tracing(TracerWarning:)。转换成MNN和TNN的模型文件的命令如下：

MNNConvert -f ONNX --modelFile yolov5n-0.5-640x640.onnx --MNNModel yolov5n-0.5-640x640.mnn --bizCode MNN# MNN模型转换 python3 ./converter.py onnx2tnn yolov5n-0.5-640x640.onnx -o ./YOLO5Face/ -optimize -v v1.0 -align # TNN模型转换

我用的MNNConvert是对应MNN 1.2.0版本，tnn-convert镜像则是最新的镜像。
7.3 针对NCNN模型转换的定制化处理（不支持5维张量）
由于NCNN的Mat是一个3维张量（h,w,c），假设batch=1，所以目前似乎是对4维及以下的张量有比较好的支持，5维及以上的张量是无法转换到ncnn的（个人理解哈，如有错误，欢迎指正~）。我拿export出来的ONNX文件直接转ncnn会遇到unspport slice axes的情况。比如

~ onnx2ncnn YOLO5Face/yolov5n-face-640x640.onnx yolov5n-face-640x640.param yolov5n-face-640x640.bin Unsupported slice axes ! Unsupported slice axes ! Unsupported slice axes ! Unsupported slice axes ! ...

然后尝试采用野路子：记录一个解决onnx转ncnn时op不支持的trick 也无法解决，输出的信息如下：

~ onnx2ncnn YOLO5Face/yolov5n-face-640x640.opt.onnx yolov5n-face-640x640.param yolov5n-face-640x640.bin Unsupported slice axes ! Unsupported slice axes ! Unsupported slice axes ! Unsupported slice axes ! ...

所以，我想这可能是由于ncnn会把一个5维张量捏成4维（假设batch=1），但是YOLO5Face的坐标反算逻辑基本上是在5维上做slice，所以导致了NCNN在转换这段反算逻辑时出现了slice错误。那么怎么解决这个问题呢？那就是不使用5维张量，把Detect中关于坐标反算的那段拿到C++中做实现。如果你理解了Detect的细节，以及张量在内存中的分布，这个实现其实不难做。首先，我们来看看，在YOLO5Face中这个代码怎么改。

# x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 原来的处理 x[i] = x[i].view(bs, 3, 16, -1).permute(0, 1, 3, 2).contiguous()# e.g (b,3,20x20,16) for NCNN # ... 注释掉坐标反算的逻辑 # return torch.cat(z, 1)# (bs,?,16) 原来的返回 return x # 修改后的返回 for NCNN

其实就是不展开最后(ny,no)这两个维度，把这2个维度flatten成一个维度。由于后续的处理都是基于5维的张量，所以，坐标反算那段逻辑也要注释掉，直接返回这个修改后的4维张量，把坐标反算这部分放在C++里面实现。为了顺利export出ONNX文件，还需要对应地修改export.py，因为现在输出是一个list了，里面有3个维度不一样的张量，而原来是被torch.cat在一起，只有一个张量。

# torch.onnx.export(model, img, f, verbose=False, opset_version=12, #input_names=input_names, #output_names=output_names, #dynamic_axes={'input': {0: 'batch'}, #'output': {0: 'batch'} #} if opt.dynamic else None) torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['input'], output_names=["det_stride_8", "det_stride_16", "det_stride_32"], )# for ncnn

正常导出即可，然后转换成NCNN文件，并用ncnnoptimze过一遍，很顺利，没有再出现算子不支持的问题。

~ PYTHONPATH=. python3 export.py --weights weights/yolov5n-face.pt --img_size 640 640 --batch_size 1 --simplify ~ ncnn_models onnx2ncnn yolov5n-face-640x640-for-ncnn.onnx yolov5n-face-640x640.param yolov5n-face-640x640.bin ~ ncnnoptimize yolov5n-face-640x640.param yolov5n-face-640x640.bin yolov5n-face-640x640.opt.param yolov5n-face-640x640.opt.bin 0 Input layer input without shape info, shape_inference skipped Input layer input without shape info, estimate_memory_footprint skipped

其实，这样做还是有好处的，因为不需要把anchors和anchor_grid导出来，那么模型文件的size就变小了，比如按照原来方式导出的yolov5face-n-640x640.onnx文件占了9.5Mb内存，修改后，不导出anchors和anchor_grid的模型文件只有6.5Mb。最后，关于YOLO5Face 的C++前后处理以及NMS的实现，建议大家可以去看看我仓库的源码，就不在这里啰嗦了~
本文仅做学术分享，如有侵权，请联系删文。
重磅！计算机视觉工坊-学习交流群已成立
扫码添加小助手微信，可申请加入3D视觉工坊-学术论文写作与投稿微信交流群，旨在交流顶会、顶刊、SCI、EI等写作与投稿事宜。
同时也可申请加入我们的细分方向交流群，目前主要有ORB-SLAM系列源码学习、3D视觉、CV&深度学习、SLAM、三维重建、点云后处理、自动驾驶、CV入门、三维测量、VR/AR、3D人脸识别、医疗影像、缺陷检测、行人重识别、目标跟踪、视觉产品落地、视觉竞赛、车牌识别、硬件选型、深度估计、学术交流、求职交流等微信群，请扫描下面微信号加群，备注：”研究方向+学校/公司+昵称“，例如：”3D视觉 + 上海交大 + 静静“。请按照格式备注，否则不予通过。添加成功后会根据研究方向邀请进去相关微信群。原创投稿也请联系。

文章图片

▲长按加微信群或投稿

文章图片

▲长按关注公众号
3D视觉从入门到精通知识星球：针对3D视觉领域的视频课程（三维重建系列、三维点云系列、结构光系列、手眼标定、相机标定、激光/视觉SLAM、自动驾驶等）、知识点汇总、入门进阶学习路线、最新paper分享、疑问解答五个方面进行深耕，更有各类大厂的算法工程人员进行技术指导。与此同时，星球将联合知名企业发布3D视觉相关算法开发岗位以及项目对接信息，打造成集技术与就业为一体的铁杆粉丝聚集区，近4000星球成员为创造更好的AI世界共同进步，知识星球入口：
学习3D视觉核心技术，扫描查看介绍，3天内无条件退款