基于飞桨YOLOv5n6通道裁剪及手动转TensorRT实践-CFANZ编程社区

转自AI Studio，原文链接：基于飞桨YOLOv5n6通道裁剪及手动转TensorRT实践 - 飞桨AI Studio

参考大佬们的项目：

【PaddleDetection2.0专项】快速实现行人检测
【目标检测】基于飞桨复现YOLOv5
【yolov5_prune】

本项目基准参考指标yolov3： map 51.8， yolov5n v6模型裁剪后map指标达到58.9，模型大小只有2.5mb, 使用TRT加速后模型enqueue推理速度大约1.7ms(显卡RTX 2070s)

本项目裁剪实验基于yolov5n的v6版本，yolov5的其他版本类似

注意本项目只是个人的实验记录，仅仅给各位大佬提供粗略的参考

本项目GitHub地址: https://github.com/thunder95/yolov5_paddle_prune

TensorRT c++实现： https://github.com/thunder95/tensorrtx/tree/master/yolov5n-v6-prune

实验用到的数据集

全文是MOT20: A benchmark for multi object tracking in crowded scenes。MOT数据集经过数年地发展，已经越来越侧重于复杂、多任务的现实场景了。

该数据集着重多人跟踪，由于行人是跟踪任务的主要的研究目标，因此精确地跟踪和检测有非常重要的使用价值。

行人检测的主要应用有智能监控。在监控场景中，大多是从公共区域的监控摄像头视角拍摄行人，获取图像后再进行行人检测。

数据集下载地址： https://aistudio.baidu.com/aistudio/datasetdetail/87561，需要进行格式转换：

python YOLOv5-Paddle/convert.py

本项目已经上传好转换后的数据集, 各位大佬不需要再转换了。

In [ ]

# 解压数据集
%cd /home/aistudio/data/data127016
!unzip yolov5_mot20.zip

In [ ]

#生成txt
%cd /home/aistudio/data/data127016
!python create_txt.py

/home/aistudio/data/data127016

基于YOLOv5-Paddle进行基准训练

从GitHub下载浩神的代码就可以直接训练：https://github.com/GuoQuanhao/YOLOv5-Paddle

本项目基于解压后的代码运行，由于占用显存较大，AIStudio没有多卡资源，所以在本地进行多卡分布式训练：

python -m paddle.distributed.launch train.py --img 640 --batch 64 --epochs 100 --data ./data/coco_person.yaml --cfg models/yolov5n.yaml --weights weights/yolov5n6.pdparams

若下载Arial.ttf字体库比较慢，可以执行手动拷贝:

cp /home/aistudio/Arial.ttf /home/aistudio/.config/thunder95/

验证集map精度0.612

In [1]

#  若需要下载最新代码代码
%cd /home/aistudio/
!rm -rf yolov5_paddle_prune/

#若太慢可以使用gitee仓库， git clone https://github.com/thunder95/yolov5_paddle_prune.git
!git clone https://gitee.com/thunder95/yolov5_paddle_prune.git

In [ ]

# 安装依赖库
!pip install gputil==1.4.0 pycocotools terminaltables
!mkdir -p /home/aistudio/.config/thunder95/
!cp /home/aistudio/Arial.ttf  /home/aistudio/.config/thunder95/

In [47]

 #训练基础模型， 单卡运行非常耗时，建议本地多卡运行
%cd /home/aistudio/yolov5_paddle_prune/
!python train.py --img 640 --batch 64 --epochs 1 --data ./data/coco_person.yaml --cfg models/yolov5n.yaml --weights weights/yolov5n6.pdparams

In [ ]

# 验证基础模型
%cd /home/aistudio/yolov5_paddle_prune/
!python prune_val.py --img 640  --data ./data/coco_person.yaml --cfg models/yolov5n.yaml --weights weights/train.pdparams

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
prune_val: data=./data/coco_person.yaml, cfg=models/yolov5n.yaml, weights=weights/train.pdparams, wts=, hyp=data/hyps/hyp.scratch.yaml, batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False
YOLOv5 🚀 2973ab2 paddle 2.2.2 
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
W0407 00:24:59.826810 11755 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:24:59.832765 11755 device_context.cc:465] device: 0, cuDNN Version: 7.6.
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs

val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.942      0.889      0.961      0.612
Speed: 0.2ms pre-process, 1.1ms inference, 40.6ms NMS per image at shape (32, 3, 640, 640)
Results saved to runs/val/exp

稀疏训练

进行通道剪枝前需要先进行稀疏训练，将BN层的权重训经过稀疏训练后，权重分布尽可能靠近0.

若使用多卡训练： python -m paddle.distributed.launch train_sparsity.py --img 640 --batch 64 --epochs 100 --data ./data/coco_person.yaml --cfg models/yolov5n.yaml --weights weights/train.pdparams -sr

经过稀疏训练后map精度达到0.51, 原因是稀疏度不够且训练也并不充分。

稀疏训练前:

稀疏训练后:

In [ ]

# 进行稀疏训练
%cd /home/aistudio/yolov5_paddle_prune/
!python train_sparsity.py --img 640 --batch 64 --epochs 1 --data ./data/coco_person.yaml --cfg models/yolov5n.yaml --weights weights/train.pdparams -sr

In [ ]

# 基于稀疏训练的模型进行验证
%cd /home/aistudio/yolov5_paddle_prune/
!python val.py --img 640  --data ./data/coco_person.yaml --cfg models/yolov5n.yaml --weights weights/sparse2.pdparams

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
val: data=./data/coco_person.yaml, cfg=models/yolov5n.yaml, weights=weights/sparse2.pdparams, hyp=data/hyps/hyp.scratch.yaml, batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False
YOLOv5 🚀 2973ab2 paddle 2.2.2 
--->cfg:  models/yolov5n.yaml
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
W0407 00:34:44.609393 12967 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:34:44.614804 12967 device_context.cc:465] device: 0, cuDNN Version: 7.6.
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs

Fusing layers... 
Model Summary: 212 layers, 1760518 parameters, 8118 gradients, 4.2 GFLOPs
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.868      0.789      0.884       0.51
Speed: 0.4ms pre-process, 21.4ms inference, 85.9ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/val/exp2

剪枝后网络还是很敏感,可能稀疏训练不够强, 或者本身网络channels数相比s要少很多。
有时候稀疏训练能够对map有不小的提升，深层次的对比实验和分析本文还没有做。
而且稀疏训练耗时比较多，特别是像yolov5nv6版本的模型本身就已经很小，需要反复调整稀疏度参数，由于比较敏感剪枝后对网络整体输出影响比较大。
如果稀疏训练不够，可以反复进行这些步骤: 训练->稀疏->裁剪->重训；例如，对上面的模型再进行一次稀疏训练:

稀疏训练前:

稀疏训练后:

map变化情况如下：

基本可以恢复到初始训练阶段的精度

8倍通道裁剪

通道数量原则上可以任意比例裁剪，原作者强调8倍通道数量对硬件的适配会更加友好，但是本项目还没有做具体的比较实验。

由于模型本身比较小，稀疏度训练也不够，对剪枝非常敏感，所以裁剪后将对模型精度上有较大的损失。

In [ ]

#裁剪比例0.5
%cd /home/aistudio/yolov5_paddle_prune/
!python slim_prune_yolov5n_8x.py --cfg cfg/yolov5n_v6_person.cfg --data data/coco_person.yaml --weights weights/sparse2.pdparams --global_percent 0.5 --layer_keep 0.01 --img_size 640

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
Namespace(cfg='cfg/yolov5n_v6_person.cfg', data='data/coco_person.yaml', global_percent=0.5, img_size=640, layer_keep=0.01, weights='weights/sparse2.pdparams')
W0407 00:42:36.749428 13987 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:42:36.754426 13987 device_context.cc:465] device: 0, cuDNN Version: 7.6.
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs


let's test the original model first:
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.868      0.789      0.884       0.51
Global Threshold should be less than 0.5709.
Prune channels: 2040	Prune ratio: 0.429
merge the mask of layers connected to shortcut!
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881          0          0          0          0          0
mask the gamma as zero, mAP of the model is 0.0000

now prune the model but keep size,(actually add offset of BN beta to following layers), let's see how the mAP goes
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363     0.0233     0.0156      0.012    0.00278
testing inference time...
testing the final model...
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363     0.0233     0.0156      0.012    0.00278
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
| Metric     | Before                                                                  | After                                                                   |
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
| mAP        | 0.884103                                                                | 0.011989                                                                |
| Parameters | Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=False, | Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=False, |
|            |        [1774774])                                                       |        [636534])                                                        |
| Inference  | 0.0235                                                                  | 0.0216                                                                  |
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
Config file has been saved: cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg
Compact model has been saved: weights/prune_0.5_keep_0.01_8x_sparse2.weights

In [ ]

# 裁剪比例0.2
%cd /home/aistudio/yolov5_paddle_prune/
!python slim_prune_yolov5n_8x.py --cfg cfg/yolov5n_v6_person.cfg --data data/coco_person.yaml --weights weights/sparse2.pdparams --global_percent 0.2 --layer_keep 0.01 --img_size 640

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
Namespace(cfg='cfg/yolov5n_v6_person.cfg', data='data/coco_person.yaml', global_percent=0.2, img_size=640, layer_keep=0.01, weights='weights/sparse2.pdparams')
W0407 00:36:32.239961 13226 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:36:32.244963 13226 device_context.cc:465] device: 0, cuDNN Version: 7.6.
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs


let's test the original model first:
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.868      0.789      0.884       0.51
Global Threshold should be less than 0.4824.
Prune channels: 752	Prune ratio: 0.158
merge the mask of layers connected to shortcut!
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881          0          0          0          0          0
mask the gamma as zero, mAP of the model is 0.0000

now prune the model but keep size,(actually add offset of BN beta to following layers), let's see how the mAP goes
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363    0.00983   0.000996    0.00492    0.00113
testing inference time...
testing the final model...
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363    0.00983   0.000996    0.00492    0.00113
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
| Metric     | Before                                                                  | After                                                                   |
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
| mAP        | 0.884103                                                                | 0.004918                                                                |
| Parameters | Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=False, | Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=False, |
|            |        [1774774])                                                       |        [1287670])                                                       |
| Inference  | 0.0238                                                                  | 0.0214                                                                  |
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
Config file has been saved: cfg/prune_0.2_keep_0.01_8x_yolov5n_v6_person.cfg
Compact model has been saved: weights/prune_0.2_keep_0.01_8x_sparse2.weights

In [ ]

#裁剪比例0.1
%cd /home/aistudio/yolov5_paddle_prune/
!python slim_prune_yolov5n_8x.py --cfg cfg/yolov5n_v6_person.cfg --data data/coco_person.yaml --weights weights/sparse2.pdparams --global_percent 0.1 --layer_keep 0.01 --img_size 640

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
Namespace(cfg='cfg/yolov5n_v6_person.cfg', data='data/coco_person.yaml', global_percent=0.1, img_size=640, layer_keep=0.01, weights='weights/sparse2.pdparams')
W0407 00:47:40.726068 14664 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:47:40.731101 14664 device_context.cc:465] device: 0, cuDNN Version: 7.6.
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs


let's test the original model first:
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.868      0.789      0.884       0.51
Global Threshold should be less than 0.2867.
Prune channels: 368	Prune ratio: 0.077
merge the mask of layers connected to shortcut!
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.687      0.671      0.731      0.298
mask the gamma as zero, mAP of the model is 0.7312

now prune the model but keep size,(actually add offset of BN beta to following layers), let's see how the mAP goes
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.669      0.645        0.7      0.268
testing inference time...
testing the final model...
val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.669      0.645        0.7      0.268
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
| Metric     | Before                                                                  | After                                                                   |
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
| mAP        | 0.884103                                                                | 0.700221                                                                |
| Parameters | Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=False, | Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=False, |
|            |        [1774774])                                                       |        [1588150])                                                       |
| Inference  | 0.0231                                                                  | 0.0215                                                                  |
+------------+-------------------------------------------------------------------------+-------------------------------------------------------------------------+
Config file has been saved: cfg/prune_0.1_keep_0.01_8x_yolov5n_v6_person.cfg
Compact model has been saved: weights/prune_0.1_keep_0.01_8x_sparse2.weights

微调训练

在8通道剪枝之后，精度损失非常严重，但我认为，剪枝后的模型精度仅提供一个大概的参考。我们可以基于经验，估计一个裁剪比例，重新finetune训练。本文以裁剪比例0.5为例，精度降低并不多。

python -m paddle.distributed.launch finetune_prune.py --img 640 --batch 128 --epochs 100 --data data/coco_person.yaml --cfg ./cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg --weights ./runs/train/s_person_finetune6/weights/last.pdparams --name s_person_finetune

In [ ]

# 微调训练
%cd /home/aistudio/yolov5_paddle_prune/
!python finetune_prune.py --img 640 --batch 16 --epochs 1 --data data/coco_person.yaml --cfg ./cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg --weights ./weights/prune_0.5_keep_0.01_8x_sparse2.pdparams --name s_person_finetune

In [ ]

# 基于微调训练的模型进行验证并生成TensorRT所需要的wts文件
%cd /home/aistudio/yolov5_paddle_prune/
!python prune_val.py --img 640  --data ./data/coco_person.yaml --cfg ./cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg --weights weights/finetune.pdparams --wts out.wts

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
prune_val: data=./data/coco_person.yaml, cfg=./cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg, weights=weights/finetune.pdparams, wts=out.wts, hyp=data/hyps/hyp.scratch.yaml, batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False
YOLOv5 🚀 6068140 paddle 2.2.2 
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
W0407 00:17:06.039932 10758 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:17:06.045732 10758 device_context.cc:465] device: 0, cuDNN Version: 7.6.
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs

val: Scanning '/home/aistudio/data/data127016/valid.cache' images and labels... 
               Class     Images     Labels          P          R     mAP@.5 mAP@
                 all        881      90363      0.923      0.871      0.951      0.589
Speed: 0.2ms pre-process, 1.4ms inference, 11.9ms NMS per image at shape (32, 3, 640, 640)
Results saved to runs/val/exp6
wts file saved successfully!!!

In [ ]

# 基于微调训练的模型查看模型效果
%cd /home/aistudio/yolov5_paddle_prune/
!python detect.py --weights ./weights/finetune.pdparams --cfg cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg --data ./data/coco_person.yaml --source 05000319.jpg

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
detect: weights=['./weights/finetune.pdparams'], cfg=cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg, source=05000319.jpg, imgsz=[640, 640], conf_thres=0.01, iou_thres=0.6, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, dnn=False, single_cls=False, data=./data/coco_person.yaml, hyp=data/hyps/hyp.scratch.yaml
YOLOv5 🚀 6068140 paddle 2.2.2 
Overriding model.yaml nc=80 with nc=1

                 from  n    params  layer                                   arguments                     
W0407 00:18:01.161935 10908 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0407 00:18:01.167872 10908 device_context.cc:465] device: 0, cuDNN Version: 7.6.
  0                -1  1      1792  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4736  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4992  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18688  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29696  models.common.C3                        [64, 64, 2]                   
  5                -1  1     74240  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    158208  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295936  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    297984  models.common.C3                        [256, 256, 1]                 
  9                -1  1    165376  models.common.SPPF                      [256, 256, 5]                 
 10                -1  1     33280  models.common.Conv                      [256, 128, 1, 1]              
 11                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1     91648  models.common.C3                        [256, 128, 1, False]          
 14                -1  1      8448  models.common.Conv                      [128, 64, 1, 1]               
 15                -1  1         0  models.common.Upsample                  [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     23296  models.common.C3                        [128, 64, 1, False]           
 18                -1  1     37120  models.common.Conv                      [64, 64, 3, 2]                
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1     75264  models.common.C3                        [128, 128, 1, False]          
 21                -1  1    147968  models.common.Conv                      [128, 128, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1    297984  models.common.C3                        [256, 256, 1, False]          
 24      [17, 20, 23]  1      8118  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 269 layers, 1774774 parameters, 1765270 gradients, 4.2 GFLOPs

detect.py:302: DeprecationWarning: In future, it will be an error for 'np.bool_' scalars to be interpreted as an index
  s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string
image 1/1 /home/aistudio/yolov5_paddle_prune/05000319.jpg: 448x640 183 persons, Done. (0.029s)
Speed: 1.0ms pre-process, 29.0ms inference, 8.3ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp

最终map精度达到58.9，相比yolov3已经有较大的提升，对于初始训练只降低了2.3个点，符合剪枝后效果的预期。

手动转TensorRT

利用上一步骤的生成的out.wts，可以基于TensorRT API搭建网络层，可以获得原生API支持的极速体验。

同时本项目能够自适应裁剪的通道数，不需要额外的手动配置TensorRT网络层通道数，能快速提高开发效率。

具体参考本人的Github项目： https://github.com/thunder95/tensorrtx

写在最后

由于时间关系，本项目做的实验还比较粗糙，整体上基于飞桨和Yolov5nV6模型搭建了一套简单的轻量化部署流程，希望能给大家一定的参考价值。同时后面会将模型基于Paddle-Lite部署到安卓手机。

若对您有帮助, 请多多fork支持，蟹蟹~~~~

关于作者

成都飞桨领航团团长
PPDE
AICA三期学员

我在AI Studio上获得钻石等级，点亮10个徽章，来互关呀~ https://aistudio.baidu.com/aistudio/personalcenter/thirdview/89442

基于飞桨YOLOv5n6通道裁剪及手动转TensorRT实践

实验用到的数据集

基于YOLOv5-Paddle进行基准训练

稀疏训练

8倍通道裁剪

微调训练

手动转TensorRT

写在最后

关于作者

yolov5 自训练模型转 tensorrt 及测试

已解决：Downloading https://github.com/ultralytics.../yolov5n6.pt to yolov5n6——ubuntu18.4-yolov5报错记录[02]

基于深度学习的高精度安全帽及背心检测识别系统（PyTorch+Pyside6+YOLOv5模型）