更换文档检测模型
This commit is contained in:
60
paddle_detection/configs/picodet/legacy_model/README.md
Normal file
60
paddle_detection/configs/picodet/legacy_model/README.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# PP-PicoDet Legacy Model-ZOO (2021.10)
|
||||
|
||||
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[NCNN](#latency)</small><sup><br><sup>(ms) | Latency<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | Download | Config |
|
||||
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
|
||||
| PicoDet-S | 320*320 | 27.1 | 41.4 | 0.99 | 0.73 | 8.13 | **6.65** | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_320_coco.yml) |
|
||||
| PicoDet-S | 416*416 | 30.7 | 45.8 | 0.99 | 1.24 | 12.37 | **9.82** | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco.yml) |
|
||||
| PicoDet-M | 320*320 | 30.9 | 45.7 | 2.15 | 1.48 | 11.27 | **9.61** | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_320_coco.yml) |
|
||||
| PicoDet-M | 416*416 | 34.8 | 50.5 | 2.15 | 2.50 | 17.39 | **15.88** | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_416_coco.yml) |
|
||||
| PicoDet-L | 320*320 | 32.9 | 48.2 | 3.30 | 2.23 | 15.26 | **13.42** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_320_coco.yml) |
|
||||
| PicoDet-L | 416*416 | 36.6 | 52.5 | 3.30 | 3.76 | 23.36 | **21.85** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco.yml) |
|
||||
| PicoDet-L | 640*640 | 40.9 | 57.6 | 3.30 | 8.91 | 54.11 | **50.55** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco.yml) |
|
||||
|
||||
#### More Configs
|
||||
|
||||
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[NCNN](#latency)</small><sup><br><sup>(ms) | Latency<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | Download | Config |
|
||||
| :--------------------------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
|
||||
| PicoDet-Shufflenetv2 1x | 416*416 | 30.0 | 44.6 | 1.17 | 1.53 | 15.06 | **10.63** | [model](https://paddledet.bj.bcebos.com/models/picodet_shufflenetv2_1x_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_shufflenetv2_1x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_shufflenetv2_1x_416_coco.yml) |
|
||||
| PicoDet-MobileNetv3-large 1x | 416*416 | 35.6 | 52.0 | 3.55 | 2.80 | 20.71 | **17.88** | [model](https://paddledet.bj.bcebos.com/models/picodet_mobilenetv3_large_1x_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_mobilenetv3_large_1x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_mobilenetv3_large_1x_416_coco.yml) |
|
||||
| PicoDet-LCNet 1.5x | 416*416 | 36.3 | 52.2 | 3.10 | 3.85 | 21.29 | **20.8** | [model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_1_5x_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_1_5x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_lcnet_1_5x_416_coco.yml) |
|
||||
| PicoDet-LCNet 1.5x | 640*640 | 40.6 | 57.4 | 3.10 | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_1_5x_640_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_1_5x_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_lcnet_1_5x_640_coco.yml) |
|
||||
| PicoDet-R18 | 640*640 | 40.7 | 57.2 | 11.10 | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_r18_640_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_r18_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_r18_640_coco.yml) |
|
||||
|
||||
<details open>
|
||||
<summary><b>Table Notes:</b></summary>
|
||||
|
||||
- <a name="latency">Latency:</a> All our models test on `Qualcomm Snapdragon 865(4xA77+4xA55)` with 4 threads by arm8 and with FP16. In the above table, test latency on [NCNN](https://github.com/Tencent/ncnn) and `Lite`->[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite). And testing latency with code: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark).
|
||||
- PicoDet is trained on COCO train2017 dataset and evaluated on COCO val2017.
|
||||
- PicoDet used 4 or 8 GPUs for training and all checkpoints are trained with default settings and hyperparameters.
|
||||
|
||||
</details>
|
||||
|
||||
- Deploy models
|
||||
|
||||
| Model | Input size | ONNX | Paddle Lite(fp32) | Paddle Lite(fp16) |
|
||||
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
|
||||
| PicoDet-S | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_fp16.tar) |
|
||||
| PicoDet-S | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_fp16.tar) |
|
||||
| PicoDet-M | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_fp16.tar) |
|
||||
| PicoDet-M | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_fp16.tar) |
|
||||
| PicoDet-L | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_fp16.tar) |
|
||||
| PicoDet-L | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_fp16.tar) |
|
||||
| PicoDet-L | 640*640 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_fp16.tar) |
|
||||
| PicoDet-Shufflenetv2 1x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_shufflenetv2_1x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_shufflenetv2_1x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_shufflenetv2_1x_fp16.tar) |
|
||||
| PicoDet-MobileNetv3-large 1x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_mobilenetv3_large_1x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_mobilenetv3_large_1x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_mobilenetv3_large_1x_fp16.tar) |
|
||||
| PicoDet-LCNet 1.5x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_lcnet_1_5x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_lcnet_1_5x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_lcnet_1_5x_fp16.tar) |
|
||||
|
||||
|
||||
|
||||
## Cite PP-PicoDet
|
||||
```
|
||||
@misc{yu2021pppicodet,
|
||||
title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
|
||||
author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
|
||||
year={2021},
|
||||
eprint={2111.00902},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
|
||||
```
|
||||
@@ -0,0 +1,18 @@
|
||||
epoch: 100
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.4
|
||||
schedulers:
|
||||
- name: CosineDecay
|
||||
max_epochs: 100
|
||||
- name: LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.00004
|
||||
type: L2
|
||||
@@ -0,0 +1,18 @@
|
||||
epoch: 300
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.4
|
||||
schedulers:
|
||||
- name: CosineDecay
|
||||
max_epochs: 300
|
||||
- name: LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.00004
|
||||
type: L2
|
||||
@@ -0,0 +1,42 @@
|
||||
worker_num: 6
|
||||
eval_height: &eval_height 320
|
||||
eval_width: &eval_width 320
|
||||
eval_size: &eval_size [*eval_height, *eval_width]
|
||||
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 128
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 8
|
||||
shuffle: false
|
||||
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [1, 3, *eval_height, *eval_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
@@ -0,0 +1,42 @@
|
||||
worker_num: 6
|
||||
eval_height: &eval_height 416
|
||||
eval_width: &eval_width 416
|
||||
eval_size: &eval_size [*eval_height, *eval_width]
|
||||
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [352, 384, 416, 448, 480], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 80
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 8
|
||||
shuffle: false
|
||||
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [1, 3, *eval_height, *eval_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
@@ -0,0 +1,42 @@
|
||||
worker_num: 6
|
||||
eval_height: &eval_height 640
|
||||
eval_width: &eval_width 640
|
||||
eval_size: &eval_size [*eval_height, *eval_width]
|
||||
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 56
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 8
|
||||
shuffle: false
|
||||
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [1, 3, *eval_height, *eval_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
@@ -0,0 +1,55 @@
|
||||
architecture: PicoDet
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_0_pretrained.pdparams
|
||||
|
||||
PicoDet:
|
||||
backbone: ESNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
ESNet:
|
||||
scale: 1.0
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 128
|
||||
use_depthwise: True
|
||||
num_csp_blocks: 1
|
||||
num_features: 4
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 128
|
||||
feat_out: 128
|
||||
num_convs: 4
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
fpn_stride: [8, 16, 32, 64]
|
||||
feat_in_chan: 128
|
||||
prior_prob: 0.01
|
||||
reg_max: 7
|
||||
cell_offset: 0.5
|
||||
loss_class:
|
||||
name: VarifocalLoss
|
||||
use_sigmoid: True
|
||||
iou_weighted: True
|
||||
loss_weight: 1.0
|
||||
loss_dfl:
|
||||
name: DistributionFocalLoss
|
||||
loss_weight: 0.25
|
||||
loss_bbox:
|
||||
name: GIoULoss
|
||||
loss_weight: 2.0
|
||||
assigner:
|
||||
name: SimOTAAssigner
|
||||
candidate_topk: 10
|
||||
iou_weight: 6
|
||||
nms:
|
||||
name: MultiClassNMS
|
||||
nms_top_k: 1000
|
||||
keep_top_k: 100
|
||||
score_threshold: 0.025
|
||||
nms_threshold: 0.6
|
||||
@@ -0,0 +1,56 @@
|
||||
# 更多应用
|
||||
|
||||
|
||||
## 1. 版面分析任务
|
||||
|
||||
版面分析指的是对图片形式的文档进行区域划分,定位其中的关键区域,如文字、标题、表格、图片等。版面分析示意图如下图所示。
|
||||
|
||||
<div align="center">
|
||||
<img src="images/layout_demo.png" width="800">
|
||||
</div>
|
||||
|
||||
### 1.1 数据集
|
||||
|
||||
使用[PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)训练英文文档版面分析模型,该数据面向英文文献类(论文)场景,分别训练集(333,703张标注图片)、验证集(11,245张标注图片)和测试集(11,405张图片),包含5类:Table、Figure、Title、Text、List,更多[版面分析数据集](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md#32)
|
||||
|
||||
### 1.2 模型库
|
||||
|
||||
使用PicoDet模型在PubLayNet数据集进行训练,同时采用FGD蒸馏,预训练模型如下:
|
||||
|
||||
| 模型 | 图像输入尺寸 | mAP<sup>val<br/>0.5 | 下载地址 | 配置文件 |
|
||||
| :-------- | :--------: | :----------------: | :---------------: | ----------------- |
|
||||
| PicoDet-LCNet_x1_0 | 800*608 | 93.5% | [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout.pdparams) | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar) | [config](./picodet_lcnet_x1_0_layout.yml) |
|
||||
| PicoDet-LCNet_x1_0 + FGD | 800*608 | 94.0% | [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) | [teacher config](./picodet_lcnet_x2_5_layout.yml)|[student config](./picodet_lcnet_x1_0_layout.yml) |
|
||||
|
||||
[FGD蒸馏介绍](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/slim/distill/README.md)
|
||||
|
||||
### 1.3 模型推理
|
||||
|
||||
了解版面分析整个流程(数据准备、模型训练、评估等),请参考[版面分析](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md),这里仅展示模型推理过程。首先下载模型库中的inference_model模型。
|
||||
|
||||
```
|
||||
mkdir inference_model
|
||||
cd inference_model
|
||||
# 下载并解压PubLayNet推理模型
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar && tar xf picodet_lcnet_x1_0_fgd_layout_infer.tar
|
||||
cd ..
|
||||
```
|
||||
|
||||
版面恢复任务进行推理,可以执行如下命令:
|
||||
|
||||
```bash
|
||||
python3 deploy/python/infer.py \
|
||||
--model_dir=inference_model/picodet_lcnet_x1_0_fgd_layout_infer/ \
|
||||
--image_file=docs/images/layout.jpg \
|
||||
--device=CPU
|
||||
```
|
||||
|
||||
可视化版面结果如下图所示:
|
||||
|
||||
<div align="center">
|
||||
<img src="images/layout_res.jpg" width="800">
|
||||
</div>
|
||||
|
||||
## 2 Reference
|
||||
|
||||
[1] Zhong X, Tang J, Yepes A J. Publaynet: largest dataset ever for document layout analysis[C]//2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019: 1015-1022.
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 179 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 451 KiB |
@@ -0,0 +1,90 @@
|
||||
_BASE_: [
|
||||
'../../../../runtime.yml',
|
||||
'../../_base_/picodet_esnet.yml',
|
||||
'../../_base_/optimizer_100e.yml',
|
||||
'../../_base_/picodet_640_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_0_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_x1_0_layout/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 10
|
||||
snapshot_epoch: 1
|
||||
epoch: 100
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
nms_cpu: True
|
||||
|
||||
LCNet:
|
||||
scale: 1.0
|
||||
feature_maps: [3, 4, 5]
|
||||
|
||||
metric: COCO
|
||||
num_classes: 5
|
||||
|
||||
TrainDataset:
|
||||
name: COCODataSet
|
||||
image_dir: train
|
||||
anno_path: train.json
|
||||
dataset_dir: ./dataset/publaynet/
|
||||
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
|
||||
|
||||
EvalDataset:
|
||||
name: COCODataSet
|
||||
image_dir: val
|
||||
anno_path: val.json
|
||||
dataset_dir: ./dataset/publaynet/
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: ./dataset/publaynet/val.json
|
||||
|
||||
|
||||
worker_num: 8
|
||||
eval_height: &eval_height 800
|
||||
eval_width: &eval_width 608
|
||||
eval_size: &eval_size [*eval_height, *eval_width]
|
||||
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [[768, 576], [800, 608], [832, 640]], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 24
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 8
|
||||
shuffle: false
|
||||
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [1, 3, 800, 608]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
@@ -0,0 +1,34 @@
|
||||
_BASE_: [
|
||||
'../../_base_/picodet_esnet.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x2_5_ssld_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_x2_5_layout/model_final
|
||||
find_unused_parameters: True
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
nms_cpu: True
|
||||
|
||||
LCNet:
|
||||
scale: 2.5
|
||||
feature_maps: [3, 4, 5]
|
||||
|
||||
CSPPAN:
|
||||
spatial_scales: [0.125, 0.0625, 0.03125]
|
||||
|
||||
slim: Distill
|
||||
slim_method: FGD
|
||||
distill_loss: FGDFeatureLoss
|
||||
distill_loss_name: ['neck_f_3', 'neck_f_2', 'neck_f_1', 'neck_f_0']
|
||||
|
||||
FGDFeatureLoss:
|
||||
student_channels: 128
|
||||
teacher_channels: 128
|
||||
temp: 0.5
|
||||
alpha_fgd: 0.001
|
||||
beta_fgd: 0.0005
|
||||
gamma_fgd: 0.0005
|
||||
lambda_fgd: 0.000005
|
||||
@@ -0,0 +1,30 @@
|
||||
# 更多应用
|
||||
|
||||
|
||||
## 1. 主体检测任务
|
||||
|
||||
主体检测技术是目前应用非常广泛的一种检测技术,它指的是检测出图片中一个或者多个主体的坐标位置,然后将图像中的对应区域裁剪下来,进行识别,从而完成整个识别过程。主体检测是识别任务的前序步骤,可以有效提升识别精度。
|
||||
|
||||
主体检测是图像识别的前序步骤,被用于PaddleClas的PP-ShiTu图像识别系统中。PP-ShiTu中使用的主体检测模型基于PP-PicoDet。更多关于PP-ShiTu的介绍与使用可以参考:[PP-ShiTu](https://github.com/PaddlePaddle/PaddleClas)。
|
||||
|
||||
|
||||
### 1.1 数据集
|
||||
|
||||
PP-ShiTu图像识别任务中,训练主体检测模型时主要用到了以下几个数据集。
|
||||
|
||||
| 数据集 | 数据量 | 主体检测任务中使用的数据量 | 场景 | 数据集地址 |
|
||||
| :------------: | :-------------: | :-------: | :-------: | :--------: |
|
||||
| Objects365 | 1700K | 173k | 通用场景 | [地址](https://www.objects365.org/overview.html) |
|
||||
| COCO2017 | 118K | 118k | 通用场景 | [地址](https://cocodataset.org/) |
|
||||
| iCartoonFace | 48k | 48k | 动漫人脸检测 | [地址](https://github.com/luxiangju-PersonAI/iCartoonFace) |
|
||||
| LogoDet-3k | 155k | 155k | Logo检测 | [地址](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
|
||||
| RPC | 54k | 54k | 商品检测 | [地址](https://rpc-dataset.github.io/) |
|
||||
|
||||
在实际训练的过程中,将所有数据集混合在一起。由于是主体检测,这里将所有标注出的检测框对应的类别都修改为 `前景` 的类别,最终融合的数据集中只包含 1 个类别,即前景,数据集定义配置可以参考[picodet_lcnet_x2_5_640_mainbody.yml](./picodet_lcnet_x2_5_640_mainbody.yml)。
|
||||
|
||||
|
||||
### 1.2 模型库
|
||||
|
||||
| 模型 | 图像输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 下载地址 | config |
|
||||
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: |
|
||||
| PicoDet-LCNet_x2_5 | 640*640 | 41.5 | 62.0 | [trained model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody.pdparams) | [inference model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody_infer.tar) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_x2_5_640_mainbody.log) | [config](./picodet_lcnet_x2_5_640_mainbody.yml) |
|
||||
@@ -0,0 +1,23 @@
|
||||
_BASE_: [
|
||||
'../../../../datasets/coco_detection.yml',
|
||||
'../../../../runtime.yml',
|
||||
'../../_base_/picodet_esnet.yml',
|
||||
'../../_base_/optimizer_100e.yml',
|
||||
'../../_base_/picodet_640_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x2_5_ssld_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_x2_5_640_mainbody/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 20
|
||||
snapshot_epoch: 2
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
LCNet:
|
||||
scale: 2.5
|
||||
feature_maps: [3, 4, 5]
|
||||
@@ -0,0 +1,149 @@
|
||||
use_gpu: true
|
||||
log_iter: 20
|
||||
save_dir: output
|
||||
snapshot_epoch: 1
|
||||
print_flops: false
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
|
||||
weights: output/picodet_s_192_pedestrian/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
epoch: 300
|
||||
metric: COCO
|
||||
num_classes: 1
|
||||
# Exporting the model
|
||||
export:
|
||||
post_process: False # Whether post-processing is included in the network when export model.
|
||||
nms: False # Whether NMS is included in the network when export model.
|
||||
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
|
||||
|
||||
architecture: PicoDet
|
||||
|
||||
PicoDet:
|
||||
backbone: ESNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
ESNet:
|
||||
scale: 0.75
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 96
|
||||
use_depthwise: True
|
||||
num_csp_blocks: 1
|
||||
num_features: 4
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 96
|
||||
feat_out: 96
|
||||
num_convs: 2
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
fpn_stride: [8, 16, 32, 64]
|
||||
feat_in_chan: 96
|
||||
prior_prob: 0.01
|
||||
reg_max: 7
|
||||
cell_offset: 0.5
|
||||
loss_class:
|
||||
name: VarifocalLoss
|
||||
use_sigmoid: True
|
||||
iou_weighted: True
|
||||
loss_weight: 1.0
|
||||
loss_dfl:
|
||||
name: DistributionFocalLoss
|
||||
loss_weight: 0.25
|
||||
loss_bbox:
|
||||
name: GIoULoss
|
||||
loss_weight: 2.0
|
||||
assigner:
|
||||
name: SimOTAAssigner
|
||||
candidate_topk: 10
|
||||
iou_weight: 6
|
||||
nms:
|
||||
name: MultiClassNMS
|
||||
nms_top_k: 1000
|
||||
keep_top_k: 100
|
||||
score_threshold: 0.025
|
||||
nms_threshold: 0.6
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.4
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.00004
|
||||
type: L2
|
||||
|
||||
TrainDataset:
|
||||
!COCODataSet
|
||||
image_dir: ""
|
||||
anno_path: aic_coco_train_cocoformat.json
|
||||
dataset_dir: dataset
|
||||
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
|
||||
|
||||
EvalDataset:
|
||||
!COCODataSet
|
||||
image_dir: val2017
|
||||
anno_path: annotations/instances_val2017.json
|
||||
dataset_dir: dataset/coco
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: annotations/instances_val2017.json
|
||||
|
||||
worker_num: 8
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [128, 160, 192, 224, 256], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 128
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: [192, 192], keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 8
|
||||
shuffle: false
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [1, 3, 192, 192]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: [192, 192], keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
fuse_normalize: true
|
||||
@@ -0,0 +1,148 @@
|
||||
use_gpu: true
|
||||
log_iter: 20
|
||||
save_dir: output
|
||||
snapshot_epoch: 1
|
||||
print_flops: false
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
|
||||
weights: output/picodet_s_320_pedestrian/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
epoch: 300
|
||||
metric: COCO
|
||||
num_classes: 1
|
||||
# Exporting the model
|
||||
export:
|
||||
post_process: False # Whether post-processing is included in the network when export model.
|
||||
nms: False # Whether NMS is included in the network when export model.
|
||||
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
|
||||
|
||||
architecture: PicoDet
|
||||
|
||||
PicoDet:
|
||||
backbone: ESNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
ESNet:
|
||||
scale: 0.75
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 96
|
||||
use_depthwise: True
|
||||
num_csp_blocks: 1
|
||||
num_features: 4
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 96
|
||||
feat_out: 96
|
||||
num_convs: 2
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
fpn_stride: [8, 16, 32, 64]
|
||||
feat_in_chan: 96
|
||||
prior_prob: 0.01
|
||||
reg_max: 7
|
||||
cell_offset: 0.5
|
||||
loss_class:
|
||||
name: VarifocalLoss
|
||||
use_sigmoid: True
|
||||
iou_weighted: True
|
||||
loss_weight: 1.0
|
||||
loss_dfl:
|
||||
name: DistributionFocalLoss
|
||||
loss_weight: 0.25
|
||||
loss_bbox:
|
||||
name: GIoULoss
|
||||
loss_weight: 2.0
|
||||
assigner:
|
||||
name: SimOTAAssigner
|
||||
candidate_topk: 10
|
||||
iou_weight: 6
|
||||
nms:
|
||||
name: MultiClassNMS
|
||||
nms_top_k: 1000
|
||||
keep_top_k: 100
|
||||
score_threshold: 0.025
|
||||
nms_threshold: 0.6
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.4
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.00004
|
||||
type: L2
|
||||
|
||||
TrainDataset:
|
||||
!COCODataSet
|
||||
image_dir: ""
|
||||
anno_path: aic_coco_train_cocoformat.json
|
||||
dataset_dir: dataset
|
||||
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
|
||||
|
||||
EvalDataset:
|
||||
!COCODataSet
|
||||
image_dir: val2017
|
||||
anno_path: annotations/instances_val2017.json
|
||||
dataset_dir: dataset/coco
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: annotations/instances_val2017.json
|
||||
|
||||
worker_num: 8
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_size: 128
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: [320, 320], keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 8
|
||||
shuffle: false
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [1, 3, 320, 320]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {interp: 2, target_size: [320, 320], keep_ratio: False}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
@@ -0,0 +1,26 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_0_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_1_5x_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
LCNet:
|
||||
scale: 1.0
|
||||
feature_maps: [3, 4, 5]
|
||||
|
||||
TrainReader:
|
||||
batch_size: 90
|
||||
@@ -0,0 +1,23 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_5_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_1_5x_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
LCNet:
|
||||
scale: 1.5
|
||||
feature_maps: [3, 4, 5]
|
||||
@@ -0,0 +1,49 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_640_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_5_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_1_5x_640_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
LCNet:
|
||||
scale: 1.5
|
||||
feature_maps: [3, 4, 5]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 160
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 160
|
||||
feat_out: 160
|
||||
num_convs: 4
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 160
|
||||
|
||||
TrainReader:
|
||||
batch_size: 24
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.2
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
@@ -0,0 +1,26 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x2_5_ssld_pretrained.pdparams
|
||||
weights: output/picodet_lcnet_1_5x_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
PicoDet:
|
||||
backbone: LCNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
LCNet:
|
||||
scale: 2.5
|
||||
feature_maps: [3, 4, 5]
|
||||
|
||||
TrainReader:
|
||||
batch_size: 48
|
||||
@@ -0,0 +1,39 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
|
||||
weights: output/picodet_mobilenetv3_large_1x_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
epoch: 180
|
||||
|
||||
PicoDet:
|
||||
backbone: MobileNetV3
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
MobileNetV3:
|
||||
model_name: large
|
||||
scale: 1.0
|
||||
with_extra_blocks: false
|
||||
extra_block_filters: []
|
||||
feature_maps: [7, 13, 16]
|
||||
|
||||
TrainReader:
|
||||
batch_size: 56
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.3
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
@@ -0,0 +1,39 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_640_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
|
||||
weights: output/picodet_r18_640_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
PicoDet:
|
||||
backbone: ResNet
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
ResNet:
|
||||
depth: 18
|
||||
variant: d
|
||||
return_idx: [1, 2, 3]
|
||||
freeze_at: -1
|
||||
freeze_norm: false
|
||||
norm_decay: 0.
|
||||
|
||||
TrainReader:
|
||||
batch_size: 56
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.3
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
@@ -0,0 +1,38 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'../_base_/optimizer_300e.yml',
|
||||
'../_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ShuffleNetV2_x1_0_pretrained.pdparams
|
||||
weights: output/picodet_shufflenetv2_1x_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
PicoDet:
|
||||
backbone: ShuffleNetV2
|
||||
neck: CSPPAN
|
||||
head: PicoHead
|
||||
|
||||
ShuffleNetV2:
|
||||
scale: 1.0
|
||||
feature_maps: [5, 13, 17]
|
||||
act: leaky_relu
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 96
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 96
|
||||
feat_out: 96
|
||||
num_convs: 2
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 96
|
||||
@@ -0,0 +1,47 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_320_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
|
||||
weights: output/picodet_l_320_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
epoch: 250
|
||||
|
||||
ESNet:
|
||||
scale: 1.25
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 160
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 160
|
||||
feat_out: 160
|
||||
num_convs: 4
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 160
|
||||
|
||||
TrainReader:
|
||||
batch_size: 56
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.3
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
@@ -0,0 +1,47 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
|
||||
weights: output/picodet_l_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
epoch: 250
|
||||
|
||||
ESNet:
|
||||
scale: 1.25
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 160
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 160
|
||||
feat_out: 160
|
||||
num_convs: 4
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 160
|
||||
|
||||
TrainReader:
|
||||
batch_size: 48
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.3
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
@@ -0,0 +1,47 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_640_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
|
||||
weights: output/picodet_l_640_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
epoch: 250
|
||||
|
||||
ESNet:
|
||||
scale: 1.25
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 160
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 160
|
||||
feat_out: 160
|
||||
num_convs: 4
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 160
|
||||
|
||||
TrainReader:
|
||||
batch_size: 32
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.3
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 300
|
||||
@@ -0,0 +1,13 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_320_reader.yml',
|
||||
]
|
||||
|
||||
weights: output/picodet_m_320_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
@@ -0,0 +1,13 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
weights: output/picodet_m_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
@@ -0,0 +1,34 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_320_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
|
||||
weights: output/picodet_s_320_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
ESNet:
|
||||
scale: 0.75
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 96
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 96
|
||||
feat_out: 96
|
||||
num_convs: 2
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 96
|
||||
@@ -0,0 +1,37 @@
|
||||
_BASE_: [
|
||||
'../../datasets/voc.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_320_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
|
||||
weights: output/picodet_s_320_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
ESNet:
|
||||
scale: 0.75
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 96
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 96
|
||||
feat_out: 96
|
||||
num_convs: 2
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 96
|
||||
|
||||
EvalReader:
|
||||
collate_batch: false
|
||||
@@ -0,0 +1,34 @@
|
||||
_BASE_: [
|
||||
'../../datasets/coco_detection.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/picodet_esnet.yml',
|
||||
'_base_/optimizer_300e.yml',
|
||||
'_base_/picodet_416_reader.yml',
|
||||
]
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
|
||||
weights: output/picodet_s_416_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
|
||||
ESNet:
|
||||
scale: 0.75
|
||||
feature_maps: [4, 11, 14]
|
||||
act: hard_swish
|
||||
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
|
||||
|
||||
CSPPAN:
|
||||
out_channels: 96
|
||||
|
||||
PicoHead:
|
||||
conv_feat:
|
||||
name: PicoFeat
|
||||
feat_in: 96
|
||||
feat_out: 96
|
||||
num_convs: 2
|
||||
num_fpn_stride: 4
|
||||
norm_type: bn
|
||||
share_cls_reg: True
|
||||
feat_in_chan: 96
|
||||
135
paddle_detection/configs/picodet/legacy_model/pruner/README.md
Normal file
135
paddle_detection/configs/picodet/legacy_model/pruner/README.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# 非结构化稀疏在 PicoDet 上的应用教程
|
||||
|
||||
## 1. 介绍
|
||||
在模型压缩中,常见的稀疏方式为结构化稀疏和非结构化稀疏,前者在某个特定维度(特征通道、卷积核等等)上对卷积、矩阵乘法进行剪枝操作,然后生成一个更小的模型结构,这样可以复用已有的卷积、矩阵乘计算,无需特殊实现推理算子;后者以每一个参数为单元进行稀疏化,然而并不会改变参数矩阵的形状,所以更依赖于推理库、硬件对于稀疏后矩阵运算的加速能力。我们在 PP-PicoDet (以下简称PicoDet) 模型上运用了非结构化稀疏技术,在精度损失较小时,获得了在 ARM CPU 端推理的显著性能提升。本文档会介绍如何非结构化稀疏训练 PicoDet,关于非结构化稀疏的更多介绍请参照[这里](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/dygraph/unstructured_pruning)。
|
||||
|
||||
## 2. 版本要求
|
||||
```bash
|
||||
PaddlePaddle >= 2.1.2
|
||||
PaddleSlim develop分支 (pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple)
|
||||
```
|
||||
|
||||
## 3. 数据准备
|
||||
同 PicoDet
|
||||
|
||||
## 4. 预训练模型
|
||||
在非结构化稀疏训练中,我们规定预训练模型是已经收敛完成的模型参数,所以需要额外在相关配置文件中声明。
|
||||
|
||||
声明预训练模型地址的配置文件:./configs/picodet/pruner/picodet_m_320_coco_pruner.yml
|
||||
预训练模型地址请参照 PicoDet 文档:./configs/picodet/README.md
|
||||
|
||||
## 5. 自定义稀疏化的作用范围
|
||||
为达到最佳推理加速效果,我们建议只对 1x1 卷积层进行稀疏化,其他层参数保持稠密。另外,有些层对于精度影响较大(例如head的最后几层,se-block的若干层),我们同样不建议对他们进行稀疏化,我们支持开发者通过传入自定义函数的形式,方便的指定哪些层不参与稀疏。例如,基于picodet_m_320这个模型,我们稀疏时跳过了后4层卷积以及6层se-block中的卷积,自定义函数如下:
|
||||
|
||||
```python
|
||||
NORMS_ALL = [ 'BatchNorm', 'GroupNorm', 'LayerNorm', 'SpectralNorm', 'BatchNorm1D',
|
||||
'BatchNorm2D', 'BatchNorm3D', 'InstanceNorm1D', 'InstanceNorm2D',
|
||||
'InstanceNorm3D', 'SyncBatchNorm', 'LocalResponseNorm' ]
|
||||
|
||||
def skip_params_self(model):
|
||||
skip_params = set()
|
||||
for _, sub_layer in model.named_sublayers():
|
||||
if type(sub_layer).__name__.split('.')[-1] in NORMS_ALL:
|
||||
skip_params.add(sub_layer.full_name())
|
||||
for param in sub_layer.parameters(include_sublayers=False):
|
||||
cond_is_conv1x1 = len(param.shape) == 4 and param.shape[2] == 1 and param.shape[3] == 1
|
||||
cond_is_head_m = cond_is_conv1x1 and param.shape[0] == 112 and param.shape[1] == 128
|
||||
cond_is_se_block_m = param.name.split('.')[0] in ['conv2d_17', 'conv2d_18', 'conv2d_56', 'conv2d_57', 'conv2d_75', 'conv2d_76']
|
||||
if not cond_is_conv1x1 or cond_is_head_m or cond_is_se_block_m:
|
||||
skip_params.add(param.name)
|
||||
return skip_params
|
||||
```
|
||||
|
||||
## 6. 训练
|
||||
我们已经将非结构化稀疏的核心功能通过 API 调用的方式嵌入到了训练中,所以如果您没有更细节的需求,直接运行 6.1 的命令启动训练即可。同时,为帮助您根据自己的需求更改、适配代码,我们也提供了更为详细的使用介绍,请参照 6.2。
|
||||
|
||||
### 6.1 直接使用
|
||||
```bash
|
||||
export CUDA_VISIBLE_DEVICES=0,1,2,3
|
||||
python3.7 -m paddle.distributed.launch --log_dir=log_test --gpus 0,1,2,3 tools/train.py -c configs/picodet/pruner/picodet_m_320_coco_pruner.yml --slim_config configs/slim/prune/picodet_m_unstructured_prune_75.yml --eval
|
||||
```
|
||||
|
||||
### 6.2 详细介绍
|
||||
- 自定义稀疏化的作用范围:可以参照本教程的第 5 节
|
||||
- 如何添加稀疏化训练所需的 4 行代码
|
||||
|
||||
```python
|
||||
# after constructing model and before training
|
||||
|
||||
# Pruner Step1: configs
|
||||
configs = {
|
||||
'pruning_strategy': 'gmp',
|
||||
'stable_iterations': self.stable_epochs * steps_per_epoch,
|
||||
'pruning_iterations': self.pruning_epochs * steps_per_epoch,
|
||||
'tunning_iterations': self.tunning_epochs * steps_per_epoch,
|
||||
'resume_iteration': 0,
|
||||
'pruning_steps': self.pruning_steps,
|
||||
'initial_ratio': self.initial_ratio,
|
||||
}
|
||||
|
||||
# Pruner Step2: construct a pruner object
|
||||
self.pruner = GMPUnstructuredPruner(
|
||||
model,
|
||||
ratio=self.cfg.ratio,
|
||||
skip_params_func=skip_params_self, # Only pass in this value when you design your own skip_params function. And the following argument (skip_params_type) will be ignored.
|
||||
skip_params_type=self.cfg.skip_params_type,
|
||||
local_sparsity=True,
|
||||
configs=configs)
|
||||
|
||||
# training
|
||||
for epoch_id in range(self.start_epoch, self.cfg.epoch):
|
||||
model.train()
|
||||
for step_id, data in enumerate(self.loader):
|
||||
# model forward
|
||||
outputs = model(data)
|
||||
loss = outputs['loss']
|
||||
# model backward
|
||||
loss.backward()
|
||||
self.optimizer.step()
|
||||
|
||||
# Pruner Step3: step during training
|
||||
self.pruner.step()
|
||||
|
||||
# Pruner Step4: save the sparse model
|
||||
self.pruner.update_params()
|
||||
# model-saving API
|
||||
```
|
||||
|
||||
## 7. 模型评估与推理部署
|
||||
这部分与 PicoDet 文档中基本一致,只是在转换到 PaddleLite 模型时,需要添加一个输入参数(sparse_model):
|
||||
|
||||
```bash
|
||||
paddle_lite_opt --model_dir=inference_model/picodet_m_320_coco --valid_targets=arm --optimize_out=picodet_m_320_coco_fp32_sparse --sparse_model=True
|
||||
```
|
||||
|
||||
**注意:** 目前稀疏化推理适用于 PaddleLite的 FP32 和 INT8 模型,所以执行上述命令时,请不要打开 FP16 开关。
|
||||
|
||||
## 8. 稀疏化结果
|
||||
我们在75%和85%稀疏度下,训练得到了 FP32 PicoDet-m模型,并在 SnapDragon-835设备上实测推理速度,效果如下表。其中:
|
||||
- 对于 m 模型,mAP损失1.5,获得了 34\%-58\% 的加速性能
|
||||
- 同样对于 m 模型,除4线程推理速度基本持平外,单线程推理速度、mAP、模型体积均优于 s 模型。
|
||||
|
||||
|
||||
| Model | Input size | Sparsity | mAP<sup>val<br>0.5:0.95 | Size<br><sup>(MB) | Latency single-thread<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | speed-up single-thread | Latency 4-thread<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | speed-up 4-thread | Download | SlimConfig |
|
||||
| :-------- | :--------: |:--------: | :---------------------: | :----------------: | :----------------: |:----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: |
|
||||
| PicoDet-m-1.0 | 320*320 | 0 | 30.9 | 8.9 | 127 | 0 | 43 | 0 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams)| [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet/picodet_m_320_coco.yml)|
|
||||
| PicoDet-m-1.0 | 320*320 | 75% | 29.4 | 5.6 | **80** | 58% | **32** | 34% | [model](https://paddledet.bj.bcebos.com/models/slim/picodet_m_320__coco_sparse_75.pdparams)| [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320__coco_sparse_75.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/slim/prune/picodet_m_unstructured_prune_75.yml)|
|
||||
| PicoDet-s-1.0 | 320*320 | 0 | 27.1 | 4.6 | 68 | 0 | 26 | 0 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet/picodet_s_320_coco.yml)|
|
||||
| PicoDet-m-1.0 | 320*320 | 85% | 27.6 | 4.1 | **65** | 96% | **27** | 59% | [model](https://paddledet.bj.bcebos.com/models/slim/picodet_m_320__coco_sparse_85.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320__coco_sparse_85.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/slim/prune/picodet_m_unstructured_prune_85.yml)|
|
||||
|
||||
**注意:**
|
||||
- 上述模型体积是**部署模型体积**,即 PaddleLite 转换得到的 *.nb 文件的体积。
|
||||
- 加速一栏我们按照 FPS 增加百分比计算,即:$(dense\_latency - sparse\_latency) / sparse\_latency$
|
||||
- 上述稀疏化训练时,我们额外添加了一种数据增强方式到 _base_/picodet_320_reader.yml,代码如下。但是不添加的话,预期mAP也不会有明显下降(<0.1),且对速度和模型体积没有影响。
|
||||
```yaml
|
||||
worker_num: 6
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
|
||||
- RandomDistort: {}
|
||||
batch_transforms:
|
||||
etc.
|
||||
```
|
||||
@@ -0,0 +1,18 @@
|
||||
epoch: 300
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.15
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 300
|
||||
- !LinearWarmup
|
||||
start_factor: 1.0
|
||||
steps: 34350
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.00004
|
||||
type: L2
|
||||
@@ -0,0 +1,13 @@
|
||||
_BASE_: [
|
||||
'../../../datasets/coco_detection.yml',
|
||||
'../../../runtime.yml',
|
||||
'../_base_/picodet_esnet.yml',
|
||||
'./optimizer_300e_pruner.yml',
|
||||
'../_base_/picodet_320_reader.yml',
|
||||
]
|
||||
|
||||
weights: output/picodet_m_320_coco/model_final
|
||||
find_unused_parameters: True
|
||||
use_ema: true
|
||||
cycle_epoch: 40
|
||||
snapshot_epoch: 10
|
||||
Reference in New Issue
Block a user