更换文档检测模型

This commit is contained in:
2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions

View File

@@ -0,0 +1,163 @@
# PP-PicoDet全量化示例
目录:
- [1.简介](#1简介)
- [2.Benchmark](#2Benchmark)
- [3.全量化流程](#全量化流程)
- [3.1 环境准备](#31-准备环境)
- [3.2 准备数据集](#32-准备数据集)
- [3.3 全精度模型训练](#33-全精度模型训练)
- [3.4 导出预测模型](#33-导出预测模型)
- [3.5 全量化并产出模型](#35-全量化并产出模型)
- [4.预测部署](#4预测部署)
- [5.FAQ](5FAQ)
## 1. 简介
本示例以PicoDet为例介绍从模型训练、模型全量化到NPU硬件上部署的全流程。
* [Benchmark](#Benchmark)表格中已经提供了基于COCO数据预训练模型全量化的模型。
* 已经验证的NPU硬件
- 瑞芯微-开发板Rockchip RV1109、Rockchip RV1126、Rockchip RK1808
- 晶晨-开发板Amlogic A311D、Amlogic S905D3、Amlogic C308X
- 恩智浦-开发板NXP i.MX 8M Plus
* 未验证硬件部署思路:
- 未验证表示该硬件暂不支持Paddle Lite推理部署可以选择Paddle2ONNX导出使用硬件的推理引擎完成部署前提该硬件支持ONNX的全量化模型。
## 2.Benchmark
### PicoDet-S-NPU
| 模型 | 策略 | mAP | FP32 | INT8 | 配置文件 | 模型 |
|:------------- |:-------- |:----:|:----:|:----:|:---------------------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------:|
| PicoDet-S-NPU | Baseline | 30.1 | - | - | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_npu.yml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_416_coco_npu.tar) |
| PicoDet-S-NPU | 量化训练 | 29.7 | - | - | [config](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/full_quantization/detection/configs/picodet_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_npu_quant.tar) |
- mAP的指标均在COCO val2017数据集中评测得到IoU=0.5:0.95。
## 3. 全量化流程
基于自己数据训练的模型,可以参考如下流程。
### 3.1 准备环境
- PaddlePaddle >= 2.3 (可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装)
- PaddleSlim >= 2.3
- PaddleDet >= 2.4
安装paddlepaddle
```shell
# CPU
pip install paddlepaddle
# GPU
pip install paddlepaddle-gpu
```
安装paddleslim
```shell
pip install paddleslim
```
安装paddledet
```shell
pip install paddledet
```
### 3.2 准备数据集
本案例默认以COCO数据进行全量化实验如果自定义数据可将数据按照COCO数据的标准准备其他自定义数据可以参考[PaddleDetection数据准备文档](../../docs/tutorials/data/PrepareDataSet.md) 来准备。
以PicoDet-S-NPU模型为例如果已经准备好数据集请直接修改[picodet_reader.yml](./configs/picodet_reader.yml)中`EvalDataset``dataset_dir`字段为自己数据集路径即可。
### 3.3 全精度模型训练
如需模型全量化,需要准备一个训好的全精度模型,如果已训好模型可跳过该步骤。
- 单卡GPU上训练:
```shell
# training on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/picodet/picodet_s_416_coco_npu.yml --eval
```
**注意:**如果训练时显存out memory将TrainReader中batch_size调小同时LearningRate中base_lr等比例减小。同时我们发布的config均由4卡训练得到如果改变GPU卡数为1那么base_lr需要减小4倍。
- 多卡GPU上训练:
```shell
# training on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/picodet/picodet_s_416_coco_npu.yml --eval
```
**注意:**PicoDet所有模型均由4卡GPU训练得到如果改变训练GPU卡数需要按线性比例缩放学习率base_lr。
- 评估:
```shell
python tools/eval.py -c configs/picodet/picodet_s_416_coco_npu.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams
```
### 3.4 导出预测模型
使用如下命令导出Inference模型用于全量化训练。导出模型默认存放在`output_inference`文件夹,包括*.pdmodel和*.pdiparams文件用于全量化。
* 命令说明:
- -c: [3.3 全精度模型训练](#3.3全精度模型训练)训练时使用的yam配置文件。
- -o weight: 预测模型文件该文档直接使用基于COCO上训练好的模型。
```shell
python tools/export_model.py \
-c configs/picodet/picodet_s_416_coco_npu.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams \
```
### 3.5 全量化训练并产出模型
- 进入PaddleSlim自动化压缩Demo文件夹下
```shell
cd deploy/auto_compression/
```
全量化示例通过run.py脚本启动会使用接口```paddleslim.auto_compression.AutoCompression```对模型进行全量化。配置config文件中模型路径、蒸馏、量化、和训练等部分的参数配置完成后便可对模型进行量化和蒸馏。具体运行命令为
- 单卡量化训练:
```
export CUDA_VISIBLE_DEVICES=0
python run.py --config_path=./configs/picodet_s_qat_dis.yaml --save_dir='./output/'
```
- 多卡量化训练:
```
CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --log_dir=log --gpus 0,1,2,3 run.py \
--config_path=./configs/picodet_s_qat_dis.yaml --save_dir='./output/'
```
- 最终模型默认产出在`output`文件夹下,训练完成后,测试全量化模型精度
将config要测试的模型路径可以在配置文件中`model_dir`字段下进行修改。使用eval.py脚本得到模型的mAP
```
export CUDA_VISIBLE_DEVICES=0
python eval.py --config_path=./configs/picodet_s_qat_dis.yaml
```
## 4.预测部署
请直接使用PicoDet的[Paddle Lite全量化Demo](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/linux/picodet_detection)进行落地部署。
## 5.FAQ

View File

@@ -0,0 +1,355 @@
简体中文 | [English](README_en.md)
# PP-PicoDet
![](../../docs/images/picedet_demo.jpeg)
## 最新动态
- 发布PicoDet-NPU模型支持模型全量化部署。详情请参考[PicoDet全量化示例](./FULL_QUANTIZATION.md) **2022.08.10**
- 发布全新系列PP-PicoDet模型**2022.03.20**
- (1)引入TAL及ETA Head优化PAN等结构精度提升2个点以上
- (2)优化CPU端预测速度同时训练速度提升一倍
- (3)导出模型将后处理包含在网络中预测直接输出box结果无需二次开发迁移成本更低端到端预测速度提升10%-20%。
## 历史版本模型
- 详情请参考:[PicoDet 2021.10版本](./legacy_model/)
## 简介
PaddleDetection中提出了全新的轻量级系列模型`PP-PicoDet`在移动端具有卓越的性能成为全新SOTA轻量级模型。详细的技术细节可以参考我们的[arXiv技术报告](https://arxiv.org/abs/2111.00902)。
PP-PicoDet模型有如下特点
- 🌟 更高的mAP: 第一个在1M参数量之内`mAP(0.5:0.95)`超越**30+**(输入416像素时)。
- 🚀 更快的预测速度: 网络预测在ARM CPU下可达150FPS。
- 😊 部署友好: 支持PaddleLite/MNN/NCNN/OpenVINO等预测库支持转出ONNX提供了C++/Python/Android的demo。
- 😍 先进的算法: 我们在现有SOTA算法中进行了创新, 包括ESNet, CSP-PAN, SimOTA等等。
<div align="center">
<img src="../../docs/images/picodet_map.png" width='600'/>
</div>
## 基线
| 模型 | 输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 参数量<br><sup>(M) | FLOPS<br><sup>(G) | 预测时延<sup><small>[CPU](#latency)</small><sup><br><sup>(ms) | 预测时延<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | 权重下载 | 配置文件 | 导出模型 |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- | :--------------------------------------- |
| PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |
- 特色模型
| 模型 | 输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 参数量<br><sup>(M) | FLOPS<br><sup>(G) | 预测时延<sup><small>[CPU](#latency)</small><sup><br><sup>(ms) | 预测时延<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | 权重下载 | 配置文件 |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
| PicoDet-S-NPU | 416*416 | 30.1 | 44.2 | - | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_npu.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_npu.yml) |
<details open>
<summary><b>注意事项:</b></summary>
- <a name="latency">时延测试:</a> 我们所有的模型都在`英特尔酷睿i7 10750H`的CPU 和`骁龙865(4xA77+4xA55)`的ARM CPU上测试(4线程FP16预测)。上面表格中标有`CPU`的是使用OpenVINO测试标有`Lite`的是使用[Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite)进行测试。
- PicoDet在COCO train2017上训练并且在COCO val2017上进行验证。使用4卡GPU训练并且上表所有的预训练模型都是通过发布的默认配置训练得到。
- Benchmark测试测试速度benchmark性能时导出模型后处理不包含在网络中需要设置`-o export.benchmark=True` 或手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml#L12)。
</details>
#### 其他模型的基线
| 模型 | 输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 参数量<br><sup>(M) | FLOPS<br><sup>(G) | 预测时延<sup><small>[NCNN](#latency)</small><sup><br><sup>(ms) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: |
| YOLOv3-Tiny | 416*416 | 16.6 | 33.1 | 8.86 | 5.62 | 25.42 |
| YOLOv4-Tiny | 416*416 | 21.7 | 40.2 | 6.06 | 6.96 | 23.69 |
| PP-YOLO-Tiny | 320*320 | 20.6 | - | 1.08 | 0.58 | 6.75 |
| PP-YOLO-Tiny | 416*416 | 22.7 | - | 1.08 | 1.02 | 10.48 |
| Nanodet-M | 320*320 | 20.6 | - | 0.95 | 0.72 | 8.71 |
| Nanodet-M | 416*416 | 23.5 | - | 0.95 | 1.2 | 13.35 |
| Nanodet-M 1.5x | 416*416 | 26.8 | - | 2.08 | 2.42 | 15.83 |
| YOLOX-Nano | 416*416 | 25.8 | - | 0.91 | 1.08 | 19.23 |
| YOLOX-Tiny | 416*416 | 32.8 | - | 5.06 | 6.45 | 32.77 |
| YOLOv5n | 640*640 | 28.4 | 46.0 | 1.9 | 4.5 | 40.35 |
| YOLOv5s | 640*640 | 37.2 | 56.0 | 7.2 | 16.5 | 78.05 |
- ARM测试的benchmark脚本来自: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)。
## 快速开始
<details open>
<summary>依赖包:</summary>
- PaddlePaddle == 2.2.2
</details>
<details>
<summary>安装</summary>
- [安装指导文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/INSTALL.md)
- [准备数据文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/data/PrepareDataSet_en.md)
</details>
<details>
<summary>训练&评估</summary>
- 单卡GPU上训练:
```shell
# training on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
```
**注意:**如果训练时显存out memory将TrainReader中batch_size调小同时LearningRate中base_lr等比例减小。同时我们发布的config均由4卡训练得到如果改变GPU卡数为1那么base_lr需要减小4倍。
- 多卡GPU上训练:
```shell
# training on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
```
**注意:**PicoDet所有模型均由4卡GPU训练得到如果改变训练GPU卡数需要按线性比例缩放学习率base_lr。
- 评估:
```shell
python tools/eval.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
```
- 测试:
```shell
python tools/infer.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
```
详情请参考[快速开始文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/GETTING_STARTED.md).
</details>
## 部署
### 导出及转换模型
<details open>
<summary>1. 导出模型</summary>
```shell
cd PaddleDetection
python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams \
--output_dir=output_inference
```
- 如无需导出后处理,请指定:`-o export.post_process=False`(如果-o已出现过此处删掉-o或者手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml) 中相应字段。
- 如无需导出NMS请指定`-o export.nms=False`或者手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml) 中相应字段。 许多导出至ONNX场景只支持单输入及固定shape输出所以如果导出至ONNX推荐不导出NMS。
</details>
<details>
<summary>2. 转换模型至Paddle Lite (点击展开)</summary>
- 安装Paddlelite>=2.10:
```shell
pip install paddlelite
```
- 转换模型至Paddle Lite格式
```shell
# FP32
paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp32
# FP16
paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp16 --enable_fp16=true
```
</details>
<details>
<summary>3. 转换模型至ONNX (点击展开)</summary>
- 安装[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX) >= 0.7 并且 ONNX > 1.10.1, 细节请参考[导出ONNX模型教程](../../deploy/EXPORT_ONNX_MODEL.md)
```shell
pip install onnx
pip install paddle2onnx==0.9.2
```
- 转换模型:
```shell
paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file picodet_s_320_coco.onnx
```
- 简化ONNX模型: 使用`onnx-simplifier`库来简化ONNX模型。
- 安装 onnxsim >= 0.4.1:
```shell
pip install onnxsim
```
- 简化ONNX模型:
```shell
onnxsim picodet_s_320_coco.onnx picodet_s_processed.onnx
```
</details>
- 部署用的模型
| 模型 | 输入尺寸 | ONNX | Paddle Lite(fp32) | Paddle Lite(fp16) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
| PicoDet-XS | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet_fp16.tar) |
| PicoDet-XS | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet_fp16.tar) |
| PicoDet-S | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet_fp16.tar) |
| PicoDet-S | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_fp16.tar) |
| PicoDet-M | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet_fp16.tar) |
| PicoDet-M | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet_fp16.tar) |
| PicoDet-L | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 640*640 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet_fp16.tar) |
### 部署
| 预测库 | Python | C++ | 带后处理预测 |
| :-------- | :--------: | :---------------------: | :----------------: |
| OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(带后处理开发中) | ✔︎ |
| Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
| Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Coming soon | ✔︎ |
| NCNN | Coming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
| MNN | Coming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
Android demo可视化
<div align="center">
<img src="../../docs/images/picodet_android_demo1.jpg" height="500px" ><img src="../../docs/images/picodet_android_demo2.jpg" height="500px" ><img src="../../docs/images/picodet_android_demo3.jpg" height="500px" ><img src="../../docs/images/picodet_android_demo4.jpg" height="500px" >
</div>
## 量化
<details open>
<summary>依赖包:</summary>
- PaddlePaddle >= 2.2.2
- PaddleSlim >= 2.2.2
**安装:**
```shell
pip install paddleslim==2.2.2
```
</details>
<details open>
<summary>量化训练</summary>
开始量化训练:
```shell
python tools/train.py -c configs/picodet/picodet_s_416_coco_lcnet.yml \
--slim_config configs/slim/quant/picodet_s_416_lcnet_quant.yml --eval
```
- 更多细节请参考[slim文档](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/slim)
</details>
- 量化训练Model ZOO
| 量化模型 | 输入尺寸 | mAP<sup>val<br>0.5:0.95 | Configs | Weight | Inference Model | Paddle Lite(INT8) |
| :-------- | :--------: | :--------------------: | :-------: | :----------------: | :----------------: | :----------------: |
| PicoDet-S | 416*416 | 31.5 | [config](./picodet_s_416_coco_lcnet.yml) &#124; [slim config](../slim/quant/picodet_s_416_lcnet_quant.yml) | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet_quant.pdparams) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant_non_postprocess.tar) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant.nb) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant_non_postprocess.nb) |
## 非结构化剪枝
<details open>
<summary>教程:</summary>
训练及部署细节请参考[非结构化剪枝文档](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/legacy_model/pruner/README.md)。
</details>
## 应用
- **行人检测:** `PicoDet-S-Pedestrian`行人检测模型请参考[PP-TinyPose](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/keypoint/tiny_pose#%E8%A1%8C%E4%BA%BA%E6%A3%80%E6%B5%8B%E6%A8%A1%E5%9E%8B)
- **主体检测:** `PicoDet-L-Mainbody`主体检测模型请参考[主体检测文档](./legacy_model/application/mainbody_detection/README.md)
## FAQ
<details>
<summary>显存爆炸(Out of memory error)</summary>
请减小配置文件中`TrainReader`的`batch_size`。
</details>
<details>
<summary>如何迁移学习</summary>
请重新设置配置文件中的`pretrain_weights`字段比如利用COCO上训好的模型在自己的数据上继续训练
```yaml
pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
```
</details>
<details>
<summary>`transpose`算子在某些硬件上耗时验证</summary>
请使用`PicoDet-LCNet`模型,`transpose`较少。
</details>
<details>
<summary>如何计算模型参数量。</summary>
可以将以下代码插入:[trainer.py](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/engine/trainer.py#L141) 来计算参数量。
```python
params = sum([
p.numel() for n, p in self.model. named_parameters()
if all([x not in n for x in ['_mean', '_variance']])
]) # exclude BatchNorm running status
print('params: ', params)
```
</details>
## 引用PP-PicoDet
如果需要在你的研究中使用PP-PicoDet请通过一下方式引用我们的技术报告
```
@misc{yu2021pppicodet,
title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
year={2021},
eprint={2111.00902},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

View File

@@ -0,0 +1,342 @@
English | [简体中文](README.md)
# PP-PicoDet
![](../../docs/images/picedet_demo.jpeg)
## News
- Released a new series of PP-PicoDet models: **(2022.03.20)**
- (1) It was used TAL/ETA Head and optimized PAN, which greatly improved the accuracy;
- (2) Moreover optimized CPU prediction speed, and the training speed is greatly improved;
- (3) The export model includes post-processing, and the prediction directly outputs the result, without secondary development, and the migration cost is lower.
### Legacy Model
- Please refer to: [PicoDet 2021.10](./legacy_model/)
## Introduction
We developed a series of lightweight models, named `PP-PicoDet`. Because of the excellent performance, our models are very suitable for deployment on mobile or CPU. For more details, please refer to our [report on arXiv](https://arxiv.org/abs/2111.00902).
- 🌟 Higher mAP: the **first** object detectors that surpass mAP(0.5:0.95) **30+** within 1M parameters when the input size is 416.
- 🚀 Faster latency: 150FPS on mobile ARM CPU.
- 😊 Deploy friendly: support PaddleLite/MNN/NCNN/OpenVINO and provide C++/Python/Android implementation.
- 😍 Advanced algorithm: use the most advanced algorithms and offer innovation, such as ESNet, CSP-PAN, SimOTA with VFL, etc.
<div align="center">
<img src="../../docs/images/picodet_map.png" width='600'/>
</div>
## Benchmark
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[CPU](#latency)</small><sup><br><sup>(ms) | Latency<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | Weight | Config | Inference Model |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- | :--------------------------------------- |
| PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |
<details open>
<summary><b>Table Notes:</b></summary>
- <a name="latency">Latency:</a> All our models test on `Intel core i7 10750H` CPU with MKLDNN by 12 threads and `Qualcomm Snapdragon 865(4xA77+4xA55)` with 4 threads by arm8 and with FP16. In the above table, test CPU latency on Paddle-Inference and testing Mobile latency with `Lite`->[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite).
- PicoDet is trained on COCO train2017 dataset and evaluated on COCO val2017. And PicoDet used 4 GPUs for training and all checkpoints are trained with default settings and hyperparameters.
- Benchmark test: When testing the speed benchmark, the post-processing is not included in the exported model, you need to set `-o export.benchmark=True` or manually modify [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml#L12).
</details>
#### Benchmark of Other Models
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[NCNN](#latency)</small><sup><br><sup>(ms) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: |
| YOLOv3-Tiny | 416*416 | 16.6 | 33.1 | 8.86 | 5.62 | 25.42 |
| YOLOv4-Tiny | 416*416 | 21.7 | 40.2 | 6.06 | 6.96 | 23.69 |
| PP-YOLO-Tiny | 320*320 | 20.6 | - | 1.08 | 0.58 | 6.75 |
| PP-YOLO-Tiny | 416*416 | 22.7 | - | 1.08 | 1.02 | 10.48 |
| Nanodet-M | 320*320 | 20.6 | - | 0.95 | 0.72 | 8.71 |
| Nanodet-M | 416*416 | 23.5 | - | 0.95 | 1.2 | 13.35 |
| Nanodet-M 1.5x | 416*416 | 26.8 | - | 2.08 | 2.42 | 15.83 |
| YOLOX-Nano | 416*416 | 25.8 | - | 0.91 | 1.08 | 19.23 |
| YOLOX-Tiny | 416*416 | 32.8 | - | 5.06 | 6.45 | 32.77 |
| YOLOv5n | 640*640 | 28.4 | 46.0 | 1.9 | 4.5 | 40.35 |
| YOLOv5s | 640*640 | 37.2 | 56.0 | 7.2 | 16.5 | 78.05 |
- Testing Mobile latency with code: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark).
## Quick Start
<details open>
<summary>Requirements:</summary>
- PaddlePaddle >= 2.2.2
</details>
<details>
<summary>Installation</summary>
- [Installation guide](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/INSTALL.md)
- [Prepare dataset](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/data/PrepareDataSet_en.md)
</details>
<details>
<summary>Training and Evaluation</summary>
- Training model on single-GPU:
```shell
# training on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
```
If the GPU is out of memory during training, reduce the batch_size in TrainReader, and reduce the base_lr in LearningRate proportionally. At the same time, the configs we published are all trained with 4 GPUs. If the number of GPUs is changed to 1, the base_lr needs to be reduced by a factor of 4.
- Training model on multi-GPU:
```shell
# training on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
```
- Evaluation:
```shell
python tools/eval.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
```
- Infer:
```shell
python tools/infer.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
```
Detail also can refer to [Quick start guide](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/GETTING_STARTED.md).
</details>
## Deployment
### Export and Convert Model
<details open>
<summary>1. Export model</summary>
```shell
cd PaddleDetection
python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams \
--output_dir=output_inference
```
- If no post processing is required, please specify: `-o export.post_process=False` (if -o has already appeared, delete -o here) or manually modify corresponding fields in [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml).
- If no NMS is required, please specify: `-o export.nms=True` or manually modify corresponding fields in [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml). Many scenes exported to ONNX only support single input and fixed shape output, so if exporting to ONNX, it is recommended not to export NMS.
</details>
<details>
<summary>2. Convert to PaddleLite (click to expand)</summary>
- Install Paddlelite>=2.10:
```shell
pip install paddlelite
```
- Convert model:
```shell
# FP32
paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp32
# FP16
paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp16 --enable_fp16=true
```
</details>
<details>
<summary>3. Convert to ONNX (click to expand)</summary>
- Install [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX) >= 0.7 and ONNX > 1.10.1, for details, please refer to [Tutorials of Export ONNX Model](../../deploy/EXPORT_ONNX_MODEL.md)
```shell
pip install onnx
pip install paddle2onnx==0.9.2
```
- Convert model:
```shell
paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file picodet_s_320_coco.onnx
```
- Simplify ONNX model: use onnx-simplifier to simplify onnx model.
- Install onnxsim >= 0.4.1:
```shell
pip install onnxsim
```
- simplify onnx model:
```shell
onnxsim picodet_s_320_coco.onnx picodet_s_processed.onnx
```
</details>
- Deploy models
| Model | Input size | ONNX(w/o postprocess) | Paddle Lite(fp32) | Paddle Lite(fp16) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
| PicoDet-XS | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet_fp16.tar) |
| PicoDet-XS | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet_fp16.tar) |
| PicoDet-S | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet_fp16.tar) |
| PicoDet-S | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_fp16.tar) |
| PicoDet-M | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet_fp16.tar) |
| PicoDet-M | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet_fp16.tar) |
| PicoDet-L | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 640*640 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco_lcnet.onnx) [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet_fp16.tar) |
### Deploy
| Infer Engine | Python | C++ | Predict With Postprocess |
| :-------- | :--------: | :---------------------: | :----------------: |
| OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)postprocess coming soon | ✔︎ |
| Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
| Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Coming soon | ✔︎ |
| NCNN | Coming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
| MNN | Coming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
Android demo visualization:
<div align="center">
<img src="../../docs/images/picodet_android_demo1.jpg" height="500px" ><img src="../../docs/images/picodet_android_demo2.jpg" height="500px" ><img src="../../docs/images/picodet_android_demo3.jpg" height="500px" ><img src="../../docs/images/picodet_android_demo4.jpg" height="500px" >
</div>
## Quantization
<details open>
<summary>Requirements:</summary>
- PaddlePaddle >= 2.2.2
- PaddleSlim >= 2.2.2
**Install:**
```shell
pip install paddleslim==2.2.2
```
</details>
<details open>
<summary>Quant aware</summary>
Configure the quant config and start training:
```shell
python tools/train.py -c configs/picodet/picodet_s_416_coco_lcnet.yml \
--slim_config configs/slim/quant/picodet_s_416_lcnet_quant.yml --eval
```
- More detail can refer to [slim document](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/slim)
</details>
- Quant Aware Model ZOO
| Quant Model | Input size | mAP<sup>val<br>0.5:0.95 | Configs | Weight | Inference Model | Paddle Lite(INT8) |
| :-------- | :--------: | :--------------------: | :-------: | :----------------: | :----------------: | :----------------: |
| PicoDet-S | 416*416 | 31.5 | [config](./picodet_s_416_coco_lcnet.yml) &#124; [slim config](../slim/quant/picodet_s_416_lcnet_quant.yml) | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet_quant.pdparams) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant_non_postprocess.tar) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant.nb) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant_non_postprocess.nb) |
## Unstructured Pruning
<details open>
<summary>Tutorial:</summary>
Please refer this [documentation](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/legacy_model/pruner/README.md) for details such as requirements, training and deployment.
</details>
## Application
- **Pedestrian detection:** model zoo of `PicoDet-S-Pedestrian` please refer to [PP-TinyPose](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/keypoint/tiny_pose#%E8%A1%8C%E4%BA%BA%E6%A3%80%E6%B5%8B%E6%A8%A1%E5%9E%8B)
- **Mainbody detection:** model zoo of `PicoDet-L-Mainbody` please refer to [mainbody detection](./legacy_model/application/mainbody_detection/README.md)
## FAQ
<details>
<summary>Out of memory error.</summary>
Please reduce the `batch_size` of `TrainReader` in config.
</details>
<details>
<summary>How to transfer learning.</summary>
Please reset `pretrain_weights` in config, which trained on coco. Such as:
```yaml
pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
```
</details>
<details>
<summary>The transpose operator is time-consuming on some hardware.</summary>
Please use `PicoDet-LCNet` model, which has fewer `transpose` operators.
</details>
<details>
<summary>How to count model parameters.</summary>
You can insert below code at [here](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/engine/trainer.py#L141) to count learnable parameters.
```python
params = sum([
p.numel() for n, p in self.model. named_parameters()
if all([x not in n for x in ['_mean', '_variance']])
]) # exclude BatchNorm running status
print('params: ', params)
```
</details>
## Cite PP-PicoDet
If you use PicoDet in your research, please cite our work by using the following BibTeX entry:
```
@misc{yu2021pppicodet,
title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
year={2021},
eprint={2111.00902},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

View File

@@ -0,0 +1,18 @@
epoch: 300
LearningRate:
base_lr: 0.32
schedulers:
- name: CosineDecay
max_epochs: 300
- name: LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2

View File

@@ -0,0 +1,42 @@
worker_num: 6
eval_height: &eval_height 320
eval_width: &eval_width 320
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
- PadGT: {}
batch_size: 64
shuffle: true
drop_last: true
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,42 @@
worker_num: 6
eval_height: &eval_height 416
eval_width: &eval_width 416
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [352, 384, 416, 448, 480], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
- PadGT: {}
batch_size: 64
shuffle: true
drop_last: true
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,42 @@
worker_num: 6
eval_height: &eval_height 640
eval_width: &eval_width 640
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
- PadGT: {}
batch_size: 32
shuffle: true
drop_last: true
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,61 @@
architecture: PicoDet
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_5_pretrained.pdparams
PicoDet:
backbone: LCNet
neck: LCPAN
head: PicoHeadV2
LCNet:
scale: 1.5
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 128
use_depthwise: True
num_features: 4
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 128
feat_out: 128
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
fpn_stride: [8, 16, 32, 64]
feat_in_chan: 128
prior_prob: 0.01
reg_max: 7
cell_offset: 0.5
grid_cell_scale: 5.0
static_assigner_epoch: 100
use_align_head: True
static_assigner:
name: ATSSAssigner
topk: 9
force_gt_matching: False
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
loss_class:
name: VarifocalLoss
use_sigmoid: False
iou_weighted: True
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.5
loss_bbox:
name: GIoULoss
loss_weight: 2.5
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6

View File

@@ -0,0 +1,161 @@
use_gpu: true
use_xpu: false
log_iter: 20
save_dir: output
snapshot_epoch: 1
print_flops: false
# Exporting the model
export:
post_process: True # Whether post-processing is included in the network when export model.
nms: True # Whether NMS is included in the network when export model.
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
metric: COCO
num_classes: 1
architecture: PicoDet
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
weights: output/picodet_s_192_lcnet_pedestrian/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: LCPAN
head: PicoHeadV2
LCNet:
scale: 0.75
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 96
use_depthwise: True
num_features: 4
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 96
fpn_stride: [8, 16, 32, 64]
prior_prob: 0.01
reg_max: 7
cell_offset: 0.5
grid_cell_scale: 5.0
static_assigner_epoch: 100
use_align_head: True
static_assigner:
name: ATSSAssigner
topk: 4
force_gt_matching: False
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
loss_class:
name: VarifocalLoss
use_sigmoid: False
iou_weighted: True
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.5
loss_bbox:
name: GIoULoss
loss_weight: 2.5
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
LearningRate:
base_lr: 0.32
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2
worker_num: 6
eval_height: &eval_height 192
eval_width: &eval_width 192
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [128, 160, 192, 224, 256], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
- PadGT: {}
batch_size: 64
shuffle: true
drop_last: true
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
TrainDataset:
!COCODataSet
image_dir: ""
anno_path: aic_coco_train_cocoformat.json
dataset_dir: dataset
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,160 @@
use_gpu: true
use_xpu: false
log_iter: 20
save_dir: output
snapshot_epoch: 1
print_flops: false
# Exporting the model
export:
post_process: True # Whether post-processing is included in the network when export model.
nms: True # Whether NMS is included in the network when export model.
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
metric: COCO
num_classes: 1
architecture: PicoDet
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
weights: output/picodet_s_320_lcnet_pedestrian/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: LCPAN
head: PicoHeadV2
LCNet:
scale: 0.75
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 96
use_depthwise: True
num_features: 4
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 96
fpn_stride: [8, 16, 32, 64]
prior_prob: 0.01
reg_max: 7
cell_offset: 0.5
grid_cell_scale: 5.0
static_assigner_epoch: 100
use_align_head: True
static_assigner:
name: ATSSAssigner
topk: 9
force_gt_matching: False
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
loss_class:
name: VarifocalLoss
use_sigmoid: False
iou_weighted: True
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.5
loss_bbox:
name: GIoULoss
loss_weight: 2.5
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
LearningRate:
base_lr: 0.32
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2
worker_num: 6
eval_height: &eval_height 320
eval_width: &eval_width 320
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
- PadGT: {}
batch_size: 64
shuffle: true
drop_last: true
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
TrainDataset:
!COCODataSet
image_dir: ""
anno_path: aic_coco_train_cocoformat.json
dataset_dir: dataset
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,60 @@
# PP-PicoDet Legacy Model-ZOO (2021.10)
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[NCNN](#latency)</small><sup><br><sup>(ms) | Latency<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | Download | Config |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
| PicoDet-S | 320*320 | 27.1 | 41.4 | 0.99 | 0.73 | 8.13 | **6.65** | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_320_coco.yml) |
| PicoDet-S | 416*416 | 30.7 | 45.8 | 0.99 | 1.24 | 12.37 | **9.82** | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco.yml) |
| PicoDet-M | 320*320 | 30.9 | 45.7 | 2.15 | 1.48 | 11.27 | **9.61** | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_320_coco.yml) |
| PicoDet-M | 416*416 | 34.8 | 50.5 | 2.15 | 2.50 | 17.39 | **15.88** | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_416_coco.yml) |
| PicoDet-L | 320*320 | 32.9 | 48.2 | 3.30 | 2.23 | 15.26 | **13.42** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_320_coco.yml) |
| PicoDet-L | 416*416 | 36.6 | 52.5 | 3.30 | 3.76 | 23.36 | **21.85** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco.yml) |
| PicoDet-L | 640*640 | 40.9 | 57.6 | 3.30 | 8.91 | 54.11 | **50.55** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco.yml) |
#### More Configs
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[NCNN](#latency)</small><sup><br><sup>(ms) | Latency<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | Download | Config |
| :--------------------------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
| PicoDet-Shufflenetv2 1x | 416*416 | 30.0 | 44.6 | 1.17 | 1.53 | 15.06 | **10.63** | [model](https://paddledet.bj.bcebos.com/models/picodet_shufflenetv2_1x_416_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_shufflenetv2_1x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_shufflenetv2_1x_416_coco.yml) |
| PicoDet-MobileNetv3-large 1x | 416*416 | 35.6 | 52.0 | 3.55 | 2.80 | 20.71 | **17.88** | [model](https://paddledet.bj.bcebos.com/models/picodet_mobilenetv3_large_1x_416_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_mobilenetv3_large_1x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_mobilenetv3_large_1x_416_coco.yml) |
| PicoDet-LCNet 1.5x | 416*416 | 36.3 | 52.2 | 3.10 | 3.85 | 21.29 | **20.8** | [model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_1_5x_416_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_1_5x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_lcnet_1_5x_416_coco.yml) |
| PicoDet-LCNet 1.5x | 640*640 | 40.6 | 57.4 | 3.10 | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_1_5x_640_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_1_5x_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_lcnet_1_5x_640_coco.yml) |
| PicoDet-R18 | 640*640 | 40.7 | 57.2 | 11.10 | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_r18_640_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_r18_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/more_config/picodet_r18_640_coco.yml) |
<details open>
<summary><b>Table Notes:</b></summary>
- <a name="latency">Latency:</a> All our models test on `Qualcomm Snapdragon 865(4xA77+4xA55)` with 4 threads by arm8 and with FP16. In the above table, test latency on [NCNN](https://github.com/Tencent/ncnn) and `Lite`->[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite). And testing latency with code: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark).
- PicoDet is trained on COCO train2017 dataset and evaluated on COCO val2017.
- PicoDet used 4 or 8 GPUs for training and all checkpoints are trained with default settings and hyperparameters.
</details>
- Deploy models
| Model | Input size | ONNX | Paddle Lite(fp32) | Paddle Lite(fp16) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
| PicoDet-S | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_fp16.tar) |
| PicoDet-S | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_fp16.tar) |
| PicoDet-M | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_fp16.tar) |
| PicoDet-M | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_fp16.tar) |
| PicoDet-L | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_fp16.tar) |
| PicoDet-L | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_fp16.tar) |
| PicoDet-L | 640*640 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_fp16.tar) |
| PicoDet-Shufflenetv2 1x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_shufflenetv2_1x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_shufflenetv2_1x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_shufflenetv2_1x_fp16.tar) |
| PicoDet-MobileNetv3-large 1x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_mobilenetv3_large_1x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_mobilenetv3_large_1x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_mobilenetv3_large_1x_fp16.tar) |
| PicoDet-LCNet 1.5x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_lcnet_1_5x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_lcnet_1_5x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_lcnet_1_5x_fp16.tar) |
## Cite PP-PicoDet
```
@misc{yu2021pppicodet,
title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
year={2021},
eprint={2111.00902},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

View File

@@ -0,0 +1,18 @@
epoch: 100
LearningRate:
base_lr: 0.4
schedulers:
- name: CosineDecay
max_epochs: 100
- name: LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2

View File

@@ -0,0 +1,18 @@
epoch: 300
LearningRate:
base_lr: 0.4
schedulers:
- name: CosineDecay
max_epochs: 300
- name: LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2

View File

@@ -0,0 +1,42 @@
worker_num: 6
eval_height: &eval_height 320
eval_width: &eval_width 320
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 128
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,42 @@
worker_num: 6
eval_height: &eval_height 416
eval_width: &eval_width 416
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [352, 384, 416, 448, 480], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 80
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,42 @@
worker_num: 6
eval_height: &eval_height 640
eval_width: &eval_width 640
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 56
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,55 @@
architecture: PicoDet
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_0_pretrained.pdparams
PicoDet:
backbone: ESNet
neck: CSPPAN
head: PicoHead
ESNet:
scale: 1.0
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
CSPPAN:
out_channels: 128
use_depthwise: True
num_csp_blocks: 1
num_features: 4
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 128
feat_out: 128
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
fpn_stride: [8, 16, 32, 64]
feat_in_chan: 128
prior_prob: 0.01
reg_max: 7
cell_offset: 0.5
loss_class:
name: VarifocalLoss
use_sigmoid: True
iou_weighted: True
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
assigner:
name: SimOTAAssigner
candidate_topk: 10
iou_weight: 6
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6

View File

@@ -0,0 +1,56 @@
# 更多应用
## 1. 版面分析任务
版面分析指的是对图片形式的文档进行区域划分,定位其中的关键区域,如文字、标题、表格、图片等。版面分析示意图如下图所示。
<div align="center">
<img src="images/layout_demo.png" width="800">
</div>
### 1.1 数据集
使用[PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)训练英文文档版面分析模型,该数据面向英文文献类(论文)场景,分别训练集(333,703张标注图片)、验证集(11,245张标注图片)和测试集(11,405张图片)包含5类Table、Figure、Title、Text、List更多[版面分析数据集](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md#32)
### 1.2 模型库
使用PicoDet模型在PubLayNet数据集进行训练同时采用FGD蒸馏预训练模型如下:
| 模型 | 图像输入尺寸 | mAP<sup>val<br/>0.5 | 下载地址 | 配置文件 |
| :-------- | :--------: | :----------------: | :---------------: | ----------------- |
| PicoDet-LCNet_x1_0 | 800*608 | 93.5% | [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout.pdparams) &#124; [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar) | [config](./picodet_lcnet_x1_0_layout.yml) |
| PicoDet-LCNet_x1_0 + FGD | 800*608 | 94.0% | [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) &#124; [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) | [teacher config](./picodet_lcnet_x2_5_layout.yml)&#124;[student config](./picodet_lcnet_x1_0_layout.yml) |
[FGD蒸馏介绍](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/slim/distill/README.md)
### 1.3 模型推理
了解版面分析整个流程(数据准备、模型训练、评估等),请参考[版面分析](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md)这里仅展示模型推理过程。首先下载模型库中的inference_model模型。
```
mkdir inference_model
cd inference_model
# 下载并解压PubLayNet推理模型
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar && tar xf picodet_lcnet_x1_0_fgd_layout_infer.tar
cd ..
```
版面恢复任务进行推理,可以执行如下命令:
```bash
python3 deploy/python/infer.py \
--model_dir=inference_model/picodet_lcnet_x1_0_fgd_layout_infer/ \
--image_file=docs/images/layout.jpg \
--device=CPU
```
可视化版面结果如下图所示:
<div align="center">
<img src="images/layout_res.jpg" width="800">
</div>
## 2 Reference
[1] Zhong X, Tang J, Yepes A J. Publaynet: largest dataset ever for document layout analysis[C]//2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019: 1015-1022.

Binary file not shown.

After

Width:  |  Height:  |  Size: 179 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 451 KiB

View File

@@ -0,0 +1,90 @@
_BASE_: [
'../../../../runtime.yml',
'../../_base_/picodet_esnet.yml',
'../../_base_/optimizer_100e.yml',
'../../_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_0_pretrained.pdparams
weights: output/picodet_lcnet_x1_0_layout/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 10
snapshot_epoch: 1
epoch: 100
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
nms_cpu: True
LCNet:
scale: 1.0
feature_maps: [3, 4, 5]
metric: COCO
num_classes: 5
TrainDataset:
name: COCODataSet
image_dir: train
anno_path: train.json
dataset_dir: ./dataset/publaynet/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
name: COCODataSet
image_dir: val
anno_path: val.json
dataset_dir: ./dataset/publaynet/
TestDataset:
!ImageFolder
anno_path: ./dataset/publaynet/val.json
worker_num: 8
eval_height: &eval_height 800
eval_width: &eval_width 608
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [[768, 576], [800, 608], [832, 640]], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 24
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, 800, 608]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false

View File

@@ -0,0 +1,34 @@
_BASE_: [
'../../_base_/picodet_esnet.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x2_5_ssld_pretrained.pdparams
weights: output/picodet_lcnet_x2_5_layout/model_final
find_unused_parameters: True
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
nms_cpu: True
LCNet:
scale: 2.5
feature_maps: [3, 4, 5]
CSPPAN:
spatial_scales: [0.125, 0.0625, 0.03125]
slim: Distill
slim_method: FGD
distill_loss: FGDFeatureLoss
distill_loss_name: ['neck_f_3', 'neck_f_2', 'neck_f_1', 'neck_f_0']
FGDFeatureLoss:
student_channels: 128
teacher_channels: 128
temp: 0.5
alpha_fgd: 0.001
beta_fgd: 0.0005
gamma_fgd: 0.0005
lambda_fgd: 0.000005

View File

@@ -0,0 +1,30 @@
# 更多应用
## 1. 主体检测任务
主体检测技术是目前应用非常广泛的一种检测技术,它指的是检测出图片中一个或者多个主体的坐标位置,然后将图像中的对应区域裁剪下来,进行识别,从而完成整个识别过程。主体检测是识别任务的前序步骤,可以有效提升识别精度。
主体检测是图像识别的前序步骤被用于PaddleClas的PP-ShiTu图像识别系统中。PP-ShiTu中使用的主体检测模型基于PP-PicoDet。更多关于PP-ShiTu的介绍与使用可以参考[PP-ShiTu](https://github.com/PaddlePaddle/PaddleClas)。
### 1.1 数据集
PP-ShiTu图像识别任务中训练主体检测模型时主要用到了以下几个数据集。
| 数据集 | 数据量 | 主体检测任务中使用的数据量 | 场景 | 数据集地址 |
| :------------: | :-------------: | :-------: | :-------: | :--------: |
| Objects365 | 1700K | 173k | 通用场景 | [地址](https://www.objects365.org/overview.html) |
| COCO2017 | 118K | 118k | 通用场景 | [地址](https://cocodataset.org/) |
| iCartoonFace | 48k | 48k | 动漫人脸检测 | [地址](https://github.com/luxiangju-PersonAI/iCartoonFace) |
| LogoDet-3k | 155k | 155k | Logo检测 | [地址](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
| RPC | 54k | 54k | 商品检测 | [地址](https://rpc-dataset.github.io/) |
在实际训练的过程中,将所有数据集混合在一起。由于是主体检测,这里将所有标注出的检测框对应的类别都修改为 `前景` 的类别,最终融合的数据集中只包含 1 个类别,即前景,数据集定义配置可以参考[picodet_lcnet_x2_5_640_mainbody.yml](./picodet_lcnet_x2_5_640_mainbody.yml)。
### 1.2 模型库
| 模型 | 图像输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 下载地址 | config |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: |
| PicoDet-LCNet_x2_5 | 640*640 | 41.5 | 62.0 | [trained model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody.pdparams) &#124; [inference model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody_infer.tar) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_x2_5_640_mainbody.log) | [config](./picodet_lcnet_x2_5_640_mainbody.yml) |

View File

@@ -0,0 +1,23 @@
_BASE_: [
'../../../../datasets/coco_detection.yml',
'../../../../runtime.yml',
'../../_base_/picodet_esnet.yml',
'../../_base_/optimizer_100e.yml',
'../../_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x2_5_ssld_pretrained.pdparams
weights: output/picodet_lcnet_x2_5_640_mainbody/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 20
snapshot_epoch: 2
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
LCNet:
scale: 2.5
feature_maps: [3, 4, 5]

View File

@@ -0,0 +1,149 @@
use_gpu: true
log_iter: 20
save_dir: output
snapshot_epoch: 1
print_flops: false
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
weights: output/picodet_s_192_pedestrian/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 300
metric: COCO
num_classes: 1
# Exporting the model
export:
post_process: False # Whether post-processing is included in the network when export model.
nms: False # Whether NMS is included in the network when export model.
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
architecture: PicoDet
PicoDet:
backbone: ESNet
neck: CSPPAN
head: PicoHead
ESNet:
scale: 0.75
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
CSPPAN:
out_channels: 96
use_depthwise: True
num_csp_blocks: 1
num_features: 4
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
fpn_stride: [8, 16, 32, 64]
feat_in_chan: 96
prior_prob: 0.01
reg_max: 7
cell_offset: 0.5
loss_class:
name: VarifocalLoss
use_sigmoid: True
iou_weighted: True
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
assigner:
name: SimOTAAssigner
candidate_topk: 10
iou_weight: 6
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
LearningRate:
base_lr: 0.4
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2
TrainDataset:
!COCODataSet
image_dir: ""
anno_path: aic_coco_train_cocoformat.json
dataset_dir: dataset
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json
worker_num: 8
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [128, 160, 192, 224, 256], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 128
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [192, 192], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, 192, 192]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [192, 192], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
fuse_normalize: true

View File

@@ -0,0 +1,148 @@
use_gpu: true
log_iter: 20
save_dir: output
snapshot_epoch: 1
print_flops: false
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
weights: output/picodet_s_320_pedestrian/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 300
metric: COCO
num_classes: 1
# Exporting the model
export:
post_process: False # Whether post-processing is included in the network when export model.
nms: False # Whether NMS is included in the network when export model.
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
architecture: PicoDet
PicoDet:
backbone: ESNet
neck: CSPPAN
head: PicoHead
ESNet:
scale: 0.75
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
CSPPAN:
out_channels: 96
use_depthwise: True
num_csp_blocks: 1
num_features: 4
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
fpn_stride: [8, 16, 32, 64]
feat_in_chan: 96
prior_prob: 0.01
reg_max: 7
cell_offset: 0.5
loss_class:
name: VarifocalLoss
use_sigmoid: True
iou_weighted: True
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
assigner:
name: SimOTAAssigner
candidate_topk: 10
iou_weight: 6
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
LearningRate:
base_lr: 0.4
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2
TrainDataset:
!COCODataSet
image_dir: ""
anno_path: aic_coco_train_cocoformat.json
dataset_dir: dataset
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json
worker_num: 8
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 128
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [320, 320], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, 320, 320]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [320, 320], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false

View File

@@ -0,0 +1,26 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_0_pretrained.pdparams
weights: output/picodet_lcnet_1_5x_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
LCNet:
scale: 1.0
feature_maps: [3, 4, 5]
TrainReader:
batch_size: 90

View File

@@ -0,0 +1,23 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_5_pretrained.pdparams
weights: output/picodet_lcnet_1_5x_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
LCNet:
scale: 1.5
feature_maps: [3, 4, 5]

View File

@@ -0,0 +1,49 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_5_pretrained.pdparams
weights: output/picodet_lcnet_1_5x_640_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
LCNet:
scale: 1.5
feature_maps: [3, 4, 5]
CSPPAN:
out_channels: 160
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 160
TrainReader:
batch_size: 24
LearningRate:
base_lr: 0.2
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,26 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x2_5_ssld_pretrained.pdparams
weights: output/picodet_lcnet_1_5x_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
LCNet:
scale: 2.5
feature_maps: [3, 4, 5]
TrainReader:
batch_size: 48

View File

@@ -0,0 +1,39 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
weights: output/picodet_mobilenetv3_large_1x_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 180
PicoDet:
backbone: MobileNetV3
neck: CSPPAN
head: PicoHead
MobileNetV3:
model_name: large
scale: 1.0
with_extra_blocks: false
extra_block_filters: []
feature_maps: [7, 13, 16]
TrainReader:
batch_size: 56
LearningRate:
base_lr: 0.3
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,39 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
weights: output/picodet_r18_640_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
PicoDet:
backbone: ResNet
neck: CSPPAN
head: PicoHead
ResNet:
depth: 18
variant: d
return_idx: [1, 2, 3]
freeze_at: -1
freeze_norm: false
norm_decay: 0.
TrainReader:
batch_size: 56
LearningRate:
base_lr: 0.3
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,38 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'../_base_/optimizer_300e.yml',
'../_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ShuffleNetV2_x1_0_pretrained.pdparams
weights: output/picodet_shufflenetv2_1x_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
PicoDet:
backbone: ShuffleNetV2
neck: CSPPAN
head: PicoHead
ShuffleNetV2:
scale: 1.0
feature_maps: [5, 13, 17]
act: leaky_relu
CSPPAN:
out_channels: 96
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 96

View File

@@ -0,0 +1,47 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
weights: output/picodet_l_320_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 250
ESNet:
scale: 1.25
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
CSPPAN:
out_channels: 160
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 160
TrainReader:
batch_size: 56
LearningRate:
base_lr: 0.3
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,47 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
weights: output/picodet_l_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 250
ESNet:
scale: 1.25
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
CSPPAN:
out_channels: 160
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 160
TrainReader:
batch_size: 48
LearningRate:
base_lr: 0.3
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,47 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
weights: output/picodet_l_640_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 250
ESNet:
scale: 1.25
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
CSPPAN:
out_channels: 160
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 160
TrainReader:
batch_size: 32
LearningRate:
base_lr: 0.3
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,13 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
weights: output/picodet_m_320_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10

View File

@@ -0,0 +1,13 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
weights: output/picodet_m_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10

View File

@@ -0,0 +1,34 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
weights: output/picodet_s_320_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
ESNet:
scale: 0.75
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
CSPPAN:
out_channels: 96
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 96

View File

@@ -0,0 +1,37 @@
_BASE_: [
'../../datasets/voc.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
weights: output/picodet_s_320_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
ESNet:
scale: 0.75
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
CSPPAN:
out_channels: 96
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 96
EvalReader:
collate_batch: false

View File

@@ -0,0 +1,34 @@
_BASE_: [
'../../datasets/coco_detection.yml',
'../../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
weights: output/picodet_s_416_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
ESNet:
scale: 0.75
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
CSPPAN:
out_channels: 96
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 96

View File

@@ -0,0 +1,135 @@
# 非结构化稀疏在 PicoDet 上的应用教程
## 1. 介绍
在模型压缩中,常见的稀疏方式为结构化稀疏和非结构化稀疏,前者在某个特定维度(特征通道、卷积核等等)上对卷积、矩阵乘法进行剪枝操作,然后生成一个更小的模型结构,这样可以复用已有的卷积、矩阵乘计算,无需特殊实现推理算子;后者以每一个参数为单元进行稀疏化,然而并不会改变参数矩阵的形状,所以更依赖于推理库、硬件对于稀疏后矩阵运算的加速能力。我们在 PP-PicoDet 以下简称PicoDet 模型上运用了非结构化稀疏技术,在精度损失较小时,获得了在 ARM CPU 端推理的显著性能提升。本文档会介绍如何非结构化稀疏训练 PicoDet关于非结构化稀疏的更多介绍请参照[这里](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/dygraph/unstructured_pruning)。
## 2. 版本要求
```bash
PaddlePaddle >= 2.1.2
PaddleSlim develop分支 pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
```
## 3. 数据准备
同 PicoDet
## 4. 预训练模型
在非结构化稀疏训练中,我们规定预训练模型是已经收敛完成的模型参数,所以需要额外在相关配置文件中声明。
声明预训练模型地址的配置文件:./configs/picodet/pruner/picodet_m_320_coco_pruner.yml
预训练模型地址请参照 PicoDet 文档:./configs/picodet/README.md
## 5. 自定义稀疏化的作用范围
为达到最佳推理加速效果,我们建议只对 1x1 卷积层进行稀疏化其他层参数保持稠密。另外有些层对于精度影响较大例如head的最后几层se-block的若干层我们同样不建议对他们进行稀疏化我们支持开发者通过传入自定义函数的形式方便的指定哪些层不参与稀疏。例如基于picodet_m_320这个模型我们稀疏时跳过了后4层卷积以及6层se-block中的卷积自定义函数如下
```python
NORMS_ALL = [ 'BatchNorm', 'GroupNorm', 'LayerNorm', 'SpectralNorm', 'BatchNorm1D',
'BatchNorm2D', 'BatchNorm3D', 'InstanceNorm1D', 'InstanceNorm2D',
'InstanceNorm3D', 'SyncBatchNorm', 'LocalResponseNorm' ]
def skip_params_self(model):
skip_params = set()
for _, sub_layer in model.named_sublayers():
if type(sub_layer).__name__.split('.')[-1] in NORMS_ALL:
skip_params.add(sub_layer.full_name())
for param in sub_layer.parameters(include_sublayers=False):
cond_is_conv1x1 = len(param.shape) == 4 and param.shape[2] == 1 and param.shape[3] == 1
cond_is_head_m = cond_is_conv1x1 and param.shape[0] == 112 and param.shape[1] == 128
cond_is_se_block_m = param.name.split('.')[0] in ['conv2d_17', 'conv2d_18', 'conv2d_56', 'conv2d_57', 'conv2d_75', 'conv2d_76']
if not cond_is_conv1x1 or cond_is_head_m or cond_is_se_block_m:
skip_params.add(param.name)
return skip_params
```
## 6. 训练
我们已经将非结构化稀疏的核心功能通过 API 调用的方式嵌入到了训练中,所以如果您没有更细节的需求,直接运行 6.1 的命令启动训练即可。同时,为帮助您根据自己的需求更改、适配代码,我们也提供了更为详细的使用介绍,请参照 6.2。
### 6.1 直接使用
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch --log_dir=log_test --gpus 0,1,2,3 tools/train.py -c configs/picodet/pruner/picodet_m_320_coco_pruner.yml --slim_config configs/slim/prune/picodet_m_unstructured_prune_75.yml --eval
```
### 6.2 详细介绍
- 自定义稀疏化的作用范围:可以参照本教程的第 5 节
- 如何添加稀疏化训练所需的 4 行代码
```python
# after constructing model and before training
# Pruner Step1: configs
configs = {
'pruning_strategy': 'gmp',
'stable_iterations': self.stable_epochs * steps_per_epoch,
'pruning_iterations': self.pruning_epochs * steps_per_epoch,
'tunning_iterations': self.tunning_epochs * steps_per_epoch,
'resume_iteration': 0,
'pruning_steps': self.pruning_steps,
'initial_ratio': self.initial_ratio,
}
# Pruner Step2: construct a pruner object
self.pruner = GMPUnstructuredPruner(
model,
ratio=self.cfg.ratio,
skip_params_func=skip_params_self, # Only pass in this value when you design your own skip_params function. And the following argument (skip_params_type) will be ignored.
skip_params_type=self.cfg.skip_params_type,
local_sparsity=True,
configs=configs)
# training
for epoch_id in range(self.start_epoch, self.cfg.epoch):
model.train()
for step_id, data in enumerate(self.loader):
# model forward
outputs = model(data)
loss = outputs['loss']
# model backward
loss.backward()
self.optimizer.step()
# Pruner Step3: step during training
self.pruner.step()
# Pruner Step4: save the sparse model
self.pruner.update_params()
# model-saving API
```
## 7. 模型评估与推理部署
这部分与 PicoDet 文档中基本一致,只是在转换到 PaddleLite 模型时需要添加一个输入参数sparse_model
```bash
paddle_lite_opt --model_dir=inference_model/picodet_m_320_coco --valid_targets=arm --optimize_out=picodet_m_320_coco_fp32_sparse --sparse_model=True
```
**注意:** 目前稀疏化推理适用于 PaddleLite的 FP32 和 INT8 模型,所以执行上述命令时,请不要打开 FP16 开关。
## 8. 稀疏化结果
我们在75%和85%稀疏度下,训练得到了 FP32 PicoDet-m模型并在 SnapDragon-835设备上实测推理速度效果如下表。其中
- 对于 m 模型mAP损失1.5,获得了 34\%-58\% 的加速性能
- 同样对于 m 模型除4线程推理速度基本持平外单线程推理速度、mAP、模型体积均优于 s 模型。
| Model | Input size | Sparsity | mAP<sup>val<br>0.5:0.95 | Size<br><sup>(MB) | Latency single-thread<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | speed-up single-thread | Latency 4-thread<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) | speed-up 4-thread | Download | SlimConfig |
| :-------- | :--------: |:--------: | :---------------------: | :----------------: | :----------------: |:----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: |
| PicoDet-m-1.0 | 320*320 | 0 | 30.9 | 8.9 | 127 | 0 | 43 | 0 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams)&#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet/picodet_m_320_coco.yml)|
| PicoDet-m-1.0 | 320*320 | 75% | 29.4 | 5.6 | **80** | 58% | **32** | 34% | [model](https://paddledet.bj.bcebos.com/models/slim/picodet_m_320__coco_sparse_75.pdparams)&#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320__coco_sparse_75.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/slim/prune/picodet_m_unstructured_prune_75.yml)|
| PicoDet-s-1.0 | 320*320 | 0 | 27.1 | 4.6 | 68 | 0 | 26 | 0 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet/picodet_s_320_coco.yml)|
| PicoDet-m-1.0 | 320*320 | 85% | 27.6 | 4.1 | **65** | 96% | **27** | 59% | [model](https://paddledet.bj.bcebos.com/models/slim/picodet_m_320__coco_sparse_85.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320__coco_sparse_85.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/slim/prune/picodet_m_unstructured_prune_85.yml)|
**注意:**
- 上述模型体积是**部署模型体积**,即 PaddleLite 转换得到的 *.nb 文件的体积。
- 加速一栏我们按照 FPS 增加百分比计算,即:$(dense\_latency - sparse\_latency) / sparse\_latency$
- 上述稀疏化训练时,我们额外添加了一种数据增强方式到 _base_/picodet_320_reader.yml代码如下。但是不添加的话预期mAP也不会有明显下降<0.1且对速度和模型体积没有影响
```yaml
worker_num: 6
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomDistort: {}
batch_transforms:
etc.
```

View File

@@ -0,0 +1,18 @@
epoch: 300
LearningRate:
base_lr: 0.15
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 1.0
steps: 34350
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2

View File

@@ -0,0 +1,13 @@
_BASE_: [
'../../../datasets/coco_detection.yml',
'../../../runtime.yml',
'../_base_/picodet_esnet.yml',
'./optimizer_300e_pruner.yml',
'../_base_/picodet_320_reader.yml',
]
weights: output/picodet_m_320_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams
weights: output/picodet_l_320_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 250
snapshot_epoch: 10
LCNet:
scale: 2.0
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 160
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 160
LearningRate:
base_lr: 0.12
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
TrainReader:
batch_size: 24

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams
weights: output/picodet_l_320_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 250
snapshot_epoch: 10
LCNet:
scale: 2.0
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 160
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 160
LearningRate:
base_lr: 0.12
schedulers:
- name: CosineDecay
max_epochs: 300
- name: LinearWarmup
start_factor: 0.1
steps: 300
TrainReader:
batch_size: 24

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams
weights: output/picodet_l_320_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 200
snapshot_epoch: 10
LCNet:
scale: 2.0
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 160
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 160
LearningRate:
base_lr: 0.06
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
TrainReader:
batch_size: 12

View File

@@ -0,0 +1,25 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
weights: output/picodet_m_320_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
TrainReader:
batch_size: 48
LearningRate:
base_lr: 0.24
schedulers:
- name: CosineDecay
max_epochs: 300
- name: LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,25 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
weights: output/picodet_m_416_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 250
snapshot_epoch: 10
TrainReader:
batch_size: 48
LearningRate:
base_lr: 0.24
schedulers:
- name: CosineDecay
max_epochs: 300
- name: LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
weights: output/picodet_s_320_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
LCNet:
scale: 0.75
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 96
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 96
TrainReader:
batch_size: 64
LearningRate:
base_lr: 0.32
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
weights: output/picodet_s_416_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
LCNet:
scale: 0.75
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 96
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 96
TrainReader:
batch_size: 48
LearningRate:
base_lr: 0.24
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,106 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
]
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
weights: output/picodet_s_416_coco/best_model
find_unused_parameters: True
keep_best_weight: True
use_ema: True
epoch: 300
snapshot_epoch: 10
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHeadV2
LCNet:
scale: 0.75
feature_maps: [3, 4, 5]
act: relu6
CSPPAN:
out_channels: 96
use_depthwise: True
num_csp_blocks: 1
num_features: 4
act: relu6
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
act: relu6
feat_in_chan: 96
act: relu6
LearningRate:
base_lr: 0.2
schedulers:
- !CosineDecay
max_epochs: 300
min_lr_ratio: 0.08
last_plateau_epochs: 30
- !ExpWarmup
epochs: 2
worker_num: 6
eval_height: &eval_height 416
eval_width: &eval_width 416
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- Mosaic:
prob: 0.6
input_dim: [640, 640]
degrees: [-10, 10]
scale: [0.1, 2.0]
shear: [-2, 2]
translate: [-0.1, 0.1]
enable_mixup: True
- AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
- RandomFlip: {prob: 0.5}
batch_transforms:
- BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
- PadGT: {}
batch_size: 40
shuffle: true
drop_last: true
mosaic_epoch: 180
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_320_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x0_35_pretrained.pdparams
weights: output/picodet_xs_320_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
LCNet:
scale: 0.35
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 96
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 96
TrainReader:
batch_size: 64
LearningRate:
base_lr: 0.32
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300

View File

@@ -0,0 +1,45 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_v2.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_416_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x0_35_pretrained.pdparams
weights: output/picodet_xs_416_coco/best_model
find_unused_parameters: True
use_ema: true
epoch: 300
snapshot_epoch: 10
LCNet:
scale: 0.35
feature_maps: [3, 4, 5]
LCPAN:
out_channels: 96
PicoHeadV2:
conv_feat:
name: PicoFeat
feat_in: 96
feat_out: 96
num_convs: 2
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
use_se: True
feat_in_chan: 96
TrainReader:
batch_size: 56
LearningRate:
base_lr: 0.28
schedulers:
- name: CosineDecay
max_epochs: 300
- name: LinearWarmup
start_factor: 0.1
steps: 300