更换文档检测模型
This commit is contained in:
178
paddle_detection/configs/rotate/ppyoloe_r/README.md
Normal file
178
paddle_detection/configs/rotate/ppyoloe_r/README.md
Normal file
@@ -0,0 +1,178 @@
|
||||
简体中文 | [English](README_en.md)
|
||||
|
||||
# PP-YOLOE-R
|
||||
|
||||
## 内容
|
||||
- [简介](#简介)
|
||||
- [模型库](#模型库)
|
||||
- [使用说明](#使用说明)
|
||||
- [预测部署](#预测部署)
|
||||
- [附录](#附录)
|
||||
- [引用](#引用)
|
||||
|
||||
## 简介
|
||||
PP-YOLOE-R是一个高效的单阶段Anchor-free旋转框检测模型。基于PP-YOLOE, PP-YOLOE-R以极少的参数量和计算量为代价,引入了一系列有用的设计来提升检测精度。在DOTA 1.0数据集上,PP-YOLOE-R-l和PP-YOLOE-R-x在单尺度训练和测试的情况下分别达到了78.14和78.27 mAP,这超越了几乎所有的旋转框检测模型。通过多尺度训练和测试,PP-YOLOE-R-l和PP-YOLOE-R-x的检测精度进一步提升至80.02和80.73 mAP。在这种情况下,PP-YOLOE-R-x超越了所有的anchor-free方法并且和最先进的anchor-based的两阶段模型精度几乎相当。此外,PP-YOLOE-R-s和PP-YOLOE-R-m通过多尺度训练和测试可以达到79.42和79.71 mAP。考虑到这两个模型的参数量和计算量,其性能也非常卓越。在保持高精度的同时,PP-YOLOE-R避免使用特殊的算子,例如Deformable Convolution或Rotated RoI Align,以使其能轻松地部署在多种多样的硬件上。在1024x1024的输入分辨率下,PP-YOLOE-R-s/m/l/x在RTX 2080 Ti上使用TensorRT FP16分别能达到69.8/55.1/48.3/37.1 FPS,在Tesla V100上分别能达到114.5/86.8/69.7/50.7 FPS。更多细节可以参考我们的[**技术报告**](https://arxiv.org/abs/2211.02386)。
|
||||
|
||||
<div align="center">
|
||||
<img src="../../../docs/images/ppyoloe_r_map_fps.png" width=500 />
|
||||
</div>
|
||||
|
||||
PP-YOLOE-R相较于PP-YOLOE做了以下几点改动:
|
||||
- Rotated Task Alignment Learning
|
||||
- 解耦的角度预测头
|
||||
- 使用DFL进行角度预测
|
||||
- 可学习的门控单元
|
||||
- [ProbIoU损失函数](https://arxiv.org/abs/2106.06072)
|
||||
|
||||
## 模型库
|
||||
|
||||
| 模型 | Backbone | mAP | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
|
||||
|:---:|:--------:|:----:|:--------------------:|:------------------------:|:----------:|:---------:|:--------:|:----------:|:-------:|:------:|:-----------:|:--------:|:------:|
|
||||
| PP-YOLOE-R-s | CRN-s | 73.82 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml) |
|
||||
| PP-YOLOE-R-s | CRN-s | 79.42 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml) |
|
||||
| PP-YOLOE-R-m | CRN-m | 77.64 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml) |
|
||||
| PP-YOLOE-R-m | CRN-m | 79.71 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml) |
|
||||
| PP-YOLOE-R-l | CRN-l | 78.14 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml) |
|
||||
| PP-YOLOE-R-l | CRN-l | 80.02 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
|
||||
| PP-YOLOE-R-x | CRN-x | 78.28 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml) |
|
||||
| PP-YOLOE-R-x | CRN-x | 80.73 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml) |
|
||||
|
||||
**注意:**
|
||||
|
||||
- 如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)** 调整学习率。
|
||||
- 模型库中的模型默认使用单尺度训练单尺度测试。如果数据增广一栏标明MS,意味着使用多尺度训练和多尺度测试。如果数据增广一栏标明RR,意味着使用RandomRotate数据增广进行训练。
|
||||
- CRN表示在PP-YOLOE中提出的CSPRepResNet
|
||||
- PP-YOLOE-R的参数量和计算量是在重参数化之后计算得到,输入图像的分辨率为1024x1024
|
||||
- 速度测试使用TensorRT 8.2.3在DOTA测试集中测试2000张图片计算平均值得到。参考速度测试以复现[速度测试](#速度测试)
|
||||
|
||||
## 使用说明
|
||||
|
||||
参考[数据准备](../README.md#数据准备)准备数据。
|
||||
|
||||
### 训练
|
||||
|
||||
GPU单卡训练
|
||||
``` bash
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
|
||||
```
|
||||
|
||||
GPU多卡训练
|
||||
``` bash
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
|
||||
```
|
||||
|
||||
### 预测
|
||||
|
||||
执行以下命令预测单张图片,图片预测结果会默认保存在`output`文件夹下面
|
||||
``` bash
|
||||
python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_img=demo/P0861__1.0__1154___824.png --draw_threshold=0.5
|
||||
```
|
||||
|
||||
### DOTA数据集评估
|
||||
|
||||
参考[DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), 评估DOTA数据集需要生成一个包含所有检测结果的zip文件,每一类的检测结果储存在一个txt文件中,txt文件中每行格式为:`image_name score x1 y1 x2 y2 x3 y3 x4 y4`。将生成的zip文件提交到[DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html)的Task1进行评估。你可以执行以下命令得到test数据集的预测结果:
|
||||
``` bash
|
||||
python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output_ppyoloe_r --visualize=False --save_results=True
|
||||
```
|
||||
将预测结果处理成官网评估所需要的格式:
|
||||
``` bash
|
||||
python configs/rotate/tools/generate_result.py --pred_txt_dir=output_ppyoloe_r/ --output_dir=submit/ --data_type=dota10
|
||||
|
||||
zip -r submit.zip submit
|
||||
```
|
||||
|
||||
### 速度测试
|
||||
可以使用Paddle模式或者Paddle-TRT模式进行测速。当使用Paddle-TRT模式测速时,需要确保**TensorRT版本大于8.2, PaddlePaddle版本为develop版本**。使用Paddle-TRT进行测速,可以执行以下命令:
|
||||
|
||||
``` bash
|
||||
# 导出模型
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
|
||||
|
||||
# 速度测试
|
||||
CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode trt_fp16
|
||||
```
|
||||
当只使用Paddle进行测速,可以执行以下命令:
|
||||
``` bash
|
||||
# 导出模型
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
|
||||
|
||||
# 速度测试
|
||||
CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode paddle
|
||||
```
|
||||
|
||||
## 预测部署
|
||||
|
||||
**使用Paddle**进行部署,执行以下命令:
|
||||
``` bash
|
||||
# 导出模型
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
|
||||
|
||||
# 预测图片
|
||||
python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=paddle --device=gpu
|
||||
```
|
||||
|
||||
**使用Paddle-TRT进行部署**,执行以下命令:
|
||||
```
|
||||
# 导出模型
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
|
||||
|
||||
# 预测图片
|
||||
python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=trt_fp16 --device=gpu
|
||||
```
|
||||
|
||||
**注意:**
|
||||
- 使用Paddle-TRT使用确保**PaddlePaddle版本为develop版本且TensorRT版本大于8.2**.
|
||||
|
||||
**使用ONNX Runtime进行部署**,执行以下命令:
|
||||
```
|
||||
# 导出模型
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams export_onnx=True
|
||||
|
||||
# 安装paddle2onnx
|
||||
pip install paddle2onnx
|
||||
|
||||
# 转换成onnx模型
|
||||
paddle2onnx --model_dir output_inference/ppyoloe_r_crn_l_3x_dota --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_r_crn_l_3x_dota.onnx
|
||||
|
||||
# 预测图片
|
||||
python configs/rotate/tools/onnx_infer.py --infer_cfg output_inference/ppyoloe_r_crn_l_3x_dota/infer_cfg.yml --onnx_file ppyoloe_r_crn_l_3x_dota.onnx --image_file demo/P0072__1.0__0___0.png
|
||||
|
||||
```
|
||||
|
||||
## 附录
|
||||
|
||||
PP-YOLOE-R消融实验
|
||||
|
||||
| 模型 | mAP | 参数量(M) | FLOPs(G) |
|
||||
| :-: | :-: | :------: | :------: |
|
||||
| Baseline | 75.61 | 50.65 | 269.09 |
|
||||
| +Rotated Task Alignment Learning | 77.24 | 50.65 | 269.09 |
|
||||
| +Decoupled Angle Prediction Head | 77.78 | 52.20 | 272.72 |
|
||||
| +Angle Prediction with DFL | 78.01 | 53.29 | 281.65 |
|
||||
| +Learnable Gating Unit for RepVGG | 78.14 | 53.29 | 281.65 |
|
||||
|
||||
|
||||
## 引用
|
||||
|
||||
```
|
||||
@article{wang2022pp,
|
||||
title={PP-YOLOE-R: An Efficient Anchor-Free Rotated Object Detector},
|
||||
author={Wang, Xinxin and Wang, Guanzhong and Dang, Qingqing and Liu, Yi and Hu, Xiaoguang and Yu, Dianhai},
|
||||
journal={arXiv preprint arXiv:2211.02386},
|
||||
year={2022}
|
||||
}
|
||||
|
||||
@article{xu2022pp,
|
||||
title={PP-YOLOE: An evolved version of YOLO},
|
||||
author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},
|
||||
journal={arXiv preprint arXiv:2203.16250},
|
||||
year={2022}
|
||||
}
|
||||
|
||||
@article{llerena2021gaussian,
|
||||
title={Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection},
|
||||
author={Llerena, Jeffri M and Zeni, Luis Felipe and Kristen, Lucas N and Jung, Claudio},
|
||||
journal={arXiv preprint arXiv:2106.06072},
|
||||
year={2021}
|
||||
}
|
||||
```
|
||||
180
paddle_detection/configs/rotate/ppyoloe_r/README_en.md
Normal file
180
paddle_detection/configs/rotate/ppyoloe_r/README_en.md
Normal file
@@ -0,0 +1,180 @@
|
||||
English | [简体中文](README.md)
|
||||
|
||||
# PP-YOLOE-R
|
||||
|
||||
## Content
|
||||
- [Introduction](#Introduction)
|
||||
- [Model Zoo](#Model-Zoo)
|
||||
- [Getting Start](#Getting-Start)
|
||||
- [Deployment](#Deployment)
|
||||
- [Appendix](#Appendix)
|
||||
- [Citations](#Citations)
|
||||
|
||||
## Introduction
|
||||
PP-YOLOE-R is an efficient anchor-free rotated object detector. Based on PP-YOLOE, PP-YOLOE-R introduces a bag of useful tricks to improve detection precision at the expense of marginal parameters and computations.PP-YOLOE-R-l and PP-YOLOE-R-x achieve 78.14 and 78.27 mAP respectively on DOTA 1.0 dataset with single-scale training and testing, which outperform almost all other rotated object detectors. With multi-scale training and testing, the detection precision of PP-YOLOE-R-l and PP-YOLOE-R-x is further improved to 80.02 and 80.73 mAP. In this case, PP-YOLOE-R-x surpasses all anchor-free methods and demonstrates competitive performance to state-of-the-art anchor-based two-stage model. Moreover, PP-YOLOE-R-s and PP-YOLOE-R-m can achieve 79.42 and 79.71 mAP with multi-scale training and testing, which is an excellent result considering the parameters and GLOPS of these two models. While maintaining high precision, PP-YOLOE-R avoids using special operators, such as Deformable Convolution or Rotated RoI Align, to be deployed friendly on various hardware. At the input resolution of 1024$\times$1024, PP-YOLOE-R-s/m/l/x can reach 69.8/55.1/48.3/37.1 FPS on RTX 2080 Ti and 114.5/86.8/69.7/50.7 FPS on Tesla V100 GPU with TensorRT and FP16-precision. For more details, please refer to our [**technical report**](https://arxiv.org/abs/2211.02386).
|
||||
|
||||
<div align="center">
|
||||
<img src="../../../docs/images/ppyoloe_r_map_fps.png" width=500 />
|
||||
</div>
|
||||
|
||||
Compared with PP-YOLOE, PP-YOLOE-R has made the following changes:
|
||||
- Rotated Task Alignment Learning
|
||||
- Decoupled Angle Prediction Head
|
||||
- Angle Prediction with DFL
|
||||
- Learnable Gating Unit for RepVGG
|
||||
- [ProbIoU Loss](https://arxiv.org/abs/2106.06072)
|
||||
|
||||
## Model Zoo
|
||||
| Model | Backbone | mAP | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | Lr Scheduler | Angle | Aug | GPU Number | images/GPU | download | config |
|
||||
|:-----:|:--------:|:----:|:-------------------:|:--------------------------:|:-----------:|:---------:|:--------:|:-----:|:---:|:----------:|:----------:|:--------:|:------:|
|
||||
| PP-YOLOE-R-s | CRN-s | 73.82 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml) |
|
||||
| PP-YOLOE-R-s | CRN-s | 79.42 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml) |
|
||||
| PP-YOLOE-R-m | CRN-m | 77.64 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml) |
|
||||
| PP-YOLOE-R-m | CRN-m | 79.71 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml) |
|
||||
| PP-YOLOE-R-l | CRN-l | 78.14 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml) |
|
||||
| PP-YOLOE-R-l | CRN-l | 80.02 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
|
||||
| PP-YOLOE-R-x | CRN-x | 78.28 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml) |
|
||||
| PP-YOLOE-R-x | CRN-x | 80.73 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml) |
|
||||
|
||||
**Notes:**
|
||||
|
||||
- if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)**.
|
||||
- Models in model zoo is trained and tested with single scale by default. If `MS` is indicated in the data augmentation column, it means that multi-scale training and multi-scale testing are used. If `RR` is indicated in the data augmentation column, it means that RandomRotate data augmentation is used for training.
|
||||
- CRN denotes CSPRepResNet proposed in PP-YOLOE
|
||||
- The parameters and GLOPs of PP-YOLOE-R are calculated after re-parameterization, and the resolution of the input image is 1024x1024
|
||||
- Speed is calculated and averaged by testing 2000 images on the DOTA test dataset. Refer to [Speed testing](#Speed-testing) to reproduce the results.
|
||||
|
||||
## Getting Start
|
||||
|
||||
Refer to [Data-Preparation](../README_en.md#Data-Preparation) to prepare data.
|
||||
|
||||
### Training
|
||||
|
||||
Single GPU Training
|
||||
``` bash
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
|
||||
```
|
||||
|
||||
Multiple GPUs Training
|
||||
``` bash
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
|
||||
```
|
||||
|
||||
### Inference
|
||||
|
||||
Run the follow command to infer single image, the result of inference will be saved in `output` directory by default.
|
||||
|
||||
``` bash
|
||||
python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_img=demo/P0861__1.0__1154___824.png --draw_threshold=0.5
|
||||
```
|
||||
|
||||
### Evaluation on DOTA Dataset
|
||||
Refering to [DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), You need to submit a zip file containing results for all test images for evaluation. The detection results of each category are stored in a txt file, each line of which is in the following format
|
||||
`image_id score x1 y1 x2 y2 x3 y3 x4 y4`. To evaluate, you should submit the generated zip file to the Task1 of [DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html). You can run the following command to get the inference results of test dataset:
|
||||
``` bash
|
||||
python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output_ppyoloe_r --visualize=False --save_results=True
|
||||
```
|
||||
Process the prediction results into the format required for the official website evaluation:
|
||||
``` bash
|
||||
python configs/rotate/tools/generate_result.py --pred_txt_dir=output_ppyoloe_r/ --output_dir=submit/ --data_type=dota10
|
||||
|
||||
zip -r submit.zip submit
|
||||
```
|
||||
|
||||
### Speed testing
|
||||
|
||||
You can use Paddle mode or Paddle-TRT mode for speed testing. When using Paddle-TRT for speed testing, make sure that **the version of TensorRT is larger than 8.2 and the version of PaddlePaddle is the develop version**. Using Paddle-TRT to test speed, run following command
|
||||
|
||||
``` bash
|
||||
# export inference model
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
|
||||
|
||||
# speed testing
|
||||
CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode trt_fp16
|
||||
```
|
||||
Using Paddle to test speed, run following command
|
||||
``` bash
|
||||
# export inference model
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
|
||||
|
||||
# speed testing
|
||||
CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode paddle
|
||||
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
**Using Paddle** to for deployment, run following command
|
||||
|
||||
``` bash
|
||||
# export inference model
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
|
||||
|
||||
# inference single image
|
||||
python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=paddle --device=gpu
|
||||
```
|
||||
|
||||
**Using Paddle-TRT** for deployment, run following command
|
||||
|
||||
``` bash
|
||||
# export inference model
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
|
||||
|
||||
# inference single image
|
||||
python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=trt_fp16 --device=gpu
|
||||
```
|
||||
**Notes:**
|
||||
- When using Paddle-TRT for speed testing, make sure that **the version of TensorRT is larger than 8.2 and the version of PaddlePaddle is the develop version**
|
||||
|
||||
**Using ONNX Runtime** for deployment, run following command
|
||||
|
||||
``` bash
|
||||
# export inference model
|
||||
python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams export_onnx=True
|
||||
|
||||
# install paddle2onnx
|
||||
pip install paddle2onnx
|
||||
|
||||
# convert to onnx model
|
||||
paddle2onnx --model_dir output_inference/ppyoloe_r_crn_l_3x_dota --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_r_crn_l_3x_dota.onnx
|
||||
|
||||
# inference single image
|
||||
python configs/rotate/tools/onnx_infer.py --infer_cfg output_inference/ppyoloe_r_crn_l_3x_dota/infer_cfg.yml --onnx_file ppyoloe_r_crn_l_3x_dota.onnx --image_file demo/P0072__1.0__0___0.png
|
||||
```
|
||||
|
||||
## Appendix
|
||||
|
||||
Ablation experiments of PP-YOLOE-R
|
||||
|
||||
| Model | mAP | Params(M) | FLOPs(G) |
|
||||
| :-: | :-: | :------: | :------: |
|
||||
| Baseline | 75.61 | 50.65 | 269.09 |
|
||||
| +Rotated Task Alignment Learning | 77.24 | 50.65 | 269.09 |
|
||||
| +Decoupled Angle Prediction Head | 77.78 | 52.20 | 272.72 |
|
||||
| +Angle Prediction with DFL | 78.01 | 53.29 | 281.65 |
|
||||
| +Learnable Gating Unit for RepVGG | 78.14 | 53.29 | 281.65 |
|
||||
|
||||
## Citations
|
||||
|
||||
```
|
||||
@article{wang2022pp,
|
||||
title={PP-YOLOE-R: An Efficient Anchor-Free Rotated Object Detector},
|
||||
author={Wang, Xinxin and Wang, Guanzhong and Dang, Qingqing and Liu, Yi and Hu, Xiaoguang and Yu, Dianhai},
|
||||
journal={arXiv preprint arXiv:2211.02386},
|
||||
year={2022}
|
||||
}
|
||||
|
||||
@article{xu2022pp,
|
||||
title={PP-YOLOE: An evolved version of YOLO},
|
||||
author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},
|
||||
journal={arXiv preprint arXiv:2203.16250},
|
||||
year={2022}
|
||||
}
|
||||
|
||||
@article{llerena2021gaussian,
|
||||
title={Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection},
|
||||
author={Llerena, Jeffri M and Zeni, Luis Felipe and Kristen, Lucas N and Jung, Claudio},
|
||||
journal={arXiv preprint arXiv:2106.06072},
|
||||
year={2021}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,19 @@
|
||||
epoch: 36
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.008
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 44
|
||||
- !LinearWarmup
|
||||
start_factor: 0.
|
||||
steps: 1000
|
||||
|
||||
OptimizerBuilder:
|
||||
clip_grad_by_norm: 35.
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.0005
|
||||
type: L2
|
||||
@@ -0,0 +1,49 @@
|
||||
architecture: YOLOv3
|
||||
norm_type: sync_bn
|
||||
use_ema: true
|
||||
ema_decay: 0.9998
|
||||
|
||||
YOLOv3:
|
||||
backbone: CSPResNet
|
||||
neck: CustomCSPPAN
|
||||
yolo_head: PPYOLOERHead
|
||||
post_process: ~
|
||||
|
||||
CSPResNet:
|
||||
layers: [3, 6, 6, 3]
|
||||
channels: [64, 128, 256, 512, 1024]
|
||||
return_idx: [1, 2, 3]
|
||||
use_large_stem: True
|
||||
use_alpha: True
|
||||
|
||||
CustomCSPPAN:
|
||||
out_channels: [768, 384, 192]
|
||||
stage_num: 1
|
||||
block_num: 3
|
||||
act: 'swish'
|
||||
spp: true
|
||||
use_alpha: True
|
||||
|
||||
PPYOLOERHead:
|
||||
fpn_strides: [32, 16, 8]
|
||||
grid_cell_offset: 0.5
|
||||
use_varifocal_loss: true
|
||||
static_assigner_epoch: -1
|
||||
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.05}
|
||||
static_assigner:
|
||||
name: FCOSRAssigner
|
||||
factor: 12
|
||||
threshold: 0.23
|
||||
boundary: [[512, 10000], [256, 512], [-1, 256]]
|
||||
assigner:
|
||||
name: RotatedTaskAlignedAssigner
|
||||
topk: 13
|
||||
alpha: 1.0
|
||||
beta: 6.0
|
||||
nms:
|
||||
name: MultiClassNMS
|
||||
nms_top_k: 2000
|
||||
keep_top_k: -1
|
||||
score_threshold: 0.1
|
||||
nms_threshold: 0.1
|
||||
normalized: False
|
||||
@@ -0,0 +1,46 @@
|
||||
worker_num: 4
|
||||
image_height: &image_height 1024
|
||||
image_width: &image_width 1024
|
||||
image_size: &image_size [*image_height, *image_width]
|
||||
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Poly2Array: {}
|
||||
- RandomRFlip: {}
|
||||
- RandomRRotate: {angle_mode: 'value', angle: [0, 90, 180, -90]}
|
||||
- RandomRRotate: {angle_mode: 'value', angle: [30, 60], rotate_prob: 0.5}
|
||||
- RResize: {target_size: *image_size, keep_ratio: True, interp: 2}
|
||||
- Poly2RBox: {filter_threshold: 2, filter_mode: 'edge', rbox_type: 'oc'}
|
||||
batch_transforms:
|
||||
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
|
||||
- Permute: {}
|
||||
- PadRGT: {}
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 2
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
use_shared_memory: true
|
||||
collate_batch: true
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Poly2Array: {}
|
||||
- RResize: {target_size: *image_size, keep_ratio: True, interp: 2}
|
||||
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 2
|
||||
collate_batch: false
|
||||
|
||||
TestReader:
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- Resize: {target_size: *image_size, keep_ratio: True, interp: 2}
|
||||
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 2
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_l_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
|
||||
depth_mult: 1.0
|
||||
width_mult: 1.0
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota_ms.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_l_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
|
||||
depth_mult: 1.0
|
||||
width_mult: 1.0
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_m_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_m_pretrained.pdparams
|
||||
depth_mult: 0.67
|
||||
width_mult: 0.75
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota_ms.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_m_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_m_pretrained.pdparams
|
||||
depth_mult: 0.67
|
||||
width_mult: 0.75
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_s_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
|
||||
depth_mult: 0.33
|
||||
width_mult: 0.50
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota_ms.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_s_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
|
||||
depth_mult: 0.33
|
||||
width_mult: 0.50
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_x_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_x_pretrained.pdparams
|
||||
depth_mult: 1.33
|
||||
width_mult: 1.25
|
||||
@@ -0,0 +1,15 @@
|
||||
_BASE_: [
|
||||
'../../datasets/dota_ms.yml',
|
||||
'../../runtime.yml',
|
||||
'_base_/optimizer_3x.yml',
|
||||
'_base_/ppyoloe_r_reader.yml',
|
||||
'_base_/ppyoloe_r_crn.yml'
|
||||
]
|
||||
|
||||
log_iter: 50
|
||||
snapshot_epoch: 1
|
||||
weights: output/ppyoloe_r_crn_x_3x_dota/model_final
|
||||
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_x_pretrained.pdparams
|
||||
depth_mult: 1.33
|
||||
width_mult: 1.25
|
||||
Reference in New Issue
Block a user