更换文档检测模型

This commit is contained in:
2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions

View File

@@ -0,0 +1,104 @@
简体中文 | [English](README_en.md)
# S2ANet
## 内容
- [简介](#简介)
- [模型库](#模型库)
- [使用说明](#使用说明)
- [预测部署](#预测部署)
- [引用](#引用)
## 简介
[S2ANet](https://arxiv.org/pdf/2008.09397.pdf)是用于检测旋转框的模型.
## 模型库
| 模型 | Conv类型 | mAP | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
|:---:|:------:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
| S2ANet | Conv | 71.45 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_conv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/s2anet/s2anet_conv_2x_dota.yml) |
| S2ANet | AlignConv | 73.84 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml) |
**注意:**
- 如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)** 调整学习率。
- 模型库中的模型默认使用单尺度训练单尺度测试。如果数据增广一栏标明MS意味着使用多尺度训练和多尺度测试。如果数据增广一栏标明RR意味着使用RandomRotate数据增广进行训练。
- 这里使用`multiclass_nms`与原作者使用nms略有不同。
## 使用说明
参考[数据准备](../README.md#数据准备)准备数据。
### 1. 训练
GPU单卡训练
```bash
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
```
GPU多卡训练
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
```
可以通过`--eval`开启边训练边测试。
### 2. 评估
```bash
python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams
# 使用提供训练好的模型评估
python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams
```
### 3. 预测
执行如下命令,会将图像预测结果保存到`output`文件夹下。
```bash
python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
```
使用提供训练好的模型预测:
```bash
python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
```
### 4. DOTA数据评估
执行如下命令,会在`output`文件夹下将每个图像预测结果保存到同文件夹名的txt文本中。
```
python tools/infer.py -c configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output --visualize=False --save_results=True
```
参考[DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), 评估DOTA数据集需要生成一个包含所有检测结果的zip文件每一类的检测结果储存在一个txt文件中txt文件中每行格式为`image_name score x1 y1 x2 y2 x3 y3 x4 y4`。将生成的zip文件提交到[DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html)的Task1进行评估。你可以执行以下命令生成评估文件
```
python configs/rotate/tools/generate_result.py --pred_txt_dir=output/ --output_dir=submit/ --data_type=dota10
zip -r submit.zip submit
```
## 预测部署
Paddle中`multiclass_nms`算子的输入支持四边形输入因此部署时可以不需要依赖旋转框IOU计算算子。
部署教程请参考[预测部署](../../../deploy/README.md)
## 引用
```
@article{han2021align,
author={J. {Han} and J. {Ding} and J. {Li} and G. -S. {Xia}},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={Align Deep Features for Oriented Object Detection},
year={2021},
pages={1-11},
doi={10.1109/TGRS.2021.3062048}}
@inproceedings{xia2018dota,
title={DOTA: A large-scale dataset for object detection in aerial images},
author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={3974--3983},
year={2018}
}
```

View File

@@ -0,0 +1,102 @@
English | [简体中文](README.md)
# S2ANet
## Content
- [Introduction](#Introduction)
- [Model Zoo](#Model-Zoo)
- [Getting Start](#Getting-Start)
- [Deployment](#Deployment)
- [Citations](#Citations)
## Introduction
[S2ANet](https://arxiv.org/pdf/2008.09397.pdf) is used to detect rotated objects.
## Model Zoo
| Model | Conv Type | mAP | Lr Scheduler | Angle | Aug | GPU Number | images/GPU | download | config |
|:---:|:------:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
| S2ANet | Conv | 71.45 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_conv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/s2anet/s2anet_conv_2x_dota.yml) |
| S2ANet | AlignConv | 73.84 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml) |
**Notes:**
- if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)**.
- Models in model zoo is trained and tested with single scale by default. If `MS` is indicated in the data augmentation column, it means that multi-scale training and multi-scale testing are used. If `RR` is indicated in the data augmentation column, it means that RandomRotate data augmentation is used for training.
- `multiclass_nms` is used here, which is slightly different from the original author's use of NMS.
## Getting Start
Refer to [Data-Preparation](../README_en.md#Data-Preparation) to prepare data.
### 1. Train
Single GPU Training
```bash
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
```
Multiple GPUs Training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
```
You can use `--eval`to enable train-by-test.
### 2. Evaluation
```bash
python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams
# Use a trained model to evaluate
python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams
```
### 3. Prediction
Executing the following command will save the image prediction results to the `output` folder.
```bash
python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
```
Prediction using models that provide training:
```bash
python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
```
### 4. DOTA Data evaluation
Execute the following command, will save each image prediction result in `output` folder txt text with the same folder name.
```
python tools/infer.py -c configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output --visualize=False --save_results=True
```
Refering to [DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), You need to submit a zip file containing results for all test images for evaluation. The detection results of each category are stored in a txt file, each line of which is in the following format
`image_id score x1 y1 x2 y2 x3 y3 x4 y4`. To evaluate, you should submit the generated zip file to the Task1 of [DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html). You can execute the following command to generate the file
```
python configs/rotate/tools/generate_result.py --pred_txt_dir=output/ --output_dir=submit/ --data_type=dota10
zip -r submit.zip submit
```
## Deployment
The inputs of the `multiclass_nms` operator in Paddle support quadrilateral inputs, so deployment can be done without relying on the rotating frame IOU operator.
Please refer to the deployment tutorial[Predict deployment](../../../deploy/README_en.md)
## Citations
```
@article{han2021align,
author={J. {Han} and J. {Ding} and J. {Li} and G. -S. {Xia}},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={Align Deep Features for Oriented Object Detection},
year={2021},
pages={1-11},
doi={10.1109/TGRS.2021.3062048}}
@inproceedings{xia2018dota,
title={DOTA: A large-scale dataset for object detection in aerial images},
author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={3974--3983},
year={2018}
}
```

View File

@@ -0,0 +1,52 @@
architecture: S2ANet
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
weights: output/s2anet_r50_fpn_1x_dota/model_final.pdparams
# Model Achitecture
S2ANet:
backbone: ResNet
neck: FPN
head: S2ANetHead
ResNet:
depth: 50
variant: d
norm_type: bn
return_idx: [1,2,3]
num_stages: 4
FPN:
in_channels: [256, 512, 1024]
out_channel: 256
spatial_scales: [0.25, 0.125, 0.0625]
has_extra_convs: True
extra_stage: 2
relu_before_extra_convs: False
S2ANetHead:
anchor_strides: [8, 16, 32, 64, 128]
anchor_scales: [4]
anchor_ratios: [1.0]
anchor_assign: RBoxAssigner
stacked_convs: 2
feat_in: 256
feat_out: 256
align_conv_type: 'AlignConv' # AlignConv Conv
align_conv_size: 3
use_sigmoid_cls: True
reg_loss_weight: [1.0, 1.0, 1.0, 1.0, 1.1]
cls_loss_weight: [1.1, 1.05]
nms_pre: 2000
nms:
name: MultiClassNMS
keep_top_k: -1
score_threshold: 0.05
nms_threshold: 0.1
normalized: False
RBoxAssigner:
pos_iou_thr: 0.5
neg_iou_thr: 0.4
min_iou_thr: 0.0
ignore_iof_thr: -2

View File

@@ -0,0 +1,20 @@
epoch: 12
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [7, 10]
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 500
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
clip_grad_by_norm: 35

View File

@@ -0,0 +1,20 @@
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [14, 20]
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
clip_grad_by_norm: 35

View File

@@ -0,0 +1,44 @@
worker_num: 4
TrainReader:
sample_transforms:
- Decode: {}
- Poly2Array: {}
- RandomRFlip: {}
- RResize: {target_size: [1024, 1024], keep_ratio: True, interp: 2}
- Poly2RBox: {rbox_type: 'le135'}
batch_transforms:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
- PadRGT: {}
- PadBatch: {pad_to_stride: 32}
batch_size: 2
shuffle: true
drop_last: true
EvalReader:
sample_transforms:
- Decode: {}
- Poly2Array: {}
- RResize: {target_size: [1024, 1024], keep_ratio: True, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 2
shuffle: false
drop_last: false
collate_batch: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [1024, 1024], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false

View File

@@ -0,0 +1,25 @@
_BASE_: [
'../../datasets/spine_coco.yml',
'../../runtime.yml',
'_base_/s2anet_optimizer_1x.yml',
'_base_/s2anet.yml',
'_base_/s2anet_reader.yml',
]
weights: output/s2anet_1x_spine/model_final
pretrain_weights: https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams
# for 4 card
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [7, 10]
- !LinearWarmup
start_factor: 0.3333333333333333
epochs: 5
S2ANetHead:
reg_loss_weight: [1.0, 1.0, 1.0, 1.0, 1.05]
cls_loss_weight: [1.05, 1.0]

View File

@@ -0,0 +1,10 @@
_BASE_: [
'../../datasets/dota.yml',
'../../runtime.yml',
'_base_/s2anet_optimizer_2x.yml',
'_base_/s2anet.yml',
'_base_/s2anet_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
weights: output/s2anet_alignconv_2x_dota/model_final

View File

@@ -0,0 +1,19 @@
_BASE_: [
'../../datasets/dota.yml',
'../../runtime.yml',
'_base_/s2anet_optimizer_2x.yml',
'_base_/s2anet.yml',
'_base_/s2anet_reader.yml',
]
weights: output/s2anet_conv_1x_dota/model_final
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
ResNet:
depth: 50
variant: b
norm_type: bn
return_idx: [1,2,3]
num_stages: 4
S2ANetHead:
align_conv_type: 'Conv'