移动paddle_detection

This commit is contained in:
2024-09-24 17:02:56 +08:00
parent 90a6d5ec75
commit 3438cf6e0e
2025 changed files with 11 additions and 11 deletions

View File

@@ -0,0 +1,40 @@
# Generalized Focal Loss Model(GFL)
## Introduction
We reproduce the object detection results in the paper [Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection](https://arxiv.org/abs/2006.04388) and [Generalized Focal Loss V2](https://arxiv.org/pdf/2011.12885.pdf). And We use a better performing pre-trained model and ResNet-vd structure to improve mAP.
## Model Zoo
| Backbone | Model | batch-size/GPU | lr schedule |FPS | Box AP | download | config |
| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
| ResNet50 | GFL | 2 | 1x | ---- | 41.0 | [model](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r50_fpn_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r50_fpn_1x_coco.yml) |
| ResNet50 | GFL + [CWD](../slim/README.md) | 2 | 2x | ---- | 44.0 | [model](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_2x_coco_cwd.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r50_fpn_2x_coco_cwd.log) | [config1](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r50_fpn_1x_coco.yml), [config2](../slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml) |
| ResNet101-vd | GFL | 2 | 2x | ---- | 46.8 | [model](https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r101vd_fpn_mstrain_2x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml) |
| ResNet34-vd | GFL | 2 | 1x | ---- | 40.8 | [model](https://paddledet.bj.bcebos.com/models/gfl_r34vd_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r34vd_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r34vd_1x_coco.yml) |
| ResNet18-vd | GFL | 2 | 1x | ---- | 36.6 | [model](https://paddledet.bj.bcebos.com/models/gfl_r18vd_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r18vd_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r18vd_1x_coco.yml) |
| ResNet18-vd | GFL + [LD](../slim/README.md) | 2 | 1x | ---- | 38.2 | [model](https://bj.bcebos.com/v1/paddledet/models/gfl_slim_ld_r18vd_1x_coco.pdparams) | [log](https://bj.bcebos.com/v1/paddledet/logs/train_gfl_slim_ld_r18vd_1x_coco.log) | [config1](./gfl_slim_ld_r18vd_1x_coco.yml), [config2](../slim/distill/gfl_ld_distill.yml) |
| ResNet50 | GFLv2 | 2 | 1x | ---- | 41.2 | [model](https://paddledet.bj.bcebos.com/models/gflv2_r50_fpn_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gflv2_r50_fpn_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gflv2_r50_fpn_1x_coco.yml) |
**Notes:**
- GFL is trained on COCO train2017 dataset with 8 GPUs and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
## Citations
```
@article{li2020generalized,
title={Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection},
author={Li, Xiang and Wang, Wenhai and Wu, Lijun and Chen, Shuo and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
journal={arXiv preprint arXiv:2006.04388},
year={2020}
}
@article{li2020gflv2,
title={Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection},
author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
journal={arXiv preprint arXiv:2011.12885},
year={2020}
}
```

View File

@@ -0,0 +1,51 @@
architecture: GFL
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
GFL:
backbone: ResNet
neck: FPN
head: GFLHead
ResNet:
depth: 50
variant: b
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: true
use_c5: false
GFLHead:
conv_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: false
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
reg_max: 16
loss_class:
name: QualityFocalLoss
use_sigmoid: True
beta: 2.0
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6

View File

@@ -0,0 +1,41 @@
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- RandomFlip: {prob: 0.5}
- Resize: {target_size: [800, 1333], keep_ratio: true, interp: 1}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
- Gt2GFLTarget:
downsample_ratios: [8, 16, 32, 64, 128]
grid_cell_scale: 8
batch_size: 2
shuffle: true
drop_last: true
use_shared_memory: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 2
shuffle: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false

View File

@@ -0,0 +1,56 @@
architecture: GFL
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
GFL:
backbone: ResNet
neck: FPN
head: GFLHead
ResNet:
depth: 50
variant: b
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: true
use_c5: false
GFLHead:
conv_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: false
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
reg_max: 16
dgqp_module:
name: DGQP
reg_topk: 4
reg_channels: 64
add_mean: True
loss_class:
name: QualityFocalLoss
use_sigmoid: False
beta: 2.0
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6

View File

@@ -0,0 +1,19 @@
epoch: 12
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [8, 11]
- !LinearWarmup
start_factor: 0.001
steps: 500
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2

View File

@@ -0,0 +1,46 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/gfl_r50_fpn.yml',
'_base_/optimizer_1x.yml',
'_base_/gfl_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
weights: output/gfl_r101vd_fpn_mstrain_2x_coco/model_final
find_unused_parameters: True
use_ema: true
ema_decay: 0.9998
ResNet:
depth: 101
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.001
steps: 500
TrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[480, 1333], [512, 1333], [544, 1333], [576, 1333], [608, 1333], [640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
- RandomFlip: {prob: 0.5}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
- Gt2GFLTarget:
downsample_ratios: [8, 16, 32, 64, 128]
grid_cell_scale: 8

View File

@@ -0,0 +1,19 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/gfl_r50_fpn.yml',
'_base_/optimizer_1x.yml',
'_base_/gfl_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
weights: output/gfl_r18vd_1x_coco/model_final
find_unused_parameters: True
ResNet:
depth: 18
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4

View File

@@ -0,0 +1,19 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/gfl_r50_fpn.yml',
'_base_/optimizer_1x.yml',
'_base_/gfl_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_vd_pretrained.pdparams
weights: output/gfl_r34vd_1x_coco/model_final
find_unused_parameters: True
ResNet:
depth: 34
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4

View File

@@ -0,0 +1,10 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/gfl_r50_fpn.yml',
'_base_/optimizer_1x.yml',
'_base_/gfl_reader.yml',
]
weights: output/gfl_r50_fpn_1x_coco/model_final
find_unused_parameters: True

View File

@@ -0,0 +1,73 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/gfl_reader.yml',
]
weights: output/gfl_r18vd_1x_coco/model_final
find_unused_parameters: True
architecture: GFL
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
GFL:
backbone: ResNet
neck: FPN
head: LDGFLHead
ResNet:
depth: 18
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: true
use_c5: false
LDGFLHead: # new head
conv_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: false
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
reg_max: 16
loss_class:
name: QualityFocalLoss
use_sigmoid: True
beta: 2.0
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
loss_ld:
name: KnowledgeDistillationKLDivLoss
loss_weight: 0.25
T: 10
loss_ld_vlr:
name: KnowledgeDistillationKLDivLoss
loss_weight: 0.25
T: 10
loss_kd:
name: KnowledgeDistillationKLDivLoss
loss_weight: 10
T: 2
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6

View File

@@ -0,0 +1,10 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/gflv2_r50_fpn.yml',
'_base_/optimizer_1x.yml',
'_base_/gfl_reader.yml',
]
weights: output/gflv2_r50_fpn_1x_coco/model_final
find_unused_parameters: True