移动paddle_detection

2024-09-24 17:02:56 +08:00
parent 90a6d5ec75
commit 3438cf6e0e
2025 changed files with 11 additions and 11 deletions
--- a/services/paddle_services/paddle_detection/configs/gfl/README.md
+++ b/services/paddle_services/paddle_detection/configs/gfl/README.md
@@ -0,0 +1,40 @@
+# Generalized Focal Loss Model(GFL)
+
+## Introduction
+
+We reproduce the object detection results in the paper [Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection](https://arxiv.org/abs/2006.04388) and [Generalized Focal Loss V2](https://arxiv.org/pdf/2011.12885.pdf). And We use a better performing pre-trained model and ResNet-vd structure to improve mAP.
+
+## Model Zoo
+
+| Backbone        | Model      | batch-size/GPU | lr schedule |FPS | Box AP |                           download                          | config |
+| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50    | GFL           |    2    |   1x      |     ----     |  41.0  | [model](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_1x_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r50_fpn_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r50_fpn_1x_coco.yml) |
+| ResNet50    | GFL + [CWD](../slim/README.md) |    2    |   2x      |     ----     |  44.0  | [model](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_2x_coco_cwd.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r50_fpn_2x_coco_cwd.log) | [config1](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r50_fpn_1x_coco.yml), [config2](../slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml) |
+| ResNet101-vd   | GFL           |    2    |   2x      |     ----     |  46.8  | [model](https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r101vd_fpn_mstrain_2x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml) |
+| ResNet34-vd    | GFL           |    2    |   1x      |     ----     |  40.8  | [model](https://paddledet.bj.bcebos.com/models/gfl_r34vd_1x_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r34vd_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r34vd_1x_coco.yml) |
+| ResNet18-vd   | GFL           |    2    |   1x      |     ----     |  36.6  | [model](https://paddledet.bj.bcebos.com/models/gfl_r18vd_1x_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r18vd_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gfl_r18vd_1x_coco.yml) |
+| ResNet18-vd   | GFL + [LD](../slim/README.md)    |    2    |   1x      |     ----     |  38.2  | [model](https://bj.bcebos.com/v1/paddledet/models/gfl_slim_ld_r18vd_1x_coco.pdparams) &#124; [log](https://bj.bcebos.com/v1/paddledet/logs/train_gfl_slim_ld_r18vd_1x_coco.log) | [config1](./gfl_slim_ld_r18vd_1x_coco.yml), [config2](../slim/distill/gfl_ld_distill.yml) |
+| ResNet50    | GFLv2       |    2    |   1x      |     ----     |  41.2  | [model](https://paddledet.bj.bcebos.com/models/gflv2_r50_fpn_1x_coco.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_gflv2_r50_fpn_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/gfl/gflv2_r50_fpn_1x_coco.yml) |
+
+
+**Notes:**
+
+- GFL is trained on COCO train2017 dataset with 8 GPUs and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+
+## Citations
+```
+@article{li2020generalized,
+  title={Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection},
+  author={Li, Xiang and Wang, Wenhai and Wu, Lijun and Chen, Shuo and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
+  journal={arXiv preprint arXiv:2006.04388},
+  year={2020}
+}
+
+@article{li2020gflv2,
+  title={Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection},
+  author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
+  journal={arXiv preprint arXiv:2011.12885},
+  year={2020}
+}
+
+```
--- a/services/paddle_services/paddle_detection/configs/gfl/_base_/gfl_r50_fpn.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/_base_/gfl_r50_fpn.yml
@@ -0,0 +1,51 @@
+architecture: GFL
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+GFL:
+  backbone: ResNet
+  neck: FPN
+  head: GFLHead
+
+ResNet:
+  depth: 50
+  variant: b
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1,2,3]
+  num_stages: 4
+
+FPN:
+  out_channel: 256
+  spatial_scales: [0.125, 0.0625, 0.03125]
+  extra_stage: 2
+  has_extra_convs: true
+  use_c5: false
+
+GFLHead:
+  conv_feat:
+    name: FCOSFeat
+    feat_in: 256
+    feat_out: 256
+    num_convs: 4
+    norm_type: "gn"
+    use_dcn: false
+  fpn_stride: [8, 16, 32, 64, 128]
+  prior_prob: 0.01
+  reg_max: 16
+  loss_class:
+    name: QualityFocalLoss
+    use_sigmoid: True
+    beta: 2.0
+    loss_weight: 1.0
+  loss_dfl:
+    name: DistributionFocalLoss
+    loss_weight: 0.25
+  loss_bbox:
+    name: GIoULoss
+    loss_weight: 2.0
+  nms:
+    name: MultiClassNMS
+    nms_top_k: 1000
+    keep_top_k: 100
+    score_threshold: 0.025
+    nms_threshold: 0.6
--- a/services/paddle_services/paddle_detection/configs/gfl/_base_/gfl_reader.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/_base_/gfl_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - Decode: {}
+  - RandomFlip: {prob: 0.5}
+  - Resize: {target_size: [800, 1333], keep_ratio: true, interp: 1}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  - Gt2GFLTarget:
+      downsample_ratios: [8, 16, 32, 64, 128]
+      grid_cell_scale: 8
+  batch_size: 2
+  shuffle: true
+  drop_last: true
+  use_shared_memory: True
+
+
+EvalReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 2
+  shuffle: false
+
+
+TestReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
--- a/services/paddle_services/paddle_detection/configs/gfl/_base_/gflv2_r50_fpn.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/_base_/gflv2_r50_fpn.yml
@@ -0,0 +1,56 @@
+architecture: GFL
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+GFL:
+  backbone: ResNet
+  neck: FPN
+  head: GFLHead
+
+ResNet:
+  depth: 50
+  variant: b
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1,2,3]
+  num_stages: 4
+
+FPN:
+  out_channel: 256
+  spatial_scales: [0.125, 0.0625, 0.03125]
+  extra_stage: 2
+  has_extra_convs: true
+  use_c5: false
+
+GFLHead:
+  conv_feat:
+    name: FCOSFeat
+    feat_in: 256
+    feat_out: 256
+    num_convs: 4
+    norm_type: "gn"
+    use_dcn: false
+  fpn_stride: [8, 16, 32, 64, 128]
+  prior_prob: 0.01
+  reg_max: 16
+  dgqp_module:
+    name: DGQP
+    reg_topk: 4
+    reg_channels: 64
+    add_mean: True
+  loss_class:
+    name: QualityFocalLoss
+    use_sigmoid: False
+    beta: 2.0
+    loss_weight: 1.0
+  loss_dfl:
+    name: DistributionFocalLoss
+    loss_weight: 0.25
+  loss_bbox:
+    name: GIoULoss
+    loss_weight: 2.0
+  nms:
+    name: MultiClassNMS
+    nms_top_k: 1000
+    keep_top_k: 100
+    score_threshold: 0.025
+    nms_threshold: 0.6
--- a/services/paddle_services/paddle_detection/configs/gfl/_base_/optimizer_1x.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [8, 11]
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
--- a/services/paddle_services/paddle_detection/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml
@@ -0,0 +1,46 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/gfl_r50_fpn.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/gfl_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/gfl_r101vd_fpn_mstrain_2x_coco/model_final
+find_unused_parameters: True
+use_ema: true
+ema_decay: 0.9998
+
+ResNet:
+  depth: 101
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1,2,3]
+  num_stages: 4
+
+epoch: 24
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [16, 22]
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 500
+
+TrainReader:
+  sample_transforms:
+  - Decode: {}
+  - RandomResize: {target_size: [[480, 1333], [512, 1333], [544, 1333], [576, 1333], [608, 1333], [640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+  - RandomFlip: {prob: 0.5}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  - Gt2GFLTarget:
+      downsample_ratios: [8, 16, 32, 64, 128]
+      grid_cell_scale: 8
--- a/services/paddle_services/paddle_detection/configs/gfl/gfl_r18vd_1x_coco.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/gfl_r18vd_1x_coco.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/gfl_r50_fpn.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/gfl_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
+weights: output/gfl_r18vd_1x_coco/model_final
+find_unused_parameters: True
+
+ResNet:
+  depth: 18
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1,2,3]
+  num_stages: 4
--- a/services/paddle_services/paddle_detection/configs/gfl/gfl_r34vd_1x_coco.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/gfl_r34vd_1x_coco.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/gfl_r50_fpn.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/gfl_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_vd_pretrained.pdparams
+weights: output/gfl_r34vd_1x_coco/model_final
+find_unused_parameters: True
+
+ResNet:
+  depth: 34
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1,2,3]
+  num_stages: 4
--- a/services/paddle_services/paddle_detection/configs/gfl/gfl_r50_fpn_1x_coco.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/gfl_r50_fpn_1x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/gfl_r50_fpn.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/gfl_reader.yml',
+]
+
+weights: output/gfl_r50_fpn_1x_coco/model_final
+find_unused_parameters: True
--- a/services/paddle_services/paddle_detection/configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml
@@ -0,0 +1,73 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/gfl_reader.yml',
+]
+
+weights: output/gfl_r18vd_1x_coco/model_final
+find_unused_parameters: True
+
+architecture: GFL
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
+
+GFL:
+  backbone: ResNet
+  neck: FPN
+  head: LDGFLHead
+
+ResNet:
+  depth: 18
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1,2,3]
+  num_stages: 4
+
+FPN:
+  out_channel: 256
+  spatial_scales: [0.125, 0.0625, 0.03125]
+  extra_stage: 2
+  has_extra_convs: true
+  use_c5: false
+
+LDGFLHead:   # new head
+  conv_feat:
+    name: FCOSFeat
+    feat_in: 256
+    feat_out: 256
+    num_convs: 4
+    norm_type: "gn"
+    use_dcn: false
+  fpn_stride: [8, 16, 32, 64, 128]
+  prior_prob: 0.01
+  reg_max: 16
+  loss_class:
+    name: QualityFocalLoss
+    use_sigmoid: True
+    beta: 2.0
+    loss_weight: 1.0
+  loss_dfl:
+    name: DistributionFocalLoss
+    loss_weight: 0.25
+  loss_bbox:
+    name: GIoULoss
+    loss_weight: 2.0
+  loss_ld:
+    name: KnowledgeDistillationKLDivLoss
+    loss_weight: 0.25
+    T: 10
+  loss_ld_vlr:
+    name: KnowledgeDistillationKLDivLoss
+    loss_weight: 0.25
+    T: 10
+  loss_kd:
+    name: KnowledgeDistillationKLDivLoss
+    loss_weight: 10
+    T: 2
+  nms:
+    name: MultiClassNMS
+    nms_top_k: 1000
+    keep_top_k: 100
+    score_threshold: 0.025
+    nms_threshold: 0.6
--- a/services/paddle_services/paddle_detection/configs/gfl/gflv2_r50_fpn_1x_coco.yml
+++ b/services/paddle_services/paddle_detection/configs/gfl/gflv2_r50_fpn_1x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/gflv2_r50_fpn.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/gfl_reader.yml',
+]
+
+weights: output/gflv2_r50_fpn_1x_coco/model_final
+find_unused_parameters: True