更换文档检测模型
This commit is contained in:
67
paddle_detection/configs/sniper/README.md
Normal file
67
paddle_detection/configs/sniper/README.md
Normal file
@@ -0,0 +1,67 @@
|
||||
English | [简体中文](README_cn.md)
|
||||
|
||||
# SNIPER: Efficient Multi-Scale Training
|
||||
|
||||
## Model Zoo
|
||||
|
||||
| Sniper | GPU number | images/GPU | Model | Dataset | Schedulers | Box AP | Download | Config |
|
||||
| :---------------- | :-------------------: | :------------------: | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
|
||||
| w/o | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 23.3 | [Download Link](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_visdrone.pdparams ) | [config](./faster_rcnn_r50_fpn_1x_visdrone.yml) |
|
||||
| w/ | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 29.7 | [Download Link](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_sniper_visdrone.pdparams) | [config](./faster_rcnn_r50_fpn_1x_sniper_visdrone.yml) |
|
||||
|
||||
### Note
|
||||
- Here, we use VisDrone dataset, and to detect 9 objects including `person, bicycles, car, van, truck, tricycle, awning-tricycle, bus, motor`.
|
||||
- Do not support deploy by now because sniper dataset crop behavior.
|
||||
|
||||
## Getting Start
|
||||
### 1. Training
|
||||
a. optional: Run `tools/sniper_params_stats.py` to get image_target_sizes\valid_box_ratio_ranges\chip_target_size\chip_target_stride,and modify this params in configs/datasets/sniper_coco_detection.yml
|
||||
```bash
|
||||
python tools/sniper_params_stats.py FasterRCNN annotations/instances_train2017.json
|
||||
```
|
||||
b. optional: train detector to get negative proposals.
|
||||
```bash
|
||||
python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --save_proposals --proposals_path=./proposals.json &>sniper.log 2>&1 &
|
||||
```
|
||||
c. train models
|
||||
```bash
|
||||
python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --eval &>sniper.log 2>&1 &
|
||||
```
|
||||
|
||||
### 2. Evaluation
|
||||
Evaluating SNIPER on custom dataset in single GPU with following commands:
|
||||
```bash
|
||||
# use saved checkpoint in training
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final
|
||||
```
|
||||
|
||||
### 3. Inference
|
||||
Inference images in single GPU with following commands, use `--infer_img` to inference a single image and `--infer_dir` to inference all images in the directory.
|
||||
|
||||
```bash
|
||||
# inference single image
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_img=demo/P0861__1.0__1154___824.png
|
||||
|
||||
# inference all images in the directory
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_dir=demo
|
||||
```
|
||||
|
||||
## Citations
|
||||
```
|
||||
@misc{1805.09300,
|
||||
Author = {Bharat Singh and Mahyar Najibi and Larry S. Davis},
|
||||
Title = {SNIPER: Efficient Multi-Scale Training},
|
||||
Year = {2018},
|
||||
Eprint = {arXiv:1805.09300},
|
||||
}
|
||||
|
||||
@ARTICLE{9573394,
|
||||
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
|
||||
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||
title={Detection and Tracking Meet Drones Challenge},
|
||||
year={2021},
|
||||
volume={},
|
||||
number={},
|
||||
pages={1-1},
|
||||
doi={10.1109/TPAMI.2021.3119563}}
|
||||
```
|
||||
68
paddle_detection/configs/sniper/README_cn.md
Normal file
68
paddle_detection/configs/sniper/README_cn.md
Normal file
@@ -0,0 +1,68 @@
|
||||
简体中文 | [English](README.md)
|
||||
|
||||
# SNIPER: Efficient Multi-Scale Training
|
||||
|
||||
## 模型库
|
||||
| 有无sniper | GPU个数 | 每张GPU图片个数 | 骨架网络 | 数据集 | 学习率策略 | Box AP | 模型下载 | 配置文件 |
|
||||
| :---------------- | :-------------------: | :------------------: | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
|
||||
| w/o sniper | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 23.3 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_visdrone.pdparams ) | [配置文件](./faster_rcnn_r50_fpn_1x_visdrone.yml) |
|
||||
| w sniper | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 29.7 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_sniper_visdrone.pdparams) | [配置文件](./faster_rcnn_r50_fpn_1x_sniper_visdrone.yml) |
|
||||
|
||||
|
||||
### 注意
|
||||
- 我们使用的是`VisDrone`数据集, 并且检查其中的9类,包括 `person, bicycles, car, van, truck, tricyle, awning-tricyle, bus, motor`.
|
||||
- 暂时不支持和导出预测部署(deploy).
|
||||
|
||||
|
||||
## 使用说明
|
||||
### 1. 训练
|
||||
a. 可选:统计数据集信息,获得数据缩放尺度、有效框范围、chip尺度和步长等参数,修改configs/datasets/sniper_coco_detection.yml中对应参数
|
||||
```bash
|
||||
python tools/sniper_params_stats.py FasterRCNN annotations/instances_train2017.json
|
||||
```
|
||||
b. 可选:训练检测器,生成负样本
|
||||
```bash
|
||||
python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --save_proposals --proposals_path=./proposals.json &>sniper.log 2>&1 &
|
||||
```
|
||||
c. 训练模型
|
||||
```bash
|
||||
python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --eval &>sniper.log 2>&1 &
|
||||
```
|
||||
|
||||
### 2. 评估
|
||||
使用单GPU通过如下命令一键式评估模型在COCO val2017数据集效果
|
||||
```bash
|
||||
# 使用训练保存的checkpoint
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final
|
||||
```
|
||||
|
||||
### 3. 推理
|
||||
使用单GPU通过如下命令一键式推理图像,通过`--infer_img`指定图像路径,或通过`--infer_dir`指定目录并推理目录下所有图像
|
||||
|
||||
```bash
|
||||
# 推理单张图像
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_img=demo/P0861__1.0__1154___824.png
|
||||
|
||||
# 推理目录下所有图像
|
||||
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_dir=demo
|
||||
```
|
||||
|
||||
## Citations
|
||||
```
|
||||
@misc{1805.09300,
|
||||
Author = {Bharat Singh and Mahyar Najibi and Larry S. Davis},
|
||||
Title = {SNIPER: Efficient Multi-Scale Training},
|
||||
Year = {2018},
|
||||
Eprint = {arXiv:1805.09300},
|
||||
}
|
||||
|
||||
@ARTICLE{9573394,
|
||||
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
|
||||
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||
title={Detection and Tracking Meet Drones Challenge},
|
||||
year={2021},
|
||||
volume={},
|
||||
number={},
|
||||
pages={1-1},
|
||||
doi={10.1109/TPAMI.2021.3119563}}
|
||||
```
|
||||
40
paddle_detection/configs/sniper/_base_/faster_fpn_reader.yml
Normal file
40
paddle_detection/configs/sniper/_base_/faster_fpn_reader.yml
Normal file
@@ -0,0 +1,40 @@
|
||||
worker_num: 2
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 1
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
drop_last: false
|
||||
|
||||
|
||||
TestReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: 32}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
drop_last: false
|
||||
41
paddle_detection/configs/sniper/_base_/faster_reader.yml
Normal file
41
paddle_detection/configs/sniper/_base_/faster_reader.yml
Normal file
@@ -0,0 +1,41 @@
|
||||
worker_num: 2
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- RandomResize: {target_size: [[800, 1333]], interp: 2, keep_ratio: True}
|
||||
- RandomFlip: {prob: 0.5}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: -1}
|
||||
batch_size: 1
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
collate_batch: false
|
||||
use_shared_memory: true
|
||||
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: -1}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
drop_last: false
|
||||
|
||||
|
||||
TestReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
|
||||
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
|
||||
- Permute: {}
|
||||
batch_transforms:
|
||||
- PadBatch: {pad_to_stride: -1}
|
||||
batch_size: 1
|
||||
shuffle: false
|
||||
drop_last: false
|
||||
40
paddle_detection/configs/sniper/_base_/ppyolo_reader.yml
Normal file
40
paddle_detection/configs/sniper/_base_/ppyolo_reader.yml
Normal file
@@ -0,0 +1,40 @@
|
||||
worker_num: 2
|
||||
TrainReader:
|
||||
inputs_def:
|
||||
num_max_boxes: 50
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- RandomDistort: {}
|
||||
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
|
||||
- RandomCrop: {}
|
||||
- RandomFlip: {}
|
||||
batch_transforms:
|
||||
- BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
|
||||
- NormalizeBox: {}
|
||||
- PadBox: {num_max_boxes: 50}
|
||||
- BboxXYXY2XYWH: {}
|
||||
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
|
||||
- Permute: {}
|
||||
- Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
|
||||
batch_size: 8
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
use_shared_memory: true
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
|
||||
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
|
||||
- Permute: {}
|
||||
batch_size: 8
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [3, 608, 608]
|
||||
sample_transforms:
|
||||
- SniperDecodeCrop: {}
|
||||
- Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
|
||||
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
@@ -0,0 +1,9 @@
|
||||
_BASE_: [
|
||||
'../datasets/sniper_visdrone_detection.yml',
|
||||
'../runtime.yml',
|
||||
'../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
|
||||
'../faster_rcnn/_base_/optimizer_1x.yml',
|
||||
'_base_/faster_fpn_reader.yml',
|
||||
]
|
||||
weights: output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final
|
||||
find_unused_parameters: true
|
||||
@@ -0,0 +1,29 @@
|
||||
_BASE_: [
|
||||
'../datasets/coco_detection.yml',
|
||||
'../runtime.yml',
|
||||
'../faster_rcnn/_base_/optimizer_1x.yml',
|
||||
'../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
|
||||
'../faster_rcnn/_base_/faster_fpn_reader.yml',
|
||||
]
|
||||
weights: output/faster_rcnn_r50_fpn_1x_visdrone/model_final
|
||||
|
||||
|
||||
metric: COCO
|
||||
num_classes: 9
|
||||
|
||||
TrainDataset:
|
||||
!COCODataSet
|
||||
image_dir: train
|
||||
anno_path: annotations/train.json
|
||||
dataset_dir: dataset/VisDrone2019_coco
|
||||
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
|
||||
|
||||
EvalDataset:
|
||||
!COCODataSet
|
||||
image_dir: val
|
||||
anno_path: annotations/val.json
|
||||
dataset_dir: dataset/VisDrone2019_coco
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: annotations/val.json
|
||||
@@ -0,0 +1,33 @@
|
||||
_BASE_: [
|
||||
'../datasets/sniper_visdrone_detection.yml',
|
||||
'../runtime.yml',
|
||||
'../ppyolo/_base_/ppyolo_r50vd_dcn.yml',
|
||||
'../ppyolo/_base_/optimizer_1x.yml',
|
||||
'./_base_/ppyolo_reader.yml',
|
||||
]
|
||||
|
||||
snapshot_epoch: 8
|
||||
use_ema: true
|
||||
weights: output/ppyolo_r50vd_dcn_1x_sniper_visdrone/model_final
|
||||
|
||||
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.005
|
||||
schedulers:
|
||||
- !PiecewiseDecay
|
||||
gamma: 0.
|
||||
milestones:
|
||||
- 153
|
||||
- 173
|
||||
- !LinearWarmup
|
||||
start_factor: 0.1
|
||||
steps: 4000
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.0005
|
||||
type: L2
|
||||
@@ -0,0 +1,54 @@
|
||||
_BASE_: [
|
||||
'../datasets/coco_detection.yml',
|
||||
'../runtime.yml',
|
||||
'../ppyolo/_base_/ppyolo_r50vd_dcn.yml',
|
||||
'../ppyolo/_base_/optimizer_1x.yml',
|
||||
'../ppyolo/_base_/ppyolo_reader.yml',
|
||||
]
|
||||
|
||||
snapshot_epoch: 8
|
||||
use_ema: true
|
||||
weights: output/ppyolo_r50vd_dcn_1x_visdrone/model_final
|
||||
|
||||
epoch: 192
|
||||
|
||||
LearningRate:
|
||||
base_lr: 0.005
|
||||
schedulers:
|
||||
- !PiecewiseDecay
|
||||
gamma: 0.1
|
||||
milestones:
|
||||
- 153
|
||||
- 173
|
||||
- !LinearWarmup
|
||||
start_factor: 0.
|
||||
steps: 4000
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
momentum: 0.9
|
||||
type: Momentum
|
||||
regularizer:
|
||||
factor: 0.0005
|
||||
type: L2
|
||||
|
||||
|
||||
metric: COCO
|
||||
num_classes: 9
|
||||
|
||||
TrainDataset:
|
||||
!COCODataSet
|
||||
image_dir: train
|
||||
anno_path: annotations/train.json
|
||||
dataset_dir: dataset/VisDrone2019_coco
|
||||
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
|
||||
|
||||
EvalDataset:
|
||||
!COCODataSet
|
||||
image_dir: val
|
||||
anno_path: annotations/val.json
|
||||
dataset_dir: dataset/VisDrone2019_coco
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: annotations/val.json
|
||||
Reference in New Issue
Block a user