更换文档检测模型

This commit is contained in:
2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions

View File

@@ -0,0 +1,437 @@
简体中文 | [English](README_en.md)
# Semi-Supervised Detection (Semi DET) 半监督检测
## 内容
- [简介](#简介)
- [模型库](#模型库)
- [Baseline](#Baseline)
- [DenseTeacher](#DenseTeacher)
- [ARSL](#ARSL)
- [半监督数据集准备](#半监督数据集准备)
- [半监督检测配置](#半监督检测配置)
- [训练集配置](#训练集配置)
- [预训练配置](#预训练配置)
- [全局配置](#全局配置)
- [模型配置](#模型配置)
- [数据增强配置](#数据增强配置)
- [其他配置](#其他配置)
- [使用说明](#使用说明)
- [训练](#训练)
- [评估](#评估)
- [预测](#预测)
- [部署](#部署)
- [引用](#引用)
## 简介
半监督目标检测(Semi DET)是**同时使用有标注数据和无标注数据**进行训练的目标检测既可以极大地节省标注成本也可以充分利用无标注数据进一步提高检测精度。PaddleDetection团队提供了[DenseTeacher](denseteacher/)和[ARSL](arsl/)等最前沿的半监督检测算法,用户可以下载使用。
## 模型库
### [Baseline](baseline)
**纯监督数据**模型的训练和模型库,请参照[Baseline](baseline)
### [DenseTeacher](denseteacher)
| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAP<sup>val<br>0.5:0.95 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
| DenseTeacher-FCOS | 5% | [sup_config](./baseline/fcos_r50_fpn_2x_coco_sup005.yml) | 24 (8712) | 21.3 | **30.6** | 240 (87120) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi005.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_semi005.yml) |
| DenseTeacher-FCOS | 10% | [sup_config](./baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **35.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml) |
| DenseTeacher-FCOS(LSJ)| 10% | [sup_config](./baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **37.1(LSJ)** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010_lsj.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml) |
| DenseTeacher-FCOS |100%(full)| [sup_config](./../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.ymll) | 24 (175896) | 42.6 | **44.2** | 24 (175896)| [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_full.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_full.yml) |
| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAP<sup>val<br>0.5:0.95 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
| DenseTeacher-PPYOLOE+_s | 5% | [sup_config](./baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml) | 80 (14480) | 32.8 | **34.0** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_s_coco_semi005.pdparams) | [config](denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi005.yml) |
| DenseTeacher-PPYOLOE+_s | 10% | [sup_config](./baseline/ppyoloe_plus_crn_s_80e_coco_sup010.yml) | 80 (14480) | 35.3 | **37.5** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_s_coco_semi010.pdparams) | [config](denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi010.yml) |
| DenseTeacher-PPYOLOE+_l | 5% | [sup_config](./baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml) | 80 (14480) | 42.9 | **45.4** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_l_coco_semi005.pdparams) | [config](denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi005.yml) |
| DenseTeacher-PPYOLOE+_l | 10% | [sup_config](./baseline/ppyoloe_plus_crn_l_80e_coco_sup010.yml) | 80 (14480) | 45.7 | **47.4** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_l_coco_semi010.pdparams) | [config](denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi010.yml) |
### [ARSL](arsl)
| 模型 | COCO监督数据比例 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------:|:----------------------------: | :------------------: |:--------: |:----------: |
| ARSL-FCOS | 1% | **22.8** | 240 (87120) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi001.pdparams) | [config](arsl/arsl_fcos_r50_fpn_coco_semi001.yml) |
| ARSL-FCOS | 5% | **33.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi005.pdparams) | [config](arsl/arsl_fcos_r50_fpn_coco_semi005.yml ) |
| ARSL-FCOS | 10% | **36.9** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi010.pdparams) | [config](arsl/arsl_fcos_r50_fpn_coco_semi010.yml ) |
| ARSL-FCOS | 10% | **38.5(LSJ)** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi010_lsj.pdparams) | [config](arsl/arsl_fcos_r50_fpn_coco_semi010_lsj.yml ) |
| ARSL-FCOS | full(100%) | **45.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_full.pdparams) | [config](arsl/arsl_fcos_r50_fpn_coco_full.yml ) |
## 半监督数据集准备
半监督目标检测**同时需要有标注数据和无标注数据**,且无标注数据量一般**远多于有标注数据量**。
对于COCO数据集一般有两种常规设置
1抽取部分比例的原始训练集`train2017`作为标注数据和无标注数据;
`train2017`中按固定百分比1%、2%、5%、10%等)抽取,由于抽取方法会对半监督训练的结果影响较大,所以采用五折交叉验证来评估。运行数据集划分制作的脚本如下:
```bash
python tools/gen_semi_coco.py
```
会按照 1%、2%、5%、10% 的监督数据比例来划分`train2017`全集为了交叉验证每一种划分会随机重复5次生成的半监督标注文件如下
- 标注数据集标注:`instances_train2017.{fold}@{percent}.json`
- 无标注数据集标注:`instances_train2017.{fold}@{percent}-unlabeled.json`
其中,`fold` 表示交叉验证,`percent` 表示有标注数据的百分比。
注意如果根据`txt_file`生成,需要下载`COCO_supervision.txt`:
```shell
wget https://bj.bcebos.com/v1/paddledet/data/coco/COCO_supervision.txt
```
2使用全量原始训练集`train2017`作为有标注数据 和 全量原始无标签图片集`unlabeled2017`作为无标注数据;
### 下载链接
PaddleDetection团队提供了COCO数据集全部的标注文件请下载并解压存放至对应目录:
```shell
# 下载COCO全量数据集图片和标注
# 包括 train2017, val2017, annotations
wget https://bj.bcebos.com/v1/paddledet/data/coco.tar
# 下载PaddleDetection团队整理的COCO部分比例数据的标注文件
wget https://bj.bcebos.com/v1/paddledet/data/coco/semi_annotations.zip
# unlabeled2017是可选如果不需要训full则无需下载
# 下载COCO全量 unlabeled 无标注数据集
wget https://bj.bcebos.com/v1/paddledet/data/coco/unlabeled2017.zip
wget https://bj.bcebos.com/v1/paddledet/data/coco/image_info_unlabeled2017.zip
# 下载转换完的 unlabeled2017 无标注json文件
wget https://bj.bcebos.com/v1/paddledet/data/coco/instances_unlabeled2017.zip
```
如果需要用到COCO全量unlabeled无标注数据集需要将原版的`image_info_unlabeled2017.json`进行格式转换,运行以下代码:
<details>
<summary> COCO unlabeled 标注转换代码:</summary>
```python
import json
anns_train = json.load(open('annotations/instances_train2017.json', 'r'))
anns_unlabeled = json.load(open('annotations/image_info_unlabeled2017.json', 'r'))
unlabeled_json = {
'images': anns_unlabeled['images'],
'annotations': [],
'categories': anns_train['categories'],
}
path = 'annotations/instances_unlabeled2017.json'
with open(path, 'w') as f:
json.dump(unlabeled_json, f)
```
</details>
<details open>
<summary> 解压后的数据集目录如下:</summary>
```
PaddleDetection
├── dataset
│ ├── coco
│ │ ├── annotations
│ │ │ ├── instances_train2017.json
│ │ │ ├── instances_unlabeled2017.json
│ │ │ ├── instances_val2017.json
│ │ ├── semi_annotations
│ │ │ ├── instances_train2017.1@1.json
│ │ │ ├── instances_train2017.1@1-unlabeled.json
│ │ │ ├── instances_train2017.1@2.json
│ │ │ ├── instances_train2017.1@2-unlabeled.json
│ │ │ ├── instances_train2017.1@5.json
│ │ │ ├── instances_train2017.1@5-unlabeled.json
│ │ │ ├── instances_train2017.1@10.json
│ │ │ ├── instances_train2017.1@10-unlabeled.json
│ │ ├── train2017
│ │ ├── unlabeled2017
│ │ ├── val2017
```
</details>
## 半监督检测配置
配置半监督检测,需要基于选用的**基础检测器**的配置文件,如:
```python
_BASE_: [
'../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
'../_base_/coco_detection_percent_10.yml',
]
log_iter: 50
snapshot_epoch: 5
epochs: &epochs 240
weights: output/denseteacher_fcos_r50_fpn_coco_semi010/model_final
```
并依次做出如下几点改动:
### 训练集配置
首先可以直接引用已经配置好的半监督训练集,如:
```python
_BASE_: [
'../_base_/coco_detection_percent_10.yml',
]
```
具体来看,构建半监督数据集,需要同时配置监督数据集`TrainDataset`和无监督数据集`UnsupTrainDataset`的路径,**注意必须选用`SemiCOCODataSet`类而不是`COCODataSet`类**,如以下所示:
**COCO-train2017部分比例数据集**
```python
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
```
或者 **COCO-train2017 full全量数据集**
```python
# full labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# full unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: unlabeled2017
anno_path: annotations/instances_unlabeled2017.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
```
验证集`EvalDataset`和测试集`TestDataset`的配置**不需要更改**,且还是采用`COCODataSet`类。
### 预训练配置
```python
### pretrain and warmup config, choose one and comment another
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
semi_start_iters: 5000
ema_start_iters: 3000
use_warmup: &use_warmup True
```
**注意:**
- `Dense Teacher`原文使用`R50-va-caffe`预训练PaddleDetection中默认使用`R50-vb`预训练,如果使用`R50-vd`结合[SSLD](../../../docs/feature_models/SSLD_PRETRAINED_MODEL.md)的预训练模型可进一步显著提升检测精度同时backbone部分配置也需要做出相应更改
```python
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]
```
### 全局配置
需要在配置文件中添加如下全局配置,并且注意 DenseTeacher 模型需要使用`use_simple_ema: True`而不是`use_ema: True`
```python
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 4.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
concat_sup_data: True
suppress: linear
ratio: 0.01
gamma: 2.0
test_cfg:
inference_on: teacher
```
### 模型配置
如果没有特殊改动,则直接继承自基础检测器里的模型配置。
以 `DenseTeacher` 为例,选择 `fcos_r50_fpn_iou_multiscale_2x_coco.yml` 作为**基础检测器**进行半监督训练,**teacher网络的结构和student网络的结构均为基础检测器的结构且结构相同**。
```python
_BASE_: [
'../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
]
```
### 数据增强配置
构建半监督训练集的Reader需要在原先`TrainReader`的基础上,新增加`weak_aug`,`strong_aug`,`sup_batch_transforms`和`unsup_batch_transforms`,并且需要注意:
- 如果有`NormalizeImage`,需要单独从`sample_transforms`中抽出来放在`weak_aug`和`strong_aug`中;
- `sample_transforms`为**公用的基础数据增强**
- 完整的弱数据增强为`sample_transforms + weak_aug`,完整的强数据增强为`sample_transforms + strong_aug`
如以下所示:
原纯监督模型的`TrainReader`
```python
TrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
- RandomFlip: {}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
norm_reg_targets: True
batch_size: 2
shuffle: True
drop_last: True
```
更改后的半监督TrainReader
```python
### reader config
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True
```
### 其他配置
训练epoch数需要和全量数据训练时换算总iter数保持一致如全量训练24 epoch(换算约为180k个iter)则10%监督数据的半监督训练总epoch数需要为240 epoch左右(换算约为180k个iter)。示例如下:
```python
### other config
epoch: 240
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: 240
use_warmup: True
- !LinearWarmup
start_factor: 0.001
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
clip_grad_by_value: 1.0
```
## 使用说明
仅训练时必须使用半监督检测的配置文件去训练,评估、预测、部署也可以按基础检测器的配置文件去执行。
### 训练
```bash
# 单卡训练 (不推荐,需按线性比例相应地调整学习率)
CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
# 多卡训练
python -m paddle.distributed.launch --log_dir=denseteacher_fcos_semi010/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
```
### 评估
```bash
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams
```
### 预测
```bash
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams --infer_img=demo/000000014439.jpg
```
### 部署
部署可以使用半监督检测配置文件,也可以使用基础检测器的配置文件去部署和使用。
```bash
# 导出模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams
# 导出权重预测
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU
# 部署测速
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16
# 导出ONNX
paddle2onnx --model_dir output_inference/denseteacher_fcos_r50_fpn_coco_semi010/ --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file denseteacher_fcos_r50_fpn_coco_semi010.onnx
```
## 引用
```
@article{denseteacher2022,
title={Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection},
author={Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun},
journal={arXiv preprint arXiv:2207.02541},
year={2022}
}
```

View File

@@ -0,0 +1,31 @@
metric: COCO
num_classes: 80
# full labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# full unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: unlabeled2017
anno_path: annotations/instances_unlabeled2017.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,31 @@
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@1.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@1-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,31 @@
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,31 @@
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,31 @@
metric: COCO
num_classes: 20
# before training, change VOC to COCO format by 'python voc2coco.py'
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: VOC2007/JPEGImages
anno_path: PseudoAnnotations/VOC2007_trainval.json
dataset_dir: dataset/voc/VOCdevkit
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: VOC2012/JPEGImages
anno_path: PseudoAnnotations/VOC2012_trainval.json
dataset_dir: dataset/voc/VOCdevkit
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: VOC2007/JPEGImages
anno_path: PseudoAnnotations/VOC2007_test.json
dataset_dir: dataset/voc/VOCdevkit/
allow_empty: true
TestDataset:
!ImageFolder
anno_path: PseudoAnnotations/VOC2007_test.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/voc/VOCdevkit/ # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,213 @@
# convert VOC xml to COCO format json
import xml.etree.ElementTree as ET
import os
import json
import argparse
# create and init coco json, img set, and class set
def init_json():
# create coco json
coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []
# voc classes
voc_class = [
'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
]
# init json categories
image_set = set()
class_set = dict()
for cat_id, cat_name in enumerate(voc_class):
cat_item = dict()
cat_item['supercategory'] = 'none'
cat_item['id'] = cat_id
cat_item['name'] = cat_name
coco['categories'].append(cat_item)
class_set[cat_name] = cat_id
return coco, class_set, image_set
def getImgItem(file_name, size, img_id):
if file_name is None:
raise Exception('Could not find filename tag in xml file.')
if size['width'] is None:
raise Exception('Could not find width tag in xml file.')
if size['height'] is None:
raise Exception('Could not find height tag in xml file.')
image_item = dict()
image_item['id'] = img_id
image_item['file_name'] = file_name
image_item['width'] = size['width']
image_item['height'] = size['height']
return image_item
def getAnnoItem(object_name, image_id, ann_id, category_id, bbox):
annotation_item = dict()
annotation_item['segmentation'] = []
seg = []
# bbox[] is x,y,w,h
# left_top
seg.append(bbox[0])
seg.append(bbox[1])
# left_bottom
seg.append(bbox[0])
seg.append(bbox[1] + bbox[3])
# right_bottom
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1] + bbox[3])
# right_top
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1])
annotation_item['segmentation'].append(seg)
annotation_item['area'] = bbox[2] * bbox[3]
annotation_item['iscrowd'] = 0
annotation_item['ignore'] = 0
annotation_item['image_id'] = image_id
annotation_item['bbox'] = bbox
annotation_item['category_id'] = category_id
annotation_item['id'] = ann_id
return annotation_item
def convert_voc_to_coco(txt_path, json_path, xml_path):
# create and init coco json, img set, and class set
coco_json, class_set, image_set = init_json()
### collect img and ann info into coco json
# read img_name in txt, e.g., 000005 for voc2007, 2008_000002 for voc2012
img_txt = open(txt_path, 'r')
img_line = img_txt.readline().strip()
# loop xml
img_id = 0
ann_id = 0
while img_line:
print('img_id:', img_id)
# find corresponding xml
xml_name = img_line.split('Annotations/', 1)[1]
xml_file = os.path.join(xml_path, xml_name)
if not os.path.exists(xml_file):
print('{} is not exists.'.format(xml_name))
img_line = img_txt.readline().strip()
continue
# decode xml
tree = ET.parse(xml_file)
root = tree.getroot()
if root.tag != 'annotation':
raise Exception(
'xml {} root element should be annotation, rather than {}'.
format(xml_name, root.tag))
# init img and ann info
bndbox = dict()
size = dict()
size['width'] = None
size['height'] = None
size['depth'] = None
# filename
fileNameNode = root.find('filename')
file_name = fileNameNode.text
# img size
sizeNode = root.find('size')
if not sizeNode:
raise Exception('xml {} structure broken at size tag.'.format(
xml_name))
for subNode in sizeNode:
size[subNode.tag] = int(subNode.text)
# add img into json
if file_name not in image_set:
img_id += 1
format_img_id = int("%04d" % img_id)
# print('line 120. format_img_id:', format_img_id)
image_item = getImgItem(file_name, size, img_id)
image_set.add(file_name)
coco_json['images'].append(image_item)
else:
raise Exception(' xml {} duplicated image: {}'.format(xml_name,
file_name))
### add objAnn into json
objectAnns = root.findall('object')
for objectAnn in objectAnns:
bndbox['xmin'] = None
bndbox['xmax'] = None
bndbox['ymin'] = None
bndbox['ymax'] = None
#add obj category
object_name = objectAnn.find('name').text
if object_name not in class_set:
raise Exception('xml {} Unrecognized category: {}'.format(
xml_name, object_name))
else:
current_category_id = class_set[object_name]
#add obj bbox ann
objectBboxNode = objectAnn.find('bndbox')
for coordinate in objectBboxNode:
if bndbox[coordinate.tag] is not None:
raise Exception('xml {} structure corrupted at bndbox tag.'.
format(xml_name))
bndbox[coordinate.tag] = int(float(coordinate.text))
bbox = []
# x
bbox.append(bndbox['xmin'])
# y
bbox.append(bndbox['ymin'])
# w
bbox.append(bndbox['xmax'] - bndbox['xmin'])
# h
bbox.append(bndbox['ymax'] - bndbox['ymin'])
ann_id += 1
ann_item = getAnnoItem(object_name, img_id, ann_id,
current_category_id, bbox)
coco_json['annotations'].append(ann_item)
img_line = img_txt.readline().strip()
print('Saving json.')
json.dump(coco_json, open(json_path, 'w'))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'--type', type=str, default='VOC2007_test', help="data type")
parser.add_argument(
'--base_path',
type=str,
default='dataset/voc/VOCdevkit',
help="base VOC path.")
args = parser.parse_args()
# image info path
txt_name = args.type + '.txt'
json_name = args.type + '.json'
txt_path = os.path.join(args.base_path, 'PseudoAnnotations', txt_name)
json_path = os.path.join(args.base_path, 'PseudoAnnotations', json_name)
# xml path
xml_path = os.path.join(args.base_path,
args.type.split('_')[0], 'Annotations')
print('txt_path:', txt_path)
print('json_path:', json_path)
print('xml_path:', xml_path)
print('Converting {} to COCO json.'.format(args.type))
convert_voc_to_coco(txt_path, json_path, xml_path)
print('Finished.')

View File

@@ -0,0 +1,48 @@
简体中文 | [English](README_en.md)
# Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection (ARSL)
## ARSL-FCOS 模型库
| 模型 | COCO监督数据比例 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------:|:----------------------------: | :------------------: |:--------: |:----------: |
| ARSL-FCOS | 1% | **22.8** | 240 (87120) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi001.pdparams) | [config](./arsl_fcos_r50_fpn_coco_semi001.yml) |
| ARSL-FCOS | 5% | **33.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi005.pdparams) | [config](./arsl_fcos_r50_fpn_coco_semi005.yml ) |
| ARSL-FCOS | 10% | **36.9** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi010.pdparams) | [config](./arsl_fcos_r50_fpn_coco_semi010.yml ) |
| ARSL-FCOS | 10% | **38.5(LSJ)** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_semi010_lsj.pdparams) | [config](./arsl_fcos_r50_fpn_coco_semi010_lsj.yml ) |
| ARSL-FCOS | full(100%) | **45.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/arsl_fcos_r50_fpn_coco_full.pdparams) | [config](./arsl_fcos_r50_fpn_coco_full.yml ) |
## 使用说明
仅训练时必须使用半监督检测的配置文件去训练,评估、预测、部署也可以按基础检测器的配置文件去执行。
### 训练
```bash
# 单卡训练 (不推荐,需按线性比例相应地调整学习率)
CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/semi_det/arsl/arsl_fcos_r50_fpn_coco_semi010.yml --eval
# 多卡训练
python -m paddle.distributed.launch --log_dir=arsl_fcos_r50_fpn_coco_semi010/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/semi_det/arsl/arsl_fcos_r50_fpn_coco_semi010.yml --eval
```
### 评估
```bash
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/semi_det/arsl/arsl_fcos_r50_fpn_coco_semi010.yml -o weights=output/arsl_fcos_r50_fpn_coco_semi010/model_final.pdparams
```
### 预测
```bash
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/semi_det/arsl/arsl_fcos_r50_fpn_coco_semi010.yml -o weights=output/arsl_fcos_r50_fpn_coco_semi010/model_final.pdparams --infer_img=demo/000000014439.jpg
```
## 引用
```
```

View File

@@ -0,0 +1,56 @@
architecture: ARSL_FCOS
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
ARSL_FCOS:
backbone: ResNet
neck: FPN
fcos_head: FCOSHead_ARSL
fcos_cr_loss: FCOSLossCR
ResNet:
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [1,2,3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: true
use_c5: false
FCOSHead_ARSL:
fcos_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: false
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
norm_reg_targets: True
centerness_on_reg: True
fcos_loss:
name: FCOSLossMILC
loss_alpha: 0.25
loss_gamma: 2.0
iou_loss_type: "giou"
reg_weights: 1.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
FCOSLossCR:
iou_loss_type: "giou"
cls_weight: 2.0
reg_weight: 2.0
iou_weight: 0.5
hard_neg_mining_flag: true

View File

@@ -0,0 +1,55 @@
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
num_shift: 0. # default 0.5
multiply_strides_reg_targets: False
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
TestReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1

View File

@@ -0,0 +1,29 @@
epoch: 120 # employ iter to control shedule
LearningRate:
base_lr: 0.02 # 0.02 for 8*(4+4) batch
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [3000] # do not decay lr
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 1000
max_iter: 360000 # 360k for 32 batch, 720k for 16 batch
epoch_iter: 1000 # set epoch_iter for saving checkpoint and eval
optimize_rate: 1
SEMISUPNET:
BBOX_THRESHOLD: 0.5 # # not used
TEACHER_UPDATE_ITER: 1
BURN_UP_STEP: 30000
EMA_KEEP_RATE: 0.9996
UNSUP_LOSS_WEIGHT: 1.0 # detailed weights for cls and loc task can be seen in cr_loss
PSEUDO_WARM_UP_STEPS: 2000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2

View File

@@ -0,0 +1,30 @@
epoch: 30 # employ iter to control shedule
LearningRate:
base_lr: 0.02 # 0.02 for 8*(4+4) batch
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [300] # do not decay lr
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 1000
max_iter: 90000 # 90k for 32 batch, 180k for 16 batch
epoch_iter: 1000 # set epoch_iter for saving checkpoint and eval
# update student params according to loss_grad every X iter.
optimize_rate: 1
SEMISUPNET:
BBOX_THRESHOLD: 0.5 # not used
TEACHER_UPDATE_ITER: 1
BURN_UP_STEP: 9000
EMA_KEEP_RATE: 0.9996
UNSUP_LOSS_WEIGHT: 1.0 # detailed weights for cls and loc task can be seen in cr_loss
PSEUDO_WARM_UP_STEPS: 2000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2

View File

@@ -0,0 +1,12 @@
_BASE_: [
'../_base_/coco_detection_full.yml',
'../../runtime.yml',
'_base_/arsl_fcos_r50_fpn.yml',
'_base_/optimizer_360k.yml',
'_base_/arsl_fcos_reader.yml',
]
weights: output/fcos_r50_fpn_arsl_360k_coco_full/model_final
#semi detector type
ssod_method: ARSL

View File

@@ -0,0 +1,12 @@
_BASE_: [
'../_base_/coco_detection_percent_1.yml',
'../../runtime.yml',
'_base_/arsl_fcos_r50_fpn.yml',
'_base_/optimizer_90k.yml',
'_base_/arsl_fcos_reader.yml',
]
weights: output/arsl_fcos_r50_fpn_coco_semi001/model_final
#semi detector type
ssod_method: ARSL

View File

@@ -0,0 +1,12 @@
_BASE_: [
'../_base_/coco_detection_percent_5.yml',
'../../runtime.yml',
'_base_/arsl_fcos_r50_fpn.yml',
'_base_/optimizer_90k.yml',
'_base_/arsl_fcos_reader.yml',
]
weights: output/arsl_fcos_r50_fpn_coco_semi005/model_final
#semi detector type
ssod_method: ARSL

View File

@@ -0,0 +1,12 @@
_BASE_: [
'../_base_/coco_detection_percent_10.yml',
'../../runtime.yml',
'_base_/arsl_fcos_r50_fpn.yml',
'_base_/optimizer_360k.yml',
'_base_/arsl_fcos_reader.yml',
]
weights: output/arsl_fcos_r50_fpn_coco_semi010/model_final
#semi detector type
ssod_method: ARSL

View File

@@ -0,0 +1,47 @@
_BASE_: [
'../_base_/coco_detection_percent_10.yml',
'../../runtime.yml',
'_base_/arsl_fcos_r50_fpn.yml',
'_base_/optimizer_360k.yml',
'_base_/arsl_fcos_reader.yml',
]
weights: output/arsl_fcos_r50_fpn_coco_semi010/model_final
#semi detector type
ssod_method: ARSL
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
# large-scale jittering
- RandomResize: {target_size: [[400, 1333], [1200, 1333]], keep_ratio: True, interp: 1, random_range: True}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
num_shift: 0. # default 0.5
multiply_strides_reg_targets: False
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True

View File

@@ -0,0 +1,93 @@
# Supervised Baseline 纯监督模型基线
## COCO数据集模型库
### [FCOS](../../fcos)
| 基础模型 | 监督数据比例 | Epochs (Iters) | mAP<sup>val<br>0.5:0.95 | 模型下载 | 配置文件 |
| :---------------: | :-------------: | :---------------: |:---------------------: |:--------: | :---------: |
| FCOS ResNet50-FPN | 5% | 24 (8712) | 21.3 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_2x_coco_sup005.pdparams) | [config](fcos_r50_fpn_2x_coco_sup005.yml) |
| FCOS ResNet50-FPN | 10% | 24 (17424) | 26.3 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_2x_coco_sup010.pdparams) | [config](fcos_r50_fpn_2x_coco_sup010.yml) |
| FCOS ResNet50-FPN | full | 24 (175896) | 42.6 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_iou_multiscale_2x_coco.pdparams) | [config](../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml) |
**注意:**
- 以上模型训练默认使用8 GPUs总batch_size默认为16默认初始学习率为0.01。如果改动了总batch_size请按线性比例相应地调整学习率。
### [PP-YOLOE+](../../ppyoloe)
| 基础模型 | 监督数据比例 | Epochs (Iters) | mAP<sup>val<br>0.5:0.95 | 模型下载 | 配置文件 |
| :---------------: | :-------------: | :---------------: | :---------------------: |:--------: | :---------: |
| PP-YOLOE+_s | 5% | 80 (7200) | 32.8 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco_sup005.pdparams) | [config](ppyoloe_plus_crn_s_80e_coco_sup005.yml) |
| PP-YOLOE+_s | 10% | 80 (14480) | 35.3 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco_sup010.pdparams) | [config](ppyoloe_plus_crn_s_80e_coco_sup010.yml) |
| PP-YOLOE+_s | full | 80 (146560) | 43.7 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) | [config](../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml) |
| PP-YOLOE+_l | 5% | 80 (7200) | 42.9 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco_sup005.pdparams) | [config](ppyoloe_plus_crn_l_80e_coco_sup005.yml) |
| PP-YOLOE+_l | 10% | 80 (14480) | 45.7 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco_sup010.pdparams) | [config](ppyoloe_plus_crn_l_80e_coco_sup010.yml) |
| PP-YOLOE+_l | full | 80 (146560) | 49.8 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [config](../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) |
**注意:**
- 以上模型训练默认使用8 GPUs总batch_size默认为64默认初始学习率为0.001。如果改动了总batch_size请按线性比例相应地调整学习率。
### [Faster R-CNN](../../faster_rcnn)
| 基础模型 | 监督数据比例 | Epochs (Iters) | mAP<sup>val<br>0.5:0.95 | 模型下载 | 配置文件 |
| :---------------: | :-------------: | :---------------: | :---------------------: |:--------: | :---------: |
| Faster R-CNN ResNet50-FPN | 5% | 24 (8712) | 20.7 | [download](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco_sup005.pdparams) | [config](faster_rcnn_r50_fpn_2x_coco_sup005.yml) |
| Faster R-CNN ResNet50-FPN | 10% | 24 (17424) | 25.6 | [download](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco_sup010.pdparams) | [config](faster_rcnn_r50_fpn_2x_coco_sup010.yml) |
| Faster R-CNN ResNet50-FPN | full | 24 (175896) | 40.0 | [download](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco.pdparams) | [config](../../configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml) |
**注意:**
- 以上模型训练默认使用8 GPUs总batch_size默认为16默认初始学习率为0.02。如果改动了总batch_size请按线性比例相应地调整学习率。
### [RetinaNet](../../retinanet)
| 基础模型 | 监督数据比例 | Epochs (Iters) | mAP<sup>val<br>0.5:0.95 | 模型下载 | 配置文件 |
| :---------------: | :-------------: | :---------------: | :---------------------: |:--------: | :---------: |
| RetinaNet ResNet50-FPN | 5% | 24 (8712) | 13.9 | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_sup005.pdparams) | [config](retinanet_r50_fpn_2x_coco_sup005.yml) |
| RetinaNet ResNet50-FPN | 10% | 24 (17424) | 23.6 | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_sup010.pdparams) | [config](retinanet_r50_fpn_2x_coco_sup010.yml) |
| RetinaNet ResNet50-FPN | full | 24 (175896) | 39.1 | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco.pdparams) | [config](../../configs/retinanet/retinanet_r50_fpn_2x_coco.yml) |
**注意:**
- 以上模型训练默认使用8 GPUs总batch_size默认为16默认初始学习率为0.01。如果改动了总batch_size请按线性比例相应地调整学习率。
### [RT-DETR](../../rtdetr)
| 基础模型 | 监督数据比例 | mAP<sup>val<br>0.5:0.95 | 模型下载 | 配置文件 |
| :---------------: | :-------------: | :---------------------: |:--------: | :---------: |
| RT-DETR ResNet5vd | 5% | 39.1 | [download](https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_coco_sup005.pdparams) | [config](rtdetr_r50vd_6x_coco_sup005.yml) |
| RT-DETR ResNet5vd | 10% | 42.3 | [download](https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_coco_sup010.pdparams) | [config](rtdetr_r50vd_6x_coco_sup010.yml) |
| RT-DETR ResNet5vd | VOC2007 | 62.7 | [download](https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_voc2007.pdparams) | [config](rtdetr_r50vd_6x_voc2007.yml) |
**注意:**
- RT-DETR模型训练默认使用4 GPUs总batch_size默认为16默认初始学习率为0.0001。如果改动了总batch_size请按线性比例相应地调整学习率。
### 注意事项
- COCO部分监督数据集请参照 [数据集准备](../README.md) 去下载和准备,各个比例的训练集均为**从train2017中抽取部分百分比的子集**,默认使用`fold`号为1的划分子集`sup010`表示抽取10%的监督数据训练,`sup005`表示抽取5%`full`表示全部train2017验证集均为val2017全量
- 抽取部分百分比的监督数据的抽法不同,或使用的`fold`号不同精度都会因此而有约0.5 mAP之多的差异
- PP-YOLOE+ 使用Objects365预训练其余模型均使用ImageNet预训练
- 线型比例相应调整学习率,参照公式: **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)**
## 使用教程
将以下命令写在一个脚本文件里如```run.sh```,一键运行命令为:```sh run.sh```,也可命令行一句句去运行:
```bash
model_type=semi_det/baseline
job_name=ppyoloe_plus_crn_s_80e_coco_sup010 # 可修改,如 fcos_r50_fpn_2x_coco_sup010
config=configs/${model_type}/${job_name}.yml
log_dir=log_dir/${job_name}
weights=output/${job_name}/model_final.pdparams
# 1.training
# CUDA_VISIBLE_DEVICES=0 python tools/train.py -c ${config}
python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp
# 2.eval
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c ${config} -o weights=${weights}
```

View File

@@ -0,0 +1,42 @@
_BASE_: [
'../../faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/faster_rcnn_r50_fpn_2x_coco_sup005/model_final
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
- RandomFlip: {}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.1
epochs: 1

View File

@@ -0,0 +1,42 @@
_BASE_: [
'../../faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/faster_rcnn_r50_fpn_2x_coco_sup010/model_final
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
- RandomFlip: {}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
epoch: 24
LearningRate:
base_lr: 0.02
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.1
epochs: 1

View File

@@ -0,0 +1,26 @@
_BASE_: [
'../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/fcos_r50_fpn_2x_coco_sup005/model_final
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.001
epochs: 1

View File

@@ -0,0 +1,26 @@
_BASE_: [
'../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/fcos_r50_fpn_2x_coco_sup010/model_final
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.001
epochs: 1

View File

@@ -0,0 +1,29 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/ppyoloe_plus_crn_l_80e_coco_sup005/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
depth_mult: 1.0
width_mult: 1.0
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 80
LearningRate:
base_lr: 0.001
schedulers:
- !CosineDecay
max_epochs: 96
- !LinearWarmup
start_factor: 0.
epochs: 5

View File

@@ -0,0 +1,29 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/ppyoloe_plus_crn_l_80e_coco_sup010/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
depth_mult: 1.0
width_mult: 1.0
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 80
LearningRate:
base_lr: 0.001
schedulers:
- !CosineDecay
max_epochs: 96
- !LinearWarmup
start_factor: 0.
epochs: 5

View File

@@ -0,0 +1,29 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/ppyoloe_plus_crn_s_80e_coco_sup005/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
depth_mult: 0.33
width_mult: 0.50
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 80
LearningRate:
base_lr: 0.001
schedulers:
- !CosineDecay
max_epochs: 96
- !LinearWarmup
start_factor: 0.
epochs: 5

View File

@@ -0,0 +1,29 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/ppyoloe_plus_crn_s_80e_coco_sup010/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
depth_mult: 0.33
width_mult: 0.50
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 80
LearningRate:
base_lr: 0.001
schedulers:
- !CosineDecay
max_epochs: 96
- !LinearWarmup
start_factor: 0.
epochs: 5

View File

@@ -0,0 +1,26 @@
_BASE_: [
'../../retinanet/retinanet_r50_fpn_2x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/retinanet_r50_fpn_2x_coco_sup005/model_final
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.001
epochs: 1

View File

@@ -0,0 +1,26 @@
_BASE_: [
'../../retinanet/retinanet_r50_fpn_2x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/retinanet_r50_fpn_2x_coco_sup010/model_final
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.001
epochs: 1

View File

@@ -0,0 +1,35 @@
_BASE_: [
'../../rtdetr/rtdetr_r50vd_6x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/rtdetr_r50vd_6x_coco/model_final
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
worker_num: 4
TrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {prob: 0.8}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomCrop: {prob: 0.8}
- RandomFlip: {}
batch_transforms:
- BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
batch_size: 4
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false

View File

@@ -0,0 +1,35 @@
_BASE_: [
'../../rtdetr/rtdetr_r50vd_6x_coco.yml',
]
log_iter: 50
snapshot_epoch: 2
weights: output/rtdetr_r50vd_6x_coco/model_final
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
TrainDataset:
!COCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class']
worker_num: 4
TrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {prob: 0.8}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomCrop: {prob: 0.8}
- RandomFlip: {}
batch_transforms:
- BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
batch_size: 4
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false

View File

@@ -0,0 +1,101 @@
简体中文 | [English](README_en.md)
# Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection
## FCOS模型库
| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAP<sup>val<br>0.5:0.95 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
| DenseTeacher-FCOS | 5% | [sup_config](../baseline/fcos_r50_fpn_2x_coco_sup005.yml) | 24 (8712) | 21.3 | **30.6** | 240 (87120) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi005.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_semi005.yml) |
| DenseTeacher-FCOS | 10% | [sup_config](../baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **35.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_semi010.yml) |
| DenseTeacher-FCOS(LSJ)| 10% | [sup_config](../baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **37.1(LSJ)** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010_lsj.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml) |
| DenseTeacher-FCOS |100%(full)| [sup_config](../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.ymll) | 24 (175896) | 42.6 | **44.2** | 24 (175896)| [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_full.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_full.yml) |
**注意:**
- 以上模型训练默认使用8 GPUs监督数据总batch_size默认为16无监督数据总batch_size默认也为16默认初始学习率为0.01。如果改动了总batch_size请按线性比例相应地调整学习率
- **监督数据比例**是指使用的有标签COCO数据集占 COCO train2017 全量训练集的百分比使用的无标签COCO数据集一般也是相同比例但具体图片和有标签数据的图片不重合
- `Semi Epochs (Iters)`表示**半监督训练**的模型的 Epochs (Iters),如果使用**自定义数据集**需自行根据Iters换算到对应的Epochs调整最好保证总Iters 和COCO数据集的设置较为接近
- `Sup mAP`是**只使用有监督数据训练**的模型的精度,请参照**基础检测器的配置文件** 和 [baseline](../baseline)
- `Semi mAP`是**半监督训练**的模型的精度,模型下载和配置文件的链接均为**半监督模型**
- `LSJ`表示 **large-scale jittering**,表示使用更大范围的多尺度训练,可进一步提升精度,但训练速度也会变慢;
- 半监督检测的配置讲解,请参照[文档](../README.md/#半监督检测配置)
- `Dense Teacher`原文使用`R50-va-caffe`预训练PaddleDetection中默认使用`R50-vb`预训练,如果使用`R50-vd`结合[SSLD](../../../docs/feature_models/SSLD_PRETRAINED_MODEL.md)的预训练模型可进一步显著提升检测精度同时backbone部分配置也需要做出相应更改
```python
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]
```
## PPYOLOE+ 模型库
| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAP<sup>val<br>0.5:0.95 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
| DenseTeacher-PPYOLOE+_s | 5% | [sup_config](../baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml) | 80 (14480) | 32.8 | **34.0** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_s_coco_semi005.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_s_coco_semi005.yml) |
| DenseTeacher-PPYOLOE+_s | 10% | [sup_config](../baseline/ppyoloe_plus_crn_s_80e_coco_sup010.yml) | 80 (14480) | 35.3 | **37.5** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_s_coco_semi010.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_s_coco_semi010.yml) |
| DenseTeacher-PPYOLOE+_l | 5% | [sup_config](../baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml) | 80 (14480) | 42.9 | **45.4** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_l_coco_semi005.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_l_coco_semi005.yml) |
| DenseTeacher-PPYOLOE+_l | 10% | [sup_config](../baseline/ppyoloe_plus_crn_l_80e_coco_sup010.yml) | 80 (14480) | 45.7 | **47.4** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_l_coco_semi010.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_l_coco_semi010.yml) |
## 使用说明
仅训练时必须使用半监督检测的配置文件去训练,评估、预测、部署也可以按基础检测器的配置文件去执行。
### 训练
```bash
# 单卡训练 (不推荐,需按线性比例相应地调整学习率)
CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
# 多卡训练
python -m paddle.distributed.launch --log_dir=denseteacher_fcos_semi010/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
```
### 评估
```bash
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams
```
### 预测
```bash
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams --infer_img=demo/000000014439.jpg
```
### 部署
部署可以使用半监督检测配置文件,也可以使用基础检测器的配置文件去部署和使用。
```bash
# 导出模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams
# 导出权重预测
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU
# 部署测速
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16
# 导出ONNX
paddle2onnx --model_dir output_inference/denseteacher_fcos_r50_fpn_coco_semi010/ --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file denseteacher_fcos_r50_fpn_coco_semi010.onnx
```
## 引用
```
@article{denseteacher2022,
title={Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection},
author={Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun},
journal={arXiv preprint arXiv:2207.02541},
year={2022}
}
```

View File

@@ -0,0 +1,166 @@
_BASE_: [
'denseteacher_fcos_r50_fpn_coco_semi010.yml',
'../_base_/coco_detection_full.yml',
]
log_iter: 100
snapshot_epoch: 2
epochs: &epochs 24
weights: output/denseteacher_fcos_r50_fpn_coco_full/model_final
### pretrain and warmup config, choose one and comment another
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/fcos_r50_fpn_iou_multiscale_2x_coco.pdparams # mAP=42.6
# semi_start_iters: 0
# ema_start_iters: 0
# use_warmup: &use_warmup False
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
semi_start_iters: 5000
ema_start_iters: 3000
use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 2.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
concat_sup_data: True
suppress: linear
ratio: 0.01
gamma: 2.0
test_cfg:
inference_on: teacher
### reader config
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
num_shift: 0.5
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
fuse_normalize: True
### model config
architecture: FCOS
FCOS:
backbone: ResNet
neck: FPN
fcos_head: FCOSHead
ResNet:
depth: 50
variant: 'b'
norm_type: bn
freeze_at: 0 # res2
return_idx: [1, 2, 3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: True
use_c5: False
FCOSHead:
fcos_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: False
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
norm_reg_targets: True
centerness_on_reg: True
num_shift: 0.5
fcos_loss:
name: FCOSLoss
loss_alpha: 0.25
loss_gamma: 2.0
iou_loss_type: "giou"
reg_weights: 1.0
quality: "iou"
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [*epochs]
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
clip_grad_by_value: 1.0

View File

@@ -0,0 +1,164 @@
_BASE_: [
'../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
'../_base_/coco_detection_percent_5.yml',
]
log_iter: 20
snapshot_epoch: 5
epochs: &epochs 240 # 480 will be better
weights: output/denseteacher_fcos_r50_fpn_coco_semi005/model_final
### pretrain and warmup config, choose one and comment another
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/fcos_r50_fpn_2x_coco_sup005.pdparams # mAP=21.3
# semi_start_iters: 0
# ema_start_iters: 0
# use_warmup: &use_warmup False
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
semi_start_iters: 5000
ema_start_iters: 3000
use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 4.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
concat_sup_data: True
suppress: linear
ratio: 0.01
gamma: 2.0
test_cfg:
inference_on: teacher
### reader config
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
fuse_normalize: True
### model config
architecture: FCOS
FCOS:
backbone: ResNet
neck: FPN
fcos_head: FCOSHead
ResNet:
depth: 50
variant: 'b'
norm_type: bn
freeze_at: 0 # res2
return_idx: [1, 2, 3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: True
use_c5: False
FCOSHead:
fcos_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: False
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
norm_reg_targets: True
centerness_on_reg: True
fcos_loss:
name: FCOSLoss
loss_alpha: 0.25
loss_gamma: 2.0
iou_loss_type: "giou"
reg_weights: 1.0
quality: "iou"
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [*epochs]
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
clip_grad_by_value: 1.0

View File

@@ -0,0 +1,169 @@
_BASE_: [
'../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
'../_base_/coco_detection_percent_10.yml',
]
log_iter: 50
snapshot_epoch: 5
epochs: &epochs 240
weights: output/denseteacher_fcos_r50_fpn_coco_semi010/model_final
### pretrain and warmup config, choose one and comment another
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/fcos_r50_fpn_2x_coco_sup010.pdparams # mAP=26.3
# semi_start_iters: 0
# ema_start_iters: 0
# use_warmup: &use_warmup False
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
semi_start_iters: 5000
ema_start_iters: 3000
use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 4.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
concat_sup_data: True
suppress: linear
ratio: 0.01
gamma: 2.0
test_cfg:
inference_on: teacher
### reader config
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
num_shift: 0. # default 0.5
multiply_strides_reg_targets: False
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
fuse_normalize: True
### model config
architecture: FCOS
FCOS:
backbone: ResNet
neck: FPN
fcos_head: FCOSHead
ResNet:
depth: 50
variant: 'b'
norm_type: bn
freeze_at: 0 # res2
return_idx: [1, 2, 3]
num_stages: 4
FPN:
out_channel: 256
spatial_scales: [0.125, 0.0625, 0.03125]
extra_stage: 2
has_extra_convs: True
use_c5: False
FCOSHead:
fcos_feat:
name: FCOSFeat
feat_in: 256
feat_out: 256
num_convs: 4
norm_type: "gn"
use_dcn: False
fpn_stride: [8, 16, 32, 64, 128]
prior_prob: 0.01
norm_reg_targets: True
centerness_on_reg: True
num_shift: 0. # default 0.5
multiply_strides_reg_targets: False
sqrt_score: False
fcos_loss:
name: FCOSLoss
loss_alpha: 0.25
loss_gamma: 2.0
iou_loss_type: "giou"
reg_weights: 1.0
quality: "iou"
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.025
nms_threshold: 0.6
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [*epochs]
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
clip_grad_by_value: 1.0

View File

@@ -0,0 +1,44 @@
_BASE_: [
'denseteacher_fcos_r50_fpn_coco_semi010.yml',
]
log_iter: 50
snapshot_epoch: 5
epochs: &epochs 240
weights: output/denseteacher_fcos_r50_fpn_coco_semi010_lsj/model_final
### reader config
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
# large-scale jittering
- RandomResize: {target_size: [[400, 1333], [1200, 1333]], keep_ratio: True, interp: 1, random_range: True}
- RandomFlip: {}
weak_aug:
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
sup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
- Gt2FCOSTarget:
object_sizes_boundary: [64, 128, 256, 512]
center_sampling_radius: 1.5
downsample_ratios: [8, 16, 32, 64, 128]
num_shift: 0. # default 0.5
multiply_strides_reg_targets: False
norm_reg_targets: True
unsup_batch_transforms:
- Permute: {}
- PadBatch: {pad_to_stride: 32}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: True
drop_last: True

View File

@@ -0,0 +1,151 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
'../_base_/coco_detection_percent_5.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/denseteacher_ppyoloe_plus_crn_l_coco_semi005/model_final
epochs: &epochs 200
cosine_epochs: &cosine_epochs 240
### pretrain and warmup config, choose one and comment another
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco_sup005.pdparams # mAP=42.9
semi_start_iters: 0
ema_start_iters: 0
use_warmup: &use_warmup False
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
# semi_start_iters: 5000
# ema_start_iters: 3000
# use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
contrast_loss:
temperature: 0.2
alpha: 0.9
smooth_iter: 100
concat_sup_data: True
suppress: linear
ratio: 0.01
test_cfg:
inference_on: teacher
### reader config
batch_size: &batch_size 8
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomFlip: {}
- RandomCrop: {} # unsup will be fake gt_boxes
weak_aug:
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
sup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
- PadGT: {}
unsup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
sup_batch_size: *batch_size
unsup_batch_size: *batch_size
shuffle: True
drop_last: True
collate_batch: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
TestReader:
inputs_def:
image_shape: [3, 640, 640]
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
### model config
architecture: PPYOLOE
norm_type: sync_bn
ema_black_list: ['proj_conv.weight']
custom_black_list: ['reduce_mean']
PPYOLOE:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
eval_size: ~ # means None, but not str 'None'
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 #
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !CosineDecay
max_epochs: *cosine_epochs
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
epochs: 3
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005 # dt-fcos 0.0001
type: L2
clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value

View File

@@ -0,0 +1,151 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
'../_base_/coco_detection_percent_10.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/denseteacher_ppyoloe_plus_crn_l_coco_semi010/model_final
epochs: &epochs 200
cosine_epochs: &cosine_epochs 240
### pretrain and warmup config, choose one and comment another
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco_sup010.pdparams # mAP=45.7
semi_start_iters: 0
ema_start_iters: 0
use_warmup: &use_warmup False
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
# semi_start_iters: 5000
# ema_start_iters: 3000
# use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
contrast_loss:
temperature: 0.2
alpha: 0.9
smooth_iter: 100
concat_sup_data: True
suppress: linear
ratio: 0.01
test_cfg:
inference_on: teacher
### reader config
batch_size: &batch_size 8
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomFlip: {}
- RandomCrop: {} # unsup will be fake gt_boxes
weak_aug:
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
sup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
- PadGT: {}
unsup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
sup_batch_size: *batch_size
unsup_batch_size: *batch_size
shuffle: True
drop_last: True
collate_batch: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
TestReader:
inputs_def:
image_shape: [3, 640, 640]
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
### model config
architecture: PPYOLOE
norm_type: sync_bn
ema_black_list: ['proj_conv.weight']
custom_black_list: ['reduce_mean']
PPYOLOE:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
eval_size: ~ # means None, but not str 'None'
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 #
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !CosineDecay
max_epochs: *cosine_epochs
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
epochs: 3
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005 # dt-fcos 0.0001
type: L2
clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value

View File

@@ -0,0 +1,151 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
'../_base_/coco_detection_percent_5.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/denseteacher_ppyoloe_plus_crn_s_coco_semi005/model_final
epochs: &epochs 200
cosine_epochs: &cosine_epochs 240
### pretrain and warmup config, choose one and comment another
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_coco_sup005.pdparams # mAP=32.8
semi_start_iters: 0
ema_start_iters: 0
use_warmup: &use_warmup False
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
# semi_start_iters: 5000
# ema_start_iters: 3000
# use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
contrast_loss:
temperature: 0.2
alpha: 0.9
smooth_iter: 100
concat_sup_data: True
suppress: linear
ratio: 0.01
test_cfg:
inference_on: teacher
### reader config
batch_size: &batch_size 8
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomFlip: {}
- RandomCrop: {} # unsup will be fake gt_boxes
weak_aug:
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
sup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
- PadGT: {}
unsup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
sup_batch_size: *batch_size
unsup_batch_size: *batch_size
shuffle: True
drop_last: True
collate_batch: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
TestReader:
inputs_def:
image_shape: [3, 640, 640]
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
### model config
architecture: PPYOLOE
norm_type: sync_bn
ema_black_list: ['proj_conv.weight']
custom_black_list: ['reduce_mean']
PPYOLOE:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
eval_size: ~ # means None, but not str 'None'
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 #
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !CosineDecay
max_epochs: *cosine_epochs
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
epochs: 3
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005 # dt-fcos 0.0001
type: L2
clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value

View File

@@ -0,0 +1,151 @@
_BASE_: [
'../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
'../_base_/coco_detection_percent_10.yml',
]
log_iter: 50
snapshot_epoch: 5
weights: output/denseteacher_ppyoloe_plus_crn_s_coco_semi010/model_final
epochs: &epochs 200
cosine_epochs: &cosine_epochs 240
### pretrain and warmup config, choose one and comment another
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_coco_sup010.pdparams # mAP=35.3
semi_start_iters: 0
ema_start_iters: 0
use_warmup: &use_warmup False
# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
# semi_start_iters: 5000
# ema_start_iters: 3000
# use_warmup: &use_warmup True
### global config
use_simple_ema: True
ema_decay: 0.9996
ssod_method: DenseTeacher
DenseTeacher:
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
contrast_loss:
temperature: 0.2
alpha: 0.9
smooth_iter: 100
concat_sup_data: True
suppress: linear
ratio: 0.01
test_cfg:
inference_on: teacher
### reader config
batch_size: &batch_size 8
worker_num: 2
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomFlip: {}
- RandomCrop: {} # unsup will be fake gt_boxes
weak_aug:
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
sup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
- PadGT: {}
unsup_batch_transforms:
- BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
- Permute: {}
sup_batch_size: *batch_size
unsup_batch_size: *batch_size
shuffle: True
drop_last: True
collate_batch: True
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
TestReader:
inputs_def:
image_shape: [3, 640, 640]
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
### model config
architecture: PPYOLOE
norm_type: sync_bn
ema_black_list: ['proj_conv.weight']
custom_black_list: ['reduce_mean']
PPYOLOE:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
eval_size: ~ # means None, but not str 'None'
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 #
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
### other config
epoch: *epochs
LearningRate:
base_lr: 0.01
schedulers:
- !CosineDecay
max_epochs: *cosine_epochs
use_warmup: *use_warmup
- !LinearWarmup
start_factor: 0.001
epochs: 3
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005 # dt-fcos 0.0001
type: L2
clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value

View File

@@ -0,0 +1,109 @@
简体中文 | [English](README_en.md)
# RTDETR-SSOD(基于RTDETR的配套半监督目标检测方法)
# 复现模型指标注意事项: 模型中指标是采用在监督数据训练饱和后加载监督数据所训练的模型进行半监督训练
- 例如 使用 baseline/rtdetr_r50vd_6x_coco_sup005.yml使用5%coco数据训练全监督模型得到rtdetr_r50vd_6x_coco_sup005.pdparams,在rt_detr_ssod005_coco_no_warmup.yml中设置
- pretrain_student_weights: rtdetr_r50vd_6x_coco_sup005.pdparams
- pretrain_teacher_weights: rtdetr_r50vd_6x_coco_sup005.pdparams
- 1.使用coco数据集5%和10%有标记数据和voc数据集VOC2007trainval 所训练的权重已给出请参考 semi_det/baseline/README.md.
- 2.rt_detr_ssod_voc_no_warmup.yml rt_detr_ssod005_coco_no_warmup.yml rt_detr_ssod010_coco_no_warmup.yml 是使用训练好的全监督权中直接开启半监督训练(推荐)
## RTDETR-SSOD模型库
| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAP<sup>val<br>0.5:0.95 | Semi mAP<sup>val<br>0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
| RTDETR-SSOD | 5% | [sup_config](../baseline/rtdetr_r50vd_6x_coco_sup005.yml) | - | 39.0 | **42.3** | - | [download](https://bj.bcebos.com/v1/paddledet/rt_detr_ssod005_coco_no_warmup.pdparams) | [config](./rt_detr_ssod005_coco_no_warmup.yml) |
| RTDETR-SSOD | 10% | [sup_config](../baseline/rtdetr_r50vd_6x_coco_sup010.yml) | -| 42.3 | **44.8** | - | [download](https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/rt_detr_ssod010_coco/rt_detr_ssod010_coco_no_warmup.pdparams) | [config](./rt_detr_ssod010_coco_with_warmup.yml) |
| RTDETR-SSOD(VOC)| VOC | [sup_config](../baseline/rtdetr_r50vd_6x_coco_voc2007.yml) | - | 62.7 | **65.8(LSJ)** | - | [download](https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/rt_detr_ssod_voc/rt_detr_ssod_voc_no_warmup.pdparams) | [config](./rt_detr_ssod_voc_with_warmup.yml) |
**注意:**
- 以上模型训练默认使用8 GPUs监督数据总batch_size默认为16无监督数据总batch_size默认也为16默认初始学习率为0.01。如果改动了总batch_size请按线性比例相应地调整学习率
- **监督数据比例**是指使用的有标签COCO数据集占 COCO train2017 全量训练集的百分比使用的无标签COCO数据集一般也是相同比例但具体图片和有标签数据的图片不重合
- `Semi Epochs (Iters)`表示**半监督训练**的模型的 Epochs (Iters),如果使用**自定义数据集**需自行根据Iters换算到对应的Epochs调整最好保证总Iters 和COCO数据集的设置较为接近
- `Sup mAP`是**只使用有监督数据训练**的模型的精度,请参照**基础检测器的配置文件** 和 [baseline](../baseline)
- `Semi mAP`是**半监督训练**的模型的精度,模型下载和配置文件的链接均为**半监督模型**
- `LSJ`表示 **large-scale jittering**,表示使用更大范围的多尺度训练,可进一步提升精度,但训练速度也会变慢;
- 半监督检测的配置讲解,请参照[文档](../README.md/#半监督检测配置)
- `Dense Teacher`原文使用`R50-va-caffe`预训练PaddleDetection中默认使用`R50-vb`预训练,如果使用`R50-vd`结合[SSLD](../../../docs/feature_models/SSLD_PRETRAINED_MODEL.md)的预训练模型可进一步显著提升检测精度同时backbone部分配置也需要做出相应更改
```python
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]
```
## 使用说明
仅训练时必须使用半监督检测的配置文件去训练,评估、预测、部署也可以按基础检测器的配置文件去执行。
### 训练
```bash
# 单卡训练 (不推荐,需按线性比例相应地调整学习率)
CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/semi_det/rtdetr_ssod/rt_detr_ssod010_coco_no_warmup.yml --eval
# 多卡训练
python -m paddle.distributed.launch --log_dir=denseteacher_fcos_semi010/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/semi_det/rtdetr_ssod/rt_detr_ssod010_coco_no_warmup.yml --eval
```
### 评估
```bash
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/semi_det/rtdetr_ssod/rt_detr_ssod010_coco_no_warmup.yml -o weights=output/rt_detr_ssod/model_final/model_final.pdparams
```
### 预测
```bash
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/semi_det/rtdetr_ssod/rt_detr_ssod010_coco_no_warmup.yml -o weights=output/rt_detr_ssod/model_final/model_final.pdparams --infer_img=demo/000000014439.jpg
```
### 部署
部署可以使用半监督检测配置文件,也可以使用基础检测器的配置文件去部署和使用。
```bash
# 导出模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/semi_det/rtdetr_ssod/rt_detr_ssod010_coco_no_warmup.yml -o weights=https://paddledet.bj.bcebos.com/models/rt_detr_ssod010_coco_no_warmup.pdparams
# 导出权重预测
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/rt_detr_ssod010_coco_no_warmup --image_file=demo/000000014439_640x640.jpg --device=GPU
# 部署测速
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/rt_detr_ssod010_coco_no_warmup --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16
# 导出ONNX
paddle2onnx --model_dir output_inference/drt_detr_ssod010_coco_no_warmup/ --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file rt_detr_ssod010_coco_no_warmup.onnx
```
# RTDETR-SSOD 下游任务
我们验证了RTDETR-SSOD模型强大的泛化能力在低光、工业、交通等不同场景下游任务检测效果稳定提升!
voc数据集采用[voc](https://github.com/thsant/wgisd)是一个广泛使用的计算机视觉数据集用于目标检测、图像分割和场景理解等任务。该数据集包含20个类别的图像,处理后的COCO格式包含图片标注训练集5011张图片无标注训练集11540张测试集2510张20个类别
低光数据集使用[ExDark](https://github.com/cs-chan/Exclusively-Dark-Image-Dataset/tree/master/Dataset)该数据集是一个专门在低光照环境下拍摄出针对低光目标检测的数据集包括从极低光环境到暮光环境等10种不同光照条件下的图片处理后的COCO格式包含图片训练集5891张测试集1472张12个类别;
工业数据集使用[PKU-Market-PCB](https://robotics.pkusz.edu.cn/resources/dataset/)该数据集用于印刷电路板PCB的瑕疵检测提供了6种常见的PCB缺陷;
商超数据集[SKU110k](https://github.com/eg4000/SKU110K_CVPR19)是商品超市场景下的密集目标检测数据集包含11,762张图片和超过170个实例。其中包括8,233张用于训练的图像、588张用于验证的图像和2,941张用于测试的图像;
自动驾驶数据集使用[sslad](https://soda-2d.github.io/index.html);
交通数据集使用[visdrone](http://aiskyeye.com/home/);
## 下游数据集实验结果:
| 数据集 | 业务方向 | 划分 | labeled数据量 | 全监督mAP | 半监督mAP |
|----------|-----------|---------------------|-----------------|------------------|--------------|
| voc | 通用 | voc07, 121:2 | 5000 | 63.1 | 65.8+2.7 |
| visdrone | 无人机交通 | 1:9 | 647 | 19.4 | 20.6 (+1.2) |
| pcb | 工业缺陷 | 1:9 | 55 | 22.9 | 26.8 (+3.9) |
| sku110k | 商品 | 1:9 | 821 | 38.9 | 52.4 (+13.5) |
| sslad | 自动驾驶 | 1:32 | 4967 | 42.1 | 43.3 (+1.2) |
| exdark | 低光照 | 1:9 | 589 | 39.6 | 44.1 (+4.5) |

View File

@@ -0,0 +1,212 @@
_BASE_: [
'../../runtime.yml',
'../../rtdetr/_base_/rtdetr_r50vd.yml',
'../../rtdetr/_base_/rtdetr_reader.yml',
]
eval_interval: 4000
save_interval: 4000
weights: output/rt_detr_ssod/model_final
find_unused_parameters: True
save_dir: output
log_iter: 1
ssod_method: Semi_RTDETR
### global config
use_simple_ema: True
ema_decay: 0.9996
use_gpu: true
### reader config
worker_num: 4
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {prob: 0.8}
- RandomExpand: {fill_value: [0., 0., 0.]}
- RandomCrop: {prob: 0.8}
- RandomFlip: {}
weak_aug:
- RandomFlip: {prob: 0.0}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
sup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
unsup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: { target_size: [640, 640], keep_ratio: False }
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
pretrain_student_weights: https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_coco_sup005.pdparams
pretrain_teacher_weights: https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_coco_sup005.pdparams
hidden_dim: 256
use_focal_loss: True
eval_size: [640, 640]
architecture: DETR
DETR:
backbone: ResNet
neck: HybridEncoder
transformer: RTDETRTransformer
detr_head: DINOHead
post_process: DETRPostProcess
post_process_semi: DETRBBoxSemiPostProcess
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.1, 0.1, 0.1, 0.1]
num_stages: 4
freeze_stem_only: True
HybridEncoder:
hidden_dim: 256
use_encoder_idx: [2]
num_encoder_layers: 1
encoder_layer:
name: TransformerLayer
d_model: 256
nhead: 8
dim_feedforward: 1024
dropout: 0.
activation: 'gelu'
expansion: 1.0
PPDETRTransformer:
num_queries: 300
position_embed_type: sine
feat_strides: [8, 16, 32]
num_levels: 3
nhead: 8
num_decoder_layers: 6
dim_feedforward: 1024
dropout: 0.0
activation: relu
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: False
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
use_vfl: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
DETRPostProcess:
num_top_queries: 300
SSOD: DETR_SSOD
DETR_SSOD:
teacher: DETR
student: DETR
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
ema_start_iters: -1
pseudo_label_initial_score_thr: 0.7
min_pseduo_box_size: 0
concat_sup_data: True
test_cfg:
inference_on: teacher
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
epoch: 400 #epoch: 60
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 1.0
milestones: [400]
use_warmup: false
- !LinearWarmup
start_factor: 0.001
steps: 2000
OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001

View File

@@ -0,0 +1,215 @@
_BASE_: [
'../../runtime.yml',
'../../rtdetr/_base_/rtdetr_r50vd.yml',
'../../rtdetr/_base_/rtdetr_reader.yml',
]
#for debug
eval_interval: 4000
save_interval: 4000
weights: output/rt_detr_ssod/model_final
find_unused_parameters: True
save_dir: output
log_iter: 50
ssod_method: Semi_RTDETR
### global config
use_simple_ema: True
ema_decay: 0.9996
use_gpu: true
### reader config
worker_num: 4
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {prob: 0.8}
- RandomExpand: {fill_value: [0., 0., 0.]}
- RandomCrop: {prob: 0.8}
- RandomFlip: {}
weak_aug:
- RandomFlip: {prob: 0.0}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
sup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
unsup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: { target_size: [640, 640], keep_ratio: False }
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
pretrain_student_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
pretrain_teacher_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
hidden_dim: 256
use_focal_loss: True
eval_size: [640, 640]
architecture: DETR
DETR:
backbone: ResNet
neck: HybridEncoder
transformer: RTDETRTransformer
detr_head: DINOHead
post_process: DETRPostProcess
post_process_semi: DETRBBoxSemiPostProcess
ResNet:
# index 0 stands for res2
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.1, 0.1, 0.1, 0.1]
num_stages: 4
freeze_stem_only: True
HybridEncoder:
hidden_dim: 256
use_encoder_idx: [2]
num_encoder_layers: 1
encoder_layer:
name: TransformerLayer
d_model: 256
nhead: 8
dim_feedforward: 1024
dropout: 0.
activation: 'gelu'
expansion: 1.0
RTDETRTransformer:
num_queries: 300
position_embed_type: sine
feat_strides: [8, 16, 32]
num_levels: 3
nhead: 8
num_decoder_layers: 6
dim_feedforward: 1024
dropout: 0.0
activation: relu
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: False
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
use_vfl: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
use_uni_match: True
DETRPostProcess:
num_top_queries: 300
SSOD: DETR_SSOD
DETR_SSOD:
teacher: DETR
student: DETR
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
ema_start_iters: 10000
pseudo_label_initial_score_thr: 0.7
min_pseduo_box_size: 0
concat_sup_data: True
test_cfg:
inference_on: teacher
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@5-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
epoch: 500 #epoch: 60
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 1.0
milestones: [500]
use_warmup: false
- !LinearWarmup
start_factor: 0.001
steps: 2000
OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001

View File

@@ -0,0 +1,212 @@
_BASE_: [
'../../runtime.yml',
'../../rtdetr/_base_/rtdetr_r50vd.yml',
'../../rtdetr/_base_/rtdetr_reader.yml',
]
eval_interval: 4000
save_interval: 4000
weights: output/rt_detr_ssod/model_final
find_unused_parameters: True
save_dir: output
log_iter: 1
ssod_method: Semi_RTDETR
### global config
use_simple_ema: True
ema_decay: 0.9996
use_gpu: true
### reader config
worker_num: 4
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {prob: 0.8}
- RandomExpand: {fill_value: [0., 0., 0.]}
- RandomCrop: {prob: 0.8}
- RandomFlip: {}
weak_aug:
- RandomFlip: {prob: 0.0}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
sup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
unsup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: { target_size: [640, 640], keep_ratio: False }
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
pretrain_student_weights: https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_coco_sup010.pdparams
pretrain_teacher_weights: https://bj.bcebos.com/v1/paddledet/data/semidet/rtdetr_ssod/baseline/rtdetr_r50vd_6x_coco_sup010.pdparams
hidden_dim: 256
use_focal_loss: True
eval_size: [640, 640]
architecture: DETR
DETR:
backbone: ResNet
neck: HybridEncoder
transformer: RTDETRTransformer
detr_head: DINOHead
post_process: DETRPostProcess
post_process_semi: DETRBBoxSemiPostProcess
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.1, 0.1, 0.1, 0.1]
num_stages: 4
freeze_stem_only: True
HybridEncoder:
hidden_dim: 256
use_encoder_idx: [2]
num_encoder_layers: 1
encoder_layer:
name: TransformerLayer
d_model: 256
nhead: 8
dim_feedforward: 1024
dropout: 0.
activation: 'gelu'
expansion: 1.0
PPDETRTransformer:
num_queries: 300
position_embed_type: sine
feat_strides: [8, 16, 32]
num_levels: 3
nhead: 8
num_decoder_layers: 6
dim_feedforward: 1024
dropout: 0.0
activation: relu
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: False
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
use_vfl: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
DETRPostProcess:
num_top_queries: 300
SSOD: DETR_SSOD
DETR_SSOD:
teacher: DETR
student: DETR
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
ema_start_iters: -1
pseudo_label_initial_score_thr: 0.7
min_pseduo_box_size: 0
concat_sup_data: True
test_cfg:
inference_on: teacher
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
epoch: 400 #epoch: 60
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 1.0
milestones: [400]
use_warmup: false
- !LinearWarmup
start_factor: 0.001
steps: 2000
OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001

View File

@@ -0,0 +1,215 @@
_BASE_: [
'../../runtime.yml',
'../../rtdetr/_base_/rtdetr_r50vd.yml',
'../../rtdetr/_base_/rtdetr_reader.yml',
]
#for debug
eval_interval: 4000
save_interval: 4000
weights: output/rt_detr_ssod/model_final
find_unused_parameters: True
save_dir: output
log_iter: 50
ssod_method: Semi_RTDETR
### global config
use_simple_ema: True
ema_decay: 0.9996
use_gpu: true
### reader config
worker_num: 4
SemiTrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {prob: 0.8}
- RandomExpand: {fill_value: [0., 0., 0.]}
- RandomCrop: {prob: 0.8}
- RandomFlip: {}
weak_aug:
- RandomFlip: {prob: 0.0}
strong_aug:
- StrongAugImage: {transforms: [
RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
RandomErasingCrop: {},
RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
RandomGrayscale: {prob: 0.2},
]}
sup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
unsup_batch_transforms:
- BatchRandomResizeForSSOD: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
sup_batch_size: 2
unsup_batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [640, 640], keep_ratio: False}
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 2
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: { target_size: [640, 640], keep_ratio: False }
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
pretrain_student_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
pretrain_teacher_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
hidden_dim: 256
use_focal_loss: True
eval_size: [640, 640]
architecture: DETR
DETR:
backbone: ResNet
neck: HybridEncoder
transformer: RTDETRTransformer
detr_head: DINOHead
post_process: DETRPostProcess
post_process_semi: DETRBBoxSemiPostProcess
ResNet:
# index 0 stands for res2
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.1, 0.1, 0.1, 0.1]
num_stages: 4
freeze_stem_only: True
HybridEncoder:
hidden_dim: 256
use_encoder_idx: [2]
num_encoder_layers: 1
encoder_layer:
name: TransformerLayer
d_model: 256
nhead: 8
dim_feedforward: 1024
dropout: 0.
activation: 'gelu'
expansion: 1.0
RTDETRTransformer:
num_queries: 300
position_embed_type: sine
feat_strides: [8, 16, 32]
num_levels: 3
nhead: 8
num_decoder_layers: 6
dim_feedforward: 1024
dropout: 0.0
activation: relu
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: False
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
use_vfl: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
use_uni_match: True
DETRPostProcess:
num_top_queries: 300
SSOD: DETR_SSOD
DETR_SSOD:
teacher: DETR
student: DETR
train_cfg:
sup_weight: 1.0
unsup_weight: 1.0
ema_start_iters: 10000
pseudo_label_initial_score_thr: 0.7
min_pseduo_box_size: 0
concat_sup_data: True
test_cfg:
inference_on: teacher
metric: COCO
num_classes: 80
# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
TrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
UnsupTrainDataset:
!SemiCOCODataSet
image_dir: train2017
anno_path: semi_annotations/instances_train2017.1@10-unlabeled.json
dataset_dir: dataset/coco
data_fields: ['image']
supervised: False
EvalDataset:
!COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
epoch: 500 #epoch: 60
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 1.0
milestones: [500]
use_warmup: false
- !LinearWarmup
start_factor: 0.001
steps: 2000
OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001

View File

@@ -0,0 +1 @@
## To be released