DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Introduction
DINO is an object detection model based on DETR. We reproduced the model of the paper.
Model Zoo
| Backbone | Model | Epochs | Box AP | Config | Log | Download |
|---|---|---|---|---|---|---|
| R-50 | dino_r50_4scale | 12 | 49.5 | config | log | model |
| R-50 | dino_r50_4scale | 24 | 50.8 | config | log | model |
Notes:
- DINO is trained on COCO train2017 dataset and evaluated on val2017 results of
mAP(IoU=0.5:0.95). - DINO uses 4GPU to train.
GPU multi-card training
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/dino/dino_r50_4scale_1x_coco.yml --fleet --eval
Custom Operator
- Multi-scale deformable attention custom operator see here.
Citations
@misc{zhang2022dino,
title={DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection},
author={Hao Zhang and Feng Li and Shilong Liu and Lei Zhang and Hang Su and Jun Zhu and Lionel M. Ni and Heung-Yeung Shum},
year={2022},
eprint={2203.03605},
archivePrefix={arXiv},
primaryClass={cs.CV}
}