更换文档检测模型

2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions
--- a/object_detection/.gitignore
+++ b/object_detection/.gitignore
@@ -1,18 +0,0 @@
-loss/
-data/
-cache/
-tf_cache/
-debug/
-results/
-
-misc/outputs
-
-evaluation/evaluate_object
-evaluation/analyze_object
-
-nnet/__pycache__/
-
-*.swp
-
-*.pyc
-*.o*
--- a/object_detection/LICENSE
+++ b/object_detection/LICENSE
@@ -1,29 +0,0 @@
-BSD 3-Clause License
-
-Copyright (c) 2019, Princeton University
-All rights reserved.
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-* Redistributions of source code must retain the above copyright notice, this
-  list of conditions and the following disclaimer.
-
-* Redistributions in binary form must reproduce the above copyright notice,
-  this list of conditions and the following disclaimer in the documentation
-  and/or other materials provided with the distribution.
-
-* Neither the name of the copyright holder nor the names of its
-  contributors may be used to endorse or promote products derived from
-  this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
-DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
-FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
-SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
-CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/object_detection/README.md
+++ b/object_detection/README.md
@@ -1,152 +0,0 @@
-# CornerNet-Lite: Training, Evaluation and Testing Code
-Code for reproducing results in the following paper:
-
-[**CornerNet-Lite: Efficient Keypoint Based Object Detection**](https://arxiv.org/abs/1904.08900)  
-Hei Law, Yun Teng, Olga Russakovsky, Jia Deng  
-*arXiv:1904.08900* 
-
-## Getting Started
-### Software Requirement
- Python 3.7
- PyTorch 1.0.0
- CUDA 10
- GCC 4.9.2 or above
-
-### Installing Dependencies
-Please first install [Anaconda](https://anaconda.org) and create an Anaconda environment using the provided package list `conda_packagelist.txt`.
-```
-conda create --name CornerNet_Lite --file conda_packagelist.txt --channel pytorch
-```
-
-After you create the environment, please activate it.
-```
-source activate CornerNet_Lite
-```
-
-### Compiling Corner Pooling Layers
-Compile the C++ implementation of the corner pooling layers. (GCC4.9.2 or above is required.)
-```
-cd <CornerNet-Lite dir>/core/models/py_utils/_cpools/
-python setup.py install --user
-```
-
-### Compiling NMS
-Compile the NMS code which are originally from [Faster R-CNN](https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/nms/cpu_nms.pyx) and [Soft-NMS](https://github.com/bharatsingh430/soft-nms/blob/master/lib/nms/cpu_nms.pyx).
-```
-cd <CornerNet-Lite dir>/core/external
-make
-```
-
-### Downloading Models
-In this repo, we provide models for the following detectors:
- [CornerNet-Saccade](https://drive.google.com/file/d/1MQDyPRI0HgDHxHToudHqQ-2m8TVBciaa/view?usp=sharing)
- [CornerNet-Squeeze](https://drive.google.com/file/d/1qM8BBYCLUBcZx_UmLT0qMXNTh-Yshp4X/view?usp=sharing)
- [CornerNet](https://drive.google.com/file/d/1e8At_iZWyXQgLlMwHkB83kN-AN85Uff1/view?usp=sharing)
-
-Put the CornerNet-Saccade model under `<CornerNet-Lite dir>/cache/nnet/CornerNet_Saccade/`, CornerNet-Squeeze model under `<CornerNet-Lite dir>/cache/nnet/CornerNet_Squeeze/` and CornerNet model under `<CornerNet-Lite dir>/cache/nnet/CornerNet/`. (\* Note we use underscore instead of dash in both the directory names for CornerNet-Saccade and CornerNet-Squeeze.)
-
-Note: The CornerNet model is the same as the one in the original [CornerNet repo](https://github.com/princeton-vl/CornerNet). We just ported it to this new repo.
-
-### Running the Demo Script
-After downloading the models, you should be able to use the detectors on your own images. We provide a demo script `demo.py` to test if the repo is installed correctly.
-```
-python demo.py
-```
-This script applies CornerNet-Saccade to `demo.jpg` and writes the results to `demo_out.jpg`.
-
-In the demo script, the default detector is CornerNet-Saccade. You can modify the demo script to test different detectors. For example, if you want to test CornerNet-Squeeze:
-```python
-#!/usr/bin/env python
-
-import cv2
-from core.detectors import CornerNet_Squeeze
-from core.vis_utils import draw_bboxes
-
-detector = CornerNet_Squeeze()
-image    = cv2.imread("demo.jpg")
-
-bboxes = detector(image)
-image  = draw_bboxes(image, bboxes)
-cv2.imwrite("demo_out.jpg", image)
-```
-
-### Using CornerNet-Lite in Your Project
-It is also easy to use CornerNet-Lite in your project. You will need to change the directory name from `CornerNet-Lite` to `CornerNet_Lite`. Otherwise, you won't be able to import CornerNet-Lite.
-```
-Your project
-│   README.md
-│   ...
-│   foo.py
-│
-└───CornerNet_Lite
-│
-└───directory1
-│   
-└───...
-```
-
-In `foo.py`, you can easily import CornerNet-Saccade by adding:
-```python
-from CornerNet_Lite import CornerNet_Saccade
-
-def foo():
-    cornernet = CornerNet_Saccade()
-    # CornerNet_Saccade is ready to use
-
-    image  = cv2.imread('/path/to/your/image')
-    bboxes = cornernet(image)
-```
-
-If you want to train or evaluate the detectors on COCO, please move on to the following steps.
-
-## Training and Evaluation
-
-### Installing MS COCO APIs
-```
-mkdir -p <CornerNet-Lite dir>/data
-cd <CornerNet-Lite dir>/data
-git clone git@github.com:cocodataset/cocoapi.git coco
-cd <CornerNet-Lite dir>/data/coco/PythonAPI
-make install
-```
-
-### Downloading MS COCO Data
- Download the training/validation split we use in our paper from [here](https://drive.google.com/file/d/1dop4188xo5lXDkGtOZUzy2SHOD_COXz4/view?usp=sharing) (originally from [Faster R-CNN](https://github.com/rbgirshick/py-faster-rcnn/tree/master/data))
- Unzip the file and place `annotations` under `<CornerNet-Lite dir>/data/coco`
- Download the images (2014 Train, 2014 Val, 2017 Test) from [here](http://cocodataset.org/#download)
- Create 3 directories, `trainval2014`, `minival2014` and `testdev2017`, under `<CornerNet-Lite dir>/data/coco/images/`
- Copy the training/validation/testing images to the corresponding directories according to the annotation files
-
-To train and evaluate a network, you will need to create a configuration file, which defines the hyperparameters, and a model file, which defines the network architecture. The configuration file should be in JSON format and placed in `<CornerNet-Lite dir>/configs/`. Each configuration file should have a corresponding model file in `<CornerNet-Lite dir>/core/models/`. i.e. If there is a `<model>.json` in `<CornerNet-Lite dir>/configs/`, there should be a `<model>.py` in `<CornerNet-Lite dir>/core/models/`. There is only one exception which we will mention later.
-
-### Training and Evaluating a Model
-To train a model:
-```
-python train.py <model>
-```
-
-We provide the configuration files and the model files for CornerNet-Saccade, CornerNet-Squeeze and CornerNet in this repo. Please check the configuration files in `<CornerNet-Lite dir>/configs/`.
-
-To train CornerNet-Saccade:
-```
-python train.py CornerNet_Saccade
-```
-Please adjust the batch size in `CornerNet_Saccade.json` to accommodate the number of GPUs that are available to you.
-
-To evaluate the trained model:
-```
-python evaluate.py CornerNet_Saccade --testiter 500000 --split <split>
-```
-
-If you want to test different hyperparameters during evaluation and do not want to overwrite the original configuration file, you can do so by creating a configuration file with a suffix (`<model>-<suffix>.json`). There is no need to create `<model>-<suffix>.py` in `<CornerNet-Lite dir>/core/models/`.
-
-To use the new configuration file:
-```
-python evaluate.py <model> --testiter <iter> --split <split> --suffix <suffix>
-```
-
-We also include a configuration file for CornerNet under multi-scale setting, which is `CornerNet-multi_scale.json`, in this repo. 
-
-To use the multi-scale configuration file:
-```
-python evaluate.py CornerNet --testiter <iter> --split <split> --suffix multi_scale
--- a/object_detection/init.py
+++ b/object_detection/init.py
@@ -1,2 +0,0 @@
-from .core.detectors import CornerNet, CornerNet_Squeeze, CornerNet_Saccade
-from .core.vis_utils import draw_bboxes
--- a/object_detection/conda_packagelist.txt
+++ b/object_detection/conda_packagelist.txt
@@ -1,81 +0,0 @@
-# This file may be used to create an environment using:
-# $ conda create --name <env> --file <this file>
-# platform: linux-64
-blas=1.0=mkl
-bzip2=1.0.6=h14c3975_5
-ca-certificates=2018.12.5=0
-cairo=1.14.12=h8948797_3
-certifi=2018.11.29=py37_0
-cffi=1.11.5=py37he75722e_1
-cuda100=1.0=0
-cycler=0.10.0=py37_0
-cython=0.28.5=py37hf484d3e_0
-dbus=1.13.2=h714fa37_1
-expat=2.2.6=he6710b0_0
-ffmpeg=4.0=hcdf2ecd_0
-fontconfig=2.13.0=h9420a91_0
-freeglut=3.0.0=hf484d3e_5
-freetype=2.9.1=h8a8886c_1
-glib=2.56.2=hd408876_0
-graphite2=1.3.12=h23475e2_2
-gst-plugins-base=1.14.0=hbbd80ab_1
-gstreamer=1.14.0=hb453b48_1
-harfbuzz=1.8.8=hffaf4a1_0
-hdf5=1.10.2=hba1933b_1
-icu=58.2=h9c2bf20_1
-intel-openmp=2019.0=118
-jasper=2.0.14=h07fcdf6_1
-jpeg=9b=h024ee3a_2
-kiwisolver=1.0.1=py37hf484d3e_0
-libedit=3.1.20170329=h6b74fdf_2
-libffi=3.2.1=hd88cf55_4
-libgcc-ng=8.2.0=hdf63c60_1
-libgfortran-ng=7.3.0=hdf63c60_0
-libglu=9.0.0=hf484d3e_1
-libopencv=3.4.2=hb342d67_1
-libopus=1.2.1=hb9ed12e_0
-libpng=1.6.35=hbc83047_0
-libstdcxx-ng=8.2.0=hdf63c60_1
-libtiff=4.0.9=he85c1e1_2
-libuuid=1.0.3=h1bed415_2
-libvpx=1.7.0=h439df22_0
-libxcb=1.13=h1bed415_1
-libxml2=2.9.8=h26e45fe_1
-matplotlib=3.0.2=py37h5429711_0
-mkl=2018.0.3=1
-mkl_fft=1.0.6=py37h7dd41cf_0
-mkl_random=1.0.1=py37h4414c95_1
-ncurses=6.1=hf484d3e_0
-ninja=1.8.2=py37h6bb024c_1
-numpy=1.15.4=py37h1d66e8a_0
-numpy-base=1.15.4=py37h81de0dd_0
-olefile=0.46=py37_0
-opencv=3.4.2=py37h6fd60c2_1
-openssl=1.1.1a=h7b6447c_0
-pcre=8.42=h439df22_0
-pillow=5.2.0=py37heded4f4_0
-pip=10.0.1=py37_0
-pixman=0.34.0=hceecf20_3
-py-opencv=3.4.2=py37hb342d67_1
-pycparser=2.18=py37_1
-pyparsing=2.2.0=py37_1
-pyqt=5.9.2=py37h05f1152_2
-python=3.7.1=h0371630_3
-python-dateutil=2.7.3=py37_0
-pytorch=1.0.0=py3.7_cuda10.0.130_cudnn7.4.1_1
-pytz=2018.5=py37_0
-qt=5.9.7=h5867ecd_1
-readline=7.0=h7b6447c_5
-scikit-learn=0.19.1=py37hedc7406_0
-scipy=1.1.0=py37hfa4b5c9_1
-setuptools=40.2.0=py37_0
-sip=4.19.8=py37hf484d3e_0
-six=1.11.0=py37_1
-sqlite=3.25.3=h7b6447c_0
-tk=8.6.8=hbc83047_0
-torchvision=0.2.1=py37_1
-tornado=5.1=py37h14c3975_0
-tqdm=4.25.0=py37h28b3542_0
-wheel=0.31.1=py37_0
-xz=5.2.4=h14c3975_4
-zlib=1.2.11=ha838bed_2
--- a/object_detection/configs/CornerNet-multi_scale.json
+++ b/object_detection/configs/CornerNet-multi_scale.json
@@ -1,54 +0,0 @@
-{
-    "system": {
-        "dataset": "COCO",
-        "batch_size": 49,
-        "sampling_function": "cornernet",
-
-        "train_split": "trainval",
-        "val_split": "minival",
-
-        "learning_rate": 0.00025,
-        "decay_rate": 10,
-
-        "val_iter": 100,
-
-        "opt_algo": "adam",
-        "prefetch_size": 5,
-
-        "max_iter": 500000,
-        "stepsize": 450000,
-        "snapshot": 5000,
-
-        "chunk_sizes": [4, 5, 5, 5, 5, 5, 5, 5, 5, 5],
-
-        "data_dir": "./data"
-    },
-    
-    "db": {
-        "rand_scale_min": 0.6,
-        "rand_scale_max": 1.4,
-        "rand_scale_step": 0.1,
-        "rand_scales": null,
-
-        "rand_crop": true,
-        "rand_color": true,
-
-        "border": 128,
-        "gaussian_bump": true,
-
-        "input_size": [511, 511],
-        "output_sizes": [[128, 128]],
-
-        "test_scales": [0.5, 0.75, 1, 1.25, 1.5],
-
-        "top_k": 100,
-        "categories": 80,
-        "ae_threshold": 0.5,
-        "nms_threshold": 0.5,
-
-        "merge_bbox": true,
-        "weight_exp": 10,
-
-        "max_per_image": 100
-    }
-}
--- a/object_detection/configs/CornerNet.json
+++ b/object_detection/configs/CornerNet.json
@@ -1,52 +0,0 @@
-{
-    "system": {
-        "dataset": "COCO",
-        "batch_size": 49,
-        "sampling_function": "cornernet",
-
-        "train_split": "trainval",
-        "val_split": "minival",
-
-        "learning_rate": 0.00025,
-        "decay_rate": 10,
-
-        "val_iter": 100,
-
-        "opt_algo": "adam",
-        "prefetch_size": 5,
-
-        "max_iter": 500000,
-        "stepsize": 450000,
-        "snapshot": 5000,
-
-        "chunk_sizes": [4, 5, 5, 5, 5, 5, 5, 5, 5, 5],
-
-        "data_dir": "./data"
-    },
-    
-    "db": {
-        "rand_scale_min": 0.6,
-        "rand_scale_max": 1.4,
-        "rand_scale_step": 0.1,
-        "rand_scales": null,
-
-        "rand_crop": true,
-        "rand_color": true,
-
-        "border": 128,
-        "gaussian_bump": true,
-        "gaussian_iou": 0.3,
-
-        "input_size": [511, 511],
-        "output_sizes": [[128, 128]],
-
-        "test_scales": [1],
-
-        "top_k": 100,
-        "categories": 80,
-        "ae_threshold": 0.5,
-        "nms_threshold": 0.5,
-
-        "max_per_image": 100
-    }
-}
--- a/object_detection/configs/CornerNet_Saccade.json
+++ b/object_detection/configs/CornerNet_Saccade.json
@@ -1,56 +0,0 @@
-{
-    "system": {
-        "dataset": "COCO",
-        "batch_size": 48,
-        "sampling_function": "cornernet_saccade",
-
-        "train_split": "trainval",
-        "val_split": "minival",
-
-        "learning_rate": 0.00025,
-        "decay_rate": 10,
-
-        "val_iter": 100,
-
-        "opt_algo": "adam",
-        "prefetch_size": 5,
-
-        "max_iter": 500000,
-        "stepsize": 450000,
-        "snapshot": 5000,
-
-        "chunk_sizes": [12, 12, 12, 12]
-    },
-    
-    "db": {
-        "rand_scale_min": 0.5,
-        "rand_scale_max": 1.1,
-        "rand_scale_step": 0.1,
-        "rand_scales": null,
-
-        "rand_full_crop": true,
-        "gaussian_bump": true,
-        "gaussian_iou": 0.5,
-
-        "min_scale": 16,
-        "view_sizes": [],
-
-        "height_mult": 31,
-        "width_mult": 31,
-
-        "input_size": [255, 255],
-        "output_sizes": [[64, 64]],
-
-        "att_max_crops": 30,
-        "att_scales": [[1, 2, 4]],
-        "att_thresholds": [0.3],
-
-        "top_k": 12,
-        "num_dets": 12,
-        "categories": 80,
-        "ae_threshold": 0.3,
-        "nms_threshold": 0.5,
-
-        "max_per_image": 100
-    }
-}
--- a/object_detection/configs/CornerNet_Squeeze.json
+++ b/object_detection/configs/CornerNet_Squeeze.json
@@ -1,54 +0,0 @@
-{
-    "system": {
-        "dataset": "COCO",
-        "batch_size": 55,
-        "sampling_function": "cornernet",
-
-        "train_split": "trainval",
-        "val_split": "minival",
-
-        "learning_rate": 0.00025,
-        "decay_rate": 10,
-
-        "val_iter": 100,
-
-        "opt_algo": "adam",
-        "prefetch_size": 5,
-
-        "max_iter": 500000,
-        "stepsize": 450000,
-        "snapshot": 5000,
-
-        "chunk_sizes": [13, 14, 14, 14],
-
-        "data_dir": "./data"
-    },
-    
-    "db": {
-        "rand_scale_min": 0.6,
-        "rand_scale_max": 1.4,
-        "rand_scale_step": 0.1,
-        "rand_scales": null,
-
-        "rand_crop": true,
-        "rand_color": true,
-
-        "border": 128,
-        "gaussian_bump": true,
-        "gaussian_iou": 0.3,
-
-        "input_size": [511, 511],
-        "output_sizes": [[64, 64]],
-
-        "test_scales": [1],
-        "test_flipped": false,
-
-        "top_k": 20,
-        "num_dets": 100,
-        "categories": 80,
-        "ae_threshold": 0.5,
-        "nms_threshold": 0.5,
-
-        "max_per_image": 100
-    }
-}
--- a/object_detection/core/base.py
+++ b/object_detection/core/base.py
@@ -1,39 +0,0 @@
-import json
-
-from .nnet.py_factory import NetworkFactory
-
-
-class Base(object):
-    def __init__(self, db, nnet, func, model=None):
-        super(Base, self).__init__()
-
-        self._db = db
-        self._nnet = nnet
-        self._func = func
-
-        if model is not None:
-            self._nnet.load_pretrained_params(model)
-
-        self._nnet.cuda()
-        self._nnet.eval_mode()
-
-    def _inference(self, image, *args, **kwargs):
-        return self._func(self._db, self._nnet, image.copy(), *args, **kwargs)
-
-    def __call__(self, image, *args, **kwargs):
-        categories = self._db.configs["categories"]
-        bboxes = self._inference(image, *args, **kwargs)
-        return {self._db.cls2name(j): bboxes[j] for j in range(1, categories + 1)}
-
-
-def load_cfg(cfg_file):
-    with open(cfg_file, "r") as f:
-        cfg = json.load(f)
-
-    cfg_sys = cfg["system"]
-    cfg_db = cfg["db"]
-    return cfg_sys, cfg_db
-
-
-def load_nnet(cfg_sys, model):
-    return NetworkFactory(cfg_sys, model)
--- a/object_detection/core/config.py
+++ b/object_detection/core/config.py
@@ -1,164 +0,0 @@
-import os
-
-import numpy as np
-
-
-class SystemConfig(object):
-    def __init__(self):
-        self._configs = {}
-        self._configs["dataset"] = None
-        self._configs["sampling_function"] = "coco_detection"
-
-        # Training Config
-        self._configs["display"] = 5
-        self._configs["snapshot"] = 400
-        self._configs["stepsize"] = 5000
-        self._configs["learning_rate"] = 0.001
-        self._configs["decay_rate"] = 10
-        self._configs["max_iter"] = 100000
-        self._configs["val_iter"] = 20
-        self._configs["batch_size"] = 1
-        self._configs["snapshot_name"] = None
-        self._configs["prefetch_size"] = 100
-        self._configs["pretrain"] = None
-        self._configs["opt_algo"] = "adam"
-        self._configs["chunk_sizes"] = None
-
-        # Directories
-        self._configs["data_dir"] = "./data"
-        self._configs["cache_dir"] = "./cache"
-        self._configs["config_dir"] = "./config"
-        self._configs["result_dir"] = "./results"
-
-        # Split
-        self._configs["train_split"] = "training"
-        self._configs["val_split"] = "validation"
-        self._configs["test_split"] = "testdev"
-
-        # Rng
-        self._configs["data_rng"] = np.random.RandomState(123)
-        self._configs["nnet_rng"] = np.random.RandomState(317)
-
-    @property
-    def chunk_sizes(self):
-        return self._configs["chunk_sizes"]
-
-    @property
-    def train_split(self):
-        return self._configs["train_split"]
-
-    @property
-    def val_split(self):
-        return self._configs["val_split"]
-
-    @property
-    def test_split(self):
-        return self._configs["test_split"]
-
-    @property
-    def full(self):
-        return self._configs
-
-    @property
-    def sampling_function(self):
-        return self._configs["sampling_function"]
-
-    @property
-    def data_rng(self):
-        return self._configs["data_rng"]
-
-    @property
-    def nnet_rng(self):
-        return self._configs["nnet_rng"]
-
-    @property
-    def opt_algo(self):
-        return self._configs["opt_algo"]
-
-    @property
-    def prefetch_size(self):
-        return self._configs["prefetch_size"]
-
-    @property
-    def pretrain(self):
-        return self._configs["pretrain"]
-
-    @property
-    def result_dir(self):
-        result_dir = os.path.join(self._configs["result_dir"], self.snapshot_name)
-        if not os.path.exists(result_dir):
-            os.makedirs(result_dir)
-        return result_dir
-
-    @property
-    def dataset(self):
-        return self._configs["dataset"]
-
-    @property
-    def snapshot_name(self):
-        return self._configs["snapshot_name"]
-
-    @property
-    def snapshot_dir(self):
-        snapshot_dir = os.path.join(self.cache_dir, "nnet", self.snapshot_name)
-
-        if not os.path.exists(snapshot_dir):
-            os.makedirs(snapshot_dir)
-        return snapshot_dir
-
-    @property
-    def snapshot_file(self):
-        snapshot_file = os.path.join(self.snapshot_dir, self.snapshot_name + "_{}.pkl")
-        return snapshot_file
-
-    @property
-    def config_dir(self):
-        return self._configs["config_dir"]
-
-    @property
-    def batch_size(self):
-        return self._configs["batch_size"]
-
-    @property
-    def max_iter(self):
-        return self._configs["max_iter"]
-
-    @property
-    def learning_rate(self):
-        return self._configs["learning_rate"]
-
-    @property
-    def decay_rate(self):
-        return self._configs["decay_rate"]
-
-    @property
-    def stepsize(self):
-        return self._configs["stepsize"]
-
-    @property
-    def snapshot(self):
-        return self._configs["snapshot"]
-
-    @property
-    def display(self):
-        return self._configs["display"]
-
-    @property
-    def val_iter(self):
-        return self._configs["val_iter"]
-
-    @property
-    def data_dir(self):
-        return self._configs["data_dir"]
-
-    @property
-    def cache_dir(self):
-        if not os.path.exists(self._configs["cache_dir"]):
-            os.makedirs(self._configs["cache_dir"])
-        return self._configs["cache_dir"]
-
-    def update_config(self, new):
-        for key in new:
-            if key in self._configs:
-                self._configs[key] = new[key]
-        return self
--- a/object_detection/core/dbs/init.py
+++ b/object_detection/core/dbs/init.py
@@ -1,5 +0,0 @@
-from .coco import COCO
-
-datasets = {
-    "COCO": COCO
-}
--- a/object_detection/core/dbs/base.py
+++ b/object_detection/core/dbs/base.py
@@ -1,74 +0,0 @@
-import os
-
-import numpy as np
-
-
-class BASE(object):
-    def __init__(self):
-        self._split = None
-        self._db_inds = []
-        self._image_ids = []
-
-        self._mean = np.zeros((3,), dtype=np.float32)
-        self._std = np.ones((3,), dtype=np.float32)
-        self._eig_val = np.ones((3,), dtype=np.float32)
-        self._eig_vec = np.zeros((3, 3), dtype=np.float32)
-
-        self._configs = {}
-        self._configs["data_aug"] = True
-
-        self._data_rng = None
-
-    @property
-    def configs(self):
-        return self._configs
-
-    @property
-    def mean(self):
-        return self._mean
-
-    @property
-    def std(self):
-        return self._std
-
-    @property
-    def eig_val(self):
-        return self._eig_val
-
-    @property
-    def eig_vec(self):
-        return self._eig_vec
-
-    @property
-    def db_inds(self):
-        return self._db_inds
-
-    @property
-    def split(self):
-        return self._split
-
-    def update_config(self, new):
-        for key in new:
-            if key in self._configs:
-                self._configs[key] = new[key]
-
-    def image_ids(self, ind):
-        return self._image_ids[ind]
-
-    def image_path(self, ind):
-        pass
-
-    def write_result(self, ind, all_bboxes, all_scores):
-        pass
-
-    def evaluate(self, name):
-        pass
-
-    def shuffle_inds(self, quiet=False):
-        if self._data_rng is None:
-            self._data_rng = np.random.RandomState(os.getpid())
-
-        if not quiet:
-            print("shuffling indices...")
-        rand_perm = self._data_rng.permutation(len(self._db_inds))
-        self._db_inds = self._db_inds[rand_perm]
--- a/object_detection/core/dbs/coco.py
+++ b/object_detection/core/dbs/coco.py
@@ -1,169 +0,0 @@
-import os
-
-import numpy as np
-
-from .detection import DETECTION
-
-
-# COCO bounding boxes are 0-indexed
-
-class COCO(DETECTION):
-    def __init__(self, db_config, split=None, sys_config=None):
-        assert split is None or sys_config is not None
-        super(COCO, self).__init__(db_config)
-
-        self._mean = np.array([0.40789654, 0.44719302, 0.47026115], dtype=np.float32)
-        self._std = np.array([0.28863828, 0.27408164, 0.27809835], dtype=np.float32)
-        self._eig_val = np.array([0.2141788, 0.01817699, 0.00341571], dtype=np.float32)
-        self._eig_vec = np.array([
-            [-0.58752847, -0.69563484, 0.41340352],
-            [-0.5832747, 0.00994535, -0.81221408],
-            [-0.56089297, 0.71832671, 0.41158938]
-        ], dtype=np.float32)
-
-        self._coco_cls_ids = [
-            1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13,
-            14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
-            24, 25, 27, 28, 31, 32, 33, 34, 35, 36,
-            37, 38, 39, 40, 41, 42, 43, 44, 46, 47,
-            48, 49, 50, 51, 52, 53, 54, 55, 56, 57,
-            58, 59, 60, 61, 62, 63, 64, 65, 67, 70,
-            72, 73, 74, 75, 76, 77, 78, 79, 80, 81,
-            82, 84, 85, 86, 87, 88, 89, 90
-        ]
-
-        self._coco_cls_names = [
-            'person', 'bicycle', 'car', 'motorcycle', 'airplane',
-            'bus', 'train', 'truck', 'boat', 'traffic light',
-            'fire hydrant', 'stop sign', 'parking meter', 'bench',
-            'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant',
-            'bear', 'zebra', 'giraffe', 'backpack', 'umbrella',
-            'handbag', 'tie', 'suitcase', 'frisbee', 'skis',
-            'snowboard', 'sports ball', 'kite', 'baseball bat',
-            'baseball glove', 'skateboard', 'surfboard',
-            'tennis racket', 'bottle', 'wine glass', 'cup', 'fork',
-            'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich',
-            'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
-            'donut', 'cake', 'chair', 'couch', 'potted plant',
-            'bed', 'dining table', 'toilet', 'tv', 'laptop',
-            'mouse', 'remote', 'keyboard', 'cell phone', 'microwave',
-            'oven', 'toaster', 'sink', 'refrigerator', 'book',
-            'clock', 'vase', 'scissors', 'teddy bear', 'hair drier',
-            'toothbrush'
-        ]
-
-        self._cls2coco = {ind + 1: coco_id for ind, coco_id in enumerate(self._coco_cls_ids)}
-        self._coco2cls = {coco_id: cls_id for cls_id, coco_id in self._cls2coco.items()}
-        self._coco2name = {cls_id: cls_name for cls_id, cls_name in zip(self._coco_cls_ids, self._coco_cls_names)}
-        self._name2coco = {cls_name: cls_id for cls_name, cls_id in self._coco2name.items()}
-
-        if split is not None:
-            coco_dir = os.path.join(sys_config.data_dir, "coco")
-
-            self._split = {
-                "trainval": "trainval2014",
-                "minival": "minival2014",
-                "testdev": "testdev2017"
-            }[split]
-            self._data_dir = os.path.join(coco_dir, "images", self._split)
-            self._anno_file = os.path.join(coco_dir, "annotations", "instances_{}.json".format(self._split))
-
-            self._detections, self._eval_ids = self._load_coco_annos()
-            self._image_ids = list(self._detections.keys())
-            self._db_inds = np.arange(len(self._image_ids))
-
-    def _load_coco_annos(self):
-        from pycocotools.coco import COCO
-
-        coco = COCO(self._anno_file)
-        self._coco = coco
-
-        class_ids = coco.getCatIds()
-        image_ids = coco.getImgIds()
-
-        eval_ids = {}
-        detections = {}
-        for image_id in image_ids:
-            image = coco.loadImgs(image_id)[0]
-            dets = []
-
-            eval_ids[image["file_name"]] = image_id
-            for class_id in class_ids:
-                annotation_ids = coco.getAnnIds(imgIds=image["id"], catIds=class_id)
-                annotations = coco.loadAnns(annotation_ids)
-                category = self._coco2cls[class_id]
-                for annotation in annotations:
-                    det = annotation["bbox"] + [category]
-                    det[2] += det[0]
-                    det[3] += det[1]
-                    dets.append(det)
-
-            file_name = image["file_name"]
-            if len(dets) == 0:
-                detections[file_name] = np.zeros((0, 5), dtype=np.float32)
-            else:
-                detections[file_name] = np.array(dets, dtype=np.float32)
-        return detections, eval_ids
-
-    def image_path(self, ind):
-        if self._data_dir is None:
-            raise ValueError("Data directory is not set")
-
-        db_ind = self._db_inds[ind]
-        file_name = self._image_ids[db_ind]
-        return os.path.join(self._data_dir, file_name)
-
-    def detections(self, ind):
-        db_ind = self._db_inds[ind]
-        file_name = self._image_ids[db_ind]
-        return self._detections[file_name].copy()
-
-    def cls2name(self, cls):
-        coco = self._cls2coco[cls]
-        return self._coco2name[coco]
-
-    def _to_float(self, x):
-        return float("{:.2f}".format(x))
-
-    def convert_to_coco(self, all_bboxes):
-        detections = []
-        for image_id in all_bboxes:
-            coco_id = self._eval_ids[image_id]
-            for cls_ind in all_bboxes[image_id]:
-                category_id = self._cls2coco[cls_ind]
-                for bbox in all_bboxes[image_id][cls_ind]:
-                    bbox[2] -= bbox[0]
-                    bbox[3] -= bbox[1]
-
-                    score = bbox[4]
-                    bbox = list(map(self._to_float, bbox[0:4]))
-
-                    detection = {
-                        "image_id": coco_id,
-                        "category_id": category_id,
-                        "bbox": bbox,
-                        "score": float("{:.2f}".format(score))
-                    }
-
-                    detections.append(detection)
-        return detections
-
-    def evaluate(self, result_json, cls_ids, image_ids):
-        from pycocotools.cocoeval import COCOeval
-
-        if self._split == "testdev":
-            return None
-
-        coco = self._coco
-
-        eval_ids = [self._eval_ids[image_id] for image_id in image_ids]
-        cat_ids = [self._cls2coco[cls_id] for cls_id in cls_ids]
-
-        coco_dets = coco.loadRes(result_json)
-        coco_eval = COCOeval(coco, coco_dets, "bbox")
-        coco_eval.params.imgIds = eval_ids
-        coco_eval.params.catIds = cat_ids
-        coco_eval.evaluate()
-        coco_eval.accumulate()
-        coco_eval.summarize()
-        return coco_eval.stats[0], coco_eval.stats[12:]
--- a/object_detection/core/dbs/detection.py
+++ b/object_detection/core/dbs/detection.py
@@ -1,71 +0,0 @@
-import numpy as np
-
-from .base import BASE
-
-
-class DETECTION(BASE):
-    def __init__(self, db_config):
-        super(DETECTION, self).__init__()
-
-        # Configs for training
-        self._configs["categories"] = 80
-        self._configs["rand_scales"] = [1]
-        self._configs["rand_scale_min"] = 0.8
-        self._configs["rand_scale_max"] = 1.4
-        self._configs["rand_scale_step"] = 0.2
-
-        # Configs for both training and testing
-        self._configs["input_size"] = [383, 383]
-        self._configs["output_sizes"] = [[96, 96], [48, 48], [24, 24], [12, 12]]
-
-        self._configs["score_threshold"] = 0.05
-        self._configs["nms_threshold"] = 0.7
-        self._configs["max_per_set"] = 40
-        self._configs["max_per_image"] = 100
-        self._configs["top_k"] = 20
-        self._configs["ae_threshold"] = 1
-        self._configs["nms_kernel"] = 3
-        self._configs["num_dets"] = 1000
-
-        self._configs["nms_algorithm"] = "exp_soft_nms"
-        self._configs["weight_exp"] = 8
-        self._configs["merge_bbox"] = False
-
-        self._configs["data_aug"] = True
-        self._configs["lighting"] = True
-
-        self._configs["border"] = 64
-        self._configs["gaussian_bump"] = False
-        self._configs["gaussian_iou"] = 0.7
-        self._configs["gaussian_radius"] = -1
-        self._configs["rand_crop"] = False
-        self._configs["rand_color"] = False
-        self._configs["rand_center"] = True
-
-        self._configs["init_sizes"] = [192, 255]
-        self._configs["view_sizes"] = []
-
-        self._configs["min_scale"] = 16
-        self._configs["max_scale"] = 32
-
-        self._configs["att_sizes"] = [[16, 16], [32, 32], [64, 64]]
-        self._configs["att_ranges"] = [[96, 256], [32, 96], [0, 32]]
-        self._configs["att_ratios"] = [16, 8, 4]
-        self._configs["att_scales"] = [1, 1.5, 2]
-        self._configs["att_thresholds"] = [0.3, 0.3, 0.3, 0.3]
-        self._configs["att_nms_ks"] = [3, 3, 3]
-        self._configs["att_max_crops"] = 8
-        self._configs["ref_dets"] = True
-
-        # Configs for testing
-        self._configs["test_scales"] = [1]
-        self._configs["test_flipped"] = True
-
-        self.update_config(db_config)
-
-        if self._configs["rand_scales"] is None:
-            self._configs["rand_scales"] = np.arange(
-                self._configs["rand_scale_min"],
-                self._configs["rand_scale_max"],
-                self._configs["rand_scale_step"]
-            )
--- a/object_detection/core/detectors.py
+++ b/object_detection/core/detectors.py
@@ -1,52 +0,0 @@
-from .base import Base, load_cfg, load_nnet
-from .config import SystemConfig
-from .dbs.coco import COCO
-from .paths import get_file_path
-
-
-class CornerNet(Base):
-    def __init__(self):
-        from .test.cornernet import cornernet_inference
-        from .models.CornerNet import model
-
-        cfg_path = get_file_path("..", "configs", "CornerNet.json")
-        model_path = get_file_path("..", "cache", "nnet", "CornerNet", "CornerNet_500000.pkl")
-
-        cfg_sys, cfg_db = load_cfg(cfg_path)
-        sys_cfg = SystemConfig().update_config(cfg_sys)
-        coco = COCO(cfg_db)
-
-        cornernet = load_nnet(sys_cfg, model())
-        super(CornerNet, self).__init__(coco, cornernet, cornernet_inference, model=model_path)
-
-
-class CornerNet_Squeeze(Base):
-    def __init__(self):
-        from .test.cornernet import cornernet_inference
-        from .models.CornerNet_Squeeze import model
-
-        cfg_path = get_file_path("..", "configs", "CornerNet_Squeeze.json")
-        model_path = get_file_path("..", "cache", "nnet", "CornerNet_Squeeze", "CornerNet_Squeeze_500000.pkl")
-
-        cfg_sys, cfg_db = load_cfg(cfg_path)
-        sys_cfg = SystemConfig().update_config(cfg_sys)
-        coco = COCO(cfg_db)
-
-        cornernet = load_nnet(sys_cfg, model())
-        super(CornerNet_Squeeze, self).__init__(coco, cornernet, cornernet_inference, model=model_path)
-
-
-class CornerNet_Saccade(Base):
-    def __init__(self):
-        from .test.cornernet_saccade import cornernet_saccade_inference
-        from .models.CornerNet_Saccade import model
-
-        cfg_path = get_file_path("..", "configs", "CornerNet_Saccade.json")
-        model_path = get_file_path("..", "cache", "nnet", "CornerNet_Saccade", "CornerNet_Saccade_500000.pkl")
-
-        cfg_sys, cfg_db = load_cfg(cfg_path)
-        sys_cfg = SystemConfig().update_config(cfg_sys)
-        coco = COCO(cfg_db)
-
-        cornernet = load_nnet(sys_cfg, model())
-        super(CornerNet_Saccade, self).__init__(coco, cornernet, cornernet_saccade_inference, model=model_path)
--- a/object_detection/core/external/.gitignore
+++ b/object_detection/core/external/.gitignore
@@ -1,7 +0,0 @@
-bbox.c
-bbox.cpython-35m-x86_64-linux-gnu.so
-bbox.cpython-36m-x86_64-linux-gnu.so
-
-nms.c
-nms.cpython-35m-x86_64-linux-gnu.so
-nms.cpython-36m-x86_64-linux-gnu.so
--- a/object_detection/core/external/Makefile
+++ b/object_detection/core/external/Makefile
@@ -1,3 +0,0 @@
-all:
-	python setup.py build_ext --inplace
-	rm -rf build
--- a/object_detection/core/external/bbox.pyx
+++ b/object_detection/core/external/bbox.pyx
@@ -1,55 +0,0 @@
-# --------------------------------------------------------
-# Fast R-CNN
-# Copyright (c) 2015 Microsoft
-# Licensed under The MIT License [see LICENSE for details]
-# Written by Sergey Karayev
-# --------------------------------------------------------
-
-cimport cython
-import numpy as np
-cimport numpy as np
-
-DTYPE = np.float
-ctypedef np.float_t DTYPE_t
-
-def bbox_overlaps(
-        np.ndarray[DTYPE_t, ndim=2] boxes,
-        np.ndarray[DTYPE_t, ndim=2] query_boxes):
-    """
-    Parameters
-    ----------
-    boxes: (N, 4) ndarray of float
-    query_boxes: (K, 4) ndarray of float
-    Returns
-    -------
-    overlaps: (N, K) ndarray of overlap between boxes and query_boxes
-    """
-    cdef unsigned int N = boxes.shape[0]
-    cdef unsigned int K = query_boxes.shape[0]
-    cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)
-    cdef DTYPE_t iw, ih, box_area
-    cdef DTYPE_t ua
-    cdef unsigned int k, n
-    for k in range(K):
-        box_area = (
-            (query_boxes[k, 2] - query_boxes[k, 0] + 1) *
-            (query_boxes[k, 3] - query_boxes[k, 1] + 1)
-        )
-        for n in range(N):
-            iw = (
-                min(boxes[n, 2], query_boxes[k, 2]) -
-                max(boxes[n, 0], query_boxes[k, 0]) + 1
-            )
-            if iw > 0:
-                ih = (
-                    min(boxes[n, 3], query_boxes[k, 3]) -
-                    max(boxes[n, 1], query_boxes[k, 1]) + 1
-                )
-                if ih > 0:
-                    ua = float(
-                        (boxes[n, 2] - boxes[n, 0] + 1) *
-                        (boxes[n, 3] - boxes[n, 1] + 1) +
-                        box_area - iw * ih
-                    )
-                    overlaps[n, k] = iw * ih / ua
-    return overlaps
--- a/object_detection/core/external/nms.pyx
+++ b/object_detection/core/external/nms.pyx
@@ -1,281 +0,0 @@
-# --------------------------------------------------------
-# Fast R-CNN
-# Copyright (c) 2015 Microsoft
-# Licensed under The MIT License [see LICENSE for details]
-# Written by Ross Girshick
-# --------------------------------------------------------
-
-import numpy as np
-cimport numpy as np
-
-cdef inline np.float32_t max(np.float32_t a, np.float32_t b):
-    return a if a >= b else b
-
-cdef inline np.float32_t min(np.float32_t a, np.float32_t b):
-    return a if a <= b else b
-
-def nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):
-    cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]
-    cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]
-    cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]
-    cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]
-    cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]
-
-    cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)
-    cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1]
-
-    cdef int ndets = dets.shape[0]
-    cdef np.ndarray[np.int_t, ndim=1] suppressed = \
-        np.zeros((ndets), dtype=np.int)
-
-    # nominal indices
-    cdef int _i, _j
-    # sorted indices
-    cdef int i, j
-    # temp variables for box i's (the box currently under consideration)
-    cdef np.float32_t ix1, iy1, ix2, iy2, iarea
-    # variables for computing overlap with box j (lower scoring box)
-    cdef np.float32_t xx1, yy1, xx2, yy2
-    cdef np.float32_t w, h
-    cdef np.float32_t inter, ovr
-
-    keep = []
-    for _i in range(ndets):
-        i = order[_i]
-        if suppressed[i] == 1:
-            continue
-        keep.append(i)
-        ix1 = x1[i]
-        iy1 = y1[i]
-        ix2 = x2[i]
-        iy2 = y2[i]
-        iarea = areas[i]
-        for _j in range(_i + 1, ndets):
-            j = order[_j]
-            if suppressed[j] == 1:
-                continue
-            xx1 = max(ix1, x1[j])
-            yy1 = max(iy1, y1[j])
-            xx2 = min(ix2, x2[j])
-            yy2 = min(iy2, y2[j])
-            w = max(0.0, xx2 - xx1 + 1)
-            h = max(0.0, yy2 - yy1 + 1)
-            inter = w * h
-            ovr = inter / (iarea + areas[j] - inter)
-            if ovr >= thresh:
-                suppressed[j] = 1
-
-    return keep
-
-def soft_nms(np.ndarray[float, ndim=2] boxes, float sigma=0.5, float Nt=0.3, float threshold=0.001,
-             unsigned int method=0):
-    cdef unsigned int N = boxes.shape[0]
-    cdef float iw, ih, box_area
-    cdef float ua
-    cdef int pos = 0
-    cdef float maxscore = 0
-    cdef int maxpos = 0
-    cdef float x1, x2, y1, y2, tx1, tx2, ty1, ty2, ts, area, weight, ov
-
-    for i in range(N):
-        maxscore = boxes[i, 4]
-        maxpos = i
-
-        tx1 = boxes[i, 0]
-        ty1 = boxes[i, 1]
-        tx2 = boxes[i, 2]
-        ty2 = boxes[i, 3]
-        ts = boxes[i, 4]
-
-        pos = i + 1
-        # get max box
-        while pos < N:
-            if maxscore < boxes[pos, 4]:
-                maxscore = boxes[pos, 4]
-                maxpos = pos
-            pos = pos + 1
-
-        # add max box as a detection 
-        boxes[i, 0] = boxes[maxpos, 0]
-        boxes[i, 1] = boxes[maxpos, 1]
-        boxes[i, 2] = boxes[maxpos, 2]
-        boxes[i, 3] = boxes[maxpos, 3]
-        boxes[i, 4] = boxes[maxpos, 4]
-
-        # swap ith box with position of max box
-        boxes[maxpos, 0] = tx1
-        boxes[maxpos, 1] = ty1
-        boxes[maxpos, 2] = tx2
-        boxes[maxpos, 3] = ty2
-        boxes[maxpos, 4] = ts
-
-        tx1 = boxes[i, 0]
-        ty1 = boxes[i, 1]
-        tx2 = boxes[i, 2]
-        ty2 = boxes[i, 3]
-        ts = boxes[i, 4]
-
-        pos = i + 1
-        # NMS iterations, note that N changes if detection boxes fall below threshold
-        while pos < N:
-            x1 = boxes[pos, 0]
-            y1 = boxes[pos, 1]
-            x2 = boxes[pos, 2]
-            y2 = boxes[pos, 3]
-            s = boxes[pos, 4]
-
-            area = (x2 - x1 + 1) * (y2 - y1 + 1)
-            iw = (min(tx2, x2) - max(tx1, x1) + 1)
-            if iw > 0:
-                ih = (min(ty2, y2) - max(ty1, y1) + 1)
-                if ih > 0:
-                    ua = float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih)
-                    ov = iw * ih / ua  #iou between max box and detection box
-
-                    if method == 1:  # linear
-                        if ov > Nt:
-                            weight = 1 - ov
-                        else:
-                            weight = 1
-                    elif method == 2:  # gaussian
-                        weight = np.exp(-(ov * ov) / sigma)
-                    else:  # original NMS
-                        if ov > Nt:
-                            weight = 0
-                        else:
-                            weight = 1
-
-                    boxes[pos, 4] = weight * boxes[pos, 4]
-
-                    # if box score falls below threshold, discard the box by swapping with last box
-                    # update N
-                    if boxes[pos, 4] < threshold:
-                        boxes[pos, 0] = boxes[N - 1, 0]
-                        boxes[pos, 1] = boxes[N - 1, 1]
-                        boxes[pos, 2] = boxes[N - 1, 2]
-                        boxes[pos, 3] = boxes[N - 1, 3]
-                        boxes[pos, 4] = boxes[N - 1, 4]
-                        N = N - 1
-                        pos = pos - 1
-
-            pos = pos + 1
-
-    keep = [i for i in range(N)]
-    return keep
-
-def soft_nms_merge(np.ndarray[float, ndim=2] boxes, float sigma=0.5, float Nt=0.3, float threshold=0.001,
-                   unsigned int method=0, float weight_exp=6):
-    cdef unsigned int N = boxes.shape[0]
-    cdef float iw, ih, box_area
-    cdef float ua
-    cdef int pos = 0
-    cdef float maxscore = 0
-    cdef int maxpos = 0
-    cdef float x1, x2, y1, y2, tx1, tx2, ty1, ty2, ts, area, weight, ov
-    cdef float mx1, mx2, my1, my2, mts, mbs, mw
-
-    for i in range(N):
-        maxscore = boxes[i, 4]
-        maxpos = i
-
-        tx1 = boxes[i, 0]
-        ty1 = boxes[i, 1]
-        tx2 = boxes[i, 2]
-        ty2 = boxes[i, 3]
-        ts = boxes[i, 4]
-
-        pos = i + 1
-        # get max box
-        while pos < N:
-            if maxscore < boxes[pos, 4]:
-                maxscore = boxes[pos, 4]
-                maxpos = pos
-            pos = pos + 1
-
-        # add max box as a detection 
-        boxes[i, 0] = boxes[maxpos, 0]
-        boxes[i, 1] = boxes[maxpos, 1]
-        boxes[i, 2] = boxes[maxpos, 2]
-        boxes[i, 3] = boxes[maxpos, 3]
-        boxes[i, 4] = boxes[maxpos, 4]
-
-        mx1 = boxes[i, 0] * boxes[i, 5]
-        my1 = boxes[i, 1] * boxes[i, 5]
-        mx2 = boxes[i, 2] * boxes[i, 6]
-        my2 = boxes[i, 3] * boxes[i, 6]
-        mts = boxes[i, 5]
-        mbs = boxes[i, 6]
-
-        # swap ith box with position of max box
-        boxes[maxpos, 0] = tx1
-        boxes[maxpos, 1] = ty1
-        boxes[maxpos, 2] = tx2
-        boxes[maxpos, 3] = ty2
-        boxes[maxpos, 4] = ts
-
-        tx1 = boxes[i, 0]
-        ty1 = boxes[i, 1]
-        tx2 = boxes[i, 2]
-        ty2 = boxes[i, 3]
-        ts = boxes[i, 4]
-
-        pos = i + 1
-        # NMS iterations, note that N changes if detection boxes fall below threshold
-        while pos < N:
-            x1 = boxes[pos, 0]
-            y1 = boxes[pos, 1]
-            x2 = boxes[pos, 2]
-            y2 = boxes[pos, 3]
-            s = boxes[pos, 4]
-
-            area = (x2 - x1 + 1) * (y2 - y1 + 1)
-            iw = (min(tx2, x2) - max(tx1, x1) + 1)
-            if iw > 0:
-                ih = (min(ty2, y2) - max(ty1, y1) + 1)
-                if ih > 0:
-                    ua = float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih)
-                    ov = iw * ih / ua  #iou between max box and detection box
-
-                    if method == 1:  # linear
-                        if ov > Nt:
-                            weight = 1 - ov
-                        else:
-                            weight = 1
-                    elif method == 2:  # gaussian
-                        weight = np.exp(-(ov * ov) / sigma)
-                    else:  # original NMS
-                        if ov > Nt:
-                            weight = 0
-                        else:
-                            weight = 1
-
-                    mw = (1 - weight) ** weight_exp
-                    mx1 = mx1 + boxes[pos, 0] * boxes[pos, 5] * mw
-                    my1 = my1 + boxes[pos, 1] * boxes[pos, 5] * mw
-                    mx2 = mx2 + boxes[pos, 2] * boxes[pos, 6] * mw
-                    my2 = my2 + boxes[pos, 3] * boxes[pos, 6] * mw
-                    mts = mts + boxes[pos, 5] * mw
-                    mbs = mbs + boxes[pos, 6] * mw
-
-                    boxes[pos, 4] = weight * boxes[pos, 4]
-
-                    # if box score falls below threshold, discard the box by swapping with last box
-                    # update N
-                    if boxes[pos, 4] < threshold:
-                        boxes[pos, 0] = boxes[N - 1, 0]
-                        boxes[pos, 1] = boxes[N - 1, 1]
-                        boxes[pos, 2] = boxes[N - 1, 2]
-                        boxes[pos, 3] = boxes[N - 1, 3]
-                        boxes[pos, 4] = boxes[N - 1, 4]
-                        N = N - 1
-                        pos = pos - 1
-
-            pos = pos + 1
-
-        boxes[i, 0] = mx1 / mts
-        boxes[i, 1] = my1 / mts
-        boxes[i, 2] = mx2 / mbs
-        boxes[i, 3] = my2 / mbs
-
-    keep = [i for i in range(N)]
-    return keep
--- a/object_detection/core/external/setup.py
+++ b/object_detection/core/external/setup.py
@@ -1,24 +0,0 @@
-from distutils.core import setup
-from distutils.extension import Extension
-
-import numpy
-from Cython.Build import cythonize
-
-extensions = [
-    Extension(
-        "bbox",
-        ["bbox.pyx"],
-        extra_compile_args=[]
-    ),
-    Extension(
-        "nms",
-        ["nms.pyx"],
-        extra_compile_args=[]
-    )
-]
-
-setup(
-    name="coco",
-    ext_modules=cythonize(extensions),
-    include_dirs=[numpy.get_include()]
-)
--- a/object_detection/core/models/CornerNet.py
+++ b/object_detection/core/models/CornerNet.py
@@ -1,73 +0,0 @@
-import torch
-import torch.nn as nn
-
-from .py_utils import TopPool, BottomPool, LeftPool, RightPool
-from .py_utils.losses import CornerNet_Loss
-from .py_utils.modules import hg_module, hg, hg_net
-from .py_utils.utils import convolution, residual, corner_pool
-
-
-def make_pool_layer(dim):
-    return nn.Sequential()
-
-
-def make_hg_layer(inp_dim, out_dim, modules):
-    layers = [residual(inp_dim, out_dim, stride=2)]
-    layers += [residual(out_dim, out_dim) for _ in range(1, modules)]
-    return nn.Sequential(*layers)
-
-
-class model(hg_net):
-    def _pred_mod(self, dim):
-        return nn.Sequential(
-            convolution(3, 256, 256, with_bn=False),
-            nn.Conv2d(256, dim, (1, 1))
-        )
-
-    def _merge_mod(self):
-        return nn.Sequential(
-            nn.Conv2d(256, 256, (1, 1), bias=False),
-            nn.BatchNorm2d(256)
-        )
-
-    def __init__(self):
-        stacks = 2
-        pre = nn.Sequential(
-            convolution(7, 3, 128, stride=2),
-            residual(128, 256, stride=2)
-        )
-        hg_mods = nn.ModuleList([
-            hg_module(
-                5, [256, 256, 384, 384, 384, 512], [2, 2, 2, 2, 2, 4],
-                make_pool_layer=make_pool_layer,
-                make_hg_layer=make_hg_layer
-            ) for _ in range(stacks)
-        ])
-        cnvs = nn.ModuleList([convolution(3, 256, 256) for _ in range(stacks)])
-        inters = nn.ModuleList([residual(256, 256) for _ in range(stacks - 1)])
-        cnvs_ = nn.ModuleList([self._merge_mod() for _ in range(stacks - 1)])
-        inters_ = nn.ModuleList([self._merge_mod() for _ in range(stacks - 1)])
-
-        hgs = hg(pre, hg_mods, cnvs, inters, cnvs_, inters_)
-
-        tl_modules = nn.ModuleList([corner_pool(256, TopPool, LeftPool) for _ in range(stacks)])
-        br_modules = nn.ModuleList([corner_pool(256, BottomPool, RightPool) for _ in range(stacks)])
-
-        tl_heats = nn.ModuleList([self._pred_mod(80) for _ in range(stacks)])
-        br_heats = nn.ModuleList([self._pred_mod(80) for _ in range(stacks)])
-        for tl_heat, br_heat in zip(tl_heats, br_heats):
-            torch.nn.init.constant_(tl_heat[-1].bias, -2.19)
-            torch.nn.init.constant_(br_heat[-1].bias, -2.19)
-
-        tl_tags = nn.ModuleList([self._pred_mod(1) for _ in range(stacks)])
-        br_tags = nn.ModuleList([self._pred_mod(1) for _ in range(stacks)])
-
-        tl_offs = nn.ModuleList([self._pred_mod(2) for _ in range(stacks)])
-        br_offs = nn.ModuleList([self._pred_mod(2) for _ in range(stacks)])
-
-        super(model, self).__init__(
-            hgs, tl_modules, br_modules, tl_heats, br_heats,
-            tl_tags, br_tags, tl_offs, br_offs
-        )
-
-        self.loss = CornerNet_Loss(pull_weight=1e-1, push_weight=1e-1)
--- a/object_detection/core/models/CornerNet_Saccade.py
+++ b/object_detection/core/models/CornerNet_Saccade.py
@@ -1,93 +0,0 @@
-import torch
-import torch.nn as nn
-
-from .py_utils import TopPool, BottomPool, LeftPool, RightPool
-from .py_utils.losses import CornerNet_Saccade_Loss
-from .py_utils.modules import saccade_net, saccade_module, saccade
-from .py_utils.utils import convolution, residual, corner_pool
-
-
-def make_pool_layer(dim):
-    return nn.Sequential()
-
-
-def make_hg_layer(inp_dim, out_dim, modules):
-    layers = [residual(inp_dim, out_dim, stride=2)]
-    layers += [residual(out_dim, out_dim) for _ in range(1, modules)]
-    return nn.Sequential(*layers)
-
-
-class model(saccade_net):
-    def _pred_mod(self, dim):
-        return nn.Sequential(
-            convolution(3, 256, 256, with_bn=False),
-            nn.Conv2d(256, dim, (1, 1))
-        )
-
-    def _merge_mod(self):
-        return nn.Sequential(
-            nn.Conv2d(256, 256, (1, 1), bias=False),
-            nn.BatchNorm2d(256)
-        )
-
-    def __init__(self):
-        stacks = 3
-        pre = nn.Sequential(
-            convolution(7, 3, 128, stride=2),
-            residual(128, 256, stride=2)
-        )
-        hg_mods = nn.ModuleList([
-            saccade_module(
-                3, [256, 384, 384, 512], [1, 1, 1, 1],
-                make_pool_layer=make_pool_layer,
-                make_hg_layer=make_hg_layer
-            ) for _ in range(stacks)
-        ])
-        cnvs = nn.ModuleList([convolution(3, 256, 256) for _ in range(stacks)])
-        inters = nn.ModuleList([residual(256, 256) for _ in range(stacks - 1)])
-        cnvs_ = nn.ModuleList([self._merge_mod() for _ in range(stacks - 1)])
-        inters_ = nn.ModuleList([self._merge_mod() for _ in range(stacks - 1)])
-
-        att_mods = nn.ModuleList([
-            nn.ModuleList([
-                nn.Sequential(
-                    convolution(3, 384, 256, with_bn=False),
-                    nn.Conv2d(256, 1, (1, 1))
-                ),
-                nn.Sequential(
-                    convolution(3, 384, 256, with_bn=False),
-                    nn.Conv2d(256, 1, (1, 1))
-                ),
-                nn.Sequential(
-                    convolution(3, 256, 256, with_bn=False),
-                    nn.Conv2d(256, 1, (1, 1))
-                )
-            ]) for _ in range(stacks)
-        ])
-        for att_mod in att_mods:
-            for att in att_mod:
-                torch.nn.init.constant_(att[-1].bias, -2.19)
-
-        hgs = saccade(pre, hg_mods, cnvs, inters, cnvs_, inters_)
-
-        tl_modules = nn.ModuleList([corner_pool(256, TopPool, LeftPool) for _ in range(stacks)])
-        br_modules = nn.ModuleList([corner_pool(256, BottomPool, RightPool) for _ in range(stacks)])
-
-        tl_heats = nn.ModuleList([self._pred_mod(80) for _ in range(stacks)])
-        br_heats = nn.ModuleList([self._pred_mod(80) for _ in range(stacks)])
-        for tl_heat, br_heat in zip(tl_heats, br_heats):
-            torch.nn.init.constant_(tl_heat[-1].bias, -2.19)
-            torch.nn.init.constant_(br_heat[-1].bias, -2.19)
-
-        tl_tags = nn.ModuleList([self._pred_mod(1) for _ in range(stacks)])
-        br_tags = nn.ModuleList([self._pred_mod(1) for _ in range(stacks)])
-
-        tl_offs = nn.ModuleList([self._pred_mod(2) for _ in range(stacks)])
-        br_offs = nn.ModuleList([self._pred_mod(2) for _ in range(stacks)])
-
-        super(model, self).__init__(
-            hgs, tl_modules, br_modules, tl_heats, br_heats,
-            tl_tags, br_tags, tl_offs, br_offs, att_mods
-        )
-
-        self.loss = CornerNet_Saccade_Loss(pull_weight=1e-1, push_weight=1e-1)
--- a/object_detection/core/models/CornerNet_Squeeze.py
+++ b/object_detection/core/models/CornerNet_Squeeze.py
@@ -1,117 +0,0 @@
-import torch
-import torch.nn as nn
-
-from .py_utils import TopPool, BottomPool, LeftPool, RightPool
-from .py_utils.losses import CornerNet_Loss
-from .py_utils.modules import hg_module, hg, hg_net
-from .py_utils.utils import convolution, corner_pool, residual
-
-
-class fire_module(nn.Module):
-    def __init__(self, inp_dim, out_dim, sr=2, stride=1):
-        super(fire_module, self).__init__()
-        self.conv1 = nn.Conv2d(inp_dim, out_dim // sr, kernel_size=1, stride=1, bias=False)
-        self.bn1 = nn.BatchNorm2d(out_dim // sr)
-        self.conv_1x1 = nn.Conv2d(out_dim // sr, out_dim // 2, kernel_size=1, stride=stride, bias=False)
-        self.conv_3x3 = nn.Conv2d(out_dim // sr, out_dim // 2, kernel_size=3, padding=1,
-                                  stride=stride, groups=out_dim // sr, bias=False)
-        self.bn2 = nn.BatchNorm2d(out_dim)
-        self.skip = (stride == 1 and inp_dim == out_dim)
-        self.relu = nn.ReLU(inplace=True)
-
-    def forward(self, x):
-        conv1 = self.conv1(x)
-        bn1 = self.bn1(conv1)
-        conv2 = torch.cat((self.conv_1x1(bn1), self.conv_3x3(bn1)), 1)
-        bn2 = self.bn2(conv2)
-        if self.skip:
-            return self.relu(bn2 + x)
-        else:
-            return self.relu(bn2)
-
-
-def make_pool_layer(dim):
-    return nn.Sequential()
-
-
-def make_unpool_layer(dim):
-    return nn.ConvTranspose2d(dim, dim, kernel_size=4, stride=2, padding=1)
-
-
-def make_layer(inp_dim, out_dim, modules):
-    layers = [fire_module(inp_dim, out_dim)]
-    layers += [fire_module(out_dim, out_dim) for _ in range(1, modules)]
-    return nn.Sequential(*layers)
-
-
-def make_layer_revr(inp_dim, out_dim, modules):
-    layers = [fire_module(inp_dim, inp_dim) for _ in range(modules - 1)]
-    layers += [fire_module(inp_dim, out_dim)]
-    return nn.Sequential(*layers)
-
-
-def make_hg_layer(inp_dim, out_dim, modules):
-    layers = [fire_module(inp_dim, out_dim, stride=2)]
-    layers += [fire_module(out_dim, out_dim) for _ in range(1, modules)]
-    return nn.Sequential(*layers)
-
-
-class model(hg_net):
-    def _pred_mod(self, dim):
-        return nn.Sequential(
-            convolution(1, 256, 256, with_bn=False),
-            nn.Conv2d(256, dim, (1, 1))
-        )
-
-    def _merge_mod(self):
-        return nn.Sequential(
-            nn.Conv2d(256, 256, (1, 1), bias=False),
-            nn.BatchNorm2d(256)
-        )
-
-    def __init__(self):
-        stacks = 2
-        pre = nn.Sequential(
-            convolution(7, 3, 128, stride=2),
-            residual(128, 256, stride=2),
-            residual(256, 256, stride=2)
-        )
-        hg_mods = nn.ModuleList([
-            hg_module(
-                4, [256, 256, 384, 384, 512], [2, 2, 2, 2, 4],
-                make_pool_layer=make_pool_layer,
-                make_unpool_layer=make_unpool_layer,
-                make_up_layer=make_layer,
-                make_low_layer=make_layer,
-                make_hg_layer_revr=make_layer_revr,
-                make_hg_layer=make_hg_layer
-            ) for _ in range(stacks)
-        ])
-        cnvs = nn.ModuleList([convolution(3, 256, 256) for _ in range(stacks)])
-        inters = nn.ModuleList([residual(256, 256) for _ in range(stacks - 1)])
-        cnvs_ = nn.ModuleList([self._merge_mod() for _ in range(stacks - 1)])
-        inters_ = nn.ModuleList([self._merge_mod() for _ in range(stacks - 1)])
-
-        hgs = hg(pre, hg_mods, cnvs, inters, cnvs_, inters_)
-
-        tl_modules = nn.ModuleList([corner_pool(256, TopPool, LeftPool) for _ in range(stacks)])
-        br_modules = nn.ModuleList([corner_pool(256, BottomPool, RightPool) for _ in range(stacks)])
-
-        tl_heats = nn.ModuleList([self._pred_mod(80) for _ in range(stacks)])
-        br_heats = nn.ModuleList([self._pred_mod(80) for _ in range(stacks)])
-        for tl_heat, br_heat in zip(tl_heats, br_heats):
-            torch.nn.init.constant_(tl_heat[-1].bias, -2.19)
-            torch.nn.init.constant_(br_heat[-1].bias, -2.19)
-
-        tl_tags = nn.ModuleList([self._pred_mod(1) for _ in range(stacks)])
-        br_tags = nn.ModuleList([self._pred_mod(1) for _ in range(stacks)])
-
-        tl_offs = nn.ModuleList([self._pred_mod(2) for _ in range(stacks)])
-        br_offs = nn.ModuleList([self._pred_mod(2) for _ in range(stacks)])
-
-        super(model, self).__init__(
-            hgs, tl_modules, br_modules, tl_heats, br_heats,
-            tl_tags, br_tags, tl_offs, br_offs
-        )
-
-        self.loss = CornerNet_Loss(pull_weight=1e-1, push_weight=1e-1)
--- a/object_detection/core/models/py_utils/init.py
+++ b/object_detection/core/models/py_utils/init.py
@@ -1 +0,0 @@
-from ._cpools import TopPool, BottomPool, LeftPool, RightPool
--- a/object_detection/core/models/py_utils/_cpools/.gitignore
+++ b/object_detection/core/models/py_utils/_cpools/.gitignore
@@ -1,3 +0,0 @@
-build/
-cpools.egg-info/
-dist/
--- a/object_detection/core/models/py_utils/_cpools/init.py
+++ b/object_detection/core/models/py_utils/_cpools/init.py
@@ -1,82 +0,0 @@
-import bottom_pool
-import left_pool
-import right_pool
-import top_pool
-from torch import nn
-from torch.autograd import Function
-
-
-class TopPoolFunction(Function):
-    @staticmethod
-    def forward(ctx, input):
-        output = top_pool.forward(input)[0]
-        ctx.save_for_backward(input)
-        return output
-
-    @staticmethod
-    def backward(ctx, grad_output):
-        input = ctx.saved_variables[0]
-        output = top_pool.backward(input, grad_output)[0]
-        return output
-
-
-class BottomPoolFunction(Function):
-    @staticmethod
-    def forward(ctx, input):
-        output = bottom_pool.forward(input)[0]
-        ctx.save_for_backward(input)
-        return output
-
-    @staticmethod
-    def backward(ctx, grad_output):
-        input = ctx.saved_variables[0]
-        output = bottom_pool.backward(input, grad_output)[0]
-        return output
-
-
-class LeftPoolFunction(Function):
-    @staticmethod
-    def forward(ctx, input):
-        output = left_pool.forward(input)[0]
-        ctx.save_for_backward(input)
-        return output
-
-    @staticmethod
-    def backward(ctx, grad_output):
-        input = ctx.saved_variables[0]
-        output = left_pool.backward(input, grad_output)[0]
-        return output
-
-
-class RightPoolFunction(Function):
-    @staticmethod
-    def forward(ctx, input):
-        output = right_pool.forward(input)[0]
-        ctx.save_for_backward(input)
-        return output
-
-    @staticmethod
-    def backward(ctx, grad_output):
-        input = ctx.saved_variables[0]
-        output = right_pool.backward(input, grad_output)[0]
-        return output
-
-
-class TopPool(nn.Module):
-    def forward(self, x):
-        return TopPoolFunction.apply(x)
-
-
-class BottomPool(nn.Module):
-    def forward(self, x):
-        return BottomPoolFunction.apply(x)
-
-
-class LeftPool(nn.Module):
-    def forward(self, x):
-        return LeftPoolFunction.apply(x)
-
-
-class RightPool(nn.Module):
-    def forward(self, x):
-        return RightPoolFunction.apply(x)
--- a/object_detection/core/models/py_utils/_cpools/setup.py
+++ b/object_detection/core/models/py_utils/_cpools/setup.py
@@ -1,15 +0,0 @@
-from setuptools import setup
-from torch.utils.cpp_extension import BuildExtension, CppExtension
-
-setup(
-    name="cpools",
-    ext_modules=[
-        CppExtension("top_pool", ["src/top_pool.cpp"]),
-        CppExtension("bottom_pool", ["src/bottom_pool.cpp"]),
-        CppExtension("left_pool", ["src/left_pool.cpp"]),
-        CppExtension("right_pool", ["src/right_pool.cpp"])
-    ],
-    cmdclass={
-        "build_ext": BuildExtension
-    }
-)
--- a/object_detection/core/models/py_utils/_cpools/src/bottom_pool.cpp
+++ b/object_detection/core/models/py_utils/_cpools/src/bottom_pool.cpp
@@ -1,80 +0,0 @@
-#include <torch/torch.h>
-
-#include <vector>
-
-std::vector<at::Tensor> pool_forward(
-    at::Tensor input
-) {
-    // Initialize output
-    at::Tensor output = at::zeros_like(input);
-
-    // Get height
-    int64_t height = input.size(2);
-
-    output.copy_(input);
-
-    for (int64_t ind = 1; ind < height; ind <<= 1) {
-        at::Tensor max_temp = at::slice(output, 2, ind, height);
-        at::Tensor cur_temp = at::slice(output, 2, ind, height);
-        at::Tensor next_temp = at::slice(output, 2, 0, height-ind);
-        at::max_out(max_temp, cur_temp, next_temp);
-    }
-
-    return { 
-        output
-    };
-}
-
-std::vector<at::Tensor> pool_backward(
-    at::Tensor input,
-    at::Tensor grad_output
-) {
-    auto output = at::zeros_like(input);
-
-    int32_t batch   = input.size(0);
-    int32_t channel = input.size(1);
-    int32_t height  = input.size(2);
-    int32_t width   = input.size(3);
-
-    auto max_val = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kFloat));
-    auto max_ind = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kLong));
-
-    auto input_temp = input.select(2, 0);
-    max_val.copy_(input_temp);
-
-    max_ind.fill_(0);
-
-    auto output_temp      = output.select(2, 0);
-    auto grad_output_temp = grad_output.select(2, 0);
-    output_temp.copy_(grad_output_temp);
-
-    auto un_max_ind = max_ind.unsqueeze(2);
-    auto gt_mask    = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kByte));
-    auto max_temp   = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kFloat));
-    for (int32_t ind = 0; ind < height - 1; ++ind) {
-        input_temp = input.select(2, ind + 1);
-        at::gt_out(gt_mask, input_temp, max_val);
-
-        at::masked_select_out(max_temp, input_temp, gt_mask);
-        max_val.masked_scatter_(gt_mask, max_temp);
-        max_ind.masked_fill_(gt_mask, ind + 1);
-
-        grad_output_temp = grad_output.select(2, ind + 1).unsqueeze(2);
-        output.scatter_add_(2, un_max_ind, grad_output_temp);
-    }
-
-    return {
-        output
-    };
-}
-
-PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
-    m.def(
-        "forward", &pool_forward, "Bottom Pool Forward",
-        py::call_guard<py::gil_scoped_release>()
-    );
-    m.def(
-        "backward", &pool_backward, "Bottom Pool Backward",
-        py::call_guard<py::gil_scoped_release>()
-    );
-}
--- a/object_detection/core/models/py_utils/_cpools/src/left_pool.cpp
+++ b/object_detection/core/models/py_utils/_cpools/src/left_pool.cpp
@@ -1,80 +0,0 @@
-#include <torch/torch.h>
-
-#include <vector>
-
-std::vector<at::Tensor> pool_forward(
-    at::Tensor input
-) {
-    // Initialize output
-    at::Tensor output = at::zeros_like(input);
-
-    // Get width
-    int64_t width = input.size(3);
-
-    output.copy_(input);
-
-    for (int64_t ind = 1; ind < width; ind <<= 1) {
-        at::Tensor max_temp = at::slice(output, 3, 0, width-ind); 
-        at::Tensor cur_temp = at::slice(output, 3, 0, width-ind);        
-        at::Tensor next_temp = at::slice(output, 3, ind, width);
-        at::max_out(max_temp, cur_temp, next_temp);
-    }
-
-    return { 
-        output
-    };
-}
-
-std::vector<at::Tensor> pool_backward(
-    at::Tensor input,
-    at::Tensor grad_output
-) {
-    auto output = at::zeros_like(input);
-
-    int32_t batch   = input.size(0);
-    int32_t channel = input.size(1);
-    int32_t height  = input.size(2);
-    int32_t width   = input.size(3);
-
-    auto max_val = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kFloat));
-    auto max_ind = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kLong));
-
-    auto input_temp = input.select(3, width - 1);
-    max_val.copy_(input_temp);
-
-    max_ind.fill_(width - 1);
-
-    auto output_temp      = output.select(3, width - 1);
-    auto grad_output_temp = grad_output.select(3, width - 1);
-    output_temp.copy_(grad_output_temp);
-
-    auto un_max_ind = max_ind.unsqueeze(3);
-    auto gt_mask    = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kByte));
-    auto max_temp   = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kFloat));
-    for (int32_t ind = 1; ind < width; ++ind) {
-        input_temp = input.select(3, width - ind - 1);
-        at::gt_out(gt_mask, input_temp, max_val);
-
-        at::masked_select_out(max_temp, input_temp, gt_mask);
-        max_val.masked_scatter_(gt_mask, max_temp);
-        max_ind.masked_fill_(gt_mask, width - ind - 1);
-
-        grad_output_temp = grad_output.select(3, width - ind - 1).unsqueeze(3);
-        output.scatter_add_(3, un_max_ind, grad_output_temp);
-    }
-
-    return {
-        output
-    };
-}
-
-PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
-    m.def(
-        "forward", &pool_forward, "Left Pool Forward", 
-        py::call_guard<py::gil_scoped_release>()
-    );
-    m.def(
-        "backward", &pool_backward, "Left Pool Backward", 
-        py::call_guard<py::gil_scoped_release>()
-    );
-}
--- a/object_detection/core/models/py_utils/_cpools/src/right_pool.cpp
+++ b/object_detection/core/models/py_utils/_cpools/src/right_pool.cpp
@@ -1,80 +0,0 @@
-#include <torch/torch.h>
-
-#include <vector>
-
-std::vector<at::Tensor> pool_forward(
-    at::Tensor input
-) {
-    // Initialize output
-    at::Tensor output = at::zeros_like(input);
-
-    // Get width
-    int64_t width = input.size(3);
-
-    output.copy_(input);
-
-    for (int64_t ind = 1; ind < width; ind <<= 1) {
-        at::Tensor max_temp = at::slice(output, 3, ind, width); 
-        at::Tensor cur_temp = at::slice(output, 3, ind, width);        
-        at::Tensor next_temp = at::slice(output, 3, 0, width-ind);
-        at::max_out(max_temp, cur_temp, next_temp);
-    }
-
-    return { 
-        output
-    };
-}
-
-std::vector<at::Tensor> pool_backward(
-    at::Tensor input,
-    at::Tensor grad_output
-) {
-    at::Tensor output = at::zeros_like(input);
-
-    int32_t batch   = input.size(0);
-    int32_t channel = input.size(1);
-    int32_t height  = input.size(2);
-    int32_t width   = input.size(3);
-
-    auto max_val = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kFloat));
-    auto max_ind = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kLong));
-
-    auto input_temp = input.select(3, 0);
-    max_val.copy_(input_temp);
-
-    max_ind.fill_(0);
-
-    auto output_temp      = output.select(3, 0);
-    auto grad_output_temp = grad_output.select(3, 0);
-    output_temp.copy_(grad_output_temp);
-
-    auto un_max_ind = max_ind.unsqueeze(3);
-    auto gt_mask    = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kByte));
-    auto max_temp   = torch::zeros({batch, channel, height}, at::device(at::kCUDA).dtype(at::kFloat));
-    for (int32_t ind = 0; ind < width - 1; ++ind) {
-        input_temp = input.select(3, ind + 1);
-        at::gt_out(gt_mask, input_temp, max_val);
-
-        at::masked_select_out(max_temp, input_temp, gt_mask);
-        max_val.masked_scatter_(gt_mask, max_temp);
-        max_ind.masked_fill_(gt_mask, ind + 1);
-
-        grad_output_temp = grad_output.select(3, ind + 1).unsqueeze(3);
-        output.scatter_add_(3, un_max_ind, grad_output_temp);
-    }
-
-    return {
-        output
-    };
-}
-
-PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
-    m.def(
-        "forward", &pool_forward, "Right Pool Forward",
-        py::call_guard<py::gil_scoped_release>()
-    );
-    m.def(
-        "backward", &pool_backward, "Right Pool Backward",
-        py::call_guard<py::gil_scoped_release>()     
-    );
-}
--- a/object_detection/core/models/py_utils/_cpools/src/top_pool.cpp
+++ b/object_detection/core/models/py_utils/_cpools/src/top_pool.cpp
@@ -1,80 +0,0 @@
-#include <torch/torch.h>
-
-#include <vector>
-
-std::vector<at::Tensor> top_pool_forward(
-    at::Tensor input
-) {
-    // Initialize output
-    at::Tensor output = at::zeros_like(input);
-
-    // Get height
-    int64_t height = input.size(2);
-
-    output.copy_(input);
-
-    for (int64_t ind = 1; ind < height; ind <<= 1) {
-        at::Tensor max_temp = at::slice(output, 2, 0, height-ind);
-        at::Tensor cur_temp = at::slice(output, 2, 0, height-ind);
-        at::Tensor next_temp = at::slice(output, 2, ind, height);
-        at::max_out(max_temp, cur_temp, next_temp);
-    }
-
-    return { 
-        output
-    };
-}
-
-std::vector<at::Tensor> top_pool_backward(
-    at::Tensor input,
-    at::Tensor grad_output
-) {
-    auto output = at::zeros_like(input);
-
-    int32_t batch   = input.size(0);
-    int32_t channel = input.size(1);
-    int32_t height  = input.size(2);
-    int32_t width   = input.size(3);
-
-    auto max_val = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kFloat));
-    auto max_ind = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kLong));
-
-    auto input_temp = input.select(2, height - 1);
-    max_val.copy_(input_temp);
-
-    max_ind.fill_(height - 1);
-
-    auto output_temp      = output.select(2, height - 1);
-    auto grad_output_temp = grad_output.select(2, height - 1);
-    output_temp.copy_(grad_output_temp);
-
-    auto un_max_ind = max_ind.unsqueeze(2);
-    auto gt_mask    = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kByte));
-    auto max_temp   = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kFloat));
-    for (int32_t ind = 1; ind < height; ++ind) {
-        input_temp = input.select(2, height - ind - 1);
-        at::gt_out(gt_mask, input_temp, max_val);
-
-        at::masked_select_out(max_temp, input_temp, gt_mask);
-        max_val.masked_scatter_(gt_mask, max_temp);
-        max_ind.masked_fill_(gt_mask, height - ind - 1);
-
-        grad_output_temp = grad_output.select(2, height - ind - 1).unsqueeze(2);
-        output.scatter_add_(2, un_max_ind, grad_output_temp);
-    }
-
-    return {
-        output
-    };
-}
-
-PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
-    m.def(
-        "forward", &top_pool_forward, "Top Pool Forward",
-        py::call_guard<py::gil_scoped_release>()
-    );
-    m.def(
-        "backward", &top_pool_backward, "Top Pool Backward",
-        py::call_guard<py::gil_scoped_release>()
-    );
-}
--- a/object_detection/core/models/py_utils/data_parallel.py
+++ b/object_detection/core/models/py_utils/data_parallel.py
@@ -1,117 +0,0 @@
-import torch
-from torch.nn.modules import Module
-from torch.nn.parallel.parallel_apply import parallel_apply
-from torch.nn.parallel.replicate import replicate
-from torch.nn.parallel.scatter_gather import gather
-
-from .scatter_gather import scatter_kwargs
-
-
-class DataParallel(Module):
-    r"""Implements data parallelism at the module level.
-
-    This container parallelizes the application of the given module by
-    splitting the input across the specified devices by chunking in the batch
-    dimension. In the forward pass, the module is replicated on each device,
-    and each replica handles a portion of the input. During the backwards
-    pass, gradients from each replica are summed into the original module.
-
-    The batch size should be larger than the number of GPUs used. It should
-    also be an integer multiple of the number of GPUs so that each chunk is the
-    same size (so that each GPU processes the same number of samples).
-
-    See also: :ref:`cuda-nn-dataparallel-instead`
-
-    Arbitrary positional and keyword inputs are allowed to be passed into
-    DataParallel EXCEPT Tensors. All variables will be scattered on dim
-    specified (default 0). Primitive types will be broadcasted, but all
-    other types will be a shallow copy and can be corrupted if written to in
-    the model's forward pass.
-
-    Args:
-        module: module to be parallelized
-        device_ids: CUDA devices (default: all devices)
-        output_device: device location of output (default: device_ids[0])
-
-    Example::
-
-        >>> net = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
-        >>> output = net(input_var)
-    """
-
-    # TODO: update notes/cuda.rst when this class handles 8+ GPUs well
-
-    def __init__(self, module, device_ids=None, output_device=None, dim=0, chunk_sizes=None):
-        super(DataParallel, self).__init__()
-
-        if not torch.cuda.is_available():
-            self.module = module
-            self.device_ids = []
-            return
-
-        if device_ids is None:
-            device_ids = list(range(torch.cuda.device_count()))
-        if output_device is None:
-            output_device = device_ids[0]
-        self.dim = dim
-        self.module = module
-        self.device_ids = device_ids
-        self.chunk_sizes = chunk_sizes
-        self.output_device = output_device
-        if len(self.device_ids) == 1:
-            self.module.cuda(device_ids[0])
-
-    def forward(self, *inputs, **kwargs):
-        if not self.device_ids:
-            return self.module(*inputs, **kwargs)
-        inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids, self.chunk_sizes)
-        if len(self.device_ids) == 1:
-            return self.module(*inputs[0], **kwargs[0])
-        replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
-        outputs = self.parallel_apply(replicas, inputs, kwargs)
-        return self.gather(outputs, self.output_device)
-
-    def replicate(self, module, device_ids):
-        return replicate(module, device_ids)
-
-    def scatter(self, inputs, kwargs, device_ids, chunk_sizes):
-        return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim, chunk_sizes=self.chunk_sizes)
-
-    def parallel_apply(self, replicas, inputs, kwargs):
-        return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
-
-    def gather(self, outputs, output_device):
-        return gather(outputs, output_device, dim=self.dim)
-
-
-def data_parallel(module, inputs, device_ids=None, output_device=None, dim=0, module_kwargs=None):
-    r"""Evaluates module(input) in parallel across the GPUs given in device_ids.
-
-    This is the functional version of the DataParallel module.
-
-    Args:
-        module: the module to evaluate in parallel
-        inputs: inputs to the module
-        device_ids: GPU ids on which to replicate module
-        output_device: GPU location of the output  Use -1 to indicate the CPU.
-            (default: device_ids[0])
-    Returns:
-        a Variable containing the result of module(input) located on
-        output_device
-    """
-    if not isinstance(inputs, tuple):
-        inputs = (inputs,)
-
-    if device_ids is None:
-        device_ids = list(range(torch.cuda.device_count()))
-
-    if output_device is None:
-        output_device = device_ids[0]
-
-    inputs, module_kwargs = scatter_kwargs(inputs, module_kwargs, device_ids, dim)
-    if len(device_ids) == 1:
-        return module(*inputs[0], **module_kwargs[0])
-    used_device_ids = device_ids[:len(inputs)]
-    replicas = replicate(module, used_device_ids)
-    outputs = parallel_apply(replicas, inputs, module_kwargs, used_device_ids)
-    return gather(outputs, output_device, dim)
--- a/object_detection/core/models/py_utils/losses.py
+++ b/object_detection/core/models/py_utils/losses.py
@@ -1,231 +0,0 @@
-import torch
-import torch.nn as nn
-
-from .utils import _tranpose_and_gather_feat
-
-
-def _sigmoid(x):
-    return torch.clamp(x.sigmoid_(), min=1e-4, max=1 - 1e-4)
-
-
-def _ae_loss(tag0, tag1, mask):
-    num = mask.sum(dim=1, keepdim=True).float()
-    tag0 = tag0.squeeze()
-    tag1 = tag1.squeeze()
-
-    tag_mean = (tag0 + tag1) / 2
-
-    tag0 = torch.pow(tag0 - tag_mean, 2) / (num + 1e-4)
-    tag0 = tag0[mask].sum()
-    tag1 = torch.pow(tag1 - tag_mean, 2) / (num + 1e-4)
-    tag1 = tag1[mask].sum()
-    pull = tag0 + tag1
-
-    mask = mask.unsqueeze(1) + mask.unsqueeze(2)
-    mask = mask.eq(2)
-    num = num.unsqueeze(2)
-    num2 = (num - 1) * num
-    dist = tag_mean.unsqueeze(1) - tag_mean.unsqueeze(2)
-    dist = 1 - torch.abs(dist)
-    dist = nn.functional.relu(dist, inplace=True)
-    dist = dist - 1 / (num + 1e-4)
-    dist = dist / (num2 + 1e-4)
-    dist = dist[mask]
-    push = dist.sum()
-    return pull, push
-
-
-def _off_loss(off, gt_off, mask):
-    num = mask.float().sum()
-    mask = mask.unsqueeze(2).expand_as(gt_off)
-
-    off = off[mask]
-    gt_off = gt_off[mask]
-
-    off_loss = nn.functional.smooth_l1_loss(off, gt_off, reduction="sum")
-    off_loss = off_loss / (num + 1e-4)
-    return off_loss
-
-
-def _focal_loss_mask(preds, gt, mask):
-    pos_inds = gt.eq(1)
-    neg_inds = gt.lt(1)
-
-    neg_weights = torch.pow(1 - gt[neg_inds], 4)
-
-    pos_mask = mask[pos_inds]
-    neg_mask = mask[neg_inds]
-
-    loss = 0
-    for pred in preds:
-        pos_pred = pred[pos_inds]
-        neg_pred = pred[neg_inds]
-
-        pos_loss = torch.log(pos_pred) * torch.pow(1 - pos_pred, 2) * pos_mask
-        neg_loss = torch.log(1 - neg_pred) * torch.pow(neg_pred, 2) * neg_weights * neg_mask
-
-        num_pos = pos_inds.float().sum()
-        pos_loss = pos_loss.sum()
-        neg_loss = neg_loss.sum()
-
-        if pos_pred.nelement() == 0:
-            loss = loss - neg_loss
-        else:
-            loss = loss - (pos_loss + neg_loss) / num_pos
-    return loss
-
-
-def _focal_loss(preds, gt):
-    pos_inds = gt.eq(1)
-    neg_inds = gt.lt(1)
-
-    neg_weights = torch.pow(1 - gt[neg_inds], 4)
-
-    loss = 0
-    for pred in preds:
-        pos_pred = pred[pos_inds]
-        neg_pred = pred[neg_inds]
-
-        pos_loss = torch.log(pos_pred) * torch.pow(1 - pos_pred, 2)
-        neg_loss = torch.log(1 - neg_pred) * torch.pow(neg_pred, 2) * neg_weights
-
-        num_pos = pos_inds.float().sum()
-        pos_loss = pos_loss.sum()
-        neg_loss = neg_loss.sum()
-
-        if pos_pred.nelement() == 0:
-            loss = loss - neg_loss
-        else:
-            loss = loss - (pos_loss + neg_loss) / num_pos
-    return loss
-
-
-class CornerNet_Saccade_Loss(nn.Module):
-    def __init__(self, pull_weight=1, push_weight=1, off_weight=1, focal_loss=_focal_loss_mask):
-        super(CornerNet_Saccade_Loss, self).__init__()
-
-        self.pull_weight = pull_weight
-        self.push_weight = push_weight
-        self.off_weight = off_weight
-        self.focal_loss = focal_loss
-        self.ae_loss = _ae_loss
-        self.off_loss = _off_loss
-
-    def forward(self, outs, targets):
-        tl_heats = outs[0]
-        br_heats = outs[1]
-        tl_tags = outs[2]
-        br_tags = outs[3]
-        tl_offs = outs[4]
-        br_offs = outs[5]
-        atts = outs[6]
-
-        gt_tl_heat = targets[0]
-        gt_br_heat = targets[1]
-        gt_mask = targets[2]
-        gt_tl_off = targets[3]
-        gt_br_off = targets[4]
-        gt_tl_ind = targets[5]
-        gt_br_ind = targets[6]
-        gt_tl_valid = targets[7]
-        gt_br_valid = targets[8]
-        gt_atts = targets[9]
-
-        # focal loss
-        focal_loss = 0
-
-        tl_heats = [_sigmoid(t) for t in tl_heats]
-        br_heats = [_sigmoid(b) for b in br_heats]
-
-        focal_loss += self.focal_loss(tl_heats, gt_tl_heat, gt_tl_valid)
-        focal_loss += self.focal_loss(br_heats, gt_br_heat, gt_br_valid)
-
-        atts = [[_sigmoid(a) for a in att] for att in atts]
-        atts = [[att[ind] for att in atts] for ind in range(len(gt_atts))]
-
-        att_loss = 0
-        for att, gt_att in zip(atts, gt_atts):
-            att_loss += _focal_loss(att, gt_att) / max(len(att), 1)
-
-        # tag loss
-        pull_loss = 0
-        push_loss = 0
-        tl_tags = [_tranpose_and_gather_feat(tl_tag, gt_tl_ind) for tl_tag in tl_tags]
-        br_tags = [_tranpose_and_gather_feat(br_tag, gt_br_ind) for br_tag in br_tags]
-        for tl_tag, br_tag in zip(tl_tags, br_tags):
-            pull, push = self.ae_loss(tl_tag, br_tag, gt_mask)
-            pull_loss += pull
-            push_loss += push
-        pull_loss = self.pull_weight * pull_loss
-        push_loss = self.push_weight * push_loss
-
-        off_loss = 0
-        tl_offs = [_tranpose_and_gather_feat(tl_off, gt_tl_ind) for tl_off in tl_offs]
-        br_offs = [_tranpose_and_gather_feat(br_off, gt_br_ind) for br_off in br_offs]
-        for tl_off, br_off in zip(tl_offs, br_offs):
-            off_loss += self.off_loss(tl_off, gt_tl_off, gt_mask)
-            off_loss += self.off_loss(br_off, gt_br_off, gt_mask)
-        off_loss = self.off_weight * off_loss
-
-        loss = (focal_loss + att_loss + pull_loss + push_loss + off_loss) / max(len(tl_heats), 1)
-        return loss.unsqueeze(0)
-
-
-class CornerNet_Loss(nn.Module):
-    def __init__(self, pull_weight=1, push_weight=1, off_weight=1, focal_loss=_focal_loss):
-        super(CornerNet_Loss, self).__init__()
-
-        self.pull_weight = pull_weight
-        self.push_weight = push_weight
-        self.off_weight = off_weight
-        self.focal_loss = focal_loss
-        self.ae_loss = _ae_loss
-        self.off_loss = _off_loss
-
-    def forward(self, outs, targets):
-        tl_heats = outs[0]
-        br_heats = outs[1]
-        tl_tags = outs[2]
-        br_tags = outs[3]
-        tl_offs = outs[4]
-        br_offs = outs[5]
-
-        gt_tl_heat = targets[0]
-        gt_br_heat = targets[1]
-        gt_mask = targets[2]
-        gt_tl_off = targets[3]
-        gt_br_off = targets[4]
-        gt_tl_ind = targets[5]
-        gt_br_ind = targets[6]
-
-        # focal loss
-        focal_loss = 0
-
-        tl_heats = [_sigmoid(t) for t in tl_heats]
-        br_heats = [_sigmoid(b) for b in br_heats]
-
-        focal_loss += self.focal_loss(tl_heats, gt_tl_heat)
-        focal_loss += self.focal_loss(br_heats, gt_br_heat)
-
-        # tag loss
-        pull_loss = 0
-        push_loss = 0
-        tl_tags = [_tranpose_and_gather_feat(tl_tag, gt_tl_ind) for tl_tag in tl_tags]
-        br_tags = [_tranpose_and_gather_feat(br_tag, gt_br_ind) for br_tag in br_tags]
-        for tl_tag, br_tag in zip(tl_tags, br_tags):
-            pull, push = self.ae_loss(tl_tag, br_tag, gt_mask)
-            pull_loss += pull
-            push_loss += push
-        pull_loss = self.pull_weight * pull_loss
-        push_loss = self.push_weight * push_loss
-
-        off_loss = 0
-        tl_offs = [_tranpose_and_gather_feat(tl_off, gt_tl_ind) for tl_off in tl_offs]
-        br_offs = [_tranpose_and_gather_feat(br_off, gt_br_ind) for br_off in br_offs]
-        for tl_off, br_off in zip(tl_offs, br_offs):
-            off_loss += self.off_loss(tl_off, gt_tl_off, gt_mask)
-            off_loss += self.off_loss(br_off, gt_br_off, gt_mask)
-        off_loss = self.off_weight * off_loss
-
-        loss = (focal_loss + pull_loss + push_loss + off_loss) / max(len(tl_heats), 1)
-        return loss.unsqueeze(0)
--- a/object_detection/core/models/py_utils/modules.py
+++ b/object_detection/core/models/py_utils/modules.py
@@ -1,303 +0,0 @@
-import torch
-import torch.nn as nn
-
-from .utils import residual, upsample, merge, _decode
-
-
-def _make_layer(inp_dim, out_dim, modules):
-    layers = [residual(inp_dim, out_dim)]
-    layers += [residual(out_dim, out_dim) for _ in range(1, modules)]
-    return nn.Sequential(*layers)
-
-
-def _make_layer_revr(inp_dim, out_dim, modules):
-    layers = [residual(inp_dim, inp_dim) for _ in range(modules - 1)]
-    layers += [residual(inp_dim, out_dim)]
-    return nn.Sequential(*layers)
-
-
-def _make_pool_layer(dim):
-    return nn.MaxPool2d(kernel_size=2, stride=2)
-
-
-def _make_unpool_layer(dim):
-    return upsample(scale_factor=2)
-
-
-def _make_merge_layer(dim):
-    return merge()
-
-
-class hg_module(nn.Module):
-    def __init__(
-            self, n, dims, modules, make_up_layer=_make_layer,
-            make_pool_layer=_make_pool_layer, make_hg_layer=_make_layer,
-            make_low_layer=_make_layer, make_hg_layer_revr=_make_layer_revr,
-            make_unpool_layer=_make_unpool_layer, make_merge_layer=_make_merge_layer
-    ):
-        super(hg_module, self).__init__()
-
-        curr_mod = modules[0]
-        next_mod = modules[1]
-
-        curr_dim = dims[0]
-        next_dim = dims[1]
-
-        self.n = n
-        self.up1 = make_up_layer(curr_dim, curr_dim, curr_mod)
-        self.max1 = make_pool_layer(curr_dim)
-        self.low1 = make_hg_layer(curr_dim, next_dim, curr_mod)
-        self.low2 = hg_module(
-            n - 1, dims[1:], modules[1:],
-            make_up_layer=make_up_layer,
-            make_pool_layer=make_pool_layer,
-            make_hg_layer=make_hg_layer,
-            make_low_layer=make_low_layer,
-            make_hg_layer_revr=make_hg_layer_revr,
-            make_unpool_layer=make_unpool_layer,
-            make_merge_layer=make_merge_layer
-        ) if n > 1 else make_low_layer(next_dim, next_dim, next_mod)
-        self.low3 = make_hg_layer_revr(next_dim, curr_dim, curr_mod)
-        self.up2 = make_unpool_layer(curr_dim)
-        self.merg = make_merge_layer(curr_dim)
-
-    def forward(self, x):
-        up1 = self.up1(x)
-        max1 = self.max1(x)
-        low1 = self.low1(max1)
-        low2 = self.low2(low1)
-        low3 = self.low3(low2)
-        up2 = self.up2(low3)
-        merg = self.merg(up1, up2)
-        return merg
-
-
-class hg(nn.Module):
-    def __init__(self, pre, hg_modules, cnvs, inters, cnvs_, inters_):
-        super(hg, self).__init__()
-
-        self.pre = pre
-        self.hgs = hg_modules
-        self.cnvs = cnvs
-
-        self.inters = inters
-        self.inters_ = inters_
-        self.cnvs_ = cnvs_
-
-    def forward(self, x):
-        inter = self.pre(x)
-
-        cnvs = []
-        for ind, (hg_, cnv_) in enumerate(zip(self.hgs, self.cnvs)):
-            hg = hg_(inter)
-            cnv = cnv_(hg)
-            cnvs.append(cnv)
-
-            if ind < len(self.hgs) - 1:
-                inter = self.inters_[ind](inter) + self.cnvs_[ind](cnv)
-                inter = nn.functional.relu_(inter)
-                inter = self.inters[ind](inter)
-        return cnvs
-
-
-class hg_net(nn.Module):
-    def __init__(
-            self, hg, tl_modules, br_modules, tl_heats, br_heats,
-            tl_tags, br_tags, tl_offs, br_offs
-    ):
-        super(hg_net, self).__init__()
-
-        self._decode = _decode
-
-        self.hg = hg
-
-        self.tl_modules = tl_modules
-        self.br_modules = br_modules
-
-        self.tl_heats = tl_heats
-        self.br_heats = br_heats
-
-        self.tl_tags = tl_tags
-        self.br_tags = br_tags
-
-        self.tl_offs = tl_offs
-        self.br_offs = br_offs
-
-    def _train(self, *xs):
-        image = xs[0]
-        cnvs = self.hg(image)
-
-        tl_modules = [tl_mod_(cnv) for tl_mod_, cnv in zip(self.tl_modules, cnvs)]
-        br_modules = [br_mod_(cnv) for br_mod_, cnv in zip(self.br_modules, cnvs)]
-        tl_heats = [tl_heat_(tl_mod) for tl_heat_, tl_mod in zip(self.tl_heats, tl_modules)]
-        br_heats = [br_heat_(br_mod) for br_heat_, br_mod in zip(self.br_heats, br_modules)]
-        tl_tags = [tl_tag_(tl_mod) for tl_tag_, tl_mod in zip(self.tl_tags, tl_modules)]
-        br_tags = [br_tag_(br_mod) for br_tag_, br_mod in zip(self.br_tags, br_modules)]
-        tl_offs = [tl_off_(tl_mod) for tl_off_, tl_mod in zip(self.tl_offs, tl_modules)]
-        br_offs = [br_off_(br_mod) for br_off_, br_mod in zip(self.br_offs, br_modules)]
-        return [tl_heats, br_heats, tl_tags, br_tags, tl_offs, br_offs]
-
-    def _test(self, *xs, **kwargs):
-        image = xs[0]
-        cnvs = self.hg(image)
-
-        tl_mod = self.tl_modules[-1](cnvs[-1])
-        br_mod = self.br_modules[-1](cnvs[-1])
-
-        tl_heat, br_heat = self.tl_heats[-1](tl_mod), self.br_heats[-1](br_mod)
-        tl_tag, br_tag = self.tl_tags[-1](tl_mod), self.br_tags[-1](br_mod)
-        tl_off, br_off = self.tl_offs[-1](tl_mod), self.br_offs[-1](br_mod)
-
-        outs = [tl_heat, br_heat, tl_tag, br_tag, tl_off, br_off]
-        return self._decode(*outs, **kwargs), tl_heat, br_heat, tl_tag, br_tag
-
-    def forward(self, *xs, test=False, **kwargs):
-        if not test:
-            return self._train(*xs, **kwargs)
-        return self._test(*xs, **kwargs)
-
-
-class saccade_module(nn.Module):
-    def __init__(
-            self, n, dims, modules, make_up_layer=_make_layer,
-            make_pool_layer=_make_pool_layer, make_hg_layer=_make_layer,
-            make_low_layer=_make_layer, make_hg_layer_revr=_make_layer_revr,
-            make_unpool_layer=_make_unpool_layer, make_merge_layer=_make_merge_layer
-    ):
-        super(saccade_module, self).__init__()
-
-        curr_mod = modules[0]
-        next_mod = modules[1]
-
-        curr_dim = dims[0]
-        next_dim = dims[1]
-
-        self.n = n
-        self.up1 = make_up_layer(curr_dim, curr_dim, curr_mod)
-        self.max1 = make_pool_layer(curr_dim)
-        self.low1 = make_hg_layer(curr_dim, next_dim, curr_mod)
-        self.low2 = saccade_module(
-            n - 1, dims[1:], modules[1:],
-            make_up_layer=make_up_layer,
-            make_pool_layer=make_pool_layer,
-            make_hg_layer=make_hg_layer,
-            make_low_layer=make_low_layer,
-            make_hg_layer_revr=make_hg_layer_revr,
-            make_unpool_layer=make_unpool_layer,
-            make_merge_layer=make_merge_layer
-        ) if n > 1 else make_low_layer(next_dim, next_dim, next_mod)
-        self.low3 = make_hg_layer_revr(next_dim, curr_dim, curr_mod)
-        self.up2 = make_unpool_layer(curr_dim)
-        self.merg = make_merge_layer(curr_dim)
-
-    def forward(self, x):
-        up1 = self.up1(x)
-        max1 = self.max1(x)
-        low1 = self.low1(max1)
-        if self.n > 1:
-            low2, mergs = self.low2(low1)
-        else:
-            low2, mergs = self.low2(low1), []
-        low3 = self.low3(low2)
-        up2 = self.up2(low3)
-        merg = self.merg(up1, up2)
-        mergs.append(merg)
-        return merg, mergs
-
-
-class saccade(nn.Module):
-    def __init__(self, pre, hg_modules, cnvs, inters, cnvs_, inters_):
-        super(saccade, self).__init__()
-
-        self.pre = pre
-        self.hgs = hg_modules
-        self.cnvs = cnvs
-
-        self.inters = inters
-        self.inters_ = inters_
-        self.cnvs_ = cnvs_
-
-    def forward(self, x):
-        inter = self.pre(x)
-
-        cnvs = []
-        atts = []
-        for ind, (hg_, cnv_) in enumerate(zip(self.hgs, self.cnvs)):
-            hg, ups = hg_(inter)
-            cnv = cnv_(hg)
-            cnvs.append(cnv)
-            atts.append(ups)
-
-            if ind < len(self.hgs) - 1:
-                inter = self.inters_[ind](inter) + self.cnvs_[ind](cnv)
-                inter = nn.functional.relu_(inter)
-                inter = self.inters[ind](inter)
-        return cnvs, atts
-
-
-class saccade_net(nn.Module):
-    def __init__(
-            self, hg, tl_modules, br_modules, tl_heats, br_heats,
-            tl_tags, br_tags, tl_offs, br_offs, att_modules, up_start=0
-    ):
-        super(saccade_net, self).__init__()
-
-        self._decode = _decode
-
-        self.hg = hg
-
-        self.tl_modules = tl_modules
-        self.br_modules = br_modules
-        self.tl_heats = tl_heats
-        self.br_heats = br_heats
-        self.tl_tags = tl_tags
-        self.br_tags = br_tags
-        self.tl_offs = tl_offs
-        self.br_offs = br_offs
-
-        self.att_modules = att_modules
-        self.up_start = up_start
-
-    def _train(self, *xs):
-        image = xs[0]
-
-        cnvs, ups = self.hg(image)
-        ups = [up[self.up_start:] for up in ups]
-
-        tl_modules = [tl_mod_(cnv) for tl_mod_, cnv in zip(self.tl_modules, cnvs)]
-        br_modules = [br_mod_(cnv) for br_mod_, cnv in zip(self.br_modules, cnvs)]
-        tl_heats = [tl_heat_(tl_mod) for tl_heat_, tl_mod in zip(self.tl_heats, tl_modules)]
-        br_heats = [br_heat_(br_mod) for br_heat_, br_mod in zip(self.br_heats, br_modules)]
-        tl_tags = [tl_tag_(tl_mod) for tl_tag_, tl_mod in zip(self.tl_tags, tl_modules)]
-        br_tags = [br_tag_(br_mod) for br_tag_, br_mod in zip(self.br_tags, br_modules)]
-        tl_offs = [tl_off_(tl_mod) for tl_off_, tl_mod in zip(self.tl_offs, tl_modules)]
-        br_offs = [br_off_(br_mod) for br_off_, br_mod in zip(self.br_offs, br_modules)]
-        atts = [[att_mod_(u) for att_mod_, u in zip(att_mods, up)] for att_mods, up in zip(self.att_modules, ups)]
-        return [tl_heats, br_heats, tl_tags, br_tags, tl_offs, br_offs, atts]
-
-    def _test(self, *xs, no_att=False, **kwargs):
-        image = xs[0]
-        cnvs, ups = self.hg(image)
-        ups = [up[self.up_start:] for up in ups]
-
-        if not no_att:
-            atts = [att_mod_(up) for att_mod_, up in zip(self.att_modules[-1], ups[-1])]
-            atts = [torch.sigmoid(att) for att in atts]
-
-        tl_mod = self.tl_modules[-1](cnvs[-1])
-        br_mod = self.br_modules[-1](cnvs[-1])
-
-        tl_heat, br_heat = self.tl_heats[-1](tl_mod), self.br_heats[-1](br_mod)
-        tl_tag, br_tag = self.tl_tags[-1](tl_mod), self.br_tags[-1](br_mod)
-        tl_off, br_off = self.tl_offs[-1](tl_mod), self.br_offs[-1](br_mod)
-
-        outs = [tl_heat, br_heat, tl_tag, br_tag, tl_off, br_off]
-        if not no_att:
-            return self._decode(*outs, **kwargs), atts
-        else:
-            return self._decode(*outs, **kwargs)
-
-    def forward(self, *xs, test=False, **kwargs):
-        if not test:
-            return self._train(*xs, **kwargs)
-        return self._test(*xs, **kwargs)
--- a/object_detection/core/models/py_utils/scatter_gather.py
+++ b/object_detection/core/models/py_utils/scatter_gather.py
@@ -1,39 +0,0 @@
-import torch
-from torch.autograd import Variable
-from torch.nn.parallel._functions import Scatter
-
-
-def scatter(inputs, target_gpus, dim=0, chunk_sizes=None):
-    r"""
-    Slices variables into approximately equal chunks and
-    distributes them across given GPUs. Duplicates
-    references to objects that are not variables. Does not
-    support Tensors.
-    """
-
-    def scatter_map(obj):
-        if isinstance(obj, Variable):
-            return Scatter.apply(target_gpus, chunk_sizes, dim, obj)
-        assert not torch.is_tensor(obj), "Tensors not supported in scatter."
-        if isinstance(obj, tuple):
-            return list(zip(*map(scatter_map, obj)))
-        if isinstance(obj, list):
-            return list(map(list, zip(*map(scatter_map, obj))))
-        if isinstance(obj, dict):
-            return list(map(type(obj), zip(*map(scatter_map, obj.items()))))
-        return [obj for targets in target_gpus]
-
-    return scatter_map(inputs)
-
-
-def scatter_kwargs(inputs, kwargs, target_gpus, dim=0, chunk_sizes=None):
-    r"""Scatter with support for kwargs dictionary"""
-    inputs = scatter(inputs, target_gpus, dim, chunk_sizes) if inputs else []
-    kwargs = scatter(kwargs, target_gpus, dim, chunk_sizes) if kwargs else []
-    if len(inputs) < len(kwargs):
-        inputs.extend([() for _ in range(len(kwargs) - len(inputs))])
-    elif len(kwargs) < len(inputs):
-        kwargs.extend([{} for _ in range(len(inputs) - len(kwargs))])
-    inputs = tuple(inputs)
-    kwargs = tuple(kwargs)
-    return inputs, kwargs
--- a/object_detection/core/models/py_utils/utils.py
+++ b/object_detection/core/models/py_utils/utils.py
@@ -1,236 +0,0 @@
-import torch
-import torch.nn as nn
-
-
-def _gather_feat(feat, ind, mask=None):
-    dim = feat.size(2)
-    ind = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim)
-    feat = feat.gather(1, ind)
-    if mask is not None:
-        mask = mask.unsqueeze(2).expand_as(feat)
-        feat = feat[mask]
-        feat = feat.view(-1, dim)
-    return feat
-
-
-def _nms(heat, kernel=1):
-    pad = (kernel - 1) // 2
-
-    hmax = nn.functional.max_pool2d(heat, (kernel, kernel), stride=1, padding=pad)
-    keep = (hmax == heat).float()
-    return heat * keep
-
-
-def _tranpose_and_gather_feat(feat, ind):
-    feat = feat.permute(0, 2, 3, 1).contiguous()
-    feat = feat.view(feat.size(0), -1, feat.size(3))
-    feat = _gather_feat(feat, ind)
-    return feat
-
-
-def _topk(scores, K=20):
-    batch, cat, height, width = scores.size()
-
-    topk_scores, topk_inds = torch.topk(scores.view(batch, -1), K)
-
-    topk_clses = (topk_inds / (height * width)).int()
-
-    topk_inds = topk_inds % (height * width)
-    topk_ys = (topk_inds / width).int().float()
-    topk_xs = (topk_inds % width).int().float()
-    return topk_scores, topk_inds, topk_clses, topk_ys, topk_xs
-
-
-def _decode(
-        tl_heat, br_heat, tl_tag, br_tag, tl_regr, br_regr,
-        K=100, kernel=1, ae_threshold=1, num_dets=1000, no_border=False
-):
-    batch, cat, height, width = tl_heat.size()
-
-    tl_heat = torch.sigmoid(tl_heat)
-    br_heat = torch.sigmoid(br_heat)
-
-    # perform nms on heatmaps
-    tl_heat = _nms(tl_heat, kernel=kernel)
-    br_heat = _nms(br_heat, kernel=kernel)
-
-    tl_scores, tl_inds, tl_clses, tl_ys, tl_xs = _topk(tl_heat, K=K)
-    br_scores, br_inds, br_clses, br_ys, br_xs = _topk(br_heat, K=K)
-
-    tl_ys = tl_ys.view(batch, K, 1).expand(batch, K, K)
-    tl_xs = tl_xs.view(batch, K, 1).expand(batch, K, K)
-    br_ys = br_ys.view(batch, 1, K).expand(batch, K, K)
-    br_xs = br_xs.view(batch, 1, K).expand(batch, K, K)
-
-    if no_border:
-        tl_ys_binds = (tl_ys == 0)
-        tl_xs_binds = (tl_xs == 0)
-        br_ys_binds = (br_ys == height - 1)
-        br_xs_binds = (br_xs == width - 1)
-
-    if tl_regr is not None and br_regr is not None:
-        tl_regr = _tranpose_and_gather_feat(tl_regr, tl_inds)
-        tl_regr = tl_regr.view(batch, K, 1, 2)
-        br_regr = _tranpose_and_gather_feat(br_regr, br_inds)
-        br_regr = br_regr.view(batch, 1, K, 2)
-
-        tl_xs = tl_xs + tl_regr[..., 0]
-        tl_ys = tl_ys + tl_regr[..., 1]
-        br_xs = br_xs + br_regr[..., 0]
-        br_ys = br_ys + br_regr[..., 1]
-
-    # all possible boxes based on top k corners (ignoring class)
-    bboxes = torch.stack((tl_xs, tl_ys, br_xs, br_ys), dim=3)
-
-    tl_tag = _tranpose_and_gather_feat(tl_tag, tl_inds)
-    tl_tag = tl_tag.view(batch, K, 1)
-    br_tag = _tranpose_and_gather_feat(br_tag, br_inds)
-    br_tag = br_tag.view(batch, 1, K)
-    dists = torch.abs(tl_tag - br_tag)
-
-    tl_scores = tl_scores.view(batch, K, 1).expand(batch, K, K)
-    br_scores = br_scores.view(batch, 1, K).expand(batch, K, K)
-    scores = (tl_scores + br_scores) / 2
-
-    # reject boxes based on classes
-    tl_clses = tl_clses.view(batch, K, 1).expand(batch, K, K)
-    br_clses = br_clses.view(batch, 1, K).expand(batch, K, K)
-    cls_inds = (tl_clses != br_clses)
-
-    # reject boxes based on distances
-    dist_inds = (dists > ae_threshold)
-
-    # reject boxes based on widths and heights
-    width_inds = (br_xs < tl_xs)
-    height_inds = (br_ys < tl_ys)
-
-    if no_border:
-        scores[tl_ys_binds] = -1
-        scores[tl_xs_binds] = -1
-        scores[br_ys_binds] = -1
-        scores[br_xs_binds] = -1
-
-    scores[cls_inds] = -1
-    scores[dist_inds] = -1
-    scores[width_inds] = -1
-    scores[height_inds] = -1
-
-    scores = scores.view(batch, -1)
-    scores, inds = torch.topk(scores, num_dets)
-    scores = scores.unsqueeze(2)
-
-    bboxes = bboxes.view(batch, -1, 4)
-    bboxes = _gather_feat(bboxes, inds)
-
-    clses = tl_clses.contiguous().view(batch, -1, 1)
-    clses = _gather_feat(clses, inds).float()
-
-    tl_scores = tl_scores.contiguous().view(batch, -1, 1)
-    tl_scores = _gather_feat(tl_scores, inds).float()
-    br_scores = br_scores.contiguous().view(batch, -1, 1)
-    br_scores = _gather_feat(br_scores, inds).float()
-
-    detections = torch.cat([bboxes, scores, tl_scores, br_scores, clses], dim=2)
-    return detections
-
-
-class upsample(nn.Module):
-    def __init__(self, scale_factor):
-        super(upsample, self).__init__()
-        self.scale_factor = scale_factor
-
-    def forward(self, x):
-        return nn.functional.interpolate(x, scale_factor=self.scale_factor)
-
-
-class merge(nn.Module):
-    def forward(self, x, y):
-        return x + y
-
-
-class convolution(nn.Module):
-    def __init__(self, k, inp_dim, out_dim, stride=1, with_bn=True):
-        super(convolution, self).__init__()
-
-        pad = (k - 1) // 2
-        self.conv = nn.Conv2d(inp_dim, out_dim, (k, k), padding=(pad, pad), stride=(stride, stride), bias=not with_bn)
-        self.bn = nn.BatchNorm2d(out_dim) if with_bn else nn.Sequential()
-        self.relu = nn.ReLU(inplace=True)
-
-    def forward(self, x):
-        conv = self.conv(x)
-        bn = self.bn(conv)
-        relu = self.relu(bn)
-        return relu
-
-
-class residual(nn.Module):
-    def __init__(self, inp_dim, out_dim, k=3, stride=1):
-        super(residual, self).__init__()
-        p = (k - 1) // 2
-
-        self.conv1 = nn.Conv2d(inp_dim, out_dim, (k, k), padding=(p, p), stride=(stride, stride), bias=False)
-        self.bn1 = nn.BatchNorm2d(out_dim)
-        self.relu1 = nn.ReLU(inplace=True)
-
-        self.conv2 = nn.Conv2d(out_dim, out_dim, (k, k), padding=(p, p), bias=False)
-        self.bn2 = nn.BatchNorm2d(out_dim)
-
-        self.skip = nn.Sequential(
-            nn.Conv2d(inp_dim, out_dim, (1, 1), stride=(stride, stride), bias=False),
-            nn.BatchNorm2d(out_dim)
-        ) if stride != 1 or inp_dim != out_dim else nn.Sequential()
-        self.relu = nn.ReLU(inplace=True)
-
-    def forward(self, x):
-        conv1 = self.conv1(x)
-        bn1 = self.bn1(conv1)
-        relu1 = self.relu1(bn1)
-
-        conv2 = self.conv2(relu1)
-        bn2 = self.bn2(conv2)
-
-        skip = self.skip(x)
-        return self.relu(bn2 + skip)
-
-
-class corner_pool(nn.Module):
-    def __init__(self, dim, pool1, pool2):
-        super(corner_pool, self).__init__()
-        self._init_layers(dim, pool1, pool2)
-
-    def _init_layers(self, dim, pool1, pool2):
-        self.p1_conv1 = convolution(3, dim, 128)
-        self.p2_conv1 = convolution(3, dim, 128)
-
-        self.p_conv1 = nn.Conv2d(128, dim, (3, 3), padding=(1, 1), bias=False)
-        self.p_bn1 = nn.BatchNorm2d(dim)
-
-        self.conv1 = nn.Conv2d(dim, dim, (1, 1), bias=False)
-        self.bn1 = nn.BatchNorm2d(dim)
-        self.relu1 = nn.ReLU(inplace=True)
-
-        self.conv2 = convolution(3, dim, dim)
-
-        self.pool1 = pool1()
-        self.pool2 = pool2()
-
-    def forward(self, x):
-        # pool 1
-        p1_conv1 = self.p1_conv1(x)
-        pool1 = self.pool1(p1_conv1)
-
-        # pool 2
-        p2_conv1 = self.p2_conv1(x)
-        pool2 = self.pool2(p2_conv1)
-
-        # pool 1 + pool 2
-        p_conv1 = self.p_conv1(pool1 + pool2)
-        p_bn1 = self.p_bn1(p_conv1)
-
-        conv1 = self.conv1(x)
-        bn1 = self.bn1(conv1)
-        relu1 = self.relu1(p_bn1 + bn1)
-
-        conv2 = self.conv2(relu1)
-        return conv2
--- a/object_detection/core/nnet/init.py
+++ b/object_detection/core/nnet/init.py
--- a/object_detection/core/nnet/py_factory.py
+++ b/object_detection/core/nnet/py_factory.py
@@ -1,137 +0,0 @@
-import torch
-import torch.nn as nn
-
-from ..models.py_utils.data_parallel import DataParallel
-
-torch.manual_seed(317)
-
-
-class Network(nn.Module):
-    def __init__(self, model, loss):
-        super(Network, self).__init__()
-
-        self.model = model
-        self.loss = loss
-
-    def forward(self, xs, ys, **kwargs):
-        preds = self.model(*xs, **kwargs)
-        loss = self.loss(preds, ys, **kwargs)
-        return loss
-
-
-# for model backward compatibility
-# previously model was wrapped by DataParallel module
-class DummyModule(nn.Module):
-    def __init__(self, model):
-        super(DummyModule, self).__init__()
-        self.module = model
-
-    def forward(self, *xs, **kwargs):
-        return self.module(*xs, **kwargs)
-
-
-class NetworkFactory(object):
-    def __init__(self, system_config, model, distributed=False, gpu=None):
-        super(NetworkFactory, self).__init__()
-
-        self.system_config = system_config
-
-        self.gpu = gpu
-        self.model = DummyModule(model)
-        self.loss = model.loss
-        self.network = Network(self.model, self.loss)
-
-        if distributed:
-            from apex.parallel import DistributedDataParallel, convert_syncbn_model
-            torch.cuda.set_device(gpu)
-            self.network = self.network.cuda(gpu)
-            self.network = convert_syncbn_model(self.network)
-            self.network = DistributedDataParallel(self.network)
-        else:
-            self.network = DataParallel(self.network, chunk_sizes=system_config.chunk_sizes)
-
-        total_params = 0
-        for params in self.model.parameters():
-            num_params = 1
-            for x in params.size():
-                num_params *= x
-            total_params += num_params
-        print("total parameters: {}".format(total_params))
-
-        if system_config.opt_algo == "adam":
-            self.optimizer = torch.optim.Adam(
-                filter(lambda p: p.requires_grad, self.model.parameters())
-            )
-        elif system_config.opt_algo == "sgd":
-            self.optimizer = torch.optim.SGD(
-                filter(lambda p: p.requires_grad, self.model.parameters()),
-                lr=system_config.learning_rate,
-                momentum=0.9, weight_decay=0.0001
-            )
-        else:
-            raise ValueError("unknown optimizer")
-
-    def cuda(self):
-        self.model.cuda()
-
-    def train_mode(self):
-        self.network.train()
-
-    def eval_mode(self):
-        self.network.eval()
-
-    def _t_cuda(self, xs):
-        if type(xs) is list:
-            return [x.cuda(self.gpu, non_blocking=True) for x in xs]
-        return xs.cuda(self.gpu, non_blocking=True)
-
-    def train(self, xs, ys, **kwargs):
-        xs = [self._t_cuda(x) for x in xs]
-        ys = [self._t_cuda(y) for y in ys]
-
-        self.optimizer.zero_grad()
-        loss = self.network(xs, ys)
-        loss = loss.mean()
-        loss.backward()
-        self.optimizer.step()
-
-        return loss
-
-    def validate(self, xs, ys, **kwargs):
-        with torch.no_grad():
-            xs = [self._t_cuda(x) for x in xs]
-            ys = [self._t_cuda(y) for y in ys]
-
-            loss = self.network(xs, ys)
-            loss = loss.mean()
-            return loss
-
-    def test(self, xs, **kwargs):
-        with torch.no_grad():
-            xs = [self._t_cuda(x) for x in xs]
-            return self.model(*xs, **kwargs)
-
-    def set_lr(self, lr):
-        print("setting learning rate to: {}".format(lr))
-        for param_group in self.optimizer.param_groups:
-            param_group["lr"] = lr
-
-    def load_pretrained_params(self, pretrained_model):
-        print("loading from {}".format(pretrained_model))
-        with open(pretrained_model, "rb") as f:
-            params = torch.load(f, weights_only=False)
-            self.model.load_state_dict(params)
-
-    def load_params(self, iteration):
-        cache_file = self.system_config.snapshot_file.format(iteration)
-        print("loading model from {}".format(cache_file))
-        with open(cache_file, "rb") as f:
-            params = torch.load(f)
-            self.model.load_state_dict(params)
-
-    def save_params(self, iteration):
-        cache_file = self.system_config.snapshot_file.format(iteration)
-        print("saving model to {}".format(cache_file))
-        with open(cache_file, "wb") as f:
-            params = self.model.state_dict()
-            torch.save(params, f)
--- a/object_detection/core/paths.py
+++ b/object_detection/core/paths.py
@@ -1,8 +0,0 @@
-import pkg_resources
-
-_package_name = __name__
-
-
-def get_file_path(*paths):
-    path = "/".join(paths)
-    return pkg_resources.resource_filename(_package_name, path)
--- a/object_detection/core/sample/init.py
+++ b/object_detection/core/sample/init.py
@@ -1,5 +0,0 @@
-from .cornernet import cornernet
-from .cornernet_saccade import cornernet_saccade
-
-def data_sampling_func(sys_configs, db, k_ind, data_aug=True, debug=False):
-    return globals()[sys_configs.sampling_function](sys_configs, db, k_ind, data_aug, debug)
--- a/object_detection/core/sample/cornernet.py
+++ b/object_detection/core/sample/cornernet.py
@@ -1,164 +0,0 @@
-import math
-
-import cv2
-import numpy as np
-import torch
-
-from .utils import random_crop, draw_gaussian, gaussian_radius, normalize_, color_jittering_, lighting_
-
-
-def _resize_image(image, detections, size):
-    detections = detections.copy()
-    height, width = image.shape[0:2]
-    new_height, new_width = size
-
-    image = cv2.resize(image, (new_width, new_height))
-
-    height_ratio = new_height / height
-    width_ratio = new_width / width
-    detections[:, 0:4:2] *= width_ratio
-    detections[:, 1:4:2] *= height_ratio
-    return image, detections
-
-
-def _clip_detections(image, detections):
-    detections = detections.copy()
-    height, width = image.shape[0:2]
-
-    detections[:, 0:4:2] = np.clip(detections[:, 0:4:2], 0, width - 1)
-    detections[:, 1:4:2] = np.clip(detections[:, 1:4:2], 0, height - 1)
-    keep_inds = ((detections[:, 2] - detections[:, 0]) > 0) & \
-                ((detections[:, 3] - detections[:, 1]) > 0)
-    detections = detections[keep_inds]
-    return detections
-
-
-def cornernet(system_configs, db, k_ind, data_aug, debug):
-    data_rng = system_configs.data_rng
-    batch_size = system_configs.batch_size
-
-    categories = db.configs["categories"]
-    input_size = db.configs["input_size"]
-    output_size = db.configs["output_sizes"][0]
-
-    border = db.configs["border"]
-    lighting = db.configs["lighting"]
-    rand_crop = db.configs["rand_crop"]
-    rand_color = db.configs["rand_color"]
-    rand_scales = db.configs["rand_scales"]
-    gaussian_bump = db.configs["gaussian_bump"]
-    gaussian_iou = db.configs["gaussian_iou"]
-    gaussian_rad = db.configs["gaussian_radius"]
-
-    max_tag_len = 128
-
-    # allocating memory
-    images = np.zeros((batch_size, 3, input_size[0], input_size[1]), dtype=np.float32)
-    tl_heatmaps = np.zeros((batch_size, categories, output_size[0], output_size[1]), dtype=np.float32)
-    br_heatmaps = np.zeros((batch_size, categories, output_size[0], output_size[1]), dtype=np.float32)
-    tl_regrs = np.zeros((batch_size, max_tag_len, 2), dtype=np.float32)
-    br_regrs = np.zeros((batch_size, max_tag_len, 2), dtype=np.float32)
-    tl_tags = np.zeros((batch_size, max_tag_len), dtype=np.int64)
-    br_tags = np.zeros((batch_size, max_tag_len), dtype=np.int64)
-    tag_masks = np.zeros((batch_size, max_tag_len), dtype=np.uint8)
-    tag_lens = np.zeros((batch_size,), dtype=np.int32)
-
-    db_size = db.db_inds.size
-    for b_ind in range(batch_size):
-        if not debug and k_ind == 0:
-            db.shuffle_inds()
-
-        db_ind = db.db_inds[k_ind]
-        k_ind = (k_ind + 1) % db_size
-
-        # reading image
-        image_path = db.image_path(db_ind)
-        image = cv2.imread(image_path)
-
-        # reading detections
-        detections = db.detections(db_ind)
-
-        # cropping an image randomly
-        if not debug and rand_crop:
-            image, detections = random_crop(image, detections, rand_scales, input_size, border=border)
-
-        image, detections = _resize_image(image, detections, input_size)
-        detections = _clip_detections(image, detections)
-
-        width_ratio = output_size[1] / input_size[1]
-        height_ratio = output_size[0] / input_size[0]
-
-        # flipping an image randomly
-        if not debug and np.random.uniform() > 0.5:
-            image[:] = image[:, ::-1, :]
-            width = image.shape[1]
-            detections[:, [0, 2]] = width - detections[:, [2, 0]] - 1
-
-        if not debug:
-            image = image.astype(np.float32) / 255.
-            if rand_color:
-                color_jittering_(data_rng, image)
-                if lighting:
-                    lighting_(data_rng, image, 0.1, db.eig_val, db.eig_vec)
-            normalize_(image, db.mean, db.std)
-        images[b_ind] = image.transpose((2, 0, 1))
-
-        for ind, detection in enumerate(detections):
-            category = int(detection[-1]) - 1
-
-            xtl, ytl = detection[0], detection[1]
-            xbr, ybr = detection[2], detection[3]
-
-            fxtl = (xtl * width_ratio)
-            fytl = (ytl * height_ratio)
-            fxbr = (xbr * width_ratio)
-            fybr = (ybr * height_ratio)
-
-            xtl = int(fxtl)
-            ytl = int(fytl)
-            xbr = int(fxbr)
-            ybr = int(fybr)
-
-            if gaussian_bump:
-                width = detection[2] - detection[0]
-                height = detection[3] - detection[1]
-
-                width = math.ceil(width * width_ratio)
-                height = math.ceil(height * height_ratio)
-
-                if gaussian_rad == -1:
-                    radius = gaussian_radius((height, width), gaussian_iou)
-                    radius = max(0, int(radius))
-                else:
-                    radius = gaussian_rad
-
-                draw_gaussian(tl_heatmaps[b_ind, category], [xtl, ytl], radius)
-                draw_gaussian(br_heatmaps[b_ind, category], [xbr, ybr], radius)
-            else:
-                tl_heatmaps[b_ind, category, ytl, xtl] = 1
-                br_heatmaps[b_ind, category, ybr, xbr] = 1
-
-            tag_ind = tag_lens[b_ind]
-            tl_regrs[b_ind, tag_ind, :] = [fxtl - xtl, fytl - ytl]
-            br_regrs[b_ind, tag_ind, :] = [fxbr - xbr, fybr - ybr]
-            tl_tags[b_ind, tag_ind] = ytl * output_size[1] + xtl
-            br_tags[b_ind, tag_ind] = ybr * output_size[1] + xbr
-            tag_lens[b_ind] += 1
-
-    for b_ind in range(batch_size):
-        tag_len = tag_lens[b_ind]
-        tag_masks[b_ind, :tag_len] = 1
-
-    images = torch.from_numpy(images)
-    tl_heatmaps = torch.from_numpy(tl_heatmaps)
-    br_heatmaps = torch.from_numpy(br_heatmaps)
-    tl_regrs = torch.from_numpy(tl_regrs)
-    br_regrs = torch.from_numpy(br_regrs)
-    tl_tags = torch.from_numpy(tl_tags)
-    br_tags = torch.from_numpy(br_tags)
-    tag_masks = torch.from_numpy(tag_masks)
-
-    return {
-        "xs": [images],
-        "ys": [tl_heatmaps, br_heatmaps, tag_masks, tl_regrs, br_regrs, tl_tags, br_tags]
-    }, k_ind
--- a/object_detection/core/sample/cornernet_saccade.py
+++ b/object_detection/core/sample/cornernet_saccade.py
@@ -1,293 +0,0 @@
-import math
-
-import cv2
-import numpy as np
-import torch
-
-from .utils import draw_gaussian, gaussian_radius, normalize_, color_jittering_, lighting_, crop_image
-
-
-def bbox_overlaps(a_dets, b_dets):
-    a_widths = a_dets[:, 2] - a_dets[:, 0]
-    a_heights = a_dets[:, 3] - a_dets[:, 1]
-    a_areas = a_widths * a_heights
-
-    b_widths = b_dets[:, 2] - b_dets[:, 0]
-    b_heights = b_dets[:, 3] - b_dets[:, 1]
-    b_areas = b_widths * b_heights
-
-    return a_areas / b_areas
-
-
-def clip_detections(border, detections):
-    detections = detections.copy()
-
-    y0, y1, x0, x1 = border
-    det_xs = detections[:, 0:4:2]
-    det_ys = detections[:, 1:4:2]
-    np.clip(det_xs, x0, x1 - 1, out=det_xs)
-    np.clip(det_ys, y0, y1 - 1, out=det_ys)
-
-    keep_inds = ((det_xs[:, 1] - det_xs[:, 0]) > 0) & \
-                ((det_ys[:, 1] - det_ys[:, 0]) > 0)
-    keep_inds = np.where(keep_inds)[0]
-    return detections[keep_inds], keep_inds
-
-
-def crop_image_dets(image, dets, ind, input_size, output_size=None, random_crop=True, rand_center=True):
-    if ind is not None:
-        det_x0, det_y0, det_x1, det_y1 = dets[ind, 0:4]
-    else:
-        det_x0, det_y0, det_x1, det_y1 = None, None, None, None
-
-    input_height, input_width = input_size
-    image_height, image_width = image.shape[0:2]
-
-    centered = rand_center and np.random.uniform() > 0.5
-    if not random_crop or image_width <= input_width:
-        xc = image_width // 2
-    elif ind is None or not centered:
-        xmin = max(det_x1 - input_width, 0) if ind is not None else 0
-        xmax = min(image_width - input_width, det_x0) if ind is not None else image_width - input_width
-        xrand = np.random.randint(int(xmin), int(xmax) + 1)
-        xc = xrand + input_width // 2
-    else:
-        xmin = max((det_x0 + det_x1) // 2 - np.random.randint(0, 15), 0)
-        xmax = min((det_x0 + det_x1) // 2 + np.random.randint(0, 15), image_width - 1)
-        xc = np.random.randint(int(xmin), int(xmax) + 1)
-
-    if not random_crop or image_height <= input_height:
-        yc = image_height // 2
-    elif ind is None or not centered:
-        ymin = max(det_y1 - input_height, 0) if ind is not None else 0
-        ymax = min(image_height - input_height, det_y0) if ind is not None else image_height - input_height
-        yrand = np.random.randint(int(ymin), int(ymax) + 1)
-        yc = yrand + input_height // 2
-    else:
-        ymin = max((det_y0 + det_y1) // 2 - np.random.randint(0, 15), 0)
-        ymax = min((det_y0 + det_y1) // 2 + np.random.randint(0, 15), image_height - 1)
-        yc = np.random.randint(int(ymin), int(ymax) + 1)
-
-    image, border, offset = crop_image(image, [yc, xc], input_size, output_size=output_size)
-    dets[:, 0:4:2] -= offset[1]
-    dets[:, 1:4:2] -= offset[0]
-    return image, dets, border
-
-
-def scale_image_detections(image, dets, scale):
-    height, width = image.shape[0:2]
-
-    new_height = int(height * scale)
-    new_width = int(width * scale)
-
-    image = cv2.resize(image, (new_width, new_height))
-    dets = dets.copy()
-    dets[:, 0:4] *= scale
-    return image, dets
-
-
-def ref_scale(detections, random_crop=False):
-    if detections.shape[0] == 0:
-        return None, None
-
-    if random_crop and np.random.uniform() > 0.7:
-        return None, None
-
-    ref_ind = np.random.randint(detections.shape[0])
-    ref_det = detections[ref_ind].copy()
-    ref_h = ref_det[3] - ref_det[1]
-    ref_w = ref_det[2] - ref_det[0]
-    ref_hw = max(ref_h, ref_w)
-
-    if ref_hw > 96:
-        return np.random.randint(low=96, high=255) / ref_hw, ref_ind
-    elif ref_hw > 32:
-        return np.random.randint(low=32, high=97) / ref_hw, ref_ind
-    return np.random.randint(low=16, high=33) / ref_hw, ref_ind
-
-
-def create_attention_mask(atts, ratios, sizes, detections):
-    for det in detections:
-        width = det[2] - det[0]
-        height = det[3] - det[1]
-
-        max_hw = max(width, height)
-        for att, ratio, size in zip(atts, ratios, sizes):
-            if max_hw >= size[0] and max_hw <= size[1]:
-                x = (det[0] + det[2]) / 2
-                y = (det[1] + det[3]) / 2
-                x = (x / ratio).astype(np.int32)
-                y = (y / ratio).astype(np.int32)
-                att[y, x] = 1
-
-
-def cornernet_saccade(system_configs, db, k_ind, data_aug, debug):
-    data_rng = system_configs.data_rng
-    batch_size = system_configs.batch_size
-
-    categories = db.configs["categories"]
-    input_size = db.configs["input_size"]
-    output_size = db.configs["output_sizes"][0]
-    rand_scales = db.configs["rand_scales"]
-    rand_crop = db.configs["rand_crop"]
-    rand_center = db.configs["rand_center"]
-    view_sizes = db.configs["view_sizes"]
-
-    gaussian_iou = db.configs["gaussian_iou"]
-    gaussian_rad = db.configs["gaussian_radius"]
-
-    att_ratios = db.configs["att_ratios"]
-    att_ranges = db.configs["att_ranges"]
-    att_sizes = db.configs["att_sizes"]
-
-    min_scale = db.configs["min_scale"]
-    max_scale = db.configs["max_scale"]
-    max_objects = 128
-
-    images = np.zeros((batch_size, 3, input_size[0], input_size[1]), dtype=np.float32)
-    tl_heats = np.zeros((batch_size, categories, output_size[0], output_size[1]), dtype=np.float32)
-    br_heats = np.zeros((batch_size, categories, output_size[0], output_size[1]), dtype=np.float32)
-    tl_valids = np.zeros((batch_size, categories, output_size[0], output_size[1]), dtype=np.float32)
-    br_valids = np.zeros((batch_size, categories, output_size[0], output_size[1]), dtype=np.float32)
-    tl_regrs = np.zeros((batch_size, max_objects, 2), dtype=np.float32)
-    br_regrs = np.zeros((batch_size, max_objects, 2), dtype=np.float32)
-    tl_tags = np.zeros((batch_size, max_objects), dtype=np.int64)
-    br_tags = np.zeros((batch_size, max_objects), dtype=np.int64)
-    tag_masks = np.zeros((batch_size, max_objects), dtype=np.uint8)
-    tag_lens = np.zeros((batch_size,), dtype=np.int32)
-    attentions = [np.zeros((batch_size, 1, att_size[0], att_size[1]), dtype=np.float32) for att_size in att_sizes]
-
-    db_size = db.db_inds.size
-    for b_ind in range(batch_size):
-        if not debug and k_ind == 0:
-            # if k_ind == 0:
-            db.shuffle_inds()
-
-        db_ind = db.db_inds[k_ind]
-        k_ind = (k_ind + 1) % db_size
-
-        image_path = db.image_path(db_ind)
-        image = cv2.imread(image_path)
-
-        orig_detections = db.detections(db_ind)
-        keep_inds = np.arange(orig_detections.shape[0])
-
-        # clip the detections
-        detections = orig_detections.copy()
-        border = [0, image.shape[0], 0, image.shape[1]]
-        detections, clip_inds = clip_detections(border, detections)
-        keep_inds = keep_inds[clip_inds]
-
-        scale, ref_ind = ref_scale(detections, random_crop=rand_crop)
-        scale = np.random.choice(rand_scales) if scale is None else scale
-
-        orig_detections[:, 0:4:2] *= scale
-        orig_detections[:, 1:4:2] *= scale
-
-        image, detections = scale_image_detections(image, detections, scale)
-        ref_detection = detections[ref_ind].copy()
-
-        image, detections, border = crop_image_dets(image, detections, ref_ind, input_size, rand_center=rand_center)
-
-        detections, clip_inds = clip_detections(border, detections)
-        keep_inds = keep_inds[clip_inds]
-
-        width_ratio = output_size[1] / input_size[1]
-        height_ratio = output_size[0] / input_size[0]
-
-        # flipping an image randomly
-        if not debug and np.random.uniform() > 0.5:
-            image[:] = image[:, ::-1, :]
-            width = image.shape[1]
-            detections[:, [0, 2]] = width - detections[:, [2, 0]] - 1
-        create_attention_mask([att[b_ind, 0] for att in attentions], att_ratios, att_ranges, detections)
-
-        if debug:
-            dimage = image.copy()
-            for det in detections.astype(np.int32):
-                cv2.rectangle(dimage,
-                              (det[0], det[1]),
-                              (det[2], det[3]),
-                              (0, 255, 0), 2
-                              )
-            cv2.imwrite('debug/{:03d}.jpg'.format(b_ind), dimage)
-        overlaps = bbox_overlaps(detections, orig_detections[keep_inds]) > 0.5
-
-        if not debug:
-            image = image.astype(np.float32) / 255.
-            color_jittering_(data_rng, image)
-            lighting_(data_rng, image, 0.1, db.eig_val, db.eig_vec)
-            normalize_(image, db.mean, db.std)
-        images[b_ind] = image.transpose((2, 0, 1))
-
-        for ind, (detection, overlap) in enumerate(zip(detections, overlaps)):
-            category = int(detection[-1]) - 1
-
-            xtl, ytl = detection[0], detection[1]
-            xbr, ybr = detection[2], detection[3]
-
-            det_height = int(ybr) - int(ytl)
-            det_width = int(xbr) - int(xtl)
-            det_max = max(det_height, det_width)
-
-            valid = det_max >= min_scale
-
-            fxtl = (xtl * width_ratio)
-            fytl = (ytl * height_ratio)
-            fxbr = (xbr * width_ratio)
-            fybr = (ybr * height_ratio)
-
-            xtl = int(fxtl)
-            ytl = int(fytl)
-            xbr = int(fxbr)
-            ybr = int(fybr)
-
-            width = detection[2] - detection[0]
-            height = detection[3] - detection[1]
-
-            width = math.ceil(width * width_ratio)
-            height = math.ceil(height * height_ratio)
-
-            if gaussian_rad == -1:
-                radius = gaussian_radius((height, width), gaussian_iou)
-                radius = max(0, int(radius))
-            else:
-                radius = gaussian_rad
-
-            if overlap and valid:
-                draw_gaussian(tl_heats[b_ind, category], [xtl, ytl], radius)
-                draw_gaussian(br_heats[b_ind, category], [xbr, ybr], radius)
-
-                tag_ind = tag_lens[b_ind]
-                tl_regrs[b_ind, tag_ind, :] = [fxtl - xtl, fytl - ytl]
-                br_regrs[b_ind, tag_ind, :] = [fxbr - xbr, fybr - ybr]
-                tl_tags[b_ind, tag_ind] = ytl * output_size[1] + xtl
-                br_tags[b_ind, tag_ind] = ybr * output_size[1] + xbr
-                tag_lens[b_ind] += 1
-            else:
-                draw_gaussian(tl_valids[b_ind, category], [xtl, ytl], radius)
-                draw_gaussian(br_valids[b_ind, category], [xbr, ybr], radius)
-
-    tl_valids = (tl_valids == 0).astype(np.float32)
-    br_valids = (br_valids == 0).astype(np.float32)
-
-    for b_ind in range(batch_size):
-        tag_len = tag_lens[b_ind]
-        tag_masks[b_ind, :tag_len] = 1
-
-    images = torch.from_numpy(images)
-    tl_heats = torch.from_numpy(tl_heats)
-    br_heats = torch.from_numpy(br_heats)
-    tl_regrs = torch.from_numpy(tl_regrs)
-    br_regrs = torch.from_numpy(br_regrs)
-    tl_tags = torch.from_numpy(tl_tags)
-    br_tags = torch.from_numpy(br_tags)
-    tag_masks = torch.from_numpy(tag_masks)
-    tl_valids = torch.from_numpy(tl_valids)
-    br_valids = torch.from_numpy(br_valids)
-    attentions = [torch.from_numpy(att) for att in attentions]
-
-    return {
-        "xs": [images],
-        "ys": [tl_heats, br_heats, tag_masks, tl_regrs, br_regrs, tl_tags, br_tags, tl_valids, br_valids, attentions]
-    }, k_ind
--- a/object_detection/core/sample/utils.py
+++ b/object_detection/core/sample/utils.py
@@ -1,178 +0,0 @@
-import random
-
-import cv2
-import numpy as np
-
-
-def grayscale(image):
-    return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
-
-
-def normalize_(image, mean, std):
-    image -= mean
-    image /= std
-
-
-def lighting_(data_rng, image, alphastd, eigval, eigvec):
-    alpha = data_rng.normal(scale=alphastd, size=(3,))
-    image += np.dot(eigvec, eigval * alpha)
-
-
-def blend_(alpha, image1, image2):
-    image1 *= alpha
-    image2 *= (1 - alpha)
-    image1 += image2
-
-
-def saturation_(data_rng, image, gs, gs_mean, var):
-    alpha = 1. + data_rng.uniform(low=-var, high=var)
-    blend_(alpha, image, gs[:, :, None])
-
-
-def brightness_(data_rng, image, gs, gs_mean, var):
-    alpha = 1. + data_rng.uniform(low=-var, high=var)
-    image *= alpha
-
-
-def contrast_(data_rng, image, gs, gs_mean, var):
-    alpha = 1. + data_rng.uniform(low=-var, high=var)
-    blend_(alpha, image, gs_mean)
-
-
-def color_jittering_(data_rng, image):
-    functions = [brightness_, contrast_, saturation_]
-    random.shuffle(functions)
-
-    gs = grayscale(image)
-    gs_mean = gs.mean()
-    for f in functions:
-        f(data_rng, image, gs, gs_mean, 0.4)
-
-
-def gaussian2D(shape, sigma=1):
-    m, n = [(ss - 1.) / 2. for ss in shape]
-    y, x = np.ogrid[-m:m + 1, -n:n + 1]
-
-    h = np.exp(-(x * x + y * y) / (2 * sigma * sigma))
-    h[h < np.finfo(h.dtype).eps * h.max()] = 0
-    return h
-
-
-def draw_gaussian(heatmap, center, radius, k=1):
-    diameter = 2 * radius + 1
-    gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6)
-
-    x, y = center
-
-    height, width = heatmap.shape[0:2]
-
-    left, right = min(x, radius), min(width - x, radius + 1)
-    top, bottom = min(y, radius), min(height - y, radius + 1)
-
-    masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right]
-    masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right]
-    np.maximum(masked_heatmap, masked_gaussian * k, out=masked_heatmap)
-
-
-def gaussian_radius(det_size, min_overlap):
-    height, width = det_size
-
-    a1 = 1
-    b1 = (height + width)
-    c1 = width * height * (1 - min_overlap) / (1 + min_overlap)
-    sq1 = np.sqrt(b1 ** 2 - 4 * a1 * c1)
-    r1 = (b1 - sq1) / (2 * a1)
-
-    a2 = 4
-    b2 = 2 * (height + width)
-    c2 = (1 - min_overlap) * width * height
-    sq2 = np.sqrt(b2 ** 2 - 4 * a2 * c2)
-    r2 = (b2 - sq2) / (2 * a2)
-
-    a3 = 4 * min_overlap
-    b3 = -2 * min_overlap * (height + width)
-    c3 = (min_overlap - 1) * width * height
-    sq3 = np.sqrt(b3 ** 2 - 4 * a3 * c3)
-    r3 = (b3 + sq3) / (2 * a3)
-    return min(r1, r2, r3)
-
-
-def _get_border(border, size):
-    i = 1
-    while size - border // i <= border // i:
-        i *= 2
-    return border // i
-
-
-def random_crop(image, detections, random_scales, view_size, border=64):
-    view_height, view_width = view_size
-    image_height, image_width = image.shape[0:2]
-
-    scale = np.random.choice(random_scales)
-    height = int(view_height * scale)
-    width = int(view_width * scale)
-
-    cropped_image = np.zeros((height, width, 3), dtype=image.dtype)
-
-    w_border = _get_border(border, image_width)
-    h_border = _get_border(border, image_height)
-
-    ctx = np.random.randint(low=w_border, high=image_width - w_border)
-    cty = np.random.randint(low=h_border, high=image_height - h_border)
-
-    x0, x1 = max(ctx - width // 2, 0), min(ctx + width // 2, image_width)
-    y0, y1 = max(cty - height // 2, 0), min(cty + height // 2, image_height)
-
-    left_w, right_w = ctx - x0, x1 - ctx
-    top_h, bottom_h = cty - y0, y1 - cty
-
-    # crop image
-    cropped_ctx, cropped_cty = width // 2, height // 2
-    x_slice = slice(cropped_ctx - left_w, cropped_ctx + right_w)
-    y_slice = slice(cropped_cty - top_h, cropped_cty + bottom_h)
-    cropped_image[y_slice, x_slice, :] = image[y0:y1, x0:x1, :]
-
-    # crop detections
-    cropped_detections = detections.copy()
-    cropped_detections[:, 0:4:2] -= x0
-    cropped_detections[:, 1:4:2] -= y0
-    cropped_detections[:, 0:4:2] += cropped_ctx - left_w
-    cropped_detections[:, 1:4:2] += cropped_cty - top_h
-
-    return cropped_image, cropped_detections
-
-
-def crop_image(image, center, size, output_size=None):
-    if output_size == None:
-        output_size = size
-
-    cty, ctx = center
-    height, width = size
-    o_height, o_width = output_size
-    im_height, im_width = image.shape[0:2]
-    cropped_image = np.zeros((o_height, o_width, 3), dtype=image.dtype)
-
-    x0, x1 = max(0, ctx - width // 2), min(ctx + width // 2, im_width)
-    y0, y1 = max(0, cty - height // 2), min(cty + height // 2, im_height)
-
-    left, right = ctx - x0, x1 - ctx
-    top, bottom = cty - y0, y1 - cty
-
-    cropped_cty, cropped_ctx = o_height // 2, o_width // 2
-    y_slice = slice(cropped_cty - top, cropped_cty + bottom)
-    x_slice = slice(cropped_ctx - left, cropped_ctx + right)
-    cropped_image[y_slice, x_slice, :] = image[y0:y1, x0:x1, :]
-
-    border = np.array([
-        cropped_cty - top,
-        cropped_cty + bottom,
-        cropped_ctx - left,
-        cropped_ctx + right
-    ], dtype=np.float32)
-
-    offset = np.array([
-        cty - o_height // 2,
-        ctx - o_width // 2
-    ])
-
-    return cropped_image, border, offset
--- a/object_detection/core/test/init.py
+++ b/object_detection/core/test/init.py
@@ -1,5 +0,0 @@
-from .cornernet import cornernet
-from .cornernet_saccade import cornernet_saccade
-
-def test_func(sys_config, db, nnet, result_dir, debug=False):
-    return globals()[sys_config.sampling_function](db, nnet, result_dir, debug=debug)
--- a/object_detection/core/test/cornernet.py
+++ b/object_detection/core/test/cornernet.py
@@ -1,180 +0,0 @@
-import json
-import os
-
-import cv2
-import numpy as np
-import torch
-from tqdm import tqdm
-
-from ..external.nms import soft_nms, soft_nms_merge
-from ..sample.utils import crop_image
-from ..utils import Timer
-from ..vis_utils import draw_bboxes
-
-
-def rescale_dets_(detections, ratios, borders, sizes):
-    xs, ys = detections[..., 0:4:2], detections[..., 1:4:2]
-    xs /= ratios[:, 1][:, None, None]
-    ys /= ratios[:, 0][:, None, None]
-    xs -= borders[:, 2][:, None, None]
-    ys -= borders[:, 0][:, None, None]
-    np.clip(xs, 0, sizes[:, 1][:, None, None], out=xs)
-    np.clip(ys, 0, sizes[:, 0][:, None, None], out=ys)
-
-
-def decode(nnet, images, K, ae_threshold=0.5, kernel=3, num_dets=1000):
-    detections = nnet.test([images], ae_threshold=ae_threshold, test=True, K=K, kernel=kernel, num_dets=num_dets)[0]
-    return detections.data.cpu().numpy()
-
-
-def cornernet(db, nnet, result_dir, debug=False, decode_func=decode):
-    debug_dir = os.path.join(result_dir, "debug")
-    if not os.path.exists(debug_dir):
-        os.makedirs(debug_dir)
-
-    if db.split != "trainval2014":
-        db_inds = db.db_inds[:100] if debug else db.db_inds
-    else:
-        db_inds = db.db_inds[:100] if debug else db.db_inds[:5000]
-
-    num_images = db_inds.size
-    categories = db.configs["categories"]
-
-    timer = Timer()
-    top_bboxes = {}
-    for ind in tqdm(range(0, num_images), ncols=80, desc="locating kps"):
-        db_ind = db_inds[ind]
-
-        image_id = db.image_ids(db_ind)
-        image_path = db.image_path(db_ind)
-        image = cv2.imread(image_path)
-
-        timer.tic()
-        top_bboxes[image_id] = cornernet_inference(db, nnet, image)
-        timer.toc()
-
-        if debug:
-            image_path = db.image_path(db_ind)
-            image = cv2.imread(image_path)
-            bboxes = {
-                db.cls2name(j): top_bboxes[image_id][j]
-                for j in range(1, categories + 1)
-            }
-            image = draw_bboxes(image, bboxes)
-            debug_file = os.path.join(debug_dir, "{}.jpg".format(db_ind))
-            cv2.imwrite(debug_file, image)
-    print('average time: {}'.format(timer.average_time))
-
-    result_json = os.path.join(result_dir, "results.json")
-    detections = db.convert_to_coco(top_bboxes)
-    with open(result_json, "w") as f:
-        json.dump(detections, f)
-
-    cls_ids = list(range(1, categories + 1))
-    image_ids = [db.image_ids(ind) for ind in db_inds]
-    db.evaluate(result_json, cls_ids, image_ids)
-    return 0
-
-
-def cornernet_inference(db, nnet, image, decode_func=decode):
-    K = db.configs["top_k"]
-    ae_threshold = db.configs["ae_threshold"]
-    nms_kernel = db.configs["nms_kernel"]
-    num_dets = db.configs["num_dets"]
-    test_flipped = db.configs["test_flipped"]
-
-    input_size = db.configs["input_size"]
-    output_size = db.configs["output_sizes"][0]
-
-    scales = db.configs["test_scales"]
-    weight_exp = db.configs["weight_exp"]
-    merge_bbox = db.configs["merge_bbox"]
-    categories = db.configs["categories"]
-    nms_threshold = db.configs["nms_threshold"]
-    max_per_image = db.configs["max_per_image"]
-    nms_algorithm = {
-        "nms": 0,
-        "linear_soft_nms": 1,
-        "exp_soft_nms": 2
-    }[db.configs["nms_algorithm"]]
-
-    height, width = image.shape[0:2]
-
-    height_scale = (input_size[0] + 1) // output_size[0]
-    width_scale = (input_size[1] + 1) // output_size[1]
-
-    im_mean = torch.cuda.FloatTensor(db.mean).reshape(1, 3, 1, 1)
-    im_std = torch.cuda.FloatTensor(db.std).reshape(1, 3, 1, 1)
-
-    detections = []
-    for scale in scales:
-        new_height = int(height * scale)
-        new_width = int(width * scale)
-        new_center = np.array([new_height // 2, new_width // 2])
-
-        inp_height = new_height | 127
-        inp_width = new_width | 127
-
-        images = np.zeros((1, 3, inp_height, inp_width), dtype=np.float32)
-        ratios = np.zeros((1, 2), dtype=np.float32)
-        borders = np.zeros((1, 4), dtype=np.float32)
-        sizes = np.zeros((1, 2), dtype=np.float32)
-
-        out_height, out_width = (inp_height + 1) // height_scale, (inp_width + 1) // width_scale
-        height_ratio = out_height / inp_height
-        width_ratio = out_width / inp_width
-
-        resized_image = cv2.resize(image, (new_width, new_height))
-        resized_image, border, offset = crop_image(resized_image, new_center, [inp_height, inp_width])
-
-        resized_image = resized_image / 255.
-
-        images[0] = resized_image.transpose((2, 0, 1))
-        borders[0] = border
-        sizes[0] = [int(height * scale), int(width * scale)]
-        ratios[0] = [height_ratio, width_ratio]
-
-        if test_flipped:
-            images = np.concatenate((images, images[:, :, :, ::-1]), axis=0)
-        images = torch.from_numpy(images).cuda()
-        images -= im_mean
-        images /= im_std
-
-        dets = decode_func(nnet, images, K, ae_threshold=ae_threshold, kernel=nms_kernel, num_dets=num_dets)
-        if test_flipped:
-            dets[1, :, [0, 2]] = out_width - dets[1, :, [2, 0]]
-            dets = dets.reshape(1, -1, 8)
-
-        rescale_dets_(dets, ratios, borders, sizes)
-        dets[:, :, 0:4] /= scale
-        detections.append(dets)
-
-    detections = np.concatenate(detections, axis=1)
-
-    classes = detections[..., -1]
-    classes = classes[0]
-    detections = detections[0]
-
-    # reject detections with negative scores
-    keep_inds = (detections[:, 4] > -1)
-    detections = detections[keep_inds]
-    classes = classes[keep_inds]
-
-    top_bboxes = {}
-    for j in range(categories):
-        keep_inds = (classes == j)
-        top_bboxes[j + 1] = detections[keep_inds][:, 0:7].astype(np.float32)
-        if merge_bbox:
-            soft_nms_merge(top_bboxes[j + 1], Nt=nms_threshold, method=nms_algorithm, weight_exp=weight_exp)
-        else:
-            soft_nms(top_bboxes[j + 1], Nt=nms_threshold, method=nms_algorithm)
-        top_bboxes[j + 1] = top_bboxes[j + 1][:, 0:5]
-
-    scores = np.hstack([top_bboxes[j][:, -1] for j in range(1, categories + 1)])
-    if len(scores) > max_per_image:
-        kth = len(scores) - max_per_image
-        thresh = np.partition(scores, kth)[kth]
-        for j in range(1, categories + 1):
-            keep_inds = (top_bboxes[j][:, -1] >= thresh)
-            top_bboxes[j] = top_bboxes[j][keep_inds]
-    return top_bboxes
--- a/object_detection/core/test/cornernet_saccade.py
+++ b/object_detection/core/test/cornernet_saccade.py
@@ -1,405 +0,0 @@
-import json
-import math
-import os
-
-import cv2
-import numpy as np
-import torch
-import torch.nn as nn
-from tqdm import tqdm
-
-from ..external.nms import soft_nms
-from ..utils import Timer
-from ..vis_utils import draw_bboxes
-
-
-def crop_image_gpu(image, center, size, out_image):
-    cty, ctx = center
-    height, width = size
-    o_height, o_width = out_image.shape[1:3]
-    im_height, im_width = image.shape[1:3]
-
-    scale = o_height / max(height, width)
-    x0, x1 = max(0, ctx - width // 2), min(ctx + width // 2, im_width)
-    y0, y1 = max(0, cty - height // 2), min(cty + height // 2, im_height)
-
-    left, right = ctx - x0, x1 - ctx
-    top, bottom = cty - y0, y1 - cty
-
-    cropped_cty, cropped_ctx = o_height // 2, o_width // 2
-    out_y0, out_y1 = cropped_cty - int(top * scale), cropped_cty + int(bottom * scale)
-    out_x0, out_x1 = cropped_ctx - int(left * scale), cropped_ctx + int(right * scale)
-
-    new_height = out_y1 - out_y0
-    new_width = out_x1 - out_x0
-    image = image[:, y0:y1, x0:x1].unsqueeze(0)
-    out_image[:, out_y0:out_y1, out_x0:out_x1] = nn.functional.interpolate(
-        image, size=[new_height, new_width], mode='bilinear'
-    )[0]
-
-    return np.array([cty - height // 2, ctx - width // 2])
-
-
-def remap_dets_(detections, scales, offsets):
-    xs, ys = detections[..., 0:4:2], detections[..., 1:4:2]
-
-    xs /= scales.reshape(-1, 1, 1)
-    ys /= scales.reshape(-1, 1, 1)
-    xs += offsets[:, 1][:, None, None]
-    ys += offsets[:, 0][:, None, None]
-
-
-def att_nms(atts, ks):
-    pads = [(k - 1) // 2 for k in ks]
-    pools = [nn.functional.max_pool2d(att, (k, k), stride=1, padding=pad) for k, att, pad in zip(ks, atts, pads)]
-    keeps = [(att == pool).float() for att, pool in zip(atts, pools)]
-    atts = [att * keep for att, keep in zip(atts, keeps)]
-    return atts
-
-
-def batch_decode(db, nnet, images, no_att=False):
-    K = db.configs["top_k"]
-    ae_threshold = db.configs["ae_threshold"]
-    kernel = db.configs["nms_kernel"]
-    num_dets = db.configs["num_dets"]
-
-    att_nms_ks = db.configs["att_nms_ks"]
-    att_ranges = db.configs["att_ranges"]
-
-    num_images = images.shape[0]
-    detections = []
-    attentions = [[] for _ in range(len(att_ranges))]
-
-    batch_size = 32
-    for b_ind in range(math.ceil(num_images / batch_size)):
-        b_start = b_ind * batch_size
-        b_end = min(num_images, (b_ind + 1) * batch_size)
-
-        b_images = images[b_start:b_end]
-        b_outputs = nnet.test(
-            [b_images], ae_threshold=ae_threshold, K=K, kernel=kernel,
-            test=True, num_dets=num_dets, no_border=True, no_att=no_att
-        )
-        if no_att:
-            b_detections = b_outputs
-        else:
-            b_detections = b_outputs[0]
-            b_attentions = b_outputs[1]
-            b_attentions = att_nms(b_attentions, att_nms_ks)
-            b_attentions = [b_attention.data.cpu().numpy() for b_attention in b_attentions]
-
-        b_detections = b_detections.data.cpu().numpy()
-
-        detections.append(b_detections)
-        if not no_att:
-            for attention, b_attention in zip(attentions, b_attentions):
-                attention.append(b_attention)
-
-    if not no_att:
-        attentions = [np.concatenate(atts, axis=0) for atts in attentions] if detections else None
-    detections = np.concatenate(detections, axis=0) if detections else np.zeros((0, num_dets, 8))
-    return detections, attentions
-
-
-def decode_atts(db, atts, att_scales, scales, offsets, height, width, thresh, ignore_same=False):
-    att_ranges = db.configs["att_ranges"]
-    att_ratios = db.configs["att_ratios"]
-    input_size = db.configs["input_size"]
-
-    next_ys, next_xs, next_scales, next_scores = [], [], [], []
-
-    num_atts = atts[0].shape[0]
-    for aind in range(num_atts):
-        for att, att_range, att_ratio, att_scale in zip(atts, att_ranges, att_ratios, att_scales):
-            ys, xs = np.where(att[aind, 0] > thresh)
-            scores = att[aind, 0, ys, xs]
-
-            ys = ys * att_ratio / scales[aind] + offsets[aind, 0]
-            xs = xs * att_ratio / scales[aind] + offsets[aind, 1]
-
-            keep = (ys >= 0) & (ys < height) & (xs >= 0) & (xs < width)
-            ys, xs, scores = ys[keep], xs[keep], scores[keep]
-
-            next_scale = att_scale * scales[aind]
-            if (ignore_same and att_scale <= 1) or scales[aind] > 2 or next_scale > 4:
-                continue
-
-            next_scales += [next_scale] * len(xs)
-            next_scores += scores.tolist()
-            next_ys += ys.tolist()
-            next_xs += xs.tolist()
-    next_ys = np.array(next_ys)
-    next_xs = np.array(next_xs)
-    next_scales = np.array(next_scales)
-    next_scores = np.array(next_scores)
-    return np.stack((next_ys, next_xs, next_scales, next_scores), axis=1)
-
-
-def get_ref_locs(dets):
-    keep = dets[:, 4] > 0.5
-    dets = dets[keep]
-
-    ref_xs = (dets[:, 0] + dets[:, 2]) / 2
-    ref_ys = (dets[:, 1] + dets[:, 3]) / 2
-
-    ref_maxhws = np.maximum(dets[:, 2] - dets[:, 0], dets[:, 3] - dets[:, 1])
-    ref_scales = np.zeros_like(ref_maxhws)
-    ref_scores = dets[:, 4]
-
-    large_inds = ref_maxhws > 96
-    medium_inds = (ref_maxhws > 32) & (ref_maxhws <= 96)
-    small_inds = ref_maxhws <= 32
-
-    ref_scales[large_inds] = 192 / ref_maxhws[large_inds]
-    ref_scales[medium_inds] = 64 / ref_maxhws[medium_inds]
-    ref_scales[small_inds] = 24 / ref_maxhws[small_inds]
-
-    new_locations = np.stack((ref_ys, ref_xs, ref_scales, ref_scores), axis=1)
-    new_locations[:, 3] = 1
-    return new_locations
-
-
-def get_locs(db, nnet, image, im_mean, im_std, att_scales, thresh, sizes, ref_dets=True):
-    att_ranges = db.configs["att_ranges"]
-    att_ratios = db.configs["att_ratios"]
-    input_size = db.configs["input_size"]
-
-    height, width = image.shape[1:3]
-
-    locations = []
-    for size in sizes:
-        scale = size / max(height, width)
-        location = [height // 2, width // 2, scale]
-        locations.append(location)
-
-    locations = np.array(locations, dtype=np.float32)
-    images, offsets = prepare_images(db, image, locations, flipped=False)
-
-    images -= im_mean
-    images /= im_std
-
-    dets, atts = batch_decode(db, nnet, images)
-
-    scales = locations[:, 2]
-    next_locations = decode_atts(db, atts, att_scales, scales, offsets, height, width, thresh)
-
-    rescale_dets_(db, dets)
-    remap_dets_(dets, scales, offsets)
-
-    dets = dets.reshape(-1, 8)
-    keep = dets[:, 4] > 0.3
-    dets = dets[keep]
-
-    if ref_dets:
-        ref_locations = get_ref_locs(dets)
-        next_locations = np.concatenate((next_locations, ref_locations), axis=0)
-        next_locations = location_nms(next_locations, thresh=16)
-    return dets, next_locations, atts
-
-
-def location_nms(locations, thresh=15):
-    next_locations = []
-    sorted_inds = np.argsort(locations[:, -1])[::-1]
-
-    locations = locations[sorted_inds]
-    ys = locations[:, 0]
-    xs = locations[:, 1]
-    scales = locations[:, 2]
-
-    dist_ys = np.absolute(ys.reshape(-1, 1) - ys.reshape(1, -1))
-    dist_xs = np.absolute(xs.reshape(-1, 1) - xs.reshape(1, -1))
-    dists = np.minimum(dist_ys, dist_xs)
-    ratios = scales.reshape(-1, 1) / scales.reshape(1, -1)
-    while dists.shape[0] > 0:
-        next_locations.append(locations[0])
-
-        scale = scales[0]
-        dist = dists[0]
-        ratio = ratios[0]
-
-        keep = (dist > (thresh / scale)) | (ratio > 1.2) | (ratio < 0.8)
-
-        locations = locations[keep]
-
-        scales = scales[keep]
-        dists = dists[keep, :]
-        dists = dists[:, keep]
-        ratios = ratios[keep, :]
-        ratios = ratios[:, keep]
-    return np.stack(next_locations) if next_locations else np.zeros((0, 4))
-
-
-def prepare_images(db, image, locs, flipped=True):
-    input_size = db.configs["input_size"]
-    num_patches = locs.shape[0]
-
-    images = torch.zeros((num_patches, 3, input_size[0], input_size[1]), dtype=torch.float32, device='cuda')
-    offsets = np.zeros((num_patches, 2), dtype=np.float32)
-    for ind, (y, x, scale) in enumerate(locs[:, :3]):
-        crop_height = int(input_size[0] / scale)
-        crop_width = int(input_size[1] / scale)
-        offsets[ind] = crop_image_gpu(image, [int(y), int(x)], [crop_height, crop_width], images[ind])
-    return images, offsets
-
-
-def rescale_dets_(db, dets):
-    input_size = db.configs["input_size"]
-    output_size = db.configs["output_sizes"][0]
-
-    ratios = [o / i for o, i in zip(output_size, input_size)]
-    dets[..., 0:4:2] /= ratios[1]
-    dets[..., 1:4:2] /= ratios[0]
-
-
-def cornernet_saccade(db, nnet, result_dir, debug=False, decode_func=batch_decode):
-    debug_dir = os.path.join(result_dir, "debug")
-    if not os.path.exists(debug_dir):
-        os.makedirs(debug_dir)
-
-    if db.split != "trainval2014":
-        db_inds = db.db_inds[:500] if debug else db.db_inds
-    else:
-        db_inds = db.db_inds[:100] if debug else db.db_inds[:5000]
-
-    num_images = db_inds.size
-    categories = db.configs["categories"]
-
-    timer = Timer()
-    top_bboxes = {}
-    for k_ind in tqdm(range(0, num_images), ncols=80, desc="locating kps"):
-        db_ind = db_inds[k_ind]
-
-        image_id = db.image_ids(db_ind)
-        image_path = db.image_path(db_ind)
-        image = cv2.imread(image_path)
-
-        timer.tic()
-        top_bboxes[image_id] = cornernet_saccade_inference(db, nnet, image)
-        timer.toc()
-
-        if debug:
-            image_path = db.image_path(db_ind)
-            image = cv2.imread(image_path)
-            bboxes = {
-                db.cls2name(j): top_bboxes[image_id][j]
-                for j in range(1, categories + 1)
-            }
-            image = draw_bboxes(image, bboxes)
-            debug_file = os.path.join(debug_dir, "{}.jpg".format(db_ind))
-            cv2.imwrite(debug_file, image)
-    print('average time: {}'.format(timer.average_time))
-
-    result_json = os.path.join(result_dir, "results.json")
-    detections = db.convert_to_coco(top_bboxes)
-    with open(result_json, "w") as f:
-        json.dump(detections, f)
-
-    cls_ids = list(range(1, categories + 1))
-    image_ids = [db.image_ids(ind) for ind in db_inds]
-    db.evaluate(result_json, cls_ids, image_ids)
-    return 0
-
-
-def cornernet_saccade_inference(db, nnet, image, decode_func=batch_decode):
-    init_sizes = db.configs["init_sizes"]
-    ref_dets = db.configs["ref_dets"]
-
-    att_thresholds = db.configs["att_thresholds"]
-    att_scales = db.configs["att_scales"]
-    att_max_crops = db.configs["att_max_crops"]
-
-    categories = db.configs["categories"]
-    nms_threshold = db.configs["nms_threshold"]
-    max_per_image = db.configs["max_per_image"]
-    nms_algorithm = {
-        "nms": 0,
-        "linear_soft_nms": 1,
-        "exp_soft_nms": 2
-    }[db.configs["nms_algorithm"]]
-
-    num_iterations = len(att_thresholds)
-
-    im_mean = torch.tensor(db.mean, dtype=torch.float32, device='cuda').reshape(1, 3, 1, 1)
-    im_std = torch.tensor(db.std, dtype=torch.float32, device='cuda').reshape(1, 3, 1, 1)
-
-    height, width = image.shape[0:2]
-
-    image = image / 255.
-    image = image.transpose((2, 0, 1)).copy()
-    image = torch.from_numpy(image).cuda(non_blocking=True)
-
-    dets, locations, atts = get_locs(
-        db, nnet, image, im_mean, im_std,
-        att_scales[0], att_thresholds[0],
-        init_sizes, ref_dets=ref_dets
-    )
-
-    detections = [dets]
-    num_patches = locations.shape[0]
-
-    num_crops = 0
-    for ind in range(1, num_iterations + 1):
-        if num_patches == 0:
-            break
-
-        if num_crops + num_patches > att_max_crops:
-            max_crops = min(att_max_crops - num_crops, num_patches)
-            locations = locations[:max_crops]
-
-        num_patches = locations.shape[0]
-        num_crops += locations.shape[0]
-        no_att = (ind == num_iterations)
-
-        images, offsets = prepare_images(db, image, locations, flipped=False)
-        images -= im_mean
-        images /= im_std
-
-        dets, atts = decode_func(db, nnet, images, no_att=no_att)
-        dets = dets.reshape(num_patches, -1, 8)
-
-        rescale_dets_(db, dets)
-        remap_dets_(dets, locations[:, 2], offsets)
-
-        dets = dets.reshape(-1, 8)
-        keeps = (dets[:, 4] > -1)
-        dets = dets[keeps]
-
-        detections.append(dets)
-
-        if num_crops == att_max_crops:
-            break
-
-        if ind < num_iterations:
-            att_threshold = att_thresholds[ind]
-            att_scale = att_scales[ind]
-
-            next_locations = decode_atts(
-                db, atts, att_scale, locations[:, 2], offsets, height, width, att_threshold, ignore_same=True
-            )
-
-            if ref_dets:
-                ref_locations = get_ref_locs(dets)
-                next_locations = np.concatenate((next_locations, ref_locations), axis=0)
-                next_locations = location_nms(next_locations, thresh=16)
-
-            locations = next_locations
-            num_patches = locations.shape[0]
-
-    detections = np.concatenate(detections, axis=0)
-    classes = detections[..., -1]
-
-    top_bboxes = {}
-    for j in range(categories):
-        keep_inds = (classes == j)
-        top_bboxes[j + 1] = detections[keep_inds][:, 0:7].astype(np.float32)
-        keep_inds = soft_nms(top_bboxes[j + 1], Nt=nms_threshold, method=nms_algorithm, sigma=0.7)
-        top_bboxes[j + 1] = top_bboxes[j + 1][keep_inds, 0:5]
-
-    scores = np.hstack([top_bboxes[j][:, -1] for j in range(1, categories + 1)])
-    if len(scores) > max_per_image:
-        kth = len(scores) - max_per_image
-        thresh = np.partition(scores, kth)[kth]
-        for j in range(1, categories + 1):
-            keep_inds = (top_bboxes[j][:, -1] >= thresh)
-            top_bboxes[j] = top_bboxes[j][keep_inds]
-    return top_bboxes
--- a/object_detection/core/utils/init.py
+++ b/object_detection/core/utils/init.py
@@ -1,2 +0,0 @@
-from .tqdm import stdout_to_tqdm
-from .timer import Timer
--- a/object_detection/core/utils/timer.py
+++ b/object_detection/core/utils/timer.py
@@ -1,27 +0,0 @@
-import time
-
-
-class Timer(object):
-    """A simple timer."""
-
-    def __init__(self):
-        self.total_time = 0.
-        self.calls = 0
-        self.start_time = 0.
-        self.diff = 0.
-        self.average_time = 0.
-
-    def tic(self):
-        # using time.time instead of time.clock because time time.clock
-        # does not normalize for multithreading
-        self.start_time = time.time()
-
-    def toc(self, average=True):
-        self.diff = time.time() - self.start_time
-        self.total_time += self.diff
-        self.calls += 1
-        self.average_time = self.total_time / self.calls
-        if average:
-            return self.average_time
-        else:
-            return self.diff
--- a/object_detection/core/utils/tqdm.py
+++ b/object_detection/core/utils/tqdm.py
@@ -1,27 +0,0 @@
-import contextlib
-import sys
-
-from tqdm import tqdm
-
-
-class TqdmFile(object):
-    dummy_file = None
-
-    def __init__(self, dummy_file):
-        self.dummy_file = dummy_file
-
-    def write(self, x):
-        if len(x.rstrip()) > 0:
-            tqdm.write(x, file=self.dummy_file)
-
-
-@contextlib.contextmanager
-def stdout_to_tqdm():
-    save_stdout = sys.stdout
-    try:
-        sys.stdout = TqdmFile(sys.stdout)
-        yield save_stdout
-    except Exception as exc:
-        raise exc
-    finally:
-        sys.stdout = save_stdout
--- a/object_detection/core/vis_utils.py
+++ b/object_detection/core/vis_utils.py
@@ -1,63 +0,0 @@
-import cv2
-import numpy as np
-
-
-def draw_bboxes(image, bboxes, font_size=0.5, thresh=0.5, colors=None):
-    """Draws bounding boxes on an image.
-
-    Args:
-        image: An image in OpenCV format
-        bboxes: A dictionary representing bounding boxes of different object
-            categories, where the keys are the names of the categories and the
-            values are the bounding boxes. The bounding boxes of category should be
-            stored in a 2D NumPy array, where each row is a bounding box (x1, y1,
-            x2, y2, score).
-        font_size: (Optional) Font size of the category names.
-        thresh: (Optional) Only bounding boxes with scores above the threshold
-            will be drawn.
-        colors: (Optional) Color of bounding boxes for each category. If it is
-            not provided, this function will use random color for each category.
-
-    Returns:
-        An image with bounding boxes.
-    """
-
-    image = image.copy()
-    for cat_name in bboxes:
-        keep_inds = bboxes[cat_name][:, -1] > thresh
-        cat_size = cv2.getTextSize(cat_name, cv2.FONT_HERSHEY_SIMPLEX, font_size, 2)[0]
-
-        if colors is None:
-            color = np.random.random((3,)) * 0.6 + 0.4
-            color = (color * 255).astype(np.int32).tolist()
-        else:
-            color = colors[cat_name]
-
-        for bbox in bboxes[cat_name][keep_inds]:
-            bbox = bbox[0:4].astype(np.int32)
-            if bbox[1] - cat_size[1] - 2 < 0:
-                cv2.rectangle(image,
-                              (bbox[0], bbox[1] + 2),
-                              (bbox[0] + cat_size[0], bbox[1] + cat_size[1] + 2),
-                              color, -1
-                              )
-                cv2.putText(image, cat_name,
-                            (bbox[0], bbox[1] + cat_size[1] + 2),
-                            cv2.FONT_HERSHEY_SIMPLEX, font_size, (0, 0, 0), thickness=1
-                            )
-            else:
-                cv2.rectangle(image,
-                              (bbox[0], bbox[1] - cat_size[1] - 2),
-                              (bbox[0] + cat_size[0], bbox[1] - 2),
-                              color, -1
-                              )
-                cv2.putText(image, cat_name,
-                            (bbox[0], bbox[1] - 2),
-                            cv2.FONT_HERSHEY_SIMPLEX, font_size, (0, 0, 0), thickness=1
-                            )
-            cv2.rectangle(image,
-                          (bbox[0], bbox[1]),
-                          (bbox[2], bbox[3]),
-                          color, 2
-                          )
-    return image
--- a/object_detection/demo.jpg
+++ b/object_detection/demo.jpg
--- a/object_detection/demo.py
+++ b/object_detection/demo.py
@@ -1,13 +0,0 @@
-#!/usr/bin/env python
-
-import cv2
-
-from core.detectors import CornerNet_Saccade
-from core.vis_utils import draw_bboxes
-
-detector = CornerNet_Saccade()
-image = cv2.imread("demo.jpg")
-
-bboxes = detector(image)
-image = draw_bboxes(image, bboxes)
-cv2.imwrite("demo_out.jpg", image)
--- a/object_detection/doc_detect.py
+++ b/object_detection/doc_detect.py
@@ -1,16 +0,0 @@
-import numpy as np
-
-from object_detection import CornerNet_Saccade
-from util import image_util
-
-
-def capture_target_area(image, target="book"):
-    detector = CornerNet_Saccade()
-    bboxes = detector(image)
-    target_images = []
-    keep_inds = bboxes[target][:, -1] > 0.5
-    for bbox in bboxes[target][keep_inds]:
-        bbox = bbox[0:4].astype(np.int32)
-        bbox = np.clip(bbox, 0, None)
-        target_images.append(image_util.capture(image, bbox))
-    return target_images
--- a/object_detection/evaluate.py
+++ b/object_detection/evaluate.py
@@ -1,110 +0,0 @@
-#!/usr/bin/env python
-import argparse
-import importlib
-import json
-import os
-import pprint
-
-import torch
-
-from core.config import SystemConfig
-from core.dbs import datasets
-from core.nnet.py_factory import NetworkFactory
-from core.test import test_func
-
-torch.backends.cudnn.benchmark = False
-
-
-def parse_args():
-    parser = argparse.ArgumentParser(description="Evaluation Script")
-    parser.add_argument("cfg_file", help="config file", type=str)
-    parser.add_argument("--testiter", dest="testiter",
-                        help="test at iteration i",
-                        default=None, type=int)
-    parser.add_argument("--split", dest="split",
-                        help="which split to use",
-                        default="validation", type=str)
-    parser.add_argument("--suffix", dest="suffix", default=None, type=str)
-    parser.add_argument("--debug", action="store_true")
-
-    args = parser.parse_args()
-    return args
-
-
-def make_dirs(directories):
-    for directory in directories:
-        if not os.path.exists(directory):
-            os.makedirs(directory)
-
-
-def test(db, system_config, model, args):
-    split = args.split
-    testiter = args.testiter
-    debug = args.debug
-    suffix = args.suffix
-
-    result_dir = system_config.result_dir
-    result_dir = os.path.join(result_dir, str(testiter), split)
-
-    if suffix is not None:
-        result_dir = os.path.join(result_dir, suffix)
-
-    make_dirs([result_dir])
-
-    test_iter = system_config.max_iter if testiter is None else testiter
-    print("loading parameters at iteration: {}".format(test_iter))
-
-    print("building neural network...")
-    nnet = NetworkFactory(system_config, model)
-    print("loading parameters...")
-    nnet.load_params(test_iter)
-
-    nnet.cuda()
-    nnet.eval_mode()
-    test_func(system_config, db, nnet, result_dir, debug=debug)
-
-
-def main(args):
-    if args.suffix is None:
-        cfg_file = os.path.join("./configs", args.cfg_file + ".json")
-    else:
-        cfg_file = os.path.join("./configs", args.cfg_file + "-{}.json".format(args.suffix))
-    print("cfg_file: {}".format(cfg_file))
-
-    with open(cfg_file, "r") as f:
-        config = json.load(f)
-
-    config["system"]["snapshot_name"] = args.cfg_file
-    system_config = SystemConfig().update_config(config["system"])
-
-    model_file = "core.models.{}".format(args.cfg_file)
-    model_file = importlib.import_module(model_file)
-    model = model_file.model()
-
-    train_split = system_config.train_split
-    val_split = system_config.val_split
-    test_split = system_config.test_split
-
-    split = {
-        "training": train_split,
-        "validation": val_split,
-        "testing": test_split
-    }[args.split]
-
-    print("loading all datasets...")
-    dataset = system_config.dataset
-    print("split: {}".format(split))
-    testing_db = datasets[dataset](config["db"], split=split, sys_config=system_config)
-
-    print("system config...")
-    pprint.pprint(system_config.full)
-
-    print("db config...")
-    pprint.pprint(testing_db.configs)
-
-    test(testing_db, system_config, model, args)
-
-
-if __name__ == "__main__":
-    args = parse_args()
-    main(args)
--- a/object_detection/train.py
+++ b/object_detection/train.py
@@ -1,260 +0,0 @@
-#!/usr/bin/env python
-import argparse
-import importlib
-import json
-import os
-import pprint
-import queue
-import threading
-import traceback
-
-import numpy as np
-import torch
-import torch.distributed as dist
-import torch.multiprocessing as mp
-from torch.multiprocessing import Process, Queue
-from tqdm import tqdm
-
-from core.config import SystemConfig
-from core.dbs import datasets
-from core.nnet.py_factory import NetworkFactory
-from core.sample import data_sampling_func
-from core.utils import stdout_to_tqdm
-
-torch.backends.cudnn.enabled = True
-torch.backends.cudnn.benchmark = True
-
-
-def parse_args():
-    parser = argparse.ArgumentParser(description="Training Script")
-    parser.add_argument("cfg_file", help="config file", type=str)
-    parser.add_argument("--iter", dest="start_iter",
-                        help="train at iteration i",
-                        default=0, type=int)
-    parser.add_argument("--workers", default=4, type=int)
-    parser.add_argument("--initialize", action="store_true")
-
-    parser.add_argument("--distributed", action="store_true")
-    parser.add_argument("--world-size", default=-1, type=int,
-                        help="number of nodes of distributed training")
-    parser.add_argument("--rank", default=0, type=int,
-                        help="node rank for distributed training")
-    parser.add_argument("--dist-url", default=None, type=str,
-                        help="url used to set up distributed training")
-    parser.add_argument("--dist-backend", default="nccl", type=str)
-
-    args = parser.parse_args()
-    return args
-
-
-def prefetch_data(system_config, db, queue, sample_data, data_aug):
-    ind = 0
-    print("start prefetching data...")
-    np.random.seed(os.getpid())
-    while True:
-        try:
-            data, ind = sample_data(system_config, db, ind, data_aug=data_aug)
-            queue.put(data)
-        except Exception as e:
-            traceback.print_exc()
-            raise e
-
-
-def _pin_memory(ts):
-    if type(ts) is list:
-        return [t.pin_memory() for t in ts]
-    return ts.pin_memory()
-
-
-def pin_memory(data_queue, pinned_data_queue, sema):
-    while True:
-        data = data_queue.get()
-
-        data["xs"] = [_pin_memory(x) for x in data["xs"]]
-        data["ys"] = [_pin_memory(y) for y in data["ys"]]
-
-        pinned_data_queue.put(data)
-
-        if sema.acquire(blocking=False):
-            return
-
-
-def init_parallel_jobs(system_config, dbs, queue, fn, data_aug):
-    tasks = [Process(target=prefetch_data, args=(system_config, db, queue, fn, data_aug)) for db in dbs]
-    for task in tasks:
-        task.daemon = True
-        task.start()
-    return tasks
-
-
-def terminate_tasks(tasks):
-    for task in tasks:
-        task.terminate()
-
-
-def train(training_dbs, validation_db, system_config, model, args):
-    # reading arguments from command
-    start_iter = args.start_iter
-    distributed = args.distributed
-    world_size = args.world_size
-    initialize = args.initialize
-    gpu = args.gpu
-    rank = args.rank
-
-    # reading arguments from json file
-    batch_size = system_config.batch_size
-    learning_rate = system_config.learning_rate
-    max_iteration = system_config.max_iter
-    pretrained_model = system_config.pretrain
-    stepsize = system_config.stepsize
-    snapshot = system_config.snapshot
-    val_iter = system_config.val_iter
-    display = system_config.display
-    decay_rate = system_config.decay_rate
-    stepsize = system_config.stepsize
-
-    print("Process {}: building model...".format(rank))
-    nnet = NetworkFactory(system_config, model, distributed=distributed, gpu=gpu)
-    if initialize:
-        nnet.save_params(0)
-        exit(0)
-
-    # queues storing data for training
-    training_queue = Queue(system_config.prefetch_size)
-    validation_queue = Queue(5)
-
-    # queues storing pinned data for training
-    pinned_training_queue = queue.Queue(system_config.prefetch_size)
-    pinned_validation_queue = queue.Queue(5)
-
-    # allocating resources for parallel reading
-    training_tasks = init_parallel_jobs(system_config, training_dbs, training_queue, data_sampling_func, True)
-    if val_iter:
-        validation_tasks = init_parallel_jobs(system_config, [validation_db], validation_queue, data_sampling_func,
-                                              False)
-
-    training_pin_semaphore = threading.Semaphore()
-    validation_pin_semaphore = threading.Semaphore()
-    training_pin_semaphore.acquire()
-    validation_pin_semaphore.acquire()
-
-    training_pin_args = (training_queue, pinned_training_queue, training_pin_semaphore)
-    training_pin_thread = threading.Thread(target=pin_memory, args=training_pin_args)
-    training_pin_thread.daemon = True
-    training_pin_thread.start()
-
-    validation_pin_args = (validation_queue, pinned_validation_queue, validation_pin_semaphore)
-    validation_pin_thread = threading.Thread(target=pin_memory, args=validation_pin_args)
-    validation_pin_thread.daemon = True
-    validation_pin_thread.start()
-
-    if pretrained_model is not None:
-        if not os.path.exists(pretrained_model):
-            raise ValueError("pretrained model does not exist")
-        print("Process {}: loading from pretrained model".format(rank))
-        nnet.load_pretrained_params(pretrained_model)
-
-    if start_iter:
-        nnet.load_params(start_iter)
-        learning_rate /= (decay_rate ** (start_iter // stepsize))
-        nnet.set_lr(learning_rate)
-        print("Process {}: training starts from iteration {} with learning_rate {}".format(rank, start_iter + 1,
-                                                                                           learning_rate))
-    else:
-        nnet.set_lr(learning_rate)
-
-    if rank == 0:
-        print("training start...")
-    nnet.cuda()
-    nnet.train_mode()
-    with stdout_to_tqdm() as save_stdout:
-        for iteration in tqdm(range(start_iter + 1, max_iteration + 1), file=save_stdout, ncols=80):
-            training = pinned_training_queue.get(block=True)
-            training_loss = nnet.train(**training)
-
-            if display and iteration % display == 0:
-                print("Process {}: training loss at iteration {}: {}".format(rank, iteration, training_loss.item()))
-            del training_loss
-
-            if val_iter and validation_db.db_inds.size and iteration % val_iter == 0:
-                nnet.eval_mode()
-                validation = pinned_validation_queue.get(block=True)
-                validation_loss = nnet.validate(**validation)
-                print("Process {}: validation loss at iteration {}: {}".format(rank, iteration, validation_loss.item()))
-                nnet.train_mode()
-
-            if iteration % snapshot == 0 and rank == 0:
-                nnet.save_params(iteration)
-
-            if iteration % stepsize == 0:
-                learning_rate /= decay_rate
-                nnet.set_lr(learning_rate)
-
-    # sending signal to kill the thread
-    training_pin_semaphore.release()
-    validation_pin_semaphore.release()
-
-    # terminating data fetching processes
-    terminate_tasks(training_tasks)
-    terminate_tasks(validation_tasks)
-
-
-def main(gpu, ngpus_per_node, args):
-    args.gpu = gpu
-    if args.distributed:
-        args.rank = args.rank * ngpus_per_node + gpu
-        dist.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
-                                world_size=args.world_size, rank=args.rank)
-
-    rank = args.rank
-
-    cfg_file = os.path.join("./configs", args.cfg_file + ".json")
-    with open(cfg_file, "r") as f:
-        config = json.load(f)
-
-    config["system"]["snapshot_name"] = args.cfg_file
-    system_config = SystemConfig().update_config(config["system"])
-
-    model_file = "core.models.{}".format(args.cfg_file)
-    model_file = importlib.import_module(model_file)
-    model = model_file.model()
-
-    train_split = system_config.train_split
-    val_split = system_config.val_split
-
-    print("Process {}: loading all datasets...".format(rank))
-    dataset = system_config.dataset
-    workers = args.workers
-    print("Process {}: using {} workers".format(rank, workers))
-    training_dbs = [datasets[dataset](config["db"], split=train_split, sys_config=system_config) for _ in
-                    range(workers)]
-    validation_db = datasets[dataset](config["db"], split=val_split, sys_config=system_config)
-
-    if rank == 0:
-        print("system config...")
-        pprint.pprint(system_config.full)
-
-        print("db config...")
-        pprint.pprint(training_dbs[0].configs)
-
-        print("len of db: {}".format(len(training_dbs[0].db_inds)))
-        print("distributed: {}".format(args.distributed))
-
-    train(training_dbs, validation_db, system_config, model, args)
-
-
-if __name__ == "__main__":
-    args = parse_args()
-
-    distributed = args.distributed
-    world_size = args.world_size
-
-    if distributed and world_size < 0:
-        raise ValueError("world size must be greater than 0 in distributed training")
-
-    ngpus_per_node = torch.cuda.device_count()
-    if distributed:
-        args.world_size = ngpus_per_node * args.world_size
-        mp.spawn(main, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
-    else:
-        main(None, ngpus_per_node, args)
--- a/paddle_detection/.github/ISSUE_TEMPLATE/1_bug-report.yml
+++ b/paddle_detection/.github/ISSUE_TEMPLATE/1_bug-report.yml
@@ -0,0 +1,106 @@
+name: 🐛 报BUG Bug Report
+description: 报告一个可复现的Bug以帮助我们修复PaddleDetection。 Report a bug to help us reproduce and fix it.
+labels: [type/bug-report, status/new-issue]
+
+body:
+- type: markdown
+  attributes:
+    value: |
+        Thank you for submitting a PaddleDetection Bug Report!
+
+- type: checkboxes
+  attributes:
+    label: 问题确认 Search before asking
+    description: >
+      (必选项) 在向PaddleDetection报bug之前，请先查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)是否报过同样的bug。
+
+      (Required) Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues).
+
+    options:
+      - label: >
+          我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)，没有发现相似的bug。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issues) and found no similar bug report.
+        required: true
+
+- type: dropdown
+  attributes:
+    label: Bug组件 Bug Component
+    description: |
+      (可选项) 请选择在哪部分代码发现这个bug。(Optional) Please select the part of PaddleDetection where you found the bug.
+    multiple: true
+    options:
+      - "Training"
+      - "Validation"
+      - "Inference"
+      - "Export"
+      - "Deploy"
+      - "Installation"
+      - "DataProcess"
+      - "Other"
+  validations:
+    required: false
+
+- type: textarea
+  id: code
+  attributes:
+    label: Bug描述 Describe the Bug
+    description:  |
+      请清晰而简洁地描述这个bug，并附上bug复现步骤、报错信息或截图、代码改动说明或最小可复现代码。如果代码太长，请将可执行代码放到[AIStudio](https://aistudio.baidu.com/aistudio/index)中并将项目设置为公开（或者放到github gist上），并在项目中描述清楚bug复现步骤，在issue中描述期望结果与实际结果。
+
+      如果你报告的是一个报错信息，请将完整回溯的报错贴在这里，并使用 ` ```三引号块``` `展示错误信息。
+
+
+    placeholder: |
+      请清晰简洁的描述这个bug。 A clear and concise description of what the bug is.
+
+      ```python
+      代码改动说明，或最小可复现代码。 Code change description, or sample code to reproduce the problem.
+      ```
+
+      ```shell
+      带有完整回溯信息的报错日志或截图。 The error log or screenshot you got, with the full traceback.
+      ```
+  validations:
+    required: true
+
+- type: textarea
+  attributes:
+    label: 复现环境 Environment
+    description: 请具体说明复现bug的环境信息。Please specify the environment information for reproducing the bug.
+    placeholder: |
+      - OS: Linux/Windows
+      - PaddlePaddle: 2.2.2
+      - PaddleDetection: release/2.4
+      - Python: 3.8.0
+      - CUDA: 10.2
+      - CUDNN: 7.6
+      - GCC: 8.2.0
+  validations:
+    required: true
+
+- type: checkboxes
+  attributes:
+    label: Bug描述确认 Bug description confirmation
+    description: >
+      (必选项) 请确认是否提供了详细的Bug描述和环境信息，确认问题是否可以复现。
+
+      (Required) Please confirm whether the bug description and environment information are provided, and whether the problem can be reproduced.
+
+    options:
+      - label: >
+          我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息，确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
+        required: true
+
+- type: checkboxes
+  attributes:
+    label: 是否愿意提交PR？ Are you willing to submit a PR?
+    description: >
+      (可选项) 如果你对修复bug有自己的想法，十分鼓励提交[Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls)，共同提升PaddleDetection。
+
+      (Optional) We encourage you to submit a [Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls) (PR) to help improve PaddleDetection for everyone, especially if you have a good understanding of how to implement a fix or feature.
+    options:
+      - label: 我愿意提交PR！I'd like to help by submitting a PR!
+
+- type: markdown
+  attributes:
+    value: >
+      感谢你的贡献 🎉！Thanks for your contribution 🎉!
--- a/paddle_detection/.github/ISSUE_TEMPLATE/2_feature-request.yml
+++ b/paddle_detection/.github/ISSUE_TEMPLATE/2_feature-request.yml
@@ -0,0 +1,50 @@
+name: 🚀 新需求 Feature Request
+description: 提交一个你对PaddleDetection的新需求。 Submit a request for a new Paddle feature.
+labels: [type/feature-request, status/new-issue]
+
+body:
+- type: markdown
+  attributes:
+    value: >
+      #### 你可以在这里提出你对PaddleDetection的新需求，包括但不限于：功能或模型缺失、功能不全或无法使用、精度/性能不符合预期等。
+
+      #### You could submit a request for a new feature here, including but not limited to: new features or models, incomplete or unusable features, accuracy/performance not as expected, etc.
+
+- type: checkboxes
+  attributes:
+    label: 问题确认 Search before asking
+    description: >
+      在向PaddleDetection提新需求之前，请先查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)是否报过同样的需求。
+
+      Before submitting a feature request, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues).
+
+    options:
+      - label: >
+          我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)，没有类似需求。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issues) and found no similar feature requests.
+        required: true
+
+- type: textarea
+  id: description
+  attributes:
+    label: 需求描述 Feature Description
+    description: |
+      请尽可能包含任务目标、需求场景、功能描述等信息，全面的信息有利于我们准确评估你的需求。
+      Please include as much information as possible, such as mission objectives, requirement scenarios, functional descriptions, etc. Comprehensive information will help us accurately assess your feature request.
+    value: "1. 任务目标（请描述你正在做的项目是什么，如模型、论文、项目是什么？）; 2. 需求场景（请描述你的项目中为什么需要用此功能）; 3. 功能描述（请简单描述或设计这个功能）"
+  validations:
+    required: true
+
+- type: checkboxes
+  attributes:
+    label: 是否愿意提交PR Are you willing to submit a PR?
+    description: >
+      (可选)如果你对新feature有自己的想法，十分鼓励提交[Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls)，共同提升PaddleDetection
+
+      (Optional) We encourage you to submit a [Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls) (PR) to help improve PaddleDetection for everyone, especially if you have a good understanding of how to implement a fix or feature.
+    options:
+      - label: Yes I'd like to help by submitting a PR!
+
+- type: markdown
+  attributes:
+    value: >
+      感谢你的贡献 🎉！Thanks for your contribution 🎉!
--- a/paddle_detection/.github/ISSUE_TEMPLATE/3_documentation-issue.yml
+++ b/paddle_detection/.github/ISSUE_TEMPLATE/3_documentation-issue.yml
@@ -0,0 +1,38 @@
+name: 📚 文档 Documentation Issue
+description: 反馈一个官网文档错误。 Report an issue related to https://github.com/PaddlePaddle/PaddleDetection.
+labels: [type/docs, status/new-issue]
+
+body:
+- type: markdown
+  attributes:
+    value: >
+      #### 请确认反馈的问题来自PaddlePaddle官网文档：https://github.com/PaddlePaddle/PaddleDetection 。
+
+      #### Before submitting a Documentation Issue, Please make sure that issue is related to https://github.com/PaddlePaddle/PaddleDetection.
+
+- type: textarea
+  id: link
+  attributes:
+    label: 文档链接&描述 Document Links & Description
+    description: |
+      请说明有问题的文档链接以及该文档存在的问题。
+      Please fill in the link to the document and describe the question.
+  validations:
+    required: true
+
+
+- type: textarea
+  id: error
+  attributes:
+    label: 请提出你的建议 Please give your suggestion
+    description: |
+      请告诉我们，你希望如何改进这个文档。或者你可以提个PR修复这个问题。
+      Please tell us how you would like to improve this document. Or you can submit a PR to fix this problem.
+
+  validations:
+    required: false
+
+- type: markdown
+  attributes:
+    value: >
+      感谢你的贡献 🎉！Thanks for your contribution 🎉!
--- a/paddle_detection/.github/ISSUE_TEMPLATE/4_ask-a-question.yml
+++ b/paddle_detection/.github/ISSUE_TEMPLATE/4_ask-a-question.yml
@@ -0,0 +1,37 @@
+name: 🙋🏼‍♀️🙋🏻‍♂️提问 Ask a Question
+description: 提出一个使用/咨询问题。 Ask a usage or consultation question.
+labels: [type/question, status/new-issue]
+
+body:
+- type: checkboxes
+  attributes:
+    label: 问题确认 Search before asking
+    description: >
+      #### 你可以在这里提出一个使用/咨询问题，提问之前请确保：
+
+      - 1）已经百度/谷歌搜索过你的问题，但是没有找到解答；
+
+      - 2）已经在官网查询过[教程文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/GETTING_STARTED_cn.md)与[FAQ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/docs/tutorials/FAQ)，但是没有找到解答；
+
+      - 3）已经在[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)中搜索过，没有找到同类issue或issue未被解答。
+
+
+      #### You could ask a usage or consultation question here, before your start, please make sure:
+
+      - 1) You have searched your question on Baidu/Google, but found no answer;
+
+      - 2) You have checked the [tutorials](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/GETTING_STARTED.md), but found no answer;
+
+      - 3) You have searched [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues), but found no similar issue or the issue has not been answered.
+
+    options:
+      - label: >
+          我已经搜索过问题，但是没有找到解答。I have searched the question and found no related answer.
+        required: true
+
+- type: textarea
+  id: question
+  attributes:
+    label: 请提出你的问题 Please ask your question
+  validations:
+    required: true
--- a/paddle_detection/.github/ISSUE_TEMPLATE/5_others.yml
+++ b/paddle_detection/.github/ISSUE_TEMPLATE/5_others.yml
@@ -0,0 +1,23 @@
+name: 🧩 其他 Others
+description: 提出其他问题。 Report any other non-support related issues.
+labels: [type/others, status/new-issue]
+
+body:
+- type: markdown
+  attributes:
+    value: >
+      #### 你可以在这里提出任何前面几类模板不适用的问题，包括但不限于：优化性建议、框架使用体验反馈、版本兼容性问题、报错信息不清楚等。
+
+      #### You can report any issues that are not applicable to the previous types of templates, including but not limited to: enhancement suggestions, feedback on the use of the framework, version compatibility issues, unclear error information, etc.
+
+- type: textarea
+  id: others
+  attributes:
+    label: 问题描述 Please describe your issue
+  validations:
+    required: true
+
+- type: markdown
+  attributes:
+    value: >
+      感谢你的贡献 🎉！ Thanks for your contribution 🎉!
--- a/paddle_detection/.gitignore
+++ b/paddle_detection/.gitignore
@@ -0,0 +1,88 @@
+# Virtualenv
+/.venv/
+/venv/
+
+# Byte-compiled / optimized / DLL files
+__pycache__/
+.ipynb_checkpoints/
+*.py[cod]
+
+# C extensions
+*.so
+
+# json file
+*.json
+
+# log file
+*.log
+
+# Distribution / packaging
+/bin/
+*build/
+/develop-eggs/
+*dist/
+/eggs/
+/lib/
+/lib64/
+/output/
+/inference_model/
+/output_inference/
+/parts/
+/sdist/
+/var/
+*.egg-info/
+/.installed.cfg
+/*.egg
+/.eggs
+
+# AUTHORS and ChangeLog will be generated while packaging
+/AUTHORS
+/ChangeLog
+
+# BCloud / BuildSubmitter
+/build_submitter.*
+/logger_client_log
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+.tox/
+.coverage
+.cache
+.pytest_cache
+nosetests.xml
+coverage.xml
+
+# Translations
+*.mo
+
+# Sphinx documentation
+/docs/_build/
+
+*.tar
+*.pyc
+
+.idea/
+
+dataset/coco/annotations
+dataset/coco/train2017
+dataset/coco/val2017
+dataset/voc/VOCdevkit
+dataset/fruit/fruit-detection/
+dataset/voc/test.txt
+dataset/voc/trainval.txt
+dataset/wider_face/WIDER_test
+dataset/wider_face/WIDER_train
+dataset/wider_face/WIDER_val
+dataset/wider_face/wider_face_split
+
+ppdet/version.py
+
+# NPU meta folder
+kernel_meta/
+
+# MAC
+*.DS_Store
+
--- a/paddle_detection/.pre-commit-config.yaml
+++ b/paddle_detection/.pre-commit-config.yaml
@@ -0,0 +1,44 @@
+-   repo: https://github.com/PaddlePaddle/mirrors-yapf.git
+    sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
+    hooks:
+    -   id: yapf
+        files: \.py$
+-   repo: https://github.com/pre-commit/pre-commit-hooks
+    sha: a11d9314b22d8f8c7556443875b731ef05965464
+    hooks:
+    -   id: check-merge-conflict
+    -   id: check-symlinks
+    -   id: detect-private-key
+        files: (?!.*paddle)^.*$
+    -   id: end-of-file-fixer
+        files: \.(md|yml)$
+    -   id: trailing-whitespace
+        files: \.(md|yml)$
+-   repo: https://github.com/Lucas-C/pre-commit-hooks
+    sha: v1.0.1
+    hooks:
+    -   id: forbid-crlf
+        files: \.(md|yml)$
+    -   id: remove-crlf
+        files: \.(md|yml)$
+    -   id: forbid-tabs
+        files: \.(md|yml)$
+    -   id: remove-tabs
+        files: \.(md|yml)$
+-   repo: local
+    hooks:
+    -   id: clang-format-with-version-check
+        name: clang-format
+        description: Format files with ClangFormat.
+        entry: bash ./.travis/codestyle/clang_format.hook -i
+        language: system
+        files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|proto)$
+
+-   repo: local
+    hooks:
+    -   id: cpplint-cpp-source
+        name: cpplint
+        description: Check C++ code style using cpplint.py.
+        entry: bash ./.travis/codestyle/cpplint_pre_commit.hook
+        language: system
+        files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx)$
--- a/paddle_detection/.style.yapf
+++ b/paddle_detection/.style.yapf
@@ -0,0 +1,3 @@
+[style]
+based_on_style = pep8
+column_limit = 80
--- a/paddle_detection/.travis.yml
+++ b/paddle_detection/.travis.yml
@@ -0,0 +1,35 @@
+language: cpp
+cache: ccache
+sudo: required
+dist: trusty
+services:
+  - docker
+os:
+  - linux
+env:
+  - JOB=PRE_COMMIT
+
+addons:
+  apt:
+    packages:
+      - git
+      - python
+      - python-pip
+      - python2.7-dev
+  ssh_known_hosts: 13.229.163.131
+before_install:
+  - sudo pip install -U virtualenv pre-commit pip -i https://pypi.tuna.tsinghua.edu.cn/simple
+  - docker pull paddlepaddle/paddle:latest
+  - git pull https://github.com/PaddlePaddle/PaddleDetection develop
+
+script:
+  - exit_code=0
+  - .travis/precommit.sh || exit_code=$(( exit_code | $? ))
+  # - docker run -i --rm -v "$PWD:/py_unittest" paddlepaddle/paddle:latest /bin/bash -c
+  #   'cd /py_unittest; sh .travis/unittest.sh' || exit_code=$(( exit_code | $? ))
+  - if [ $exit_code -eq 0  ]; then true; else exit 1; fi;
+
+notifications:
+  email:
+    on_success: change
+    on_failure: always
--- a/paddle_detection/.travis/codestyle/clang_format.hook
+++ b/paddle_detection/.travis/codestyle/clang_format.hook
@@ -0,0 +1,4 @@
+#!/bin/bash
+set -e
+
+clang-format $@
--- a/paddle_detection/.travis/codestyle/cpplint_pre_commit.hook
+++ b/paddle_detection/.travis/codestyle/cpplint_pre_commit.hook
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+TOTAL_ERRORS=0
+if [[ ! $TRAVIS_BRANCH ]]; then
+  # install cpplint on local machine.
+  if [[ ! $(which cpplint) ]]; then
+    pip install cpplint
+  fi
+  # diff files on local machine. 
+  files=$(git diff --cached --name-status | awk '$1 != "D" {print $2}')
+else
+  # diff files between PR and latest commit on Travis CI. 
+  branch_ref=$(git rev-parse "$TRAVIS_BRANCH")
+  head_ref=$(git rev-parse HEAD)
+  files=$(git diff --name-status $branch_ref $head_ref | awk '$1 != "D" {print $2}')
+fi
+# The trick to remove deleted files: https://stackoverflow.com/a/2413151
+for file in $files; do
+    if [[ $file =~ ^(patches/.*) ]]; then
+        continue;
+    else
+        cpplint --filter=-readability/fn_size,-build/include_what_you_use,-build/c++11 $file;
+        TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?);
+    fi
+done
+
+exit $TOTAL_ERRORS
--- a/paddle_detection/.travis/precommit.sh
+++ b/paddle_detection/.travis/precommit.sh
@@ -0,0 +1,21 @@
+#!/bin/bash
+function abort(){
+    echo "Your commit not fit PaddlePaddle code style" 1>&2
+    echo "Please use pre-commit scripts to auto-format your code" 1>&2
+    exit 1
+}
+
+trap 'abort' 0
+set -e
+cd `dirname $0`
+cd ..
+export PATH=/usr/bin:$PATH
+pre-commit install
+
+if ! pre-commit run -a ; then
+  ls -lh
+  git diff  --exit-code
+  exit 1
+fi
+
+trap : 0
--- a/paddle_detection/.travis/requirements.txt
+++ b/paddle_detection/.travis/requirements.txt
@@ -0,0 +1,8 @@
+# add python requirements for unittests here, note install pycocotools
+# directly is not supported in travis ci, it is installed by compiling
+# from source files in unittest.sh
+tqdm
+cython
+shapely
+llvmlite==0.33
+numba==0.50
--- a/paddle_detection/.travis/unittest.sh
+++ b/paddle_detection/.travis/unittest.sh
@@ -0,0 +1,47 @@
+#!/bin/bash
+
+abort(){
+    echo "Run unittest failed" 1>&2
+    echo "Please check your code" 1>&2
+    echo "  1. you can run unit tests by 'bash .travis/unittest.sh' locally" 1>&2
+    echo "  2. you can add python requirements in .travis/requirements.txt if you use new requirements in unit tests" 1>&2
+    exit 1
+}
+
+unittest(){
+    if [ $? != 0 ]; then
+        exit 1
+    fi
+    find "./ppdet" -name 'tests' -type d -print0 | \
+        xargs -0 -I{} -n1 bash -c \
+        'python -m unittest discover -v -s {}'
+}
+
+trap 'abort' 0
+set -e
+
+# install travis python dependencies exclude pycocotools
+if [ -f ".travis/requirements.txt" ]; then
+    pip install -r .travis/requirements.txt
+fi
+
+# install pycocotools
+if [ `pip list | grep pycocotools | wc -l` -eq 0 ]; then
+  # install git if needed
+  if [ -n  `which git` ]; then
+    apt-get update
+    apt-get install -y git
+  fi;
+  git clone https://github.com/cocodataset/cocoapi.git
+  cd cocoapi/PythonAPI
+  make install
+  python setup.py install --user
+  cd ../..
+  rm -rf cocoapi
+fi
+
+export PYTHONPATH=`pwd`:$PYTHONPATH
+
+unittest .
+
+trap : 0
--- a/paddle_detection/LICENSE
+++ b/paddle_detection/LICENSE
@@ -0,0 +1,201 @@
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/paddle_detection/README.md
+++ b/paddle_detection/README.md
@@ -0,0 +1 @@
+README_cn.md
--- a/paddle_detection/README_cn.md
+++ b/paddle_detection/README_cn.md
@@ -0,0 +1,878 @@
+简体中文 | [English](README_en.md)
+
+<div align="center">
+<p align="center">
+  <img src="https://user-images.githubusercontent.com/48054808/160532560-34cf7a1f-d950-435e-90d2-4b0a679e5119.png" align="middle" width = "800" />
+</p>
+
+<p align="center">
+    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
+    <a href="https://github.com/PaddlePaddle/PaddleDetection/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleDetection?color=ffa"></a>
+    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
+    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
+    <a href="https://github.com/PaddlePaddle/PaddleDetection/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleDetection?color=ccf"></a>
+</p>
+</div>
+
+## 💌目录
+- [💌目录](#目录)
+- [🌈简介](#简介)
+- [📣最新进展](#最新进展)
+- [👫开源社区](#开源社区)
+- [✨主要特性](#主要特性)
+    - [🧩模块化设计](#模块化设计)
+    - [📱丰富的模型库](#丰富的模型库)
+    - [🎗️产业特色模型|产业工具](#️产业特色模型产业工具)
+    - [💡🏆产业级部署实践](#产业级部署实践)
+- [🍱安装](#安装)
+- [🔥教程](#教程)
+- [🔑FAQ](#faq)
+- [🧩模块组件](#模块组件)
+- [📱模型库](#模型库)
+- [⚖️模型性能对比](#️模型性能对比)
+    - [🖥️服务器端模型性能对比](#️服务器端模型性能对比)
+    - [⌚️移动端模型性能对比](#️移动端模型性能对比)
+- [🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)
+  - [💎PP-YOLOE 高精度目标检测模型](#pp-yoloe-高精度目标检测模型)
+  - [💎PP-YOLOE-R 高性能旋转框检测模型](#pp-yoloe-r-高性能旋转框检测模型)
+  - [💎PP-YOLOE-SOD 高精度小目标检测模型](#pp-yoloe-sod-高精度小目标检测模型)
+  - [💫PP-PicoDet 超轻量实时目标检测模型](#pp-picodet-超轻量实时目标检测模型)
+  - [📡PP-Tracking 实时多目标跟踪系统](#pp-tracking-实时多目标跟踪系统)
+  - [⛷️PP-TinyPose 人体骨骼关键点识别](#️pp-tinypose-人体骨骼关键点识别)
+  - [🏃🏻PP-Human 实时行人分析工具](#pp-human-实时行人分析工具)
+  - [🏎️PP-Vehicle 实时车辆分析工具](#️pp-vehicle-实时车辆分析工具)
+- [💡产业实践范例](#产业实践范例)
+- [🏆企业应用案例](#企业应用案例)
+- [📝许可证书](#许可证书)
+- [📌引用](#引用)
+
+
+## 🌈简介
+
+PaddleDetection是一个基于PaddlePaddle的目标检测端到端开发套件，在提供丰富的模型组件和测试基准的同时，注重端到端的产业落地应用，通过打造产业级特色模型|工具、建设产业应用范例等手段，帮助开发者实现数据准备、模型选型、模型训练、模型部署的全流程打通，快速进行落地应用。
+
+主要模型效果示例如下（点击标题可快速跳转）：
+
+|                                                  [**通用目标检测**](#pp-yoloe-高精度目标检测模型)                                                  |                                                [**小目标检测**](#pp-yoloe-sod-高精度小目标检测模型)                                                |                                                  [**旋转框检测**](#pp-yoloe-r-高性能旋转框检测模型)                                                  |                                            [**3D目标物检测**](https://github.com/PaddlePaddle/Paddle3D)                                            |
+| :--------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------: |
+| <img src='https://user-images.githubusercontent.com/61035602/206095864-f174835d-4e9a-42f7-96b8-d684fc3a3687.png' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095892-934be83a-f869-4a31-8e52-1074184149d1.jpg' height="126px" width="180px"> |  <img src='https://user-images.githubusercontent.com/61035602/206111796-d9a9702a-c1a0-4647-b8e9-3e1307e9d34c.png' height="126px" width="180px">  | <img src='https://user-images.githubusercontent.com/61035602/206095622-cf6dbd26-5515-472f-9451-b39bbef5b1bf.gif' height="126px" width="180px"> |
+|                                                              [**人脸检测**](#模型库)                                                               |                                                [**2D关键点检测**](#️pp-tinypose-人体骨骼关键点识别)                                                 |                                                  [**多目标追踪**](#pp-tracking-实时多目标跟踪系统)                                                   |                                                              [**实例分割**](#模型库)                                                               |
+| <img src='https://user-images.githubusercontent.com/61035602/206095684-72f42233-c9c7-4bd8-9195-e34859bd08bf.jpg' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206100220-ab01d347-9ff9-4f17-9718-290ec14d4205.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206111753-836e7827-968e-4c80-92ef-7a78766892fc.gif' height="126px" width="180px"  > | <img src='https://user-images.githubusercontent.com/61035602/206095831-cc439557-1a23-4a99-b6b0-b6f2e97e8c57.jpg' height="126px" width="180px"> |
+|                                               [**车辆分析——车牌识别**](#️pp-vehicle-实时车辆分析工具)                                               |                                               [**车辆分析——车流统计**](#️pp-vehicle-实时车辆分析工具)                                               |                                                [**车辆分析——违章检测**](#️pp-vehicle-实时车辆分析工具)                                                |                                               [**车辆分析——属性分析**](#️pp-vehicle-实时车辆分析工具)                                               |
+| <img src='https://user-images.githubusercontent.com/61035602/206099328-2a1559e0-3b48-4424-9bad-d68f9ba5ba65.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095918-d0e7ad87-7bbb-40f1-bcc1-37844e2271ff.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206100295-7762e1ab-ffce-44fb-b69d-45fb93657fa0.gif' height="126px" width="180px"  > | <img src='https://user-images.githubusercontent.com/61035602/206095905-8255776a-d8e6-4af1-b6e9-8d9f97e5059d.gif' height="126px" width="180px"> |
+|                                                [**行人分析——闯入分析**](#pp-human-实时行人分析工具)                                                |                                                [**行人分析——行为分析**](#pp-human-实时行人分析工具)                                                |                                                 [**行人分析——属性分析**](#pp-human-实时行人分析工具)                                                 |                                                [**行人分析——人流统计**](#pp-human-实时行人分析工具)                                                |
+| <img src='https://user-images.githubusercontent.com/61035602/206095792-ae0ac107-cd8e-492a-8baa-32118fc82b04.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095778-fdd73e5d-9f91-48c7-9d3d-6f2e02ec3f79.gif' height="126px" width="180px"> |  <img src='https://user-images.githubusercontent.com/61035602/206095709-2c3a209e-6626-45dd-be16-7f0bf4d48a14.gif' height="126px" width="180px">  | <img src="https://user-images.githubusercontent.com/61035602/206113351-cc59df79-8672-4d76-b521-a15acf69ae78.gif" height="126px" width="180px"> |
+
+同时，PaddleDetection提供了模型的在线体验功能，用户可以选择自己的数据进行在线推理。
+
+`说明`：考虑到服务器负载压力，在线推理均为CPU推理，完整的模型开发实例以及产业部署实践代码示例请前往[🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)。
+
+`传送门`：[模型在线体验](https://www.paddlepaddle.org.cn/models)
+
+<div align="center">
+<p align="center">
+  <img src="https://user-images.githubusercontent.com/61035602/206896755-bd0cd498-1149-4e94-ae30-da590ea78a7a.gif" align="middle"/>
+</p>
+</div>
+
+## 📣最新进展
+
+💥 2024.6.27 **飞桨低代码开发工具 [PaddleX 3.0](https://github.com/paddlepaddle/paddlex) 重磅更新！**
+  - 低代码开发范式：支持目标检测模型全流程低代码开发，提供 Python API，支持用户自定义串联模型；
+  - 多硬件训推支持：支持英伟达 GPU、昆仑芯、昇腾和寒武纪等多种硬件进行模型训练与推理。
+
+**🔥超越YOLOv8，飞桨推出精度最高的实时检测器RT-DETR！**
+
+  <div align="center">
+  <img src="https://github.com/PaddlePaddle/PaddleDetection/assets/17582080/196b0a10-d2e8-401c-9132-54b9126e0a33"  height = "500" caption='' />
+  <p></p>
+  </div>
+
+  - `RT-DETR解读文章传送门`：
+    -  [《超越YOLOv8，飞桨推出精度最高的实时检测器RT-DETR！》](https://mp.weixin.qq.com/s/o03QM2rZNjHVto36gcV0Yw)
+  - `代码传送门`：[RT-DETR](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rtdetr)
+
+## 👫开源社区
+
+- **📑项目合作：** 如果您是企业开发者且有明确的目标检测垂类应用需求，请扫描如下二维码入群，并联系`群管理员AI`后可免费与官方团队展开不同层次的合作。
+- **🏅️社区贡献：** PaddleDetection非常欢迎你加入到飞桨社区的开源建设中，参与贡献方式可以参考[开源项目开发指南](docs/contribution/README.md)。
+- **💻直播教程：** PaddleDetection会定期在飞桨直播间([B站:飞桨PaddlePaddle](https://space.bilibili.com/476867757)、[微信: 飞桨PaddlePaddle](https://mp.weixin.qq.com/s/6ji89VKqoXDY6SSGkxS8NQ))，针对发新内容、以及产业范例、使用教程等进行直播分享。
+- **🎁加入社区：** **微信扫描二维码并填写问卷之后，可以及时获取如下信息，包括：**
+  - 社区最新文章、直播课等活动预告
+  - 往期直播录播&PPT
+  - 30+行人车辆等垂类高性能预训练模型
+  - 七大任务开源数据集下载链接汇总
+  - 40+前沿检测领域顶会算法
+  - 15+从零上手目标检测理论与实践视频课程
+  - 10+工业安防交通全流程项目实操（含源码）
+
+<div align="center">
+<img src="https://github.com/PaddlePaddle/PaddleDetection/assets/22989727/0466954b-ab4d-4984-bd36-796c37f0ee9c"  width = "150" height = "150",caption='' />
+<p>PaddleDetection官方交流群二维码</p>
+</div>
+
+## 📖 技术交流合作
+
+- 飞桨低代码开发工具（PaddleX）—— 面向国内外主流AI硬件的飞桨精选模型一站式开发工具。包含如下核心优势：
+  - 【产业高精度模型库】：覆盖10个主流AI任务 40+精选模型，丰富齐全。
+  - 【特色模型产线】：提供融合大小模型的特色模型产线，精度更高，效果更好。
+  - 【低代码开发模式】：图形化界面支持统一开发范式，便捷高效。
+  - 【私有化部署多硬件支持】：适配国内外主流AI硬件，支持本地纯离线使用，满足企业安全保密需要。
+
+- PaddleX官网地址：https://aistudio.baidu.com/intro/paddlex
+
+- PaddleX官方交流频道：https://aistudio.baidu.com/community/channel/610
+
+
+- **🎈社区近期活动**
+  - **🔥PaddleDetection v2.6版本更新解读**
+
+    <div align="center">
+    <img src="https://user-images.githubusercontent.com/61035602/224244188-da8495fc-eea9-432f-bc2d-6f0144c2dde9.png"  height = "250" caption='' />
+    <p></p>
+    </div>
+
+    - `v2.6版本版本更新解读文章传送门`：[《PaddleDetection v2.6发布：目标小？数据缺？标注累？泛化差？PP新员逐一应对！》](https://mp.weixin.qq.com/s/SLITj5k120d_fQc7jEO8Vw)
+
+  - **🏆半监督检测**
+
+    - `文章传送门`：[CVPR 2023 | 单阶段半监督目标检测SOTA：ARSL](https://mp.weixin.qq.com/s/UZLIGL6va2KBfofC-nKG4g)
+    - `代码传送门`：[ARSL](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/semi_det)
+
+    <div align="center">
+    <img src="https://user-images.githubusercontent.com/61035602/230522850-21873665-ba79-4f8d-8dce-43d736111df8.png"  height = "250" caption='' />
+    <p></p>
+    </div>
+
+  - **👀YOLO系列专题**
+
+    - `文章传送门`：[YOLOv8来啦！YOLO内卷期模型怎么选？9+款AI硬件如何快速部署？深度解析](https://mp.weixin.qq.com/s/rPwprZeHEpmGOe5wxrmO5g)
+    - `代码传送门`：[PaddleYOLO全系列](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/feature_models/PaddleYOLO_MODEL.md)
+
+    <div align="center">
+    <img src="https://user-images.githubusercontent.com/61035602/213202797-3a1b24f3-53c0-4094-bb31-db2f84438fbc.jpeg"  height = "250" caption='' />
+    <p></p>
+    </div>
+
+  - **🎯少目标迁移学习专题**
+    - `文章传送门`：[囿于数据少？泛化性差？PaddleDetection少样本迁移学习助你一键突围！](https://mp.weixin.qq.com/s/dFEQoxSzVCOaWVZPb3N7WA)
+
+  - **⚽️2022卡塔尔世界杯专题**
+    - `文章传送门`：[世界杯决赛号角吹响！趁周末来搭一套足球3D+AI量化分析系统吧！](https://mp.weixin.qq.com/s/koJxjWDPBOlqgI-98UsfKQ)
+
+    <div align="center">
+    <img src="https://user-images.githubusercontent.com/61035602/208036574-f151a7ff-a5f1-4495-9316-a47218a6576b.gif"  height = "250" caption='' />
+    <p></p>
+    </div>
+
+  - **🔍旋转框小目标检测专题**
+    - `文章传送门`：[Yes, PP-YOLOE！80.73mAP、38.5mAP，旋转框、小目标检测能力双SOTA！](https://mp.weixin.qq.com/s/6ji89VKqoXDY6SSGkxS8NQ)
+
+    <div align="center">
+    <img src="https://user-images.githubusercontent.com/61035602/208037368-5b9f01f7-afd9-46d8-bc80-271ccb5db7bb.png"  height = "220" caption='' />
+    <p></p>
+    </div>
+
+  - **🎊YOLO Vision世界学术交流大会**
+    - **PaddleDetection**受邀参与首个以**YOLO为主题**的**YOLO-VISION**世界大会，与全球AI领先开发者学习交流。
+    - `活动链接传送门`：[YOLO-VISION](https://ultralytics.com/yolo-vision)
+
+    <div  align="center">
+    <img src="https://user-images.githubusercontent.com/48054808/192301374-940cf2fa-9661-419b-9c46-18a4570df381.jpeg" width="400"/>
+    </div>
+
+- **🏅️社区贡献**
+  - `活动链接传送门`：[Yes, PP-YOLOE! 基于PP-YOLOE的算法开发](https://github.com/PaddlePaddle/PaddleDetection/issues/7345)
+
+## ✨主要特性
+
+#### 🧩模块化设计
+PaddleDetection将检测模型解耦成不同的模块组件，通过自定义模块组件组合，用户可以便捷高效地完成检测模型的搭建。`传送门`：[🧩模块组件](#模块组件)。
+
+#### 📱丰富的模型库
+PaddleDetection支持大量的最新主流的算法基准以及预训练模型，涵盖2D/3D目标检测、实例分割、人脸检测、关键点检测、多目标跟踪、半监督学习等方向。`传送门`：[📱模型库](#模型库)、[⚖️模型性能对比](#️模型性能对比)。
+
+#### 🎗️产业特色模型|产业工具
+PaddleDetection打造产业级特色模型以及分析工具：PP-YOLOE+、PP-PicoDet、PP-TinyPose、PP-HumanV2、PP-Vehicle等，针对通用、高频垂类应用场景提供深度优化解决方案以及高度集成的分析工具，降低开发者的试错、选择成本，针对业务场景快速应用落地。`传送门`：[🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)。
+
+#### 💡🏆产业级部署实践
+PaddleDetection整理工业、农业、林业、交通、医疗、金融、能源电力等AI应用范例，打通数据标注-模型训练-模型调优-预测部署全流程，持续降低目标检测技术产业落地门槛。`传送门`：[💡产业实践范例](#产业实践范例)、[🏆企业应用案例](#企业应用案例)。
+
+<div align="center">
+<p align="center">
+  <img src="https://user-images.githubusercontent.com/61035602/206431371-912a14c8-ce1e-48ec-ae6f-7267016b308e.png" align="middle" width="1280"/>
+</p>
+</div>
+
+
+## 🍱安装
+
+参考[安装说明](docs/tutorials/INSTALL_cn.md)进行安装。
+
+## 🔥教程
+
+**深度学习入门教程**
+
+- [零基础入门深度学习](https://www.paddlepaddle.org.cn/tutorials/projectdetail/4676538)
+- [零基础入门目标检测](https://aistudio.baidu.com/aistudio/education/group/info/1617)
+
+**快速开始**
+
+- [快速体验](docs/tutorials/QUICK_STARTED_cn.md)
+- [示例：30分钟快速开发交通标志检测模型](docs/tutorials/GETTING_STARTED_cn.md)
+
+**数据准备**
+- [数据准备](docs/tutorials/data/README.md)
+- [数据处理模块](docs/advanced_tutorials/READER.md)
+
+**配置文件说明**
+- [RCNN参数说明](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation.md)
+- [PP-YOLO参数说明](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation.md)
+
+**模型开发**
+
+- [新增检测模型](docs/advanced_tutorials/MODEL_TECHNICAL.md)
+- 二次开发
+  - [目标检测](docs/advanced_tutorials/customization/detection.md)
+  - [关键点检测](docs/advanced_tutorials/customization/keypoint_detection.md)
+  - [多目标跟踪](docs/advanced_tutorials/customization/pphuman_mot.md)
+  - [行为识别](docs/advanced_tutorials/customization/action_recognotion/)
+  - [属性识别](docs/advanced_tutorials/customization/pphuman_attribute.md)
+
+**部署推理**
+
+- [模型导出教程](deploy/EXPORT_MODEL.md)
+- [模型压缩](https://github.com/PaddlePaddle/PaddleSlim)
+  - [剪裁/量化/蒸馏教程](configs/slim)
+- [Paddle Inference部署](deploy/README.md)
+  - [Python端推理部署](deploy/python)
+  - [C++端推理部署](deploy/cpp)
+- [Paddle Lite部署](deploy/lite)
+- [Paddle Serving部署](deploy/serving)
+- [ONNX模型导出](deploy/EXPORT_ONNX_MODEL.md)
+- [推理benchmark](deploy/BENCHMARK_INFER.md)
+
+## 🔑FAQ
+- [FAQ/常见问题汇总](docs/tutorials/FAQ)
+
+## 🧩模块组件
+
+<table align="center">
+  <tbody>
+    <tr align="center" valign="center">
+      <td>
+        <b>Backbones</b>
+      </td>
+      <td>
+        <b>Necks</b>
+      </td>
+      <td>
+        <b>Loss</b>
+      </td>
+      <td>
+        <b>Common</b>
+      </td>
+      <td>
+      <b>Data Augmentation</b>
+      </td>
+    </tr>
+    <tr valign="top">
+      <td>
+      <ul>
+          <li><a href="ppdet/modeling/backbones/resnet.py">ResNet</a></li>
+          <li><a href="ppdet/modeling/backbones/res2net.py">CSPResNet</a></li>
+          <li><a href="ppdet/modeling/backbones/senet.py">SENet</a></li>
+          <li><a href="ppdet/modeling/backbones/res2net.py">Res2Net</a></li>
+          <li><a href="ppdet/modeling/backbones/hrnet.py">HRNet</a></li>
+          <li><a href="ppdet/modeling/backbones/lite_hrnet.py">Lite-HRNet</a></li>
+          <li><a href="ppdet/modeling/backbones/darknet.py">DarkNet</a></li>
+          <li><a href="ppdet/modeling/backbones/csp_darknet.py">CSPDarkNet</a></li>
+          <li><a href="ppdet/modeling/backbones/mobilenet_v1.py">MobileNetV1</a></li>
+          <li><a href="ppdet/modeling/backbones/mobilenet_v3.py">MobileNetV1</a></li>  
+          <li><a href="ppdet/modeling/backbones/shufflenet_v2.py">ShuffleNetV2</a></li>
+          <li><a href="ppdet/modeling/backbones/ghostnet.py">GhostNet</a></li>
+          <li><a href="ppdet/modeling/backbones/blazenet.py">BlazeNet</a></li>
+          <li><a href="ppdet/modeling/backbones/dla.py">DLA</a></li>
+          <li><a href="ppdet/modeling/backbones/hardnet.py">HardNet</a></li>
+          <li><a href="ppdet/modeling/backbones/lcnet.py">LCNet</a></li>  
+          <li><a href="ppdet/modeling/backbones/esnet.py">ESNet</a></li>  
+          <li><a href="ppdet/modeling/backbones/swin_transformer.py">Swin-Transformer</a></li>
+          <li><a href="ppdet/modeling/backbones/convnext.py">ConvNeXt</a></li>
+          <li><a href="ppdet/modeling/backbones/vgg.py">VGG</a></li>
+          <li><a href="ppdet/modeling/backbones/vision_transformer.py">Vision Transformer</a></li>
+          <li><a href="configs/convnext">ConvNext</a></li>
+      </ul>
+      </td>
+      <td>
+      <ul>
+        <li><a href="ppdet/modeling/necks/bifpn.py">BiFPN</a></li>
+        <li><a href="ppdet/modeling/necks/blazeface_fpn.py">BlazeFace-FPN</a></li>
+        <li><a href="ppdet/modeling/necks/centernet_fpn.py">CenterNet-FPN</a></li>
+        <li><a href="ppdet/modeling/necks/csp_pan.py">CSP-PAN</a></li>
+        <li><a href="ppdet/modeling/necks/custom_pan.py">Custom-PAN</a></li>
+        <li><a href="ppdet/modeling/necks/fpn.py">FPN</a></li>
+        <li><a href="ppdet/modeling/necks/es_pan.py">ES-PAN</a></li>
+        <li><a href="ppdet/modeling/necks/hrfpn.py">HRFPN</a></li>
+        <li><a href="ppdet/modeling/necks/lc_pan.py">LC-PAN</a></li>
+        <li><a href="ppdet/modeling/necks/ttf_fpn.py">TTF-FPN</a></li>
+        <li><a href="ppdet/modeling/necks/yolo_fpn.py">YOLO-FPN</a></li>
+      </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="ppdet/modeling/losses/smooth_l1_loss.py">Smooth-L1</a></li>
+          <li><a href="ppdet/modeling/losses/detr_loss.py">Detr Loss</a></li>
+          <li><a href="ppdet/modeling/losses/fairmot_loss.py">Fairmot Loss</a></li>
+          <li><a href="ppdet/modeling/losses/fcos_loss.py">Fcos Loss</a></li>
+          <li><a href="ppdet/modeling/losses/gfocal_loss.py">GFocal Loss</a></li>
+          <li><a href="ppdet/modeling/losses/jde_loss.py">JDE Loss</a></li>
+          <li><a href="ppdet/modeling/losses/keypoint_loss.py">KeyPoint Loss</a></li>
+          <li><a href="ppdet/modeling/losses/solov2_loss.py">SoloV2 Loss</a></li>
+          <li><a href="ppdet/modeling/losses/focal_loss.py">Focal Loss</a></li>
+          <li><a href="ppdet/modeling/losses/iou_loss.py">GIoU/DIoU/CIoU</a></li>  
+          <li><a href="ppdet/modeling/losses/iou_aware_loss.py">IoUAware</a></li>
+          <li><a href="ppdet/modeling/losses/sparsercnn_loss.py">SparseRCNN Loss</a></li>
+          <li><a href="ppdet/modeling/losses/ssd_loss.py">SSD Loss</a></li>
+          <li><a href="ppdet/modeling/losses/focal_loss.py">YOLO Loss</a></li>
+          <li><a href="ppdet/modeling/losses/yolo_loss.py">CT Focal Loss</a></li>
+          <li><a href="ppdet/modeling/losses/varifocal_loss.py">VariFocal Loss</a></li>
+        </ul>
+      </td>
+      <td>
+      </ul>
+          <li><b>Post-processing</b></li>
+        <ul>
+        <ul>
+           <li><a href="ppdet/modeling/post_process.py">SoftNMS</a></li>
+            <li><a href="ppdet/modeling/post_process.py">MatrixNMS</a></li>
+            </ul>
+            </ul>
+          <li><b>Training</b></li>
+        <ul>
+        <ul>
+            <li><a href="tools/train.py#L62">FP16 training</a></li>
+            <li><a href="docs/tutorials/DistributedTraining_cn.md">Multi-machine training </a></li>
+                        </ul>
+            </ul>
+          <li><b>Common</b></li>
+        <ul>
+        <ul>
+            <li><a href="ppdet/modeling/backbones/resnet.py#L41">Sync-BN</a></li>
+            <li><a href="configs/gn/README.md">Group Norm</a></li>
+            <li><a href="configs/dcn/README.md">DCNv2</a></li>
+            <li><a href="ppdet/optimizer/ema.py">EMA</a></li>
+        </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="ppdet/data/transform/operators.py">Resize</a></li>  
+          <li><a href="ppdet/data/transform/operators.py">Lighting</a></li>  
+          <li><a href="ppdet/data/transform/operators.py">Flipping</a></li>  
+          <li><a href="ppdet/data/transform/operators.py">Expand</a></li>
+          <li><a href="ppdet/data/transform/operators.py">Crop</a></li>
+          <li><a href="ppdet/data/transform/operators.py">Color Distort</a></li>  
+          <li><a href="ppdet/data/transform/operators.py">Random Erasing</a></li>  
+          <li><a href="ppdet/data/transform/operators.py">Mixup </a></li>
+          <li><a href="ppdet/data/transform/operators.py">AugmentHSV</a></li>
+          <li><a href="ppdet/data/transform/operators.py">Mosaic</a></li>
+          <li><a href="ppdet/data/transform/operators.py">Cutmix </a></li>
+          <li><a href="ppdet/data/transform/operators.py">Grid Mask</a></li>
+          <li><a href="ppdet/data/transform/operators.py">Auto Augment</a></li>  
+          <li><a href="ppdet/data/transform/operators.py">Random Perspective</a></li>  
+        </ul>
+      </td>
+    </tr>
+</td>
+    </tr>
+  </tbody>
+</table>
+
+## 📱模型库
+
+<table align="center">
+  <tbody>
+    <tr align="center" valign="center">
+      <td>
+        <b>2D Detection</b>
+      </td>
+      <td>
+        <b>Multi Object Tracking</b>
+      </td>
+      <td>
+        <b>KeyPoint Detection</b>
+      </td>
+      <td>
+      <b>Others</b>
+      </td>
+    </tr>
+    <tr valign="top">
+      <td>
+        <ul>
+            <li><a href="configs/faster_rcnn/README.md">Faster RCNN</a></li>
+            <li><a href="ppdet/modeling/necks/fpn.py">FPN</a></li>
+            <li><a href="configs/cascade_rcnn/README.md">Cascade-RCNN</a></li>
+            <li><a href="configs/rcnn_enhance">PSS-Det</a></li>
+            <li><a href="configs/retinanet/README.md">RetinaNet</a></li>
+            <li><a href="configs/yolov3/README.md">YOLOv3</a></li>  
+            <li><a href="configs/yolof/README.md">YOLOF</a></li>  
+            <li><a href="configs/yolox/README.md">YOLOX</a></li>  
+            <li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5">YOLOv5</a></li>
+            <li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov6">YOLOv6</a></li>  
+            <li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7">YOLOv7</a></li>
+            <li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov8">YOLOv8</a></li>
+            <li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/rtmdet">RTMDet</a></li>
+            <li><a href="configs/ppyolo/README_cn.md">PP-YOLO</a></li>
+            <li><a href="configs/ppyolo#pp-yolo-tiny">PP-YOLO-Tiny</a></li>
+            <li><a href="configs/picodet">PP-PicoDet</a></li>
+            <li><a href="configs/ppyolo/README_cn.md">PP-YOLOv2</a></li>
+            <li><a href="configs/ppyoloe/README_legacy.md">PP-YOLOE</a></li>
+            <li><a href="configs/ppyoloe/README_cn.md">PP-YOLOE+</a></li>
+            <li><a href="configs/smalldet">PP-YOLOE-SOD</a></li>
+            <li><a href="configs/rotate/README.md">PP-YOLOE-R</a></li>
+            <li><a href="configs/ssd/README.md">SSD</a></li>
+            <li><a href="configs/centernet">CenterNet</a></li>
+            <li><a href="configs/fcos">FCOS</a></li>  
+            <li><a href="configs/rotate/fcosr">FCOSR</a></li>  
+            <li><a href="configs/ttfnet">TTFNet</a></li>
+            <li><a href="configs/tood">TOOD</a></li>
+            <li><a href="configs/gfl">GFL</a></li>
+            <li><a href="configs/gfl/gflv2_r50_fpn_1x_coco.yml">GFLv2</a></li>
+            <li><a href="configs/detr">DETR</a></li>
+            <li><a href="configs/deformable_detr">Deformable DETR</a></li>
+            <li><a href="configs/sparse_rcnn">Sparse RCNN</a></li>
+      </ul>
+      </td>
+      <td>
+        <ul>
+           <li><a href="configs/mot/jde">JDE</a></li>
+            <li><a href="configs/mot/fairmot">FairMOT</a></li>
+            <li><a href="configs/mot/deepsort">DeepSORT</a></li>
+            <li><a href="configs/mot/bytetrack">ByteTrack</a></li>
+            <li><a href="configs/mot/ocsort">OC-SORT</a></li>
+            <li><a href="configs/mot/botsort">BoT-SORT</a></li>
+            <li><a href="configs/mot/centertrack">CenterTrack</a></li>
+        </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="configs/keypoint/hrnet">HRNet</a></li>
+            <li><a href="configs/keypoint/higherhrnet">HigherHRNet</a></li>
+            <li><a href="configs/keypoint/lite_hrnet">Lite-HRNet</a></li>
+            <li><a href="configs/keypoint/tiny_pose">PP-TinyPose</a></li>
+        </ul>
+</td>
+<td>
+</ul>
+          <li><b>Instance Segmentation</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/mask_rcnn">Mask RCNN</a></li>
+            <li><a href="configs/cascade_rcnn">Cascade Mask RCNN</a></li>
+            <li><a href="configs/solov2">SOLOv2</a></li>
+        </ul>
+      </ul>
+          <li><b>Face Detection</b></li>
+        <ul>
+        <ul>
+            <li><a href="configs/face_detection">BlazeFace</a></li>
+        </ul>
+        </ul>
+          <li><b>Semi-Supervised Detection</b></li>
+        <ul>
+        <ul>
+            <li><a href="configs/semi_det">DenseTeacher</a></li>
+        </ul>
+        </ul>
+          <li><b>3D Detection</b></li>
+        <ul>
+        <ul>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">Smoke</a></li>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">CaDDN</a></li>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">PointPillars</a></li>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">CenterPoint</a></li>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">SequeezeSegV3</a></li>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">IA-SSD</a></li>
+            <li><a href="https://github.com/PaddlePaddle/Paddle3D">PETR</a></li>
+        </ul>
+        </ul>
+          <li><b>Vehicle Analysis Toolbox</b></li>
+        <ul>
+        <ul>
+            <li><a href="deploy/pipeline/README.md">PP-Vehicle</a></li>
+        </ul>
+        </ul>
+          <li><b>Human Analysis Toolbox</b></li>
+        <ul>
+        <ul>
+            <li><a href="deploy/pipeline/README.md">PP-Human</a></li>
+            <li><a href="deploy/pipeline/README.md">PP-HumanV2</a></li>
+        </ul>
+        </ul>
+          <li><b>Sport Analysis Toolbox</b></li>
+        <ul>
+        <ul>
+            <li><a href="https://github.com/PaddlePaddle/PaddleSports">PP-Sports</a></li>
+        </ul>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+## ⚖️模型性能对比
+
+#### 🖥️服务器端模型性能对比
+
+各模型结构和骨干网络的代表模型在COCO数据集上精度mAP和单卡Tesla V100上预测速度(FPS)对比图。
+
+  <div  align="center">
+  <img src="https://user-images.githubusercontent.com/61035602/206434766-caaa781b-b922-481f-af09-15faac9ed33b.png" width="800"/>
+</div>
+
+<details>
+<summary><b> 测试说明(点击展开)</b></summary>
+
+- ViT为ViT-Cascade-Faster-RCNN模型，COCO数据集mAP高达55.7%
+- Cascade-Faster-RCNN为Cascade-Faster-RCNN-ResNet50vd-DCN，PaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
+- PP-YOLOE是对PP-YOLO v2模型的进一步优化，L版本在COCO数据集mAP为51.6%，Tesla V100预测速度78.1FPS
+- PP-YOLOE+是对PPOLOE模型的进一步优化，L版本在COCO数据集mAP为53.3%，Tesla V100预测速度78.1FPS
+- YOLOX和YOLOv5均为基于PaddleDetection复现算法，YOLOv5代码在[PaddleYOLO](https://github.com/PaddlePaddle/PaddleYOLO)中，参照[PaddleYOLO_MODEL](docs/feature_models/PaddleYOLO_MODEL.md)
+- 图中模型均可在[📱模型库](#模型库)中获取
+</details>
+
+#### ⌚️移动端模型性能对比
+
+各移动端模型在COCO数据集上精度mAP和高通骁龙865处理器上预测速度(FPS)对比图。
+
+  <div  align="center">
+  <img src="https://user-images.githubusercontent.com/61035602/206434741-10460690-8fc3-4084-a11a-16fe4ce2fc85.png" width="550"/>
+</div>
+
+
+<details>
+<summary><b> 测试说明(点击展开)</b></summary>
+
+- 测试数据均使用高通骁龙865(4xA77+4xA55)处理器，batch size为1, 开启4线程测试，测试使用NCNN预测库，测试脚本见[MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
+- PP-PicoDet及PP-YOLO-Tiny为PaddleDetection自研模型，可在[📱模型库](#模型库)中获取，其余模型PaddleDetection暂未提供
+</details>
+
+## 🎗️产业特色模型|产业工具
+
+产业特色模型｜产业工具是PaddleDetection针对产业高频应用场景打造的兼顾精度和速度的模型以及工具箱，注重从数据处理-模型训练-模型调优-模型部署的端到端打通，且提供了实际生产环境中的实践范例代码，帮助拥有类似需求的开发者高效的完成产品开发落地应用。
+
+该系列模型｜工具均已PP前缀命名，具体介绍、预训练模型以及产业实践范例代码如下。
+
+### 💎PP-YOLOE 高精度目标检测模型
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PP-YOLOE是基于PP-YOLOv2的卓越的单阶段Anchor-free模型，超越了多种流行的YOLO模型。PP-YOLOE避免了使用诸如Deformable Convolution或者Matrix NMS之类的特殊算子，以使其能轻松地部署在多种多样的硬件上。其使用大规模数据集obj365预训练模型进行预训练，可以在不同场景数据集上快速调优收敛。
+
+`传送门`：[PP-YOLOE说明](configs/ppyoloe/README_cn.md)。
+
+`传送门`：[arXiv论文](https://arxiv.org/abs/2203.16250)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+| 模型名称    | COCO精度（mAP） | V100 TensorRT FP16速度(FPS) | 推荐部署硬件 |                        配置文件                         |                                        模型下载                                         |
+| :---------- | :-------------: | :-------------------------: | :----------: | :-----------------------------------------------------: | :-------------------------------------------------------------------------------------: |
+| PP-YOLOE+_l |      53.3       |            149.2            |    服务器    | [链接](configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
+
+`传送门`：[全部预训练模型](configs/ppyoloe/README_cn.md)。
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业 | 类别              | 亮点                                                                                          | 文档说明                                                      | 模型下载                                            |
+| ---- | ----------------- | --------------------------------------------------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------------- |
+| 农业 | 农作物检测        | 用于葡萄栽培中基于图像的监测和现场机器人技术，提供了来自5种不同葡萄品种的实地实例             | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
+| 通用 | 低光场景检测      | 低光数据集使用ExDark，包括从极低光环境到暮光环境等10种不同光照条件下的图片。                  | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
+| 工业 | PCB电路板瑕疵检测 | 工业数据集使用PKU-Market-PCB，该数据集用于印刷电路板（PCB）的瑕疵检测，提供了6种常见的PCB缺陷 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
+</details>
+
+### 💎PP-YOLOE-R 高性能旋转框检测模型
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PP-YOLOE-R是一个高效的单阶段Anchor-free旋转框检测模型，基于PP-YOLOE+引入了一系列改进策略来提升检测精度。根据不同的硬件对精度和速度的要求，PP-YOLOE-R包含s/m/l/x四个尺寸的模型。在DOTA 1.0数据集上，PP-YOLOE-R-l和PP-YOLOE-R-x在单尺度训练和测试的情况下分别达到了78.14mAP和78.28 mAP，这在单尺度评估下超越了几乎所有的旋转框检测模型。通过多尺度训练和测试，PP-YOLOE-R-l和PP-YOLOE-R-x的检测精度进一步提升至80.02mAP和80.73 mAP，超越了所有的Anchor-free方法并且和最先进的Anchor-based的两阶段模型精度几乎相当。在保持高精度的同时，PP-YOLOE-R避免使用特殊的算子，例如Deformable Convolution或Rotated RoI Align，使其能轻松地部署在多种多样的硬件上。
+
+`传送门`：[PP-YOLOE-R说明](configs/rotate/ppyoloe_r)。
+
+`传送门`：[arXiv论文](https://arxiv.org/abs/2211.02386)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+|     模型     | Backbone |  mAP  | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 |                                      模型下载                                       |                                                            配置文件                                                            |
+| :----------: | :------: | :---: | :-----------------: | :------------------------: | :--------: | :-------: | :--------: | :------: | :------: | :-----: | :-----------: | :---------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: |
+| PP-YOLOE-R-l |  CRN-l   | 80.02 |        69.7         |            48.3            |   53.29    |  281.65   |     3x     |    oc    |  MS+RR   |    4    |       2       | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
+
+`传送门`：[全部预训练模型](configs/rotate/ppyoloe_r)。
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业 | 类别       | 亮点                                                                  | 文档说明                                                                                | 模型下载                                                              |
+| ---- | ---------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 通用 | 旋转框检测 | 手把手教你上手PP-YOLOE-R旋转框检测，10分钟将脊柱数据集精度训练至95mAP | [基于PP-YOLOE-R的旋转框检测](https://aistudio.baidu.com/aistudio/projectdetail/5058293) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/5058293) |
+</details>
+
+### 💎PP-YOLOE-SOD 高精度小目标检测模型
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PP-YOLOE-SOD(Small Object Detection)是PaddleDetection团队针对小目标检测提出的检测方案，在VisDrone-DET数据集上单模型精度达到38.5mAP，达到了SOTA性能。其分别基于切图拼图流程优化的小目标检测方案以及基于原图模型算法优化的小目标检测方案。同时提供了数据集自动分析脚本，只需输入数据集标注文件，便可得到数据集统计结果，辅助判断数据集是否是小目标数据集以及是否需要采用切图策略，同时给出网络超参数参考值。
+
+`传送门`：[PP-YOLOE-SOD 小目标检测模型](configs/smalldet)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+- VisDrone数据集预训练模型
+
+| 模型                | COCOAPI mAP<sup>val<br>0.5:0.95 | COCOAPI mAP<sup>val<br>0.5 | COCOAPI mAP<sup>test_dev<br>0.5:0.95 | COCOAPI mAP<sup>test_dev<br>0.5 | MatlabAPI mAP<sup>test_dev<br>0.5:0.95 | MatlabAPI mAP<sup>test_dev<br>0.5 |                                              下载                                               |                           配置文件                           |
+| :------------------ | :-----------------------------: | :------------------------: | :----------------------------------: | :-----------------------------: | :------------------------------------: | :-------------------------------: | :---------------------------------------------------------------------------------------------: | :----------------------------------------------------------: |
+| **PP-YOLOE+_SOD-l** |            **31.9**             |          **52.1**          |               **25.6**               |            **43.5**             |               **30.25**                |             **51.18**             | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml) |
+
+`传送门`：[全部预训练模型](configs/smalldet)。
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业 | 类别       | 亮点                                                 | 文档说明                                                                                          | 模型下载                                                              |
+| ---- | ---------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 通用 | 小目标检测 | 基于PP-YOLOE-SOD的无人机航拍图像检测案例全流程实操。 | [基于PP-YOLOE-SOD的无人机航拍图像检测](https://aistudio.baidu.com/aistudio/projectdetail/5036782) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/5036782) |
+</details>
+
+### 💫PP-PicoDet 超轻量实时目标检测模型
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+全新的轻量级系列模型PP-PicoDet，在移动端具有卓越的性能，成为全新SOTA轻量级模型。
+
+`传送门`：[PP-PicoDet说明](configs/picodet/README.md)。
+
+`传送门`：[arXiv论文](https://arxiv.org/abs/2111.00902)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+| 模型名称  | COCO精度（mAP） | 骁龙865 四线程速度(FPS) |  推荐部署硬件  |                       配置文件                       |                                       模型下载                                       |
+| :-------- | :-------------: | :---------------------: | :------------: | :--------------------------------------------------: | :----------------------------------------------------------------------------------: |
+| PicoDet-L |      36.1       |          39.7           | 移动端、嵌入式 | [链接](configs/picodet/picodet_l_320_coco_lcnet.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) |
+
+`传送门`：[全部预训练模型](configs/picodet/README.md)。
+</details>
+
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业     | 类别         | 亮点                                                                                                                           | 文档说明                                                                                                          | 模型下载                                                                                      |
+| -------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
+| 智慧城市 | 道路垃圾检测 | 通过在市政环卫车辆上安装摄像头对路面垃圾检测并分析，实现对路面遗撒的垃圾进行监控，记录并通知环卫人员清理，大大提升了环卫人效。 | [基于PP-PicoDet的路面垃圾检测](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0) |
+</details>
+
+### 📡PP-Tracking 实时多目标跟踪系统
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PaddleDetection团队提供了实时多目标跟踪系统PP-Tracking，是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统，具有模型丰富、应用广泛和部署高效三大优势。 PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式，针对实际业务的难点和痛点，提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用，部署方式支持API调用和GUI可视化界面，部署语言支持Python和C++，部署平台环境支持Linux、NVIDIA Jetson等。
+
+`传送门`：[PP-Tracking说明](configs/mot/README.md)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+| 模型名称  |               模型简介               |          精度          | 速度(FPS) |      推荐部署硬件      |                          配置文件                          |                                              模型下载                                              |
+| :-------- | :----------------------------------: | :--------------------: | :-------: | :--------------------: | :--------------------------------------------------------: | :------------------------------------------------------------------------------------------------: |
+| ByteTrack |   SDE多目标跟踪算法 仅包含检测模型   |   MOT-17 test:  78.4   |     -     | 服务器、移动端、嵌入式 |     [链接](configs/mot/bytetrack/bytetrack_yolox.yml)      |  [下载地址](https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams)   |
+| FairMOT   | JDE多目标跟踪算法 多任务联合学习方法 |   MOT-16 test: 75.0    |     -     | 服务器、移动端、嵌入式 | [链接](configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml) |     [下载地址](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams)     |
+| OC-SORT   |   SDE多目标跟踪算法 仅包含检测模型   | MOT-17 half val:  75.5 |     -     | 服务器、移动端、嵌入式 |        [链接](configs/mot/ocsort/ocsort_yolox.yml)         | [下载地址](https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_mot_ch.pdparams) |
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业 | 类别       | 亮点                       | 文档说明                                                                                       | 模型下载                                                              |
+| ---- | ---------- | -------------------------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 通用 | 多目标跟踪 | 快速上手单镜头、多镜头跟踪 | [PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/3022582) |
+</details>
+
+### ⛷️PP-TinyPose 人体骨骼关键点识别
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PaddleDetection 中的关键点检测部分紧跟最先进的算法，包括 Top-Down 和 Bottom-Up 两种方法，可以满足用户的不同需求。同时，PaddleDetection 提供针对移动端设备优化的自研实时关键点检测模型 PP-TinyPose。
+
+`传送门`：[PP-TinyPose说明](configs/keypoint/tiny_pose)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+|  模型名称   |               模型简介               | COCO精度（AP） |         速度(FPS)         |  推荐部署硬件  |                        配置文件                         |                                         模型下载                                         |
+| :---------: | :----------------------------------: | :------------: | :-----------------------: | :------------: | :-----------------------------------------------------: | :--------------------------------------------------------------------------------------: |
+| PP-TinyPose | 轻量级关键点算法<br/>输入尺寸256x192 |      68.8      | 骁龙865 四线程: 158.7 FPS | 移动端、嵌入式 | [链接](configs/keypoint/tiny_pose/tinypose_256x192.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) |
+
+`传送门`：[全部预训练模型](configs/keypoint/README.md)。
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业 | 类别 | 亮点                                                                                                                                     | 文档说明                                                                                             | 模型下载                                                              |
+| ---- | ---- | ---------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 运动 | 健身 | 提供从模型选型、数据准备、模型训练优化，到后处理逻辑和模型部署的全流程可复用方案，有效解决了复杂健身动作的高效识别，打造AI虚拟健身教练！ | [基于PP-TinyPose增强版的智能健身动作识别](https://aistudio.baidu.com/aistudio/projectdetail/4385813) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4385813) |
+</details>
+
+### 🏃🏻PP-Human 实时行人分析工具
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PaddleDetection深入探索核心行业的高频场景，提供了行人开箱即用分析工具，支持图片/单镜头视频/多镜头视频/在线视频流多种输入方式，广泛应用于智慧交通、智慧城市、工业巡检等领域。支持服务器端部署及TensorRT加速，T4服务器上可达到实时。
+PP-Human支持四大产业级功能：五大异常行为识别、26种人体属性分析、实时人流计数、跨镜头（ReID）跟踪。
+
+`传送门`：[PP-Human行人分析工具使用指南](deploy/pipeline/README.md)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+|        任务        | T4 TensorRT FP16: 速度（FPS） | 推荐部署硬件 |                                                                                                                                         模型下载                                                                                                                                         |                             模型体积                              |
+| :----------------: | :---------------------------: | :----------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------: |
+| 行人检测（高精度） |             39.8              |    服务器    |                                                                                              [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                               |                               182M                                |
+| 行人跟踪（高精度） |             31.4              |    服务器    |                                                                                             [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                              |                               182M                                |
+| 属性识别（高精度） |          单人 117.6           |    服务器    |                                      [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [属性识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_small_person_attribute_954_infer.zip)                                       |                  目标检测：182M<br>属性识别：86M                  |
+|      摔倒识别      |           单人 100            |    服务器    | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) <br> [关键点检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) <br> [基于关键点行为识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | 多目标跟踪：182M<br>关键点检测：101M<br>基于关键点行为识别：21.8M |
+|      闯入识别      |             31.4              |    服务器    |                                                                                             [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                              |                               182M                                |
+|      打架识别      |             50.8              |    服务器    |                                                                                              [视频分类](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                               |                                90M                                |
+|      抽烟识别      |             340.1             |    服务器    |                                    [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[基于人体id的目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip)                                    |            目标检测：182M<br>基于人体id的目标检测：27M            |
+|     打电话识别     |             166.7             |    服务器    |                                      [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[基于人体id的图像分类](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip)                                       |            目标检测：182M<br>基于人体id的图像分类：45M            |
+
+`传送门`：[完整预训练模型](deploy/pipeline/README.md)。
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业     | 类别     | 亮点                                                                                                                                           | 文档说明                                                                                               | 模型下载                                                                                 |
+| -------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
+| 智能安防 | 摔倒检测 | 飞桨行人分析PP-Human中提供的摔倒识别算法，采用了关键点+时空图卷积网络的技术，对摔倒姿势无限制、背景环境无要求。                                | [基于PP-Human v2的摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/4606001)                 | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4606001)                    |
+| 智能安防 | 打架识别 | 本项目基于PaddleVideo视频开发套件训练打架识别模型，然后将训练好的模型集成到PaddleDetection的PP-Human中，助力行人行为分析。                     | [基于PP-Human的打架识别](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1) |
+| 智能安防 | 摔倒检测 | 基于PP-Human完成来客分析整体流程。使用PP-Human完成来客分析中非常常见的场景： 1. 来客属性识别(单镜和跨境可视化）；2. 来客行为识别（摔倒识别）。 | [基于PP-Human的来客分析案例教程](https://aistudio.baidu.com/aistudio/projectdetail/4537344)            | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4537344)                    |
+</details>
+
+### 🏎️PP-Vehicle 实时车辆分析工具
+
+<details>
+<summary><b> 简介(点击展开)</b></summary>
+
+PaddleDetection深入探索核心行业的高频场景，提供了车辆开箱即用分析工具，支持图片/单镜头视频/多镜头视频/在线视频流多种输入方式，广泛应用于智慧交通、智慧城市、工业巡检等领域。支持服务器端部署及TensorRT加速，T4服务器上可达到实时。
+PP-Vehicle囊括四大交通场景核心功能：车牌识别、属性识别、车流量统计、违章检测。
+
+`传送门`：[PP-Vehicle车辆分析工具指南](deploy/pipeline/README.md)。
+
+</details>
+
+<details>
+<summary><b> 预训练模型(点击展开)</b></summary>
+
+|        任务        | T4 TensorRT FP16: 速度(FPS) | 推荐部署硬件 |                                                                                           模型方案                                                                                           |                模型体积                 |
+| :----------------: | :-------------------------: | :----------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------: |
+| 车辆检测（高精度） |            38.9             |    服务器    |                                                [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip)                                                |                  182M                   |
+| 车辆跟踪（高精度） |             25              |    服务器    |                                               [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip)                                               |                  182M                   |
+|      车牌识别      |            213.7            |    服务器    | [车牌检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz) <br> [车牌识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz) | 车牌检测：3.9M  <br> 车牌字符识别： 12M |
+|      车辆属性      |            136.8            |    服务器    |                                                  [属性识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip)                                                  |                  7.2M                   |
+
+`传送门`：[完整预训练模型](deploy/pipeline/README.md)。
+</details>
+
+<details>
+<summary><b> 产业应用代码示例(点击展开)</b></summary>
+
+| 行业     | 类别             | 亮点                                                                                                               | 文档说明                                                                                      | 模型下载                                                              |
+| -------- | ---------------- | ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 智慧交通 | 交通监控车辆分析 | 本项目基于PP-Vehicle演示智慧交通中最刚需的车流量监控、车辆违停检测以及车辆结构化（车牌、车型、颜色）分析三大场景。 | [基于PP-Vehicle的交通监控分析系统](https://aistudio.baidu.com/aistudio/projectdetail/4512254) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4512254) |
+</details>
+
+## 💡产业实践范例
+
+产业实践范例是PaddleDetection针对高频目标检测应用场景，提供的端到端开发示例，帮助开发者打通数据标注-模型训练-模型调优-预测部署全流程。
+针对每个范例我们都通过[AI-Studio](https://ai.baidu.com/ai-doc/AISTUDIO/Tk39ty6ho)提供了项目代码以及说明，用户可以同步运行体验。
+
+`传送门`：[产业实践范例完整列表](industrial_tutorial/README.md)
+
+- [基于PP-YOLOE-R的旋转框检测](https://aistudio.baidu.com/aistudio/projectdetail/5058293)
+- [基于PP-YOLOE-SOD的无人机航拍图像检测](https://aistudio.baidu.com/aistudio/projectdetail/5036782)
+- [基于PP-Vehicle的交通监控分析系统](https://aistudio.baidu.com/aistudio/projectdetail/4512254)
+- [基于PP-Human v2的摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/4606001)
+- [基于PP-TinyPose增强版的智能健身动作识别](https://aistudio.baidu.com/aistudio/projectdetail/4385813)
+- [基于PP-Human的打架识别](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1)
+- [基于Faster-RCNN的瓷砖表面瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2571419)
+- [基于PaddleDetection的PCB瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2367089)
+- [基于FairMOT实现人流量统计](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
+- [基于YOLOv3实现跌倒检测](https://aistudio.baidu.com/aistudio/projectdetail/2500639)
+- [基于PP-PicoDetv2 的路面垃圾检测](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0)
+- [基于人体关键点检测的合规检测](https://aistudio.baidu.com/aistudio/projectdetail/4061642?contributionType=1)
+- [基于PP-Human的来客分析案例教程](https://aistudio.baidu.com/aistudio/projectdetail/4537344)
+- 持续更新中...
+
+## 🏆企业应用案例
+
+企业应用案例是企业在实生产环境下落地应用PaddleDetection的方案思路，相比产业实践范例其更多强调整体方案设计思路，可供开发者在项目方案设计中做参考。
+
+`传送门`：[企业应用案例完整列表](https://www.paddlepaddle.org.cn/customercase)
+
+- [中国南方电网——变电站智慧巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2330)
+- [国铁电气——轨道在线智能巡检系统](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2280)
+- [京东物流——园区车辆行为识别](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2611)
+- [中兴克拉—厂区传统仪表统计监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2618)
+- [宁德时代—动力电池高精度质量检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2609)
+- [中国科学院空天信息创新研究院——高尔夫球场遥感监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2483)
+- [御航智能——基于边缘的无人机智能巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2481)
+- [普宙无人机——高精度森林巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2121)
+- [领邦智能——红外无感测温监控](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2615)
+- [北京地铁——口罩检测](https://mp.weixin.qq.com/s/znrqaJmtA7CcjG0yQESWig)
+- [音智达——工厂人员违规行为检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2288)
+- [华夏天信——输煤皮带机器人智能巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2331)
+- [优恩物联网——社区住户分类支持广告精准投放](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2485)
+- [螳螂慧视——室内3D点云场景物体分割与检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2599)
+- 持续更新中...
+
+## 📝许可证书
+
+本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
+
+
+## 📌引用
+
+```
+@misc{ppdet2019,
+title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
+author={PaddlePaddle Authors},
+howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
+year={2019}
+}
+```
--- a/paddle_detection/README_en.md
+++ b/paddle_detection/README_en.md
@@ -0,0 +1,541 @@
+[简体中文](README_cn.md) | English
+
+<div align="center">
+<p align="center">
+  <img src="https://user-images.githubusercontent.com/48054808/160532560-34cf7a1f-d950-435e-90d2-4b0a679e5119.png" align="middle" width = "800" />
+</p>
+
+**A High-Efficient Development Toolkit for Object Detection based on [PaddlePaddle](https://github.com/paddlepaddle/paddle)**
+
+<p align="center">
+    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
+    <a href="https://github.com/PaddlePaddle/PaddleDetection/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleDetection?color=ffa"></a>
+    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
+    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
+    <a href="https://github.com/PaddlePaddle/PaddleDetection/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleDetection?color=ccf"></a>
+</p>
+</div>
+
+<div  align="center">
+  <img src="https://user-images.githubusercontent.com/22989727/205581915-aa8d6bee-5624-4aec-8059-76b5ebaf96f1.gif" width="800"/>
+
+</div>
+
+## <img src="https://user-images.githubusercontent.com/48054808/157793354-6e7f381a-0aa6-4bb7-845c-9acf2ecc05c3.png" width="20"/> Product Update
+
+- 🔥 **2022.11.15：SOTA rotated object detector and small object detector based on PP-YOLOE**
+  - Rotated object detector [PP-YOLOE-R](configs/rotate/ppyoloe_r)
+    - SOTA Anchor-free rotated object detection model with high accuracy and efficiency
+    - A series of models, named s/m/l/x, for cloud and edge devices
+    - Avoiding using special operators to be deployed friendly with TensorRT.
+  - Small object detector [PP-YOLOE-SOD](configs/smalldet)
+    - End-to-end detection pipeline based on sliced images
+    - SOTA model on VisDrone based on original images.
+
+- 2022.8.26：PaddleDetection releases[release/2.5 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5)
+
+  - 🗳 Model features：
+
+    - Release [PP-YOLOE+](configs/ppyoloe): Increased accuracy by a maximum of 2.4% mAP to 54.9% mAP, 3.75 times faster model training convergence rate, and up to 2.3 times faster end-to-end inference speed; improved generalization for multiple downstream tasks
+    - Release [PicoDet-NPU](configs/picodet) model which supports full quantization deployment of models; add [PicoDet](configs/picodet) layout analysis model
+    - Release [PP-TinyPose Plus](./configs/keypoint/tiny_pose/). With 9.1% AP accuracy improvement in physical exercise, dance, and other scenarios, our PP-TinyPose Plus supports unconventional movements such as turning to one side, lying down, jumping, and high lifts
+
+  - 🔮 Functions in different scenarios
+
+    - Release the pedestrian analysis tool [PP-Human v2](./deploy/pipeline). It introduces four new behavior recognition: fighting, telephoning, smoking, and trespassing. The underlying algorithm performance is optimized, covering three core algorithm capabilities: detection, tracking, and attributes of pedestrians. Our model provides end-to-end development and model optimization strategies for beginners and supports online video streaming input.
+    - First release [PP-Vehicle](./deploy/pipeline), which has four major functions: license plate recognition, vehicle attribute analysis (color, model), traffic flow statistics, and violation detection. It is compatible with input formats, including pictures, online video streaming, and video. And we also offer our users a comprehensive set of tutorials for customization.
+
+  - 💡 Cutting-edge algorithms：
+
+    - Release [PaddleYOLO](https://github.com/PaddlePaddle/PaddleYOLO) which overs classic and latest models of [YOLO family](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/docs/MODEL_ZOO_en.md): YOLOv3, PP-YOLOE (a real-time high-precision object detection model developed by Baidu PaddlePaddle), and cutting-edge detection algorithms such as YOLOv4, YOLOv5, YOLOX, YOLOv6, YOLOv7 and YOLOv8
+    - Newly add high precision detection model based on [ViT](configs/vitdet) backbone network, with a 55.7% mAP accuracy on COCO dataset; newly add multi-object tracking model [OC-SORT](configs/mot/ocsort); newly add [ConvNeXt](configs/convnext) backbone network.
+
+  - 📋 Industrial applications: Newly add [Smart Fitness](https://aistudio.baidu.com/aistudio/projectdetail/4385813), [Fighting recognition](https://aistudio.baidu.com/aistudio/projectdetail/4086987?channelType=0&channel=0),[ and Visitor Analysis](https://aistudio.baidu.com/aistudio/projectdetail/4230123?channelType=0&channel=0).
+
+- 2022.3.24：PaddleDetection released[release/2.4 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4)  
+  - Release high-performanace SOTA object detection model [PP-YOLOE](configs/ppyoloe). It integrates cloud and edge devices and provides S/M/L/X versions. In particular, Verson L has the accuracy as 51.4% on COCO test 2017 dataset, inference speed as 78.1 FPS on a single Test V100. It supports mixed precision training, 33% faster than PP-YOLOv2. Its full range of multi-sized models can meet different hardware arithmetic requirements, and adaptable to server, edge-device GPU and other AI accelerator cards on servers.
+  - Release ultra-lightweight SOTA object detection model [PP-PicoDet Plus](configs/picodet) with 2% improvement in accuracy and 63% improvement in CPU inference speed. Add PicoDet-XS model with a 0.7M parameter, providing model sparsification and quantization functions for model acceleration. No specific post processing module is required for all the hardware, simplifying the deployment.  
+  - Release the real-time pedestrian analysis tool [PP-Human](deploy/pphuman). It has four major functions: pedestrian tracking, visitor flow statistics, human attribute recognition and falling detection. For falling detection, it is optimized based on real-life data with accurate recognition of various types of falling posture. It can adapt to different environmental background, light and camera angle.
+  - Add [YOLOX](configs/yolox) object detection model with nano/tiny/S/M/L/X. X version has the accuracy as 51.8% on COCO  Val2017 dataset.
+
+- [More releases](https://github.com/PaddlePaddle/PaddleDetection/releases)
+
+## <img title="" src="https://user-images.githubusercontent.com/48054808/157795569-9fc77c85-732f-4870-9be0-99a7fe2cff27.png" alt="" width="20"> Brief Introduction
+
+**PaddleDetection** is an end-to-end object detection development kit based on PaddlePaddle. Providing **over 30 model algorithm** and **over 300 pre-trained models**, it covers object detection, instance segmentation, keypoint detection, multi-object tracking. In particular, PaddleDetection offers **high- performance & light-weight** industrial SOTA models on **servers and mobile** devices, champion solution and cutting-edge algorithm. PaddleDetection provides various data augmentation methods, configurable network components, loss functions and other advanced optimization & deployment schemes. In addition to running through the whole process of data processing, model development, training, compression and deployment, PaddlePaddle also provides rich cases and tutorials to accelerate the industrial application of algorithm.
+
+<div  align="center">
+  <img src="https://user-images.githubusercontent.com/22989727/189122825-ee1c1db2-b5f9-42c0-88b4-7975e1ec239d.gif" width="800"/>
+</div>
+
+## <img src="https://user-images.githubusercontent.com/48054808/157799599-e6a66855-bac6-4e75-b9c0-96e13cb9612f.png" width="20"/> Features
+
+- **Rich model library**: PaddleDetection provides over 250 pre-trained models including **object detection, instance segmentation, face recognition, multi-object tracking**. It covers a variety of **global competition champion** schemes.
+- **Simple to use**: Modular design, decoupling each network component, easy for developers to build and try various detection models and optimization strategies, quick access to high-performance, customized algorithm.
+- **Getting Through End to End**: PaddlePaddle gets through end to end from data augmentation, constructing models, training, compression, depolyment. It also supports multi-architecture, multi-device deployment for **cloud and edge** device.
+- **High Performance**: Due to the high performance core, PaddlePaddle has clear advantages in training speed and memory occupation. It also supports FP16 training and multi-machine training.
+
+<div  align="center">
+  <img src="https://user-images.githubusercontent.com/22989727/202131382-45fd2de6-3805-460e-a70c-66db7188d37c.png" width="800"/>
+</div>
+
+## <img title="" src="https://user-images.githubusercontent.com/48054808/157800467-2a9946ad-30d1-49a9-b9db-ba33413d9c90.png" alt="" width="20"> Exchanges
+
+- If you have any question or suggestion, please give us your valuable input via [GitHub Issues](https://github.com/PaddlePaddle/PaddleDetection/issues)
+
+  Welcome to join PaddleDetection user groups on WeChat (scan the QR code, add and reply "D" to the assistant)
+
+  <div align="center">
+  <img src="https://user-images.githubusercontent.com/34162360/177678712-4655747d-4290-4ad9-b7a1-4564a5418ac6.jpg"  width = "200" />  
+  </div>
+
+## <img src="https://user-images.githubusercontent.com/48054808/157827140-03ffaff7-7d14-48b4-9440-c38986ea378c.png" width="20"/> Kit Structure
+
+<table align="center">
+  <tbody>
+    <tr align="center" valign="bottom">
+      <td>
+        <b>Architectures</b>
+      </td>
+      <td>
+        <b>Backbones</b>
+      </td>
+      <td>
+        <b>Components</b>
+      </td>
+      <td>
+        <b>Data Augmentation</b>
+      </td>
+    </tr>
+    <tr valign="top">
+      <td>
+        <ul>
+        <details><summary><b>Object Detection</b></summary>
+          <ul>
+            <li>Faster RCNN</li>
+            <li>FPN</li>
+            <li>Cascade-RCNN</li>
+            <li>PSS-Det</li>
+            <li>RetinaNet</li>
+            <li>YOLOv3</li>  
+            <li>YOLOF</li>  
+            <li>YOLOX</li>  
+            <li>YOLOv5</li>  
+            <li>YOLOv6</li>  
+            <li>YOLOv7</li>  
+            <li>YOLOv8</li>  
+            <li>RTMDet</li>  
+            <li>PP-YOLO</li>
+            <li>PP-YOLO-Tiny</li>
+            <li>PP-PicoDet</li>
+            <li>PP-YOLOv2</li>
+            <li>PP-YOLOE</li>
+            <li>PP-YOLOE+</li>
+            <li>PP-YOLOE-SOD</li>
+            <li>PP-YOLOE-R</li>
+            <li>SSD</li>
+            <li>CenterNet</li>
+            <li>FCOS</li>  
+            <li>FCOSR</li>  
+            <li>TTFNet</li>
+            <li>TOOD</li>
+            <li>GFL</li>
+            <li>GFLv2</li>
+            <li>DETR</li>
+            <li>Deformable DETR</li>
+            <li>Swin Transformer</li>
+            <li>Sparse RCNN</li>
+         </ul></details>
+        <details><summary><b>Instance Segmentation</b></summary>
+         <ul>
+            <li>Mask RCNN</li>
+            <li>Cascade Mask RCNN</li>
+            <li>SOLOv2</li>
+        </ul></details>
+        <details><summary><b>Face Detection</b></summary>
+        <ul>
+            <li>BlazeFace</li>
+        </ul></details>
+        <details><summary><b>Multi-Object-Tracking</b></summary>
+        <ul>
+            <li>JDE</li>
+            <li>FairMOT</li>
+            <li>DeepSORT</li>
+            <li>ByteTrack</li>
+            <li>OC-SORT</li>
+            <li>BoT-SORT</li>
+            <li>CenterTrack</li>
+        </ul></details>
+        <details><summary><b>KeyPoint-Detection</b></summary>
+        <ul>
+            <li>HRNet</li>
+            <li>HigherHRNet</li>
+            <li>Lite-HRNet</li>
+            <li>PP-TinyPose</li>
+        </ul></details>
+      </ul>
+      </td>
+      <td>
+        <details><summary><b>Details</b></summary>
+        <ul>
+          <li>ResNet(&vd)</li>
+          <li>Res2Net(&vd)</li>
+          <li>CSPResNet</li>
+          <li>SENet</li>
+          <li>Res2Net</li>
+          <li>HRNet</li>
+          <li>Lite-HRNet</li>
+          <li>DarkNet</li>
+          <li>CSPDarkNet</li>
+          <li>MobileNetv1/v3</li>  
+          <li>ShuffleNet</li>
+          <li>GhostNet</li>
+          <li>BlazeNet</li>
+          <li>DLA</li>
+          <li>HardNet</li>
+          <li>LCNet</li>  
+          <li>ESNet</li>  
+          <li>Swin-Transformer</li>
+          <li>ConvNeXt</li>
+          <li>Vision Transformer</li>
+        </ul></details>
+      </td>
+      <td>
+        <details><summary><b>Common</b></summary>
+          <ul>
+            <li>Sync-BN</li>
+            <li>Group Norm</li>
+            <li>DCNv2</li>
+            <li>EMA</li>
+          </ul> </details>
+        </ul>
+        <details><summary><b>KeyPoint</b></summary>
+          <ul>
+            <li>DarkPose</li>
+          </ul></details>
+        </ul>
+        <details><summary><b>FPN</b></summary>
+          <ul>
+            <li>BiFPN</li>
+            <li>CSP-PAN</li>
+            <li>Custom-PAN</li>
+            <li>ES-PAN</li>
+            <li>HRFPN</li>
+          </ul> </details>
+        </ul>  
+        <details><summary><b>Loss</b></summary>
+          <ul>
+            <li>Smooth-L1</li>
+            <li>GIoU/DIoU/CIoU</li>  
+            <li>IoUAware</li>
+            <li>Focal Loss</li>
+            <li>CT Focal Loss</li>
+            <li>VariFocal Loss</li>
+          </ul> </details>
+        </ul>  
+        <details><summary><b>Post-processing</b></summary>
+          <ul>
+            <li>SoftNMS</li>
+            <li>MatrixNMS</li>  
+          </ul> </details>  
+        </ul>
+        <details><summary><b>Speed</b></summary>
+          <ul>
+            <li>FP16 training</li>
+            <li>Multi-machine training </li>  
+          </ul> </details>  
+        </ul>  
+      </td>
+      <td>
+        <details><summary><b>Details</b></summary>
+        <ul>
+          <li>Resize</li>  
+          <li>Lighting</li>  
+          <li>Flipping</li>  
+          <li>Expand</li>
+          <li>Crop</li>
+          <li>Color Distort</li>  
+          <li>Random Erasing</li>  
+          <li>Mixup </li>
+          <li>AugmentHSV</li>
+          <li>Mosaic</li>
+          <li>Cutmix </li>
+          <li>Grid Mask</li>
+          <li>Auto Augment</li>  
+          <li>Random Perspective</li>  
+        </ul> </details>  
+      </td>  
+    </tr>
+
+</td>
+    </tr>
+  </tbody>
+</table>
+
+## <img src="https://user-images.githubusercontent.com/48054808/157801371-9a9a8c65-1690-4123-985a-e0559a7f9494.png" width="20"/> Model Performance
+
+<details>
+<summary><b> Performance comparison of Cloud models</b></summary>
+
+The comparison between COCO mAP and FPS on Tesla V100 of representative models of each architectures and backbones.
+
+<div align="center">
+  <img src="docs/images/fps_map.png" />
+</div>
+
+**Clarification：**
+
+- `ViT` stands for `ViT-Cascade-Faster-RCNN`, which has highest mAP on COCO as 55.7%
+- `Cascade-Faster-RCNN`stands for `Cascade-Faster-RCNN-ResNet50vd-DCN`, which has been optimized to 20 FPS inference speed when COCO mAP as 47.8% in PaddleDetection models
+- `PP-YOLOE` are optimized `PP-YOLO v2`. It reached accuracy as 51.4% on COCO dataset, inference speed as 78.1 FPS on Tesla V100
+- `PP-YOLOE+` are optimized `PP-YOLOE`. It reached accuracy as 53.3% on COCO dataset, inference speed as 78.1 FPS on Tesla V100
+- The models in the figure are available in the[ model library](#模型库)
+
+</details>
+
+<details>
+<summary><b> Performance omparison on mobiles</b></summary>
+
+The comparison between COCO mAP and FPS on Qualcomm Snapdragon 865 processor of models on mobile devices.
+
+<div align="center">
+  <img src="docs/images/mobile_fps_map.png" width=600/>
+</div>
+
+**Clarification：**
+
+- Tests were conducted on Qualcomm Snapdragon 865 (4 \*A77 + 4 \*A55) batch_size=1, 4 thread, and NCNN inference library, test script see [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
+- [PP-PicoDet](configs/picodet) and [PP-YOLO-Tiny](configs/ppyolo) are self-developed models of PaddleDetection, and other models are not tested yet.
+
+</details>
+
+## <img src="https://user-images.githubusercontent.com/48054808/157829890-a535b8a6-631c-4c87-b861-64d4b32b2d6a.png" width="20"/> Model libraries
+
+<details>
+<summary><b> 1. General detection</b></summary>
+
+#### PP-YOLOE series Recommended scenarios: Cloud GPU such as Nvidia V100, T4 and edge devices such as Jetson series
+
+| Model      | COCO Accuracy（mAP） | V100 TensorRT FP16 Speed(FPS) | Configuration                                           | Download                                                                                 |
+|:---------- |:------------------:|:-----------------------------:|:-------------------------------------------------------:|:----------------------------------------------------------------------------------------:|
+| PP-YOLOE+_s | 43.9        | 333.3                     | [link](configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml)     | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams)      |
+| PP-YOLOE+_m | 50.0        | 208.3                     | [link](configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml)     | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams)     |
+| PP-YOLOE+_l | 53.3        | 149.2                     | [link](configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
+| PP-YOLOE+_x | 54.9        | 95.2                      | [link](configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) |
+
+#### PP-PicoDet series Recommended scenarios: Mobile chips and x86 CPU devices, such as ARM CPU(RK3399, Raspberry Pi) and NPU(BITMAIN)
+
+| Model      | COCO Accuracy（mAP） | Snapdragon 865 four-thread speed (ms) | Configuration                                         | Download                                                                              |
+|:---------- |:------------------:|:-------------------------------------:|:-----------------------------------------------------:|:-------------------------------------------------------------------------------------:|
+| PicoDet-XS | 23.5               | 7.81                                  | [Link](configs/picodet/picodet_xs_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) |
+| PicoDet-S  | 29.1               | 9.56                                  | [Link](configs/picodet/picodet_s_320_coco_lcnet.yml)  | [Download](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams)  |
+| PicoDet-M  | 34.4               | 17.68                                 | [Link](configs/picodet/picodet_m_320_coco_lcnet.yml)  | [Download](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams)  |
+| PicoDet-L  | 36.1               | 25.21                                 | [Link](configs/picodet/picodet_l_320_coco_lcnet.yml)  | [Download](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams)  |
+
+#### [Frontier detection algorithm](docs/feature_models/PaddleYOLO_MODEL.md)
+
+| Model    | COCO Accuracy（mAP） | V100 TensorRT FP16 speed(FPS) | Configuration                                                                                                  | Download                                                                       |
+|:-------- |:------------------:|:-----------------------------:|:--------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------:|
+| [YOLOX-l](configs/yolox)  | 50.1               | 107.5                         | [Link](configs/yolox/yolox_l_300e_coco.yml)                                                                    | [Download](https://paddledet.bj.bcebos.com/models/yolox_l_300e_coco.pdparams)  |
+| [YOLOv5-l](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5) | 48.6               | 136.0                         | [Link](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5/yolov5_l_300e_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/yolov5_l_300e_coco.pdparams) |
+| [YOLOv7-l](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7) | 51.0        | 135.0                     | [链接](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7/yolov7_l_300e_coco.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/yolov7_l_300e_coco.pdparams) |
+
+#### Other general purpose models [doc](docs/MODEL_ZOO_en.md)
+
+</details>
+
+<details>
+<summary><b> 2. Instance segmentation</b></summary>
+
+| Model             | Introduction                                             | Recommended Scenarios                         | COCO Accuracy(mAP)               | Configuration                                                           | Download                                                                                              |
+|:----------------- |:-------------------------------------------------------- |:--------------------------------------------- |:--------------------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|
+| Mask RCNN         | Two-stage instance segmentation algorithm                | <div style="width: 50pt">Edge-Cloud end</div> | box AP: 41.4 <br/> mask AP: 37.5 | [Link](configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml)              | [Download](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_2x_coco.pdparams)              |
+| Cascade Mask RCNN | Two-stage instance segmentation algorithm                | <div style="width: 50pt">Edge-Cloud end</div> | box AP: 45.7 <br/> mask AP: 39.7 | [Link](configs/mask_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) |
+| SOLOv2            | Lightweight single-stage instance segmentation algorithm | <div style="width: 50pt">Edge-Cloud end</div> | mask AP: 38.0                    | [Link](configs/solov2/solov2_r50_fpn_3x_coco.yml)                       | [Download](https://paddledet.bj.bcebos.com/models/solov2_r50_fpn_3x_coco.pdparams)                    |
+
+</details>
+
+<details>
+<summary><b> 3. Keypoint detection</b></summary>
+
+| Model                | Introduction                                                                                  | Recommended scenarios                         | COCO Accuracy（AP） | Speed                             | Configuration                                             | Download                                                                                    |
+|:-------------------- |:--------------------------------------------------------------------------------------------- |:--------------------------------------------- |:-----------------:|:---------------------------------:|:---------------------------------------------------------:|:-------------------------------------------------------------------------------------------:|
+| HRNet-w32 + DarkPose | <div style="width: 130pt">Top-down Keypoint detection algorithm<br/>Input size: 384x288</div> | <div style="width: 50pt">Edge-Cloud end</div> | 78.3              | T4 TensorRT FP16 2.96ms           | [Link](configs/keypoint/hrnet/dark_hrnet_w32_384x288.yml) | [Download](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_384x288.pdparams) |
+| HRNet-w32 + DarkPose | Top-down Keypoint detection algorithm<br/>Input size: 256x192                                 | Edge-Cloud end                                | 78.0              | T4 TensorRT FP16 1.75ms           | [Link](configs/keypoint/hrnet/dark_hrnet_w32_256x192.yml) | [Download](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_256x192.pdparams) |
+| PP-TinyPose          | Light-weight keypoint algorithm<br/>Input size: 256x192                                       | Mobile                                        | 68.8              | Snapdragon 865 four-thread 6.30ms | [Link](configs/keypoint/tiny_pose/tinypose_256x192.yml)   | [Download](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams)    |
+| PP-TinyPose          | Light-weight keypoint algorithm<br/>Input size: 128x96                                        | Mobile                                        | 58.1              | Snapdragon 865 four-thread 2.37ms | [Link](configs/keypoint/tiny_pose/tinypose_128x96.yml)    | [Download](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams)     |
+
+#### Other keypoint detection models [doc](configs/keypoint)
+
+</details>
+
+<details>
+<summary><b> 4. Multi-object tracking PP-Tracking</b></summary>
+
+| Model     | Introduction                                                  | Recommended scenarios | Accuracy               | Configuration                                                           | Download                                                                                              |
+|:--------- |:------------------------------------------------------------- |:--------------------- |:----------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|
+| ByteTrack | SDE Multi-object tracking algorithm with detection model only | Edge-Cloud end        | MOT-17 half val:  77.3 | [Link](configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml) | [Download](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_det.pdparams) |
+| FairMOT   | JDE multi-object tracking algorithm multi-task learning       | Edge-Cloud end        | MOT-16 test: 75.0      | [Link](configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml)              | [Download](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams)            |
+| OC-SORT   | SDE multi-object tracking algorithm with detection model only       | Edge-Cloud end        | MOT-16 half val: 75.5      | [Link](configs/mot/ocsort/ocsort_yolox.yml)              | -            |
+
+#### Other multi-object tracking models [docs](configs/mot)
+
+</details>
+
+<details>
+<summary><b> 5. Industrial real-time pedestrain analysis tool-PP Human</b></summary>
+
+| Task                                   | End-to-End Speed（ms） | Model                                                                                                                                                                                                                                                                                                                           | Size                                                                                                   |
+|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
+| Pedestrian detection (high precision)  | 25.1ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
+| Pedestrian detection (lightweight)     | 16.2ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)                                                                                                                                                                                                                      | 27M                                                                                                    |
+| Pedestrian tracking (high precision)   | 31.8ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
+| Pedestrian tracking (lightweight)      | 21.0ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)                                                                                                                                                                                                                      | 27M                                                                                                    |
+| Attribute recognition (high precision) | Single person8.5ms   | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip)                                                                                                         | Object detection：182M<br>Attribute recognition：86M                                                     |
+| Attribute recognition (lightweight)    | Single person 7.1ms  | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip)                                                                                                         | Object detection：182M<br>Attribute recognition：86M                                                     |
+| Falling detection                      | Single person 10ms   | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) <br> [Keypoint detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) <br> [Behavior detection based on key points](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | Multi-object tracking：182M<br>Keypoint detection：101M<br>Behavior detection based on key points: 21.8M |
+| Intrusion detection                    | 31.8ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
+| Fighting detection                     | 19.7ms               | [Video classification](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                       | 90M                                                                                                    |
+| Smoking detection                      | Single person 15.1ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Object detection based on Human Id](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip)                                                                                        | Object detection：182M<br>Object detection based on Human ID: 27M                                       |
+| Phoning detection                      | Single person ms     | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Image classification based on Human ID](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip)                                                                                         | Object detection：182M<br>Image classification based on Human ID：45M                                    |
+
+Please refer to [docs](deploy/pipeline/README_en.md) for details.
+
+</details>
+
+<details>
+<summary><b> 6. Industrial real-time vehicle analysis tool-PP Vehicle</b></summary>
+
+| Task                                   | End-to-End Speed（ms） | Model                                                                                                                                                                                                                                                                                                                           | Size                                                                                                   |
+|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
+| Vehicle detection (high precision)  | 25.7ms               | [object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
+| Vehicle detection (lightweight)     | 13.2ms               | [object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_ppvehicle.zip)                                                                                                                                                                                                                      | 27M                                                                                                    |
+| Vehicle tracking (high precision)   | 40ms               | [multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
+| Vehicle tracking (lightweight)      | 25ms               | [multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)                                                                                                                                                                                                                      | 27M                                                                                                    |
+| Plate Recognition                   | 4.68ms     | [plate detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz)<br>[plate recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz)                                                                                         | Plate detection：3.9M<br>Plate recognition：12M                                    |
+| Vehicle attribute      | 7.31ms               | [attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip)                                                                                                                                                                                                                      | 7.2M                                                                                                    |
+
+Please refer to [docs](deploy/pipeline/README_en.md) for details.
+
+</details>
+
+
+## <img src="https://user-images.githubusercontent.com/48054808/157828296-d5eb0ccb-23ea-40f5-9957-29853d7d13a9.png" width="20"/>Document tutorials
+
+### Introductory tutorials
+
+- [Installation](docs/tutorials/INSTALL_cn.md)
+- [Quick start](docs/tutorials/QUICK_STARTED_cn.md)
+- [Data preparation](docs/tutorials/data/README.md)
+- [Geting Started on PaddleDetection](docs/tutorials/GETTING_STARTED_cn.md)
+- [FAQ](docs/tutorials/FAQ)
+
+### Advanced tutorials
+
+- Configuration
+
+  - [RCNN Configuration](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation.md)
+  - [PP-YOLO Configuration](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation.md)
+
+- Compression based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
+
+  - [Pruning/Quantization/Distillation Tutorial](configs/slim)
+
+- [Inference deployment](deploy/README.md)
+
+  - [Export model for inference](deploy/EXPORT_MODEL.md)
+
+  - [Paddle Inference deployment](deploy/README.md)
+
+    - [Inference deployment with Python](deploy/python)
+    - [Inference deployment with C++](deploy/cpp)
+
+  - [Paddle-Lite deployment](deploy/lite)
+
+  - [Paddle Serving deployment](deploy/serving)
+
+  - [ONNX model export](deploy/EXPORT_ONNX_MODEL.md)
+
+  - [Inference benchmark](deploy/BENCHMARK_INFER.md)
+
+- Advanced development
+
+  - [Data processing module](docs/advanced_tutorials/READER.md)
+  - [New object detection models](docs/advanced_tutorials/MODEL_TECHNICAL.md)
+  - Custumization
+    - [Object detection](docs/advanced_tutorials/customization/detection.md)
+    - [Keypoint detection](docs/advanced_tutorials/customization/keypoint_detection.md)
+    - [Multiple object tracking](docs/advanced_tutorials/customization/pphuman_mot.md)
+    - [Action recognition](docs/advanced_tutorials/customization/action_recognotion/)
+    - [Attribute recognition](docs/advanced_tutorials/customization/pphuman_attribute.md)
+
+### Courses
+
+- **[Theoretical foundation] [Object detection 7-day camp](https://aistudio.baidu.com/aistudio/education/group/info/1617):** Overview of object detection tasks, details of RCNN series object detection algorithm and YOLO series object detection algorithm, PP-YOLO optimization strategy and case sharing, introduction and practice of AnchorFree series algorithm
+
+- **[Industrial application] [AI Fast Track industrial object detection technology and application](https://aistudio.baidu.com/aistudio/education/group/info/23670):** Super object detection algorithms, real-time pedestrian analysis system PP-Human, breakdown and practice of object detection industrial application
+
+- **[Industrial features] 2022.3.26** **[Smart City Industry Seven-Day Class](https://aistudio.baidu.com/aistudio/education/group/info/25620)** : Urban planning, Urban governance, Smart governance service, Traffic management, community governance.
+
+- **[Academic exchange] 2022.9.27 [YOLO Vision Event](https://www.youtube.com/playlist?list=PL1FZnkj4ad1NHVC7CMc3pjSQ-JRK-Ev6O):** As the first YOLO-themed event, PaddleDetection was invited to communicate with the experts in the field of Computer Vision around the world.
+
+### [Industrial tutorial examples](./industrial_tutorial/README.md)
+
+- [Rotated object detection based on PP-YOLOE-R](https://aistudio.baidu.com/aistudio/projectdetail/5058293)
+
+- [Aerial image detection based on PP-YOLOE-SOD](https://aistudio.baidu.com/aistudio/projectdetail/5036782)
+
+- [Fall down recognition based on PP-Human v2](https://aistudio.baidu.com/aistudio/projectdetail/4606001)
+
+- [Intelligent fitness recognition based on PP-TinyPose Plus](https://aistudio.baidu.com/aistudio/projectdetail/4385813)
+
+- [Road litter detection based on PP-PicoDet Plus](https://aistudio.baidu.com/aistudio/projectdetail/3561097)
+
+- [Visitor flow statistics based on FairMOT](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
+
+- [Guest analysis based on PP-Human](https://aistudio.baidu.com/aistudio/projectdetail/4537344)
+
+- [More examples](./industrial_tutorial/README.md)
+
+## <img title="" src="https://user-images.githubusercontent.com/48054808/157836473-1cf451fa-f01f-4148-ba68-b6d06d5da2f9.png" alt="" width="20"> Applications
+
+- [Fitness app on android mobile](https://github.com/zhiboniu/pose_demo_android)
+- [PP-Tracking GUI Visualization Interface](https://github.com/yangyudong2020/PP-Tracking_GUi)
+
+## Recommended third-party tutorials
+
+- [Deployment of PaddleDetection for Windows I ](https://zhuanlan.zhihu.com/p/268657833)
+- [Deployment of PaddleDetection for Windows II](https://zhuanlan.zhihu.com/p/280206376)
+- [Deployment of PaddleDetection on Jestson Nano](https://zhuanlan.zhihu.com/p/319371293)
+- [How to deploy YOLOv3 model on Raspberry Pi for Helmet detection](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/yolov3_for_raspi.md)
+- [Use SSD-MobileNetv1 for a project -- From dataset to deployment on Raspberry Pi](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/ssd_mobilenet_v1_for_raspi.md)
+
+## <img src="https://user-images.githubusercontent.com/48054808/157835981-ef6057b4-6347-4768-8fcc-cd07fcc3d8b0.png" width="20"/> Version updates
+
+Please refer to the[ Release note ](https://github.com/PaddlePaddle/Paddle/wiki/PaddlePaddle-2.3.0-Release-Note-EN)for more details about the updates
+
+## <img title="" src="https://user-images.githubusercontent.com/48054808/157835345-f5d24128-abaf-4813-b793-d2e5bdc70e5a.png" alt="" width="20">  License
+
+PaddlePaddle is provided under the [Apache 2.0 license](LICENSE)
+
+## <img src="https://user-images.githubusercontent.com/48054808/157835796-08d4ffbc-87d9-4622-89d8-cf11a44260fc.png" width="20"/> Contribute your code
+
+We appreciate your contributions and your feedback！
+
+- Thank [Mandroide](https://github.com/Mandroide) for code cleanup and
+- Thank [FL77N](https://github.com/FL77N/) for `Sparse-RCNN`model
+- Thank [Chen-Song](https://github.com/Chen-Song) for `Swin Faster-RCNN`model
+- Thank [yangyudong](https://github.com/yangyudong2020), [hchhtc123](https://github.com/hchhtc123) for developing PP-Tracking GUI interface
+- Thank Shigure19 for developing PP-TinyPose fitness APP
+- Thank [manangoel99](https://github.com/manangoel99) for Wandb visualization methods
+
+## <img src="https://user-images.githubusercontent.com/48054808/157835276-9aab9d1c-1c46-446b-bdd4-5ab75c5cfa48.png" width="20"/> Quote
+
+```
+@misc{ppdet2019,
+title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
+author={PaddlePaddle Authors},
+howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
+year={2019}
+}
+```
--- a/object_detection/core/init.py
+++ b/object_detection/core/init.py
--- a/paddle_detection/activity/直播答疑第一期.md
+++ b/paddle_detection/activity/直播答疑第一期.md
@@ -0,0 +1,125 @@
+# 直播答疑第一期
+
+### 答疑全程回放可以通过链接下载观看：https://pan.baidu.com/s/168ouju4MxN5XJEb-GU1iAw 提取码: 92mw
+
+## PaddleDetection框架/API问题
+
+#### Q1. warmup能详细讲解下吗？
+A1. warmup是在训练初期学习率从0调整至预设学习率的过程，设置可以参考[源码](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/ppdet/optimizer.py#L156)，可以设置step数或epoch数
+
+#### Q2. 如果类别不匹配 也能用pretrain weights吗？
+A2. 可以，类别不匹配时，模型会自动不加载shape不匹配的权重，通常和类别数相关的权重位于head层
+
+#### Q3. 请问nms_eta怎么用呀，源码上没有写的很清楚，API文档也没有细说
+A3. 针对密集的场景，nms_eta会在每轮动态的调整nms阈值，避免过滤掉两个重叠程度很高但是属于不同物体的检测框，具体可以参考[源码](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/detection/multiclass_nms_op.cc#L139)，默认为1，通常无需设置
+
+#### Q4. 请问anchor_cluster.py中的--size 是模型的input size 还是 实际使用图片的size？
+A4. 是实际推理时的图片尺寸，一般可以参考TestReader中的image_shape的设置。
+
+#### Q5. 请问为什么预测的坐标会出现负的值？
+A5. 模型算法中是有可能负值的情况，首先需要判断模型预测效果是否符合预期，如果正常可以考虑在后处理中增加clip的操作限制输出box在图像中；如果不正常，说明模型训练效果欠佳，需要进一步排查问题或调优
+
+#### Q6. PaddleDetection 人脸检测blazeface模型，一键式预测时load_params没有参数文件，从哪里下载?
+A6. blazeface的模型可以在[模型库](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/face_detection#%E6%A8%A1%E5%9E%8B%E5%BA%93)中下载到，如果想部署需要参考[步骤](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/EXPORT_MODEL.md) 导出模型
+
+## PP-YOLOE问题
+#### Q1. 训练PP-YOLOE的时候，loss是越训练越高这种情况 是数据集的问题吗？
+A1. 可以从以下几个方面排查
+
+1. 数据: 首先确认数据集没问题，包括标注，类别等
+2. 超参数：base_lr根据batch_size调整，遵守线性原则；warmup_iters根据总的epoch数进行调整
+3. 预训练参数：可以加载官方提供的自在coco数据集上的预训练参数
+4. 网络结构方面：分析下box的分布情况 适当调整dfl的参数
+
+#### Q2. 检测模型选型问题：PicoDet、PP-YOLO系列如何选型
+A2. PicoDet是针对移动端设备设计的模型，是针对arm，x86等低算力设备上设计；PP-YOLO是针对服务器端设计的模型，英伟达N卡，百度昆仑卡等。手机端，无gpu桌面端，优先PicoDet；有高算力设备，如N卡，优先PP-YOLO系列；对延时不敏感的场景，更注重高精度，优先PP-YOLO系列
+
+#### Q3. ConvBNLayer中BN层的参数都不会使用L2Decay；PP-YOLOE-s的其它部分都会按照配置文件的设置使用0.0005的L2Decay。是这样吗
+A3. PP-YOLOE的backbone和neck部分使用了ConvBNLayer，其中BN层不会使用L2Decay，其他部分使用全局设置的0.0005的L2Decay
+
+#### Q4. PP-YOLOE的Conv的bias也不使用decay吗？
+A4. PP-YOLOE的backbone和neck部分的Conv是没有bias参数的，head部分的Conv bias使用全局decay
+
+#### Q5. 在测速时，为什么要用PaddleInference而不是直接加载模型测时间呢
+A5. PaddleInference会将paddle导出的预测模型会前向算子做融合，从而实现速度优化，并且实际部署过程也是使用PaddleInference实现
+
+#### Q6. PP-YOLOE系列在部署的时候，前后处理是不是一样的啊？
+A6. PP-YOLO系列模型在部署时的前处理都是 decode-resize-normalize-permute的流程，后处理方面PP-YOLOv2使用了Matrix NMS，PP-YOLOE使用的是普通的NMS算法
+
+#### Q7. 针对小目标和类别不平衡的数据集，PP-YOLOE有什么调整策略吗
+A7 针对小目标数据集，可以适当增大ppyoloe的输入尺寸，然后在模型中增加注意力机制，目前基于PP-YOLOE的小目标检测正在开发中；针对类别不平衡问题，可以从数据采样的角度处理，目前PP-YOLOE还没有专门针对类别不平衡问题的优化
+
+## PP-Human问题
+#### Q1. 请问pphuman用导出的模型18个点（不是官方17个点）去预测时，报错是为什么
+A1. 这个问题是关键点模型输出点的数量与行为识别模型不一致导致的。如果希望用18点模型预测，除了关键点用18点模型以外，还需要自建18点的动作识别模型。
+
+#### Q2. 为什么官方导出模型设置的window_size是50
+A2. 导出模型的设置与训练和预测的输入数据长度是一致的；我们主要采用的数据集是ntu、企业提供的实际数据等等。在训练这个模型的时候，我们对这些数据中摔倒的片段做了统计分析，基本上每个动作片段持续的帧数大约是40~80左右。综合考虑到实际使用的延迟以及预测效果，我们选择了50这个量级，在我们的这部分数据上既能完整描述一个完整动作，又不会使得延迟过大。
+
+总的来说，这个window_size的数值最好还是根据实际动作以及设备的情况进行选择。例如在某种设备上，50帧的长度根本不足以包含一个完整的动作，那么这个数值就需要扩大；又或者某些动作持续时间很短，50帧的长度包含了太多不相关的其他动作，容易造成误识别，那么这个数值可以适当缩小。
+
+
+#### Q3. PP-Human中如何替换检测、跟踪、关键点模型
+A3. 我们使用的模型都是PaddleDetection中模型进行导出得到的。理论上PP-Human所使用的模型都是可以直接替换的，但是需要注意是流程和前后处理一样的模型。
+
+#### Q4. PP-Human中的数据标注问题（检测、跟踪、关键点、行为、属性）标注工具推荐和标注步骤
+A4. 标注工具：检测 labelme, labelImg, cvat； 跟踪darklabel，cvat；关键点 labelme，cvat。检测标注可以使用tools/x2coco.py转换成coco格式
+
+#### Q5. PP-Human中如何更改label（属性和动作识别）
+A5. 在PPHuman中，动作识别被定义为基于骨骼点序列的分类问题，目前我们已经开源的摔倒动作识别是一个二分类问题；属性方面我们当前还暂时没有开放训练，正在建设中
+
+#### Q6. PP-Human的哪些功能支持单人、哪些支持多人
+A6. PP-Human的功能实现基于一套流程：检测->跟踪->具体功能。当前我们的具体功能模型每次处理的是单人的，即属性、动作等都是属于图像中每一个具体人的。但是基于这套流程下来，图像中的每一个人都得到了处理的。所以单人、多人实际都是一样支持的。
+
+#### Q7. PP-Human对视频流预测的支持及服务化部署
+A7. 目前正在建设当中，下个版本会支持这部分功能
+
+#### Q8. 在使用pphuman训练自己的数据集时，训练完进行测试时，可视化的标签如何更改，没有更改的情况下还是falling
+
+A8. 可视化的函数位于https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/visualize.py#L368，这里在可视化的时候将 action_text替换为期望的类别即可。
+
+#### Q9. 关键点检测可以实现一个连贯动作的检测吗，比如健身规范
+A9. 基于关键点是可以实现的。这里可以有不同思路去做：
+
+1. 如果是期望判定动作规范的程度，且这个动作可以很好的描述。那么可以在关键点模型获得的坐标的基础上，人工增加逻辑判断即可。这里我们提供一个安卓的健身APP示例：https://github.com/zhiboniu/pose_demo_android ，其中实现判定各项动作的逻辑可以参考https://github.com/zhiboniu/pose_demo_android/blob/release/1.0/app/src/main/cpp/pose_action.cc 。
+
+2. 当一个动作较难用逻辑去描述的时候，可能参考现有摔倒检测的案例，训练一个识别健身动作的模型，但对收集数据的要求会比较高。
+
+
+#### Q10. 有遮挡的生产环境中梯子，可以用关键点检测判断人员上下梯动作是否合规
+A10. 这个问题需要视遮挡的程度而定，如果遮挡过于严重时关键点检测模型的效果会大打折扣，从而导致行为的判断失准。此外，由于基于关键点的方案抹去了外观信息，如果只是从人物本身的动作上去做判断，那么在遮挡不严重的场景下是可以的。反之，如果梯子这个物体是判断动作是否合规的必要元素，那么这个方案其实不一定是最佳选择。
+
+#### Q11. 关键点做的行为识别并不是时序上的动作识别吗
+A11. 是时序的动作识别。这里是将一定时间范围内的每一帧关键点坐标组成一个时序的关键点序列，再通过行为识别模型去预测这个序列所属的行为类别。
+
+
+## 检测算法问题
+#### Q1. 大图片小目标   最终推理的图片也是大图片 怎么预处理呀
+A1. 小目标问题常见的处理方式是切图以及增大网络输入尺寸，如果使用基于anchor的检测算法，可以通过对目标物体大小聚类生成anchor，参考[脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/tools/anchor_cluster.py)； 目前基于PP-YOLOE的小目标检测正在开发中
+
+#### Q2. 想问下大的目标对象怎么检测，比如发票
+A2. 如果使用基于anchor的检测算法，可以通过对目标物体大小聚类生成anchor，参考[脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/tools/anchor_cluster.py)；另外可以增强深层特征提升大物体检测效果
+
+#### Q3. 在做预测时发现预测框特别多，有的框的置信度甚至低于0.1，请问如果将这种框过滤掉？也就是训练模型时就把这些极低置信度的预测结果过滤掉，避免在推理部署时，做不必要的计算，从而影响推理速度。
+A3. 后处理部分有两个过滤，1）是提取置信度最高的Top 100个框做nms。2）是根据设定阈值threshold进行过滤。如果你可以确认图片上目标相对比较少<10个，可以调整Top 100这个值到50或者更低，这样可以加速nms部分的计算。其次调整threshold这个影响最终检测的准确度和召回率的效果。
+
+#### Q4. 正负样本的比例一般怎么设计
+A4. 在PaddleDetection中，支持负样本训练，TrainDataset下设置allow_empty: true即可，通过数据集测试，负样本比例在0.3时对模型提升效果最明显。
+
+## 压缩部署问题
+#### Q1. PaddleDetection训练的模型导出inference model后，在做推理部署的时候，前后处理相关代码如何编写，有什么参考教程吗？
+A1. 目前PaddleDetection下的网络模型大部分都能够支持c++ inference，不同的处理方式针对不同功能，例如：PP-YOLOE速度测试不包含后处理，PicoDet为支持不同的第三方推理引擎会设置是否导出nms
+
+object_detector.cc是针对所有检测模型的流程，其中前处理大部分都是decode-resize-normalize-permute 部分网络会加入padding的操作；大部分模型的后处理操作都放在模型里面了，picodet有单独提供nms的后处理代码
+
+检测模型的输入统一为image，im_shape，scale_factor ，如果模型中没有使用im_shape，输出个数会减少，但是整套预处理流程不需要额外开发
+
+#### Q2. 针对TensorRT的加速问题，fp16在v100确实可以，但是耗时好像有点偏差，我在1080ti上，单张图片跑1000次，耗时50s，还是float32的，可是在v100上，float16耗时97
+A2. 目前PPYOLOE等模型的速度都有在V100上使用TensorRT FP16测试，关于速度测试有以下几个方面可以排查：
+
+1. 速度测试时是否正确设置warmup，以避免过长的启动时间影响速度测试准确度
+2. 在开启TensorRT时，生成engine文件的过程耗时较长，可以在https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L745 中将use_static设置为True
+
+
+#### Q3. PaddleDetection已经支持了在线量化一些模型，比如想训练其他的一个新模型，是不是可以轻松用起来qat？如果不能，为什么只能支持很有限的模型，而qat其他模型总会出各种各样的问题，原因是什么？
+A3. 目前PaddleDetection模型很多，只能针对部分模型开源了QAT的config，其他模型也是支持QAT的，只是配置文件没有覆盖到，如果量化报错，通常是配置问题。检测模型一般建议跳过head最后一个conv。如果想要跳过某些层量化，可以设置skip_quant，参考[代码](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/ppdet/modeling/heads/yolo_head.py#L97)
--- a/paddle_detection/benchmark/README.md
+++ b/paddle_detection/benchmark/README.md
@@ -0,0 +1,47 @@
+# 通用检测benchmark测试脚本说明
+
+```
+├── benchmark
+│   ├── analysis_log.py
+│   ├── prepare.sh
+│   ├── README.md
+│   ├── run_all.sh
+│   ├── run_benchmark.sh
+```
+
+## 脚本说明
+
+### prepare.sh
+相关数据准备脚本，完成数据、模型的自动下载
+### run_all.sh
+主要运行脚本，可完成所有相关模型的测试方案
+### run_benchmark.sh
+单模型运行脚本，可完成指定模型的测试方案
+
+## Docker 运行环境
+* docker image: registry.baidubce.com/paddlepaddle/paddle:2.1.2-gpu-cuda10.2-cudnn7
+* paddle = 2.1.2
+* python = 3.7
+
+## 运行benchmark测试
+
+### 运行所有模型
+```
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+cd PaddleDetection
+bash benchmark/run_all.sh
+```
+
+### 运行指定模型
+* Usage：bash run_benchmark.sh ${run_mode} ${batch_size} ${fp_item} ${max_epoch} ${model_name}
+* model_name: faster_rcnn, fcos, deformable_detr, gfl, hrnet, higherhrnet, solov2, jde, fairmot
+```
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+cd PaddleDetection
+bash benchmark/prepare.sh
+
+# 单卡
+CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh sp 2 fp32 1 faster_rcnn
+# 多卡
+CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh mp 2 fp32 1 faster_rcnn
+```
--- a/paddle_detection/benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml
+++ b/paddle_detection/benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,48 @@
+_BASE_: [
+  '../../configs/datasets/coco_detection.yml',
+  '../../configs/runtime.yml',
+  '../../configs/faster_rcnn/_base_/optimizer_1x.yml',
+  '../../configs/faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+]
+weights: output/faster_rcnn_r50_fpn_1x_coco/model_final
+
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - RandomFlip: {prob: 0.5}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+  collate_batch: false
+
+
+EvalReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+
+
+TestReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
--- a/paddle_detection/benchmark/prepare.sh
+++ b/paddle_detection/benchmark/prepare.sh
@@ -0,0 +1,17 @@
+#!/usr/bin/env bash
+
+pip install -U pip Cython
+pip install -r requirements.txt
+
+mv ./dataset/coco/download_coco.py . && rm -rf ./dataset/coco/* && mv ./download_coco.py ./dataset/coco/
+# prepare lite train data
+wget -nc -P ./dataset/coco/ https://paddledet.bj.bcebos.com/data/coco_benchmark.tar
+cd ./dataset/coco/ && tar -xvf coco_benchmark.tar && mv -u coco_benchmark/* .
+rm -rf coco_benchmark/
+
+cd ../../
+rm -rf ./dataset/mot/*
+# prepare mot mini train data
+wget -nc -P ./dataset/mot/ https://paddledet.bj.bcebos.com/data/mot_benchmark.tar
+cd ./dataset/mot/ && tar -xvf mot_benchmark.tar && mv -u mot_benchmark/* .
+rm -rf mot_benchmark/
--- a/paddle_detection/benchmark/run_all.sh
+++ b/paddle_detection/benchmark/run_all.sh
@@ -0,0 +1,47 @@
+# Use docker: paddlepaddle/paddle:latest-gpu-cuda10.1-cudnn7  paddle=2.1.2  python3.7
+#
+# Usage:
+#   git clone https://github.com/PaddlePaddle/PaddleDetection.git
+#   cd PaddleDetection
+#   bash benchmark/run_all.sh
+log_path=${LOG_PATH_INDEX_DIR:-$(pwd)}  #  benchmark系统指定该参数,不需要跑profile时,log_path指向存speed的目录
+
+# run prepare.sh
+bash benchmark/prepare.sh
+
+model_name_list=(faster_rcnn fcos deformable_detr gfl hrnet higherhrnet solov2 jde fairmot)
+fp_item_list=(fp32)
+max_epoch=2
+
+for model_item in ${model_name_list[@]}; do
+      for fp_item in ${fp_item_list[@]}; do
+          case ${model_item} in
+              faster_rcnn) bs_list=(1 8) ;;
+              fcos) bs_list=(2) ;;
+              deformable_detr) bs_list=(2) ;;
+              gfl) bs_list=(2) ;;
+              hrnet) bs_list=(64) ;;
+              higherhrnet) bs_list=(20) ;;
+              solov2) bs_list=(2) ;;
+              jde) bs_list=(4) ;;
+              fairmot) bs_list=(6) ;;
+              *) echo "wrong model_name"; exit 1;
+          esac
+          for bs_item in ${bs_list[@]}
+            do
+            run_mode=sp
+            log_name=detection_${model_item}_bs${bs_item}_${fp_item}   # 如:clas_MobileNetv1_mp_bs32_fp32_8
+            echo "index is speed, 1gpus, begin, ${log_name}"
+            CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${bs_item} \
+             ${fp_item} ${max_epoch} ${model_item} | tee ${log_path}/${log_name}_speed_1gpus 2>&1
+            sleep 60
+
+            run_mode=mp
+            log_name=detection_${model_item}_bs${bs_item}_${fp_item}   # 如:clas_MobileNetv1_mp_bs32_fp32_8
+            echo "index is speed, 8gpus, run_mode is multi_process, begin, ${log_name}"
+            CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh ${run_mode} \
+             ${bs_item} ${fp_item} ${max_epoch} ${model_item}| tee ${log_path}/${log_name}_speed_8gpus8p 2>&1
+            sleep 60
+            done
+      done
+done
--- a/paddle_detection/benchmark/run_benchmark.sh
+++ b/paddle_detection/benchmark/run_benchmark.sh
@@ -0,0 +1,92 @@
+#!/usr/bin/env bash
+set -xe
+# Usage：CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${batch_size} ${fp_item} ${max_epoch} ${model_name}
+python="python3.7"
+# Parameter description
+function _set_params(){
+    run_mode=${1:-"sp"}            # sp|mp
+    batch_size=${2:-"2"}
+    fp_item=${3:-"fp32"}           # fp32|fp16
+    max_epoch=${4:-"1"}
+    model_item=${5:-"model_item"}
+    run_log_path=${TRAIN_LOG_DIR:-$(pwd)}
+# 添加日志解析需要的参数
+    base_batch_size=${batch_size}
+    mission_name="目标检测"
+    direction_id="0"
+    ips_unit="images/s"
+    skip_steps=10                     # 解析日志，有些模型前几个step耗时长，需要跳过                                    (必填)
+    keyword="ips:"                 # 解析日志，筛选出数据所在行的关键字                                             (必填)
+    index="1"
+    model_name=${model_item}_bs${batch_size}_${fp_item}
+
+    device=${CUDA_VISIBLE_DEVICES//,/ }
+    arr=(${device})
+    num_gpu_devices=${#arr[*]}
+    log_file=${run_log_path}/${model_item}_${run_mode}_bs${batch_size}_${fp_item}_${num_gpu_devices}
+}
+function _train(){
+    echo "Train on ${num_gpu_devices} GPUs"
+    echo "current CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES, gpus=$num_gpu_devices, batch_size=$batch_size"
+
+    # set runtime params
+    set_optimizer_lr_sp=" "
+    set_optimizer_lr_mp=" "
+    # parse model_item
+    case ${model_item} in
+        faster_rcnn) model_yml="benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml"
+            set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
+        fcos) model_yml="configs/fcos/fcos_r50_fpn_1x_coco.yml"
+            set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
+        deformable_detr) model_yml="configs/deformable_detr/deformable_detr_r50_1x_coco.yml" ;;
+        gfl) model_yml="configs/gfl/gfl_r50_fpn_1x_coco.yml"
+            set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
+        hrnet) model_yml="configs/keypoint/hrnet/hrnet_w32_256x192.yml" ;;
+        higherhrnet) model_yml="configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml" ;;
+        solov2) model_yml="configs/solov2/solov2_r50_fpn_1x_coco.yml" ;;
+        jde) model_yml="configs/mot/jde/jde_darknet53_30e_1088x608.yml" ;;
+        fairmot) model_yml="configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml" ;;
+        *) echo "Undefined model_item"; exit 1;
+    esac
+
+    set_batch_size="TrainReader.batch_size=${batch_size}"
+    set_max_epoch="epoch=${max_epoch}"
+    set_log_iter="log_iter=1"
+    if [ ${fp_item} = "fp16" ]; then
+        set_fp_item="--fp16"
+    else
+        set_fp_item=" "
+    fi
+
+    case ${run_mode} in
+        sp) train_cmd="${python} -u tools/train.py -c ${model_yml} ${set_fp_item} \
+            -o ${set_batch_size} ${set_max_epoch} ${set_log_iter} ${set_optimizer_lr_sp}" ;;
+        mp) rm -rf mylog
+            train_cmd="${python} -m paddle.distributed.launch --log_dir=./mylog \
+            --gpus=${CUDA_VISIBLE_DEVICES} tools/train.py -c ${model_yml} ${set_fp_item} \
+            -o ${set_batch_size} ${set_max_epoch} ${set_log_iter} ${set_optimizer_lr_mp}"
+            log_parse_file="mylog/workerlog.0" ;;
+        *) echo "choose run_mode(sp or mp)"; exit 1;
+    esac
+
+    timeout 15m ${train_cmd} > ${log_file} 2>&1
+    if [ $? -ne 0 ];then
+        echo -e "${train_cmd}, FAIL"
+        export job_fail_flag=1
+    else
+        echo -e "${train_cmd}, SUCCESS"
+        export job_fail_flag=0
+    fi
+    kill -9 `ps -ef|grep 'python'|awk '{print $2}'`
+
+    if [ $run_mode = "mp" -a -d mylog ]; then
+        rm ${log_file}
+        cp mylog/workerlog.0 ${log_file}
+    fi
+}
+
+source ${BENCHMARK_ROOT}/scripts/run_model.sh   # 在该脚本中会对符合benchmark规范的log使用analysis.py 脚本进行性能数据解析;该脚本在联调时可从benchmark repo中下载https://github.com/PaddlePaddle/benchmark/blob/master/scripts/run_model.sh;如果不联调只想要产出训练log可以注掉本行,提交时需打开
+_set_params $@
+# _train       # 如果只想产出训练log,不解析,可取消注释
+_run     # 该函数在run_model.sh中,执行时会调用_train; 如果不联调只想要产出训练log可以注掉本行,提交时需打开
+
--- a/paddle_detection/configs/cascade_rcnn/README.md
+++ b/paddle_detection/configs/cascade_rcnn/README.md
@@ -0,0 +1,28 @@
+# Cascade R-CNN: High Quality Object Detection and Instance Segmentation
+
+## Model Zoo
+
+| 骨架网络             | 网络类型       | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | Mask AP |                           下载                          | 配置文件 |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50-FPN         | Cascade Faster         |    1    |   1x    |     ----     |  41.1  |    -    | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN         | Cascade Mask         |    1    |   1x    |     ----     |  41.8  |    36.3    | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Faster         |    1    |   1x    |     ----     |  44.4  |    -    | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Faster         |    1    |   2x    |     ----     |  45.0  |    -    | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Mask         |    1    |   1x    |     ----     |  44.9  |    39.1    | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Mask         |    1    |   2x    |     ----     |  45.7  |    39.7    | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
+
+
+## Citations
+```
+@article{Cai_2019,
+   title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
+   ISSN={1939-3539},
+   url={http://dx.doi.org/10.1109/tpami.2019.2956516},
+   DOI={10.1109/tpami.2019.2956516},
+   journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
+   author={Cai, Zhaowei and Vasconcelos, Nuno},
+   year={2019},
+   pages={1–1}
+}
+```
--- a/paddle_detection/configs/cascade_rcnn/_base_/cascade_fpn_reader.yml
+++ b/paddle_detection/configs/cascade_rcnn/_base_/cascade_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - Decode: {}
+  - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+  - RandomFlip: {prob: 0.5}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+  collate_batch: false
+
+
+EvalReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+
+
+TestReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
--- a/paddle_detection/configs/cascade_rcnn/_base_/cascade_mask_fpn_reader.yml
+++ b/paddle_detection/configs/cascade_rcnn/_base_/cascade_mask_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - Decode: {}
+  - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+  - RandomFlip: {prob: 0.5}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+  collate_batch: false
+
+
+EvalReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+
+
+TestReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+  - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
--- a/paddle_detection/configs/cascade_rcnn/_base_/cascade_mask_rcnn_r50_fpn.yml
+++ b/paddle_detection/configs/cascade_rcnn/_base_/cascade_mask_rcnn_r50_fpn.yml
@@ -0,0 +1,97 @@
+architecture: CascadeRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+
+CascadeRCNN:
+  backbone: ResNet
+  neck: FPN
+  rpn_head: RPNHead
+  bbox_head: CascadeHead
+  mask_head: MaskHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+  mask_post_process: MaskPostProcess
+
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+
+FPN:
+  out_channel: 256
+
+RPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    anchor_sizes: [[32], [64], [128], [256], [512]]
+    strides: [4, 8, 16, 32, 64]
+  rpn_target_assign:
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    use_random: True
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+    topk_after_collect: True
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+
+CascadeHead:
+  head: CascadeTwoFCHead
+  roi_extractor:
+    resolution: 7
+    sampling_ratio: 0
+    aligned: True
+  bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bg_thresh: 0.5
+  fg_thresh: 0.5
+  fg_fraction: 0.25
+  cascade_iou: [0.5, 0.6, 0.7]
+  use_random: True
+
+CascadeTwoFCHead:
+  out_channel: 1024
+
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    prior_box_var: [30.0, 30.0, 15.0, 15.0]
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
+
+
+MaskHead:
+  head: MaskFeat
+  roi_extractor:
+    resolution: 14
+    sampling_ratio: 0
+    aligned: True
+  mask_assigner: MaskAssigner
+  share_bbox_feat: False
+
+MaskFeat:
+  num_convs: 4
+  out_channel: 256
+
+MaskAssigner:
+  mask_resolution: 28
+
+MaskPostProcess:
+  binary_thresh: 0.5
--- a/paddle_detection/configs/cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml
+++ b/paddle_detection/configs/cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml
@@ -0,0 +1,75 @@
+architecture: CascadeRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+
+CascadeRCNN:
+  backbone: ResNet
+  neck: FPN
+  rpn_head: RPNHead
+  bbox_head: CascadeHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+
+FPN:
+  out_channel: 256
+
+RPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    anchor_sizes: [[32], [64], [128], [256], [512]]
+    strides: [4, 8, 16, 32, 64]
+  rpn_target_assign:
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    use_random: True
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+    topk_after_collect: True
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+
+CascadeHead:
+  head: CascadeTwoFCHead
+  roi_extractor:
+    resolution: 7
+    sampling_ratio: 0
+    aligned: True
+  bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bg_thresh: 0.5
+  fg_thresh: 0.5
+  fg_fraction: 0.25
+  cascade_iou: [0.5, 0.6, 0.7]
+  use_random: True
+
+CascadeTwoFCHead:
+  out_channel: 1024
+
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    prior_box_var: [30.0, 30.0, 15.0, 15.0]
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
--- a/paddle_detection/configs/cascade_rcnn/_base_/optimizer_1x.yml
+++ b/paddle_detection/configs/cascade_rcnn/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [8, 11]
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
--- a/paddle_detection/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml
+++ b/paddle_detection/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+  '../datasets/coco_instance.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/cascade_mask_rcnn_r50_fpn.yml',
+  '_base_/cascade_mask_fpn_reader.yml',
+]
+weights: output/cascade_mask_rcnn_r50_fpn_1x_coco/model_final
--- a/paddle_detection/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml
+++ b/paddle_detection/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml
@@ -0,0 +1,18 @@
+_BASE_: [
+  '../datasets/coco_instance.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/cascade_mask_rcnn_r50_fpn.yml',
+  '_base_/cascade_mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
+
+ResNet:
+  depth: 50
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+  lr_mult_list: [0.05, 0.05, 0.1, 0.15]
--- a/paddle_detection/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml
+++ b/paddle_detection/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+  '../datasets/coco_instance.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/cascade_mask_rcnn_r50_fpn.yml',
+  '_base_/cascade_mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
+
+ResNet:
+  depth: 50
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+  lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 24
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [12, 22]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
--- a/paddle_detection/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml
+++ b/paddle_detection/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/cascade_rcnn_r50_fpn.yml',
+  '_base_/cascade_fpn_reader.yml',
+]
+weights: output/cascade_rcnn_r50_fpn_1x_coco/model_final
--- a/paddle_detection/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml
+++ b/paddle_detection/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml
@@ -0,0 +1,18 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/cascade_rcnn_r50_fpn.yml',
+  '_base_/cascade_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
+
+ResNet:
+  depth: 50
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+  lr_mult_list: [0.05, 0.05, 0.1, 0.15]
--- a/paddle_detection/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml
+++ b/paddle_detection/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/optimizer_1x.yml',
+  '_base_/cascade_rcnn_r50_fpn.yml',
+  '_base_/cascade_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
+
+ResNet:
+  depth: 50
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+  lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 24
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [12, 22]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
--- a/paddle_detection/configs/centernet/README.md
+++ b/paddle_detection/configs/centernet/README.md
@@ -0,0 +1,37 @@
+English | [简体中文](README_cn.md)
+
+# CenterNet (CenterNet: Objects as Points)
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model_Zoo)
+- [Citations](#Citations)
+
+## Introduction
+
+[CenterNet](http://arxiv.org/abs/1904.07850) is an Anchor Free detector, which model an object as a single point -- the center point of its bounding box. The detector uses keypoint estimation to find center points and regresses to all other object properties. The center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors.
+
+## Model Zoo
+
+### CenterNet Results on COCO-val 2017
+
+| backbone       | input shape | mAP   |    FPS    | download | config |
+| :--------------| :------- |  :----: | :------: | :----: |:-----: |
+| DLA-34(paper)  | 512x512 |  37.4  |     -   |    -   |   -    |
+| DLA-34         | 512x512 |  37.6  |     -   | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_dla34_140e_coco.pdparams) | [config](./centernet_dla34_140e_coco.yml) |
+| ResNet50 + DLAUp  | 512x512 |  38.9  |     -   | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_r50_140e_coco.pdparams) | [config](./centernet_r50_140e_coco.yml) |
+| MobileNetV1 + DLAUp  | 512x512 |  28.2  |     -   | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv1_140e_coco.pdparams) | [config](./centernet_mbv1_140e_coco.yml) |
+| MobileNetV3_small + DLAUp  | 512x512 | 17  |     -   | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_small_140e_coco.pdparams) | [config](./centernet_mbv3_small_140e_coco.yml) |
+| MobileNetV3_large + DLAUp  | 512x512 |  27.1  |     -   | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_large_140e_coco.pdparams) | [config](./centernet_mbv3_large_140e_coco.yml) |
+| ShuffleNetV2 + DLAUp  | 512x512 | 23.8  |     -   | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_shufflenetv2_140e_coco.pdparams) | [config](./centernet_shufflenetv2_140e_coco.yml) |
+
+
+## Citations
+```
+@article{zhou2019objects,
+  title={Objects as points},
+  author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
+  journal={arXiv preprint arXiv:1904.07850},
+  year={2019}
+}
+```
--- a/paddle_detection/configs/centernet/README_cn.md
+++ b/paddle_detection/configs/centernet/README_cn.md
@@ -0,0 +1,36 @@
+简体中文 | [English](README.md)
+
+# CenterNet (CenterNet: Objects as Points)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [引用](#引用)
+
+## 内容
+
+[CenterNet](http://arxiv.org/abs/1904.07850)是Anchor Free检测器，将物体表示为一个目标框中心点。CenterNet使用关键点检测的方式定位中心点并回归物体的其他属性。CenterNet是以中心点为基础的检测方法，是端到端可训练的，并且相较于基于anchor的检测器更加检测高效。
+
+## 模型库
+
+### CenterNet在COCO-val 2017上结果
+
+| 骨干网络       | 输入尺寸 | mAP   |    FPS    | 下载链接 | 配置文件 |
+| :--------------| :------- |  :----: | :------: | :----: |:-----: |
+| DLA-34(paper)  | 512x512 |  37.4  |     -   |    -   |   -    |
+| DLA-34         | 512x512 |  37.6  |     -   | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_dla34_140e_coco.pdparams) | [配置文件](./centernet_dla34_140e_coco.yml) |
+| ResNet50 + DLAUp  | 512x512 |  38.9  |     -   | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_r50_140e_coco.pdparams) | [配置文件](./centernet_r50_140e_coco.yml) |
+| MobileNetV1 + DLAUp  | 512x512 |  28.2  |     -   | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv1_140e_coco.pdparams) | [配置文件](./centernet_mbv1_140e_coco.yml) |
+| MobileNetV3_small + DLAUp  | 512x512 | 17  |     -   | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_small_140e_coco.pdparams) | [配置文件](./centernet_mbv3_small_140e_coco.yml) |
+| MobileNetV3_large + DLAUp  | 512x512 |  27.1  |     -   | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_large_140e_coco.pdparams) | [配置文件](./centernet_mbv3_large_140e_coco.yml) |
+| ShuffleNetV2 + DLAUp  | 512x512 | 23.8  |     -   | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_shufflenetv2_140e_coco.pdparams) | [配置文件](./centernet_shufflenetv2_140e_coco.yml) |
+
+## 引用
+```
+@article{zhou2019objects,
+  title={Objects as points},
+  author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
+  journal={arXiv preprint arXiv:1904.07850},
+  year={2019}
+}
+```
--- a/paddle_detection/configs/centernet/_base_/centernet_dla34.yml
+++ b/paddle_detection/configs/centernet/_base_/centernet_dla34.yml
@@ -0,0 +1,22 @@
+architecture: CenterNet
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/DLA34_pretrain.pdparams
+
+CenterNet:
+  backbone: DLA
+  neck: CenterNetDLAFPN
+  head: CenterNetHead
+  post_process: CenterNetPostProcess
+
+DLA:
+  depth: 34
+
+CenterNetDLAFPN:
+  down_ratio: 4
+
+CenterNetHead:
+  head_planes: 256
+  regress_ltrb: False
+
+CenterNetPostProcess:
+  max_per_img: 100
+  regress_ltrb: False
--- a/paddle_detection/configs/centernet/_base_/centernet_r50.yml
+++ b/paddle_detection/configs/centernet/_base_/centernet_r50.yml
@@ -0,0 +1,34 @@
+architecture: CenterNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+CenterNet:
+  backbone: ResNet
+  neck: CenterNetDLAFPN
+  head: CenterNetHead
+  post_process: CenterNetPostProcess
+
+ResNet:
+  depth: 50
+  variant: d
+  return_idx: [0, 1, 2, 3]
+  freeze_at: -1
+  norm_decay: 0.
+  dcn_v2_stages: [3]
+
+
+CenterNetDLAFPN:
+  first_level: 0
+  last_level: 4
+  down_ratio: 4
+  dcn_v2: False
+
+CenterNetHead:
+  head_planes: 256
+  regress_ltrb: False
+
+CenterNetPostProcess:
+  max_per_img: 100
+  regress_ltrb: False
--- a/paddle_detection/configs/centernet/_base_/centernet_reader.yml
+++ b/paddle_detection/configs/centernet/_base_/centernet_reader.yml
@@ -0,0 +1,35 @@
+worker_num: 4
+TrainReader:
+  inputs_def:
+    image_shape: [3, 512, 512]
+  sample_transforms:
+    - Decode: {}
+    - FlipWarpAffine: {keep_res: False, input_h: 512, input_w: 512, use_random: True}
+    - CenterRandColor: {}
+    - Lighting: {eigval: [0.2141788, 0.01817699, 0.00341571], eigvec: [[-0.58752847, -0.69563484, 0.41340352], [-0.5832747, 0.00994535, -0.81221408], [-0.56089297, 0.71832671, 0.41158938]]}
+    - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: False}
+    - Permute: {}
+    - Gt2CenterNetTarget: {down_ratio: 4, max_objs: 128}
+  batch_size: 16
+  shuffle: True
+  drop_last: True
+  use_shared_memory: True
+
+EvalReader:
+  sample_transforms:
+    - Decode: {}
+    - WarpAffine: {keep_res: True, input_h: 512, input_w: 512}
+    - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834]}
+    - Permute: {}
+  batch_size: 1
+
+
+TestReader:
+  inputs_def:
+    image_shape: [3, 512, 512]
+  sample_transforms:
+    - Decode: {}
+    - WarpAffine: {keep_res: True, input_h: 512, input_w: 512}
+    - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
+    - Permute: {}
+  batch_size: 1
--- a/paddle_detection/configs/centernet/_base_/optimizer_140e.yml
+++ b/paddle_detection/configs/centernet/_base_/optimizer_140e.yml
@@ -0,0 +1,14 @@
+epoch: 140
+
+LearningRate:
+  base_lr: 0.0005
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [90, 120]
+    use_warmup: False
+
+OptimizerBuilder:
+  optimizer:
+    type: Adam
+  regularizer: NULL
--- a/paddle_detection/configs/centernet/centernet_dla34_140e_coco.yml
+++ b/paddle_detection/configs/centernet/centernet_dla34_140e_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/optimizer_140e.yml',
+  '_base_/centernet_dla34.yml',
+  '_base_/centernet_reader.yml',
+]
+
+weights: output/centernet_dla34_140e_coco/model_final
--- a/Show More
+++ b/Show More
				`@@ -1 +0,0 @@`
				`from ._cpools import TopPool, BottomPool, LeftPool, RightPool`