更换文档检测模型

This commit is contained in:
2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions

View File

@@ -0,0 +1,208 @@
English | [简体中文](README_cn.md)
# FairMOT (FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking)
## Table of Contents
- [Introduction](#Introduction)
- [Model Zoo](#Model_Zoo)
- [Getting Start](#Getting_Start)
- [Citations](#Citations)
## Introduction
[FairMOT](https://arxiv.org/abs/2004.01888) is based on an Anchor Free detector Centernet, which overcomes the problem of anchor and feature misalignment in anchor based detection framework. The fusion of deep and shallow features enables the detection and ReID tasks to obtain the required features respectively. It also uses low dimensional ReID features. FairMOT is a simple baseline composed of two homogeneous branches propose to predict the pixel level target score and ReID features. It achieves the fairness between the two tasks and obtains a higher level of real-time MOT performance.
### PP-Tracking real-time MOT system
In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
### AI studio public project tutorial
PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
## Model Zoo
### FairMOT Results on MOT-16 Training Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34(paper) | 1088x608 | 83.3 | 81.9 | 544 | 3822 | 14095 | - | - | - |
| DLA-34 | 1088x608 | 83.2 | 83.1 | 499 | 3861 | 14223 | - | [model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [config](./fairmot_dla34_30e_1088x608.yml) |
| DLA-34 | 864x480 | 80.8 | 81.1 | 561 | 3643 | 16967 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [config](./fairmot_dla34_30e_864x480.yml) |
| DLA-34 | 576x320 | 74.0 | 76.1 | 640 | 4989 | 23034 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [config](./fairmot_dla34_30e_576x320.yml) |
### FairMOT Results on MOT-16 Test Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34(paper) | 1088x608 | 74.9 | 72.8 | 1074 | - | - | 25.9 | - | - |
| DLA-34 | 1088x608 | 75.0 | 74.7 | 919 | 7934 | 36747 | - | [model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [config](./fairmot_dla34_30e_1088x608.yml) |
| DLA-34 | 864x480 | 73.0 | 72.6 | 977 | 7578 | 40601 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [config](./fairmot_dla34_30e_864x480.yml) |
| DLA-34 | 576x320 | 69.9 | 70.2 | 1044 | 8869 | 44898 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [config](./fairmot_dla34_30e_576x320.yml) |
**Notes:**
- FairMOT DLA-34 used 2 GPUs for training and mini-batch size as 6 on each GPU, and trained for 30 epochs.
### FairMOT enhance model
### Results on MOT-16 Test Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 75.9 | 74.7 | 1021 | 11425 | 31475 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [config](./fairmot_enhance_dla34_60e_1088x608.yml) |
| HarDNet-85 | 1088x608 | 75.0 | 70.0 | 1050 | 11837 | 32774 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [config](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
### Results on MOT-17 Test Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 75.3 | 74.2 | 3270 | 29112 | 106749 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [config](./fairmot_enhance_dla34_60e_1088x608.yml) |
| HarDNet-85 | 1088x608 | 74.7 | 70.7 | 3210 | 29790 | 109914 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [config](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
**Notes:**
- FairMOT enhance used 8 GPUs for training, and the crowdhuman dataset is added to the train-set during training.
- For FairMOT enhance DLA-34 the batch size is 16 on each GPUand trained for 60 epochs.
- For FairMOT enhance HarDNet-85 the batch size is 10 on each GPUand trained for 30 epochs.
### FairMOT light model
### Results on MOT-16 Test Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| HRNetV2-W18 | 1088x608 | 71.7 | 66.6 | 1340 | 8642 | 41592 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
### Results on MOT-17 Test Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |
**Notes:**
- FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epochs. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.
### FairMOT + BYTETracker
### Results on MOT-17 Half Set
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 69.1 | 72.8 | 299 | 1957 | 14412 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [config](./fairmot_dla34_30e_1088x608.yml) |
| DLA-34 + BYTETracker| 1088x608 | 70.3 | 73.2 | 234 | 2176 | 13598 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bytetracker.pdparams) | [config](./fairmot_dla34_30e_1088x608_bytetracker.yml) |
**Notes:**
- FairMOT here is for ablation study, the training dataset is the 5 datasets of MIX(Caltech,CUHKSYSU,PRW,Cityscapes,ETHZ) and the first half of MOT17 Train, and the pretrain weights is CenterNet COCO model, the evaluation is on the second half of MOT17 Train.
- BYTETracker adapt to other FairMOT models of PaddleDetection, you can modify the tracker of the config like this:
```
JDETracker:
use_byte: True
match_thres: 0.8
conf_thres: 0.4
low_conf_thres: 0.2
```
### Fairmot transfer learning model
### Results on GMOT-40 airplane subset
| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 96.6 | 94.7 | 19 | 300 | 466 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_airplane.pdparams) | [config](./fairmot_dla34_30e_1088x608_airplane.yml) |
**Note:**
- The dataset of this model is a subset of airport category extracted from GMOT-40 dataset. The download link provided by the PaddleDetection team is```wget https://bj.bcebos.com/v1/paddledet/data/mot/airplane.zip```, unzip and store it in the ```dataset/mot```, and then copy the ```airplane.train``` to ```dataset/mot/image_lists```.
- FairMOT model here uses the pedestrian FairMOT trained model for pre- training weights. The train-set used is the complete set of airplane, with a total of 4 video sequences, and it also used for evaluation.
- When applied to the tracking other objects, you should modify ```min_box_area``` and ```vertical_ratio``` of the tracker in the corresponding config file, like this
```
JDETracker:
conf_thres: 0.4
tracked_thresh: 0.4
metric_type: cosine
min_box_area: 0 # 200 for pedestrian
vertical_ratio: 0 # 1.6 for pedestrian
```
## Getting Start
### 1. Training
Training FairMOT on 2 GPUs with following command
```bash
python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608/ --gpus 0,1 tools/train.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
```
### 2. Evaluation
Evaluating the track performance of FairMOT on val dataset in single GPU with following commands:
```bash
# use weights released in PaddleDetection model zoo
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
# use saved checkpoint in training
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
```
**Notes:**
- The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`
```
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT17/images/train
keep_ori_im: False # set True if save visualization images or video
```
- Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
### 3. Inference
Inference a video on single GPU with following command:
```bash
# inference on video and save a video
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
```
**Notes:**
- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
### 4. Export model
```bash
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
```
### 5. Using exported model for python inference
```bash
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**Notes:**
- The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
- Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
### 6. Using exported MOT and keypoint model for unite python inference
```bash
python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
```
**Notes:**
- Keypoint model export tutorial: `configs/keypoint/README.md`.
## Citations
```
@article{zhang2020fair,
title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
journal={arXiv preprint arXiv:2004.01888},
year={2020}
}
@article{shao2018crowdhuman,
title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
journal={arXiv preprint arXiv:1805.00123},
year={2018}
}
```

View File

@@ -0,0 +1,202 @@
简体中文 | [English](README.md)
# FairMOT (FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking)
## 内容
- [简介](#简介)
- [模型库](#模型库)
- [快速开始](#快速开始)
- [引用](#引用)
## 内容
[FairMOT](https://arxiv.org/abs/2004.01888)以Anchor Free的CenterNet检测器为基础克服了Anchor-Based的检测框架中anchor和特征不对齐问题深浅层特征融合使得检测和ReID任务各自获得所需要的特征并且使用低维度ReID特征提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征实现了两个任务之间的公平性并获得了更高水平的实时多目标跟踪精度。
### PP-Tracking 实时多目标跟踪系统
此外PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统具有模型丰富、应用广泛和部署高效三大优势。
PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式针对实际业务的难点和痛点提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用部署方式支持API调用和GUI可视化界面部署语言支持Python和C++部署平台环境支持Linux、NVIDIA Jetson等。
### AI Studio公开项目案例
PP-Tracking 提供了AI Studio公开项目案例教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)。
## 模型库
### FairMOT在MOT-16 Training Set上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :---: | :----: | :---: | :------: | :----: |:----: |
| DLA-34(paper) | 1088x608 | 83.3 | 81.9 | 544 | 3822 | 14095 | - | - | - |
| DLA-34 | 1088x608 | 83.2 | 83.1 | 499 | 3861 | 14223 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608.yml) |
| DLA-34 | 864x480 | 80.8 | 81.1 | 561 | 3643 | 16967 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [配置文件](./fairmot_dla34_30e_864x480.yml) |
| DLA-34 | 576x320 | 74.0 | 76.1 | 640 | 4989 | 23034 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [配置文件](./fairmot_dla34_30e_576x320.yml) |
### FairMOT在MOT-16 Test Set上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: |:-------: | :----: | :----: |
| DLA-34(paper) | 1088x608 | 74.9 | 72.8 | 1074 | - | - | 25.9 | - | - |
| DLA-34 | 1088x608 | 75.0 | 74.7 | 919 | 7934 | 36747 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608.yml) |
| DLA-34 | 864x480 | 73.0 | 72.6 | 977 | 7578 | 40601 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [配置文件](./fairmot_dla34_30e_864x480.yml) |
| DLA-34 | 576x320 | 69.9 | 70.2 | 1044 | 8869 | 44898 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [配置文件](./fairmot_dla34_30e_576x320.yml) |
**注意:**
- FairMOT DLA-34均使用2个GPU进行训练每个GPU上batch size为6训练30个epoch。
### FairMOT enhance模型
### 在MOT-16 Test Set上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 75.9 | 74.7 | 1021 | 11425 | 31475 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [配置文件](./fairmot_enhance_dla34_60e_1088x608.yml) |
| HarDNet-85 | 1088x608 | 75.0 | 70.0 | 1050 | 11837 | 32774 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [配置文件](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
### 在MOT-17 Test Set上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 75.3 | 74.2 | 3270 | 29112 | 106749 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [配置文件](./fairmot_enhance_dla34_60e_1088x608.yml) |
| HarDNet-85 | 1088x608 | 74.7 | 70.7 | 3210 | 29790 | 109914 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [配置文件](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
**注意:**
- FairMOT enhance模型均使用8个GPU进行训练训练集中加入了crowdhuman数据集一起参与训练。
- FairMOT enhance DLA-34 每个GPU上batch size为16训练60个epoch。
- FairMOT enhance HarDNet-85 每个GPU上batch size为10训练30个epoch。
### FairMOT轻量级模型
### 在MOT-16 Test Set上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| HRNetV2-W18 | 1088x608 | 71.7 | 66.6 | 1340 | 8642 | 41592 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
### 在MOT-17 Test Set上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |
**注意:**
- FairMOT HRNetV2-W18均使用8个GPU进行训练每个GPU上batch size为4训练30个epoch使用的ImageNet预训练优化器策略采用的是Momentum并且训练集中加入了crowdhuman数据集一起参与训练。
### FairMOT + BYTETracker
### 在MOT-17 Half上结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 69.1 | 72.8 | 299 | 1957 | 14412 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608.yml) |
| DLA-34 + BYTETracker| 1088x608 | 70.3 | 73.2 | 234 | 2176 | 13598 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bytetracker.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_bytetracker.yml) |
**注意:**
- FairMOT模型此处是ablation study的配置使用的训练集是原先MIX的5个数据集(Caltech,CUHKSYSU,PRW,Cityscapes,ETHZ)加上MOT17 Train的前一半且使用是预训练权重是CenterNet的COCO预训练权重验证是在MOT17 Train的后一半上测的。
- BYTETracker应用到PaddleDetection的其他FairMOT模型只需要更改对应的config文件里的tracker部分为如下所示
```
JDETracker:
use_byte: True
match_thres: 0.8
conf_thres: 0.4
low_conf_thres: 0.2
```
### FairMOT迁移学习模型
### 在GMOT-40的airplane子集上的结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| DLA-34 | 1088x608 | 96.6 | 94.7 | 19 | 300 | 466 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_airplane.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_airplane.yml) |
**注意:**
- 此模型数据集是GMOT-40的airplane类别抽离出来的子集PaddleDetection团队整理后的下载链接为: ```wget https://bj.bcebos.com/v1/paddledet/data/mot/airplane.zip```,下载解压存放于 ```dataset/mot```目录下,并将其中的```airplane.train```复制存放于```dataset/mot/image_lists```。
- FairMOT模型此处训练是采用行人FairMOT训好的模型作为预训练权重使用的训练集是airplane全集共4个视频序列验证也是在全集上测的。
- 应用到其他物体的跟踪需要更改对应的config文件里的tracker部分的```min_box_area```和```vertical_ratio```,如下所示:
```
JDETracker:
conf_thres: 0.4
tracked_thresh: 0.4
metric_type: cosine
min_box_area: 0 # 200 for pedestrian
vertical_ratio: 0 # 1.6 for pedestrian
```
## 快速开始
### 1. 训练
使用2个GPU通过如下命令一键式启动训练
```bash
python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608/ --gpus 0,1 tools/train.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
```
### 2. 评估
使用单张GPU通过如下命令一键式启动评估
```bash
# 使用PaddleDetection发布的权重
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
# 使用训练保存的checkpoint
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
```
**注意:**
- 默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`
```
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT17/images/train
keep_ori_im: False # set True if save visualization images or video
```
- 跟踪结果会存于`{output_dir}/mot_results/`中里面每个视频序列对应一个txt每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
### 3. 预测
使用单个GPU通过如下命令预测一个视频并保存为视频
```bash
# 预测一个视频
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
```
**注意:**
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
### 4. 导出预测模型
```bash
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
```
### 5. 用导出的模型基于Python去预测
```bash
python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
```
**注意:**
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件或`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
### 6. 用导出的跟踪和关键点模型Python联合预测
```bash
python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
```
**注意:**
- 关键点模型导出教程请参考`configs/keypoint/README.md`。
## 引用
```
@article{zhang2020fair,
title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
journal={arXiv preprint arXiv:2004.01888},
year={2020}
}
@article{shao2018crowdhuman,
title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
journal={arXiv preprint arXiv:1805.00123},
year={2018}
}
```

View File

@@ -0,0 +1,46 @@
architecture: FairMOT
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/fairmot_dla34_crowdhuman_pretrained.pdparams
for_mot: True
FairMOT:
detector: CenterNet
reid: FairMOTEmbeddingHead
loss: FairMOTLoss
tracker: JDETracker
CenterNet:
backbone: DLA
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
CenterNetDLAFPN:
down_ratio: 4
last_level: 5
out_channel: 0
dcn_v2: True
with_sge: False
CenterNetHead:
head_planes: 256
prior_bias: -2.19
regress_ltrb: True
size_loss: 'L1'
loss_weight: {'heatmap': 1.0, 'size': 0.1, 'offset': 1.0, 'iou': 0.0}
add_iou: False
FairMOTEmbeddingHead:
ch_head: 256
ch_emb: 128
CenterNetPostProcess:
max_per_img: 500
down_ratio: 4
regress_ltrb: True
JDETracker:
conf_thres: 0.4
tracked_thresh: 0.4
metric_type: cosine
min_box_area: 200
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,43 @@
architecture: FairMOT
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/centernet_hardnet85_coco.pdparams
for_mot: True
FairMOT:
detector: CenterNet
reid: FairMOTEmbeddingHead
loss: FairMOTLoss
tracker: JDETracker
CenterNet:
backbone: HarDNet
neck: CenterNetHarDNetFPN
head: CenterNetHead
post_process: CenterNetPostProcess
HarDNet:
depth_wise: False
return_idx: [1,3,8,13]
arch: 85
CenterNetHarDNetFPN:
num_layers: 85
down_ratio: 4
last_level: 4
out_channel: 0
CenterNetHead:
head_planes: 128
FairMOTEmbeddingHead:
ch_head: 512
CenterNetPostProcess:
max_per_img: 500
regress_ltrb: True
JDETracker:
conf_thres: 0.4
tracked_thresh: 0.4
metric_type: cosine
min_box_area: 200
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,38 @@
architecture: FairMOT
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
for_mot: True
FairMOT:
detector: CenterNet
reid: FairMOTEmbeddingHead
loss: FairMOTLoss
tracker: JDETracker
CenterNet:
backbone: HRNet
head: CenterNetHead
post_process: CenterNetPostProcess
neck: CenterNetDLAFPN
HRNet:
width: 18
freeze_at: 0
return_idx: [0, 1, 2, 3]
upsample: False
CenterNetDLAFPN:
down_ratio: 4
last_level: 3
out_channel: 0
first_level: 0
dcn_v2: False
CenterNetPostProcess:
max_per_img: 500
JDETracker:
conf_thres: 0.4
tracked_thresh: 0.4
metric_type: cosine
min_box_area: 200
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,41 @@
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 608, 1088]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [608, 1088]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 6
shuffle: True
drop_last: True
use_shared_memory: True
EvalMOTReader:
sample_transforms:
- Decode: {}
- LetterBoxResize: {target_size: [608, 1088]}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
batch_size: 1
TestMOTReader:
inputs_def:
image_shape: [3, 608, 1088]
sample_transforms:
- Decode: {}
- LetterBoxResize: {target_size: [608, 1088]}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,41 @@
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 320, 576]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [320, 576]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 6
shuffle: True
drop_last: True
use_shared_memory: True
EvalMOTReader:
sample_transforms:
- Decode: {}
- LetterBoxResize: {target_size: [320, 576]}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- Permute: {}
batch_size: 1
TestMOTReader:
inputs_def:
image_shape: [3, 320, 576]
sample_transforms:
- Decode: {}
- LetterBoxResize: {target_size: [320, 576]}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,41 @@
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 480, 864]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [480, 864]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 6
shuffle: True
drop_last: True
use_shared_memory: True
EvalMOTReader:
sample_transforms:
- Decode: {}
- LetterBoxResize: {target_size: [480, 864]}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- Permute: {}
batch_size: 1
TestMOTReader:
inputs_def:
image_shape: [3, 480, 864]
sample_transforms:
- Decode: {}
- LetterBoxResize: {target_size: [480, 864]}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,14 @@
epoch: 30
LearningRate:
base_lr: 0.0001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [20,]
use_warmup: False
OptimizerBuilder:
optimizer:
type: Adam
regularizer: NULL

View File

@@ -0,0 +1,19 @@
epoch: 30
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [15, 22]
use_warmup: True
- !ExpWarmup
steps: 1000
power: 4
OptimizerBuilder:
optimizer:
type: Momentum
regularizer:
factor: 0.0001
type: L2

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e.yml',
'_base_/fairmot_dla34.yml',
'_base_/fairmot_reader_1088x608.yml',
]
weights: output/fairmot_dla34_30e_1088x608/model_final

View File

@@ -0,0 +1,33 @@
_BASE_: [
'fairmot_dla34_30e_1088x608.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
weights: output/fairmot_dla34_30e_1088x608_airplane/model_final
JDETracker:
conf_thres: 0.4
tracked_thresh: 0.4
metric_type: cosine
min_box_area: 0
vertical_ratio: 0
# for MOT training
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['airplane.train']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
# for MOT evaluation
# If you want to change the MOT evaluation dataset, please modify 'data_root'
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: airplane/images/train
keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
# for MOT video inference
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,31 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e.yml',
'_base_/fairmot_dla34.yml',
'_base_/fairmot_reader_1088x608.yml',
]
weights: output/fairmot_dla34_30e_1088x608_bytetracker/model_final
# for ablation study, MIX + MOT17-half
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.half', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
# for MOT evaluation
# If you want to change the MOT evaluation dataset, please modify 'data_root'
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT17/images/half
keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
JDETracker:
use_byte: True
match_thres: 0.8
conf_thres: 0.4
low_conf_thres: 0.2
min_box_area: 200
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e.yml',
'_base_/fairmot_dla34.yml',
'_base_/fairmot_reader_576x320.yml',
]
weights: output/fairmot_dla34_30e_576x320/model_final

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e.yml',
'_base_/fairmot_dla34.yml',
'_base_/fairmot_reader_864x480.yml',
]
weights: output/fairmot_dla34_30e_864x480/model_final

View File

@@ -0,0 +1,56 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e.yml',
'_base_/fairmot_dla34.yml',
'_base_/fairmot_reader_1088x608.yml',
]
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
# add crowdhuman
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 608, 1088]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [608, 1088]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 16
shuffle: True
drop_last: True
use_shared_memory: True
epoch: 60
LearningRate:
base_lr: 0.0005
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [40,]
use_warmup: False
OptimizerBuilder:
optimizer:
type: Adam
regularizer: NULL
weights: output/fairmot_enhance_dla34_60e_1088x608/model_final

View File

@@ -0,0 +1,56 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e.yml',
'_base_/fairmot_hardnet85.yml',
'_base_/fairmot_reader_1088x608.yml',
]
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
# add crowdhuman
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 608, 1088]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [608, 1088]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 10
shuffle: True
drop_last: True
use_shared_memory: True
epoch: 30
LearningRate:
base_lr: 0.0001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [20,]
use_warmup: False
OptimizerBuilder:
optimizer:
type: Adam
regularizer: NULL
weights: output/fairmot_enhance_hardnet85_30e_1088x608/model_final

View File

@@ -0,0 +1,43 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e_momentum.yml',
'_base_/fairmot_hrnetv2_w18_dlafpn.yml',
'_base_/fairmot_reader_1088x608.yml',
]
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
# add crowdhuman
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 608, 1088]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [608, 1088]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 4
shuffle: True
drop_last: True
use_shared_memory: True
weights: output/fairmot_hrnetv2_w18_dlafpn_30e_1088x608/model_final

View File

@@ -0,0 +1,43 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e_momentum.yml',
'_base_/fairmot_hrnetv2_w18_dlafpn.yml',
'_base_/fairmot_reader_576x320.yml',
]
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
# add crowdhuman
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 320, 576]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [320, 576]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 4
shuffle: True
drop_last: True
use_shared_memory: True
weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320/model_final

View File

@@ -0,0 +1,43 @@
_BASE_: [
'../../datasets/mot.yml',
'../../runtime.yml',
'_base_/optimizer_30e_momentum.yml',
'_base_/fairmot_hrnetv2_w18_dlafpn.yml',
'_base_/fairmot_reader_864x480.yml',
]
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
# add crowdhuman
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 480, 864]
sample_transforms:
- Decode: {}
- RGBReverse: {}
- AugmentHSV: {}
- LetterBoxResize: {target_size: [480, 864]}
- MOTRandomAffine: {reject_outside: False}
- RandomFlip: {}
- BboxXYXY2XYWH: {}
- NormalizeBox: {}
- NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
- RGBReverse: {}
- Permute: {}
batch_transforms:
- Gt2FairMOTTarget: {}
batch_size: 4
shuffle: True
drop_last: True
use_shared_memory: True
weights: output/fairmot_hrnetv2_w18_dlafpn_30e_864x480/model_final