更换文档检测模型

This commit is contained in:
2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions

View File

@@ -0,0 +1 @@
README_cn.md

View File

@@ -0,0 +1,195 @@
简体中文 | [English](README.md)
# ByteTrack (ByteTrack: Multi-Object Tracking by Associating Every Detection Box)
## 内容
- [简介](#简介)
- [模型库](#模型库)
- [行人跟踪](#行人跟踪)
- [人头跟踪](#人头跟踪)
- [多类别适配](#多类别适配)
- [快速开始](#快速开始)
- [引用](#引用)
## 简介
[ByteTrack](https://arxiv.org/abs/2110.06864)(ByteTrack: Multi-Object Tracking by Associating Every Detection Box) 通过关联每个检测框来跟踪而不仅是关联高分的检测框。对于低分数检测框会利用它们与轨迹片段的相似性来恢复真实对象并过滤掉背景检测框。此处提供了几个常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异请自行根据需求进行适配。
## 模型库
### 行人跟踪
#### 基于不同检测器的ByteTrack在 MOT-17 half Val Set 上的结果
| 检测训练数据集 | 检测器 | 输入尺度 | ReID | 检测mAP(0.5:0.95) | MOTA | IDF1 | FPS | 配置文件 |
| :-------- | :----- | :----: | :----:|:------: | :----: |:-----: |:----:|:----: |
| MOT-17 half train | YOLOv3 | 608x608 | - | 42.7 | 49.5 | 54.8 | - |[配置文件](./bytetrack_yolov3.yml) |
| MOT-17 half train | PP-YOLOE-l | 640x640 | - | 52.9 | 50.4 | 59.7 | - |[配置文件](./bytetrack_ppyoloe.yml) |
| MOT-17 half train | PP-YOLOE-l | 640x640 |PPLCNet| 52.9 | 51.7 | 58.8 | - |[配置文件](./bytetrack_ppyoloe_pplcnet.yml) |
| **mix_mot_ch** | YOLOX-x | 800x1440| - | 61.9 | 77.3 | 71.6 | - |[配置文件](./bytetrack_yolox.yml) |
| **mix_det** | YOLOX-x | 800x1440| - | 65.4 | 84.5 | 77.4 | - |[配置文件](./bytetrack_yolox.yml) |
**注意:**
- 检测任务相关配置和文档请查看[detector](detector/)。
- 模型权重下载链接在配置文件中的```det_weights```和```reid_weights```,运行```tools/eval_mot.py```评估的命令即可自动下载,```reid_weights```若为None则表示不需要使用。
- **ByteTrack默认不使用ReID权重**如需使用ReID权重可以参考 [bytetrack_ppyoloe_pplcnet.yml](./bytetrack_ppyoloe_pplcnet.yml),如需**更换ReID权重可改动其中的`reid_weights: `为自己的权重路径**。
- **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载,并解压放在`dataset/mot/`文件夹下。
- **mix_mot_ch**数据集是MOT17、CrowdHuman组成的联合数据集**mix_det**数据集是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。为了验证精度可以都用**MOT17-half val**数据集去评估。
#### YOLOX-x ByteTrack(mix_det)在 MOT-16/MOT-17 上的结果
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pp-yoloe-an-evolved-version-of-yolo/multi-object-tracking-on-mot16)](https://paperswithcode.com/sota/multi-object-tracking-on-mot16?p=pp-yoloe-an-evolved-version-of-yolo)
| 网络 | 测试集 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :---------: | :-------: | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
| ByteTrack-x| MOT-17 Train | 84.4 | 72.8 | 837 | 5653 | 10985 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
| ByteTrack-x| **MOT-17 Test** | **78.4** | 69.7 | 4974 | 37551 | 79524 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
| ByteTrack-x| MOT-16 Train | 83.5 | 72.7 | 800 | 6973 | 10419 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
| ByteTrack-x| **MOT-16 Test** | **77.7** | 70.1 | 1570 | 15695 | 23304 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
**注意:**
- **mix_det**数据集是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。
- MOT-17 Train 和 MOT-16 Train 的指标均为本地评估该数据后的指标由于Train集包括在了训练集中此MOTA指标不代表模型的检测跟踪能力只是因为MOT-17和MOT-16无验证集而它们的Train集有ground truth是为了方便验证精度。
- MOT-17 Test 和 MOT-16 Test 的指标均为交到 [MOTChallenge](https://motchallenge.net)官网评测后的指标因为MOT-17和MOT-16的Test集未开放ground truth此MOTA指标可以代表模型的检测跟踪能力。
- ByteTrack的训练是单独的检测器训练MOT数据集推理是组装跟踪器去评估MOT指标单独的检测模型也可以评估检测指标。
- ByteTrack的导出部署是单独导出检测模型再组装跟踪器运行的参照[PP-Tracking](../../../deploy/pptracking/python/README.md)。
### 人头跟踪
#### YOLOX-x ByteTrack 在 HT-21 Test Set上的结果
| 模型 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :---: | :----: | :---: | :------: | :----: |:----: |
| ByteTrack-x | 1440x800 | 64.1 | 63.4 | 4191 | 185162 | 210240 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/bytetrack_yolox_ht21.pdparams) | [配置文件](./bytetrack_yolox_ht21.yml) |
#### YOLOX-x ByteTrack 在 HT-21 Test Set上的结果
| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :----: | :----: | :----: | :----: |:-------: | :----: | :----: |
| ByteTrack-x | 1440x800 | 72.6 | 61.8 | 5163 | 71235 | 154139 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/bytetrack_yolox_ht21.pdparams) | [配置文件](./bytetrack_yolox_ht21.yml) |
**注意:**
- 更多人头跟踪模型可以参考[headtracking21](../headtracking21)。
## 多类别适配
多类别ByteTrack可以参考 [bytetrack_ppyoloe_ppvehicle9cls.yml](./bytetrack_ppyoloe_ppvehicle9cls.yml),表示使用 [PP-Vehicle](../../ppvehicle/) 中的PPVehicle9cls数据集训好的模型权重去做多类别车辆跟踪。由于没有跟踪的ground truth标签无法做评估故只做跟踪预测只需修改`TestMOTDataset`确保路径存在,且其中的`anno_path`表示指定在一个`label_list.txt`中记录具体类别,需要自己手写,一行表示一个种类,注意路径`anno_path`如果写错或找不到则将默认使用COCO数据集80类的类别。
如需**更换检测器权重,可改动其中的`det_weights: `为自己的权重路径**,并注意**数据集路径、`label_list.txt`和类别数**做出相应更改。
预测多类别车辆跟踪:
```bash
# 下载demo视频
wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/bdd100k_demo.mp4
# 使用PPYOLOE 多类别车辆检测模型
CUDA_VISIBLE_DEVICES=1 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe_ppvehicle9cls.yml --video_file=bdd100k_demo.mp4 --scaled=True --save_videos
```
**注意:**
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的如果使用的检测模型是JDE的YOLOv3则为False如果使用通用检测模型则为True。
- `--save_videos`表示保存可视化视频,同时会保存可视化的图片在`{output_dir}/mot_outputs/`中,`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`
## 快速开始
### 1. 训练
通过如下命令一键式启动训练和评估
```bash
python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
# 或者
python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml --eval --amp
```
**注意:**
- ` --eval`是边训练边验证精度;`--amp`是混合精度训练避免溢出推荐使用paddlepaddle2.2.2版本。
### 2. 评估
#### 2.1 评估检测效果
```bash
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
# 或者
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
```
**注意:**
- 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
#### 2.2 评估跟踪效果
```bash
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_yolov3.yml --scaled=True
# 或者
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe.yml --scaled=True
# 或者
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml --scaled=True
# 或者
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_yolox.yml --scaled=True
```
**注意:**
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的如果使用的检测模型是JDE YOLOv3则为False如果使用通用检测模型则为True, 默认值是False。
- 跟踪结果会存于`{output_dir}/mot_results/`里面每个视频序列对应一个txt每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`
### 3. 预测
使用单个GPU通过如下命令预测一个视频并保存为视频
```bash
# 下载demo视频
wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
# 使用PPYOLOe行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
# 或者使用YOLOX行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_yolox.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
```
**注意:**
- 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
- `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的如果使用的检测模型是JDE的YOLOv3则为False如果使用通用检测模型则为True。
- `--save_videos`表示保存可视化视频,同时会保存可视化的图片在`{output_dir}/mot_outputs/`中,`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`
### 4. 导出预测模型
Step 1导出检测模型
```bash
# 导出PPYOLOe行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
# 或者导出YOLOX行人检测模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
```
Step 2导出ReID模型(可选步骤,默认不需要)
```bash
# 导出PPLCNet ReID模型
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pplcnet.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
```
### 5. 用导出的模型基于Python去预测
```bash
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts
# 或者
python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/yolox_x_24e_800x1440_mix_det/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts
```
**注意:**
- 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件`--save_images`表示保存跟踪结果可视化图片。
- 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`
## 引用
```
@article{zhang2021bytetrack,
title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
journal={arXiv preprint arXiv:2110.06864},
year={2021}
}
```

View File

@@ -0,0 +1,34 @@
metric: COCO
num_classes: 1
# Detection Dataset for training
TrainDataset:
!COCODataSet
image_dir: images/train
anno_path: annotations/train.json
dataset_dir: dataset/mot/HT21
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: images/train
anno_path: annotations/val_half.json
dataset_dir: dataset/mot/HT21
TestDataset:
!ImageFolder
dataset_dir: dataset/mot/HT21
anno_path: annotations/val_half.json
# MOTDataset for MOT evaluation and inference
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: HT21/images/test
keep_ori_im: True # set as True in DeepSORT and ByteTrack
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,34 @@
metric: COCO
num_classes: 1
# Detection Dataset for training
TrainDataset:
!COCODataSet
image_dir: ""
anno_path: annotations/train.json
dataset_dir: dataset/mot/mix_det
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: images/train
anno_path: annotations/val_half.json
dataset_dir: dataset/mot/MOT17
TestDataset:
!ImageFolder
anno_path: annotations/val_half.json
dataset_dir: dataset/mot/MOT17
# MOTDataset for MOT evaluation and inference
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT17/images/half
keep_ori_im: True # set as True in DeepSORT and ByteTrack
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,34 @@
metric: COCO
num_classes: 1
# Detection Dataset for training
TrainDataset:
!COCODataSet
image_dir: ""
anno_path: annotations/train.json
dataset_dir: dataset/mot/mix_mot_ch
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: images/train
anno_path: annotations/val_half.json
dataset_dir: dataset/mot/MOT17
TestDataset:
!ImageFolder
anno_path: annotations/val_half.json
dataset_dir: dataset/mot/MOT17
# MOTDataset for MOT evaluation and inference
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT17/images/half
keep_ori_im: True # set as True in DeepSORT and ByteTrack
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,34 @@
metric: COCO
num_classes: 1
# Detection Dataset for training
TrainDataset:
!COCODataSet
dataset_dir: dataset/mot/MOT17
anno_path: annotations/train_half.json
image_dir: images/train
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
dataset_dir: dataset/mot/MOT17
anno_path: annotations/val_half.json
image_dir: images/train
TestDataset:
!ImageFolder
dataset_dir: dataset/mot/MOT17
anno_path: annotations/val_half.json
# MOTDataset for MOT evaluation and inference
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT17/images/half
keep_ori_im: True # set as True in DeepSORT and ByteTrack
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,60 @@
worker_num: 4
eval_height: &eval_height 640
eval_width: &eval_width 640
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomDistort: {}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomCrop: {}
- RandomFlip: {}
batch_transforms:
- BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
- PadGT: {}
batch_size: 8
shuffle: true
drop_last: true
use_shared_memory: true
collate_batch: true
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 8
TestReader:
inputs_def:
image_shape: [3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 1
# add MOTReader for MOT evaluation and inference, note batch_size should be 1 in MOT
EvalMOTReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 1
TestMOTReader:
inputs_def:
image_shape: [3, *eval_height, *eval_width]
sample_transforms:
- Decode: {}
- Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,66 @@
worker_num: 2
TrainReader:
inputs_def:
num_max_boxes: 50
sample_transforms:
- Decode: {}
- Mixup: {alpha: 1.5, beta: 1.5}
- RandomDistort: {}
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
- RandomCrop: {}
- RandomFlip: {}
batch_transforms:
- BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeBox: {}
- PadBox: {num_max_boxes: 50}
- BboxXYXY2XYWH: {}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
- Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
batch_size: 8
shuffle: true
drop_last: true
mixup_epoch: 250
use_shared_memory: true
EvalReader:
inputs_def:
num_max_boxes: 50
sample_transforms:
- Decode: {}
- Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 8
TestReader:
inputs_def:
image_shape: [3, 608, 608]
sample_transforms:
- Decode: {}
- Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 1
# add MOTReader for MOT evaluation and inference, note batch_size should be 1 in MOT
EvalMOTReader:
inputs_def:
num_max_boxes: 50
sample_transforms:
- Decode: {}
- Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 1
TestMOTReader:
inputs_def:
image_shape: [3, 608, 608]
sample_transforms:
- Decode: {}
- Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
- NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,67 @@
input_height: &input_height 800
input_width: &input_width 1440
input_size: &input_size [*input_height, *input_width]
worker_num: 4
TrainReader:
sample_transforms:
- Decode: {}
- Mosaic:
prob: 1.0
input_dim: *input_size
degrees: [-10, 10]
scale: [0.1, 2.0]
shear: [-2, 2]
translate: [-0.1, 0.1]
enable_mixup: True
mixup_prob: 1.0
mixup_scale: [0.5, 1.5]
- AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
- PadResize: {target_size: *input_size}
- RandomFlip: {}
batch_transforms:
- Permute: {}
batch_size: 6
shuffle: True
drop_last: True
collate_batch: False
mosaic_epoch: 20
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: *input_size, keep_ratio: True}
- Pad: {size: *input_size, fill_value: [114., 114., 114.]}
- Permute: {}
batch_size: 8
TestReader:
inputs_def:
image_shape: [3, 800, 1440]
sample_transforms:
- Decode: {}
- Resize: {target_size: *input_size, keep_ratio: True}
- Pad: {size: *input_size, fill_value: [114., 114., 114.]}
- Permute: {}
batch_size: 1
# add MOTReader for MOT evaluation and inference, note batch_size should be 1 in MOT
EvalMOTReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: *input_size, keep_ratio: True}
- Pad: {size: *input_size, fill_value: [114., 114., 114.]}
- Permute: {}
batch_size: 1
TestMOTReader:
inputs_def:
image_shape: [3, 800, 1440]
sample_transforms:
- Decode: {}
- Resize: {target_size: *input_size, keep_ratio: True}
- Pad: {size: *input_size, fill_value: [114., 114., 114.]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,59 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
'_base_/mot17.yml',
'_base_/ppyoloe_mot_reader_640x640.yml'
]
weights: output/bytetrack_ppyoloe/model_final
log_iter: 20
snapshot_epoch: 2
metric: MOT # eval/infer mode, set 'COCO' can be training mode
num_classes: 1
architecture: ByteTrack
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
ByteTrack:
detector: YOLOv3 # PPYOLOe version
reid: None
tracker: JDETracker
det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
reid_weights: None
YOLOv3:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 # 100
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.1 # 0.01 in original detector
nms_threshold: 0.4 # 0.6 in original detector
# BYTETracker
JDETracker:
use_byte: True
match_thres: 0.9
conf_thres: 0.2
low_conf_thres: 0.1
min_box_area: 100
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,59 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
'_base_/mot17.yml',
'_base_/ppyoloe_mot_reader_640x640.yml'
]
weights: output/bytetrack_ppyoloe_pplcnet/model_final
log_iter: 20
snapshot_epoch: 2
metric: MOT # eval/infer mode
num_classes: 1
architecture: ByteTrack
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
ByteTrack:
detector: YOLOv3 # PPYOLOe version
reid: PPLCNetEmbedding # use reid
tracker: JDETracker
det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
reid_weights: https://bj.bcebos.com/v1/paddledet/models/mot/deepsort_pplcnet.pdparams
YOLOv3:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 # 100
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.1 # 0.01 in original detector
nms_threshold: 0.4 # 0.6 in original detector
# BYTETracker
JDETracker:
use_byte: True
match_thres: 0.9
conf_thres: 0.2
low_conf_thres: 0.1
min_box_area: 100
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,49 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'bytetrack_ppyoloe.yml',
'_base_/ppyoloe_mot_reader_640x640.yml'
]
weights: output/bytetrack_ppyoloe_ppvehicle9cls/model_final
metric: MCMOT # multi-class, `MOT` for single class
num_classes: 9
# pedestrian(1), rider(2), car(3), truck(4), bus(5), van(6), motorcycle(7), bicycle(8), others(9)
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video
anno_path: dataset/mot/label_list.txt # absolute path
### write in label_list.txt each line:
# pedestrian
# rider
# car
# truck
# bus
# van
# motorcycle
# bicycle
# others
###
det_weights: https://paddledet.bj.bcebos.com/models/mot_ppyoloe_l_36e_ppvehicle9cls.pdparams
depth_mult: 1.0
width_mult: 1.0
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
PPYOLOEHead:
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.1 # 0.01 in original detector
nms_threshold: 0.4 # 0.6 in original detector
# BYTETracker
JDETracker:
use_byte: True
match_thres: 0.9
conf_thres: 0.2
low_conf_thres: 0.1
min_box_area: 0
vertical_ratio: 0 # only use 1.6 in MOT17 pedestrian

View File

@@ -0,0 +1,50 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'detector/yolov3_darknet53_40e_608x608_mot17half.yml',
'_base_/mot17.yml',
'_base_/yolov3_mot_reader_608x608.yml'
]
weights: output/bytetrack_yolov3/model_final
log_iter: 20
snapshot_epoch: 2
metric: MOT # eval/infer mode
num_classes: 1
architecture: ByteTrack
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolov3_darknet53_270e_coco.pdparams
ByteTrack:
detector: YOLOv3 # General YOLOv3 version
reid: None
tracker: JDETracker
det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolov3_darknet53_40e_608x608_mot17half.pdparams
reid_weights: None
YOLOv3:
backbone: DarkNet
neck: YOLOv3FPN
yolo_head: YOLOv3Head
post_process: BBoxPostProcess
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
BBoxPostProcess:
decode:
name: YOLOBox
conf_thresh: 0.005
downsample_ratio: 32
clip_bbox: true
nms:
name: MultiClassNMS
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.45
nms_top_k: 1000
# BYTETracker
JDETracker:
use_byte: True
match_thres: 0.9
conf_thres: 0.2
low_conf_thres: 0.1
min_box_area: 100
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,68 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'detector/yolox_x_24e_800x1440_mix_det.yml',
'_base_/mix_det.yml',
'_base_/yolox_mot_reader_800x1440.yml'
]
weights: output/bytetrack_yolox/model_final
log_iter: 20
snapshot_epoch: 2
metric: MOT # eval/infer mode
num_classes: 1
architecture: ByteTrack
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
ByteTrack:
detector: YOLOX
reid: None
tracker: JDETracker
det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
reid_weights: None
depth_mult: 1.33
width_mult: 1.25
YOLOX:
backbone: CSPDarkNet
neck: YOLOCSPPAN
head: YOLOXHead
input_size: [800, 1440]
size_stride: 32
size_range: [18, 22] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
CSPDarkNet:
arch: "X"
return_idx: [2, 3, 4]
depthwise: False
YOLOCSPPAN:
depthwise: False
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
YOLOXHead:
l1_epoch: 20
depthwise: False
loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
assigner:
name: SimOTAAssigner
candidate_topk: 10
use_vfl: False
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.7
# For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
# For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
# BYTETracker
JDETracker:
use_byte: True
match_thres: 0.9
conf_thres: 0.6
low_conf_thres: 0.2
min_box_area: 100
vertical_ratio: 1.6 # for pedestrian

View File

@@ -0,0 +1,68 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'detector/yolox_x_24e_800x1440_ht21.yml',
'_base_/ht21.yml',
'_base_/yolox_mot_reader_800x1440.yml'
]
weights: output/bytetrack_yolox_ht21/model_final
log_iter: 20
snapshot_epoch: 2
metric: MOT # eval/infer mode
num_classes: 1
architecture: ByteTrack
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
ByteTrack:
detector: YOLOX
reid: None
tracker: JDETracker
det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_ht21.pdparams
reid_weights: None
depth_mult: 1.33
width_mult: 1.25
YOLOX:
backbone: CSPDarkNet
neck: YOLOCSPPAN
head: YOLOXHead
input_size: [800, 1440]
size_stride: 32
size_range: [18, 22] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
CSPDarkNet:
arch: "X"
return_idx: [2, 3, 4]
depthwise: False
YOLOCSPPAN:
depthwise: False
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
YOLOXHead:
l1_epoch: 20
depthwise: False
loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
assigner:
name: SimOTAAssigner
candidate_topk: 10
use_vfl: False
nms:
name: MultiClassNMS
nms_top_k: 30000
keep_top_k: 1000
score_threshold: 0.01
nms_threshold: 0.7
# For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
# For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
# BYTETracker
JDETracker:
use_byte: True
match_thres: 0.9
conf_thres: 0.7
low_conf_thres: 0.1
min_box_area: 0
vertical_ratio: 0 # 1.6 for pedestrian

View File

@@ -0,0 +1 @@
README_cn.md

View File

@@ -0,0 +1,39 @@
简体中文 | [English](README.md)
# ByteTrack的检测器
## 简介
[ByteTrack](https://arxiv.org/abs/2110.06864)(ByteTrack: Multi-Object Tracking by Associating Every Detection Box) 通过关联每个检测框来跟踪而不仅是关联高分的检测框。此处提供了几个常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异请自行根据需求进行适配。
## 模型库
### 在MOT17-half val数据集上的检测结果
| 骨架网络 | 网络类型 | 输入尺度 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
| :-------------- | :------------- | :--------: | :---------: | :-----------: | :-----: | :------: | :-----: |
| DarkNet-53 | YOLOv3 | 608X608 | 40e | ---- | 42.7 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolov3_darknet53_40e_608x608_mot17half.pdparams) | [配置文件](./yolov3_darknet53_40e_608x608_mot17half.yml) |
| CSPResNet | PPYOLOe | 640x640 | 36e | ---- | 52.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams) | [配置文件](./ppyoloe_crn_l_36e_640x640_mot17half.yml) |
| CSPDarkNet | YOLOX-x(mix_mot_ch) | 800x1440 | 24e | ---- | 61.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_mot_ch.pdparams) | [配置文件](./yolox_x_24e_800x1440_mix_mot_ch.yml) |
| CSPDarkNet | YOLOX-x(mix_det) | 800x1440 | 24e | ---- | 65.4 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./yolox_x_24e_800x1440_mix_det.yml) |
**注意:**
- 以上模型除YOLOX外采用**MOT17-half train**数据集训练,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载。
- **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://paddledet.bj.bcebos.com/data/mot/mot17half/annotations.zip)下载,并解压放在`dataset/mot/MOT17/images/`文件夹下。
- YOLOX-x(mix_mot_ch)采用**mix_mot_ch**数据集是MOT17、CrowdHuman组成的联合数据集YOLOX-x(mix_det)采用**mix_det**数据集是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。为了验证精度可以都用**MOT17-half val**数据集去评估。
- 行人跟踪请使用行人检测器结合行人ReID模型。车辆跟踪请使用车辆检测器结合车辆ReID模型。
- 用于ByteTrack跟踪时这些模型的NMS阈值等后处理设置会与纯检测任务的设置不同。
## 快速开始
通过如下命令一键式启动评估、评估和导出
```bash
job_name=ppyoloe_crn_l_36e_640x640_mot17half
config=configs/mot/bytetrack/detector/${job_name}.yml
log_dir=log_dir/${job_name}
# 1. training
python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp
# 2. evaluation
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c ${config} -o weights=output/${job_name}/model_final.pdparams
# 3. export
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=output/${job_name}/model_final.pdparams
```

View File

@@ -0,0 +1,83 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'../../../ppyoloe/ppyoloe_crn_l_300e_coco.yml',
'../_base_/mot17.yml',
]
weights: output/ppyoloe_crn_l_36e_640x640_mot17half/model_final
log_iter: 20
snapshot_epoch: 2
# schedule configuration for fine-tuning
epoch: 36
LearningRate:
base_lr: 0.001
schedulers:
- !CosineDecay
max_epochs: 43
- !LinearWarmup
start_factor: 0.001
epochs: 1
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005
type: L2
TrainReader:
batch_size: 8
# detector configuration
architecture: YOLOv3
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
depth_mult: 1.0
width_mult: 1.0
YOLOv3:
backbone: CSPResNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
CSPResNet:
layers: [3, 6, 6, 3]
channels: [64, 128, 256, 512, 1024]
return_idx: [1, 2, 3]
use_large_stem: True
CustomCSPPAN:
out_channels: [768, 384, 192]
stage_num: 1
block_num: 3
act: 'swish'
spp: true
PPYOLOEHead:
fpn_strides: [32, 16, 8]
grid_cell_scale: 5.0
grid_cell_offset: 0.5
static_assigner_epoch: -1 # 100
use_varifocal_loss: True
loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
static_assigner:
name: ATSSAssigner
topk: 9
assigner:
name: TaskAlignedAssigner
topk: 13
alpha: 1.0
beta: 6.0
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.6

View File

@@ -0,0 +1,77 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'../../../yolov3/yolov3_darknet53_270e_coco.yml',
'../_base_/mot17.yml',
]
weights: output/yolov3_darknet53_40e_608x608_mot17half/model_final
log_iter: 20
snapshot_epoch: 2
# schedule configuration for fine-tuning
epoch: 40
LearningRate:
base_lr: 0.0001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones:
- 32
- 36
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 100
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005
type: L2
TrainReader:
batch_size: 8
mixup_epoch: 35
# detector configuration
architecture: YOLOv3
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolov3_darknet53_270e_coco.pdparams
norm_type: sync_bn
YOLOv3:
backbone: DarkNet
neck: YOLOv3FPN
yolo_head: YOLOv3Head
post_process: BBoxPostProcess
DarkNet:
depth: 53
return_idx: [2, 3, 4]
# use default config
# YOLOv3FPN:
YOLOv3Head:
anchors: [[10, 13], [16, 30], [33, 23],
[30, 61], [62, 45], [59, 119],
[116, 90], [156, 198], [373, 326]]
anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
loss: YOLOv3Loss
YOLOv3Loss:
ignore_thresh: 0.7
downsample: [32, 16, 8]
label_smooth: false
BBoxPostProcess:
decode:
name: YOLOBox
conf_thresh: 0.005
downsample_ratio: 32
clip_bbox: true
nms:
name: MultiClassNMS
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.45
nms_top_k: 1000

View File

@@ -0,0 +1,80 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'../../../yolox/yolox_x_300e_coco.yml',
'../_base_/ht21.yml',
]
weights: output/yolox_x_24e_800x1440_ht21/model_final
log_iter: 20
snapshot_epoch: 2
# schedule configuration for fine-tuning
epoch: 24
LearningRate:
base_lr: 0.0005 # fintune
schedulers:
- !CosineDecay
max_epochs: 24
min_lr_ratio: 0.05
last_plateau_epochs: 4
- !ExpWarmup
epochs: 1
OptimizerBuilder:
optimizer:
type: Momentum
momentum: 0.9
use_nesterov: True
regularizer:
factor: 0.0005
type: L2
TrainReader:
batch_size: 4
mosaic_epoch: 20
# detector configuration
architecture: YOLOX
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
norm_type: sync_bn
use_ema: True
ema_decay: 0.9999
ema_decay_type: "exponential"
act: silu
find_unused_parameters: True
depth_mult: 1.33
width_mult: 1.25
YOLOX:
backbone: CSPDarkNet
neck: YOLOCSPPAN
head: YOLOXHead
input_size: [800, 1440]
size_stride: 32
size_range: [18, 32] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
CSPDarkNet:
arch: "X"
return_idx: [2, 3, 4]
depthwise: False
YOLOCSPPAN:
depthwise: False
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
YOLOXHead:
l1_epoch: 20
depthwise: False
loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
assigner:
name: SimOTAAssigner
candidate_topk: 10
use_vfl: False
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.7
# For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
# For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.

View File

@@ -0,0 +1,80 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'../../../yolox/yolox_x_300e_coco.yml',
'../_base_/mix_det.yml',
]
weights: output/yolox_x_24e_800x1440_mix_det/model_final
log_iter: 20
snapshot_epoch: 2
# schedule configuration for fine-tuning
epoch: 24
LearningRate:
base_lr: 0.00075 # fintune
schedulers:
- !CosineDecay
max_epochs: 24
min_lr_ratio: 0.05
last_plateau_epochs: 4
- !ExpWarmup
epochs: 1
OptimizerBuilder:
optimizer:
type: Momentum
momentum: 0.9
use_nesterov: True
regularizer:
factor: 0.0005
type: L2
TrainReader:
batch_size: 6
mosaic_epoch: 20
# detector configuration
architecture: YOLOX
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
norm_type: sync_bn
use_ema: True
ema_decay: 0.9999
ema_decay_type: "exponential"
act: silu
find_unused_parameters: True
depth_mult: 1.33
width_mult: 1.25
YOLOX:
backbone: CSPDarkNet
neck: YOLOCSPPAN
head: YOLOXHead
input_size: [800, 1440]
size_stride: 32
size_range: [18, 30] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
CSPDarkNet:
arch: "X"
return_idx: [2, 3, 4]
depthwise: False
YOLOCSPPAN:
depthwise: False
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
YOLOXHead:
l1_epoch: 20
depthwise: False
loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
assigner:
name: SimOTAAssigner
candidate_topk: 10
use_vfl: False
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.7
# For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
# For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.

View File

@@ -0,0 +1,80 @@
# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
_BASE_: [
'../../../yolox/yolox_x_300e_coco.yml',
'../_base_/mix_mot_ch.yml',
]
weights: output/yolox_x_24e_800x1440_mix_mot_ch/model_final
log_iter: 20
snapshot_epoch: 2
# schedule configuration for fine-tuning
epoch: 24
LearningRate:
base_lr: 0.00075 # fine-tune
schedulers:
- !CosineDecay
max_epochs: 24
min_lr_ratio: 0.05
last_plateau_epochs: 4
- !ExpWarmup
epochs: 1
OptimizerBuilder:
optimizer:
type: Momentum
momentum: 0.9
use_nesterov: True
regularizer:
factor: 0.0005
type: L2
TrainReader:
batch_size: 6
mosaic_epoch: 20
# detector configuration
architecture: YOLOX
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
norm_type: sync_bn
use_ema: True
ema_decay: 0.9999
ema_decay_type: "exponential"
act: silu
find_unused_parameters: True
depth_mult: 1.33
width_mult: 1.25
YOLOX:
backbone: CSPDarkNet
neck: YOLOCSPPAN
head: YOLOXHead
input_size: [800, 1440]
size_stride: 32
size_range: [18, 30] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
CSPDarkNet:
arch: "X"
return_idx: [2, 3, 4]
depthwise: False
YOLOCSPPAN:
depthwise: False
# Tracking requires higher quality boxes, so NMS score_threshold will be higher
YOLOXHead:
l1_epoch: 20
depthwise: False
loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
assigner:
name: SimOTAAssigner
candidate_topk: 10
use_vfl: False
nms:
name: MultiClassNMS
nms_top_k: 1000
keep_top_k: 100
score_threshold: 0.01
nms_threshold: 0.7
# For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
# For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.