更换文档检测模型

This commit is contained in:
2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions

View File

@@ -0,0 +1,106 @@
name: 🐛 报BUG Bug Report
description: 报告一个可复现的Bug以帮助我们修复PaddleDetection。 Report a bug to help us reproduce and fix it.
labels: [type/bug-report, status/new-issue]
body:
- type: markdown
attributes:
value: |
Thank you for submitting a PaddleDetection Bug Report!
- type: checkboxes
attributes:
label: 问题确认 Search before asking
description: >
(必选项) 在向PaddleDetection报bug之前请先查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)是否报过同样的bug。
(Required) Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues).
options:
- label: >
我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)没有发现相似的bug。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issues) and found no similar bug report.
required: true
- type: dropdown
attributes:
label: Bug组件 Bug Component
description: |
(可选项) 请选择在哪部分代码发现这个bug。(Optional) Please select the part of PaddleDetection where you found the bug.
multiple: true
options:
- "Training"
- "Validation"
- "Inference"
- "Export"
- "Deploy"
- "Installation"
- "DataProcess"
- "Other"
validations:
required: false
- type: textarea
id: code
attributes:
label: Bug描述 Describe the Bug
description: |
请清晰而简洁地描述这个bug并附上bug复现步骤、报错信息或截图、代码改动说明或最小可复现代码。如果代码太长请将可执行代码放到[AIStudio](https://aistudio.baidu.com/aistudio/index)中并将项目设置为公开或者放到github gist上并在项目中描述清楚bug复现步骤在issue中描述期望结果与实际结果。
如果你报告的是一个报错信息,请将完整回溯的报错贴在这里,并使用 ` ```三引号块``` `展示错误信息。
placeholder: |
请清晰简洁的描述这个bug。 A clear and concise description of what the bug is.
```python
代码改动说明,或最小可复现代码。 Code change description, or sample code to reproduce the problem.
```
```shell
带有完整回溯信息的报错日志或截图。 The error log or screenshot you got, with the full traceback.
```
validations:
required: true
- type: textarea
attributes:
label: 复现环境 Environment
description: 请具体说明复现bug的环境信息。Please specify the environment information for reproducing the bug.
placeholder: |
- OS: Linux/Windows
- PaddlePaddle: 2.2.2
- PaddleDetection: release/2.4
- Python: 3.8.0
- CUDA: 10.2
- CUDNN: 7.6
- GCC: 8.2.0
validations:
required: true
- type: checkboxes
attributes:
label: Bug描述确认 Bug description confirmation
description: >
(必选项) 请确认是否提供了详细的Bug描述和环境信息确认问题是否可以复现。
(Required) Please confirm whether the bug description and environment information are provided, and whether the problem can be reproduced.
options:
- label: >
我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
required: true
- type: checkboxes
attributes:
label: 是否愿意提交PR Are you willing to submit a PR?
description: >
(可选项) 如果你对修复bug有自己的想法十分鼓励提交[Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls)共同提升PaddleDetection。
(Optional) We encourage you to submit a [Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls) (PR) to help improve PaddleDetection for everyone, especially if you have a good understanding of how to implement a fix or feature.
options:
- label: 我愿意提交PRI'd like to help by submitting a PR!
- type: markdown
attributes:
value: >
感谢你的贡献 🎉Thanks for your contribution 🎉!

View File

@@ -0,0 +1,50 @@
name: 🚀 新需求 Feature Request
description: 提交一个你对PaddleDetection的新需求。 Submit a request for a new Paddle feature.
labels: [type/feature-request, status/new-issue]
body:
- type: markdown
attributes:
value: >
#### 你可以在这里提出你对PaddleDetection的新需求包括但不限于功能或模型缺失、功能不全或无法使用、精度/性能不符合预期等。
#### You could submit a request for a new feature here, including but not limited to: new features or models, incomplete or unusable features, accuracy/performance not as expected, etc.
- type: checkboxes
attributes:
label: 问题确认 Search before asking
description: >
在向PaddleDetection提新需求之前请先查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)是否报过同样的需求。
Before submitting a feature request, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues).
options:
- label: >
我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)没有类似需求。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issues) and found no similar feature requests.
required: true
- type: textarea
id: description
attributes:
label: 需求描述 Feature Description
description: |
请尽可能包含任务目标、需求场景、功能描述等信息,全面的信息有利于我们准确评估你的需求。
Please include as much information as possible, such as mission objectives, requirement scenarios, functional descriptions, etc. Comprehensive information will help us accurately assess your feature request.
value: "1. 任务目标(请描述你正在做的项目是什么,如模型、论文、项目是什么?); 2. 需求场景(请描述你的项目中为什么需要用此功能); 3. 功能描述(请简单描述或设计这个功能)"
validations:
required: true
- type: checkboxes
attributes:
label: 是否愿意提交PR Are you willing to submit a PR?
description: >
(可选)如果你对新feature有自己的想法十分鼓励提交[Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls)共同提升PaddleDetection
(Optional) We encourage you to submit a [Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls) (PR) to help improve PaddleDetection for everyone, especially if you have a good understanding of how to implement a fix or feature.
options:
- label: Yes I'd like to help by submitting a PR!
- type: markdown
attributes:
value: >
感谢你的贡献 🎉Thanks for your contribution 🎉!

View File

@@ -0,0 +1,38 @@
name: 📚 文档 Documentation Issue
description: 反馈一个官网文档错误。 Report an issue related to https://github.com/PaddlePaddle/PaddleDetection.
labels: [type/docs, status/new-issue]
body:
- type: markdown
attributes:
value: >
#### 请确认反馈的问题来自PaddlePaddle官网文档https://github.com/PaddlePaddle/PaddleDetection 。
#### Before submitting a Documentation Issue, Please make sure that issue is related to https://github.com/PaddlePaddle/PaddleDetection.
- type: textarea
id: link
attributes:
label: 文档链接&描述 Document Links & Description
description: |
请说明有问题的文档链接以及该文档存在的问题。
Please fill in the link to the document and describe the question.
validations:
required: true
- type: textarea
id: error
attributes:
label: 请提出你的建议 Please give your suggestion
description: |
请告诉我们你希望如何改进这个文档。或者你可以提个PR修复这个问题。
Please tell us how you would like to improve this document. Or you can submit a PR to fix this problem.
validations:
required: false
- type: markdown
attributes:
value: >
感谢你的贡献 🎉Thanks for your contribution 🎉!

View File

@@ -0,0 +1,37 @@
name: 🙋🏼‍♀️🙋🏻‍♂️提问 Ask a Question
description: 提出一个使用/咨询问题。 Ask a usage or consultation question.
labels: [type/question, status/new-issue]
body:
- type: checkboxes
attributes:
label: 问题确认 Search before asking
description: >
#### 你可以在这里提出一个使用/咨询问题,提问之前请确保:
- 1已经百度/谷歌搜索过你的问题,但是没有找到解答;
- 2已经在官网查询过[教程文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/GETTING_STARTED_cn.md)与[FAQ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/docs/tutorials/FAQ),但是没有找到解答;
- 3已经在[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)中搜索过没有找到同类issue或issue未被解答。
#### You could ask a usage or consultation question here, before your start, please make sure:
- 1) You have searched your question on Baidu/Google, but found no answer;
- 2) You have checked the [tutorials](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/GETTING_STARTED.md), but found no answer;
- 3) You have searched [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues), but found no similar issue or the issue has not been answered.
options:
- label: >
我已经搜索过问题但是没有找到解答。I have searched the question and found no related answer.
required: true
- type: textarea
id: question
attributes:
label: 请提出你的问题 Please ask your question
validations:
required: true

View File

@@ -0,0 +1,23 @@
name: 🧩 其他 Others
description: 提出其他问题。 Report any other non-support related issues.
labels: [type/others, status/new-issue]
body:
- type: markdown
attributes:
value: >
#### 你可以在这里提出任何前面几类模板不适用的问题,包括但不限于:优化性建议、框架使用体验反馈、版本兼容性问题、报错信息不清楚等。
#### You can report any issues that are not applicable to the previous types of templates, including but not limited to: enhancement suggestions, feedback on the use of the framework, version compatibility issues, unclear error information, etc.
- type: textarea
id: others
attributes:
label: 问题描述 Please describe your issue
validations:
required: true
- type: markdown
attributes:
value: >
感谢你的贡献 🎉! Thanks for your contribution 🎉!

88
paddle_detection/.gitignore vendored Normal file
View File

@@ -0,0 +1,88 @@
# Virtualenv
/.venv/
/venv/
# Byte-compiled / optimized / DLL files
__pycache__/
.ipynb_checkpoints/
*.py[cod]
# C extensions
*.so
# json file
*.json
# log file
*.log
# Distribution / packaging
/bin/
*build/
/develop-eggs/
*dist/
/eggs/
/lib/
/lib64/
/output/
/inference_model/
/output_inference/
/parts/
/sdist/
/var/
*.egg-info/
/.installed.cfg
/*.egg
/.eggs
# AUTHORS and ChangeLog will be generated while packaging
/AUTHORS
/ChangeLog
# BCloud / BuildSubmitter
/build_submitter.*
/logger_client_log
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
.tox/
.coverage
.cache
.pytest_cache
nosetests.xml
coverage.xml
# Translations
*.mo
# Sphinx documentation
/docs/_build/
*.tar
*.pyc
.idea/
dataset/coco/annotations
dataset/coco/train2017
dataset/coco/val2017
dataset/voc/VOCdevkit
dataset/fruit/fruit-detection/
dataset/voc/test.txt
dataset/voc/trainval.txt
dataset/wider_face/WIDER_test
dataset/wider_face/WIDER_train
dataset/wider_face/WIDER_val
dataset/wider_face/wider_face_split
ppdet/version.py
# NPU meta folder
kernel_meta/
# MAC
*.DS_Store

View File

@@ -0,0 +1,44 @@
- repo: https://github.com/PaddlePaddle/mirrors-yapf.git
sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
hooks:
- id: yapf
files: \.py$
- repo: https://github.com/pre-commit/pre-commit-hooks
sha: a11d9314b22d8f8c7556443875b731ef05965464
hooks:
- id: check-merge-conflict
- id: check-symlinks
- id: detect-private-key
files: (?!.*paddle)^.*$
- id: end-of-file-fixer
files: \.(md|yml)$
- id: trailing-whitespace
files: \.(md|yml)$
- repo: https://github.com/Lucas-C/pre-commit-hooks
sha: v1.0.1
hooks:
- id: forbid-crlf
files: \.(md|yml)$
- id: remove-crlf
files: \.(md|yml)$
- id: forbid-tabs
files: \.(md|yml)$
- id: remove-tabs
files: \.(md|yml)$
- repo: local
hooks:
- id: clang-format-with-version-check
name: clang-format
description: Format files with ClangFormat.
entry: bash ./.travis/codestyle/clang_format.hook -i
language: system
files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|proto)$
- repo: local
hooks:
- id: cpplint-cpp-source
name: cpplint
description: Check C++ code style using cpplint.py.
entry: bash ./.travis/codestyle/cpplint_pre_commit.hook
language: system
files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx)$

View File

@@ -0,0 +1,3 @@
[style]
based_on_style = pep8
column_limit = 80

View File

@@ -0,0 +1,35 @@
language: cpp
cache: ccache
sudo: required
dist: trusty
services:
- docker
os:
- linux
env:
- JOB=PRE_COMMIT
addons:
apt:
packages:
- git
- python
- python-pip
- python2.7-dev
ssh_known_hosts: 13.229.163.131
before_install:
- sudo pip install -U virtualenv pre-commit pip -i https://pypi.tuna.tsinghua.edu.cn/simple
- docker pull paddlepaddle/paddle:latest
- git pull https://github.com/PaddlePaddle/PaddleDetection develop
script:
- exit_code=0
- .travis/precommit.sh || exit_code=$(( exit_code | $? ))
# - docker run -i --rm -v "$PWD:/py_unittest" paddlepaddle/paddle:latest /bin/bash -c
# 'cd /py_unittest; sh .travis/unittest.sh' || exit_code=$(( exit_code | $? ))
- if [ $exit_code -eq 0 ]; then true; else exit 1; fi;
notifications:
email:
on_success: change
on_failure: always

View File

@@ -0,0 +1,4 @@
#!/bin/bash
set -e
clang-format $@

View File

@@ -0,0 +1,27 @@
#!/bin/bash
TOTAL_ERRORS=0
if [[ ! $TRAVIS_BRANCH ]]; then
# install cpplint on local machine.
if [[ ! $(which cpplint) ]]; then
pip install cpplint
fi
# diff files on local machine.
files=$(git diff --cached --name-status | awk '$1 != "D" {print $2}')
else
# diff files between PR and latest commit on Travis CI.
branch_ref=$(git rev-parse "$TRAVIS_BRANCH")
head_ref=$(git rev-parse HEAD)
files=$(git diff --name-status $branch_ref $head_ref | awk '$1 != "D" {print $2}')
fi
# The trick to remove deleted files: https://stackoverflow.com/a/2413151
for file in $files; do
if [[ $file =~ ^(patches/.*) ]]; then
continue;
else
cpplint --filter=-readability/fn_size,-build/include_what_you_use,-build/c++11 $file;
TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?);
fi
done
exit $TOTAL_ERRORS

View File

@@ -0,0 +1,21 @@
#!/bin/bash
function abort(){
echo "Your commit not fit PaddlePaddle code style" 1>&2
echo "Please use pre-commit scripts to auto-format your code" 1>&2
exit 1
}
trap 'abort' 0
set -e
cd `dirname $0`
cd ..
export PATH=/usr/bin:$PATH
pre-commit install
if ! pre-commit run -a ; then
ls -lh
git diff --exit-code
exit 1
fi
trap : 0

View File

@@ -0,0 +1,8 @@
# add python requirements for unittests here, note install pycocotools
# directly is not supported in travis ci, it is installed by compiling
# from source files in unittest.sh
tqdm
cython
shapely
llvmlite==0.33
numba==0.50

View File

@@ -0,0 +1,47 @@
#!/bin/bash
abort(){
echo "Run unittest failed" 1>&2
echo "Please check your code" 1>&2
echo " 1. you can run unit tests by 'bash .travis/unittest.sh' locally" 1>&2
echo " 2. you can add python requirements in .travis/requirements.txt if you use new requirements in unit tests" 1>&2
exit 1
}
unittest(){
if [ $? != 0 ]; then
exit 1
fi
find "./ppdet" -name 'tests' -type d -print0 | \
xargs -0 -I{} -n1 bash -c \
'python -m unittest discover -v -s {}'
}
trap 'abort' 0
set -e
# install travis python dependencies exclude pycocotools
if [ -f ".travis/requirements.txt" ]; then
pip install -r .travis/requirements.txt
fi
# install pycocotools
if [ `pip list | grep pycocotools | wc -l` -eq 0 ]; then
# install git if needed
if [ -n `which git` ]; then
apt-get update
apt-get install -y git
fi;
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make install
python setup.py install --user
cd ../..
rm -rf cocoapi
fi
export PYTHONPATH=`pwd`:$PYTHONPATH
unittest .
trap : 0

201
paddle_detection/LICENSE Normal file
View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

1
paddle_detection/README.md Symbolic link
View File

@@ -0,0 +1 @@
README_cn.md

View File

@@ -0,0 +1,878 @@
简体中文 | [English](README_en.md)
<div align="center">
<p align="center">
<img src="https://user-images.githubusercontent.com/48054808/160532560-34cf7a1f-d950-435e-90d2-4b0a679e5119.png" align="middle" width = "800" />
</p>
<p align="center">
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
<a href="https://github.com/PaddlePaddle/PaddleDetection/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleDetection?color=ffa"></a>
<a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
<a href="https://github.com/PaddlePaddle/PaddleDetection/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleDetection?color=ccf"></a>
</p>
</div>
## 💌目录
- [💌目录](#目录)
- [🌈简介](#简介)
- [📣最新进展](#最新进展)
- [👫开源社区](#开源社区)
- [✨主要特性](#主要特性)
- [🧩模块化设计](#模块化设计)
- [📱丰富的模型库](#丰富的模型库)
- [🎗️产业特色模型|产业工具](#️产业特色模型产业工具)
- [💡🏆产业级部署实践](#产业级部署实践)
- [🍱安装](#安装)
- [🔥教程](#教程)
- [🔑FAQ](#faq)
- [🧩模块组件](#模块组件)
- [📱模型库](#模型库)
- [⚖️模型性能对比](#️模型性能对比)
- [🖥️服务器端模型性能对比](#️服务器端模型性能对比)
- [⌚️移动端模型性能对比](#️移动端模型性能对比)
- [🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)
- [💎PP-YOLOE 高精度目标检测模型](#pp-yoloe-高精度目标检测模型)
- [💎PP-YOLOE-R 高性能旋转框检测模型](#pp-yoloe-r-高性能旋转框检测模型)
- [💎PP-YOLOE-SOD 高精度小目标检测模型](#pp-yoloe-sod-高精度小目标检测模型)
- [💫PP-PicoDet 超轻量实时目标检测模型](#pp-picodet-超轻量实时目标检测模型)
- [📡PP-Tracking 实时多目标跟踪系统](#pp-tracking-实时多目标跟踪系统)
- [PP-TinyPose 人体骨骼关键点识别](#pp-tinypose-人体骨骼关键点识别)
- [🏃🏻PP-Human 实时行人分析工具](#pp-human-实时行人分析工具)
- [🏎PP-Vehicle 实时车辆分析工具](#pp-vehicle-实时车辆分析工具)
- [💡产业实践范例](#产业实践范例)
- [🏆企业应用案例](#企业应用案例)
- [📝许可证书](#许可证书)
- [📌引用](#引用)
## 🌈简介
PaddleDetection是一个基于PaddlePaddle的目标检测端到端开发套件在提供丰富的模型组件和测试基准的同时注重端到端的产业落地应用通过打造产业级特色模型|工具、建设产业应用范例等手段,帮助开发者实现数据准备、模型选型、模型训练、模型部署的全流程打通,快速进行落地应用。
主要模型效果示例如下(点击标题可快速跳转):
| [**通用目标检测**](#pp-yoloe-高精度目标检测模型) | [**小目标检测**](#pp-yoloe-sod-高精度小目标检测模型) | [**旋转框检测**](#pp-yoloe-r-高性能旋转框检测模型) | [**3D目标物检测**](https://github.com/PaddlePaddle/Paddle3D) |
| :--------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------: |
| <img src='https://user-images.githubusercontent.com/61035602/206095864-f174835d-4e9a-42f7-96b8-d684fc3a3687.png' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095892-934be83a-f869-4a31-8e52-1074184149d1.jpg' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206111796-d9a9702a-c1a0-4647-b8e9-3e1307e9d34c.png' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095622-cf6dbd26-5515-472f-9451-b39bbef5b1bf.gif' height="126px" width="180px"> |
| [**人脸检测**](#模型库) | [**2D关键点检测**](#pp-tinypose-人体骨骼关键点识别) | [**多目标追踪**](#pp-tracking-实时多目标跟踪系统) | [**实例分割**](#模型库) |
| <img src='https://user-images.githubusercontent.com/61035602/206095684-72f42233-c9c7-4bd8-9195-e34859bd08bf.jpg' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206100220-ab01d347-9ff9-4f17-9718-290ec14d4205.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206111753-836e7827-968e-4c80-92ef-7a78766892fc.gif' height="126px" width="180px" > | <img src='https://user-images.githubusercontent.com/61035602/206095831-cc439557-1a23-4a99-b6b0-b6f2e97e8c57.jpg' height="126px" width="180px"> |
| [**车辆分析——车牌识别**](#pp-vehicle-实时车辆分析工具) | [**车辆分析——车流统计**](#pp-vehicle-实时车辆分析工具) | [**车辆分析——违章检测**](#pp-vehicle-实时车辆分析工具) | [**车辆分析——属性分析**](#pp-vehicle-实时车辆分析工具) |
| <img src='https://user-images.githubusercontent.com/61035602/206099328-2a1559e0-3b48-4424-9bad-d68f9ba5ba65.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095918-d0e7ad87-7bbb-40f1-bcc1-37844e2271ff.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206100295-7762e1ab-ffce-44fb-b69d-45fb93657fa0.gif' height="126px" width="180px" > | <img src='https://user-images.githubusercontent.com/61035602/206095905-8255776a-d8e6-4af1-b6e9-8d9f97e5059d.gif' height="126px" width="180px"> |
| [**行人分析——闯入分析**](#pp-human-实时行人分析工具) | [**行人分析——行为分析**](#pp-human-实时行人分析工具) | [**行人分析——属性分析**](#pp-human-实时行人分析工具) | [**行人分析——人流统计**](#pp-human-实时行人分析工具) |
| <img src='https://user-images.githubusercontent.com/61035602/206095792-ae0ac107-cd8e-492a-8baa-32118fc82b04.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095778-fdd73e5d-9f91-48c7-9d3d-6f2e02ec3f79.gif' height="126px" width="180px"> | <img src='https://user-images.githubusercontent.com/61035602/206095709-2c3a209e-6626-45dd-be16-7f0bf4d48a14.gif' height="126px" width="180px"> | <img src="https://user-images.githubusercontent.com/61035602/206113351-cc59df79-8672-4d76-b521-a15acf69ae78.gif" height="126px" width="180px"> |
同时PaddleDetection提供了模型的在线体验功能用户可以选择自己的数据进行在线推理。
`说明`考虑到服务器负载压力在线推理均为CPU推理完整的模型开发实例以及产业部署实践代码示例请前往[🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)。
`传送门`[模型在线体验](https://www.paddlepaddle.org.cn/models)
<div align="center">
<p align="center">
<img src="https://user-images.githubusercontent.com/61035602/206896755-bd0cd498-1149-4e94-ae30-da590ea78a7a.gif" align="middle"/>
</p>
</div>
## 📣最新进展
💥 2024.6.27 **飞桨低代码开发工具 [PaddleX 3.0](https://github.com/paddlepaddle/paddlex) 重磅更新!**
- 低代码开发范式:支持目标检测模型全流程低代码开发,提供 Python API支持用户自定义串联模型
- 多硬件训推支持:支持英伟达 GPU、昆仑芯、昇腾和寒武纪等多种硬件进行模型训练与推理。
**🔥超越YOLOv8飞桨推出精度最高的实时检测器RT-DETR**
<div align="center">
<img src="https://github.com/PaddlePaddle/PaddleDetection/assets/17582080/196b0a10-d2e8-401c-9132-54b9126e0a33" height = "500" caption='' />
<p></p>
</div>
- `RT-DETR解读文章传送门`
- [《超越YOLOv8飞桨推出精度最高的实时检测器RT-DETR](https://mp.weixin.qq.com/s/o03QM2rZNjHVto36gcV0Yw)
- `代码传送门`[RT-DETR](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rtdetr)
## 👫开源社区
- **📑项目合作:** 如果您是企业开发者且有明确的目标检测垂类应用需求,请扫描如下二维码入群,并联系`群管理员AI`后可免费与官方团队展开不同层次的合作。
- **🏅️社区贡献:** PaddleDetection非常欢迎你加入到飞桨社区的开源建设中参与贡献方式可以参考[开源项目开发指南](docs/contribution/README.md)。
- **💻直播教程:** PaddleDetection会定期在飞桨直播间([B站:飞桨PaddlePaddle](https://space.bilibili.com/476867757)、[微信: 飞桨PaddlePaddle](https://mp.weixin.qq.com/s/6ji89VKqoXDY6SSGkxS8NQ)),针对发新内容、以及产业范例、使用教程等进行直播分享。
- **🎁加入社区:** **微信扫描二维码并填写问卷之后,可以及时获取如下信息,包括:**
- 社区最新文章、直播课等活动预告
- 往期直播录播&PPT
- 30+行人车辆等垂类高性能预训练模型
- 七大任务开源数据集下载链接汇总
- 40+前沿检测领域顶会算法
- 15+从零上手目标检测理论与实践视频课程
- 10+工业安防交通全流程项目实操(含源码)
<div align="center">
<img src="https://github.com/PaddlePaddle/PaddleDetection/assets/22989727/0466954b-ab4d-4984-bd36-796c37f0ee9c" width = "150" height = "150",caption='' />
<p>PaddleDetection官方交流群二维码</p>
</div>
## 📖 技术交流合作
- 飞桨低代码开发工具PaddleX—— 面向国内外主流AI硬件的飞桨精选模型一站式开发工具。包含如下核心优势
- 【产业高精度模型库】覆盖10个主流AI任务 40+精选模型,丰富齐全。
- 【特色模型产线】:提供融合大小模型的特色模型产线,精度更高,效果更好。
- 【低代码开发模式】:图形化界面支持统一开发范式,便捷高效。
- 【私有化部署多硬件支持】适配国内外主流AI硬件支持本地纯离线使用满足企业安全保密需要。
- PaddleX官网地址https://aistudio.baidu.com/intro/paddlex
- PaddleX官方交流频道https://aistudio.baidu.com/community/channel/610
- **🎈社区近期活动**
- **🔥PaddleDetection v2.6版本更新解读**
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/224244188-da8495fc-eea9-432f-bc2d-6f0144c2dde9.png" height = "250" caption='' />
<p></p>
</div>
- `v2.6版本版本更新解读文章传送门`[《PaddleDetection v2.6发布目标小数据缺标注累泛化差PP新员逐一应对](https://mp.weixin.qq.com/s/SLITj5k120d_fQc7jEO8Vw)
- **🏆半监督检测**
- `文章传送门`[CVPR 2023 | 单阶段半监督目标检测SOTAARSL](https://mp.weixin.qq.com/s/UZLIGL6va2KBfofC-nKG4g)
- `代码传送门`[ARSL](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/semi_det)
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/230522850-21873665-ba79-4f8d-8dce-43d736111df8.png" height = "250" caption='' />
<p></p>
</div>
- **👀YOLO系列专题**
- `文章传送门`[YOLOv8来啦YOLO内卷期模型怎么选9+款AI硬件如何快速部署深度解析](https://mp.weixin.qq.com/s/rPwprZeHEpmGOe5wxrmO5g)
- `代码传送门`[PaddleYOLO全系列](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/feature_models/PaddleYOLO_MODEL.md)
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/213202797-3a1b24f3-53c0-4094-bb31-db2f84438fbc.jpeg" height = "250" caption='' />
<p></p>
</div>
- **🎯少目标迁移学习专题**
- `文章传送门`[囿于数据少泛化性差PaddleDetection少样本迁移学习助你一键突围](https://mp.weixin.qq.com/s/dFEQoxSzVCOaWVZPb3N7WA)
- **⚽2022卡塔尔世界杯专题**
- `文章传送门`[世界杯决赛号角吹响趁周末来搭一套足球3D+AI量化分析系统吧](https://mp.weixin.qq.com/s/koJxjWDPBOlqgI-98UsfKQ)
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/208036574-f151a7ff-a5f1-4495-9316-a47218a6576b.gif" height = "250" caption='' />
<p></p>
</div>
- **🔍旋转框小目标检测专题**
- `文章传送门`[Yes, PP-YOLOE80.73mAP、38.5mAP旋转框、小目标检测能力双SOTA](https://mp.weixin.qq.com/s/6ji89VKqoXDY6SSGkxS8NQ)
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/208037368-5b9f01f7-afd9-46d8-bc80-271ccb5db7bb.png" height = "220" caption='' />
<p></p>
</div>
- **🎊YOLO Vision世界学术交流大会**
- **PaddleDetection**受邀参与首个以**YOLO为主题**的**YOLO-VISION**世界大会与全球AI领先开发者学习交流。
- `活动链接传送门`[YOLO-VISION](https://ultralytics.com/yolo-vision)
<div align="center">
<img src="https://user-images.githubusercontent.com/48054808/192301374-940cf2fa-9661-419b-9c46-18a4570df381.jpeg" width="400"/>
</div>
- **🏅️社区贡献**
- `活动链接传送门`[Yes, PP-YOLOE! 基于PP-YOLOE的算法开发](https://github.com/PaddlePaddle/PaddleDetection/issues/7345)
## ✨主要特性
#### 🧩模块化设计
PaddleDetection将检测模型解耦成不同的模块组件通过自定义模块组件组合用户可以便捷高效地完成检测模型的搭建。`传送门`[🧩模块组件](#模块组件)。
#### 📱丰富的模型库
PaddleDetection支持大量的最新主流的算法基准以及预训练模型涵盖2D/3D目标检测、实例分割、人脸检测、关键点检测、多目标跟踪、半监督学习等方向。`传送门`[📱模型库](#模型库)、[⚖️模型性能对比](#️模型性能对比)。
#### 🎗️产业特色模型|产业工具
PaddleDetection打造产业级特色模型以及分析工具PP-YOLOE+、PP-PicoDet、PP-TinyPose、PP-HumanV2、PP-Vehicle等针对通用、高频垂类应用场景提供深度优化解决方案以及高度集成的分析工具降低开发者的试错、选择成本针对业务场景快速应用落地。`传送门`[🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)。
#### 💡🏆产业级部署实践
PaddleDetection整理工业、农业、林业、交通、医疗、金融、能源电力等AI应用范例打通数据标注-模型训练-模型调优-预测部署全流程,持续降低目标检测技术产业落地门槛。`传送门`[💡产业实践范例](#产业实践范例)、[🏆企业应用案例](#企业应用案例)。
<div align="center">
<p align="center">
<img src="https://user-images.githubusercontent.com/61035602/206431371-912a14c8-ce1e-48ec-ae6f-7267016b308e.png" align="middle" width="1280"/>
</p>
</div>
## 🍱安装
参考[安装说明](docs/tutorials/INSTALL_cn.md)进行安装。
## 🔥教程
**深度学习入门教程**
- [零基础入门深度学习](https://www.paddlepaddle.org.cn/tutorials/projectdetail/4676538)
- [零基础入门目标检测](https://aistudio.baidu.com/aistudio/education/group/info/1617)
**快速开始**
- [快速体验](docs/tutorials/QUICK_STARTED_cn.md)
- [示例30分钟快速开发交通标志检测模型](docs/tutorials/GETTING_STARTED_cn.md)
**数据准备**
- [数据准备](docs/tutorials/data/README.md)
- [数据处理模块](docs/advanced_tutorials/READER.md)
**配置文件说明**
- [RCNN参数说明](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation.md)
- [PP-YOLO参数说明](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation.md)
**模型开发**
- [新增检测模型](docs/advanced_tutorials/MODEL_TECHNICAL.md)
- 二次开发
- [目标检测](docs/advanced_tutorials/customization/detection.md)
- [关键点检测](docs/advanced_tutorials/customization/keypoint_detection.md)
- [多目标跟踪](docs/advanced_tutorials/customization/pphuman_mot.md)
- [行为识别](docs/advanced_tutorials/customization/action_recognotion/)
- [属性识别](docs/advanced_tutorials/customization/pphuman_attribute.md)
**部署推理**
- [模型导出教程](deploy/EXPORT_MODEL.md)
- [模型压缩](https://github.com/PaddlePaddle/PaddleSlim)
- [剪裁/量化/蒸馏教程](configs/slim)
- [Paddle Inference部署](deploy/README.md)
- [Python端推理部署](deploy/python)
- [C++端推理部署](deploy/cpp)
- [Paddle Lite部署](deploy/lite)
- [Paddle Serving部署](deploy/serving)
- [ONNX模型导出](deploy/EXPORT_ONNX_MODEL.md)
- [推理benchmark](deploy/BENCHMARK_INFER.md)
## 🔑FAQ
- [FAQ/常见问题汇总](docs/tutorials/FAQ)
## 🧩模块组件
<table align="center">
<tbody>
<tr align="center" valign="center">
<td>
<b>Backbones</b>
</td>
<td>
<b>Necks</b>
</td>
<td>
<b>Loss</b>
</td>
<td>
<b>Common</b>
</td>
<td>
<b>Data Augmentation</b>
</td>
</tr>
<tr valign="top">
<td>
<ul>
<li><a href="ppdet/modeling/backbones/resnet.py">ResNet</a></li>
<li><a href="ppdet/modeling/backbones/res2net.py">CSPResNet</a></li>
<li><a href="ppdet/modeling/backbones/senet.py">SENet</a></li>
<li><a href="ppdet/modeling/backbones/res2net.py">Res2Net</a></li>
<li><a href="ppdet/modeling/backbones/hrnet.py">HRNet</a></li>
<li><a href="ppdet/modeling/backbones/lite_hrnet.py">Lite-HRNet</a></li>
<li><a href="ppdet/modeling/backbones/darknet.py">DarkNet</a></li>
<li><a href="ppdet/modeling/backbones/csp_darknet.py">CSPDarkNet</a></li>
<li><a href="ppdet/modeling/backbones/mobilenet_v1.py">MobileNetV1</a></li>
<li><a href="ppdet/modeling/backbones/mobilenet_v3.py">MobileNetV1</a></li>
<li><a href="ppdet/modeling/backbones/shufflenet_v2.py">ShuffleNetV2</a></li>
<li><a href="ppdet/modeling/backbones/ghostnet.py">GhostNet</a></li>
<li><a href="ppdet/modeling/backbones/blazenet.py">BlazeNet</a></li>
<li><a href="ppdet/modeling/backbones/dla.py">DLA</a></li>
<li><a href="ppdet/modeling/backbones/hardnet.py">HardNet</a></li>
<li><a href="ppdet/modeling/backbones/lcnet.py">LCNet</a></li>
<li><a href="ppdet/modeling/backbones/esnet.py">ESNet</a></li>
<li><a href="ppdet/modeling/backbones/swin_transformer.py">Swin-Transformer</a></li>
<li><a href="ppdet/modeling/backbones/convnext.py">ConvNeXt</a></li>
<li><a href="ppdet/modeling/backbones/vgg.py">VGG</a></li>
<li><a href="ppdet/modeling/backbones/vision_transformer.py">Vision Transformer</a></li>
<li><a href="configs/convnext">ConvNext</a></li>
</ul>
</td>
<td>
<ul>
<li><a href="ppdet/modeling/necks/bifpn.py">BiFPN</a></li>
<li><a href="ppdet/modeling/necks/blazeface_fpn.py">BlazeFace-FPN</a></li>
<li><a href="ppdet/modeling/necks/centernet_fpn.py">CenterNet-FPN</a></li>
<li><a href="ppdet/modeling/necks/csp_pan.py">CSP-PAN</a></li>
<li><a href="ppdet/modeling/necks/custom_pan.py">Custom-PAN</a></li>
<li><a href="ppdet/modeling/necks/fpn.py">FPN</a></li>
<li><a href="ppdet/modeling/necks/es_pan.py">ES-PAN</a></li>
<li><a href="ppdet/modeling/necks/hrfpn.py">HRFPN</a></li>
<li><a href="ppdet/modeling/necks/lc_pan.py">LC-PAN</a></li>
<li><a href="ppdet/modeling/necks/ttf_fpn.py">TTF-FPN</a></li>
<li><a href="ppdet/modeling/necks/yolo_fpn.py">YOLO-FPN</a></li>
</ul>
</td>
<td>
<ul>
<li><a href="ppdet/modeling/losses/smooth_l1_loss.py">Smooth-L1</a></li>
<li><a href="ppdet/modeling/losses/detr_loss.py">Detr Loss</a></li>
<li><a href="ppdet/modeling/losses/fairmot_loss.py">Fairmot Loss</a></li>
<li><a href="ppdet/modeling/losses/fcos_loss.py">Fcos Loss</a></li>
<li><a href="ppdet/modeling/losses/gfocal_loss.py">GFocal Loss</a></li>
<li><a href="ppdet/modeling/losses/jde_loss.py">JDE Loss</a></li>
<li><a href="ppdet/modeling/losses/keypoint_loss.py">KeyPoint Loss</a></li>
<li><a href="ppdet/modeling/losses/solov2_loss.py">SoloV2 Loss</a></li>
<li><a href="ppdet/modeling/losses/focal_loss.py">Focal Loss</a></li>
<li><a href="ppdet/modeling/losses/iou_loss.py">GIoU/DIoU/CIoU</a></li>
<li><a href="ppdet/modeling/losses/iou_aware_loss.py">IoUAware</a></li>
<li><a href="ppdet/modeling/losses/sparsercnn_loss.py">SparseRCNN Loss</a></li>
<li><a href="ppdet/modeling/losses/ssd_loss.py">SSD Loss</a></li>
<li><a href="ppdet/modeling/losses/focal_loss.py">YOLO Loss</a></li>
<li><a href="ppdet/modeling/losses/yolo_loss.py">CT Focal Loss</a></li>
<li><a href="ppdet/modeling/losses/varifocal_loss.py">VariFocal Loss</a></li>
</ul>
</td>
<td>
</ul>
<li><b>Post-processing</b></li>
<ul>
<ul>
<li><a href="ppdet/modeling/post_process.py">SoftNMS</a></li>
<li><a href="ppdet/modeling/post_process.py">MatrixNMS</a></li>
</ul>
</ul>
<li><b>Training</b></li>
<ul>
<ul>
<li><a href="tools/train.py#L62">FP16 training</a></li>
<li><a href="docs/tutorials/DistributedTraining_cn.md">Multi-machine training </a></li>
</ul>
</ul>
<li><b>Common</b></li>
<ul>
<ul>
<li><a href="ppdet/modeling/backbones/resnet.py#L41">Sync-BN</a></li>
<li><a href="configs/gn/README.md">Group Norm</a></li>
<li><a href="configs/dcn/README.md">DCNv2</a></li>
<li><a href="ppdet/optimizer/ema.py">EMA</a></li>
</ul>
</td>
<td>
<ul>
<li><a href="ppdet/data/transform/operators.py">Resize</a></li>
<li><a href="ppdet/data/transform/operators.py">Lighting</a></li>
<li><a href="ppdet/data/transform/operators.py">Flipping</a></li>
<li><a href="ppdet/data/transform/operators.py">Expand</a></li>
<li><a href="ppdet/data/transform/operators.py">Crop</a></li>
<li><a href="ppdet/data/transform/operators.py">Color Distort</a></li>
<li><a href="ppdet/data/transform/operators.py">Random Erasing</a></li>
<li><a href="ppdet/data/transform/operators.py">Mixup </a></li>
<li><a href="ppdet/data/transform/operators.py">AugmentHSV</a></li>
<li><a href="ppdet/data/transform/operators.py">Mosaic</a></li>
<li><a href="ppdet/data/transform/operators.py">Cutmix </a></li>
<li><a href="ppdet/data/transform/operators.py">Grid Mask</a></li>
<li><a href="ppdet/data/transform/operators.py">Auto Augment</a></li>
<li><a href="ppdet/data/transform/operators.py">Random Perspective</a></li>
</ul>
</td>
</tr>
</td>
</tr>
</tbody>
</table>
## 📱模型库
<table align="center">
<tbody>
<tr align="center" valign="center">
<td>
<b>2D Detection</b>
</td>
<td>
<b>Multi Object Tracking</b>
</td>
<td>
<b>KeyPoint Detection</b>
</td>
<td>
<b>Others</b>
</td>
</tr>
<tr valign="top">
<td>
<ul>
<li><a href="configs/faster_rcnn/README.md">Faster RCNN</a></li>
<li><a href="ppdet/modeling/necks/fpn.py">FPN</a></li>
<li><a href="configs/cascade_rcnn/README.md">Cascade-RCNN</a></li>
<li><a href="configs/rcnn_enhance">PSS-Det</a></li>
<li><a href="configs/retinanet/README.md">RetinaNet</a></li>
<li><a href="configs/yolov3/README.md">YOLOv3</a></li>
<li><a href="configs/yolof/README.md">YOLOF</a></li>
<li><a href="configs/yolox/README.md">YOLOX</a></li>
<li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5">YOLOv5</a></li>
<li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov6">YOLOv6</a></li>
<li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7">YOLOv7</a></li>
<li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov8">YOLOv8</a></li>
<li><a href="https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/rtmdet">RTMDet</a></li>
<li><a href="configs/ppyolo/README_cn.md">PP-YOLO</a></li>
<li><a href="configs/ppyolo#pp-yolo-tiny">PP-YOLO-Tiny</a></li>
<li><a href="configs/picodet">PP-PicoDet</a></li>
<li><a href="configs/ppyolo/README_cn.md">PP-YOLOv2</a></li>
<li><a href="configs/ppyoloe/README_legacy.md">PP-YOLOE</a></li>
<li><a href="configs/ppyoloe/README_cn.md">PP-YOLOE+</a></li>
<li><a href="configs/smalldet">PP-YOLOE-SOD</a></li>
<li><a href="configs/rotate/README.md">PP-YOLOE-R</a></li>
<li><a href="configs/ssd/README.md">SSD</a></li>
<li><a href="configs/centernet">CenterNet</a></li>
<li><a href="configs/fcos">FCOS</a></li>
<li><a href="configs/rotate/fcosr">FCOSR</a></li>
<li><a href="configs/ttfnet">TTFNet</a></li>
<li><a href="configs/tood">TOOD</a></li>
<li><a href="configs/gfl">GFL</a></li>
<li><a href="configs/gfl/gflv2_r50_fpn_1x_coco.yml">GFLv2</a></li>
<li><a href="configs/detr">DETR</a></li>
<li><a href="configs/deformable_detr">Deformable DETR</a></li>
<li><a href="configs/sparse_rcnn">Sparse RCNN</a></li>
</ul>
</td>
<td>
<ul>
<li><a href="configs/mot/jde">JDE</a></li>
<li><a href="configs/mot/fairmot">FairMOT</a></li>
<li><a href="configs/mot/deepsort">DeepSORT</a></li>
<li><a href="configs/mot/bytetrack">ByteTrack</a></li>
<li><a href="configs/mot/ocsort">OC-SORT</a></li>
<li><a href="configs/mot/botsort">BoT-SORT</a></li>
<li><a href="configs/mot/centertrack">CenterTrack</a></li>
</ul>
</td>
<td>
<ul>
<li><a href="configs/keypoint/hrnet">HRNet</a></li>
<li><a href="configs/keypoint/higherhrnet">HigherHRNet</a></li>
<li><a href="configs/keypoint/lite_hrnet">Lite-HRNet</a></li>
<li><a href="configs/keypoint/tiny_pose">PP-TinyPose</a></li>
</ul>
</td>
<td>
</ul>
<li><b>Instance Segmentation</b></li>
<ul>
<ul>
<li><a href="configs/mask_rcnn">Mask RCNN</a></li>
<li><a href="configs/cascade_rcnn">Cascade Mask RCNN</a></li>
<li><a href="configs/solov2">SOLOv2</a></li>
</ul>
</ul>
<li><b>Face Detection</b></li>
<ul>
<ul>
<li><a href="configs/face_detection">BlazeFace</a></li>
</ul>
</ul>
<li><b>Semi-Supervised Detection</b></li>
<ul>
<ul>
<li><a href="configs/semi_det">DenseTeacher</a></li>
</ul>
</ul>
<li><b>3D Detection</b></li>
<ul>
<ul>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">Smoke</a></li>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">CaDDN</a></li>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">PointPillars</a></li>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">CenterPoint</a></li>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">SequeezeSegV3</a></li>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">IA-SSD</a></li>
<li><a href="https://github.com/PaddlePaddle/Paddle3D">PETR</a></li>
</ul>
</ul>
<li><b>Vehicle Analysis Toolbox</b></li>
<ul>
<ul>
<li><a href="deploy/pipeline/README.md">PP-Vehicle</a></li>
</ul>
</ul>
<li><b>Human Analysis Toolbox</b></li>
<ul>
<ul>
<li><a href="deploy/pipeline/README.md">PP-Human</a></li>
<li><a href="deploy/pipeline/README.md">PP-HumanV2</a></li>
</ul>
</ul>
<li><b>Sport Analysis Toolbox</b></li>
<ul>
<ul>
<li><a href="https://github.com/PaddlePaddle/PaddleSports">PP-Sports</a></li>
</ul>
</td>
</tr>
</tbody>
</table>
## ⚖️模型性能对比
#### 🖥️服务器端模型性能对比
各模型结构和骨干网络的代表模型在COCO数据集上精度mAP和单卡Tesla V100上预测速度(FPS)对比图。
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/206434766-caaa781b-b922-481f-af09-15faac9ed33b.png" width="800"/>
</div>
<details>
<summary><b> 测试说明(点击展开)</b></summary>
- ViT为ViT-Cascade-Faster-RCNN模型COCO数据集mAP高达55.7%
- Cascade-Faster-RCNN为Cascade-Faster-RCNN-ResNet50vd-DCNPaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
- PP-YOLOE是对PP-YOLO v2模型的进一步优化L版本在COCO数据集mAP为51.6%Tesla V100预测速度78.1FPS
- PP-YOLOE+是对PPOLOE模型的进一步优化L版本在COCO数据集mAP为53.3%Tesla V100预测速度78.1FPS
- YOLOX和YOLOv5均为基于PaddleDetection复现算法YOLOv5代码在[PaddleYOLO](https://github.com/PaddlePaddle/PaddleYOLO)中,参照[PaddleYOLO_MODEL](docs/feature_models/PaddleYOLO_MODEL.md)
- 图中模型均可在[📱模型库](#模型库)中获取
</details>
#### ⌚️移动端模型性能对比
各移动端模型在COCO数据集上精度mAP和高通骁龙865处理器上预测速度(FPS)对比图。
<div align="center">
<img src="https://user-images.githubusercontent.com/61035602/206434741-10460690-8fc3-4084-a11a-16fe4ce2fc85.png" width="550"/>
</div>
<details>
<summary><b> 测试说明(点击展开)</b></summary>
- 测试数据均使用高通骁龙865(4xA77+4xA55)处理器batch size为1, 开启4线程测试测试使用NCNN预测库测试脚本见[MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
- PP-PicoDet及PP-YOLO-Tiny为PaddleDetection自研模型可在[📱模型库](#模型库)中获取其余模型PaddleDetection暂未提供
</details>
## 🎗️产业特色模型|产业工具
产业特色模型产业工具是PaddleDetection针对产业高频应用场景打造的兼顾精度和速度的模型以及工具箱注重从数据处理-模型训练-模型调优-模型部署的端到端打通,且提供了实际生产环境中的实践范例代码,帮助拥有类似需求的开发者高效的完成产品开发落地应用。
该系列模型工具均已PP前缀命名具体介绍、预训练模型以及产业实践范例代码如下。
### 💎PP-YOLOE 高精度目标检测模型
<details>
<summary><b> 简介(点击展开)</b></summary>
PP-YOLOE是基于PP-YOLOv2的卓越的单阶段Anchor-free模型超越了多种流行的YOLO模型。PP-YOLOE避免了使用诸如Deformable Convolution或者Matrix NMS之类的特殊算子以使其能轻松地部署在多种多样的硬件上。其使用大规模数据集obj365预训练模型进行预训练可以在不同场景数据集上快速调优收敛。
`传送门`[PP-YOLOE说明](configs/ppyoloe/README_cn.md)。
`传送门`[arXiv论文](https://arxiv.org/abs/2203.16250)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 模型名称 | COCO精度mAP | V100 TensorRT FP16速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
| :---------- | :-------------: | :-------------------------: | :----------: | :-----------------------------------------------------: | :-------------------------------------------------------------------------------------: |
| PP-YOLOE+_l | 53.3 | 149.2 | 服务器 | [链接](configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
`传送门`[全部预训练模型](configs/ppyoloe/README_cn.md)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| ---- | ----------------- | --------------------------------------------------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------------- |
| 农业 | 农作物检测 | 用于葡萄栽培中基于图像的监测和现场机器人技术提供了来自5种不同葡萄品种的实地实例 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
| 通用 | 低光场景检测 | 低光数据集使用ExDark包括从极低光环境到暮光环境等10种不同光照条件下的图片。 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
| 工业 | PCB电路板瑕疵检测 | 工业数据集使用PKU-Market-PCB该数据集用于印刷电路板PCB的瑕疵检测提供了6种常见的PCB缺陷 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
</details>
### 💎PP-YOLOE-R 高性能旋转框检测模型
<details>
<summary><b> 简介(点击展开)</b></summary>
PP-YOLOE-R是一个高效的单阶段Anchor-free旋转框检测模型基于PP-YOLOE+引入了一系列改进策略来提升检测精度。根据不同的硬件对精度和速度的要求PP-YOLOE-R包含s/m/l/x四个尺寸的模型。在DOTA 1.0数据集上PP-YOLOE-R-l和PP-YOLOE-R-x在单尺度训练和测试的情况下分别达到了78.14mAP和78.28 mAP这在单尺度评估下超越了几乎所有的旋转框检测模型。通过多尺度训练和测试PP-YOLOE-R-l和PP-YOLOE-R-x的检测精度进一步提升至80.02mAP和80.73 mAP超越了所有的Anchor-free方法并且和最先进的Anchor-based的两阶段模型精度几乎相当。在保持高精度的同时PP-YOLOE-R避免使用特殊的算子例如Deformable Convolution或Rotated RoI Align使其能轻松地部署在多种多样的硬件上。
`传送门`[PP-YOLOE-R说明](configs/rotate/ppyoloe_r)。
`传送门`[arXiv论文](https://arxiv.org/abs/2211.02386)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 模型 | Backbone | mAP | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
| :----------: | :------: | :---: | :-----------------: | :------------------------: | :--------: | :-------: | :--------: | :------: | :------: | :-----: | :-----------: | :---------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: |
| PP-YOLOE-R-l | CRN-l | 80.02 | 69.7 | 48.3 | 53.29 | 281.65 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
`传送门`[全部预训练模型](configs/rotate/ppyoloe_r)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| ---- | ---------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| 通用 | 旋转框检测 | 手把手教你上手PP-YOLOE-R旋转框检测10分钟将脊柱数据集精度训练至95mAP | [基于PP-YOLOE-R的旋转框检测](https://aistudio.baidu.com/aistudio/projectdetail/5058293) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/5058293) |
</details>
### 💎PP-YOLOE-SOD 高精度小目标检测模型
<details>
<summary><b> 简介(点击展开)</b></summary>
PP-YOLOE-SOD(Small Object Detection)是PaddleDetection团队针对小目标检测提出的检测方案在VisDrone-DET数据集上单模型精度达到38.5mAP达到了SOTA性能。其分别基于切图拼图流程优化的小目标检测方案以及基于原图模型算法优化的小目标检测方案。同时提供了数据集自动分析脚本只需输入数据集标注文件便可得到数据集统计结果辅助判断数据集是否是小目标数据集以及是否需要采用切图策略同时给出网络超参数参考值。
`传送门`[PP-YOLOE-SOD 小目标检测模型](configs/smalldet)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
- VisDrone数据集预训练模型
| 模型 | COCOAPI mAP<sup>val<br>0.5:0.95 | COCOAPI mAP<sup>val<br>0.5 | COCOAPI mAP<sup>test_dev<br>0.5:0.95 | COCOAPI mAP<sup>test_dev<br>0.5 | MatlabAPI mAP<sup>test_dev<br>0.5:0.95 | MatlabAPI mAP<sup>test_dev<br>0.5 | 下载 | 配置文件 |
| :------------------ | :-----------------------------: | :------------------------: | :----------------------------------: | :-----------------------------: | :------------------------------------: | :-------------------------------: | :---------------------------------------------------------------------------------------------: | :----------------------------------------------------------: |
| **PP-YOLOE+_SOD-l** | **31.9** | **52.1** | **25.6** | **43.5** | **30.25** | **51.18** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml) |
`传送门`[全部预训练模型](configs/smalldet)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| ---- | ---------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| 通用 | 小目标检测 | 基于PP-YOLOE-SOD的无人机航拍图像检测案例全流程实操。 | [基于PP-YOLOE-SOD的无人机航拍图像检测](https://aistudio.baidu.com/aistudio/projectdetail/5036782) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/5036782) |
</details>
### 💫PP-PicoDet 超轻量实时目标检测模型
<details>
<summary><b> 简介(点击展开)</b></summary>
全新的轻量级系列模型PP-PicoDet在移动端具有卓越的性能成为全新SOTA轻量级模型。
`传送门`[PP-PicoDet说明](configs/picodet/README.md)。
`传送门`[arXiv论文](https://arxiv.org/abs/2111.00902)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 模型名称 | COCO精度mAP | 骁龙865 四线程速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
| :-------- | :-------------: | :---------------------: | :------------: | :--------------------------------------------------: | :----------------------------------------------------------------------------------: |
| PicoDet-L | 36.1 | 39.7 | 移动端、嵌入式 | [链接](configs/picodet/picodet_l_320_coco_lcnet.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) |
`传送门`[全部预训练模型](configs/picodet/README.md)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| -------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| 智慧城市 | 道路垃圾检测 | 通过在市政环卫车辆上安装摄像头对路面垃圾检测并分析,实现对路面遗撒的垃圾进行监控,记录并通知环卫人员清理,大大提升了环卫人效。 | [基于PP-PicoDet的路面垃圾检测](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0) |
</details>
### 📡PP-Tracking 实时多目标跟踪系统
<details>
<summary><b> 简介(点击展开)</b></summary>
PaddleDetection团队提供了实时多目标跟踪系统PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统具有模型丰富、应用广泛和部署高效三大优势。 PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式针对实际业务的难点和痛点提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用部署方式支持API调用和GUI可视化界面部署语言支持Python和C++部署平台环境支持Linux、NVIDIA Jetson等。
`传送门`[PP-Tracking说明](configs/mot/README.md)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 模型名称 | 模型简介 | 精度 | 速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
| :-------- | :----------------------------------: | :--------------------: | :-------: | :--------------------: | :--------------------------------------------------------: | :------------------------------------------------------------------------------------------------: |
| ByteTrack | SDE多目标跟踪算法 仅包含检测模型 | MOT-17 test: 78.4 | - | 服务器、移动端、嵌入式 | [链接](configs/mot/bytetrack/bytetrack_yolox.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) |
| FairMOT | JDE多目标跟踪算法 多任务联合学习方法 | MOT-16 test: 75.0 | - | 服务器、移动端、嵌入式 | [链接](configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) |
| OC-SORT | SDE多目标跟踪算法 仅包含检测模型 | MOT-17 half val: 75.5 | - | 服务器、移动端、嵌入式 | [链接](configs/mot/ocsort/ocsort_yolox.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_mot_ch.pdparams) |
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| ---- | ---------- | -------------------------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| 通用 | 多目标跟踪 | 快速上手单镜头、多镜头跟踪 | [PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/3022582) |
</details>
### ⛷PP-TinyPose 人体骨骼关键点识别
<details>
<summary><b> 简介(点击展开)</b></summary>
PaddleDetection 中的关键点检测部分紧跟最先进的算法,包括 Top-Down 和 Bottom-Up 两种方法可以满足用户的不同需求。同时PaddleDetection 提供针对移动端设备优化的自研实时关键点检测模型 PP-TinyPose。
`传送门`[PP-TinyPose说明](configs/keypoint/tiny_pose)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 模型名称 | 模型简介 | COCO精度AP | 速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
| :---------: | :----------------------------------: | :------------: | :-----------------------: | :------------: | :-----------------------------------------------------: | :--------------------------------------------------------------------------------------: |
| PP-TinyPose | 轻量级关键点算法<br/>输入尺寸256x192 | 68.8 | 骁龙865 四线程: 158.7 FPS | 移动端、嵌入式 | [链接](configs/keypoint/tiny_pose/tinypose_256x192.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) |
`传送门`[全部预训练模型](configs/keypoint/README.md)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| ---- | ---- | ---------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| 运动 | 健身 | 提供从模型选型、数据准备、模型训练优化到后处理逻辑和模型部署的全流程可复用方案有效解决了复杂健身动作的高效识别打造AI虚拟健身教练 | [基于PP-TinyPose增强版的智能健身动作识别](https://aistudio.baidu.com/aistudio/projectdetail/4385813) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4385813) |
</details>
### 🏃🏻PP-Human 实时行人分析工具
<details>
<summary><b> 简介(点击展开)</b></summary>
PaddleDetection深入探索核心行业的高频场景提供了行人开箱即用分析工具支持图片/单镜头视频/多镜头视频/在线视频流多种输入方式广泛应用于智慧交通、智慧城市、工业巡检等领域。支持服务器端部署及TensorRT加速T4服务器上可达到实时。
PP-Human支持四大产业级功能五大异常行为识别、26种人体属性分析、实时人流计数、跨镜头ReID跟踪。
`传送门`[PP-Human行人分析工具使用指南](deploy/pipeline/README.md)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 任务 | T4 TensorRT FP16: 速度FPS | 推荐部署硬件 | 模型下载 | 模型体积 |
| :----------------: | :---------------------------: | :----------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------: |
| 行人检测(高精度) | 39.8 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
| 行人跟踪(高精度) | 31.4 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
| 属性识别(高精度) | 单人 117.6 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [属性识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_small_person_attribute_954_infer.zip) | 目标检测182M<br>属性识别86M |
| 摔倒识别 | 单人 100 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) <br> [关键点检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) <br> [基于关键点行为识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | 多目标跟踪182M<br>关键点检测101M<br>基于关键点行为识别21.8M |
| 闯入识别 | 31.4 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
| 打架识别 | 50.8 | 服务器 | [视频分类](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 90M |
| 抽烟识别 | 340.1 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[基于人体id的目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip) | 目标检测182M<br>基于人体id的目标检测27M |
| 打电话识别 | 166.7 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[基于人体id的图像分类](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip) | 目标检测182M<br>基于人体id的图像分类45M |
`传送门`[完整预训练模型](deploy/pipeline/README.md)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| -------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
| 智能安防 | 摔倒检测 | 飞桨行人分析PP-Human中提供的摔倒识别算法采用了关键点+时空图卷积网络的技术,对摔倒姿势无限制、背景环境无要求。 | [基于PP-Human v2的摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/4606001) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4606001) |
| 智能安防 | 打架识别 | 本项目基于PaddleVideo视频开发套件训练打架识别模型然后将训练好的模型集成到PaddleDetection的PP-Human中助力行人行为分析。 | [基于PP-Human的打架识别](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1) |
| 智能安防 | 摔倒检测 | 基于PP-Human完成来客分析整体流程。使用PP-Human完成来客分析中非常常见的场景 1. 来客属性识别(单镜和跨境可视化2. 来客行为识别(摔倒识别)。 | [基于PP-Human的来客分析案例教程](https://aistudio.baidu.com/aistudio/projectdetail/4537344) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4537344) |
</details>
### 🏎PP-Vehicle 实时车辆分析工具
<details>
<summary><b> 简介(点击展开)</b></summary>
PaddleDetection深入探索核心行业的高频场景提供了车辆开箱即用分析工具支持图片/单镜头视频/多镜头视频/在线视频流多种输入方式广泛应用于智慧交通、智慧城市、工业巡检等领域。支持服务器端部署及TensorRT加速T4服务器上可达到实时。
PP-Vehicle囊括四大交通场景核心功能车牌识别、属性识别、车流量统计、违章检测。
`传送门`[PP-Vehicle车辆分析工具指南](deploy/pipeline/README.md)。
</details>
<details>
<summary><b> 预训练模型(点击展开)</b></summary>
| 任务 | T4 TensorRT FP16: 速度(FPS) | 推荐部署硬件 | 模型方案 | 模型体积 |
| :----------------: | :-------------------------: | :----------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------: |
| 车辆检测(高精度) | 38.9 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
| 车辆跟踪(高精度) | 25 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
| 车牌识别 | 213.7 | 服务器 | [车牌检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz) <br> [车牌识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz) | 车牌检测3.9M <br> 车牌字符识别: 12M |
| 车辆属性 | 136.8 | 服务器 | [属性识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip) | 7.2M |
`传送门`[完整预训练模型](deploy/pipeline/README.md)。
</details>
<details>
<summary><b> 产业应用代码示例(点击展开)</b></summary>
| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
| -------- | ---------------- | ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| 智慧交通 | 交通监控车辆分析 | 本项目基于PP-Vehicle演示智慧交通中最刚需的车流量监控、车辆违停检测以及车辆结构化车牌、车型、颜色分析三大场景。 | [基于PP-Vehicle的交通监控分析系统](https://aistudio.baidu.com/aistudio/projectdetail/4512254) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4512254) |
</details>
## 💡产业实践范例
产业实践范例是PaddleDetection针对高频目标检测应用场景提供的端到端开发示例帮助开发者打通数据标注-模型训练-模型调优-预测部署全流程。
针对每个范例我们都通过[AI-Studio](https://ai.baidu.com/ai-doc/AISTUDIO/Tk39ty6ho)提供了项目代码以及说明,用户可以同步运行体验。
`传送门`[产业实践范例完整列表](industrial_tutorial/README.md)
- [基于PP-YOLOE-R的旋转框检测](https://aistudio.baidu.com/aistudio/projectdetail/5058293)
- [基于PP-YOLOE-SOD的无人机航拍图像检测](https://aistudio.baidu.com/aistudio/projectdetail/5036782)
- [基于PP-Vehicle的交通监控分析系统](https://aistudio.baidu.com/aistudio/projectdetail/4512254)
- [基于PP-Human v2的摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/4606001)
- [基于PP-TinyPose增强版的智能健身动作识别](https://aistudio.baidu.com/aistudio/projectdetail/4385813)
- [基于PP-Human的打架识别](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1)
- [基于Faster-RCNN的瓷砖表面瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2571419)
- [基于PaddleDetection的PCB瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2367089)
- [基于FairMOT实现人流量统计](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
- [基于YOLOv3实现跌倒检测](https://aistudio.baidu.com/aistudio/projectdetail/2500639)
- [基于PP-PicoDetv2 的路面垃圾检测](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0)
- [基于人体关键点检测的合规检测](https://aistudio.baidu.com/aistudio/projectdetail/4061642?contributionType=1)
- [基于PP-Human的来客分析案例教程](https://aistudio.baidu.com/aistudio/projectdetail/4537344)
- 持续更新中...
## 🏆企业应用案例
企业应用案例是企业在实生产环境下落地应用PaddleDetection的方案思路相比产业实践范例其更多强调整体方案设计思路可供开发者在项目方案设计中做参考。
`传送门`[企业应用案例完整列表](https://www.paddlepaddle.org.cn/customercase)
- [中国南方电网——变电站智慧巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2330)
- [国铁电气——轨道在线智能巡检系统](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2280)
- [京东物流——园区车辆行为识别](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2611)
- [中兴克拉—厂区传统仪表统计监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2618)
- [宁德时代—动力电池高精度质量检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2609)
- [中国科学院空天信息创新研究院——高尔夫球场遥感监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2483)
- [御航智能——基于边缘的无人机智能巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2481)
- [普宙无人机——高精度森林巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2121)
- [领邦智能——红外无感测温监控](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2615)
- [北京地铁——口罩检测](https://mp.weixin.qq.com/s/znrqaJmtA7CcjG0yQESWig)
- [音智达——工厂人员违规行为检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2288)
- [华夏天信——输煤皮带机器人智能巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2331)
- [优恩物联网——社区住户分类支持广告精准投放](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2485)
- [螳螂慧视——室内3D点云场景物体分割与检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2599)
- 持续更新中...
## 📝许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
## 📌引用
```
@misc{ppdet2019,
title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
year={2019}
}
```

View File

@@ -0,0 +1,541 @@
[简体中文](README_cn.md) | English
<div align="center">
<p align="center">
<img src="https://user-images.githubusercontent.com/48054808/160532560-34cf7a1f-d950-435e-90d2-4b0a679e5119.png" align="middle" width = "800" />
</p>
**A High-Efficient Development Toolkit for Object Detection based on [PaddlePaddle](https://github.com/paddlepaddle/paddle)**
<p align="center">
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
<a href="https://github.com/PaddlePaddle/PaddleDetection/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleDetection?color=ffa"></a>
<a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
<a href="https://github.com/PaddlePaddle/PaddleDetection/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleDetection?color=ccf"></a>
</p>
</div>
<div align="center">
<img src="https://user-images.githubusercontent.com/22989727/205581915-aa8d6bee-5624-4aec-8059-76b5ebaf96f1.gif" width="800"/>
</div>
## <img src="https://user-images.githubusercontent.com/48054808/157793354-6e7f381a-0aa6-4bb7-845c-9acf2ecc05c3.png" width="20"/> Product Update
- 🔥 **2022.11.15SOTA rotated object detector and small object detector based on PP-YOLOE**
- Rotated object detector [PP-YOLOE-R](configs/rotate/ppyoloe_r)
- SOTA Anchor-free rotated object detection model with high accuracy and efficiency
- A series of models, named s/m/l/x, for cloud and edge devices
- Avoiding using special operators to be deployed friendly with TensorRT.
- Small object detector [PP-YOLOE-SOD](configs/smalldet)
- End-to-end detection pipeline based on sliced images
- SOTA model on VisDrone based on original images.
- 2022.8.26PaddleDetection releases[release/2.5 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5)
- 🗳 Model features
- Release [PP-YOLOE+](configs/ppyoloe): Increased accuracy by a maximum of 2.4% mAP to 54.9% mAP, 3.75 times faster model training convergence rate, and up to 2.3 times faster end-to-end inference speed; improved generalization for multiple downstream tasks
- Release [PicoDet-NPU](configs/picodet) model which supports full quantization deployment of models; add [PicoDet](configs/picodet) layout analysis model
- Release [PP-TinyPose Plus](./configs/keypoint/tiny_pose/). With 9.1% AP accuracy improvement in physical exercise, dance, and other scenarios, our PP-TinyPose Plus supports unconventional movements such as turning to one side, lying down, jumping, and high lifts
- 🔮 Functions in different scenarios
- Release the pedestrian analysis tool [PP-Human v2](./deploy/pipeline). It introduces four new behavior recognition: fighting, telephoning, smoking, and trespassing. The underlying algorithm performance is optimized, covering three core algorithm capabilities: detection, tracking, and attributes of pedestrians. Our model provides end-to-end development and model optimization strategies for beginners and supports online video streaming input.
- First release [PP-Vehicle](./deploy/pipeline), which has four major functions: license plate recognition, vehicle attribute analysis (color, model), traffic flow statistics, and violation detection. It is compatible with input formats, including pictures, online video streaming, and video. And we also offer our users a comprehensive set of tutorials for customization.
- 💡 Cutting-edge algorithms
- Release [PaddleYOLO](https://github.com/PaddlePaddle/PaddleYOLO) which overs classic and latest models of [YOLO family](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/docs/MODEL_ZOO_en.md): YOLOv3, PP-YOLOE (a real-time high-precision object detection model developed by Baidu PaddlePaddle), and cutting-edge detection algorithms such as YOLOv4, YOLOv5, YOLOX, YOLOv6, YOLOv7 and YOLOv8
- Newly add high precision detection model based on [ViT](configs/vitdet) backbone network, with a 55.7% mAP accuracy on COCO dataset; newly add multi-object tracking model [OC-SORT](configs/mot/ocsort); newly add [ConvNeXt](configs/convnext) backbone network.
- 📋 Industrial applications: Newly add [Smart Fitness](https://aistudio.baidu.com/aistudio/projectdetail/4385813), [Fighting recognition](https://aistudio.baidu.com/aistudio/projectdetail/4086987?channelType=0&channel=0),[ and Visitor Analysis](https://aistudio.baidu.com/aistudio/projectdetail/4230123?channelType=0&channel=0).
- 2022.3.24PaddleDetection released[release/2.4 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4)
- Release high-performanace SOTA object detection model [PP-YOLOE](configs/ppyoloe). It integrates cloud and edge devices and provides S/M/L/X versions. In particular, Verson L has the accuracy as 51.4% on COCO test 2017 dataset, inference speed as 78.1 FPS on a single Test V100. It supports mixed precision training, 33% faster than PP-YOLOv2. Its full range of multi-sized models can meet different hardware arithmetic requirements, and adaptable to server, edge-device GPU and other AI accelerator cards on servers.
- Release ultra-lightweight SOTA object detection model [PP-PicoDet Plus](configs/picodet) with 2% improvement in accuracy and 63% improvement in CPU inference speed. Add PicoDet-XS model with a 0.7M parameter, providing model sparsification and quantization functions for model acceleration. No specific post processing module is required for all the hardware, simplifying the deployment.
- Release the real-time pedestrian analysis tool [PP-Human](deploy/pphuman). It has four major functions: pedestrian tracking, visitor flow statistics, human attribute recognition and falling detection. For falling detection, it is optimized based on real-life data with accurate recognition of various types of falling posture. It can adapt to different environmental background, light and camera angle.
- Add [YOLOX](configs/yolox) object detection model with nano/tiny/S/M/L/X. X version has the accuracy as 51.8% on COCO Val2017 dataset.
- [More releases](https://github.com/PaddlePaddle/PaddleDetection/releases)
## <img title="" src="https://user-images.githubusercontent.com/48054808/157795569-9fc77c85-732f-4870-9be0-99a7fe2cff27.png" alt="" width="20"> Brief Introduction
**PaddleDetection** is an end-to-end object detection development kit based on PaddlePaddle. Providing **over 30 model algorithm** and **over 300 pre-trained models**, it covers object detection, instance segmentation, keypoint detection, multi-object tracking. In particular, PaddleDetection offers **high- performance & light-weight** industrial SOTA models on **servers and mobile** devices, champion solution and cutting-edge algorithm. PaddleDetection provides various data augmentation methods, configurable network components, loss functions and other advanced optimization & deployment schemes. In addition to running through the whole process of data processing, model development, training, compression and deployment, PaddlePaddle also provides rich cases and tutorials to accelerate the industrial application of algorithm.
<div align="center">
<img src="https://user-images.githubusercontent.com/22989727/189122825-ee1c1db2-b5f9-42c0-88b4-7975e1ec239d.gif" width="800"/>
</div>
## <img src="https://user-images.githubusercontent.com/48054808/157799599-e6a66855-bac6-4e75-b9c0-96e13cb9612f.png" width="20"/> Features
- **Rich model library**: PaddleDetection provides over 250 pre-trained models including **object detection, instance segmentation, face recognition, multi-object tracking**. It covers a variety of **global competition champion** schemes.
- **Simple to use**: Modular design, decoupling each network component, easy for developers to build and try various detection models and optimization strategies, quick access to high-performance, customized algorithm.
- **Getting Through End to End**: PaddlePaddle gets through end to end from data augmentation, constructing models, training, compression, depolyment. It also supports multi-architecture, multi-device deployment for **cloud and edge** device.
- **High Performance**: Due to the high performance core, PaddlePaddle has clear advantages in training speed and memory occupation. It also supports FP16 training and multi-machine training.
<div align="center">
<img src="https://user-images.githubusercontent.com/22989727/202131382-45fd2de6-3805-460e-a70c-66db7188d37c.png" width="800"/>
</div>
## <img title="" src="https://user-images.githubusercontent.com/48054808/157800467-2a9946ad-30d1-49a9-b9db-ba33413d9c90.png" alt="" width="20"> Exchanges
- If you have any question or suggestion, please give us your valuable input via [GitHub Issues](https://github.com/PaddlePaddle/PaddleDetection/issues)
Welcome to join PaddleDetection user groups on WeChat (scan the QR code, add and reply "D" to the assistant)
<div align="center">
<img src="https://user-images.githubusercontent.com/34162360/177678712-4655747d-4290-4ad9-b7a1-4564a5418ac6.jpg" width = "200" />
</div>
## <img src="https://user-images.githubusercontent.com/48054808/157827140-03ffaff7-7d14-48b4-9440-c38986ea378c.png" width="20"/> Kit Structure
<table align="center">
<tbody>
<tr align="center" valign="bottom">
<td>
<b>Architectures</b>
</td>
<td>
<b>Backbones</b>
</td>
<td>
<b>Components</b>
</td>
<td>
<b>Data Augmentation</b>
</td>
</tr>
<tr valign="top">
<td>
<ul>
<details><summary><b>Object Detection</b></summary>
<ul>
<li>Faster RCNN</li>
<li>FPN</li>
<li>Cascade-RCNN</li>
<li>PSS-Det</li>
<li>RetinaNet</li>
<li>YOLOv3</li>
<li>YOLOF</li>
<li>YOLOX</li>
<li>YOLOv5</li>
<li>YOLOv6</li>
<li>YOLOv7</li>
<li>YOLOv8</li>
<li>RTMDet</li>
<li>PP-YOLO</li>
<li>PP-YOLO-Tiny</li>
<li>PP-PicoDet</li>
<li>PP-YOLOv2</li>
<li>PP-YOLOE</li>
<li>PP-YOLOE+</li>
<li>PP-YOLOE-SOD</li>
<li>PP-YOLOE-R</li>
<li>SSD</li>
<li>CenterNet</li>
<li>FCOS</li>
<li>FCOSR</li>
<li>TTFNet</li>
<li>TOOD</li>
<li>GFL</li>
<li>GFLv2</li>
<li>DETR</li>
<li>Deformable DETR</li>
<li>Swin Transformer</li>
<li>Sparse RCNN</li>
</ul></details>
<details><summary><b>Instance Segmentation</b></summary>
<ul>
<li>Mask RCNN</li>
<li>Cascade Mask RCNN</li>
<li>SOLOv2</li>
</ul></details>
<details><summary><b>Face Detection</b></summary>
<ul>
<li>BlazeFace</li>
</ul></details>
<details><summary><b>Multi-Object-Tracking</b></summary>
<ul>
<li>JDE</li>
<li>FairMOT</li>
<li>DeepSORT</li>
<li>ByteTrack</li>
<li>OC-SORT</li>
<li>BoT-SORT</li>
<li>CenterTrack</li>
</ul></details>
<details><summary><b>KeyPoint-Detection</b></summary>
<ul>
<li>HRNet</li>
<li>HigherHRNet</li>
<li>Lite-HRNet</li>
<li>PP-TinyPose</li>
</ul></details>
</ul>
</td>
<td>
<details><summary><b>Details</b></summary>
<ul>
<li>ResNet(&vd)</li>
<li>Res2Net(&vd)</li>
<li>CSPResNet</li>
<li>SENet</li>
<li>Res2Net</li>
<li>HRNet</li>
<li>Lite-HRNet</li>
<li>DarkNet</li>
<li>CSPDarkNet</li>
<li>MobileNetv1/v3</li>
<li>ShuffleNet</li>
<li>GhostNet</li>
<li>BlazeNet</li>
<li>DLA</li>
<li>HardNet</li>
<li>LCNet</li>
<li>ESNet</li>
<li>Swin-Transformer</li>
<li>ConvNeXt</li>
<li>Vision Transformer</li>
</ul></details>
</td>
<td>
<details><summary><b>Common</b></summary>
<ul>
<li>Sync-BN</li>
<li>Group Norm</li>
<li>DCNv2</li>
<li>EMA</li>
</ul> </details>
</ul>
<details><summary><b>KeyPoint</b></summary>
<ul>
<li>DarkPose</li>
</ul></details>
</ul>
<details><summary><b>FPN</b></summary>
<ul>
<li>BiFPN</li>
<li>CSP-PAN</li>
<li>Custom-PAN</li>
<li>ES-PAN</li>
<li>HRFPN</li>
</ul> </details>
</ul>
<details><summary><b>Loss</b></summary>
<ul>
<li>Smooth-L1</li>
<li>GIoU/DIoU/CIoU</li>
<li>IoUAware</li>
<li>Focal Loss</li>
<li>CT Focal Loss</li>
<li>VariFocal Loss</li>
</ul> </details>
</ul>
<details><summary><b>Post-processing</b></summary>
<ul>
<li>SoftNMS</li>
<li>MatrixNMS</li>
</ul> </details>
</ul>
<details><summary><b>Speed</b></summary>
<ul>
<li>FP16 training</li>
<li>Multi-machine training </li>
</ul> </details>
</ul>
</td>
<td>
<details><summary><b>Details</b></summary>
<ul>
<li>Resize</li>
<li>Lighting</li>
<li>Flipping</li>
<li>Expand</li>
<li>Crop</li>
<li>Color Distort</li>
<li>Random Erasing</li>
<li>Mixup </li>
<li>AugmentHSV</li>
<li>Mosaic</li>
<li>Cutmix </li>
<li>Grid Mask</li>
<li>Auto Augment</li>
<li>Random Perspective</li>
</ul> </details>
</td>
</tr>
</td>
</tr>
</tbody>
</table>
## <img src="https://user-images.githubusercontent.com/48054808/157801371-9a9a8c65-1690-4123-985a-e0559a7f9494.png" width="20"/> Model Performance
<details>
<summary><b> Performance comparison of Cloud models</b></summary>
The comparison between COCO mAP and FPS on Tesla V100 of representative models of each architectures and backbones.
<div align="center">
<img src="docs/images/fps_map.png" />
</div>
**Clarification**
- `ViT` stands for `ViT-Cascade-Faster-RCNN`, which has highest mAP on COCO as 55.7%
- `Cascade-Faster-RCNN`stands for `Cascade-Faster-RCNN-ResNet50vd-DCN`, which has been optimized to 20 FPS inference speed when COCO mAP as 47.8% in PaddleDetection models
- `PP-YOLOE` are optimized `PP-YOLO v2`. It reached accuracy as 51.4% on COCO dataset, inference speed as 78.1 FPS on Tesla V100
- `PP-YOLOE+` are optimized `PP-YOLOE`. It reached accuracy as 53.3% on COCO dataset, inference speed as 78.1 FPS on Tesla V100
- The models in the figure are available in the[ model library](#模型库)
</details>
<details>
<summary><b> Performance omparison on mobiles</b></summary>
The comparison between COCO mAP and FPS on Qualcomm Snapdragon 865 processor of models on mobile devices.
<div align="center">
<img src="docs/images/mobile_fps_map.png" width=600/>
</div>
**Clarification**
- Tests were conducted on Qualcomm Snapdragon 865 (4 \*A77 + 4 \*A55) batch_size=1, 4 thread, and NCNN inference library, test script see [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
- [PP-PicoDet](configs/picodet) and [PP-YOLO-Tiny](configs/ppyolo) are self-developed models of PaddleDetection, and other models are not tested yet.
</details>
## <img src="https://user-images.githubusercontent.com/48054808/157829890-a535b8a6-631c-4c87-b861-64d4b32b2d6a.png" width="20"/> Model libraries
<details>
<summary><b> 1. General detection</b></summary>
#### PP-YOLOE series Recommended scenarios: Cloud GPU such as Nvidia V100, T4 and edge devices such as Jetson series
| Model | COCO AccuracymAP | V100 TensorRT FP16 Speed(FPS) | Configuration | Download |
|:---------- |:------------------:|:-----------------------------:|:-------------------------------------------------------:|:----------------------------------------------------------------------------------------:|
| PP-YOLOE+_s | 43.9 | 333.3 | [link](configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) |
| PP-YOLOE+_m | 50.0 | 208.3 | [link](configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
| PP-YOLOE+_l | 53.3 | 149.2 | [link](configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
| PP-YOLOE+_x | 54.9 | 95.2 | [link](configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) |
#### PP-PicoDet series Recommended scenarios: Mobile chips and x86 CPU devices, such as ARM CPU(RK3399, Raspberry Pi) and NPU(BITMAIN)
| Model | COCO AccuracymAP | Snapdragon 865 four-thread speed (ms) | Configuration | Download |
|:---------- |:------------------:|:-------------------------------------:|:-----------------------------------------------------:|:-------------------------------------------------------------------------------------:|
| PicoDet-XS | 23.5 | 7.81 | [Link](configs/picodet/picodet_xs_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) |
| PicoDet-S | 29.1 | 9.56 | [Link](configs/picodet/picodet_s_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) |
| PicoDet-M | 34.4 | 17.68 | [Link](configs/picodet/picodet_m_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) |
| PicoDet-L | 36.1 | 25.21 | [Link](configs/picodet/picodet_l_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) |
#### [Frontier detection algorithm](docs/feature_models/PaddleYOLO_MODEL.md)
| Model | COCO AccuracymAP | V100 TensorRT FP16 speed(FPS) | Configuration | Download |
|:-------- |:------------------:|:-----------------------------:|:--------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------:|
| [YOLOX-l](configs/yolox) | 50.1 | 107.5 | [Link](configs/yolox/yolox_l_300e_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/yolox_l_300e_coco.pdparams) |
| [YOLOv5-l](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5) | 48.6 | 136.0 | [Link](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5/yolov5_l_300e_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/yolov5_l_300e_coco.pdparams) |
| [YOLOv7-l](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7) | 51.0 | 135.0 | [链接](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7/yolov7_l_300e_coco.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/yolov7_l_300e_coco.pdparams) |
#### Other general purpose models [doc](docs/MODEL_ZOO_en.md)
</details>
<details>
<summary><b> 2. Instance segmentation</b></summary>
| Model | Introduction | Recommended Scenarios | COCO Accuracy(mAP) | Configuration | Download |
|:----------------- |:-------------------------------------------------------- |:--------------------------------------------- |:--------------------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|
| Mask RCNN | Two-stage instance segmentation algorithm | <div style="width: 50pt">Edge-Cloud end</div> | box AP: 41.4 <br/> mask AP: 37.5 | [Link](configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_2x_coco.pdparams) |
| Cascade Mask RCNN | Two-stage instance segmentation algorithm | <div style="width: 50pt">Edge-Cloud end</div> | box AP: 45.7 <br/> mask AP: 39.7 | [Link](configs/mask_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) |
| SOLOv2 | Lightweight single-stage instance segmentation algorithm | <div style="width: 50pt">Edge-Cloud end</div> | mask AP: 38.0 | [Link](configs/solov2/solov2_r50_fpn_3x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/solov2_r50_fpn_3x_coco.pdparams) |
</details>
<details>
<summary><b> 3. Keypoint detection</b></summary>
| Model | Introduction | Recommended scenarios | COCO AccuracyAP | Speed | Configuration | Download |
|:-------------------- |:--------------------------------------------------------------------------------------------- |:--------------------------------------------- |:-----------------:|:---------------------------------:|:---------------------------------------------------------:|:-------------------------------------------------------------------------------------------:|
| HRNet-w32 + DarkPose | <div style="width: 130pt">Top-down Keypoint detection algorithm<br/>Input size: 384x288</div> | <div style="width: 50pt">Edge-Cloud end</div> | 78.3 | T4 TensorRT FP16 2.96ms | [Link](configs/keypoint/hrnet/dark_hrnet_w32_384x288.yml) | [Download](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_384x288.pdparams) |
| HRNet-w32 + DarkPose | Top-down Keypoint detection algorithm<br/>Input size: 256x192 | Edge-Cloud end | 78.0 | T4 TensorRT FP16 1.75ms | [Link](configs/keypoint/hrnet/dark_hrnet_w32_256x192.yml) | [Download](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_256x192.pdparams) |
| PP-TinyPose | Light-weight keypoint algorithm<br/>Input size: 256x192 | Mobile | 68.8 | Snapdragon 865 four-thread 6.30ms | [Link](configs/keypoint/tiny_pose/tinypose_256x192.yml) | [Download](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) |
| PP-TinyPose | Light-weight keypoint algorithm<br/>Input size: 128x96 | Mobile | 58.1 | Snapdragon 865 four-thread 2.37ms | [Link](configs/keypoint/tiny_pose/tinypose_128x96.yml) | [Download](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) |
#### Other keypoint detection models [doc](configs/keypoint)
</details>
<details>
<summary><b> 4. Multi-object tracking PP-Tracking</b></summary>
| Model | Introduction | Recommended scenarios | Accuracy | Configuration | Download |
|:--------- |:------------------------------------------------------------- |:--------------------- |:----------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|
| ByteTrack | SDE Multi-object tracking algorithm with detection model only | Edge-Cloud end | MOT-17 half val: 77.3 | [Link](configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml) | [Download](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_det.pdparams) |
| FairMOT | JDE multi-object tracking algorithm multi-task learning | Edge-Cloud end | MOT-16 test: 75.0 | [Link](configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml) | [Download](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) |
| OC-SORT | SDE multi-object tracking algorithm with detection model only | Edge-Cloud end | MOT-16 half val: 75.5 | [Link](configs/mot/ocsort/ocsort_yolox.yml) | - |
#### Other multi-object tracking models [docs](configs/mot)
</details>
<details>
<summary><b> 5. Industrial real-time pedestrain analysis tool-PP Human</b></summary>
| Task | End-to-End Speedms | Model | Size |
|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
| Pedestrian detection (high precision) | 25.1ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
| Pedestrian detection (lightweight) | 16.2ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) | 27M |
| Pedestrian tracking (high precision) | 31.8ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
| Pedestrian tracking (lightweight) | 21.0ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) | 27M |
| Attribute recognition (high precision) | Single person8.5ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip) | Object detection182M<br>Attribute recognition86M |
| Attribute recognition (lightweight) | Single person 7.1ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip) | Object detection182M<br>Attribute recognition86M |
| Falling detection | Single person 10ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) <br> [Keypoint detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) <br> [Behavior detection based on key points](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | Multi-object tracking182M<br>Keypoint detection101M<br>Behavior detection based on key points: 21.8M |
| Intrusion detection | 31.8ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
| Fighting detection | 19.7ms | [Video classification](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 90M |
| Smoking detection | Single person 15.1ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Object detection based on Human Id](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip) | Object detection182M<br>Object detection based on Human ID: 27M |
| Phoning detection | Single person ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Image classification based on Human ID](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip) | Object detection182M<br>Image classification based on Human ID45M |
Please refer to [docs](deploy/pipeline/README_en.md) for details.
</details>
<details>
<summary><b> 6. Industrial real-time vehicle analysis tool-PP Vehicle</b></summary>
| Task | End-to-End Speedms | Model | Size |
|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
| Vehicle detection (high precision) | 25.7ms | [object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
| Vehicle detection (lightweight) | 13.2ms | [object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_ppvehicle.zip) | 27M |
| Vehicle tracking (high precision) | 40ms | [multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
| Vehicle tracking (lightweight) | 25ms | [multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) | 27M |
| Plate Recognition | 4.68ms | [plate detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz)<br>[plate recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz) | Plate detection3.9M<br>Plate recognition12M |
| Vehicle attribute | 7.31ms | [attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip) | 7.2M |
Please refer to [docs](deploy/pipeline/README_en.md) for details.
</details>
## <img src="https://user-images.githubusercontent.com/48054808/157828296-d5eb0ccb-23ea-40f5-9957-29853d7d13a9.png" width="20"/>Document tutorials
### Introductory tutorials
- [Installation](docs/tutorials/INSTALL_cn.md)
- [Quick start](docs/tutorials/QUICK_STARTED_cn.md)
- [Data preparation](docs/tutorials/data/README.md)
- [Geting Started on PaddleDetection](docs/tutorials/GETTING_STARTED_cn.md)
- [FAQ](docs/tutorials/FAQ)
### Advanced tutorials
- Configuration
- [RCNN Configuration](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation.md)
- [PP-YOLO Configuration](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation.md)
- Compression based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
- [Pruning/Quantization/Distillation Tutorial](configs/slim)
- [Inference deployment](deploy/README.md)
- [Export model for inference](deploy/EXPORT_MODEL.md)
- [Paddle Inference deployment](deploy/README.md)
- [Inference deployment with Python](deploy/python)
- [Inference deployment with C++](deploy/cpp)
- [Paddle-Lite deployment](deploy/lite)
- [Paddle Serving deployment](deploy/serving)
- [ONNX model export](deploy/EXPORT_ONNX_MODEL.md)
- [Inference benchmark](deploy/BENCHMARK_INFER.md)
- Advanced development
- [Data processing module](docs/advanced_tutorials/READER.md)
- [New object detection models](docs/advanced_tutorials/MODEL_TECHNICAL.md)
- Custumization
- [Object detection](docs/advanced_tutorials/customization/detection.md)
- [Keypoint detection](docs/advanced_tutorials/customization/keypoint_detection.md)
- [Multiple object tracking](docs/advanced_tutorials/customization/pphuman_mot.md)
- [Action recognition](docs/advanced_tutorials/customization/action_recognotion/)
- [Attribute recognition](docs/advanced_tutorials/customization/pphuman_attribute.md)
### Courses
- **[Theoretical foundation] [Object detection 7-day camp](https://aistudio.baidu.com/aistudio/education/group/info/1617):** Overview of object detection tasks, details of RCNN series object detection algorithm and YOLO series object detection algorithm, PP-YOLO optimization strategy and case sharing, introduction and practice of AnchorFree series algorithm
- **[Industrial application] [AI Fast Track industrial object detection technology and application](https://aistudio.baidu.com/aistudio/education/group/info/23670):** Super object detection algorithms, real-time pedestrian analysis system PP-Human, breakdown and practice of object detection industrial application
- **[Industrial features] 2022.3.26** **[Smart City Industry Seven-Day Class](https://aistudio.baidu.com/aistudio/education/group/info/25620)** : Urban planning, Urban governance, Smart governance service, Traffic management, community governance.
- **[Academic exchange] 2022.9.27 [YOLO Vision Event](https://www.youtube.com/playlist?list=PL1FZnkj4ad1NHVC7CMc3pjSQ-JRK-Ev6O):** As the first YOLO-themed event, PaddleDetection was invited to communicate with the experts in the field of Computer Vision around the world.
### [Industrial tutorial examples](./industrial_tutorial/README.md)
- [Rotated object detection based on PP-YOLOE-R](https://aistudio.baidu.com/aistudio/projectdetail/5058293)
- [Aerial image detection based on PP-YOLOE-SOD](https://aistudio.baidu.com/aistudio/projectdetail/5036782)
- [Fall down recognition based on PP-Human v2](https://aistudio.baidu.com/aistudio/projectdetail/4606001)
- [Intelligent fitness recognition based on PP-TinyPose Plus](https://aistudio.baidu.com/aistudio/projectdetail/4385813)
- [Road litter detection based on PP-PicoDet Plus](https://aistudio.baidu.com/aistudio/projectdetail/3561097)
- [Visitor flow statistics based on FairMOT](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
- [Guest analysis based on PP-Human](https://aistudio.baidu.com/aistudio/projectdetail/4537344)
- [More examples](./industrial_tutorial/README.md)
## <img title="" src="https://user-images.githubusercontent.com/48054808/157836473-1cf451fa-f01f-4148-ba68-b6d06d5da2f9.png" alt="" width="20"> Applications
- [Fitness app on android mobile](https://github.com/zhiboniu/pose_demo_android)
- [PP-Tracking GUI Visualization Interface](https://github.com/yangyudong2020/PP-Tracking_GUi)
## Recommended third-party tutorials
- [Deployment of PaddleDetection for Windows I ](https://zhuanlan.zhihu.com/p/268657833)
- [Deployment of PaddleDetection for Windows II](https://zhuanlan.zhihu.com/p/280206376)
- [Deployment of PaddleDetection on Jestson Nano](https://zhuanlan.zhihu.com/p/319371293)
- [How to deploy YOLOv3 model on Raspberry Pi for Helmet detection](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/yolov3_for_raspi.md)
- [Use SSD-MobileNetv1 for a project -- From dataset to deployment on Raspberry Pi](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/ssd_mobilenet_v1_for_raspi.md)
## <img src="https://user-images.githubusercontent.com/48054808/157835981-ef6057b4-6347-4768-8fcc-cd07fcc3d8b0.png" width="20"/> Version updates
Please refer to the[ Release note ](https://github.com/PaddlePaddle/Paddle/wiki/PaddlePaddle-2.3.0-Release-Note-EN)for more details about the updates
## <img title="" src="https://user-images.githubusercontent.com/48054808/157835345-f5d24128-abaf-4813-b793-d2e5bdc70e5a.png" alt="" width="20"> License
PaddlePaddle is provided under the [Apache 2.0 license](LICENSE)
## <img src="https://user-images.githubusercontent.com/48054808/157835796-08d4ffbc-87d9-4622-89d8-cf11a44260fc.png" width="20"/> Contribute your code
We appreciate your contributions and your feedback
- Thank [Mandroide](https://github.com/Mandroide) for code cleanup and
- Thank [FL77N](https://github.com/FL77N/) for `Sparse-RCNN`model
- Thank [Chen-Song](https://github.com/Chen-Song) for `Swin Faster-RCNN`model
- Thank [yangyudong](https://github.com/yangyudong2020), [hchhtc123](https://github.com/hchhtc123) for developing PP-Tracking GUI interface
- Thank Shigure19 for developing PP-TinyPose fitness APP
- Thank [manangoel99](https://github.com/manangoel99) for Wandb visualization methods
## <img src="https://user-images.githubusercontent.com/48054808/157835276-9aab9d1c-1c46-446b-bdd4-5ab75c5cfa48.png" width="20"/> Quote
```
@misc{ppdet2019,
title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
year={2019}
}
```

View File

View File

@@ -0,0 +1,125 @@
# 直播答疑第一期
### 答疑全程回放可以通过链接下载观看https://pan.baidu.com/s/168ouju4MxN5XJEb-GU1iAw 提取码: 92mw
## PaddleDetection框架/API问题
#### Q1. warmup能详细讲解下吗
A1. warmup是在训练初期学习率从0调整至预设学习率的过程设置可以参考[源码](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/ppdet/optimizer.py#L156)可以设置step数或epoch数
#### Q2. 如果类别不匹配 也能用pretrain weights吗
A2. 可以类别不匹配时模型会自动不加载shape不匹配的权重通常和类别数相关的权重位于head层
#### Q3. 请问nms_eta怎么用呀源码上没有写的很清楚API文档也没有细说
A3. 针对密集的场景nms_eta会在每轮动态的调整nms阈值避免过滤掉两个重叠程度很高但是属于不同物体的检测框具体可以参考[源码](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/detection/multiclass_nms_op.cc#L139)默认为1通常无需设置
#### Q4. 请问anchor_cluster.py中的--size 是模型的input size 还是 实际使用图片的size
A4. 是实际推理时的图片尺寸一般可以参考TestReader中的image_shape的设置。
#### Q5. 请问为什么预测的坐标会出现负的值?
A5. 模型算法中是有可能负值的情况首先需要判断模型预测效果是否符合预期如果正常可以考虑在后处理中增加clip的操作限制输出box在图像中如果不正常说明模型训练效果欠佳需要进一步排查问题或调优
#### Q6. PaddleDetection 人脸检测blazeface模型一键式预测时load_params没有参数文件从哪里下载?
A6. blazeface的模型可以在[模型库](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/face_detection#%E6%A8%A1%E5%9E%8B%E5%BA%93)中下载到,如果想部署需要参考[步骤](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/EXPORT_MODEL.md) 导出模型
## PP-YOLOE问题
#### Q1. 训练PP-YOLOE的时候loss是越训练越高这种情况 是数据集的问题吗?
A1. 可以从以下几个方面排查
1. 数据: 首先确认数据集没问题,包括标注,类别等
2. 超参数base_lr根据batch_size调整遵守线性原则warmup_iters根据总的epoch数进行调整
3. 预训练参数可以加载官方提供的自在coco数据集上的预训练参数
4. 网络结构方面分析下box的分布情况 适当调整dfl的参数
#### Q2. 检测模型选型问题PicoDet、PP-YOLO系列如何选型
A2. PicoDet是针对移动端设备设计的模型是针对armx86等低算力设备上设计PP-YOLO是针对服务器端设计的模型英伟达N卡百度昆仑卡等。手机端无gpu桌面端优先PicoDet有高算力设备如N卡优先PP-YOLO系列对延时不敏感的场景更注重高精度优先PP-YOLO系列
#### Q3. ConvBNLayer中BN层的参数都不会使用L2DecayPP-YOLOE-s的其它部分都会按照配置文件的设置使用0.0005的L2Decay。是这样吗
A3. PP-YOLOE的backbone和neck部分使用了ConvBNLayer其中BN层不会使用L2Decay其他部分使用全局设置的0.0005的L2Decay
#### Q4. PP-YOLOE的Conv的bias也不使用decay吗
A4. PP-YOLOE的backbone和neck部分的Conv是没有bias参数的head部分的Conv bias使用全局decay
#### Q5. 在测速时为什么要用PaddleInference而不是直接加载模型测时间呢
A5. PaddleInference会将paddle导出的预测模型会前向算子做融合从而实现速度优化并且实际部署过程也是使用PaddleInference实现
#### Q6. PP-YOLOE系列在部署的时候前后处理是不是一样的啊
A6. PP-YOLO系列模型在部署时的前处理都是 decode-resize-normalize-permute的流程后处理方面PP-YOLOv2使用了Matrix NMSPP-YOLOE使用的是普通的NMS算法
#### Q7. 针对小目标和类别不平衡的数据集PP-YOLOE有什么调整策略吗
A7 针对小目标数据集可以适当增大ppyoloe的输入尺寸然后在模型中增加注意力机制目前基于PP-YOLOE的小目标检测正在开发中针对类别不平衡问题可以从数据采样的角度处理目前PP-YOLOE还没有专门针对类别不平衡问题的优化
## PP-Human问题
#### Q1. 请问pphuman用导出的模型18个点不是官方17个点去预测时报错是为什么
A1. 这个问题是关键点模型输出点的数量与行为识别模型不一致导致的。如果希望用18点模型预测除了关键点用18点模型以外还需要自建18点的动作识别模型。
#### Q2. 为什么官方导出模型设置的window_size是50
A2. 导出模型的设置与训练和预测的输入数据长度是一致的我们主要采用的数据集是ntu、企业提供的实际数据等等。在训练这个模型的时候我们对这些数据中摔倒的片段做了统计分析基本上每个动作片段持续的帧数大约是40~80左右。综合考虑到实际使用的延迟以及预测效果我们选择了50这个量级在我们的这部分数据上既能完整描述一个完整动作又不会使得延迟过大。
总的来说这个window_size的数值最好还是根据实际动作以及设备的情况进行选择。例如在某种设备上50帧的长度根本不足以包含一个完整的动作那么这个数值就需要扩大又或者某些动作持续时间很短50帧的长度包含了太多不相关的其他动作容易造成误识别那么这个数值可以适当缩小。
#### Q3. PP-Human中如何替换检测、跟踪、关键点模型
A3. 我们使用的模型都是PaddleDetection中模型进行导出得到的。理论上PP-Human所使用的模型都是可以直接替换的但是需要注意是流程和前后处理一样的模型。
#### Q4. PP-Human中的数据标注问题检测、跟踪、关键点、行为、属性标注工具推荐和标注步骤
A4. 标注工具:检测 labelme, labelImg, cvat 跟踪darklabelcvat关键点 labelmecvat。检测标注可以使用tools/x2coco.py转换成coco格式
#### Q5. PP-Human中如何更改label属性和动作识别
A5. 在PPHuman中动作识别被定义为基于骨骼点序列的分类问题目前我们已经开源的摔倒动作识别是一个二分类问题属性方面我们当前还暂时没有开放训练正在建设中
#### Q6. PP-Human的哪些功能支持单人、哪些支持多人
A6. PP-Human的功能实现基于一套流程检测->跟踪->具体功能。当前我们的具体功能模型每次处理的是单人的,即属性、动作等都是属于图像中每一个具体人的。但是基于这套流程下来,图像中的每一个人都得到了处理的。所以单人、多人实际都是一样支持的。
#### Q7. PP-Human对视频流预测的支持及服务化部署
A7. 目前正在建设当中,下个版本会支持这部分功能
#### Q8. 在使用pphuman训练自己的数据集时训练完进行测试时可视化的标签如何更改没有更改的情况下还是falling
A8. 可视化的函数位于https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/visualize.py#L368,这里在可视化的时候将 action_text替换为期望的类别即可。
#### Q9. 关键点检测可以实现一个连贯动作的检测吗,比如健身规范
A9. 基于关键点是可以实现的。这里可以有不同思路去做:
1. 如果是期望判定动作规范的程度且这个动作可以很好的描述。那么可以在关键点模型获得的坐标的基础上人工增加逻辑判断即可。这里我们提供一个安卓的健身APP示例https://github.com/zhiboniu/pose_demo_android 其中实现判定各项动作的逻辑可以参考https://github.com/zhiboniu/pose_demo_android/blob/release/1.0/app/src/main/cpp/pose_action.cc 。
2. 当一个动作较难用逻辑去描述的时候,可能参考现有摔倒检测的案例,训练一个识别健身动作的模型,但对收集数据的要求会比较高。
#### Q10. 有遮挡的生产环境中梯子,可以用关键点检测判断人员上下梯动作是否合规
A10. 这个问题需要视遮挡的程度而定,如果遮挡过于严重时关键点检测模型的效果会大打折扣,从而导致行为的判断失准。此外,由于基于关键点的方案抹去了外观信息,如果只是从人物本身的动作上去做判断,那么在遮挡不严重的场景下是可以的。反之,如果梯子这个物体是判断动作是否合规的必要元素,那么这个方案其实不一定是最佳选择。
#### Q11. 关键点做的行为识别并不是时序上的动作识别吗
A11. 是时序的动作识别。这里是将一定时间范围内的每一帧关键点坐标组成一个时序的关键点序列,再通过行为识别模型去预测这个序列所属的行为类别。
## 检测算法问题
#### Q1. 大图片小目标 最终推理的图片也是大图片 怎么预处理呀
A1. 小目标问题常见的处理方式是切图以及增大网络输入尺寸如果使用基于anchor的检测算法可以通过对目标物体大小聚类生成anchor参考[脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/tools/anchor_cluster.py) 目前基于PP-YOLOE的小目标检测正在开发中
#### Q2. 想问下大的目标对象怎么检测,比如发票
A2. 如果使用基于anchor的检测算法可以通过对目标物体大小聚类生成anchor参考[脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/tools/anchor_cluster.py);另外可以增强深层特征提升大物体检测效果
#### Q3. 在做预测时发现预测框特别多有的框的置信度甚至低于0.1,请问如果将这种框过滤掉?也就是训练模型时就把这些极低置信度的预测结果过滤掉,避免在推理部署时,做不必要的计算,从而影响推理速度。
A3. 后处理部分有两个过滤1是提取置信度最高的Top 100个框做nms。2是根据设定阈值threshold进行过滤。如果你可以确认图片上目标相对比较少<10个可以调整Top 100这个值到50或者更低这样可以加速nms部分的计算其次调整threshold这个影响最终检测的准确度和召回率的效果
#### Q4. 正负样本的比例一般怎么设计
A4. 在PaddleDetection中支持负样本训练TrainDataset下设置allow_empty: true即可通过数据集测试负样本比例在0.3时对模型提升效果最明显
## 压缩部署问题
#### Q1. PaddleDetection训练的模型导出inference model后在做推理部署的时候前后处理相关代码如何编写有什么参考教程吗
A1. 目前PaddleDetection下的网络模型大部分都能够支持c++ inference不同的处理方式针对不同功能例如PP-YOLOE速度测试不包含后处理PicoDet为支持不同的第三方推理引擎会设置是否导出nms
object_detector.cc是针对所有检测模型的流程其中前处理大部分都是decode-resize-normalize-permute 部分网络会加入padding的操作大部分模型的后处理操作都放在模型里面了picodet有单独提供nms的后处理代码
检测模型的输入统一为imageim_shapescale_factor 如果模型中没有使用im_shape输出个数会减少但是整套预处理流程不需要额外开发
#### Q2. 针对TensorRT的加速问题fp16在v100确实可以但是耗时好像有点偏差我在1080ti上单张图片跑1000次耗时50s还是float32的可是在v100上float16耗时97
A2. 目前PPYOLOE等模型的速度都有在V100上使用TensorRT FP16测试关于速度测试有以下几个方面可以排查
1. 速度测试时是否正确设置warmup以避免过长的启动时间影响速度测试准确度
2. 在开启TensorRT时生成engine文件的过程耗时较长可以在https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L745 中将use_static设置为True
#### Q3. PaddleDetection已经支持了在线量化一些模型比如想训练其他的一个新模型是不是可以轻松用起来qat如果不能为什么只能支持很有限的模型而qat其他模型总会出各种各样的问题原因是什么
A3. 目前PaddleDetection模型很多只能针对部分模型开源了QAT的config其他模型也是支持QAT的只是配置文件没有覆盖到如果量化报错通常是配置问题检测模型一般建议跳过head最后一个conv如果想要跳过某些层量化可以设置skip_quant参考[代码](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/ppdet/modeling/heads/yolo_head.py#L97)

View File

@@ -0,0 +1,47 @@
# 通用检测benchmark测试脚本说明
```
├── benchmark
│ ├── analysis_log.py
│ ├── prepare.sh
│ ├── README.md
│ ├── run_all.sh
│ ├── run_benchmark.sh
```
## 脚本说明
### prepare.sh
相关数据准备脚本,完成数据、模型的自动下载
### run_all.sh
主要运行脚本,可完成所有相关模型的测试方案
### run_benchmark.sh
单模型运行脚本,可完成指定模型的测试方案
## Docker 运行环境
* docker image: registry.baidubce.com/paddlepaddle/paddle:2.1.2-gpu-cuda10.2-cudnn7
* paddle = 2.1.2
* python = 3.7
## 运行benchmark测试
### 运行所有模型
```
git clone https://github.com/PaddlePaddle/PaddleDetection.git
cd PaddleDetection
bash benchmark/run_all.sh
```
### 运行指定模型
* Usagebash run_benchmark.sh ${run_mode} ${batch_size} ${fp_item} ${max_epoch} ${model_name}
* model_name: faster_rcnn, fcos, deformable_detr, gfl, hrnet, higherhrnet, solov2, jde, fairmot
```
git clone https://github.com/PaddlePaddle/PaddleDetection.git
cd PaddleDetection
bash benchmark/prepare.sh
# 单卡
CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh sp 2 fp32 1 faster_rcnn
# 多卡
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh mp 2 fp32 1 faster_rcnn
```

View File

@@ -0,0 +1,48 @@
_BASE_: [
'../../configs/datasets/coco_detection.yml',
'../../configs/runtime.yml',
'../../configs/faster_rcnn/_base_/optimizer_1x.yml',
'../../configs/faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
]
weights: output/faster_rcnn_r50_fpn_1x_coco/model_final
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- RandomFlip: {prob: 0.5}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false

View File

@@ -0,0 +1,17 @@
#!/usr/bin/env bash
pip install -U pip Cython
pip install -r requirements.txt
mv ./dataset/coco/download_coco.py . && rm -rf ./dataset/coco/* && mv ./download_coco.py ./dataset/coco/
# prepare lite train data
wget -nc -P ./dataset/coco/ https://paddledet.bj.bcebos.com/data/coco_benchmark.tar
cd ./dataset/coco/ && tar -xvf coco_benchmark.tar && mv -u coco_benchmark/* .
rm -rf coco_benchmark/
cd ../../
rm -rf ./dataset/mot/*
# prepare mot mini train data
wget -nc -P ./dataset/mot/ https://paddledet.bj.bcebos.com/data/mot_benchmark.tar
cd ./dataset/mot/ && tar -xvf mot_benchmark.tar && mv -u mot_benchmark/* .
rm -rf mot_benchmark/

View File

@@ -0,0 +1,47 @@
# Use docker: paddlepaddle/paddle:latest-gpu-cuda10.1-cudnn7 paddle=2.1.2 python3.7
#
# Usage:
# git clone https://github.com/PaddlePaddle/PaddleDetection.git
# cd PaddleDetection
# bash benchmark/run_all.sh
log_path=${LOG_PATH_INDEX_DIR:-$(pwd)} # benchmark系统指定该参数,不需要跑profile时,log_path指向存speed的目录
# run prepare.sh
bash benchmark/prepare.sh
model_name_list=(faster_rcnn fcos deformable_detr gfl hrnet higherhrnet solov2 jde fairmot)
fp_item_list=(fp32)
max_epoch=2
for model_item in ${model_name_list[@]}; do
for fp_item in ${fp_item_list[@]}; do
case ${model_item} in
faster_rcnn) bs_list=(1 8) ;;
fcos) bs_list=(2) ;;
deformable_detr) bs_list=(2) ;;
gfl) bs_list=(2) ;;
hrnet) bs_list=(64) ;;
higherhrnet) bs_list=(20) ;;
solov2) bs_list=(2) ;;
jde) bs_list=(4) ;;
fairmot) bs_list=(6) ;;
*) echo "wrong model_name"; exit 1;
esac
for bs_item in ${bs_list[@]}
do
run_mode=sp
log_name=detection_${model_item}_bs${bs_item}_${fp_item} # 如:clas_MobileNetv1_mp_bs32_fp32_8
echo "index is speed, 1gpus, begin, ${log_name}"
CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${bs_item} \
${fp_item} ${max_epoch} ${model_item} | tee ${log_path}/${log_name}_speed_1gpus 2>&1
sleep 60
run_mode=mp
log_name=detection_${model_item}_bs${bs_item}_${fp_item} # 如:clas_MobileNetv1_mp_bs32_fp32_8
echo "index is speed, 8gpus, run_mode is multi_process, begin, ${log_name}"
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh ${run_mode} \
${bs_item} ${fp_item} ${max_epoch} ${model_item}| tee ${log_path}/${log_name}_speed_8gpus8p 2>&1
sleep 60
done
done
done

View File

@@ -0,0 +1,92 @@
#!/usr/bin/env bash
set -xe
# UsageCUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${batch_size} ${fp_item} ${max_epoch} ${model_name}
python="python3.7"
# Parameter description
function _set_params(){
run_mode=${1:-"sp"} # sp|mp
batch_size=${2:-"2"}
fp_item=${3:-"fp32"} # fp32|fp16
max_epoch=${4:-"1"}
model_item=${5:-"model_item"}
run_log_path=${TRAIN_LOG_DIR:-$(pwd)}
# 添加日志解析需要的参数
base_batch_size=${batch_size}
mission_name="目标检测"
direction_id="0"
ips_unit="images/s"
skip_steps=10 # 解析日志有些模型前几个step耗时长需要跳过 (必填)
keyword="ips:" # 解析日志,筛选出数据所在行的关键字 (必填)
index="1"
model_name=${model_item}_bs${batch_size}_${fp_item}
device=${CUDA_VISIBLE_DEVICES//,/ }
arr=(${device})
num_gpu_devices=${#arr[*]}
log_file=${run_log_path}/${model_item}_${run_mode}_bs${batch_size}_${fp_item}_${num_gpu_devices}
}
function _train(){
echo "Train on ${num_gpu_devices} GPUs"
echo "current CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES, gpus=$num_gpu_devices, batch_size=$batch_size"
# set runtime params
set_optimizer_lr_sp=" "
set_optimizer_lr_mp=" "
# parse model_item
case ${model_item} in
faster_rcnn) model_yml="benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml"
set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
fcos) model_yml="configs/fcos/fcos_r50_fpn_1x_coco.yml"
set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
deformable_detr) model_yml="configs/deformable_detr/deformable_detr_r50_1x_coco.yml" ;;
gfl) model_yml="configs/gfl/gfl_r50_fpn_1x_coco.yml"
set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
hrnet) model_yml="configs/keypoint/hrnet/hrnet_w32_256x192.yml" ;;
higherhrnet) model_yml="configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml" ;;
solov2) model_yml="configs/solov2/solov2_r50_fpn_1x_coco.yml" ;;
jde) model_yml="configs/mot/jde/jde_darknet53_30e_1088x608.yml" ;;
fairmot) model_yml="configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml" ;;
*) echo "Undefined model_item"; exit 1;
esac
set_batch_size="TrainReader.batch_size=${batch_size}"
set_max_epoch="epoch=${max_epoch}"
set_log_iter="log_iter=1"
if [ ${fp_item} = "fp16" ]; then
set_fp_item="--fp16"
else
set_fp_item=" "
fi
case ${run_mode} in
sp) train_cmd="${python} -u tools/train.py -c ${model_yml} ${set_fp_item} \
-o ${set_batch_size} ${set_max_epoch} ${set_log_iter} ${set_optimizer_lr_sp}" ;;
mp) rm -rf mylog
train_cmd="${python} -m paddle.distributed.launch --log_dir=./mylog \
--gpus=${CUDA_VISIBLE_DEVICES} tools/train.py -c ${model_yml} ${set_fp_item} \
-o ${set_batch_size} ${set_max_epoch} ${set_log_iter} ${set_optimizer_lr_mp}"
log_parse_file="mylog/workerlog.0" ;;
*) echo "choose run_mode(sp or mp)"; exit 1;
esac
timeout 15m ${train_cmd} > ${log_file} 2>&1
if [ $? -ne 0 ];then
echo -e "${train_cmd}, FAIL"
export job_fail_flag=1
else
echo -e "${train_cmd}, SUCCESS"
export job_fail_flag=0
fi
kill -9 `ps -ef|grep 'python'|awk '{print $2}'`
if [ $run_mode = "mp" -a -d mylog ]; then
rm ${log_file}
cp mylog/workerlog.0 ${log_file}
fi
}
source ${BENCHMARK_ROOT}/scripts/run_model.sh # 在该脚本中会对符合benchmark规范的log使用analysis.py 脚本进行性能数据解析;该脚本在联调时可从benchmark repo中下载https://github.com/PaddlePaddle/benchmark/blob/master/scripts/run_model.sh;如果不联调只想要产出训练log可以注掉本行,提交时需打开
_set_params $@
# _train # 如果只想产出训练log,不解析,可取消注释
_run # 该函数在run_model.sh中,执行时会调用_train; 如果不联调只想要产出训练log可以注掉本行,提交时需打开

View File

@@ -0,0 +1,28 @@
# Cascade R-CNN: High Quality Object Detection and Instance Segmentation
## Model Zoo
| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | Mask AP | 下载 | 配置文件 |
| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----: | :-----------------------------------------------------: | :-----: |
| ResNet50-FPN | Cascade Faster | 1 | 1x | ---- | 41.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml) |
| ResNet50-FPN | Cascade Mask | 1 | 1x | ---- | 41.8 | 36.3 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml) |
| ResNet50-vd-SSLDv2-FPN | Cascade Faster | 1 | 1x | ---- | 44.4 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
| ResNet50-vd-SSLDv2-FPN | Cascade Faster | 1 | 2x | ---- | 45.0 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
| ResNet50-vd-SSLDv2-FPN | Cascade Mask | 1 | 1x | ---- | 44.9 | 39.1 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
| ResNet50-vd-SSLDv2-FPN | Cascade Mask | 1 | 2x | ---- | 45.7 | 39.7 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
## Citations
```
@article{Cai_2019,
title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
ISSN={1939-3539},
url={http://dx.doi.org/10.1109/tpami.2019.2956516},
DOI={10.1109/tpami.2019.2956516},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
publisher={Institute of Electrical and Electronics Engineers (IEEE)},
author={Cai, Zhaowei and Vasconcelos, Nuno},
year={2019},
pages={11}
}
```

View File

@@ -0,0 +1,40 @@
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
- RandomFlip: {prob: 0.5}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false

View File

@@ -0,0 +1,40 @@
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
- RandomFlip: {prob: 0.5}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
drop_last: false

View File

@@ -0,0 +1,97 @@
architecture: CascadeRCNN
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
CascadeRCNN:
backbone: ResNet
neck: FPN
rpn_head: RPNHead
bbox_head: CascadeHead
mask_head: MaskHead
# post process
bbox_post_process: BBoxPostProcess
mask_post_process: MaskPostProcess
ResNet:
# index 0 stands for res2
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
FPN:
out_channel: 256
RPNHead:
anchor_generator:
aspect_ratios: [0.5, 1.0, 2.0]
anchor_sizes: [[32], [64], [128], [256], [512]]
strides: [4, 8, 16, 32, 64]
rpn_target_assign:
batch_size_per_im: 256
fg_fraction: 0.5
negative_overlap: 0.3
positive_overlap: 0.7
use_random: True
train_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 2000
post_nms_top_n: 2000
topk_after_collect: True
test_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 1000
post_nms_top_n: 1000
CascadeHead:
head: CascadeTwoFCHead
roi_extractor:
resolution: 7
sampling_ratio: 0
aligned: True
bbox_assigner: BBoxAssigner
BBoxAssigner:
batch_size_per_im: 512
bg_thresh: 0.5
fg_thresh: 0.5
fg_fraction: 0.25
cascade_iou: [0.5, 0.6, 0.7]
use_random: True
CascadeTwoFCHead:
out_channel: 1024
BBoxPostProcess:
decode:
name: RCNNBox
prior_box_var: [30.0, 30.0, 15.0, 15.0]
nms:
name: MultiClassNMS
keep_top_k: 100
score_threshold: 0.05
nms_threshold: 0.5
MaskHead:
head: MaskFeat
roi_extractor:
resolution: 14
sampling_ratio: 0
aligned: True
mask_assigner: MaskAssigner
share_bbox_feat: False
MaskFeat:
num_convs: 4
out_channel: 256
MaskAssigner:
mask_resolution: 28
MaskPostProcess:
binary_thresh: 0.5

View File

@@ -0,0 +1,75 @@
architecture: CascadeRCNN
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
CascadeRCNN:
backbone: ResNet
neck: FPN
rpn_head: RPNHead
bbox_head: CascadeHead
# post process
bbox_post_process: BBoxPostProcess
ResNet:
# index 0 stands for res2
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
FPN:
out_channel: 256
RPNHead:
anchor_generator:
aspect_ratios: [0.5, 1.0, 2.0]
anchor_sizes: [[32], [64], [128], [256], [512]]
strides: [4, 8, 16, 32, 64]
rpn_target_assign:
batch_size_per_im: 256
fg_fraction: 0.5
negative_overlap: 0.3
positive_overlap: 0.7
use_random: True
train_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 2000
post_nms_top_n: 2000
topk_after_collect: True
test_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 1000
post_nms_top_n: 1000
CascadeHead:
head: CascadeTwoFCHead
roi_extractor:
resolution: 7
sampling_ratio: 0
aligned: True
bbox_assigner: BBoxAssigner
BBoxAssigner:
batch_size_per_im: 512
bg_thresh: 0.5
fg_thresh: 0.5
fg_fraction: 0.25
cascade_iou: [0.5, 0.6, 0.7]
use_random: True
CascadeTwoFCHead:
out_channel: 1024
BBoxPostProcess:
decode:
name: RCNNBox
prior_box_var: [30.0, 30.0, 15.0, 15.0]
nms:
name: MultiClassNMS
keep_top_k: 100
score_threshold: 0.05
nms_threshold: 0.5

View File

@@ -0,0 +1,19 @@
epoch: 12
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [8, 11]
- !LinearWarmup
start_factor: 0.001
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2

View File

@@ -0,0 +1,8 @@
_BASE_: [
'../datasets/coco_instance.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/cascade_mask_rcnn_r50_fpn.yml',
'_base_/cascade_mask_fpn_reader.yml',
]
weights: output/cascade_mask_rcnn_r50_fpn_1x_coco/model_final

View File

@@ -0,0 +1,18 @@
_BASE_: [
'../datasets/coco_instance.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/cascade_mask_rcnn_r50_fpn.yml',
'_base_/cascade_mask_fpn_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
weights: output/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]

View File

@@ -0,0 +1,29 @@
_BASE_: [
'../datasets/coco_instance.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/cascade_mask_rcnn_r50_fpn.yml',
'_base_/cascade_mask_fpn_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
weights: output/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [12, 22]
- !LinearWarmup
start_factor: 0.1
steps: 1000

View File

@@ -0,0 +1,8 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/cascade_rcnn_r50_fpn.yml',
'_base_/cascade_fpn_reader.yml',
]
weights: output/cascade_rcnn_r50_fpn_1x_coco/model_final

View File

@@ -0,0 +1,18 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/cascade_rcnn_r50_fpn.yml',
'_base_/cascade_fpn_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
weights: output/cascade_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]

View File

@@ -0,0 +1,29 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/cascade_rcnn_r50_fpn.yml',
'_base_/cascade_fpn_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
ResNet:
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
lr_mult_list: [0.05, 0.05, 0.1, 0.15]
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [12, 22]
- !LinearWarmup
start_factor: 0.1
steps: 1000

View File

@@ -0,0 +1,37 @@
English | [简体中文](README_cn.md)
# CenterNet (CenterNet: Objects as Points)
## Table of Contents
- [Introduction](#Introduction)
- [Model Zoo](#Model_Zoo)
- [Citations](#Citations)
## Introduction
[CenterNet](http://arxiv.org/abs/1904.07850) is an Anchor Free detector, which model an object as a single point -- the center point of its bounding box. The detector uses keypoint estimation to find center points and regresses to all other object properties. The center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors.
## Model Zoo
### CenterNet Results on COCO-val 2017
| backbone | input shape | mAP | FPS | download | config |
| :--------------| :------- | :----: | :------: | :----: |:-----: |
| DLA-34(paper) | 512x512 | 37.4 | - | - | - |
| DLA-34 | 512x512 | 37.6 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_dla34_140e_coco.pdparams) | [config](./centernet_dla34_140e_coco.yml) |
| ResNet50 + DLAUp | 512x512 | 38.9 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_r50_140e_coco.pdparams) | [config](./centernet_r50_140e_coco.yml) |
| MobileNetV1 + DLAUp | 512x512 | 28.2 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv1_140e_coco.pdparams) | [config](./centernet_mbv1_140e_coco.yml) |
| MobileNetV3_small + DLAUp | 512x512 | 17 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_small_140e_coco.pdparams) | [config](./centernet_mbv3_small_140e_coco.yml) |
| MobileNetV3_large + DLAUp | 512x512 | 27.1 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_large_140e_coco.pdparams) | [config](./centernet_mbv3_large_140e_coco.yml) |
| ShuffleNetV2 + DLAUp | 512x512 | 23.8 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_shufflenetv2_140e_coco.pdparams) | [config](./centernet_shufflenetv2_140e_coco.yml) |
## Citations
```
@article{zhou2019objects,
title={Objects as points},
author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
journal={arXiv preprint arXiv:1904.07850},
year={2019}
}
```

View File

@@ -0,0 +1,36 @@
简体中文 | [English](README.md)
# CenterNet (CenterNet: Objects as Points)
## 内容
- [简介](#简介)
- [模型库](#模型库)
- [引用](#引用)
## 内容
[CenterNet](http://arxiv.org/abs/1904.07850)是Anchor Free检测器将物体表示为一个目标框中心点。CenterNet使用关键点检测的方式定位中心点并回归物体的其他属性。CenterNet是以中心点为基础的检测方法是端到端可训练的并且相较于基于anchor的检测器更加检测高效。
## 模型库
### CenterNet在COCO-val 2017上结果
| 骨干网络 | 输入尺寸 | mAP | FPS | 下载链接 | 配置文件 |
| :--------------| :------- | :----: | :------: | :----: |:-----: |
| DLA-34(paper) | 512x512 | 37.4 | - | - | - |
| DLA-34 | 512x512 | 37.6 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_dla34_140e_coco.pdparams) | [配置文件](./centernet_dla34_140e_coco.yml) |
| ResNet50 + DLAUp | 512x512 | 38.9 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_r50_140e_coco.pdparams) | [配置文件](./centernet_r50_140e_coco.yml) |
| MobileNetV1 + DLAUp | 512x512 | 28.2 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv1_140e_coco.pdparams) | [配置文件](./centernet_mbv1_140e_coco.yml) |
| MobileNetV3_small + DLAUp | 512x512 | 17 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_small_140e_coco.pdparams) | [配置文件](./centernet_mbv3_small_140e_coco.yml) |
| MobileNetV3_large + DLAUp | 512x512 | 27.1 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_large_140e_coco.pdparams) | [配置文件](./centernet_mbv3_large_140e_coco.yml) |
| ShuffleNetV2 + DLAUp | 512x512 | 23.8 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_shufflenetv2_140e_coco.pdparams) | [配置文件](./centernet_shufflenetv2_140e_coco.yml) |
## 引用
```
@article{zhou2019objects,
title={Objects as points},
author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
journal={arXiv preprint arXiv:1904.07850},
year={2019}
}
```

View File

@@ -0,0 +1,22 @@
architecture: CenterNet
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/DLA34_pretrain.pdparams
CenterNet:
backbone: DLA
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
DLA:
depth: 34
CenterNetDLAFPN:
down_ratio: 4
CenterNetHead:
head_planes: 256
regress_ltrb: False
CenterNetPostProcess:
max_per_img: 100
regress_ltrb: False

View File

@@ -0,0 +1,34 @@
architecture: CenterNet
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
CenterNet:
backbone: ResNet
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
ResNet:
depth: 50
variant: d
return_idx: [0, 1, 2, 3]
freeze_at: -1
norm_decay: 0.
dcn_v2_stages: [3]
CenterNetDLAFPN:
first_level: 0
last_level: 4
down_ratio: 4
dcn_v2: False
CenterNetHead:
head_planes: 256
regress_ltrb: False
CenterNetPostProcess:
max_per_img: 100
regress_ltrb: False

View File

@@ -0,0 +1,35 @@
worker_num: 4
TrainReader:
inputs_def:
image_shape: [3, 512, 512]
sample_transforms:
- Decode: {}
- FlipWarpAffine: {keep_res: False, input_h: 512, input_w: 512, use_random: True}
- CenterRandColor: {}
- Lighting: {eigval: [0.2141788, 0.01817699, 0.00341571], eigvec: [[-0.58752847, -0.69563484, 0.41340352], [-0.5832747, 0.00994535, -0.81221408], [-0.56089297, 0.71832671, 0.41158938]]}
- NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: False}
- Permute: {}
- Gt2CenterNetTarget: {down_ratio: 4, max_objs: 128}
batch_size: 16
shuffle: True
drop_last: True
use_shared_memory: True
EvalReader:
sample_transforms:
- Decode: {}
- WarpAffine: {keep_res: True, input_h: 512, input_w: 512}
- NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834]}
- Permute: {}
batch_size: 1
TestReader:
inputs_def:
image_shape: [3, 512, 512]
sample_transforms:
- Decode: {}
- WarpAffine: {keep_res: True, input_h: 512, input_w: 512}
- NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,14 @@
epoch: 140
LearningRate:
base_lr: 0.0005
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [90, 120]
use_warmup: False
OptimizerBuilder:
optimizer:
type: Adam
regularizer: NULL

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_140e.yml',
'_base_/centernet_dla34.yml',
'_base_/centernet_reader.yml',
]
weights: output/centernet_dla34_140e_coco/model_final

View File

@@ -0,0 +1,21 @@
_BASE_: [
'centernet_r50_140e_coco.yml'
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_pretrained.pdparams
weights: output/centernet_mbv1_140e_coco/model_final
CenterNet:
backbone: MobileNet
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
MobileNet:
scale: 1.
with_extra_blocks: false
extra_block_filters: []
feature_maps: [3, 5, 11, 13]
TrainReader:
batch_size: 32

View File

@@ -0,0 +1,22 @@
_BASE_: [
'centernet_r50_140e_coco.yml'
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
weights: output/centernet_mbv3_large_140e_coco/model_final
CenterNet:
backbone: MobileNetV3
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
MobileNetV3:
model_name: large
scale: 1.
with_extra_blocks: false
extra_block_filters: []
feature_maps: [4, 7, 13, 16]
TrainReader:
batch_size: 32

View File

@@ -0,0 +1,28 @@
_BASE_: [
'centernet_r50_140e_coco.yml'
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_small_x1_0_ssld_pretrained.pdparams
weights: output/centernet_mbv3_small_140e_coco/model_final
CenterNet:
backbone: MobileNetV3
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
MobileNetV3:
model_name: small
scale: 1.
with_extra_blocks: false
extra_block_filters: []
feature_maps: [4, 9, 12]
CenterNetDLAFPN:
first_level: 0
last_level: 3
down_ratio: 8
dcn_v2: False
TrainReader:
batch_size: 32

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_140e.yml',
'_base_/centernet_r50.yml',
'_base_/centernet_reader.yml',
]
weights: output/centernet_r50_140e_coco/model_final

View File

@@ -0,0 +1,33 @@
_BASE_: [
'centernet_r50_140e_coco.yml'
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ShuffleNetV2_x1_0_pretrained.pdparams
weights: output/centernet_shufflenetv2_140e_coco/model_final
CenterNet:
backbone: ShuffleNetV2
neck: CenterNetDLAFPN
head: CenterNetHead
post_process: CenterNetPostProcess
ShuffleNetV2:
scale: 1.0
feature_maps: [5, 13, 17]
act: leaky_relu
CenterNetDLAFPN:
first_level: 0
last_level: 3
down_ratio: 8
dcn_v2: False
TrainReader:
batch_size: 32
TestReader:
sample_transforms:
- Decode: {}
- WarpAffine: {keep_res: False, input_h: 512, input_w: 512}
- NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834]}
- Permute: {}

View File

@@ -0,0 +1,68 @@
简体中文 | [English](README.md)
# CLRNet (CLRNet: Cross Layer Refinement Network for Lane Detection)
## 目录
- [简介](#简介)
- [模型库](#模型库)
- [引用](#引用)
## 介绍
[CLRNet](https://arxiv.org/abs/2203.10350)是一个车道线检测模型。CLRNet模型设计了车道线检测的直线先验轨迹车道线iou以及nms方法融合提取车道线轨迹的上下文高层特征与底层特征利用FPN多尺度进行refine在车道线检测相关数据集取得了SOTA的性能。
## 模型库
### CLRNet在CUlane上结果
| 骨架网络 | mF1 | F1@50 | F1@75 | 下载链接 | 配置文件 |训练日志|
| :--------------| :------- | :----: | :------: | :----: |:-----: |:-----: |
| ResNet-18 | 54.98 | 79.46 | 62.10 | [下载链接](https://paddledet.bj.bcebos.com/models/clrnet_resnet18_culane.pdparams) | [配置文件](./clrnet_resnet18_culane.yml) |[训练日志](https://bj.bcebos.com/v1/paddledet/logs/train_clrnet_r18_15_culane.log)|
### 数据集下载
下载[CULane数据集](https://xingangpan.github.io/projects/CULane.html)并解压到`dataset/culane`目录。
您的数据集目录结构如下:
```shell
culane/driver_xx_xxframe # data folders x6
culane/laneseg_label_w16 # lane segmentation labels
culane/list # data lists
```
如果您使用百度云链接下载,注意确保`driver_23_30frame_part1.tar.gz``driver_23_30frame_part2.tar.gz`解压后的文件都在`driver_23_30frame`目录下。
现已将用于测试的小数据集上传到PaddleDetection可通过运行训练脚本自动下载并解压数据如需复现结果请下载链接中的全量数据集训练。
### 训练
- GPU单卡训练
```shell
python tools/train.py -c configs/clrnet/clr_resnet18_culane.yml
```
- GPU多卡训练
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/clrnet/clr_resnet18_culane.yml
```
### 评估
```shell
python tools/eval.py -c configs/clrnet/clr_resnet18_culane.yml -o weights=output/clr_resnet18_culane/model_final.pdparams
```
### 预测
```shell
python tools/infer_culane.py -c configs/clrnet/clr_resnet18_culane.yml -o weights=output/clr_resnet18_culane/model_final.pdparams --infer_img=demo/lane00000.jpg
```
注意:预测功能暂不支持模型静态图推理部署。
## 引用
```
@InProceedings{Zheng_2022_CVPR,
author = {Zheng, Tu and Huang, Yifei and Liu, Yang and Tang, Wenjian and Yang, Zheng and Cai, Deng and He, Xiaofei},
title = {CLRNet: Cross Layer Refinement Network for Lane Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {898-907}
}
```

View File

@@ -0,0 +1,68 @@
English | [简体中文](README_cn.md)
# CLRNet (CLRNet: Cross Layer Refinement Network for Lane Detection)
## Table of Contents
- [Introduction](#Introduction)
- [Model Zoo](#Model_Zoo)
- [Citations](#Citations)
## Introduction
[CLRNet](https://arxiv.org/abs/2203.10350) is a lane detection model. The CLRNet model is designed with line prior for lane detection, line iou loss as well as nms method, fused to extract contextual high-level features of lane line with low-level features, and refined by FPN multi-scale. Finally, the model achieved SOTA performance in lane detection datasets.
## Model Zoo
### CLRNet Results on CULane dataset
| backbone | mF1 | F1@50 | F1@75 | download | config |
| :--------------| :------- | :----: | :------: | :----: |:-----: |
| ResNet-18 | 54.98 | 79.46 | 62.10 | [model](https://paddledet.bj.bcebos.com/models/clrnet_resnet18_culane.pdparams) | [config](./clrnet_resnet18_culane.yml) |
### Download
Download [CULane](https://xingangpan.github.io/projects/CULane.html). Then extract them to `dataset/culane`.
For CULane, you should have structure like this:
```shell
culane/driver_xx_xxframe # data folders x6
culane/laneseg_label_w16 # lane segmentation labels
culane/list # data lists
```
If you use Baidu Cloud, make sure that images in `driver_23_30frame_part1.tar.gz` and `driver_23_30frame_part2.tar.gz` are located in one folder `driver_23_30frame` instead of two seperate folders after you decompress them.
Now we have uploaded a small subset of CULane dataset to PaddleDetection for code checking. You can simply run the training script below to download it automatically. If you want to implement the results, you need to download the full dataset at th link for training.
### Training
- single GPU
```shell
python tools/train.py -c configs/clrnet/clr_resnet18_culane.yml
```
- multi GPU
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/clrnet/clr_resnet18_culane.yml
```
### Evaluation
```shell
python tools/eval.py -c configs/clrnet/clr_resnet18_culane.yml -o weights=output/clr_resnet18_culane/model_final.pdparams
```
### Inference
```shell
python tools/infer_culane.py -c configs/clrnet/clr_resnet18_culane.yml -o weights=output/clr_resnet18_culane/model_final.pdparams --infer_img=demo/lane00000.jpg
```
Notice: The inference phase does not support static model graph deploy at present.
## Citations
```
@InProceedings{Zheng_2022_CVPR,
author = {Zheng, Tu and Huang, Yifei and Liu, Yang and Tang, Wenjian and Yang, Zheng and Cai, Deng and He, Xiaofei},
title = {CLRNet: Cross Layer Refinement Network for Lane Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {898-907}
}
```

View File

@@ -0,0 +1,41 @@
architecture: CLRNet
CLRNet:
backbone: CLRResNet
neck: CLRFPN
clr_head: CLRHead
CLRResNet:
resnet: 'resnet18'
pretrained: True
CLRFPN:
in_channels: [128,256,512]
out_channel: 64
extra_stage: 0
CLRHead:
prior_feat_channels: 64
fc_hidden_dim: 64
num_priors: 192
num_fc: 2
refine_layers: 3
sample_points: 36
loss: CLRNetLoss
conf_threshold: 0.4
nms_thres: 0.8
CLRNetLoss:
cls_loss_weight : 2.0
xyt_loss_weight : 0.2
iou_loss_weight : 2.0
seg_loss_weight : 1.0
refine_layers : 3
ignore_label: 255
bg_weight: 0.4
# for visualize lane detection results
sample_y:
start: 589
end: 230
step: -20

View File

@@ -0,0 +1,37 @@
worker_num: 10
img_h: &img_h 320
img_w: &img_w 800
ori_img_h: &ori_img_h 590
ori_img_w: &ori_img_w 1640
num_points: &num_points 72
max_lanes: &max_lanes 4
TrainReader:
batch_size: 24
batch_transforms:
- CULaneTrainProcess: {img_h: *img_h, img_w: *img_w}
- CULaneDataProcess: {num_points: *num_points, max_lanes: *max_lanes, img_w: *img_w, img_h: *img_h}
shuffle: True
drop_last: False
EvalReader:
batch_size: 24
batch_transforms:
- CULaneResize: {prob: 1.0, img_h: *img_h, img_w: *img_w}
- CULaneDataProcess: {num_points: *num_points, max_lanes: *max_lanes, img_w: *img_w, img_h: *img_h}
shuffle: False
drop_last: False
TestReader:
batch_size: 24
batch_transforms:
- CULaneResize: {prob: 1.0, img_h: *img_h, img_w: *img_w}
- CULaneDataProcess: {num_points: *num_points, max_lanes: *max_lanes, img_w: *img_w, img_h: *img_h}
shuffle: False
drop_last: False

View File

@@ -0,0 +1,14 @@
epoch: 15
snapshot_epoch: 5
LearningRate:
base_lr: 0.6e-3
schedulers:
- !CosineDecay
max_epochs: 15
use_warmup: False
OptimizerBuilder:
regularizer: False
optimizer:
type: AdamW

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../datasets/culane.yml',
'_base_/clrnet_reader.yml',
'_base_/clrnet_r18_fpn.yml',
'_base_/optimizer_1x.yml',
'../runtime.yml'
]
weights: output/clr_resnet18_culane/model_final

View File

@@ -0,0 +1,20 @@
# ConvNeXt (A ConvNet for the 2020s)
## 模型库
### ConvNeXt on COCO
| 网络网络 | 输入尺寸 | 图片数/GPU | 学习率策略 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params(M) | FLOPs(G) | 下载链接 | 配置文件 |
| :------------- | :------- | :-------: | :------: | :------------: | :---------------------: | :----------------: |:---------: | :------: |:---------------: |
| PP-YOLOE-ConvNeXt-tiny | 640 | 16 | 36e | 44.6 | 63.3 | 33.04 | 13.87 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_convnext_tiny_36e_coco.pdparams) | [配置文件](./ppyoloe_convnext_tiny_36e_coco.yml) |
| YOLOX-ConvNeXt-s | 640 | 8 | 36e | 44.6 | 65.3 | 36.20 | 27.52 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_convnext_s_36e_coco.pdparams) | [配置文件](./yolox_convnext_s_36e_coco.yml) |
## Citations
```
@Article{liu2022convnet,
author = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
title = {A ConvNet for the 2020s},
journal = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022},
}
```

View File

@@ -0,0 +1,55 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../ppyoloe/_base_/ppyoloe_crn.yml',
'../ppyoloe/_base_/ppyoloe_reader.yml',
]
depth_mult: 0.25
width_mult: 0.50
log_iter: 100
snapshot_epoch: 5
weights: output/ppyoloe_convnext_tiny_36e_coco/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
YOLOv3:
backbone: ConvNeXt
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
ConvNeXt:
arch: 'tiny'
drop_path_rate: 0.4
layer_scale_init_value: 1.0
return_idx: [1, 2, 3]
PPYOLOEHead:
static_assigner_epoch: 12
nms:
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
TrainReader:
batch_size: 16
epoch: 36
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [36]
use_warmup: false
OptimizerBuilder:
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0005

View File

@@ -0,0 +1,58 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../yolox/_base_/yolox_cspdarknet.yml',
'../yolox/_base_/yolox_reader.yml'
]
depth_mult: 0.33
width_mult: 0.50
log_iter: 100
snapshot_epoch: 5
weights: output/yolox_convnext_s_36e_coco/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
YOLOX:
backbone: ConvNeXt
neck: YOLOCSPPAN
head: YOLOXHead
size_stride: 32
size_range: [15, 25] # multi-scale range [480*480 ~ 800*800]
ConvNeXt:
arch: 'tiny'
drop_path_rate: 0.4
layer_scale_init_value: 1.0
return_idx: [1, 2, 3]
TrainReader:
batch_size: 8
mosaic_epoch: 30
YOLOXHead:
l1_epoch: 30
nms:
name: MultiClassNMS
nms_top_k: 10000
keep_top_k: 1000
score_threshold: 0.001
nms_threshold: 0.65
epoch: 36
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [36]
use_warmup: false
OptimizerBuilder:
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0005

View File

@@ -0,0 +1,21 @@
metric: COCO
num_classes: 80
TrainDataset:
name: COCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
name: COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
allow_empty: true
TestDataset:
name: ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,20 @@
metric: COCO
num_classes: 80
TrainDataset:
name: COCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_poly', 'is_crowd']
EvalDataset:
name: COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
TestDataset:
name: ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'

View File

@@ -0,0 +1,28 @@
metric: CULaneMetric
num_classes: 5 # 4 lanes + background
cut_height: &cut_height 270
dataset_dir: &dataset_dir dataset/culane
TrainDataset:
name: CULaneDataSet
dataset_dir: *dataset_dir
list_path: 'list/train_gt.txt'
split: train
cut_height: *cut_height
EvalDataset:
name: CULaneDataSet
dataset_dir: *dataset_dir
list_path: 'list/test.txt'
split: test
cut_height: *cut_height
TestDataset:
name: CULaneDataSet
dataset_dir: *dataset_dir
list_path: 'list/test.txt'
split: test
cut_height: *cut_height

View File

@@ -0,0 +1,21 @@
metric: RBOX
num_classes: 15
TrainDataset:
!COCODataSet
image_dir: trainval1024/images
anno_path: trainval1024/DOTA_trainval1024.json
dataset_dir: dataset/dota/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
EvalDataset:
!COCODataSet
image_dir: trainval1024/images
anno_path: trainval1024/DOTA_trainval1024.json
dataset_dir: dataset/dota/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
TestDataset:
!ImageFolder
anno_path: test1024/DOTA_test1024.json
dataset_dir: dataset/dota/

View File

@@ -0,0 +1,21 @@
metric: RBOX
num_classes: 15
TrainDataset:
!COCODataSet
image_dir: trainval1024/images
anno_path: trainval1024/DOTA_trainval1024.json
dataset_dir: dataset/dota_ms/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
EvalDataset:
!COCODataSet
image_dir: trainval1024/images
anno_path: trainval1024/DOTA_trainval1024.json
dataset_dir: dataset/dota_ms/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
TestDataset:
!ImageFolder
anno_path: test1024/DOTA_test1024.json
dataset_dir: dataset/dota_ms/

View File

@@ -0,0 +1,25 @@
metric: MCMOT
num_classes: 10
# using VisDrone2019 MOT dataset with 10 classes as default, you can modify it for your needs.
# for MCMOT training
TrainDataset:
!MCMOTDataSet
dataset_dir: dataset/mot
image_lists: ['visdrone_mcmot.train']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
label_list: label_list.txt
# for MCMOT evaluation
# If you want to change the MCMOT evaluation dataset, please modify 'data_root'
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: visdrone_mcmot/images/val
keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
# for MCMOT video inference
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,23 @@
metric: MOT
num_classes: 1
# for MOT training
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
# for MOT evaluation
# If you want to change the MOT evaluation dataset, please modify 'data_root'
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT16/images/train
keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
# for MOT video inference
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video

View File

@@ -0,0 +1,21 @@
metric: COCO
num_classes: 365
TrainDataset:
!COCODataSet
image_dir: train
anno_path: annotations/zhiyuan_objv2_train.json
dataset_dir: dataset/objects365
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: val
anno_path: annotations/zhiyuan_objv2_val.json
dataset_dir: dataset/objects365
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/zhiyuan_objv2_val.json
dataset_dir: dataset/objects365/

View File

@@ -0,0 +1,21 @@
metric: VOC
map_type: integral
num_classes: 4
TrainDataset:
name: VOCDataSet
dataset_dir: dataset/roadsign_voc
anno_path: train.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
EvalDataset:
name: VOCDataSet
dataset_dir: dataset/roadsign_voc
anno_path: valid.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
TestDataset:
name: ImageFolder
anno_path: dataset/roadsign_voc/label_list.txt

View File

@@ -0,0 +1,47 @@
metric: SNIPERCOCO
num_classes: 80
TrainDataset:
!SniperCOCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
allow_empty: true
is_trainset: true
image_target_sizes: [2000, 1000]
valid_box_ratio_ranges: [[-1, 0.1],[0.08, -1]]
chip_target_size: 512
chip_target_stride: 200
use_neg_chip: false
max_neg_num_per_im: 8
EvalDataset:
!SniperCOCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
allow_empty: true
is_trainset: false
image_target_sizes: [2000, 1000]
valid_box_ratio_ranges: [[-1, 0.1], [0.08, -1]]
chip_target_size: 512
chip_target_stride: 200
max_per_img: -1
nms_thresh: 0.5
TestDataset:
!SniperCOCODataSet
image_dir: val2017
dataset_dir: dataset/coco
is_trainset: false
image_target_sizes: [2000, 1000]
valid_box_ratio_ranges: [[-1, 0.1],[0.08, -1]]
chip_target_size: 500
chip_target_stride: 200
max_per_img: -1
nms_thresh: 0.5

View File

@@ -0,0 +1,47 @@
metric: SNIPERCOCO
num_classes: 9
TrainDataset:
!SniperCOCODataSet
image_dir: train
anno_path: annotations/train.json
dataset_dir: dataset/VisDrone2019_coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
allow_empty: true
is_trainset: true
image_target_sizes: [8145, 2742]
valid_box_ratio_ranges: [[-1, 0.03142857142857144], [0.02333211853008726, -1]]
chip_target_size: 1536
chip_target_stride: 1184
use_neg_chip: false
max_neg_num_per_im: 8
EvalDataset:
!SniperCOCODataSet
image_dir: val
anno_path: annotations/val.json
dataset_dir: dataset/VisDrone2019_coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
allow_empty: true
is_trainset: false
image_target_sizes: [8145, 2742]
valid_box_ratio_ranges: [[-1, 0.03142857142857144], [0.02333211853008726, -1]]
chip_target_size: 1536
chip_target_stride: 1184
max_per_img: -1
nms_thresh: 0.5
TestDataset:
!SniperCOCODataSet
image_dir: val
dataset_dir: dataset/VisDrone2019_coco
is_trainset: false
image_target_sizes: [8145, 2742]
valid_box_ratio_ranges: [[-1, 0.03142857142857144], [0.02333211853008726, -1]]
chip_target_size: 1536
chip_target_stride: 1184
max_per_img: -1
nms_thresh: 0.5

View File

@@ -0,0 +1,21 @@
metric: RBOX
num_classes: 9
TrainDataset:
!COCODataSet
image_dir: images
anno_path: annotations/train.json
dataset_dir: dataset/spine_coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
EvalDataset:
!COCODataSet
image_dir: images
anno_path: annotations/valid.json
dataset_dir: dataset/spine_coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
TestDataset:
!ImageFolder
anno_path: annotations/valid.json
dataset_dir: dataset/spine_coco

View File

@@ -0,0 +1,22 @@
metric: COCO
num_classes: 10
TrainDataset:
!COCODataSet
image_dir: VisDrone2019-DET-train
anno_path: train.json
dataset_dir: dataset/visdrone
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: VisDrone2019-DET-val
anno_path: val.json
# image_dir: test_dev
# anno_path: test_dev.json
dataset_dir: dataset/visdrone
TestDataset:
!ImageFolder
anno_path: val.json
dataset_dir: dataset/visdrone

View File

@@ -0,0 +1,21 @@
metric: VOC
map_type: 11point
num_classes: 20
TrainDataset:
name: VOCDataSet
dataset_dir: dataset/voc
anno_path: trainval.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
EvalDataset:
name: VOCDataSet
dataset_dir: dataset/voc
anno_path: test.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
TestDataset:
name: ImageFolder
anno_path: dataset/voc/label_list.txt

View File

@@ -0,0 +1,20 @@
metric: WiderFace
num_classes: 1
TrainDataset:
!WIDERFaceDataSet
dataset_dir: dataset/wider_face
anno_path: wider_face_split/wider_face_train_bbx_gt.txt
image_dir: WIDER_train/images
data_fields: ['image', 'gt_bbox', 'gt_class']
EvalDataset:
!WIDERFaceDataSet
dataset_dir: dataset/wider_face
anno_path: wider_face_split/wider_face_val_bbx_gt.txt
image_dir: WIDER_val/images
data_fields: ['image']
TestDataset:
!ImageFolder
use_default_label: true

View File

@@ -0,0 +1,37 @@
### Deformable ConvNets v2
| 骨架网络 | 网络类型 | 卷积 | 每张GPU图片个数 | 学习率策略 |推理时间(fps)| Box AP | Mask AP | 下载 | 配置文件 |
| :------------------- | :------------- | :-----: |:--------: | :-----: | :-----------: |:----: | :-----: | :----------------------------------------------------------: | :----: |
| ResNet50-FPN | Faster | c3-c5 | 1 | 1x | - | 42.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/faster_rcnn_dcn_r50_fpn_1x_coco.yml) |
| ResNet50-vd-FPN | Faster | c3-c5 | 1 | 1x | - | 42.7 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r50_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_1x_coco.yml) |
| ResNet50-vd-FPN | Faster | c3-c5 | 1 | 2x | - | 43.7 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r50_vd_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x_coco.yml) |
| ResNet101-vd-FPN | Faster | c3-c5 | 1 | 1x | - | 45.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r101_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x_coco.yml) |
| ResNeXt101-vd-FPN | Faster | c3-c5 | 1 | 1x | - | 46.5 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.pdparams) |[配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml) |
| ResNet50-FPN | Mask | c3-c5 | 1 | 1x | - | 42.7 | 38.4 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/mask_rcnn_dcn_r50_fpn_1x_coco.yml) |
| ResNet50-vd-FPN | Mask | c3-c5 | 1 | 2x | - | 44.6 | 39.8 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_r50_vd_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x_coco.yml) |
| ResNet101-vd-FPN | Mask | c3-c5 | 1 | 1x | - | 45.6 | 40.6 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_r101_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x_coco.yml) |
| ResNeXt101-vd-FPN | Mask | c3-c5 | 1 | 1x | - | 47.3 | 42.0 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml) |
| ResNet50-FPN | Cascade Faster | c3-c5 | 1 | 1x | - | 42.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_dcn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x_coco.yml) |
| ResNeXt101-vd-FPN | Cascade Faster | c3-c5 | 1 | 1x | - | 48.8 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml) |
**注意事项:**
- Deformable卷积网络v2(dcn_v2)参考自论文[Deformable ConvNets v2](https://arxiv.org/abs/1811.11168).
- `c3-c5`意思是在resnet模块的3到5阶段增加`dcn`.
## Citations
```
@inproceedings{dai2017deformable,
title={Deformable Convolutional Networks},
author={Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen},
booktitle={Proceedings of the IEEE international conference on computer vision},
year={2017}
}
@article{zhu2018deformable,
title={Deformable ConvNets v2: More Deformable, Better Results},
author={Zhu, Xizhou and Hu, Han and Lin, Stephen and Dai, Jifeng},
journal={arXiv preprint arXiv:1811.11168},
year={2018}
}
```

View File

@@ -0,0 +1,16 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../cascade_rcnn/_base_/optimizer_1x.yml',
'../cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml',
'../cascade_rcnn/_base_/cascade_fpn_reader.yml',
]
weights: output/cascade_rcnn_dcn_r50_fpn_1x_coco/model_final
ResNet:
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,16 @@
_BASE_: [
'cascade_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
weights: output/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco/model_final
ResNet:
depth: 101
groups: 64
base_width: 4
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,15 @@
_BASE_: [
'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
weights: output/faster_rcnn_dcn_r101_vd_fpn_1x_coco/model_final
ResNet:
# index 0 stands for res2
depth: 101
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,16 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../faster_rcnn/_base_/optimizer_1x.yml',
'../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
'../faster_rcnn/_base_/faster_fpn_reader.yml',
]
weights: output/faster_rcnn_dcn_r50_fpn_1x_coco/model_final
ResNet:
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,15 @@
_BASE_: [
'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
weights: output/faster_rcnn_dcn_r50_vd_fpn_2x_coco/model_final
ResNet:
# index 0 stands for res2
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,26 @@
_BASE_: [
'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
weights: output/faster_rcnn_dcn_r50_vd_fpn_2x_coco/model_final
ResNet:
# index 0 stands for res2
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.1
steps: 1000

View File

@@ -0,0 +1,17 @@
_BASE_: [
'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
weights: output/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco/model_final
ResNet:
# for ResNeXt: groups, base_width, base_channels
depth: 101
groups: 64
base_width: 4
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,15 @@
_BASE_: [
'mask_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
weights: output/mask_rcnn_dcn_r101_vd_fpn_1x_coco/model_final
ResNet:
# index 0 stands for res2
depth: 101
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,16 @@
_BASE_: [
'../datasets/coco_instance.yml',
'../runtime.yml',
'../mask_rcnn/_base_/optimizer_1x.yml',
'../mask_rcnn/_base_/mask_rcnn_r50_fpn.yml',
'../mask_rcnn/_base_/mask_fpn_reader.yml',
]
weights: output/mask_rcnn_dcn_r50_fpn_1x_coco/model_final
ResNet:
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,26 @@
_BASE_: [
'mask_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
weights: output/mask_rcnn_dcn_r50_vd_fpn_2x_coco/model_final
ResNet:
# index 0 stands for res2
depth: 50
variant: d
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]
epoch: 24
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [16, 22]
- !LinearWarmup
start_factor: 0.1
steps: 1000

View File

@@ -0,0 +1,17 @@
_BASE_: [
'mask_rcnn_dcn_r50_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
weights: output/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco/model_final
ResNet:
# for ResNeXt: groups, base_width, base_channels
depth: 101
variant: d
groups: 64
base_width: 4
norm_type: bn
freeze_at: 0
return_idx: [0,1,2,3]
num_stages: 4
dcn_v2_stages: [1,2,3]

View File

@@ -0,0 +1,36 @@
# Deformable DETR
## Introduction
Deformable DETR is an object detection model based on DETR. We reproduced the model of the paper.
## Model Zoo
| Backbone | Model | Images/GPU | Epochs | Box AP | Config | Log | Download |
|:--------:|:---------------:|:----------:|:------:|:------:|:------------------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------:|
| R-50 | Deformable DETR | 2 | 50 | 44.5 | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/deformable_detr/deformable_detr_r50_1x_coco.yml) | [log](https://bj.bcebos.com/v1/paddledet/logs/deformable_detr_r50_1x_coco_44.5.log) | [model](https://paddledet.bj.bcebos.com/models/deformable_detr_r50_1x_coco.pdparams) |
**Notes:**
- Deformable DETR is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
- Deformable DETR uses 8GPU to train 50 epochs.
GPU multi-card training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/deformable_detr/deformable_detr_r50_1x_coco.yml --fleet
```
## Citations
```
@inproceedings{
zhu2021deformable,
title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
author={Xizhou Zhu and Weijie Su and Lewei Lu and Bin Li and Xiaogang Wang and Jifeng Dai},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=gZ9hCDWe6ke}
}
```

View File

@@ -0,0 +1,48 @@
architecture: DETR
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vb_normal_pretrained.pdparams
hidden_dim: 256
use_focal_loss: True
DETR:
backbone: ResNet
transformer: DeformableTransformer
detr_head: DeformableDETRHead
post_process: DETRPostProcess
ResNet:
# index 0 stands for res2
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.0, 0.1, 0.1, 0.1]
num_stages: 4
DeformableTransformer:
num_queries: 300
position_embed_type: sine
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 1024
dropout: 0.1
activation: relu
num_feature_levels: 4
num_encoder_points: 4
num_decoder_points: 4
DeformableDETRHead:
num_mlp_layers: 3
DETRLoss:
loss_coeff: {class: 2, bbox: 5, giou: 2}
aux_loss: True
HungarianMatcher:
matcher_coeff: {class: 2, bbox: 5, giou: 2}

View File

@@ -0,0 +1,44 @@
worker_num: 2
TrainReader:
sample_transforms:
- Decode: {}
- RandomFlip: {prob: 0.5}
- RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
transforms2: [
RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
RandomSizeCrop: { min_size: 384, max_size: 600 },
RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
batch_transforms:
- PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false

View File

@@ -0,0 +1,16 @@
epoch: 50
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [40]
use_warmup: false
OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/deformable_optimizer_1x.yml',
'_base_/deformable_detr_r50.yml',
'_base_/deformable_detr_reader.yml',
]
weights: output/deformable_detr_r50_1x_coco/model_final
find_unused_parameters: True

View File

@@ -0,0 +1,39 @@
# DETR
## Introduction
DETR is an object detection model based on transformer. We reproduced the model of the paper.
## Model Zoo
| Backbone | Model | Images/GPU | Inf time (fps) | Box AP | Config | Download |
|:------:|:--------:|:--------:|:--------------:|:------:|:------:|:--------:|
| R-50 | DETR | 4 | --- | 42.3 | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/detr/detr_r50_1x_coco.yml) | [model](https://paddledet.bj.bcebos.com/models/detr_r50_1x_coco.pdparams) |
**Notes:**
- DETR is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
- DETR uses 8GPU to train 500 epochs.
GPU multi-card training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/detr/detr_r50_1x_coco.yml --fleet
```
## Citations
```
@inproceedings{detr,
author = {Nicolas Carion and
Francisco Massa and
Gabriel Synnaeve and
Nicolas Usunier and
Alexander Kirillov and
Sergey Zagoruyko},
title = {End-to-End Object Detection with Transformers},
booktitle = {ECCV},
year = {2020}
}
```

View File

@@ -0,0 +1,44 @@
architecture: DETR
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vb_normal_pretrained.pdparams
hidden_dim: 256
DETR:
backbone: ResNet
transformer: DETRTransformer
detr_head: DETRHead
post_process: DETRPostProcess
ResNet:
# index 0 stands for res2
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [3]
lr_mult_list: [0.0, 0.1, 0.1, 0.1]
num_stages: 4
DETRTransformer:
num_queries: 100
position_embed_type: sine
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 2048
dropout: 0.1
activation: relu
DETRHead:
num_mlp_layers: 3
DETRLoss:
loss_coeff: {class: 1, bbox: 5, giou: 2, no_object: 0.1}
aux_loss: True
HungarianMatcher:
matcher_coeff: {class: 1, bbox: 5, giou: 2}

View File

@@ -0,0 +1,44 @@
worker_num: 0
TrainReader:
sample_transforms:
- Decode: {}
- RandomFlip: {prob: 0.5}
- RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
transforms2: [
RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
RandomSizeCrop: { min_size: 384, max_size: 600 },
RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
batch_transforms:
- PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
batch_size: 2
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false

View File

@@ -0,0 +1,16 @@
epoch: 500
LearningRate:
base_lr: 0.0001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [400]
use_warmup: false
OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001

View File

@@ -0,0 +1,9 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/optimizer_1x.yml',
'_base_/detr_r50.yml',
'_base_/detr_reader.yml',
]
weights: output/detr_r50_1x_coco/model_final
find_unused_parameters: True

View File

@@ -0,0 +1,39 @@
# DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
## Introduction
[DINO](https://arxiv.org/abs/2203.03605) is an object detection model based on DETR. We reproduced the model of the paper.
## Model Zoo
| Backbone | Model | Epochs | Box AP | Config | Log | Download |
|:------:|:---------------:|:------:|:------:|:---------------------------------------:|:-------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------:|
| R-50 | dino_r50_4scale | 12 | 49.5 | [config](./dino_r50_4scale_1x_coco.yml) | [log](https://bj.bcebos.com/v1/paddledet/logs/dino_r50_4scale_1x_coco_49.5.log) | [model](https://paddledet.bj.bcebos.com/models/dino_r50_4scale_1x_coco.pdparams) |
| R-50 | dino_r50_4scale | 24 | 50.8 | [config](./dino_r50_4scale_2x_coco.yml) | [log](https://bj.bcebos.com/v1/paddledet/logs/dino_r50_4scale_2x_coco_50.8.log) | [model](https://paddledet.bj.bcebos.com/models/dino_r50_4scale_2x_coco.pdparams) |
**Notes:**
- DINO is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
- DINO uses 4GPU to train.
GPU multi-card training
```bash
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/dino/dino_r50_4scale_1x_coco.yml --fleet --eval
```
## Custom Operator
- Multi-scale deformable attention custom operator see [here](../../ppdet/modeling/transformers/ext_op).
## Citations
```
@misc{zhang2022dino,
title={DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection},
author={Hao Zhang and Feng Li and Shilong Liu and Lei Zhang and Hang Su and Jun Zhu and Lionel M. Ni and Heung-Yeung Shum},
year={2022},
eprint={2203.03605},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

View File

@@ -0,0 +1,45 @@
architecture: DETR
# pretrain_weights: # rewrite in FocalNet.pretrained in ppdet/modeling/backbones/focalnet.py
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/focalnet_large_lrf_384_fl4_pretrained.pdparams
hidden_dim: 256
use_focal_loss: True
DETR:
backbone: FocalNet
transformer: DINOTransformer
detr_head: DINOHead
post_process: DETRPostProcess
FocalNet:
arch: 'focalnet_L_384_22k_fl4'
out_indices: [1, 2, 3]
pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/focalnet_large_lrf_384_fl4_pretrained.pdparams
DINOTransformer:
num_queries: 900
position_embed_type: sine
num_levels: 4
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 2048
dropout: 0.0
activation: relu
pe_temperature: 20
pe_offset: 0.0
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: True
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
DETRPostProcess:
num_top_queries: 300

View File

@@ -0,0 +1,49 @@
architecture: DETR
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
hidden_dim: 256
use_focal_loss: True
DETR:
backbone: ResNet
transformer: DINOTransformer
detr_head: DINOHead
post_process: DETRPostProcess
ResNet:
# index 0 stands for res2
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.0, 0.1, 0.1, 0.1]
num_stages: 4
DINOTransformer:
num_queries: 900
position_embed_type: sine
num_levels: 4
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 2048
dropout: 0.0
activation: relu
pe_temperature: 20
pe_offset: 0.0
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: True
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
DETRPostProcess:
num_top_queries: 300

View File

@@ -0,0 +1,40 @@
worker_num: 4
TrainReader:
sample_transforms:
- Decode: {}
- RandomFlip: {prob: 0.5}
- RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
transforms2: [
RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
RandomSizeCrop: { min_size: 384, max_size: 600 },
RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
}
- NormalizeImage: {is_scale: true, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
batch_transforms:
- PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
batch_size: 4
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
- Permute: {}
batch_size: 1
TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
- Permute: {}
batch_size: 1

View File

@@ -0,0 +1,46 @@
architecture: DETR
# pretrain_weights: # rewrite in SwinTransformer.pretrained in ppdet/modeling/backbones/swin_transformer.py
hidden_dim: 256
use_focal_loss: True
DETR:
backbone: SwinTransformer
transformer: DINOTransformer
detr_head: DINOHead
post_process: DETRPostProcess
SwinTransformer:
arch: 'swin_L_384' # ['swin_T_224', 'swin_S_224', 'swin_B_224', 'swin_L_224', 'swin_B_384', 'swin_L_384']
ape: false
drop_path_rate: 0.2
patch_norm: true
out_indices: [1, 2, 3]
DINOTransformer:
num_queries: 900
position_embed_type: sine
num_levels: 4
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 2048
dropout: 0.0
activation: relu
pe_temperature: 10000
pe_offset: -0.5
num_denoising: 100
label_noise_ratio: 0.5
box_noise_scale: 1.0
learnt_init_query: True
DINOHead:
loss:
name: DINOLoss
loss_coeff: {class: 1, bbox: 5, giou: 2}
aux_loss: True
matcher:
name: HungarianMatcher
matcher_coeff: {class: 2, bbox: 5, giou: 2}
DETRPostProcess:
num_top_queries: 300

Some files were not shown because too many files have changed in this diff Show More