更换文档检测模型

2024-08-27 14:42:45 +08:00
parent aea6f19951
commit 1514e09c40
2072 changed files with 254336 additions and 4967 deletions
--- a/paddle_detection/configs/keypoint/tiny_pose/README.md
+++ b/paddle_detection/configs/keypoint/tiny_pose/README.md
@@ -0,0 +1,281 @@
+简体中文 | [English](README_en.md)
+
+# PP-TinyPose
+
+<div align="center">
+  <img src="../../../docs/images/tinypose_demo.png"/>
+  <center>图片来源：COCO2017开源数据集</center>
+</div>
+
+## 最新动态
+- **2022.8.01：发布PP-TinyPose升级版。 在健身、舞蹈等场景的业务数据集端到端AP提升9.1**
+  - 新增体育场景真实数据，复杂动作识别效果显著提升，覆盖侧身、卧躺、跳跃、高抬腿等非常规动作
+  - 检测模型升级为[PP-PicoDet增强版](../../../configs/picodet/README.md)，在COCO数据集上精度提升3.1%
+  - 关键点稳定性增强。新增滤波稳定方式，视频预测结果更加稳定平滑
+
+  ![](https://user-images.githubusercontent.com/15810355/181733705-d0f84232-c6a2-43dd-be70-4a3a246b8fbc.gif)
+
+## 简介
+PP-TinyPose是PaddleDetecion针对移动端设备优化的实时关键点检测模型，可流畅地在移动端设备上执行多人姿态估计任务。借助PaddleDetecion自研的优秀轻量级检测模型[PicoDet](../../picodet/README.md)，我们同时提供了特色的轻量级垂类行人检测模型。TinyPose的运行环境有以下依赖要求：
+- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle)>=2.2
+
+如希望在移动端部署，则还需要：
+- [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)>=2.11
+
+
+<div align="center">
+  <img src="../../../docs/images/tinypose_pipeline.png" width='800'/>
+</div>
+
+## 部署案例
+
+- [Android Fitness Demo](https://github.com/zhiboniu/pose_demo_android)  基于PP-TinyPose, 高效实现健身校准与计数功能。
+
+<div align="center">
+  <img src="https://user-images.githubusercontent.com/22989727/205545098-fe6515af-3f1d-4303-bb4d-6e2141e42e2c.gif" width='636'/>
+</div>
+
+- 欢迎扫码快速体验
+<div align="center">
+  <img src="../../../docs/images/tinypose_app.png" width='220'/>
+</div>
+
+
+## 模型库
+
+### Pipeline性能
+| 单人模型配置 | AP (业务数据集） | AP (COCO Val单人）| 单人耗时 (FP32) |  单人耗时 (FP16) |
+| :---------------------------------- | :------: | :------: | :---: | :---: |
+| PicoDet-S-Lcnet-Pedestrian-192\*192 + PP-TinyPose-128\*96 | 77.1 (+9.1) | 52.3 (+0.5) | 12.90 ms| 9.61 ms |
+
+| 多人模型配置 | AP (业务数据集) | AP (COCO Val多人）| 6人耗时 (FP32) | 6人耗时 (FP16)|
+| :------------------------ | :-------: | :-------: | :---: | :---: |
+| PicoDet-S-Lcnet-Pedestrian-320\*320 + PP-TinyPose-128\*96 | 78.0 (+7.7) | 50.1 (-0.2) | 47.63 ms| 34.62 ms |
+
+**说明**
+- 关键点检测模型的精度指标是基于对应行人检测模型检测得到的检测框。
+- 精度测试中去除了flip操作，且检测置信度阈值要求0.5。
+- 速度测试环境为qualcomm snapdragon 865，采用arm8下4线程推理。
+- Pipeline速度包含模型的预处理、推理及后处理部分。
+- 精度值的增量对比自历史版本中对应模型组合, 详情请见**历史版本-Pipeline性能**。
+- 精度测试中，为了公平比较，多人数据去除了6人以上（不含6人）的图像。
+
+### 关键点检测模型
+| 模型        | 输入尺寸 | AP (业务数据集) | AP (COCO Val) | 参数量 | FLOPS |单人推理耗时 (FP32) | 单人推理耗时（FP16) |             配置文件             |                           模型权重                           |                         预测部署模型                         |                  Paddle-Lite部署模型（FP32)                  |                  Paddle-Lite部署模型（FP16)                  |
+| :---------- | :------: | :-----------: | :-----------: | :-----------: | :-----------: | :-----------------: | :-----------------: | :------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PP-TinyPose |  128*96  |     84.3      |    58.4     | 1.32 M | 81.56 M | 4.57ms        |       3.27ms        | [Config](./tinypose_128x96.yml)  | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96_fp16.nb) |
+| PP-TinyPose | 256*192  |     91.0      |   68.3       |  1.32 M  | 326.24M |14.07ms       |       8.33ms        | [Config](./tinypose_256x192.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192_fp16.nb) |
+
+
+### 行人检测模型
+| 模型                 | 输入尺寸 | mAP (COCO Val-Person) | 参数量 | FLOPS | 平均推理耗时 (FP32) | 平均推理耗时 (FP16) |                           配置文件                           |                           模型权重                           |                         预测部署模型                         |                  Paddle-Lite部署模型（FP32)                  |                  Paddle-Lite部署模型（FP16)                  |
+| :------------------- | :------: | :------------: | :------------: | :------------: | :-----------------: | :-----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PicoDet-S-Lcnet-Pedestrian | 192*192  |      31.7      | 1.16 M  | 170.03 M |  5.24ms        |       3.66ms        | [Config](../../picodet/application/pedestrian_detection/picodet_s_192_lcnet_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian_fp16.nb) |
+| PicoDet-S-Lcnet-Pedestrian | 320*320  |      41.6      | 1.16 M   | 472.07 M | 13.87ms       |       8.94ms        | [Config](../../picodet/application/pedestrian_detection/picodet_s_320_lcnet_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian_fp16.nb) |
+
+**说明**
+- 关键点检测模型与行人检测模型均使用`COCO train2017`, `AI Challenger trainset`以及采集的多姿态场景数据集作为训练集。关键点检测模型使用多姿态场景数据集作为测试集，行人检测模型采用`COCO instances val2017`作为测试集。
+- 关键点检测模型的精度指标所依赖的检测框为ground truth标注得到。
+- 关键点检测模型与行人检测模型均在4卡环境下训练，若实际训练环境需要改变GPU数量或batch size， 须参考[FAQ](../../../docs/tutorials/FAQ/README.md)对应调整学习率。
+- 推理速度测试环境为 Qualcomm Snapdragon 865，采用arm8下4线程推理得到。
+
+## 历史版本
+
+<details>
+<summary>2021版本</summary>
+
+
+### Pipeline性能
+| 单人模型配置 | AP (COCO Val 单人) | 单人耗时 (FP32) |  单人耗时 (FP16) |
+| :------------------------ | :------: | :---: | :---: |
+| PicoDet-S-Pedestrian-192\*192 + PP-TinyPose-128\*96 | 51.8 | 11.72 ms| 8.18 ms |
+| 其他优秀开源模型-192\*192 | 22.3 | 12.0 ms| - |
+
+| 多人模型配置 | AP (COCO Val 多人) | 6人耗时 (FP32) | 6人耗时 (FP16)|
+| :------------------------ | :-------: | :---: | :---: |
+| PicoDet-S-Pedestrian-320\*320 + PP-TinyPose-128\*96 | 50.3 | 44.0 ms| 32.57 ms |
+| 其他优秀开源模型-256\*256 | 39.4 | 51.0 ms| - |
+
+**说明**
+- 关键点检测模型的精度指标是基于对应行人检测模型检测得到的检测框。
+- 精度测试中去除了flip操作，且检测置信度阈值要求0.5。
+- 精度测试中，为了公平比较，多人数据去除了6人以上（不含6人）的图像。
+- 速度测试环境为qualcomm snapdragon 865，采用arm8下4线程、FP32推理得到。
+- Pipeline速度包含模型的预处理、推理及后处理部分。
+- 其他优秀开源模型的测试及部署方案，请参考[这里](https://github.com/zhiboniu/MoveNet-PaddleLite)。
+- 更多环境下的性能测试结果，请参考[Keypoint Inference Benchmark](../KeypointBenchmark.md)。
+
+
+### 关键点检测模型
+| 模型        | 输入尺寸 | AP (COCO Val) | 单人推理耗时 (FP32) | 单人推理耗时（FP16) |             配置文件             |                           模型权重                           |                         预测部署模型                         |                  Paddle-Lite部署模型（FP32)                  |                  Paddle-Lite部署模型（FP16)                  |
+| :---------- | :------: | :-----------: | :-----------------: | :-----------------: | :------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PP-TinyPose |  128*96  |     58.1      |       4.57ms        |       3.27ms        | [Config](./tinypose_128x96.yml)  | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_fp16_lite.tar) |
+| PP-TinyPose | 256*192  |     68.8      |       14.07ms       |       8.33ms        | [Config](./tinypose_256x192.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_fp16_lite.tar) |
+
+### 行人检测模型
+| 模型                 | 输入尺寸 | mAP (COCO Val-Person) | 平均推理耗时 (FP32) | 平均推理耗时 (FP16) |                           配置文件                           |                           模型权重                           |                         预测部署模型                         |                  Paddle-Lite部署模型（FP32)                  |                  Paddle-Lite部署模型（FP16)                  |
+| :------------------- | :------: | :------------: | :-----------------: | :-----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PicoDet-S-Pedestrian | 192*192  |      29.0      |       4.30ms        |       2.37ms        | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_fp16_lite.tar) |
+| PicoDet-S-Pedestrian | 320*320  |      38.5      |       10.26ms       |       6.30ms        | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_fp16_lite.tar) |
+
+
+**说明**
+- 关键点检测模型与行人检测模型均使用`COCO train2017`和`AI Challenger trainset`作为训练集。关键点检测模型使用`COCO person keypoints val2017`作为测试集，行人检测模型采用`COCO instances val2017`作为测试集。
+- 关键点检测模型的精度指标所依赖的检测框为ground truth标注得到。
+- 关键点检测模型与行人检测模型均在4卡环境下训练，若实际训练环境需要改变GPU数量或batch size， 须参考[FAQ](../../../docs/tutorials/FAQ/README.md)对应调整学习率。
+- 推理速度测试环境为 Qualcomm Snapdragon 865，采用arm8下4线程推理得到。
+
+
+</details>
+
+## 模型训练
+关键点检测模型与行人检测模型的训练集在`COCO`以外还扩充了[AI Challenger](https://arxiv.org/abs/1711.06475)数据集，各数据集关键点定义如下：
+```
+COCO keypoint Description:
+    0: "Nose",
+    1: "Left Eye",
+    2: "Right Eye",
+    3: "Left Ear",
+    4: "Right Ear",
+    5: "Left Shoulder,
+    6: "Right Shoulder",
+    7: "Left Elbow",
+    8: "Right Elbow",
+    9: "Left Wrist",
+    10: "Right Wrist",
+    11: "Left Hip",
+    12: "Right Hip",
+    13: "Left Knee",
+    14: "Right Knee",
+    15: "Left Ankle",
+    16: "Right Ankle"
+
+AI Challenger Description:
+    0: "Right Shoulder",
+    1: "Right Elbow",
+    2: "Right Wrist",
+    3: "Left Shoulder",
+    4: "Left Elbow",
+    5: "Left Wrist",
+    6: "Right Hip",
+    7: "Right Knee",
+    8: "Right Ankle",
+    9: "Left Hip",
+    10: "Left Knee",
+    11: "Left Ankle",
+    12: "Head top",
+    13: "Neck"
+```
+
+由于两个数据集的关键点标注形式不同，我们将两个数据集的标注进行了对齐，仍然沿用COCO的标注形式，您可以下载[训练的参考列表](https://bj.bcebos.com/v1/paddledet/data/keypoint/aic_coco_train_cocoformat.json)并放在`dataset/`下使用。对齐两个数据集标注文件的主要处理如下：
+- `AI Challenger`关键点标注顺序调整至与COCO一致，统一是否标注/可见的标志位；
+- 舍弃了`AI Challenger`中特有的点位；将`AI Challenger`数据中`COCO`特有点位标记为未标注；
+- 重新排列了`image_id`与`annotation id`；
+利用转换为`COCO`形式的合并数据标注，执行模型训练：
+```bash
+# 关键点检测模型
+python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml
+
+# 行人检测模型
+python3 -m paddle.distributed.launch tools/train.py -c configs/picodet/application/pedestrian_detection/picodet_s_320_pedestrian.yml
+```
+
+## 部署流程
+### 实现部署预测
+1. 通过以下命令将训练得到的模型导出：
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final
+```
+导出后的模型如：
+```
+picodet_s_192_pedestrian
+├── infer_cfg.yml
+├── model.pdiparams
+├── model.pdiparams.info
+└── model.pdmodel
+```
+您也可以直接下载模型库中提供的对应`预测部署模型`，分别获取得到行人检测模型和关键点检测模型的预测部署模型，解压即可。
+
+2. 执行Python联合部署预测
+```bash
+# 预测一张图片
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# 预测多张图片
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# 预测一个视频
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+3. 执行C++联合部署预测
+- 请先按照[C++端预测部署](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/deploy/cpp)，根据您的实际环境准备对应的`paddle_inference`库及相关依赖。
+- 我们提供了[一键编译脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/deploy/cpp/scripts/build.sh)，您可以通过该脚本填写相关环境变量的位置，编译上述代码后，得到可执行文件。该过程中请保证`WITH_KEYPOINT=ON`.
+- 编译完成后，即可执行部署预测，例如：
+```bash
+# 预测一张图片
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# 预测多张图片
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# 预测一个视频
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+### 实现移动端部署
+#### 直接使用我们提供的模型进行部署
+1. 下载模型库中提供的`Paddle-Lite部署模型`，分别获取得到行人检测模型和关键点检测模型的`.nb`格式文件。
+2. 准备Paddle-Lite运行环境, 可直接通过[PaddleLite预编译库下载](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html)获取预编译库，无需自行编译。如需要采用FP16推理，则需要下载FP16的预编译库。
+3. 编译模型运行代码，详细步骤见[Paddle-Lite端侧部署](../../../deploy/lite/README.md)。
+
+#### 将训练的模型实现端侧部署
+如果您希望将自己训练的模型应用于部署，可以参考以下步骤：
+1. 将训练的模型导出
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final TestReader.fuse_normalize=true
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final TestReader.fuse_normalize=true
+```
+2. 转换为Lite模型（依赖[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite))
+
+- 安装Paddle-Lite:
+```bash
+pip install paddlelite
+```
+- 执行以下步骤，以得到对应后缀为`.nb`的Paddle-Lite模型用于端侧部署:
+```
+# 1. 转换行人检测模型
+# FP32
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp16 --enable_fp16=true
+
+# 2. 转换关键点检测模型
+# FP32
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp16 --enable_fp16=true
+```
+
+3. 编译模型运行代码，详细步骤见[Paddle-Lite端侧部署](../../../deploy/lite/README.md)。
+
+我们已提供包含数据预处理、模型推理及模型后处理的[全流程示例代码](../../../deploy/lite/)，可根据实际需求进行修改。
+
+**注意**
+- 在导出模型时增加`TestReader.fuse_normalize=true`参数，可以将对图像的Normalize操作合并在模型中执行，从而实现加速。
+- FP16推理可实现更快的模型推理速度。若希望部署FP16模型，除模型转换步骤外，还需要编译支持FP16的Paddle-Lite预测库，详见[Paddle Lite 使用 ARM CPU 预测部署](https://paddle-lite.readthedocs.io/zh/latest/demo_guides/arm_cpu.html)。
+
+## 关键点稳定策略（仅支持视频推理）
+请参考[关键点稳定策略](../README.md#关键点稳定策略仅适用于视频数据)。
+
+## 优化策略
+TinyPose采用了以下策略来平衡模型的速度和精度表现：
+- 轻量级的姿态估计任务骨干网络，[wider naive Lite-HRNet](https://arxiv.org/abs/2104.06403)。
+- 更小的输入尺寸，以提升整体推理速度。
+- 加入Distribution-Aware coordinate Representation of Keypoints ([DARK](https://arxiv.org/abs/1910.06278))，以提升低分辨率热力图下模型的精度表现。
+- Unbiased Data Processing ([UDP](https://arxiv.org/abs/1911.07524))，使用无偏数据编解码提升模型精度。
+- Augmentation by Information Dropping ([AID](https://arxiv.org/abs/2008.07139v2))，通过添加信息丢失的数组增强，提升模型对关键点的定位能力。
+- FP16 推理, 实现更快的模型推理速度。
--- a/paddle_detection/configs/keypoint/tiny_pose/README_en.md
+++ b/paddle_detection/configs/keypoint/tiny_pose/README_en.md
@@ -0,0 +1,224 @@
+[简体中文](README.md) | English
+
+# PP-TinyPose
+
+<div align="center">
+  <img src="../../../docs/images/tinypose_demo.png"/>
+  <center>Image Source: COCO2017</center>
+</div>
+
+## Introduction
+PP-TinyPose is a real-time keypoint detection model optimized by PaddleDetecion for mobile devices, which can smoothly run multi-person pose estimation tasks on mobile devices. With the excellent self-developed lightweight detection model [PicoDet](../../picodet/README.md), we also provide a lightweight pedestrian detection model. PP-TinyPose has the following dependency requirements:
+- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle)>=2.2
+
+If you want to deploy it on the mobile devives, you also need:
+- [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)>=2.10
+
+
+<div align="center">
+  <img src="../../../docs/images/tinypose_pipeline.png" width='800'/>
+</div>
+
+## Deployment Case
+
+- [Android Fitness Demo](https://github.com/zhiboniu/pose_demo_android) based on PP-TinyPose, which efficiently implements fitness calibration and counting.
+
+<div align="center">
+  <img src="https://user-images.githubusercontent.com/22989727/205545098-fe6515af-3f1d-4303-bb4d-6e2141e42e2c.gif" width='636'/>
+</div>
+
+- Welcome to scan the QR code for quick experience.
+<div align="center">
+  <img src="../../../docs/images/tinypose_app.png" width='220'/>
+</div>
+
+
+## Model Zoo
+### Keypoint Detection Model
+| Model  | Input Size | AP (COCO Val) | Inference Time for Single Person (FP32)| Inference Time for Single Person（FP16) | Config | Model Weights | Deployment Model | Paddle-Lite Model（FP32) | Paddle-Lite Model（FP16)|
+| :------------------------ | :-------:  | :------: | :------: |:---: | :---: | :---: | :---: | :---: | :---: |
+| PP-TinyPose | 128*96 | 58.1 | 4.57ms | 3.27ms | [Config](./tinypose_128x96.yml) |[Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_fp16_lite.tar) |
+| PP-TinyPose | 256*192 | 68.8 | 14.07ms | 8.33ms | [Config](./tinypose_256x192.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_fp16_lite.tar) |
+
+### Pedestrian Detection Model
+| Model  | Input Size | mAP (COCO Val) | Average Inference Time (FP32)| Average Inference Time (FP16) | Config | Model Weights | Deployment Model | Paddle-Lite Model（FP32) | Paddle-Lite Model（FP16)|
+| :------------------------ | :-------:  | :------: | :------: | :---: | :---: | :---: | :---: | :---: | :---: |
+| PicoDet-S-Pedestrian | 192*192 | 29.0 | 4.30ms |  2.37ms | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml) |[Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_fp16_lite.tar) |
+| PicoDet-S-Pedestrian | 320*320 | 38.5 | 10.26ms |  6.30ms | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_fp16_lite.tar) |
+
+
+**Tips**
+- The keypoint detection model and pedestrian detection model are both trained on `COCO train2017` and `AI Challenger trainset`. The keypoint detection model is evaluated on `COCO person keypoints val2017`, and the pedestrian detection model is evaluated on `COCO instances val2017`.
+- The AP results of keypoint detection models are based on bounding boxes in GroundTruth.
+- Both keypoint detection model and pedestrian detection model are trained in a 4-GPU environment. In practice, if number of GPUs or batch size need to be changed according to the training environment, you should refer to [FAQ](../../../docs/tutorials/FAQ/README.md) to adjust the learning rate.
+- The inference time is tested on a Qualcomm Snapdragon 865, with 4 threads at arm8.
+
+### Pipeline Performance
+| Model for Single-Pose | AP (COCO Val Single-Person) | Time for Single Person(FP32) |  Time for Single Person(FP16) |
+| :------------------------ | :------: | :---: | :---: |
+| PicoDet-S-Pedestrian-192\*192 + PP-TinyPose-128\*96 | 51.8 | 11.72 ms| 8.18 ms |
+| Other opensource model-192\*192 | 22.3 | 12.0 ms| - |
+
+| Model for Multi-Pose | AP (COCO Val Multi-Persons) | Time for Six Persons(FP32) | Time for Six Persons(FP16)|
+| :------------------------ | :-------: | :---: | :---: |
+| PicoDet-S-Pedestrian-320\*320 + PP-TinyPose-128\*96 | 50.3 | 44.0 ms| 32.57 ms |
+| Other opensource model-256\*256 | 39.4 | 51.0 ms| - |
+
+**Tips**
+- The AP results of keypoint detection models are based on bounding boxes detected by corresponding detection model.
+- In accuracy evaluation, there is no flip, and threshold of bounding boxes is set to 0.5.
+- For fairness, in multi-persons test, we remove images with more than 6 people.
+- The inference time is tested on a Qualcomm Snapdragon 865, with 4 threads at arm8, FP32.
+- Pipeline time includes time for preprocess, inferece and postprocess.
+- About the deployment and testing for other opensource model, please refer to [Here](https://github.com/zhiboniu/MoveNet-PaddleLite).
+- For more performance data in other runtime environment, please refer to [Keypoint Inference Benchmark](../KeypointBenchmark.md).
+
+## Model Training
+In addition to `COCO`, the trainset for keypoint detection model and pedestrian detection model also includes [AI Challenger](https://arxiv.org/abs/1711.06475). Keypoints of each dataset are defined as follows:
+```
+COCO keypoint Description:
+    0: "Nose",
+    1: "Left Eye",
+    2: "Right Eye",
+    3: "Left Ear",
+    4: "Right Ear",
+    5: "Left Shoulder,
+    6: "Right Shoulder",
+    7: "Left Elbow",
+    8: "Right Elbow",
+    9: "Left Wrist",
+    10: "Right Wrist",
+    11: "Left Hip",
+    12: "Right Hip",
+    13: "Left Knee",
+    14: "Right Knee",
+    15: "Left Ankle",
+    16: "Right Ankle"
+
+AI Challenger Description:
+    0: "Right Shoulder",
+    1: "Right Elbow",
+    2: "Right Wrist",
+    3: "Left Shoulder",
+    4: "Left Elbow",
+    5: "Left Wrist",
+    6: "Right Hip",
+    7: "Right Knee",
+    8: "Right Ankle",
+    9: "Left Hip",
+    10: "Left Knee",
+    11: "Left Ankle",
+    12: "Head top",
+    13: "Neck"
+```
+
+Since the annatation format of these two datasets are different, we aligned their annotations to `COCO` format. You can download [Training List](https://bj.bcebos.com/v1/paddledet/data/keypoint/aic_coco_train_cocoformat.json) and put it at `dataset/`. To align these two datasets, we mainly did the following works:
+- Align the indexes of the `AI Challenger` keypoint to be consistent with `COCO` and unify the flags whether the keypoint is labeled/visible.
+- Discard the unique keypoints in `AI Challenger`. For keypoints not in this dataset but in `COCO`, set it to not labeled.
+- Rearranged `image_id` and `annotation id`.
+
+Training with merged annotation file converted to `COCO` format:
+```bash
+# keypoint detection model
+python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml
+
+# pedestrian detection model
+python3 -m paddle.distributed.launch tools/train.py -c configs/picodet/application/pedestrian_detection/picodet_s_320_pedestrian.yml
+```
+
+## Model Deployment
+### Deploy Inference
+1. Export the trained model through the following command:
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final
+```
+The exported model looks as:
+```
+picodet_s_192_pedestrian
+├── infer_cfg.yml
+├── model.pdiparams
+├── model.pdiparams.info
+└── model.pdmodel
+```
+You can also download `Deployment Model` from `Model Zoo` directly. And obtain the deployment models of pedestrian detection model and keypoint detection model, then unzip them.
+
+2. Python joint inference by detection and keypoint
+```bash
+# inference for one image
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# inference for several images
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# inference for a video
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+3. C++ joint inference by detection and keypoint
+- First, please refer to [C++ Deploy Inference](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/deploy/cpp), prepare the corresponding `paddle_inference` library and related dependencies according to your environment.
+- We provide [Compile Script](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/deploy/cpp/scripts/build.sh). You can fill the location of the relevant environment variables in this script and excute it to compile the above codes. you can get an executable file. Please ensure `WITH_KEYPOINT=ON` during this process.
+- After compilation, you can do inference like:
+```bash
+# inference for one image
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# inference for several images
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# inference for a video
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+### Deployment on Mobile Devices
+#### Deploy directly using models we provide
+1. Download `Lite Model` from `Model Zoo` directly. And get the `.nb` format files of pedestrian detection model and keypoint detection model.
+2. Prepare environment for Paddle-Lite, you can obtain precompiled libraries from [PaddleLite Precompiled Libraries](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html). If FP16 is needed, you should download [Precompiled Libraries for FP16](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10-rc/inference_lite_lib.android.armv8_clang_c++_static_with_extra_with_cv_with_fp16.tiny_publish_427e46.zip).
+3. Compile the code to run models. The detail can be seen in [Paddle-Lite Deployment on Mobile Devices](../../../deploy/lite/README.md).
+
+#### Deployment self-trained models on Mobile Devices
+If you want to deploy self-trained models, you can refer to the following steps:
+1. Export the trained model
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final TestReader.fuse_normalize=true
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final TestReader.fuse_normalize=true
+```
+2. Convert to Lite Model（rely on [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite))
+
+- Install Paddle-Lite:
+```bash
+pip install paddlelite
+```
+- Run the following commands to obtain `.nb` format models of Paddle-Lite:
+```
+# 1. Convert pedestrian detection model
+# FP32
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp16 --enable_fp16=true
+
+# 2. keypoint detection model
+# FP32
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp16 --enable_fp16=true
+```
+
+3. Compile the code to run models. The detail can be seen in [Paddle-Lite Deployment on Mobile Devices](../../../deploy/lite/README.md).
+
+We provide [Example Code](../../../deploy/lite/) including data preprocessing, inferece and postpreocess. You can modify the codes according to your actual needs.
+
+**Note:**
+- Add `TestReader.fuse_normalize=true` during the step of exporting model. The Normalize operation for the image will be executed in the model, which can achieve acceleration.
+- With FP16, we can get a faster inference speed. If you want to deploy the FP16 model, in addition to the model conversion step, you also need to compile the Paddle-Lite prediction library that supports FP16. The detail is in [Paddle Lite Deployment on ARM CPU](https://paddle-lite.readthedocs.io/zh/latest/demo_guides/arm_cpu.html).
+
+## Optimization Strategies
+TinyPose adopts the following strategies to balance the speed and accuracy of the model:
+- Lightweight backbone network for pose estimation, [wider naive Lite-HRNet](https://arxiv.org/abs/2104.06403).
+- Smaller input size.
+- Distribution-Aware coordinate Representation of Keypoints ([DARK](https://arxiv.org/abs/1910.06278)), which can improve the accuracy of the model under the low-resolution heatmap.
+- Unbiased Data Processing ([UDP](https://arxiv.org/abs/1911.07524)).
+- Augmentation by Information Dropping ([AID](https://arxiv.org/abs/2008.07139v2)).
+- FP16 inference.
--- a/paddle_detection/configs/keypoint/tiny_pose/tinypose_128x96.yml
+++ b/paddle_detection/configs/keypoint/tiny_pose/tinypose_128x96.yml
@@ -0,0 +1,150 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/tinypose_128x96/model_final
+epoch: 420
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 128
+train_width: &train_width 96
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [24, 32]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+# AMP training
+init_loss_scaling: 32752
+master_grad: true
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+  backbone: LiteHRNet
+  post_process: HRNetPostProcess
+  flip_perm: *flip_perm
+  num_joints: *num_joints
+  width: &width 40
+  loss: KeyPointMSELoss
+  use_dark: true
+
+LiteHRNet:
+  network_type: wider_naive
+  freeze_at: -1
+  freeze_norm: false
+  return_idx: [0]
+
+KeyPointMSELoss:
+  use_target_weight: true
+  loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+  base_lr: 0.008
+  schedulers:
+  - !PiecewiseDecay
+    milestones: [380, 410]
+    gamma: 0.1
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    type: Adam
+  regularizer:
+    factor: 0.0
+    type: L2
+
+
+#####data
+TrainDataset:
+  !KeypointTopDownCocoDataset
+    image_dir: ""
+    anno_path: aic_coco_train_cocoformat.json
+    dataset_dir: dataset
+    num_joints: *num_joints
+    trainsize: *trainsize
+    pixel_std: *pixel_std
+    use_gt_bbox: True
+
+
+EvalDataset:
+  !KeypointTopDownCocoDataset
+    image_dir: val2017
+    anno_path: annotations/person_keypoints_val2017.json
+    dataset_dir: dataset/coco
+    num_joints: *num_joints
+    trainsize: *trainsize
+    pixel_std: *pixel_std
+    use_gt_bbox: True
+    image_thre: 0.5
+
+TestDataset:
+  !ImageFolder
+    anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+  sample_transforms:
+    - RandomFlipHalfBodyTransform:
+        scale: 0.25
+        rot: 30
+        num_joints_half_body: 8
+        prob_half_body: 0.3
+        pixel_std: *pixel_std
+        trainsize: *trainsize
+        upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+        flip_pairs: *flip_perm
+    - AugmentationbyInformantionDropping:
+        prob_cutout: 0.5
+        offset_factor: 0.05
+        num_patch: 1
+        trainsize: *trainsize
+    - TopDownAffine:
+        trainsize: *trainsize
+        use_udp: true
+    - ToHeatmapsTopDown_DARK:
+        hmsize: *hmsize
+        sigma: 1
+  batch_transforms:
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 512
+  shuffle: true
+  drop_last: false
+
+EvalReader:
+  sample_transforms:
+    - TopDownAffine:
+        trainsize: *trainsize
+        use_udp: true
+  batch_transforms:
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 16
+
+TestReader:
+  inputs_def:
+    image_shape: [3, *train_height, *train_width]
+  sample_transforms:
+    - Decode: {}
+    - TopDownEvalAffine:
+        trainsize: *trainsize
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 1
+  fuse_normalize: false
--- a/paddle_detection/configs/keypoint/tiny_pose/tinypose_256x192.yml
+++ b/paddle_detection/configs/keypoint/tiny_pose/tinypose_256x192.yml
@@ -0,0 +1,147 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/tinypose_256x192/model_final
+epoch: 420
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+  backbone: LiteHRNet
+  post_process: HRNetPostProcess
+  flip_perm: *flip_perm
+  num_joints: *num_joints
+  width: &width 40
+  loss: KeyPointMSELoss
+  use_dark: true
+
+LiteHRNet:
+  network_type: wider_naive
+  freeze_at: -1
+  freeze_norm: false
+  return_idx: [0]
+
+KeyPointMSELoss:
+  use_target_weight: true
+  loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+  base_lr: 0.002
+  schedulers:
+  - !PiecewiseDecay
+    milestones: [380, 410]
+    gamma: 0.1
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    type: Adam
+  regularizer:
+    factor: 0.0
+    type: L2
+
+
+#####data
+TrainDataset:
+  !KeypointTopDownCocoDataset
+    image_dir: ""
+    anno_path: aic_coco_train_cocoformat.json
+    dataset_dir: dataset
+    num_joints: *num_joints
+    trainsize: *trainsize
+    pixel_std: *pixel_std
+    use_gt_bbox: True
+
+
+EvalDataset:
+  !KeypointTopDownCocoDataset
+    image_dir: val2017
+    anno_path: annotations/person_keypoints_val2017.json
+    dataset_dir: dataset/coco
+    num_joints: *num_joints
+    trainsize: *trainsize
+    pixel_std: *pixel_std
+    use_gt_bbox: True
+    image_thre: 0.5
+
+TestDataset:
+  !ImageFolder
+    anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+  sample_transforms:
+    - RandomFlipHalfBodyTransform:
+        scale: 0.25
+        rot: 30
+        num_joints_half_body: 8
+        prob_half_body: 0.3
+        pixel_std: *pixel_std
+        trainsize: *trainsize
+        upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+        flip_pairs: *flip_perm
+    - AugmentationbyInformantionDropping:
+        prob_cutout: 0.5
+        offset_factor: 0.05
+        num_patch: 1
+        trainsize: *trainsize
+    - TopDownAffine:
+        trainsize: *trainsize
+        use_udp: true
+    - ToHeatmapsTopDown_DARK:
+        hmsize: *hmsize
+        sigma: 2
+  batch_transforms:
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 128
+  shuffle: true
+  drop_last: false
+
+EvalReader:
+  sample_transforms:
+    - TopDownAffine:
+        trainsize: *trainsize
+        use_udp: true
+  batch_transforms:
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 16
+
+TestReader:
+  inputs_def:
+    image_shape: [3, *train_height, *train_width]
+  sample_transforms:
+    - Decode: {}
+    - TopDownEvalAffine:
+        trainsize: *trainsize
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 1
+  fuse_normalize: false
--- a/paddle_detection/configs/keypoint/tiny_pose/tinypose_256x256_hand.yml
+++ b/paddle_detection/configs/keypoint/tiny_pose/tinypose_256x256_hand.yml
@@ -0,0 +1,145 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/tinypose_256x256_hand/model_final
+epoch: 210
+num_joints: &num_joints 21
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOWholeBadyHandEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 256
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [64, 64]
+flip_perm: &flip_perm []
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+  backbone: LiteHRNet
+  post_process: HRNetPostProcess
+  flip_perm: *flip_perm
+  num_joints: *num_joints
+  width: &width 40
+  loss: KeyPointMSELoss
+  use_dark: true
+
+LiteHRNet:
+  network_type: wider_naive
+  freeze_at: -1
+  freeze_norm: false
+  return_idx: [0]
+
+KeyPointMSELoss:
+  use_target_weight: true
+  loss_scale: 1.0
+
+
+#####optimizer
+LearningRate:
+  base_lr: 0.002
+  schedulers:
+  - !PiecewiseDecay
+    milestones: [170, 200]
+    gamma: 0.1
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    type: Adam
+  regularizer:
+    factor: 0.0
+    type: L2
+
+
+#####data
+TrainDataset:
+  !KeypointTopDownCocoWholeBodyHandDataset
+    image_dir: train2017
+    anno_path: annotations/coco_wholebody_train_v1.0.json
+    dataset_dir: dataset/coco
+    num_joints: *num_joints
+    trainsize: *trainsize
+    pixel_std: *pixel_std
+
+EvalDataset:
+  !KeypointTopDownCocoWholeBodyHandDataset
+    image_dir: val2017
+    anno_path: annotations/coco_wholebody_val_v1.0.json
+    dataset_dir: dataset/coco
+    num_joints: *num_joints
+    trainsize: *trainsize
+    pixel_std: *pixel_std
+
+TestDataset:
+  !ImageFolder
+    anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+  sample_transforms:
+    - TopDownRandomShiftBboxCenter:
+        shift_prob: 0.3
+        shift_factor: 0.16
+    - TopDownRandomFlip:
+        flip_prob: 0.5
+        flip_perm: *flip_perm
+    - TopDownGetRandomScaleRotation:
+        rot_prob: 0.6
+        rot_factor: 90
+        scale_factor: 0.3
+    # - AugmentationbyInformantionDropping:
+    #     prob_cutout: 0.5
+    #     offset_factor: 0.05
+    #     num_patch: 1
+    #     trainsize: *trainsize
+    - TopDownAffine:
+        trainsize: *trainsize
+        use_udp: true
+    - ToHeatmapsTopDown_DARK:
+        hmsize: *hmsize
+        sigma: 2
+  batch_transforms:
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 128
+  shuffle: true
+  drop_last: false
+
+EvalReader:
+  sample_transforms:
+    - TopDownAffine:
+        trainsize: *trainsize
+        use_udp: true
+  batch_transforms:
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 128
+
+TestReader:
+  inputs_def:
+    image_shape: [3, *train_height, *train_width]
+  sample_transforms:
+    - Decode: {}
+    - TopDownEvalAffine:
+        trainsize: *trainsize
+    - NormalizeImage:
+        mean: *global_mean
+        std: *global_std
+        is_scale: true
+    - Permute: {}
+  batch_size: 1
+  fuse_normalize: false