移动paddle_detection
This commit is contained in:
@@ -0,0 +1,157 @@
|
||||
简体中文
|
||||
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/31800336/219260054-ba3766b1-8223-42bf-b69b-7092019995cc.jpg" width='600'/>
|
||||
</div>
|
||||
|
||||
# 3D Pose系列模型
|
||||
|
||||
## 目录
|
||||
|
||||
- [简介](#简介)
|
||||
- [模型推荐](#模型推荐)
|
||||
- [快速开始](#快速开始)
|
||||
- [环境安装](#1环境安装)
|
||||
- [数据准备](#2数据准备)
|
||||
- [训练与测试](#3训练与测试)
|
||||
- [单卡训练](#单卡训练)
|
||||
- [多卡训练](#多卡训练)
|
||||
- [模型评估](#模型评估)
|
||||
- [模型预测](#模型预测)
|
||||
- [使用说明](#4使用说明)
|
||||
|
||||
## 简介
|
||||
|
||||
PaddleDetection 中提供了两种3D Pose算法(稀疏关键点),分别是适用于服务器端的大模型Metro3D和移动端的TinyPose3D。其中Metro3D基于[End-to-End Human Pose and Mesh Reconstruction with Transformers](https://arxiv.org/abs/2012.09760)进行了稀疏化改造,TinyPose3D是在TinyPose基础上修改输出3D关键点。
|
||||
|
||||
## 模型推荐
|
||||
|
||||
|模型|适用场景|human3.6m精度(14关键点)|human3.6m精度(17关键点)|模型下载|
|
||||
|:--:|:--:|:--:|:--:|:--:|
|
||||
|Metro3D|服务器端|56.014|46.619|[metro3d_24kpts.pdparams](https://bj.bcebos.com/v1/paddledet/models/pose3d/metro3d_24kpts.pdparams)|
|
||||
|TinyPose3D|移动端|86.381|71.223|[tinypose3d_human36m.pdparams](https://bj.bcebos.com/v1/paddledet/models/pose3d/tinypose3d_human36M.pdparams)|
|
||||
|
||||
注:
|
||||
1. 训练数据基于 [MeshTransfomer](https://github.com/microsoft/MeshTransformer) 中的训练数据。
|
||||
2. 测试精度同 MeshTransfomer 采用 14 关键点测试。
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 1、环境安装
|
||||
|
||||
请参考PaddleDetection [安装文档](../../docs/tutorials/INSTALL_cn.md)正确安装PaddlePaddle和PaddleDetection即可。
|
||||
|
||||
### 2、数据准备
|
||||
|
||||
我们的训练数据由coco、human3.6m、hr-lspet、posetrack3d、mpii组成。
|
||||
|
||||
2.1 我们的训练数据下载地址为:
|
||||
|
||||
[coco](https://bj.bcebos.com/v1/paddledet/data/coco.tar)
|
||||
|
||||
[human3.6m](https://bj.bcebos.com/v1/paddledet/data/pose3d/human3.6m.tar.gz)
|
||||
|
||||
[lspet+posetrack+mpii](https://bj.bcebos.com/v1/paddledet/data/pose3d/pose3d_others.tar.gz)
|
||||
|
||||
[标注文件下载](https://bj.bcebos.com/v1/paddledet/data/pose3d/pose3d.tar.gz)
|
||||
|
||||
2.2 数据下载后按如下结构放在repo目录下
|
||||
|
||||
```
|
||||
${REPO_DIR}
|
||||
|-- dataset
|
||||
| |-- traindata
|
||||
| |-- coco
|
||||
| |-- hr-lspet
|
||||
| |-- human3.6m
|
||||
| |-- mpii
|
||||
| |-- posetrack3d
|
||||
| \-- pose3d
|
||||
| |-- COCO2014-All-ver01.json
|
||||
| |-- COCO2014-Part-ver01.json
|
||||
| |-- COCO2014-Val-ver10.json
|
||||
| |-- Human3.6m_train.json
|
||||
| |-- Human3.6m_valid.json
|
||||
| |-- LSPet_train_ver10.json
|
||||
| |-- LSPet_test_ver10.json
|
||||
| |-- MPII_ver01.json
|
||||
| |-- PoseTrack_ver01.json
|
||||
|-- ppdet
|
||||
|-- deploy
|
||||
|-- demo
|
||||
|-- README_cn.md
|
||||
|-- README_en.md
|
||||
|-- ...
|
||||
```
|
||||
|
||||
|
||||
### 3、训练与测试
|
||||
|
||||
#### 单卡训练
|
||||
|
||||
```shell
|
||||
#单卡训练
|
||||
CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/pose3d/metro3d_24kpts.yml
|
||||
|
||||
#多卡训练
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/pose3d/metro3d_24kpts.yml
|
||||
```
|
||||
|
||||
#### 模型评估
|
||||
|
||||
```shell
|
||||
#单卡评估
|
||||
CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/pose3d/metro3d_24kpts.yml -o weights=output/metro3d_24kpts/best_model.pdparams
|
||||
|
||||
#当只需要保存评估预测的结果时,可以通过设置save_prediction_only参数实现,评估预测结果默认保存在output/keypoints_results.json文件中
|
||||
CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/pose3d/metro3d_24kpts.yml -o weights=output/metro3d_24kpts/best_model.pdparams --save_prediction_only
|
||||
|
||||
#多卡评估
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/eval.py -c configs/pose3d/metro3d_24kpts.yml -o weights=output/metro3d_24kpts/best_model.pdparams
|
||||
```
|
||||
|
||||
#### 模型预测
|
||||
|
||||
```shell
|
||||
#图片生成3视角图
|
||||
CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py -c configs/pose3d/metro3d_24kpts.yml -o weights=./output/metro3d_24kpts/best_model.pdparams --infer_img=./demo/hrnet_demo.jpg --draw_threshold=0.5
|
||||
```
|
||||
|
||||
### 4、使用说明
|
||||
|
||||
3D Pose在使用中相比2D Pose有更多的困难,该困难主要是由于以下两个原因导致的。
|
||||
|
||||
- 1)训练数据标注成本高;
|
||||
|
||||
- 2)图像在深度信息上的模糊性;
|
||||
|
||||
由于(1)的原因训练数据往往只能覆盖少量动作,导致模型泛化性困难。由于(2)的原因图像在预测3D Pose坐标时深度z轴上误差通常大于x、y方向,容易导致时序间的较大抖动,且数据标注误差越大该问题表现的更加明显。
|
||||
|
||||
要解决上述两个问题,就造成了两个矛盾的需求:1)提高泛化性需要更多的标注数据;2)降低预测误差需要高精度的数据标注。而3D Pose本身数据标注的困难导致越高精度的标注成本越高,标注数量则会相应降低。
|
||||
|
||||
因此,我们提供的解决方案是:
|
||||
|
||||
- 1)使用自动拟合标注方法自动产生大量低精度的数据。训练第一版模型,使其具有较普遍的泛化性。
|
||||
|
||||
- 2)标注少量目标动作的高精度数据,基于第一版模型finetune,得到目标动作上的高精度模型,且一定程度上继承了第一版模型的泛化性。
|
||||
|
||||
我们的训练数据提供了大量的低精度自动生成式的数据,用户可以在此数据训练的基础上,标注自己高精度的目标动作数据进行finetune,即可得到相对稳定较好的模型。
|
||||
|
||||
我们在医疗康复高精度数据上的训练效果展示如下 [高清视频](https://user-images.githubusercontent.com/31800336/218949226-22e6ab25-facb-4cc6-8eca-38d4bfd973e5.mp4)
|
||||
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/31800336/221747019-ceacfd64-e218-476b-a369-c6dc259816b2.gif" width='600'/>
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
## 引用
|
||||
|
||||
```
|
||||
@inproceedings{lin2021end-to-end,
|
||||
author = {Lin, Kevin and Wang, Lijuan and Liu, Zicheng},
|
||||
title = {End-to-End Human Pose and Mesh Reconstruction with Transformers},
|
||||
booktitle = {CVPR},
|
||||
year = {2021},
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,144 @@
|
||||
use_gpu: True
|
||||
log_iter: 20
|
||||
save_dir: output
|
||||
snapshot_epoch: 3
|
||||
weights: output/metro_modified/model_final
|
||||
epoch: 50
|
||||
metric: Pose3DEval
|
||||
num_classes: 1
|
||||
train_height: &train_height 224
|
||||
train_width: &train_width 224
|
||||
trainsize: &trainsize [*train_width, *train_height]
|
||||
num_joints: &num_joints 24
|
||||
|
||||
#####model
|
||||
architecture: METRO_Body
|
||||
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
|
||||
|
||||
METRO_Body:
|
||||
backbone: HRNet
|
||||
trans_encoder: TransEncoder
|
||||
num_joints: *num_joints
|
||||
loss: Pose3DLoss
|
||||
|
||||
HRNet:
|
||||
width: 32
|
||||
freeze_at: -1
|
||||
freeze_norm: False
|
||||
norm_momentum: 0.1
|
||||
downsample: True
|
||||
|
||||
TransEncoder:
|
||||
vocab_size: 30522
|
||||
num_hidden_layers: 4
|
||||
num_attention_heads: 4
|
||||
position_embeddings_size: 512
|
||||
intermediate_size: 3072
|
||||
input_feat_dim: [2048, 512, 128]
|
||||
hidden_feat_dim: [1024, 256, 128]
|
||||
attention_probs_dropout_prob: 0.1
|
||||
fc_dropout_prob: 0.1
|
||||
act_fn: 'gelu'
|
||||
output_attentions: False
|
||||
output_hidden_feats: False
|
||||
|
||||
Pose3DLoss:
|
||||
weight_3d: 1.0
|
||||
weight_2d: 0.0
|
||||
|
||||
#####optimizer
|
||||
LearningRate:
|
||||
base_lr: 0.0001
|
||||
schedulers:
|
||||
- !CosineDecay
|
||||
max_epochs: 52
|
||||
- !LinearWarmup
|
||||
start_factor: 0.01
|
||||
steps: 2000
|
||||
|
||||
|
||||
OptimizerBuilder:
|
||||
clip_grad_by_norm: 0.2
|
||||
optimizer:
|
||||
type: Adam
|
||||
regularizer:
|
||||
factor: 0.0
|
||||
type: L2
|
||||
|
||||
|
||||
#####data
|
||||
TrainDataset:
|
||||
!Pose3DDataset
|
||||
dataset_dir: dataset/traindata/
|
||||
image_dirs: ["human3.6m", "posetrack3d", "hr-lspet", "hr-lspet", "mpii/images", "coco/train2017"]
|
||||
anno_list: ["pose3d/Human3.6m_train.json", "pose3d/PoseTrack_ver01.json", "pose3d/LSPet_train_ver10.json", "pose3d/LSPet_test_ver10.json", "pose3d/MPII_ver01.json", "pose3d/COCO2014-All-ver01.json"]
|
||||
num_joints: *num_joints
|
||||
test_mode: False
|
||||
|
||||
EvalDataset:
|
||||
!Pose3DDataset
|
||||
dataset_dir: dataset/traindata/
|
||||
image_dirs: ["human3.6m"]
|
||||
anno_list: ["pose3d/Human3.6m_valid.json"]
|
||||
num_joints: *num_joints
|
||||
test_mode: True
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: dataset/traindata/coco/keypoint_imagelist.txt
|
||||
|
||||
worker_num: 4
|
||||
global_mean: &global_mean [0.485, 0.456, 0.406]
|
||||
global_std: &global_std [0.229, 0.224, 0.225]
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- SinglePoseAffine:
|
||||
trainsize: *trainsize
|
||||
rotate: [1.0, 30] #[prob, rotate range]
|
||||
scale: [1.0, 0.25] #[prob, scale range]
|
||||
- FlipPose:
|
||||
flip_prob: 0.5
|
||||
img_res: *train_width
|
||||
num_joints: *num_joints
|
||||
- NoiseJitter:
|
||||
noise_factor: 0.4
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 64
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- SinglePoseAffine:
|
||||
trainsize: *trainsize
|
||||
rotate: [0., 30]
|
||||
scale: [0., 0.25]
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 16
|
||||
shuffle: false
|
||||
drop_last: false
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [3, *train_height, *train_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- TopDownEvalAffine:
|
||||
trainsize: *trainsize
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
fuse_normalize: false #whether to fuse nomalize layer into model while export model
|
||||
@@ -0,0 +1,122 @@
|
||||
use_gpu: true
|
||||
log_iter: 5
|
||||
save_dir: output
|
||||
snapshot_epoch: 1
|
||||
weights: output/tinypose3d_human36M/model_final
|
||||
epoch: 220
|
||||
num_joints: &num_joints 24
|
||||
pixel_std: &pixel_std 200
|
||||
metric: Pose3DEval
|
||||
num_classes: 1
|
||||
train_height: &train_height 128
|
||||
train_width: &train_width 128
|
||||
trainsize: &trainsize [*train_width, *train_height]
|
||||
|
||||
#####model
|
||||
architecture: TinyPose3DHRHeatmapNet
|
||||
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams
|
||||
|
||||
TinyPose3DHRHeatmapNet:
|
||||
backbone: LiteHRNet
|
||||
post_process: HR3DNetPostProcess
|
||||
num_joints: *num_joints
|
||||
width: &width 40
|
||||
loss: Pose3DLoss
|
||||
|
||||
LiteHRNet:
|
||||
network_type: wider_naive
|
||||
freeze_at: -1
|
||||
freeze_norm: false
|
||||
return_idx: [0]
|
||||
|
||||
Pose3DLoss:
|
||||
weight_3d: 1.0
|
||||
weight_2d: 0.0
|
||||
|
||||
#####optimizer
|
||||
LearningRate:
|
||||
base_lr: 0.0001
|
||||
schedulers:
|
||||
- !PiecewiseDecay
|
||||
milestones: [17, 21]
|
||||
gamma: 0.1
|
||||
- !LinearWarmup
|
||||
start_factor: 0.01
|
||||
steps: 1000
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
type: Adam
|
||||
regularizer:
|
||||
factor: 0.0
|
||||
type: L2
|
||||
|
||||
|
||||
#####data
|
||||
TrainDataset:
|
||||
!Pose3DDataset
|
||||
dataset_dir: dataset/traindata/
|
||||
image_dirs: ["human3.6m"]
|
||||
anno_list: ['pose3d/Human3.6m_train.json']
|
||||
num_joints: *num_joints
|
||||
test_mode: False
|
||||
|
||||
EvalDataset:
|
||||
!Pose3DDataset
|
||||
dataset_dir: dataset/traindata/
|
||||
image_dirs: ["human3.6m"]
|
||||
anno_list: ['pose3d/Human3.6m_valid.json']
|
||||
num_joints: *num_joints
|
||||
test_mode: True
|
||||
|
||||
TestDataset:
|
||||
!ImageFolder
|
||||
anno_path: dataset/coco/keypoint_imagelist.txt
|
||||
|
||||
worker_num: 4
|
||||
global_mean: &global_mean [0.485, 0.456, 0.406]
|
||||
global_std: &global_std [0.229, 0.224, 0.225]
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- SinglePoseAffine:
|
||||
trainsize: *trainsize
|
||||
rotate: [0.5, 30] #[prob, rotate range]
|
||||
scale: [0.5, 0.25] #[prob, scale range]
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 128
|
||||
shuffle: true
|
||||
drop_last: true
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- SinglePoseAffine:
|
||||
trainsize: *trainsize
|
||||
rotate: [0., 30]
|
||||
scale: [0., 0.25]
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 128
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [3, *train_height, *train_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- TopDownEvalAffine:
|
||||
trainsize: *trainsize
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
fuse_normalize: false
|
||||
@@ -0,0 +1,138 @@
|
||||
use_gpu: true
|
||||
log_iter: 5
|
||||
save_dir: output
|
||||
snapshot_epoch: 1
|
||||
weights: output/tinypose_3D_multi_frames/model_final
|
||||
epoch: 420
|
||||
num_joints: &num_joints 24
|
||||
pixel_std: &pixel_std 200
|
||||
metric: Pose3DEval
|
||||
num_classes: 1
|
||||
train_height: &train_height 128
|
||||
train_width: &train_width 96
|
||||
trainsize: &trainsize [*train_width, *train_height]
|
||||
hmsize: &hmsize [24, 32]
|
||||
flip_perm: &flip_perm [[1, 2], [4, 5], [7, 8], [10, 11], [13, 14], [16, 17], [18, 19], [20, 21], [22, 23]]
|
||||
|
||||
|
||||
#####model
|
||||
architecture: TinyPose3DHRNet
|
||||
pretrain_weights: medical_multi_frames_best_model.pdparams
|
||||
|
||||
TinyPose3DHRNet:
|
||||
backbone: LiteHRNet
|
||||
post_process: TinyPose3DPostProcess
|
||||
num_joints: *num_joints
|
||||
width: &width 40
|
||||
loss: KeyPointRegressionMSELoss
|
||||
|
||||
LiteHRNet:
|
||||
network_type: wider_naive
|
||||
freeze_at: -1
|
||||
freeze_norm: false
|
||||
return_idx: [0]
|
||||
|
||||
KeyPointRegressionMSELoss:
|
||||
reduction: 'mean'
|
||||
|
||||
#####optimizer
|
||||
LearningRate:
|
||||
base_lr: 0.001
|
||||
schedulers:
|
||||
- !PiecewiseDecay
|
||||
milestones: [17, 21]
|
||||
gamma: 0.1
|
||||
- !LinearWarmup
|
||||
start_factor: 0.01
|
||||
steps: 1000
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
type: Adam
|
||||
regularizer:
|
||||
factor: 0.0
|
||||
type: L2
|
||||
|
||||
#####data
|
||||
TrainDataset:
|
||||
!Keypoint3DMultiFramesDataset
|
||||
dataset_dir: "data/medical/multi_frames/train"
|
||||
image_dir: "images"
|
||||
p3d_dir: "joint_pc/player_0"
|
||||
json_path: "json_results/player_0/player_0.json"
|
||||
img_size: *trainsize # w,h
|
||||
num_frames: 6
|
||||
|
||||
|
||||
EvalDataset:
|
||||
!Keypoint3DMultiFramesDataset
|
||||
dataset_dir: "data/medical/multi_frames/val"
|
||||
image_dir: "images"
|
||||
p3d_dir: "joint_pc/player_0"
|
||||
json_path: "json_results/player_0/player_0.json"
|
||||
img_size: *trainsize # w,h
|
||||
num_frames: 6
|
||||
|
||||
TestDataset:
|
||||
!Keypoint3DMultiFramesDataset
|
||||
dataset_dir: "data/medical/multi_frames/val"
|
||||
image_dir: "images"
|
||||
p3d_dir: "joint_pc/player_0"
|
||||
json_path: "json_results/player_0/player_0.json"
|
||||
img_size: *trainsize # w,h
|
||||
num_frames: 6
|
||||
|
||||
worker_num: 4
|
||||
global_mean: &global_mean [0.485, 0.456, 0.406]
|
||||
global_std: &global_std [0.229, 0.224, 0.225]
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- CropAndFlipImages:
|
||||
crop_range: [556, 1366]
|
||||
- RandomFlipHalfBody3DTransformImages:
|
||||
scale: 0.25
|
||||
rot: 30
|
||||
num_joints_half_body: 9
|
||||
prob_half_body: 0.3
|
||||
pixel_std: *pixel_std
|
||||
trainsize: *trainsize
|
||||
upper_body_ids: [0, 3, 6, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
|
||||
flip_pairs: *flip_perm
|
||||
do_occlusion: true
|
||||
- Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- PermuteImages: {}
|
||||
batch_size: 32
|
||||
shuffle: true
|
||||
drop_last: false
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- CropAndFlipImages:
|
||||
crop_range: [556, 1366]
|
||||
- Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- PermuteImages: {}
|
||||
batch_size: 32
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [3, *train_height, *train_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- LetterBoxResize: { target_size: [*train_height,*train_width]}
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
fuse_normalize: false
|
||||
@@ -0,0 +1,138 @@
|
||||
use_gpu: true
|
||||
log_iter: 5
|
||||
save_dir: output
|
||||
snapshot_epoch: 1
|
||||
weights: output/tinypose3d_multi_frames_heatmap/model_final
|
||||
epoch: 420
|
||||
num_joints: &num_joints 24
|
||||
pixel_std: &pixel_std 200
|
||||
metric: Pose3DEval
|
||||
num_classes: 1
|
||||
train_height: &train_height 128
|
||||
train_width: &train_width 128
|
||||
trainsize: &trainsize [*train_width, *train_height]
|
||||
hmsize: &hmsize [24, 32]
|
||||
flip_perm: &flip_perm [[1, 2], [4, 5], [7, 8], [10, 11], [13, 14], [16, 17], [18, 19], [20, 21], [22, 23]]
|
||||
|
||||
#####model
|
||||
architecture: TinyPose3DHRHeatmapNet
|
||||
pretrain_weights: medical_multi_frames_best_model.pdparams
|
||||
|
||||
TinyPose3DHRHeatmapNet:
|
||||
backbone: LiteHRNet
|
||||
post_process: TinyPosePostProcess
|
||||
num_joints: *num_joints
|
||||
width: &width 40
|
||||
loss: KeyPointRegressionMSELoss
|
||||
|
||||
LiteHRNet:
|
||||
network_type: wider_naive
|
||||
freeze_at: -1
|
||||
freeze_norm: false
|
||||
return_idx: [0]
|
||||
|
||||
KeyPointRegressionMSELoss:
|
||||
reduction: 'mean'
|
||||
|
||||
#####optimizer
|
||||
LearningRate:
|
||||
base_lr: 0.001
|
||||
schedulers:
|
||||
- !PiecewiseDecay
|
||||
milestones: [17, 21]
|
||||
gamma: 0.1
|
||||
- !LinearWarmup
|
||||
start_factor: 0.01
|
||||
steps: 1000
|
||||
|
||||
OptimizerBuilder:
|
||||
optimizer:
|
||||
type: Adam
|
||||
regularizer:
|
||||
factor: 0.0
|
||||
type: L2
|
||||
|
||||
#####data
|
||||
TrainDataset:
|
||||
!Keypoint3DMultiFramesDataset
|
||||
dataset_dir: "data/medical/multi_frames/train"
|
||||
image_dir: "images"
|
||||
p3d_dir: "joint_pc/player_0"
|
||||
json_path: "json_results/player_0/player_0.json"
|
||||
img_size: *trainsize # w,h
|
||||
num_frames: 6
|
||||
|
||||
|
||||
EvalDataset:
|
||||
!Keypoint3DMultiFramesDataset
|
||||
dataset_dir: "data/medical/multi_frames/val"
|
||||
image_dir: "images"
|
||||
p3d_dir: "joint_pc/player_0"
|
||||
json_path: "json_results/player_0/player_0.json"
|
||||
img_size: *trainsize # w,h
|
||||
num_frames: 6
|
||||
|
||||
TestDataset:
|
||||
!Keypoint3DMultiFramesDataset
|
||||
dataset_dir: "data/medical/multi_frames/val"
|
||||
image_dir: "images"
|
||||
p3d_dir: "joint_pc/player_0"
|
||||
json_path: "json_results/player_0/player_0.json"
|
||||
img_size: *trainsize # w,h
|
||||
num_frames: 6
|
||||
|
||||
worker_num: 4
|
||||
global_mean: &global_mean [0.485, 0.456, 0.406]
|
||||
global_std: &global_std [0.229, 0.224, 0.225]
|
||||
TrainReader:
|
||||
sample_transforms:
|
||||
- CropAndFlipImages:
|
||||
crop_range: [556, 1366] # 保留train_height/train_width比例的情况下,裁剪原图左右两个的黑色填充
|
||||
- RandomFlipHalfBody3DTransformImages:
|
||||
scale: 0.25
|
||||
rot: 30
|
||||
num_joints_half_body: 9
|
||||
prob_half_body: 0.3
|
||||
pixel_std: *pixel_std
|
||||
trainsize: *trainsize
|
||||
upper_body_ids: [0, 3, 6, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
|
||||
flip_pairs: *flip_perm
|
||||
do_occlusion: true
|
||||
- Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- PermuteImages: {}
|
||||
batch_size: 1 #32
|
||||
shuffle: true
|
||||
drop_last: false
|
||||
|
||||
EvalReader:
|
||||
sample_transforms:
|
||||
- CropAndFlipImages:
|
||||
crop_range: [556, 1366]
|
||||
- Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
|
||||
#- OriginPointTranslationImages: {}
|
||||
batch_transforms:
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- PermuteImages: {}
|
||||
batch_size: 32
|
||||
|
||||
TestReader:
|
||||
inputs_def:
|
||||
image_shape: [3, *train_height, *train_width]
|
||||
sample_transforms:
|
||||
- Decode: {}
|
||||
- LetterBoxResize: { target_size: [*train_height,*train_width]}
|
||||
- NormalizeImage:
|
||||
mean: *global_mean
|
||||
std: *global_std
|
||||
is_scale: true
|
||||
- Permute: {}
|
||||
batch_size: 1
|
||||
fuse_normalize: false
|
||||
Reference in New Issue
Block a user