Human skeleton, as a compact representation of human action, has received increasing attention in recent years. Many skeleton-based action recognition methods adopt graph convolutional networks (GCN) to extract features on top of human skeletons. Despite the positive results shown in previous works, GCN-based methods are subject to limitations in robustness, interoperability, and scalability. In this work, we propose PoseC3D, a new approach to skeleton-based action recognition, which relies on a 3D heatmap stack instead of a graph sequence as the base representation of human skeletons. Compared to GCN-based methods, PoseC3D is more effective in learning spatiotemporal features, more robust against pose estimation noises, and generalizes better in cross-dataset settings. Also, PoseC3D can handle multiple-person scenarios without additional computation cost, and its features can be easily integrated with other modalities at early fusion stages, which provides a great design space to further boost the performance. On four challenging datasets, PoseC3D consistently obtains superior performance, when used alone on skeletons and in combination with the RGB modality.
@misc{duan2021revisiting,
title={Revisiting Skeleton-based Action Recognition},
author={Haodong Duan and Yue Zhao and Kai Chen and Dian Shao and Dahua Lin and Bo Dai},
year={2021},
eprint={2104.13586},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Pose Estimation Results
![]() ![]() |
Keypoint Heatmap Volume Visualization
![]() ![]() |
Limb Heatmap Volume Visualization
![]() ![]() |
config | pseudo heatmap | gpus | backbone | Mean Top-1 | ckpt | log | json |
---|---|---|---|---|---|---|---|
slowonly_r50_u48_240e_gym_keypoint | keypoint | 8 x 2 | SlowOnly-R50 | 93.7 | ckpt | log | json |
slowonly_r50_u48_240e_gym_limb | limb | 8 x 2 | SlowOnly-R50 | 94.0 | ckpt | log | json |
Fusion | 94.3 |
config | pseudo heatmap | gpus | backbone | Top-1 | ckpt | log | json |
---|---|---|---|---|---|---|---|
slowonly_r50_u48_240e_ntu60_xsub_keypoint | keypoint | 8 x 2 | SlowOnly-R50 | 93.7 | ckpt | log | json |
slowonly_r50_u48_240e_ntu60_xsub_limb | limb | 8 x 2 | SlowOnly-R50 | 93.4 | ckpt | log | json |
Fusion | 94.1 |
config | pseudo heatmap | gpus | backbone | Top-1 | ckpt | log | json |
---|---|---|---|---|---|---|---|
slowonly_r50_u48_240e_ntu120_xsub_keypoint | keypoint | 8 x 2 | SlowOnly-R50 | 86.3 | ckpt | log | json |
slowonly_r50_u48_240e_ntu120_xsub_limb | limb | 8 x 2 | SlowOnly-R50 | 85.7 | ckpt | log | json |
Fusion | 86.9 |
config | pseudo heatmap | gpus | backbone | Top-1 | ckpt | log | json |
---|---|---|---|---|---|---|---|
slowonly_kinetics400_pretrained_r50_u48_120e_ucf101_split1_keypoint | keypoint | 8 | SlowOnly-R50 | 87.0 | ckpt | log | json |
config | pseudo heatmap | gpus | backbone | Top-1 | ckpt | log | json |
---|---|---|---|---|---|---|---|
slowonly_kinetics400_pretrained_r50_u48_120e_hmdb51_split1_keypoint | keypoint | 8 | SlowOnly-R50 | 69.3 | ckpt | log | json |
:::{note}
:::
You can use the following command to train a model.
python tools/train.py ${CONFIG_FILE} [optional arguments]
Example: train PoseC3D model on FineGYM dataset in a deterministic option with periodic validation.
python tools/train.py configs/skeleton/posec3d/slowonly_r50_u48_240e_gym_keypoint.py \
--work-dir work_dirs/slowonly_r50_u48_240e_gym_keypoint \
--validate --seed 0 --deterministic
For training with your custom dataset, you can refer to Custom Dataset Training.
For more details, you can refer to Training setting part in getting_started.
You can use the following command to test a model.
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
Example: test PoseC3D model on FineGYM dataset and dump the result to a pickle file.
python tools/test.py configs/skeleton/posec3d/slowonly_r50_u48_240e_gym_keypoint.py \
checkpoints/SOME_CHECKPOINT.pth --eval top_k_accuracy mean_class_accuracy \
--out result.pkl
For more details, you can refer to Test a dataset part in getting_started.