@misc{duan2021revisiting,
title={Revisiting Skeleton-based Action Recognition},
author={Haodong Duan and Yue Zhao and Kai Chen and Dian Shao and Dahua Lin and Bo Dai},
year={2021},
eprint={2104.13586},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
We release the skeleton annotations used in Revisiting Skeleton-based Action Recognition. By default, we use Faster-RCNN with ResNet50 backbone for human detection and HRNet-w32 for single person pose estimation. For FineGYM, we use Ground-Truth bounding boxes for the athlete instead of detection bounding boxes. Currently, we release the skeleton annotations for FineGYM and NTURGB-D Xsub split. Other annotations will be soo released.
Currently, we support HMDB51, UCF101, FineGYM and NTURGB+D. For FineGYM, you can execute following scripts to prepare the annotations.
bash download_annotations.sh ${DATASET}
Due to Conditions of Use of the NTURGB+D dataset, we can not directly release the annotations used in our experiments. So that we provide a script to generate pose annotations for videos in NTURGB+D datasets, which generate a dictionary and save it as a single pickle file. You can create a list which contain all annotation dictionaries of corresponding videos and save them as a pickle file. Then you can get the ntu60_xsub_train.pkl
, ntu60_xsub_val.pkl
, ntu120_xsub_train.pkl
, ntu120_xsub_val.pkl
that we used in training.
For those who have not enough computations for pose extraction, we provide the outputs of the above pipeline here, corresponding to 4 different splits of NTURGB+D datasets:
To generate 2D pose annotations for a single video, first, you need to install mmdetection and mmpose from src code. After that, you need to replace the placeholder mmdet_root
and mmpose_root
in ntu_pose_extraction.py
with your installation path. Then you can use following scripts for NTURGB+D video pose extraction:
python ntu_pose_extraction.py S001C001P001R001A001_rgb.avi S001C001P001R001A001.pkl
After you get pose annotations for all videos in a dataset split, like ntu60_xsub_val
. You can gather them into a single list and save the list as ntu60_xsub_val.pkl
. You can use those larger pickle files for training and testing.
Here we briefly introduce the format of PoseC3D Annotations, we will take gym_train.pkl
as an example: the content of gym_train.pkl
is a list of length 20484, each item is a dictionary that is the skeleton annotation of one video. Each dictionary has following fields:
For training with your custom dataset, you can refer to Custom Dataset Training.
For skeleton data visualization, you need also to prepare the RGB videos. Please refer to visualize_heatmap_volume for detailed process. Here we provide some visualization examples from NTU-60 and FineGYM.
Pose Estimation Results
![]() ![]() |
Keypoint Heatmap Volume Visualization
![]() ![]() |
Limb Heatmap Volume Visualization
![]() ![]() |
Here we also provide the script for converting the NTU RGB+D raw skeleton data to our format.
First, download the raw skeleton data of NTU-RGBD 60 and NTU-RGBD 120 from https://github.com/shahroudy/NTURGB-D.
For NTU-RGBD 60, preprocess data and convert the data format with
python gen_ntu_rgbd_raw.py --data-path your_raw_nturgbd60_skeleton_path --ignored-sample-path NTU_RGBD_samples_with_missing_skeletons.txt --out-folder your_nturgbd60_output_path --task ntu60
For NTU-RGBD 120, preprocess data and convert the data format with
python gen_ntu_rgbd_raw.py --data-path your_raw_nturgbd120_skeleton_path --ignored-sample-path NTU_RGBD120_samples_with_missing_skeletons.txt --out-folder your_nturgbd120_output_path --task ntu120
We provide scripts to convert skeleton annotations from third-party projects to MMAction2 formats:
babel2mma2.py
TODO: