Preparing Skeleton Dataset

@misc{duan2021revisiting,
      title={Revisiting Skeleton-based Action Recognition},
      author={Haodong Duan and Yue Zhao and Kai Chen and Dian Shao and Dahua Lin and Bo Dai},
      year={2021},
      eprint={2104.13586},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Introduction

We release the skeleton annotations used in Revisiting Skeleton-based Action Recognition. By default, we use Faster-RCNN with ResNet50 backbone for human detection and HRNet-w32 for single person pose estimation. For FineGYM, we use Ground-Truth bounding boxes for the athlete instead of detection bounding boxes. Currently, we release the skeleton annotations for FineGYM and NTURGB-D Xsub split. Other annotations will be soo released.

Prepare Annotations

Currently, we support HMDB51, UCF101, FineGYM and NTURGB+D. For FineGYM, you can execute following scripts to prepare the annotations.

bash download_annotations.sh ${DATASET}

Due to Conditions of Use of the NTURGB+D dataset, we can not directly release the annotations used in our experiments. So that we provide a script to generate pose annotations for videos in NTURGB+D datasets, which generate a dictionary and save it as a single pickle file. You can create a list which contain all annotation dictionaries of corresponding videos and save them as a pickle file. Then you can get the ntu60_xsub_train.pkl, ntu60_xsub_val.pkl, ntu120_xsub_train.pkl, ntu120_xsub_val.pkl that we used in training.

For those who have not enough computations for pose extraction, we provide the outputs of the above pipeline here, corresponding to 4 different splits of NTURGB+D datasets:

ntu60_xsub_train: https://download.openmmlab.com/mmaction/posec3d/ntu60_xsub_train.pkl
ntu60_xsub_val: https://download.openmmlab.com/mmaction/posec3d/ntu60_xsub_val.pkl
ntu120_xsub_train: https://download.openmmlab.com/mmaction/posec3d/ntu120_xsub_train.pkl
ntu120_xsub_val: https://download.openmmlab.com/mmaction/posec3d/ntu120_xsub_val.pkl
hmdb51: https://download.openmmlab.com/mmaction/posec3d/hmdb51.pkl
ucf101: https://download.openmmlab.com/mmaction/posec3d/ucf101.pkl

To generate 2D pose annotations for a single video, first, you need to install mmdetection and mmpose from src code. After that, you need to replace the placeholder mmdet_root and mmpose_root in ntu_pose_extraction.py with your installation path. Then you can use following scripts for NTURGB+D video pose extraction:

python ntu_pose_extraction.py S001C001P001R001A001_rgb.avi S001C001P001R001A001.pkl

After you get pose annotations for all videos in a dataset split, like ntu60_xsub_val. You can gather them into a single list and save the list as ntu60_xsub_val.pkl. You can use those larger pickle files for training and testing.

The Format of PoseC3D Annotations

Here we briefly introduce the format of PoseC3D Annotations, we will take gym_train.pkl as an example: the content of gym_train.pkl is a list of length 20484, each item is a dictionary that is the skeleton annotation of one video. Each dictionary has following fields:

keypoint: The keypoint coordinates, which is a numpy array of the shape N (#person) x T (temporal length) x K (#keypoints, 17 in our case) x 2 (x, y coordinate).
keypoint_score: The keypoint confidence scores, which is a numpy array of the shape N (#person) x T (temporal length) x K (#keypoints, 17 in our case).
frame_dir: The corresponding video name.
label: The action category.
img_shape: The image shape of each frame.
original_shape: Same as above.
total_frames: The temporal length of the video.

For training with your custom dataset, you can refer to Custom Dataset Training.

Visualization

For skeleton data visualization, you need also to prepare the RGB videos. Please refer to visualize_heatmap_volume for detailed process. Here we provide some visualization examples from NTU-60 and FineGYM.

Pose Estimation Results	Keypoint Heatmap Volume Visualization	Limb Heatmap Volume Visualization

Convert the NTU RGB+D raw skeleton data to our format (only applicable to GCN backbones)

Here we also provide the script for converting the NTU RGB+D raw skeleton data to our format.
First, download the raw skeleton data of NTU-RGBD 60 and NTU-RGBD 120 from https://github.com/shahroudy/NTURGB-D.

For NTU-RGBD 60, preprocess data and convert the data format with

python gen_ntu_rgbd_raw.py --data-path your_raw_nturgbd60_skeleton_path --ignored-sample-path NTU_RGBD_samples_with_missing_skeletons.txt --out-folder your_nturgbd60_output_path --task ntu60

For NTU-RGBD 120, preprocess data and convert the data format with

python gen_ntu_rgbd_raw.py --data-path your_raw_nturgbd120_skeleton_path --ignored-sample-path NTU_RGBD120_samples_with_missing_skeletons.txt --out-folder your_nturgbd120_output_path --task ntu120

Convert annotations from third-party projects

We provide scripts to convert skeleton annotations from third-party projects to MMAction2 formats:

BABEL: babel2mma2.py

TODO: