Switch to unified view

a b/configs/localization/bsn/README.md
1
# BSN
2
3
## Abstract
4
5
<!-- [ABSTRACT] -->
6
7
Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content. This problem requires methods not only generating proposals with precise temporal boundaries, but also retrieving proposals to cover truth action instances with high recall and high overlap using relatively fewer proposals. To address these difficulties, we introduce an effective proposal generation method, named Boundary-Sensitive Network (BSN), which adopts "local to global" fashion. Locally, BSN first locates temporal boundaries with high probabilities, then directly combines these boundaries as proposals. Globally, with Boundary-Sensitive Proposal feature, BSN retrieves proposals by evaluating the confidence of whether a proposal contains an action within its region. We conduct experiments on two challenging datasets: ActivityNet-1.3 and THUMOS14, where BSN outperforms other state-of-the-art temporal action proposal generation methods with high recall and high temporal precision. Finally, further experiments demonstrate that by combining existing action classifiers, our method significantly improves the state-of-the-art temporal action detection performance.
8
9
<!-- [IMAGE] -->
10
<div align=center>
11
<img src="https://user-images.githubusercontent.com/34324155/143016692-69efafbd-cec6-47f1-af45-371d0ff78a97.png" width="800"/>
12
</div>
13
14
## Citation
15
16
<!-- [ALGORITHM] -->
17
18
```BibTeX
19
@inproceedings{lin2018bsn,
20
  title={Bsn: Boundary sensitive network for temporal action proposal generation},
21
  author={Lin, Tianwei and Zhao, Xu and Su, Haisheng and Wang, Chongjing and Yang, Ming},
22
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
23
  pages={3--19},
24
  year={2018}
25
}
26
```
27
28
## Model Zoo
29
30
### ActivityNet feature
31
32
|config |feature | gpus| pretrain | AR@100| AUC | gpu_mem(M) | iter time(s) | ckpt | log| json|
33
|:--|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:-:|
34
|bsn_400x100_1x16_20e_activitynet_feature |cuhk_mean_100 |1| None |74.66|66.45|41(TEM)+25(PEM)|0.074(TEM)+0.036(PEM)|[ckpt_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature/bsn_tem_400x100_1x16_20e_activitynet_feature_20200619-cd6accc3.pth) [ckpt_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature/bsn_pem_400x100_1x16_20e_activitynet_feature_20210203-1c27763d.pth)| [log_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature/bsn_tem_400x100_1x16_20e_activitynet_feature.log) [log_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature/bsn_pem_400x100_1x16_20e_activitynet_feature.log)| [json_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature/bsn_tem_400x100_1x16_20e_activitynet_feature.log.json)  [json_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature/bsn_pem_400x100_1x16_20e_activitynet_feature.log.json)|
35
| |mmaction_video |1| None |74.93|66.74|41(TEM)+25(PEM)|0.074(TEM)+0.036(PEM)|[ckpt_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_mmaction_video/bsn_tem_400x100_1x16_20e_mmaction_video_20200809-ad6ec626.pth) [ckpt_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_mmaction_video/bsn_pem_400x100_1x16_20e_mmaction_video_20200809-aa861b26.pth)| [log_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_mmaction_video/bsn_tem_400x100_1x16_20e_mmaction_video_20200809.log) [log_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_mmaction_video/bsn_pem_400x100_1x16_20e_mmaction_video_20200809.log) | [json_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_mmaction_video/bsn_tem_400x100_1x16_20e_mmaction_video_20200809.json) [json_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_mmaction_video/bsn_pem_400x100_1x16_20e_mmaction_video_20200809.json) |
36
| |mmaction_clip |1| None |75.19|66.81|41(TEM)+25(PEM)|0.074(TEM)+0.036(PEM)|[ckpt_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_mmaction_clip/bsn_tem_400x100_1x16_20e_mmaction_clip_20200809-0a563554.pth) [ckpt_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_mmaction_clip/bsn_pem_400x100_1x16_20e_mmaction_clip_20200809-e32f61e6.pth)| [log_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_mmaction_clip/bsn_tem_400x100_1x16_20e_mmaction_clip_20200809.log) [log_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_mmaction_clip/bsn_pem_400x100_1x16_20e_mmaction_clip_20200809.log) | [json_tem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_tem_400x100_1x16_20e_mmaction_clip/bsn_tem_400x100_1x16_20e_mmaction_clip_20200809.json) [json_pem](https://download.openmmlab.com/mmaction/localization/bsn/bsn_pem_400x100_1x16_20e_mmaction_clip/bsn_pem_400x100_1x16_20e_mmaction_clip_20200809.json) |
37
38
:::{note}
39
40
1. The **gpus** indicates the number of gpu we used to get the checkpoint.
41
   According to the [Linear Scaling Rule](https://arxiv.org/abs/1706.02677), you may set the learning rate proportional to the batch size if you use different GPUs or videos per GPU,
42
   e.g., lr=0.01 for 4 GPUs x 2 video/gpu and lr=0.08 for 16 GPUs x 4 video/gpu.
43
2. For feature column, cuhk_mean_100 denotes the widely used cuhk activitynet feature extracted by [anet2016-cuhk](https://github.com/yjxiong/anet2016-cuhk), mmaction_video and mmaction_clip denote feature extracted by mmaction, with video-level activitynet finetuned model or clip-level activitynet finetuned model respectively.
44
45
:::
46
47
For more details on data preparation, you can refer to ActivityNet feature in [Data Preparation](/docs/data_preparation.md).
48
49
## Train
50
51
You can use the following commands to train a model.
52
53
```shell
54
python tools/train.py ${CONFIG_FILE} [optional arguments]
55
```
56
57
Examples:
58
59
1. train BSN(TEM) on ActivityNet features dataset.
60
61
    ```shell
62
    python tools/train.py configs/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature.py
63
    ```
64
65
2. train BSN(PEM) on PGM results.
66
67
    ```shell
68
    python tools/train.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py
69
    ```
70
71
For more details and optional arguments infos, you can refer to **Training setting** part in [getting_started](/docs/getting_started.md#training-setting).
72
73
## Inference
74
75
You can use the following commands to inference a model.
76
77
1. For TEM Inference
78
79
    ```shell
80
    # Note: This could not be evaluated.
81
    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
82
    ```
83
84
2. For PGM Inference
85
86
    ```shell
87
    python tools/misc/bsn_proposal_generation.py ${CONFIG_FILE} [--mode ${MODE}]
88
    ```
89
90
3. For PEM Inference
91
92
    ```shell
93
    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
94
    ```
95
96
Examples:
97
98
1. Inference BSN(TEM) with pretrained model.
99
100
    ```shell
101
    python tools/test.py configs/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth
102
    ```
103
104
2. Inference BSN(PGM) with pretrained model.
105
106
    ```shell
107
    python tools/misc/bsn_proposal_generation.py configs/localization/bsn/bsn_pgm_400x100_activitynet_feature.py --mode train
108
    ```
109
110
3. Inference BSN(PEM) with evaluation metric 'AR@AN' and output the results.
111
112
    ```shell
113
    # Note: If evaluated, then please make sure the annotation file for test data contains groundtruth.
114
    python tools/test.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py  checkpoints/SOME_CHECKPOINT.pth  --eval AR@AN --out results.json
115
    ```
116
117
## Test
118
119
You can use the following commands to test a model.
120
121
1. TEM
122
123
    ```shell
124
    # Note: This could not be evaluated.
125
    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
126
    ```
127
128
2. PGM
129
130
    ```shell
131
    python tools/misc/bsn_proposal_generation.py ${CONFIG_FILE} [--mode ${MODE}]
132
    ```
133
134
3. PEM
135
136
    ```shell
137
    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
138
    ```
139
140
Examples:
141
142
1. Test a TEM model on ActivityNet dataset.
143
144
    ```shell
145
    python tools/test.py configs/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth
146
    ```
147
148
2. Test a PGM model on ActivityNet dataset.
149
150
    ```shell
151
    python tools/misc/bsn_proposal_generation.py configs/localization/bsn/bsn_pgm_400x100_activitynet_feature.py --mode test
152
    ```
153
154
3. Test a PEM model with with evaluation metric 'AR@AN' and output the results.
155
156
    ```shell
157
    python tools/test.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth --eval AR@AN --out results.json
158
    ```
159
160
:::{note}
161
162
1. (Optional) You can use the following command to generate a formatted proposal file, which will be fed into the action classifier (Currently supports only SSN and P-GCN, not including TSN, I3D etc.) to get the classification result of proposals.
163
164
    ```shell
165
    python tools/data/activitynet/convert_proposal_format.py
166
    ```
167
168
:::
169
170
For more details and optional arguments infos, you can refer to **Test a dataset** part in [getting_started](/docs/getting_started.md#test-a-dataset).