|
a |
|
b/README.md |
|
|
1 |
# IMPORTANT UPDATE (6/29/2024) |
|
|
2 |
## The original code has an unintential bug in evaluation that causes an inaccurate assessment of the model’s true capability. We thank Yiqian Yang and Hyejeong Jo, et al for spotting and investigating the problem! Please read https://arxiv.org/pdf/2405.06459 for more details. |
|
|
3 |
|
|
|
4 |
**Please refer to https://github.com/NeuSpeech/EEG-To-Text for the corrected code and detailed experiments.** To avoid further confusion, we archived this repo. |
|
|
5 |
|
|
|
6 |
|
|
|
7 |
--- |
|
|
8 |
# Please read https://arxiv.org/pdf/2405.06459 before proceeding! |
|
|
9 |
# [(AAAI 2022) Open Vocabulary EEG-To-Text Decoding and Zero-shot sentiment classification](https://arxiv.org/abs/2112.02690) |
|
|
10 |
## Create Environment |
|
|
11 |
run `conda env create -f environment.yml` to create the conda environment (named "EEGToText") used in our experiments. |
|
|
12 |
## Download ZuCo datasets |
|
|
13 |
- Download ZuCo v1.0 'Matlab files' for 'task1-SR','task2-NR','task3-TSR' from https://osf.io/q3zws/files/ under 'OSF Storage' root, |
|
|
14 |
unzip and move all `.mat` files to `~/datasets/ZuCo/task1-SR/Matlab_files`,`~/datasets/ZuCo/task2-NR/Matlab_files`,`~/datasets/ZuCo/task3-TSR/Matlab_files` respectively. |
|
|
15 |
- Download ZuCo v2.0 'Matlab files' for 'task1-NR' from https://osf.io/2urht/files/ under 'OSF Storage' root, unzip and move all `.mat` files to `~/datasets/ZuCo/task2-NR-2.0/Matlab_files`. |
|
|
16 |
|
|
|
17 |
## Preprocess datasets |
|
|
18 |
run `bash ./scripts/prepare_dataset.sh` to preprocess `.mat` files and prepare sentiment labels. |
|
|
19 |
|
|
|
20 |
For each task, all `.mat` files will be converted into one `.pickle` file stored in `~/datasets/ZuCo/<task_name>/<task_name>-dataset.pickle`. |
|
|
21 |
|
|
|
22 |
Sentiment dataset for ZuCo (`sentiment_labels.json`) will be stored in `~/datasets/ZuCo/task1-SR/sentiment_labels/sentiment_labels.json`. |
|
|
23 |
|
|
|
24 |
Sentiment dataset for filtered Stanford Sentiment Treebank will be stored in `~/datasets/stanfordsentiment/ternary_dataset.json` |
|
|
25 |
|
|
|
26 |
## Usage Example |
|
|
27 |
### Open vocabulary EEG-To-Text Decoding |
|
|
28 |
To train an EEG-To-Text decoding model, run `bash ./scripts/train_decoding.sh`. |
|
|
29 |
|
|
|
30 |
To evaluate the trained EEG-To-Text decoding model from above, run `bash ./scripts/eval_decoding.sh`. |
|
|
31 |
|
|
|
32 |
For detailed configuration of the available arguments, please refer to function `get_config(case = 'train_decoding')` in `/config.py` |
|
|
33 |
|
|
|
34 |
### Zero-shot sentiment classification pipeline |
|
|
35 |
We first train the decoder and the classifier individually, and then we evaluate the pipeline on ZuCo task1-SR data. |
|
|
36 |
|
|
|
37 |
To run the whole training and evaluation process, run `bash ./scripts/train_eval_zeroshot_pipeline.sh`. |
|
|
38 |
|
|
|
39 |
For detailed configuration of the available arguments, please refer to function `get_config(case = 'eval_sentiment')` in `/config.py` |
|
|
40 |
|
|
|
41 |
## Citation |
|
|
42 |
``` |
|
|
43 |
@inproceedings{wang2022open, |
|
|
44 |
title={Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification}, |
|
|
45 |
author={Wang, Zhenhailong and Ji, Heng}, |
|
|
46 |
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, |
|
|
47 |
volume={36}, |
|
|
48 |
number={5}, |
|
|
49 |
pages={5350--5358}, |
|
|
50 |
year={2022} |
|
|
51 |
} |
|
|
52 |
``` |