[818e31]: / data / data_items.csv

Download this file

# Origin Rationale Data EEG processing Methodology Results and discussion Reviewed by
1 Title Year Authors Journal / Origin Preprint first Type of paper Lab / School / Company Country Pages Domain 1 Domain 2 Domain 3 Domain 4 High-level Goal Practical Goal Task/Paradigm Motivation for DL EEG Hardware Neural response pattern Dataset name Dataset accessibility Data Data - samples Data - time Data - subjects Nb Channels Sampling rate Offline / Online Preprocessing Preprocessing (clean) Artefact handling Artefact handling (clean) Features Features (clean) Normalization Software Architecture Architecture (clean) Design peculiarities EEG-specific design Network Schema Input format Layers Layers (clean) Activation function Regularization Regularization (clean) Nb Classes Classes Output format Nb Parameters Training procedure Training procedure (clean) Optimizer Optimizer (clean) Optim parameters Minibatch size Hyperparameter optim Hyperparameter optim (clean) Data augmentation Loss Intra/Inter subject Cross validation Cross validation (clean) Data split Performance metrics Performance metrics (clean) Training hardware Training time Results Benchmarks Baseline model type Statistical analysis of performance Analysis of learned parameters Model inspection (clean) Discussion Limitations Code available Code hosted on Limited data First Reader Second Reader Validated by Author(s) Citation
2 EEG-signals based cognitive workload detection of vehicle driver using deep learning 2018 Almogbel, Dang & Kameyama IEEE Conference on Advanced Communication Technology No Conference Waseda University Japan 4 Classification of EEG signals Monitoring Cognitive Mental workload Improve State-of-the-Art Driving Game (GTA) Muse (InteraXon) Raw EEG Internal Recordings Private 1 subject x 24 sessions (12 High / 12 Low Workload) 15-30min each session (Sliding windows [from 30s to 180s], 1/256 overlap) 216 540 1 4 256 None No No No Raw EEG Raw EEG z-score N/M* CNN CNN Yes 38400x4 (38400 = 150s @ 256Hz) 7 Conv + 3 FC 10 ReLU Dropout: 50% Yes 2 (Softmax) N/M* Standard Standard RMSProp Other lr=0.002 64 N/M N/M No Binary cross-entropy Intra Leave-One-Session-Out Leave-One-Session-Out Train: 92% Valid: 8% Test: N/A Accuracy accuracy N/M N/M 95.31% No None No No No "This study does not impose in any way a direct comparison with the distinguished previous works because the used data, experimental conditions, classification targets are different in each, but rather explore and introduce the potential of using deep CNN architecture in classifying raw EEG signals without any pre-processing." No N/A No Yannick Roy Isabela Albuquerque TBC Almogbel2018
3 Automatic ocular artifacts removal in EEG using deep learning 2018 Yang, Duan, Fan, Hu & Wang Biomedical Signal Processing and Control No Journal Key Laboratory of Power Station Automation Technology, Shanghai University China 11 Improvement of processing tools Signal cleaning Artifact handling Novel Motor Imagery WirelessEEG (Neuracle) Clean EEG / Ocular artefacts BCI Competition IV - I; Internal Recordings Public Each subject has 200 trails of motor imagery and each trail lasts for more than 6s. Subject 1, 2, 6 and 7 from BCI Comp. dataset + 3 (internal recordings) 1400 140 7 59; 32 100 1) Band-Pass Filter: 0.05-200Hz Yes No No Raw EEG Raw EEG min-max MATLAB SAE AE N/M* Yes 100x1 3 3 L1 Yes 100x1 N/M* Greedy Layer-wise training Standard N/M* N/M N/M N/M No RMSE Inter No No 16520 training samples 15458 test samples RMSE RMSE Not mentioned N/M* RMSE is lower for proposed approach than for benchmarks, as is the accuracy on the surrogate MI task Shallow SAE, ICA, K-ICA, SOBI Traditional pipeline No No No "Compared with the classical OAs removal methods, the proposed method has many highlights. [...] In the future work, we are going to improve the training method of DLN or try replacing the SAE with other neural networks such as convolutional neural networks (CNN) to strengthen its fitting ability for the details of EEG." No N/A No Yannick Roy Hubert Banville TBC Yang2018
4 An end-to-end framework for real-time automatic sleep stage classification 2018 Patanaik, Ong, Gooley, Ancoli-Israel & Chee Sleep No Journal Duke-NUS Medical School, Singapore University of California, San Diego Singapore 11 Classification of EEG signals Clinical Sleep Staging Improve State-of-the-Art: DL for Sleep (CNN + MLP) Reduce the time necessary to stage sleep recordings by using DL Sleep No need for feature engineering N/M Raw EEG Internal Recordings Private Four datasets ≈ 1700 polysomnography records a total of 11,727 hr of PSG data / 1,403,164 epochs 1403164 703620 459 2 N/M 1) Pass-Band Filter (FIR): 0.3-45Hz 2) Downsampled to 100Hz (polyphase FIR filter) Yes No No Spectrogram Frequency-domain N/M TensorFlow CNN + MLP (2 Stages) CNN Consecutive probabilities outputted by the CNN are aggregated by the MLP N/M Yes 32x32x3 (spectrogram 2D x 3 channels) CNN: 16 MLP: 1 17 CNN: ReLU MLP: tansig N/M N/M 5 (Softmax) Probability of each Sleep Stage dCNN: 177 669 weights MLP: 445 weights Standard optimization Standard Stochastic gradient descent with Nesterov momentum SGD Learning rate: 0.001 Momentum: 0.9 Learning rate decay: 10e-6 CNN: 300 MLP: 1000 Trial and error Yes No Categorical Cross-Entropy Inter Train-Valid-Test Train-Valid-Test Train: 75% of DS1 & DS2 Test: 25% of DS1 & DS2 Validation: DS3, DS4 Accuracy Cohen's kappa accuracy, Cohen's kappa NVidia GTX 1060 N/M Test set: ~89.8%. kappa=0.862 Validation set 1: 81.4%, kappa=0.740 Validation set 2: 72.1%, kappa=0.597 Expert rescoring of 50 records Traditional pipeline t-test on Cohen's kappa (automatic vs. expert rescoring) -> stat. diff. for validation set 3 but not for 4 No No "... our framework provides a practicable, validated, and speedy solution for automatic sleep stage classification that can significantly improve throughput and productivity of sleep labs. It has the potential to play an important role in emerging novel applications of real-time automatic sleep scoring as well as being installed in personal sleep monitors." N/M No N/A No Yannick Roy Hubert Banville Yes Patanaik2018
5 Epileptic Seizure Detection: A Deep Learning Approach 2018 Hussein, Palangi, Ward & Wang Arxiv Yes Preprint UBC Canada 12 Classification of EEG signals Clinical Epilepsy Detection Improve State-of-the-Art: DL for Epilepsy (LSTM) Improve performance on seizure detection with DL, on real conditions (with noise) Resting State, Eyes Open, Eyes Closed, Seizures. Automatically learns features N/M Raw EEG Bonn University Public Bonn University (A,B,C,D,E) 5x 100 epochs of 23.6s 500 197 15 1 173.6 1) Artifacts Removed 2) Band-Pass Filter: 0.53-40Hz (before saving the dataset... "hardcoded") Yes Yes (dataset already cleaned) Yes Raw EEG Raw EEG N/M MATLAB, Python Keras with TensorFlow backend LSTM RNN N/M N/M Yes 100x2 3 3 N/M N/M N/M 2, 3 or 5 N/M Standard optimization Standard Adam Adam LR: 0.001 64 N/M N/M Added artifacts (EMG, EOG) and Gaussian white noise Categorical Cross-Entropy Inter 10-Fold CV k-fold Train: 80% Test: 20% Accuracy Sensitivity Specificity accuracy, sensitivity, specificity NVidia K40 2h 100% everywhere. For Sensitivity, Specificity & Accuracy of the 2-Classes, 3-Classes & 5-Classes. Robust to artificial artfiacts Compared with many other SotA using the same dataset. BNN, ME, SVM, ELM, LDA, SVM, KNN, ANN, etc. Traditional pipeline No No No Compared to the state-of-the-art methods, this approach can learn the high-level representations, and can effectively discriminate between the normal and seizure EEG activities. Another advantage of this approach lies in its robustness against common EEG artifacts (e.g., muscle activities and eye- blinking) and white noise. Unbalanced class distributions No N/A No Yannick Roy Hubert Banville TBC Hussein2018
6 Development of a brain computer interface interface using multi-frequency visual stimulation and deep neural networks 2018 Perez-Benitez, Perez-Benitez & Espina-Hernandez IEEE Conference on Electronics, Communications and Computers No Conference National Polytechnic Institute, Mexico Mexico 7 Classification of EEG signals BCI Reactive SSVEP Improve State-of-the-Art: SSVEP with CNN Increase number of commands and reduce eyestrain in a visual BCI SSVEP (with LEDs) Just another classifier! (Custom-made) SSVEP Internal Recordings Private 11 subjects x 5 stimuli N/M N/M 11 3 250 N/M N/M N/M N/M Spectrum Frequency-domain N/M N/M SAE (Sparse AutoEncoder) AE N/M N/M Yes N/M SAE: 2 Final: 2 4 Sigmoid L2 regularization Sparsity loss Yes 5 (Softmax) Diff SSVEP Freqs N/M 1) Train SAE 2) Train softmax on top of SAE middle layer Pre-training N/M N/M epochs: 50 lambda (L2): 0.16 gamma (sparsity): 1.0 rho: 0.1 N/M N/M N/M No Mean Squared Error (SAE) Cross-entropy (softmax layer) Intra No No N/M Accuracy accuracy N/M N/M 97.78% (not clear if on training set or something else!) k-NN Naive Bayes, Bayes Kernel Decision Tree, Random Forest, Gradient Boosted Tree Rule Induction, MC-SVM, ML Perceptron Traditional pipeline No Visualization of learned parameters Analysis of weights The analysis of the DNN first layer weights reveals that there are two main patterns containing information about the SSVEPs in the power spectrums of the measured EEG signals: (i) the weights reinforces the features of the spectrum at frequencies {fst}, 3/2 {fst} and 2{fst} where fst are the frequencies of the MFVS and the other (ii) the weights reinforces the features of the spectrum at low frequencies from 0 Hz – 20Hz. N/M No N/A No Yannick Roy Hubert Banville TBC Perez-Benitez2018
7 Deep Semantic Architecture with discriminative feature visualization for neuroimage analysis 2018 Ghosh, Dal Maso, Roig, Mitsis & Boudrias Arxiv Yes Preprint McGill, UdeM Canada 11 Classification of EEG signals Monitoring Physical Exercise Improve SOTA Study the add-on effects of exercise on motor learning Hand motor task before and after an acute exercise Does not require hand-engineered features (BrainProducts) Brain Rhythms (SMR) Internal Recordings Private 25 subjects 4 x 50 x [3.5 sec (holding) + 3 to 5 sec (rest)] 5000 625 25 64 2500 1) Band-Pass Filter: 0.5-55Hz 2) Re-reference to average. 3) Visual Inspection, noisy signal segment removed 4) ICA to remove eye blinks 5) Morlet Wavelet (wave:7, 1Hz reso) Yes Visual inspection to reject transient artefacts ICA for eye blinks Yes Frequency Bands (55) Frequency-domain Per-electrode spectral normalization Brainstorm (MATLAB), Torch CNN CNN 1) Base CNN that expects baseline and post-condition data in parallel 2) CNN that predicts class 3) Adverserial Component to penalize subject-dependent training Base CNN: spectral-only convolutions Yes 64 x 55 (channels x freq bands) [x2 since the Base CNN is used twice in a single pass] [On TF maps, on topo maps] BaseCNN: 2 + 1, 3 + 1 Discriminator: 2, 2 Adversary: 2, 2 7 ReLU Dropout, weight decay Yes 2 Prob of EXE Prob of CON N/M Standard optimization Standard Adam Adam [On TF maps, on topo maps] LR: 0.001, 0.001 LR decay: 0.0001, 0.001 Weight decay: 0.001, 0.03 N/M N/M N/M No Negative Log-Likelihood (part 1) & KL-Divergence (part 2) Inter N/M No Train: 80% Validation: 20% Test: N/M Accuracy accuracy N/M N/M 98.70% N/M None No Visualization of class activation maps (proposed method) Class Activation Maps "Importantly, the proposed novel method enabled us to visualize the features learnt by deep networks such as CNNs, which may in turn yield better interpretation of their classification basis." N/M No N/A No Yannick Roy Hubert Banville TBC Ghosh2018
8 Cascade and Parallel Convolutional Recurrent Neural Networks on EEG-based Intention Recognition for Brain Computer Interface 2018 Zhang, Yao, Zhang, Wang, Chen & Boots AAAI Conference on Artificial Intelligence Yes Preprint University of New South Wales Australia 8 Classification of EEG signals BCI Active Motor imagery Novel Approach: Cascade & Parallel CNN and RNN Compare Cascade and Parallel CNN + RNN on Motor Imagery Dataset (eegmmidb) to SOTA Motor Imagery (see eegmmidb dataset) To capture temporal and spatial information. N/M Motor Imagery eegmmidb; Internal Recordings Both (eegmmidb) 108 subjects, 3,145,160 EEG (2808min) (Internal recordings) 9 subjects x 30 trials (6 per class) Internal recordings: 10s action, 10s rest 3145430 2898 108 64 160 Offline 1) 2D Mesh (Matrix) 2) Sliding Window (clips) 3) Normalize Yes No No 2D Mesh Clips (of Raw EEG) Raw EEG z-score N/M Cascade / Parallel CNN + RNN (LSTM) CNN+RNN CNN + LSTM combined (serial or parallel) To capture spatial and temporal resolution Yes 2D Data mesh (time signal x spatial matrice) 3 CNN + 1 FC (1024) 2 LSTM (64) + 1 FC (1024) 7 N/M Dropout (0.5) Yes 5 5 Motor Commands 5 (Softmax) N/M N/M N/M Adam Adam LR: 0.0001 N/M N/M N/M N/M Cross-Entropy Inter N/M No Train: 75% Test: 25% Accuracy accuracy Nvidia Titan X Pascal N/M Cascade: 0.9824 Parallel: 0.9828 (Major and Conrad 2017) : 0.72 - ICA (Shenoy, Vinod, and Guan 2015) : 0.82 - SR-FBCSP (Pinheiro et al. 2016) : 0.85 - SVM (Kim et al. 2016) : 0.80 - SUTCCSP (Zhang et al. 2017) : 0.79 - XGBoost (Bashivan et al. 2016) : 0.67 - R-CNN DL & Trad. No No No A large-scale dataset of 108 participants on five categories is used to evalu- ate the proposed models. The experimental results show that both the cascade and parallel architectures could achieve very competitive accuracy around 98.3%, considerably superior to the state-of-the-art methods. N/M No N/A No Yannick Roy Isabela Albuquerque TBC Zhang2018c
9 A hierarchical LSTM model with attention for modeling EEG non-stationarity for human decision prediction 2018 Hasib, Nayak & Huang IEEE EMBS International Conference on Biomedical & Health Informatics No Conference University of Texas, San Antonio USA 4 Classification of EEG signals BCI Active Mental tasks Improve SOTA Novel Approach: H-LSTM with Attention for Decision Classification Allow or Deny Access based on ID + Image (Guard) No need for hand-engineered features ActiveTwo (BioSemi) Raw EEG BCIT Guard Duty Private 1782 of 5297 sequences selected: 892 Deny + 890 Allow 18 Subjects (10s windows) 1782 297 18 64 512 1) Downsampled to 128Hz 2) Band-Pass Filter: 0.1-55Hz Yes N/M N/M Raw EEG Raw EEG z-score N/M LSTM RNN Hierachical (from samples in first layer to epochs in second layer) Attention mechanism First layer acts on samples Second layer acts on epochs Yes 0.5s epochs 2 2 N/M L2 weight decay Yes 1 Allow / Deny N/M Standard optimization Standard Adam Adam N/M N/M N/M N/M N/M Cross-Entropy Inter 3-Fold CV k-fold Train: 60% Validation: 6% Test: 33% ROC AUC ROC AUC N/M N/M H-LSTM (w/ Attention & 0.5s epochs): 82.6% H-LSTM (w/ Attention & 2.5s epochs): 81% H-LSTM (w/ Attention & 5s epochs): 81.6% H-LSTM (w/out Attention & 0.5s epochs): 80.3% H-LSTM (w/out Attention & 5s epochs): 73.7% SVM: 65% CNN: 69% DL & Trad. No No No "Using the attention mechanism does help enhance the discriminate features obtained from these epochs, although it does not help model the EEG non-stationarity" "Consistent with the observation from LSTM performance, we observed an increase of performance with shorter epoch length." N/M No N/A No Yannick Roy Hubert Banville TBC Hasib2018
10 Deep EEG super-resolution: Upsampling EEG spatial resolution with Generative Adversarial Networks 2018 Corley & Huang IEEE EMBS International Conference on Biomedical & Health Informatics No Conference University of Texas, San Antonio USA 4 Generation of data Generating EEG Spatial upsampling Novel Approach: GAN for EEG Upsampling. BCI Competition III - Dataset V GANs previous great results on image super-resolution N/M N/A BCI Competition III - V Public (BCI Comp. III - V) 1,096,192 samples from 3 subjects (Windows of 1s, 480/512 overlap) 36397 35.7 3 32 512 1) Downsampling in the number of channels (from 32 to 16) Yes No No None Raw EEG z-score N/M WGAN GAN N/M Convolutional layers with kernel dimensions that find the relationships between channels Yes 32 x 512 (channels x samples) Gen: 6 Conv Layers Discrim: 4 Conv Layers + 1 FC 6 ELU Dropout (0.1 - 0.25) Yes 32 x SR (Channels x Super Resolved) (upsampled data) N/M Pre-trained Gen fine-tuned w/ WGAN framework losses w/ gradient penalty weight of 10. Also, label smoothing technique Other Adam Adam a=10^-4, b1=0.5, b2=0.9 64 N/M N/M N/M Gen: MSE Discrim: Distance Inter Holdout Holdout Train: 75% Valid: 20% Test: 5% MSE MAE (mean absolute error) Accuracy, precision and recall (for classification task) MSE, MAE, accuracy, precision, recall N/M N/M [Scale 2 - Test] MSE: 2.06E3 | MAE: 24.66 [Scale 4 - Test] MSE: 8.68E3 | MAE: 64.39 ~10^4 fold (MSE) and ~10^2 fold (MAE) compared to Bicubic Interpolation Bicubic Interpolated Channel Data Traditional pipeline No No No "Feature scaling techniques besides standard normalization decreased model performance. [...] It was notably difficult and time-consuming to train GANs for EEG data. [...] After testing different variants of GAN: WGAN appeared to be more stable during training." "It was notably difficult and time-consuming to train GANs for EEG data" No N/A No Yannick Roy Isabela Albuquerque TBC Corley2018
11 Spatial and Time Domain Feature of ERP Speller System Extracted via Convolutional Neural Network 2018 Yoon, Lee & Whang Computational Intelligence and Neuroscience No Journal Duke University Sangmyung University USA 12 Classification of EEG signals BCI Reactive ERP Alleviate BCI illiteracy Reduce BCI illiteracy in P300 spellers by using CNNs Rapid Serial Visual Presentation (P300 speller) Uncover new unknown spatial/temporal patterns. When an optimal filter is applied, the convolution will magnify the feature of interest and reduce the others [25]. B-Alert X10 (ABM) P300 and oddball paradigm-related EEG activity Internal Recordings Private 33 subjects, 2 to 4 pairs of sessions (offline + online) 12 trials / session (each trial = 10s + ERP stimuli) 20 times x 6 icons / trial (300ms) 33x3x12x6x20 = 142,560 samples 142560 712 33 11 256 Both N/M N/M N/M N/M Raw EEG Raw EEG N/A TensorFlow Python CNN CNN - Layer 1: spatial correlation Layer 2: temporal filter Yes 14 x 300 (channels x samples) CNN: 2 FC: 2 4 ReLU Dropout (0.1 - 0.25) Yes 6 6 different icons (Power On/Off, Volume Up/Down, Channel Up/Down) 2 N/M Standard optimization Standard Adam Adam N/M N/M N/M N/M N/M N/M Intra No No Train: 50% (offline) Test: 50% (online) Accuracy, sensitivity, precision, F1 score, ROC (+ ANOVA on metrics) accuracy, sensitivity, precision, f1-score, ROC N/M N/M Accuracy: 88.9 for high performing group, 68,7% for low performing group No benchmark None ANOVA Visualization of weights Analysis of weights A P300 is not visible in all subjects, but there seems to be a P700 that is pretty consistent across subjects. Spatial features seem to play a more important role than temporal features in the classification of an oddball task. - No N/A No Hubert Banville Yannick Roy Yes Yoon2018
12 ?? Spectrographic Seizure Detection Using Deep Learning With Convolutional Neural Networks 2018 Yan, Wang & Grinspan Neurology No Supplement Well Cornell Medical College New York USA Classification of EEG signals Clinical Epilepsy Detection Improve State-of-the-Art: Using CNN on Spectrogram for Seizure Detection Existing dataset, no mention of any task. (Supposed: Resting Sate) N/M Raw EEG (Seizure) CHB-MIT Public 130 EEGs with 177 total seizures, and 549 EEGs without seizures. >90% of seizures were <2 minutes long. 130 EEGs with seizure and 130 randomly selected EEGs without seizures were converted to the median power spectrogram (MPS). The training set consisted of 16,992 seizure containing images and 16,992 images without seizures (80% of total images). The testing set contained 4,248 seizure containing images and 4,248 images without seizures (20% of total images). 33984 N/M N/M -1 N/M N/M* N/M N/M N/M Medium Power Spectrogram (MPS) Frequency-domain N/M* CNN (4 variants of VGG16) CNN N/M* No Images (1s sliding window of MPS) 16 N/M Dropout: 0.5 Yes N/M* N/M* N/M N/M N/M N/M N/M No N/M* Inter N/M No Train: 80% Test: 20% Sensitivity Specificity sensitivity, specificity N/M* All four CNN variants achieved >98% sensitivity and specificity N/M* None No No No "Convolutional neural nets can achieve high sensitivity and specificity in detecting seizures within spectrograms. However, generalizability and overfitting remains a concern. Further evaluation with more diverse data sets, images grouped by individual seizures, and additional regularization techniques is warranted." No N/A TBD Yannick Roy TBR TBC Yan2018
13 Generating target / non-target images of an RSVP experiment from brain signals in by conditional generative adversarial network 2018 Lee & Huang IEEE EMBS International Conference on Biomedical & Health Informatics No Conference University of Texas, San Antonio USA 4 Generation of data Generating images conditioned on EEG Novel Approach: generating images confitioned on EEG Using EEG from RSVP to generate images (target or non-target) RSVP - 5 Images/s GAN models. ActiveTwo (BioSemi) RSVP Internal Recordings Private 10 subjects, 5 sessions (~1h /session), 880 Epochs (1s windows) 880 14.6 10 32 -1 Offline - PREP Pipeline (EEGLAB): bandpass (0.1-55 Hz), robust referencing, interpolating bad channels - Downsampled to 32Hz - Subset of 32 channels (visual cortex) Yes Yes Yes Raw EEG Raw EEG z-score EEGLAB cGAN GAN It's to generate the image, not the EEG data. Based on DCGAN N/A Yes 32 x 32 (channels x samples) Generator: 4 Discriminator: 4 4 G: Leaky ReLU D: ReLU N/M N/M N/A N/A 64 x 64 (image) N/M GAN Style. Standard N/M N/M N/M 16 N/M N/M N/M N/M Inter No No Train: 704 epochs Test: 176 epochs Visual inspection (making sure generated image is of the right class) visual inspection N/M 2-3h Accuracy: 0.625 None None No Occlusion of input EEG and visualization of generated image Occlusion of input We demonstrated the performance of the proposed cGAN model and showed that generation with raw or normalized EEG produced better performance than that with added noise. We also showed how this model could be used for investigating the EEG and image associations. N/M No N/A No Yannick Roy Hubert Banville TBC Lee2018
14 Cross-Participant EEG-Based Assessment of Cognitive Workload Using Multi-Path Convolutional Recurrent Neural Networks 2018 Hefron, Borghetti, Schubert Kabban, Christensen & Estepp Sensors No Journal Air Force Institute of Technology (Ohio) USA 27 Classification of EEG signals Monitoring Cognitive Mental workload Novel Approach: Using a Multi-Path Convolutional Recurrent Neural Network (MPCRNN) to improve SOTA on cross-participant classification of cognitive workload Tackle cross-subs variability in cognitive workload assessment Multi-Attribute Task Battery (MATB) environment Assumptions regarding brain activity are better matched by a deep representation that includes multi-path connections. ActiveTwo (BioSemi) None Internal Recordings Private 8 subjects * 4 blocks * 6 conditions * 5 min (1s windows) 57600 960 8 128 4096 1) Trimmed to 303s trials 2) Downsampled to 512Hz 3) Down-selected 64 electrodes 4) PREP Pipeline to identify and interpolate bad channels, calculate a robust average reference, and remove line noise 5) High-Pass Filter: 1Hz 6) PSD 3-55Hz (2s Hanning-Windowed STFT, 1s overlap) Yes Yes (manual identification of high-variance segments) No PSD - Frequency Bands (53) Frequency-domain [-1, 1] Keras, TensorFlow (multi-path, residual) CNN + (bi-directional, residual) LSTM CNN+RNN It combines a wide multi-path, residual, convolutional networkwith a bi-directional, residual LSTM. 1x1 convolutions to act as cross-channel parametric pooling Yes 20x53x64 (time x frequency bands x channels) [very deep, see schema / paper] 8 ReLU and sigmoid Dropout + batch normalization + early stopping + L1 + L2 Yes 1 (Mental Workload) MPCRNN: 6.2M Standard. Standard Adam Adam LR: from 0.0001 to 0.000001 128 N/M N/M N/M Binary cross-entropy Inter Test: Hold-out 1 Participant Training: 7-Fold Cross-Validation k-fold; Holdout Train: 6 participants Validation: 1 participant Test: 1 participant Mean Accuracy accuracy N/M N/M between 80-86% (depending on sequence length used as input) Simpler DL architectures DL ANOVA + post-hoc Tukey Honest Significant Difference tests No No We found that while increasing sequence length improves model accuracy, it does not improve generalizability since cross-participant variance increases due to cross-participant distributional differences. Furthermore, longer sequences reduce temporal specificity which decreases a model’s utility in a real-time environment. The only condition among our experiments across sequence lengths, architectures, and training methods which resulted in improved accuracy and decreased cross-participant variance was the multi-path convolutional recurrent architecture. N/M No N/A Yes Yannick Roy Isabela Albuquerque Yes Hefron2018
15 Classification of auditory stimuli from EEG signals with a regulated recurrent neural network reservoir 2018 Moinnereau, Brienne, Brodeur, Rouat, Whittingstall & Plourde Arxiv Yes Preprint Université de Sherbrooke Canada 5 Classification of EEG signals BCI Reactive Heard speech decoding Improve SOTA Classify heard speech (vowels) from EEG Auditory Stimuli + Imagined Speech Can extract features automatically BrainAmp (BrainProducts) Raw EEG Internal Recordings Private 8 subjects x 3 stimuli x 200 times each (2s windows, onset at 0.5s) (preprocessing removed 30%!) 4800 9600 8 64 N/M Offline 1) Pass-Band Filter: 0.1-45Hz 2) Re-sampled at 500Hz 3) Windows of 2s (stimulus at 0.5s) 4) Trials with Amplitude > +-75uV rejected 5) Re-reference to local average Yes Yes (amplitude thresholding) Yes Spike Train from Ben’s Spike Algorithm (BSA) Other N/M Python RNN Reservoir RNN The reservoir comprises 512 neurons placed in a three-dimensional grid where 80% are excitatory and 20% are inhibitory neurons N/M Yes Spike Trains per channel N/A N/M Leaky Integrate-and-Fire N/M N/M 3 "a", "i", "u" 1 N/M Reservoir: unsupervised tuning Classifier: linear regression Standard N/M N/M N/M N/M N/M N/M N/M N/M Intra 5-Fold CV k-fold Train: 4/5 Test: 1/5 Accuracy accuracy N/M N/M 83.2% (64 electrodes) 1 Electrode: 57.3% 3 Electrodes: 71.4% 10 Electrodes: 81.7% [Chance: 33%] CNN (3 conv layers of 64 filters) DL No No No "It’s hard to compare these results with the previ- ous work where many different experimental conditions (e.g. different type and number of stimuli) and preprocessing has been used. However, we show here that excellent classifica- tion results can be obtained with minimal preprocessing of the EEGs." N/M No N/A No Yannick Roy Hubert Banville Yes Moinnereau2018
16 Deep learning for detection of epileptiform discharges from scalp EEG recordings 2018 van Putten, de Carvalho, Tjepkema-Cloostermans Clinical Neurophysiology No Journal University of Twente Netherlands 6 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA Use CNN and/or LSTM to classify yes/no discharges Pre-Recorded EEG, no mention of any task. (Supposed: Resting Sate and sleep) DL is promising novel approach and is able to learn from large data-sets N/M Raw EEG TBD Private Training (41,381 epochs), Test (8775 epochs) For validation we used 7 EEGs (47,122 epochs) with 538 focal epileptiform discharges and 12 normal EEGs (n = 11,782 epochs). (2s windows, no overlap) 97278 3242.6 N/M 19 125 Offline 1) Band-Pass Filter: 0.5-35Hz 2) Re-referenced to both a longitudinal bipolar montage and a source Laplacian Yes No No Raw EEG Raw EEG N/M Keras CNN RNN CNN+RNN Multiple designs N/M Yes 19 (channels) x 250 (2s) CNN: 4-9 Layers LSTM: 50-100 Units Both w/ 1-3 FC Layers 12 N/M Dropout (20-50%) Yes 2 Normal IED (discharge) 1 (prob [0,1] of discharge) 9142859 Standard Standard Adam Adam LR:3e-3 Beta1: 0.91 Beta2: 0.999 Epsilon: 1e-8 N/M N/M N/M N/M Categorical Cross-Entropy Inter N/M No Train: 41,381 Valid: 58,904 Test: 8,775 ROC AUC Sensitivity Specificity ROC AUC, sensitivity, specificity NVidia GTX 1080 2h AUC: 0.94 Sensitivity: 0.73 Specificity: 1 None None No No No "We foresee that deep nets may outperform humans both in classification accuracy and speed, leading to a fundamental shift in clinical EEG analysis in the next decade." N/M No N/A No Yannick Roy Isabela Albuquerque TBC VanPutten2018a
17 Cognitive Analysis of Working Memory Load from EEG, by a Deep Recurrent Neural Network 2018 Kuanar, Athitsos, Pradhan, Mishra & Rao ICASSP No Conference University of Texas, Arlington USA 5 Classification of EEG signals Monitoring Cognitive Mental workload Improve State-of-the-Art: Using RNN to measure levels of cognitive load. Extract features less sensitive to variations along each spatial dimension Working memory / workload experiment. (showing a set of letters and then showing a letter asking if the letter was part of the set) Sets of 4,6,8,10 letters corresponding to mental workload 1,2,3,4. ConvNets have demonstrated the ability to extract features that are invariant to changes in input patterns Neurofax EEG-1200 (Nihon Kohden) PSD NIMHANS Private 6490 samples, from 22 subjects Each trial of 4.5s sliced into 0.5s and an image was constructed over each time slice. 58410 486.75 22 64 256 1) From 4.5s Windows to 9 Windows of 0.5s Yes No No 192 Features: 64 chan x 3 bands Theta (4-7Hz), Alpha (8-13Hz), Beta (13-30Hz) (FFT) Converted into images. (32x32) 3D electrodes spatial to 2D. Frequency-domain N/M Theano, Python CNN + BiLSTM CNN+RNN Transforming channels and frequency bands into images of 0.5s windows, fed to an Hybrid CNN+BiLSTM N/M Yes EEG Images 32x32 (0.5s windows) (mixing 3 freq bands + 64 channels) 9 Conv Layers + 1 FC + 2 LSTM Layers of 64 units + 1 FC 13 Sigmoid Dropout (0.5) + L2 (0.0001) Yes 4 Classes (Softmax) 1.66 Mil Standard Standard Adam Adam LR: 10^-4 Beta1: 0.9 Beta2: 0.99 30 N/M N/M Gaussian noise (on image) Cross-Entropy Inter Leave-One-Subject-Out Leave-One-Subject-Out N/M Accurary accuracy NVidia K40 18h 92.50% SVM, Logistic Regression, Random Forest Traditional pipeline No No No "Our implementation was different from the previous attempts and learned the robust representations from EEG image sequences using a ConvNet and BiLSTM hybrid network. Our proposed hybrid network demonstrated the significant improvements in finding better classification accuracy i.e. up to 92.5% over various existing LSTM models." N/M Yes Website No Yannick Roy Isabela Albuquerque TBC Kuanar2018
18 A Deep Learning Approach with an Attention Mechanism for Automatic Sleep Stage Classification 2018 Längkvist & Loutfi Arxiv Yes Preprint Orebro University Sweden 18 Classification of EEG signals Clinical Sleep Staging New approach: "Explore the advantages of using a model qith selective attention applied to automatic sleep staging" Learn representations of sleep EEG with attention Sleep (PSG) Learn better features N/M Sleep UCDDB Public 25 x 6-9 hours (est. at 7.5h) (30s windows, no overlap) 22500 11250 25 1 128 Offline 1) Notch Filter: 50Hz 2) Down-Sampled to 64Hz 3) Band-Pass Filter: 0.3-32Hz Yes N/M N/M 28 Features: Relative Power: Delta (0.5 − 4Hz), Theta (4−8Hz), Alpha (8−13Hz), Beta (13−20Hz), & Gamma (20−32Hz), Entropy, Kurtosis, and Spectral Mean of all signals and fractal component of EEG. [+ EOG & EMG features] Frequency-domain z-score N/M SAE AE Attention Mechanism (static & adaptive approaches) N/M No 28 1 1 Sigmoid L2 normalization KL term in cost function for sparsity Yes 5 SWS, S2, S1, REM, Awake 5 Classes (Softmax) N/M 1) Training AE 2) Training softmax layer on learned features Pre-training SGD with momentum SGD Momentum: 0.9 LR decay: 0.01 30 Random grid search Yes N/M MSE Inter 5-Fold CV k-fold Train: 60% Valid: 20% Test: 20% Accurary accuracy N/M 2-3h 60-90% on each of the 5 classes. DBN SAE (standard a) SAE (fixed a) SAE (adaptive a) DL No Visualization of attention mechanism weights Analysis of weights "[...] Many of the used features try to capture the most relevant information for the current sleep stage and therefore mimic the standard Rechtschaffen and Kales (R&K) system [38, 18, 17] that is manually used by sleep technicians." Unsupervised learning treats all features equally; that's why attention mechanism is useful No N/A No Yannick Roy Hubert Banville TBC Langkvist2018
19 On the Classification of SSVEP-Based Dry-EEG Signals via Convolutional Neural Networks 2018 Aznan, Bonner, Connolly, Moubayed & Breckon Arxiv Yes Preprint Durham University UK 6 Classification of EEG signals BCI Reactive SSVEP Improve State-of-the-Art Apply CNN to SSVEP with dry EEG headset SSVEP Want an end-to-end system (no need to extract features) Quick-20 (Cognionics) SSVEP Internal Recordings Private 4 subjects, 4 classes 640 trials total (160 per class) x 3s 640 32 4 20 500 Offline None No No No None Raw EEG N/A Pytorch CNN CNN - Layer 1: temporal filter Yes N/M 2 (Tried 7 in the end) 2 ReLU L2 normalization Dropout (50%) Yes 4 N/M Standard optimization Standard Adam Adam N/M 32 Grid search Yes No Categorical cross-entropy Both 10-Fold CV and Leave-One-Subject-Out k-fold; Leave-One-Subject-Out N/M Accuracy accuracy Nvidia GTX 1060 4 min Subject 1 - all data: 96% Subject 1-3 (individually, only 20 trials each): mean of 89% Across-subjects: 78% Leave-one-subject:out:59% Traditional feature-based pipeline (Riemannian Geometry + classifier) RNN, LSTM, GRU DL & Trad. No No No Repeating the convolutional layer block increased accuracy on the held-out subject. N/M No N/A Yes Hubert Banville Isabela Albuquerque TBC Aznan2018
20 A Long Short-Term Memory deep learning network for the prediction of epileptic seizures using EEG signals 2018 Tsiouris, Pezoulas, Zervakis, Konitsiotis, Koutsouris & Fotiadis Computers in Biology and Medicine No Journal National Technical University of Athens Greece 14 Classification of EEG signals Clinical Epilepsy Prediction New Approach: Using LSTM for seizure detection. (claiming they are the first ones but they are not, so its a Improve SOTA) Apply LSTM for seizure detection on CHB-MIT Resting State, Eyes Open, Eyes Closed, Seizures. Expand from CNN to LSTM. (they claimed to be first, but they are not...) N/M Seizures CHB-MIT Public 983h, 185 seizures (5s windows, no overlap) 707760 58980 23 23 256 1) Selecting only channels that are stable across recordings (for cross-validation) 2) Kept 18 channels. Yes No No Time Domain: the 4 Statistical Moments, Standard Dev, Zero Crossings, Peak-to-peak Voltage, Total signal area, decorrelation time. Frequency Domain: FFT (PSD), DWT. Cross-Correlation: Max absolute coefficient. Graph Theory: Local & Global measures. (all on 5s windows) Combination N/M Keras Tensorflow Python 3.6 LSTM RNN - LSTM Length: predicting seizures from 15 min before onset to 120 min before onset Yes Features x EEG Segments 643x[5-50] LSTM_1: 1 (32 HU) LSTM_2: 1 (128 HU) LSTM_3: 2 (128/128 HU) + 1 FC (30) 3 ReLU Dropout (finally discarded, because the shuffling of data seems to be enough) Yes 2 (Softmax, 1 hot encoded: preictal or interictal) N/M By shuffling the EEG segments that are used as input, the LSTM network is forced to learn more generic preictal patterns as each sequence consists of random, non-adjacent preictal segments that not only come from various locations with different time distances from the actual seizure onset, but also from the preictal activity of different seizures. Standard Adam Adam LR: 0.001 B1: 0.9 B2: 0.999 Decay: 0 10 Manually trying 3 different configurations Yes Splitted minority class in smaller subgroups to balance classes Cross-Entropy Both 10-Fold CV k-fold Eval: 3/24 Train: N/M (assuming 21/24) Sensitivity (SEN) Specificity (SPEC) False Prediction Rate (FPR) Preictal Window sensitivity, specificity, false prediction rate N/M (they seem to say CPU) N/M [Segments] SEN, SPEC | [Events] SEN, FPR 15-min Preictal Window: 99.28, 99.28 | 100, 0.107 30-min Preictal Window: 99.37, 99.60 | 100, 0.063 60-min Preictal Window: 99.63, 99.78 | 100, 0.032 120-min Preictal Window: 99.84, 99.86 | 100, 0.02 SVM Decision Trees Repeated Incremental Pruning to Produce Error Reduction (RIPPER) (LSTM outperforms all of them on all subjects) Traditional pipeline No No No In theory, better EEG signal representation could be learned if the size of LSTM network was substantially increased, by adding more layers and memory units, to compensate for the increased input size of directly providing the EEG signals. However, the computational cost of training larger LSTM networks increases rapidly requiring more training time or using arrays of GPUs. Even if computational cost was not a problem, this approach would require even more EEG data to effectively train the millions of network parameters. 1) Overal amount of Data 2) Number of Seizures No N/A Yes Yannick Roy Isabela TBC Tsiouris2018
21 Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification 2018 Phan, Andreotti, Cooray, Chén & De Vos Arxiv Yes Preprint University of Oxford UK 11 Classification of EEG signals Clinical Sleep Staging Improve State-of-the-Art (~New task: predict neighboring classes too) Use one-to-many approach with a multi-task softmax to leverage neighboring data to predict sleep stage Sleep Re-using their previous network. (H. Phan et al., 2018) N/M Sleep events MASS Public 228,870 epochs x 30s from 200 subjects 228870 114435 200 1 100 Offline 1) Convert 20s epochs in 30s epochs (+5s before + 5s after) Yes N/M N/M Spectrogram (STFT) Hamming window 2s + 50% overlap Log Spectrum Frequency-domain N/M* Tensorflow CNN CNN Conv-Pool-Softmax Layer 0: filter bank on spectrogram Yes 129 x 29 x {1, 2, 3} Bins x time, x channels 30-s epochs 1 (1xCNN +Pooling +Softmax) 1 ReLU L2 Dropout (20%) Yes 5 Wake N1, N2, N3 REM 5 x (1 + 2 * nb of neighbouring windows) N/M* Standard optimization Standard Adam Adam LR: 0.0001 200 N/M N/M Randomly selected batch with balanced classes Categorical cross-entropy Inter Leave-10-Subjects-Out (20-Fold CV) Leave-N-Subjects-Out Train: 180 subjects Valid: 10 subjects Test: 10 subjects Accuracy, Kappa, Specificity, Sensitivity, F1-score accuracy, Cohen's kappa, specificity, sensitivity, f1-score N/M* 1.36 hours Multimodal acc.: 83.6 % One-to-one and Many-to-one with same architecture and with a different ConvNet architecture without l-max pooling DeepCNN DeepSleepNet DL No No No Increasing the number of filters in the Conv layer doesn't impact the performance much Adding other modalities (EOG, EMG) lead to significant improvements A context size larger than 3 leads to performance degradation Using recurrent layers might help No No N/A No Hubert Banville Yannick Yes Phan2018
22 Deep Convolution Neural Network and Autoencoders-Based Unsupervised Feature Learning of EEG Signals 2018 Wen & Zhang IEEE Access No Journal Xiamen University China 12 Improvement of processing tools Feature learning Improve SOTA Learn features for epilepsy detection using unsupervised learning Resting State, Eyes Open, Eyes Closed, Seizures. Learn features automatically N/M Raw EEG Bonn University; CHB-MIT Public DS #1 - Bonn University (A,B,C,D,E) 5x 100 epochs of 23.6s DS #2 - CHB-MIT (first 10 subjects) 200 + 200 examples of 4096 (@ 256Hz = 16s) 500; 400 197; 106.6 10; 10 1 173.61; 256 Offline 1) Common average reference 2) Bandpass 0.53-40 Hz Yes N/M N/M 1) Chose single channel with the most variance Raw EEG min-max Scikit-learn Python Convolutional AE AE Various (tried multiple classifiers on top of the encoder) - Yes 4096 x 1 9 [YR: not sure how HJB got that 9] 9 ReLU N/M N/M 2 Seizure No Seizure (not explicit) 4096 x 1 N/M 1) Training AE 2) Training standard classifier on learned features Pre-training Adam Adam N/M N/M N/M N/M N/M Mean absolute error divided by input mean amplitude Both 5 and 10 -Fold CV k-fold N/M Accuracy accuracy N/M N/M No aggregate is reported... (see paper, they report results per subject and per classifier) PCA Random projection Traditional pipeline No No No Less than 4 hidden units on the bottleneck layer led to a drop in accuracy as compared to standard dimensionality reduction techniques. Their approach is flexible to new datasets... "It is very difficult to train multiple hidden layers [...]" No N/A No Hubert Banville Yannick TBC Wen2018
23 Deep learning with convolutional neural networks for decoding and visualization of EEG pathology 2018 Schirrmeister, Gemein, Eggensperger, Hutter & Ball Arxiv Yes Preprint University of Freiburg Germany 7 Classification of EEG signals Clinical Pathological EEG Improve SOTA Feature visualization/interpretability End-to-end detection of abnormal EEG N/M Automated EEG diagnosis N/M Raw EEG TUH Abnormal EEG Corpus Public TUH Abnormal Corpus 2740 + 277 = 3017 (x 16min) (they explored using [1, 16] min) (6s windows) 482720 48272 2132 21 250 1) Select 21 electrodes common to all subjects 2) Remove 1st minute 3) Crop to recording to up to 20 minutes 4) Clip amplitude to +-800uV 5) Resample to 100Hz Yes N/M N/M Raw EEG Raw EEG N/M Pytorch CNN CNN Tried two architectures: shallow and deep CNNs Shallow CNN tailored to decode band powers Yes 600 x 21 Deep: 5 conv layers Shallow: 1 conv layer 5 ELU N/M N/M 2 N/M Standard optimization Standard Adam Adam Used SMAC N/M SMAC Yes N/M Binary cross-entropy Inter 10-Fold CV k-fold Train: 5480 (~90%) Test: 554 (~10%) Accuracy, Sensitivity, and Specificity accuracy, sensitivity, specificity N/M < 3.5 h Accuracy: 85.4% (deep), 84.5% (shallow) Sensitivity: 75.1% (deep), 77.3% (shallow) Specificity: 94.1% (deep), 90..5% (shallow) CNN and linear model with band-power features as input DL & Trad. Wilcoxon signed-rank test Effect of spectral perturbations of the input on the resulting prediction Input-perturbation network-prediction correlation maps Perturbation visualizations showed that the CNNs used information related to changes in delta and theta bands. Suprisingly, shorter length EEG recordings yielded better accuracies. "Still, to yield more clinically useful insights and diagnosis explanations, further improvements in ConvNet visualizations are needed." Yes GitHub No Isabela Albuquerque Hubert Banville TBC Schirrmeister2017a
24 Predicting sex from brain rhythms with deep learning 2018 van Putten, Olbrich & Arns Scientific Reports (Nature) No Journal University of Twente Netherlands 7 Classification of EEG signals Personal trait/attribute Sex New Approach: Detecting Sex from RS EEG with DL (CNN) Predicting an individual's sex from their EEG Resting State EEG. No need for engineered features, and "have potential to detect subtle differences in otherwise similar patterns". N/M Raw EEG Brain Resource Int'l Database Public 1308 subjects x 40 segments x 2s (2s windows, no overlap) 52320 1744 1308 24 128 1) Downsampled to 128Hz (from 500Hz) 2) Band-Pass Filter: 0.5-25Hz Yes EOG regression Yes Raw EEG Raw EEG N/M Windows 10 Keras, Tensorflow Python 3.6 CNN CNN None N/M Yes 256 x 24 (Samples x Channels) 2s epoch 6 6 ReLU Dropout Yes 1 0: Female | 1: Male (2 from schema) 9,051,902 Standard optimization Standard Adamax Other LR=0.002, B1=0.9, B2=0.999, e=10^8, decay=0.00 70 N/M N/M N/M Categorical Cross-Entropy Inter No No Train: 1000 subjects Test: 308 subjects Accuracy accuracy NVidia GTX-1060 N/M 81% (of correct classification over all subjects) LR Traditional pipeline Permutation test Visualization of learned filters through Deep Dream-like backprop on inputs Generating input to maximize activation While not all details of the features used for classification by the deep net have been revealed, our data show that differences in brain rhythms between sexes are mainly in the beta frequency range. N/M No N/A No Yannick Roy Hubert Banville TBC VanPutten2018b
25 Deep learning with EEG spectrograms in rapid eye movement behavior disorder 2018 Ruffini, Ibanez, Castellano, Dubreuil, Gagnon, Montplaisir & Soria-Frisch BioarXiv Yes Preprint NeuroElectrics University of Montreal Canada 10 Classification of EEG signals Clinical Sleep Abnormality detection New Approach Using DCNN for Rapid Eye Movement Behavior Disorder Resting State EEG. Exploiting compositional structure in data N/M Raw EEG Internal Recordings Private (118 + 74) = 192 subjects 148 windows of 1s per subject (1s windows) 28416 473.6 192 14 256 Offline 1) Band-Pass Filter: 0.3 and 100 Hz [Hardware] 2) Notch Filter: 60Hz [Hardware] ((FFT) after detrending blocks of 1 second with a Hann window (FFT resolution is 2 Hz)) Yes N/M N/M Spectrogram Frames Frequency-domain z-score Tensorflow DCNN CNN Conv-Pooling-Dropout N/M Yes 14 x 21 x 20 Channels x FFTBins x Epochs 5 5 ReLU Dropout Yes 2 Parkinson's disease Healthy 2 N/M* Standard optimization Standard N/M N/M N/M N/M N/M N/M Random replication of subjects in the minority class Cross-Entropy Inter Leave-Pair-Out (one subject for each class) Leave-One-Subject-Out N/M Accuracy ROC AUC accuracy, ROC AUC N/M N/M* Net: Problem [ N ] ACC (AUC) DCNN: HC vs PD [2x73 / 2x1] 79% (87%) RNN: HC vs PD [2x73 / 2x1] 81% (87%) DCNN: HC+RBD vs PD+DLB [2x159 / 2x1] 73% (78%) RNN: HC+RBD vs PD+DLB [2x159 / 2x1] 72% (77%) Stacked RNN Shallow CNN DL No Maximizing network outputs for a given class Generating input to maximize activation Although here, as in [28], we worked with time-frequency pre-processed data, the field will undoubt- edly steer towards working with raw data in the future when larger datasets become available—as suggested in [21] "We note that one of the potential issues with our dataset is the presence of healthy controls without follow up, which may be a confound. \We hope to remedy this by enlarging our database and by improving our diagnosis and follow up methodologies" No N/A Yes Yannick Roy Isabela Albuquerque Yes Ruffini2018a
26 Deep transfer learning for error decoding from non-invasive EEG 2018 Völker, Schirrmeister, Fiered, Burgard & Ball IEEE International Conference on Brain-Computer Interface Yes Conference University of Freiburg Germany 6 Classification of EEG signals BCI Reactive ERP New approach: Exploring Transfer Learning for BCI. Using CNN on 2 different BCI tasks, can it generalize? Transfer Learning across subjects and across tasks 1) Eriksen Flanker Task 2) Online GUI to control intelligent robots Enables transfer learning N/M 1) Error 2) Mental tasks (MI) Internal Recordings Private 1) 1000 trials x 1.5s x 31 subjects 2) (3032 +/- 818) x 4 x 1.5s 1.5s / epoch (onset at 0.5s) 31000; 12128 775; 303.2 31; 4 128; 64 N/M 1) Re-referenced to Common Average (CAR) 2) Resampled to 250Hz Yes N/M N/M Raw EEG Raw EEG Electrode-wise exponential running standardization Python BrainDecode Scikit-learn CNN CNN N/M N/M (see BrainDecode paper) No N/M (Raw EEG windows) N/M (see BrainDecode paper) N/M N/M N/M (see BrainDecode paper) N/M N/M N/M N/M (See braindecode paper) N/M N/M (see BrainDecode paper) N/M N/M N/M N/M N/M N/M N/M (see BrainDecode paper) Both Withing-Sub: Leave-One-Session-Out CV Between-Sub: Leave-One-Subject-Out CV Leave-One-Session-Out; Leave-One-Subject-Out Within-Sub Train: 80% Within-Sub Test: 20% Between-Sub Train: N-1 Sub. Between-Sub Test: 1 Sub. Normalized Accuracy normalized accuracy N/M N/M Between-Subject Transfer Learning Flanker Task: 81.7% Normalized Accuracy GUI Robots Task: Poor results, because only 4 subjects. Between-Paradigms Transfer Learning Both failed. ~50% rLDA (CNN outperforms rLDA) Also, best result ever reported on the Error Detection on Flanker Task* Traditional pipeline Paired t-tests Input-perturbation network-prediction correlation maps Input-perturbation network-prediction correlation maps (1) As a next step, techniques including data augmentation and automated hyper-parameter and architecture search might help to improve the generalization of deep ConvNets. (2) For a generalization to new subjects, our data suggest that a training subject group of at least 15 subjects might be necessary for reliable error decoding on unknown subjects. (3) In the flanker task, our deep ConvNets achieved the highest to date reported average accuracy. N/M No N/A Yes Yannick Roy Hubert Banville TBC Volker2018
27 DeepIED: An epileptic discharge detector for EEG-fMRI based on deep learning 2018 Hao, Khoo, von Ellenrieder, Zazubovits & Gotman NeuroImage: Clinical No Journal McGill University, Osaka University Canada 14 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA Detect interictal epileptic discharges in noisy EEG data collected during an fMRI recording Resting state EEG - Seizures. Reduce the amount of time it takes to manually label interictal epileptic discharges BrainAmp (BrainProducts) Raw EEG Internal Recordings Private 67 patients (148 studies) Average study time: 50 min (range, 18–72 min) (~1s windows) 201000 7400 67 25 200 Offline 1) Bandpass 0.5-50 Hz 2) fMRI-induced artefact removal 3) Electrode-wise exponential running standardization [6] was applied with a decay factor of 0.999 4) BCG artifact removal (ballistocardiographic) Yes N/M N/M Raw EEG Raw EEG N/M N/M CNN (ResNet) CNN - - Yes 25 x [16 to 256] 31 31 ReLU Dropout on penultimate layer (50%) Yes N/M (different EID types) EIDs 128 (FC) going to softmax and triplet (real output N/M) 999,920 Standard optimization Standard N/M N/M N/M N/M N/M N/M N/M Softmax for multi-class classification Triplet loss function Inter No No Train: 30 subjects Test: 37 subjects ROC curves Sensitivity False positive rate ROC, sensitivity, false positive rate N/M N/M Median sensitivity: 84.2% False positive rate: 5 events/min Cross-correlation (template-based) method for finding similar EEG epochs Traditional pipeline One-Way Anova + Post Hoc paired t-test No No In their tests, they asked experts to edit the outputs of the net and reject false positives; they argue that it's a necessary step and that it is not too time-consuming. - No N/A No Hubert Banville Yannick TBC Hao2018
28 Deep learning for hybrid EEG-fNIRS brain–computer interface: application to motor imagery classification 2018 Chiarelli, Croce, Merla & Zappasodi Journal of Neural Engineering No Journal G. d'Annunzio' University Italy 12 Classification of EEG signals BCI Active Motor imagery Improve SOTA Improving MI classification with DL in multimodal system Motor Imagery High performance on other tasks (EGI) ERD/ERS Internal Recordings Private 40 trials (C1: 20 / C2: 20) of 5s 200 samples x 15 subjects (1s windows) 3000 50 15 123 250 Offline 1) Bandpass 8-30 Hz Yes N/M N/M Power in the mu-beta range, averaged across 1-s Frequency-domain N/M TensorFlow Fully-connected NN FC N/M N/M Yes 123 x 1, 16 x1, or 139 x 1 5 5 ReLU Dropout (0.75) Yes 2 Right-hand MI, Left-hand MI 2 N/M Standard optimization Standard Adam Adam LR=1e-4, B1=0.9, B2=0.999, constant=1e-8 90 N/M N/M N/M Cross-Entropy Intra 10-Fold CV (1000x) k-fold Train: 180 Test: 20 Accuracy accuracy N/M N/M EEG only: ~70%, NIRS only: ~77%, EEG+NIRS: ~83% LDA, linear SVM Traditional pipeline 2-way repeated measurement ANOVA + post-hoc analysis No No DNN worked better than CNN, RNN not tested. RNN was not tested No N/A No Hubert Banville TBR TBC Chiarelli2018
29 Preference Classification Using Electroencephalography (EEG) and Deep Learning 2018 Teo, Hou & Mountstephens Journal of Telecommunication, Electronic and Computer Engineering (JTEC) No Journal University Malaysia Sabah Malaysia 5 Classification of EEG signals Monitoring Affective Emotion Improve SOTA Improving classification of preference (like vs. dislike), and overcoming intra- and inter-subject variability Rating of 3D Stimulus (1: like very much, 2: like, 3: undecided, 4: do not like, 5: do not like at all) N/M B-Alert X10 (ABM) Raw EEG Internal Recordings Private 208 trials: 9s + [5-15]s, from 16 subjects (full trial as windows) 10 other subjects were for kNN (not counted) 208 65.87 16 9 N/M Offline 1) Notch Filter: 50Hz Yes Proprietary artefact rejection and interpolation Yes 45 features (PSD for each channel) D (1-3Hz), T (4-6Hz), A (7-12Hz), B (13-30Hz), G (31-64Hz) Frequency-domain N/M R DNN FC N/M N/M No 47 2 2 ReLU N/M N/M 2 Like very much Do not like at all N/M N/M Standard optimization Standard Adadelta Other N/M N/M N/M N/M N/M Cross-Entropy Inter 10-Fold CV k-fold N/M Accuracy accuracy N/M N/M 63.99% SVM Linear: 60.19%, SVM Radial: 59.67%, OneR: 59.00%, Adaboost: 58.65%, Random Forest: 57.74%, NNet: 57.71%, JRip: 57.21%, Naive Bayes: 56.79%, C5.0: 56.74%, kNN (k = 5): 56.29% Traditional pipeline No No No "An initial study using kNN provided sufficiently good results in a 10-subject study. However, when expanded to a larger cohort size of 16 subjects, the results were not encouraging. However, the use of deep learning was able to observably overcome some of the difficulties presented by inter-subject variability posed by larger cohort sizes in EEG-based preference classification." Intersubject variability No N/A No Yannick Roy Hubert Banville Yes Teo2018
30 An Automated System for Epilepsy Detection using EEG Brain Signals based on Deep Learning Approach 2018 Ullah, Hussain, Qazi &Aboalsamh Arxiv Yes Preprint National University of Ireland King Saud University Ireland 18 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA Improving ternary classification of ictal vs. normal vs. interictal windows Resting State, Eyes Open, Eyes Closed, Seizures. Automatic feature learning N/M Raw EEG Bonn University Public Bonn University (5 sets x 100 x 23.6s) Each 100 samples --> 800 windows (512 points windows, 6.25% overlap) 4000 197 15 1 173.6 Offline N/M N/M N/M N/M Raw EEG Raw EEG z-score TensorFlow Pyramidal 1D-CNN (P-1D-CNN) CNN No pooling 1D convolution motivated by EEG being a "1D signal" Yes 8 EEG windows Raw EEG (1 channel) 3 Conv + 2 FC 5 ReLU Dropout (0.5) Batch norm Yes 2 or 3 2: Epileptic vs. non-epileptic 3: normal vs. ictal vs. interictal 2 or 3 Classes (Softmax) N/M Standard optimization Standard Adam Adam LR=0.001, B1=0.9, B2=0.999, epsilon=0.00000001, locking=false N/M N/M N/M Overlapping windows (87.5% and 25% overlap) Cross-Entropy Inter 10-Fold CV k-fold Train: 90% Test: 10% Accuracy, Specificity, Sensitivity, Precision, f-measure, and g-mean. accuracy, specificity, sensitivity, precision, f-measure, g-mean N/M N/M 99.1 ± 0.9% (for 3 classes problem) The mean accuracy of the proposed system is 99.6% for all the sixteen cases/ Many results (see papers) comparing Binary / Tenary classifications. Random forests, Naive Bayes, kNN Traditional pipeline No No No "According to our knowledge until this date, DL approach has never been used for this problem. The mean accuracy of the proposed system is 99.6% for all the sixteen cases (shown in Table 8 last column), which figures out the generalization power of the proposed system." Small datasets No N/A Yes Yannick Roy Hubert Banville TBC Ullah2018
31 A Novel Channel-aware Attention Framework for Multi-channel EEG Seizure Detection via Multi-view Deep Learning 2018 Yuan, Xun, Ma, Suo, Xue, Jia & Zhang IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) No Conference Beijing Laboratory of Advanced Information Network State University of New York at Buffalo China 4 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA Use end-to-end model with attention mechanism to select channel and detect seizures. N/M "Explore inherent EEG representations" N/M Raw EEG CHB-MIT Public CHB-MIT: 9 out of 23 subjects 4302 EEG fragments (windows length = ??) 4302 N/M 9 23 256 Offline N/M N/M N/M N/M Spectrogram (STFT) Frequency-domain N/M N/M 2 SAEs AE Channel Encoders (SAE) Global Encoder (SAE) + Attention N/M Yes N/M 2 2 N/M Dropout Yes 2 No seizure Seizure 2 N/M [Not clear!] Unsupervised pretraining, followed by fine-tuning with softmax layer? Standard Adam Adam N/M N/M N/M N/M N/M Cross-Entropy Inter Holdout Holdout N/M F1-Score Accuracy AUC of ROC and precision-recall curves f1-score, accuracy, ROC AUC, PR AUC N/M N/M [F1-score] - Channel Attloc: 0.9781, Channel Attglo: 0.9785 [Accuracy] - Channel Attloc: 0.9651, Channel Attglo: 0.9661 PCA+SVM (PSVM) SAEs + attention DNN + hard channel selection DL & Trad. N/M Analysis of mean attention score values for a single subject Analysis of activations "To the best of our knowledge, this is the first work using attention mechanism for biosignal channel selection in healthcare." No N/A No Yannick Roy Hubert Banville TBC Yuan2018a
32 Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials 2018 Waytowich, Lawhern, Garcia, Cummings, Faller, Sajda, Vettel Journal of Neural Engineering Yes Journal U.S. Army Research Laboratory Lab for Intelligent Imaging and Neural Comp. University of Pennsylvania University of California, Santa Barbara USA 21 Classification of EEG signals BCI Reactive SSVEP Improve SOTA Use ConvNet for SSVEP classification SSVEP (12 classes!) Automatic feature learning without domain-specific information ActiveTwo (BioSemi) SSVEP Internal Recordings Public 10 subjects x 15 block x 12 trials x 4s (1s windows) 7200 120 10 8 2048 Offline 1) Bandpass 9-30 Hz 2) Downsampled to 256 Hz Yes N/M N/M Raw EEG Raw EEG TensorFlow, Keras Original Stimuli (from 2015) on MATLAB with Psychophysics Toolbox EEGNet CNN Filter banks (temporal convolutions) followed by spatial filters N/M Yes 8 channels x 256 samples 3 3 ELU Batch norm Dropout (0.25) Yes 12 12 different combinations of frequency and phase 12 46,476 (45,900 trainable) Standard optimization Standard Adam Adam N/M 64 N/M N/M N/M Categorical Cross-Entropy Inter Leave-One-Subject-Out Leave-One-Subject-Out Train: 90% Test: 10% Accuracy accuracy N/M N/M ~90% for 7/10 Subjects. 60%, 75%, 30% for the others. (chance = 8%) CCA (Canonical Correl. Analysis) C-CCA (Combined CCA) Traditional pipeline Paired t-tests when comparing to baseline Visualization of feature activations with t-SNE Analysis of activations "Although unexpected, these within-class clusters highlight the strength of the deep learning approaches to learn diagnostic features directly from the data." Experiment did not includ a non-control state Yes GitHub No Yannick Roy Hubert Banville Yes Waytowich2018
33 Deep Classification of Epileptic Signals 2018 Ahmedt-Aristizabal, Fookes, Nguyen & Sridharan Arxiv Yes Preprint Queensland University of Technology Australia 4 Classification of EEG signals Clinical Epilepsy Detection Improve State-of-the-Art: Using LSTM for Epilepsy classification End-to-end seizure detection Resting State, Eyes Open, Eyes Closed, Seizures. Automatic feature learning N/M Raw EEG Bonn University Public Bonn University (5 sets x 100 x 23.6s) (full 4096 points / 23.6s as windows, no overlap) 500 197 15 1 173.6 Offline None, but the Bonn University Dataset already has some preprocessing. Yes N/M N/M Raw EEG Raw EEG N/M Keras LSTM RNN N/M N/M Yes 100 x 4096 (100 samples of 4096 segments) Model 1: 1 LSTM + 1 Dropout Model 2: 2 LSTM + 2 Dropout + 1 FC 3 N/M Dropout (0.35) Yes 2 No seizure Seizure 1 Model 1: 16,961 Model 2: 116,033 Standard Standard Adam Adam LR: 1e-3, b1:0.9, b2:0.999 4 N/M N/M No Binary Cross-Entropy Inter 10-Fold CV k-fold Train: 70% Valid: 20% Test: 10% Accuracy, Sensitivity, Specificity, Precision and the Area Under the Curve (AUC). accuracy, sensitivity, specificity, precision, ROC AUC N/M N/M Accuracy: [Valid] 95.54% [Test] 91.25% Sensitivity: [Test] 91.83% Specificity: [Test] 90.50% Precision: [Test] 91.50% AUC: [Test] 0.9582 None None No No No "We experimented with various numbers of memory cells in each layer and obtained the best performance with a network configured with one single layer with 64 hidden units (Model 1) and with 2 hidden layers of 128 and 64 hidden units respectively (Model 2)" N/M No N/A Yes Yannick Roy Isabela Albuquerque TBC Ahmedt-Aristizabal2018
34 Emotion Recognition from EEG Using RASM and LSTM 2018 Li, Tian, Shy, Xu & Hu International Conference on Internet Multimedia Computing and Service No Conference South China University of Technology Lanzhou University China 9 Classification of EEG signals Monitoring Affective Emotion Improve SOTA Using rational assymetry (RASM) as features and LSTM as classifier on DEAP dataset for emotion classification. 2 Classes (Positive / Negative Valence) Watching emotional movies (clips) LSTM to capture temporal dependencies in emotions N/M Raw EEG DEAP Public DEAP 895 Trials x 125 windows 63s each trial to 125 windows (1s windows, 50% overlap) 111875 939.75 32 32 256 Offline None No No No RASM14 (STFT + Hanning Window --> 4 Freq Bands) Frequency-domain N/M N/M LSTM RNN N/M "In our assumption, emotions change continuously, and this continuity is reflected in the temporal correlations of EEG signals. To explore the correlations, the classification method of Long Short-TermMemory networks (LSTM) is adopted." Yes 125 * 14 * 4 (segments * pairs * bands) 1 1 N/M Dropout (0.5) Yes 2 Positive valence Negative valence 1 N/M Standard Standard N/M N/M N/M N/M N/M N/M N/M N/M Inter 10-Fold CV k-fold N/M Accuracy accuracy N/M N/M RASM + LSTM: 76.67 (Accuracy) RASM + SVM: 65.62 (Accuracy) Zhang, 2016: 69.67 (Accuracy) Chen, 2015: 73.00 (Accuracy) Li X, 2016: 72.06 (Accuracy) SVM Zhang [10] (DE + GELM) Chen [2] (Fusion feature + HMM) Li [6] (Wavelet energy + CRNN) DL & Trad. No No No "Although the accuracy of our experiment is more than 75%, it is not good enough for applications. The task of the future work is to improve the recognition accuracy. More features will be tried especially those reflect the characteristics of EEG signals in frequency-space domain." N/M No N/A No Yannick Roy Isabela Albuquerque TBC Li2018
35 EEG detection and de-noising based on convolution neural network and Hilbert-Huang transform 2018 Wang, Guo, Zhang, Bai & Wang International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) No Conference Changchun University of Science and Technology Jilin Engineering Research Center of RFID and Intelligent Information Processing China 6 Improvement of processing tools Signal cleaning Artifact handling New Approach Denoising EEG with Hilbert-Huang Transform after a detection of (yes/no) EOG artifact from a CNN classifier N/M* Nonlinearity of EEG N/M Raw EEG Internal Recordings Private 2000 training data and 100 test data (3s windows) 2100 105 N/M -1 1000 N/M* N/M N/M* N/M IMF / HHT Other N/M N/M CNN CNN 2*2 convolution kernels N/M Yes "characteristic matrix of the extracted instantaneous power" 1 1 Softmax N/M N/M 1 EOG artifact yes/no (softmax) N/M N/M N/M N/M* N/M N/M N/M N/M N/M No N/M* Inter No No Train: 2000 Test: 100 Accuracy accuracy N/M N/M 80% No None No No No The results show that the method in this paper takes a little longer CPU time compared with the traditional wavelet de- noising [4] and HHT de-noising alone. But the signal-to-noise ratio after de-noising is obviously higher than the other two methods. N/M No N/A No Yannick Roy TBR TBC Wang2018a
36 Data Augmentation for EEG-Based Emotion Recognition with Deep Convolutional Neural Networks 2018 Wang, Zhong, Peng, Jiang & Liu International Conference on Multimedia Modeling No Conference Shenzhen University The Hong Kong Polytechnic University China 12 Generation of data Data augmentation New Approach: Data augmentation on Emotion datasets for Deep learning models Data augmentation on SEED & MAHNOB-HCI dataset and evaluation using ResNet & LeNet. Watching emotional films/clips Data augmentation for deep models with many parameters N/M Emotions (Frequency Features) SEED; MAHNOB HCI Public DS #1 - SEED: 630 EEG segments, from 14 subjects 14 subjects x 15 videos x 3 sessions x 4 min splitted into 3x62s each for 1890 segments total DS #2 - MAHNOB-HCI: Between 34.9 - 117s (avg at: 75.95) 188 negative, 208 neutral and 131 positive segments (1s windows, no overlap) 117180; 40025 1953; 667.1 14; 30 62; 32 N/M Offline 1) Downsampled to 200Hz 2) Band-pass filters: 5 freq bands 3) SFTF with non-overlapping Hamming window 1s Yes Manual removal Yes Differential Entropy (DE) per band Other N/M MATCONVNET ResNet LeNet CNN Data Augmentation Paper, it's not about these networks. Data augmentation with Gaussian Noise of various std No n x l x 5 n: electrodes l: length (time) 5: Freq Bands DE LeNet: 5 ResNet: 14 14 N/M N/M N/M 3 positive, neutral, negative 1) 3 2) 3 1) 4,000 2) 20,000 Standard optimization with augmented data Standard N/M N/M lr = 0.1 100 N/M N/M Gaussian Noise (augmented up to 30 times) N/M Inter No No 1 - SEED) Train: 1134 1 - SEED) Test: 756 2) N/M Accuracy accuracy N/M N/M DS #1) LeNet: [Pre] 49.6% | [Post] 74.3% DS #1) ResNet: [Pre] 34.2% | [Post] 75.0% DS #2) ResNet: [Pre] 40.8% | [Post] 45.4% DS #2) LeNet: N/M DS #1) SVM: [Pre] 74.2% | [Post] 73.4% DS #1) PCA-SVM: [Pre] 49.8% | [Post] N/M% DS #2) SVM: [Pre] 42.5% | [Post] 44.3% Traditional pipeline No No No By analyzing the experimental result, we find that the data augmentation method can effectively improve the performance of deep models. In future, we will seek to use other data augmentation methods, such as generative adversarial networks, to generate more effective samples of EEG data and improve the performance of EEG-based emotion recognition. N/M No N/A Yes Yannick Roy Hubert Banville TBC Wang2018
37 A convolutional neural network for sleep stage scoring from raw single-channel EEG 2018 Sors, Bonnet, Mirek, Vercueil & Payen Biomedical Signal Processing and Control No Journal Université Grenoble Alpes CEA Leti, MINATEC Campus (Grenoble) Dijon University Hospital (Dijon) Grenoble University Hospital France 8 Classification of EEG signals Clinical Sleep Staging New Approach: Sleep Stage Scoring (5 stages) with CNN on Single EEG Channel Use CNNs on raw EEG data for 5-class sleep prediction Sleep CNNs have presented good performance in other domains and other EEG tasks. N/M Raw EEG SHHS Public Dataset SHHS-1 (5793 polysomnographic records) 5,384,401 epochs of 30s ~ 5 years of data! (30s windows) 5384401 2692200 5728 1 125 Offline None No No No Raw EEG Raw EEG No TensorFlow CNN CNN (no mention of pooling or dropout) 1D convolutional layers Yes (3750 * 4) x 1 30s epoch + 2 preceding + 1 following 30s @ 125Hz = 3750 samples 12 Conv Layers + 1 FC (256) + 1 FC (5 classes) 14 Leaky ReLU N/M N/M 5 Wake N1 N2 N3 REM 5 [prob for each class] (Softmax) N/M* Standard optimization Standard Adam Adam lr = 3 ×10^−5, b1 = 0.9, b2 = 0.999 128 N/M N/M Tried cost-sensitive learning and oversampling (didn't improve. didn't use it.) Multiclass Cross-Entropy Inter Train-Valid-Test Train-Valid-Test Train: 50% Valid: 20% Test: 30% Accuracy accuracy NVidia GTX980Ti N/M* 87% Tsinalis [15] CNN: 0.75 Supratak [16] CNN-LSTM: 0.86 Liang [9] [...] : 0.88 Zhu [10] DVG, SVM: 0.85 Fraiwan [6] T-F, RF: 0.83 Hassan [38] EMD, Ensemble: 0.87 Hassan [11] EMD, [...]: 0.89 Hassan [12] PSD, RF: 0.88 Hassan [39] EMD, [...] : 0.83 Sharma [13] Iterative filtering: 0.88 Hsu [14] Energy, RNN: 0.90 DL & Trad. No Visualization of synthetic inputs that maximize class probability Generating input to maximize activation "This study shows that it is possible to classify sleep stages using a single EEG channel and a convolutional neural network work- ing on raw signal samples without any feature extraction phase and with performance on par with other state-of-the-art methods." "Further research is necessary to address class imbalance. Ensemble learning [35] or CNN-specific methods [36] may prove suitable" N/M Yes GitHub No Yannick Roy Isabela Albuquerque Yes Sors2018
38 ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification 2018 Roy, Kiral-Kornek & Harrer Arxiv Yes Preprint IBM Research - Australia Australia Classification of EEG signals Clinical Pathological EEG Improve SOTA Detect abnormal EEG with a new end-to-end architecture ? Automatic interpretation of EEG from raw data N/M Raw EEG TUH Abnormal EEG Corpus Public TUH Abnormal EEG Corpus Training set: 1361 abnormal/1379 normal sessions Test set: 127 abnormal/150 normal session Became: 14,971 / 15,169 windows for training (1min windows) 30417 30417 N/M 22 250 None No N/M N/M None Raw EEG N/A N/M 1) Conv+GRU 2) Inception Conv+GRU 3) Dense Conv+GRU 4) Inception Dense Conv+GRU CNN+RNN Conv filter sizes grow exponentially inside a given layer (e.g., 2, 4, 8) - Yes 15000 x ? 1) 7 2) 7 3) 7 4) 7 7 N/M N/M N/M 2 N/M Standard optimization Standard Adam Adam 500 epochs 64 N/M N/M N/M N/M Inter 5-Fold CV k-fold Train: 90.8% Test: 9.2% Accuracy accuracy N/M N/M 1) 82.31 2) 84.11 3) 83.89 4) 86.57 CNN-MLP: 78.80 DeepCNN: 85.40 DL No No No The ChronoNet architecture is a general-purpose architecture for time series - has been applied to speech data classication. - No N/A No Hubert Banville TBR Yes Roy2018
39 EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals 2018 Hartmann, Schirrmeister & Ball Arxiv Yes Preprint University of Freiburg Germany 7 Generation of data Generating EEG Generate EEG signals Generate EEG signals using GANs Motor imagery GANs are good at generating data N/M Raw EEG Internal Recordings Private 438 EEG Signals They don't talk about the lenght of these signals. They show plot of -500ms to 2500ms (probably windows of 3s) 438 21.9 N/M 1 250 None No No No None Raw EEG Subtract mean then divide by maximum absolute value N/M Wassertein GAN (modified) GAN - Conv layers instead of autoregressive model, as it worked well in the authors's other papers Yes Gen: 200 Discr: 768 Gen: 14 Discr: 14 14 Leaky ReLU Gradient penalty Yes N/A N/A Gen: 768 Discr: 1 N/M GAN optimization with increasing resolutions Other Adam Adam "Equalized learning rate" lr = 0.001 beta1 = 9 beta2 = 0.99 ? N/M N/M No Improved Wassertein distance Inter No No Train: 286 Valid: 72 Test: 80 Inception score Frechet inception distance Euclidean distance Sliced Wassertein distance inception score, frechet inception distance, euclidean distance, sliced Wasserstein distance N/M N/M [Many values] WGAN with gradient penalty DL No Visual inspection of generated segments (time series distribution, spectrum distribution, examples) Analysis of generated outputs The metrics did not correlate with visual performance, and so the authors recommend using many metrics to obtain a balanced view Mode collapse in GANs No N/A No Hubert Banville Isabela Albuquerque TBC Hartmann2018
40 Know Your Mind: Adaptive Brain Signal Classification with Reinforced Attentive Convolutional Neural Networks 2018 Zhang, Yao, Wang, Zhang, Zhang & Liu Arxiv Yes Preprint University of New South Wales, Tsinghua University, Michigan State University Australia Classification of EEG signals Multi-purpose architecture Make general framework for EEG classification Apply a single architecture (reinforced attentive CNN) to EEG classification 1 & 2: Motor imagery 3: Person identification 4: Pathology (seizure detection) Skip time-consuming feature engineering and no task-specific classifier. EPOC (Emotiv), N/M 1 & 2) Motor Imagery 3) None 4) Seizures eegmmidb; Internal Recordings; EEG-S; TUH Both DS #1 - eegmmidb: 20 x 28000 points (@160Hz) DS #2 - Internal: 7 x 34560 points (@128Hz) DS #3 - EEG-S: 8 x 7000 points (@160Hz) DS #4 - TUH: 5 x 12000 points (@250Hz) (windows of 1 point) 560000; 241910; 56000; 60000 58.33; 31.5; 5.8; 4 20; 7; 8; 5 64; 14; 64; 22 160; 128; 160; 250 None No N/M N/M None Raw EEG N/A TensorFlow CNN with attention + DQN CNN 1) Replicating and shuffling incoming samples 2) Attention mechanism trained with RL 3) CNN 4) Nearest-neighbour classifier A) Replicate and shuffle operation intended to randomly unveil interesting spatial patterns Yes 1 x nb_channel CNN: 3 DQN: 2 3 ReLU & Sigmoid L2 Yes 1) 5 2) 6 3) 8 4) 2 N/M Standard optimization (including reinforcement learning) Standard Adam Adam Learning rate: 0.001 N/M N/M N/M No Cross-entropy Inter N/M No N/M Accuracy, Precision, Recall, F1-score Latency Resilience accuracy, precision, recall, f1-score, latency, resilience N/M 10 min Accuracy 1) 0.9932 2) 0.9708 3) 0.9984 4) 0.9975 Not clear what they were trained on (samples? features?): Linear SVM, Random Forest, kNN, LSTM, GRU, Adaptive boosting, LDA + 5 state-of-the-art papers for each (20 total) DL & Trad. No No No Latency is comparable to other methods The number of channels used affects the performance. - Yes GitHub No Hubert Banville Yannick TBC Zhang2018a
41 Gated Recurrent Networks for Seizure Detection 2018 Golmohammadi, Ziyabari, Shah, Von Weltin, Campbell, Obeid & Picone Arxiv Yes Preprint Neural Engineering Data Consortium, Temple University USA 5 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA (their previous work) Explore Gated RNN (LSTM & GRU), explore initialiazation and regularization of these networks (see TUH dataset paper) Improve their last results N/M Seizures TUH Seizure Corpus Public TUH EEG Corpus (Train + Test | in sec) Seizures: 51,140 + 53,930 Non-Seizures: 877,821 + 547,728 (21s windows, no overlap) 72886 25510.35 246 22 250 None No N/M N/M LFCCs + First & Second Derivative of LFCCs Other N/A N/M 1) CNN + LSTM 2) CNN + GRU CNN+RNN 2D CNN to 1D CNN to bi-LSTM First LSTM output: 128 (1s data / epoch) Second LSTM output: 2-way sigmoid (classification of a 1s epoch) 1) Gated units to avoid vanishing gradient. 2) RNNs to capture long-term dependencies. Yes 210 x 22 x 26 (Windows * Channels * Features) 3x 2D CNN + 1x 1D CNN + LSTM 5 ELU 1) L1 2) L2 3) L1/L2 4) Dropout 5) Guassian Noise Yes 1 (classification - sigmoid) N/M Initialization: The best performance is achieved using orthogonal initialization Standard Adam Adam N/M N/M N/M N/M No MSE Inter No No Train: 928,962s Test: 601,659s Sensitivity, Specificity sensitivity, specificity N/M N/M CNN + GRU - Sensitivity: 30.83% | Specificity: 91.49% CNN + LSTM - Sensitivity: 30.83% | Specificity: 97.10% Best Regulation: L1/L2 Best Initialization: Orthogonal Compared CNN+GRU vs CNN+LSTM Compared 10 different initialization methods (see comments) Compared 5 different regularization methods (L1/L2, L1, L2, Gaussian noise, Dropout) DL No No No LSTMs outperformed GRUs. We also studied initialization and regularizations of these networks. In future research, we are designing a more powerful architecture based on reinforcement learning concepts. We are also optimizing regularization and initialization algorithms for these approaches. Our goal is to approach human performance which is in the range of 75% sensitivity with a false alarm rate of 1 per 24 hours [11]. No enough labeled data. Having certified specialist to label the data is very expensive, and hard to have people to do it. No N/A Yes Yannick Roy TBR TBC Golmohammadi2017b
42 Optimizing Channel Selection for Seizure Detection 2018 Shah, Golmohammadi, Ziyabari, Weltin, Obeid & Picone Arxiv Yes Preprint Neural Engineering Data Consortium, Temple University USA Classification of EEG signals Clinical Epilepsy Detection Study the Impact of Number of Channels Explore the impact of using/having from 2 to 22 channels with same network (see TUH dataset paper) Lower the number of EEG channels required (also save disk space) N/M Seizures TUH Seizure Corpus Public TUH EEG Seizure Corpus (TUSZ) No more information about samples/time (1s windows) N/M N/M N/M 22 250 None No N/M N/M LFCCs + First & Second Derivative of LFCCs Other N/A N/M CNN + LSTM CNN+RNN (same as there previous paper: Gated Recurrent Networks for Seizure Detection) (same as there previous paper: Gated Recurrent Networks for Seizure Detection) Yes 210 x 22 x 26 (Windows * Channels * Features) 3x 2D CNN + 1x 1D FC CNN + 2x Bi-LSTM 5 ELU & Sigmoid Dropout Yes 1* (classification - sigmoid) N/M N/M N/M Adam Adam N/M N/M N/M N/M No MSE Inter N/M No N/M Sensitivity, Specificity sensitivity, specificity N/M N/M 22 Channels - Sensitivity: 39.15% | Specificity: 90.37% 20 Channels - Sensitivity: 34.54% | Specificity: 82.07% 16 Channels - Sensitivity: 36.54% | Specificity: 80.48% 8 Channels - Sensitivity: 33.44% | Specificity: 85.51% 4 Channels - Sensitivity: 33.11% | Specificity: 39.32% No None No No No The results presented in this paper use the Any Overlap scoring method [11] in which true positives are counted when the hypothesis overlaps with one or more reference annotations. False positives correspond to events in which the hypothesis annotations do not overlap with any of the reference annotations. This method of scoring is popular in the EEG research community. - No N/A No Yannick Roy TBR TBC Shah2017
43 Improving brain computer interface performance by data augmentation with conditional Deep Convolutional Generative Adversarial Networks 2018 Zhang & Liu Arxiv Yes Preprint Beijing Institute of Technology China Generation of data Data augmentation Generate EEG signals Generate EEG signals using GANs for data augmentation Motor Imagery (Left/Right Hand) To increase amount of data available for training N/M Motor imagery BCI Competition II - III Public BCI Competition II - III (1 Subject x 7 runs x 40 trials x 9 seconds) Used only 280 trials (140 training / 140 testing) Took 5s from each trial: 4s-9s (5s windows) 280 23.3 1 3 128 Offline None No No No Continuous Wavelet transform (Morlet) Only keep 7-15 Hz (Time-Frequency Domain) Frequency-domain z-score N/M Augmentation: cDCGAN Classification: CNN CNN Conditional Deep Convolutional GAN (cDCGAN) + label information as input to both generator and discriminator 2D kernel to accomodate input TFR Yes N/M N/M N/M ReLU + Leaky ReLU + Sigmoid N/M N/M 2 Left Hand Rigth Hand Same as input (not mentioned) N/M cDCGAN optimization Training CNN with real and artificial data Standard N/M N/M N/M N/M N/M N/M GAN [0.5 - 2x] (artificial EEG data) N/M Intra No No Train: 50% Test: 50% Accuracy accuracy N/M N/M No augmentation: ~83 % 50% of augmentation: ~84% 150% of augmetation: ~84% 200% of augmentation: ~85.5% None None No No No Data augmentation with GAN does help increasing accuracy when limited data is available. Limited amount of data available per subject when training a BCI. No N/A Yes Hubert Banville Yannick Yes Zhang2018b
44 Convolutional neural networks for seizure prediction using intracranial and scalp electroencephalogram 2018 Truong, Nguyen, Kuhlmann, Bonyadi, Yang, Ippolito & Kavehei Neural Networks Yes Journal University of Sydney Royal Melbourne Institute of Technology Swinburne University University of Melbourne University of Queensland University of Adelaide Australia Classification of EEG signals Clinical Epilepsy Prediction Improve SOTA Use CNN to improve SOTA in seizure prediction Ongoing recording with and without seizures Test CNN on different epilepsy datasets N/M None (Seizures) Freiburg Hospital; CHB-MIT; Kaggle: AESSPC Public DS #1 - Freiburg: 311h (59 seizures) DS #2 - CHB-MIT: 209h (64 seizures) DS #3 - AESSPC: 627h (48 seizures) (30s windows, no overlap) 37320; 25080; 75240 18660; 12540; 37620 13; 13; 2 6; 22; 16 N/M 1) Removed Powerline: 47-53Hz + 97-103Hz 2) Removed DC Component (0Hz) Yes N/M N/M STFT (2D Freq x Time) 30s EEG windows Frequency-domain N/A Python Keras Tensorflow CNN CNN Batch Norm + Pooling 2 Dense First, we keep the CNN architecture simple and shallow as described above (Ba & Caruana, 2014) Yes n x 59 x 114 (electrodes x time x freq) CNN: 3 FC: 2 5 ReLU Sigmoid Softmax Dropout (50%) Yes 2 N/M We applied cost-sensitive learning by changing 300 the cost function in a way that the misclassification cost of preictal samples is multiplied by the ratio of interictal samples to preictal samples for each patient. We over-sampled the minority class. The cost-sensitive learning was for comparison. Though the two methods achieve comparable performance. Standard N/M N/M N/M N/M N/M N/M Overlapping windows (overlap % subject-specific to match classes) N/M Intra Leave-One-Seizure-Out Leave-One-Sample-Out Train: N/M Valid: 25% of training Test: N/M Sensitivity FPR (/h) sensitivity, FPR NVidia K80 N/M Measures (Epilepsy Specific): SOP of 30 min | SPH of 5 min DS #1) Sensitivity : 81.4% | FPR: 0.06/h DS #2) Sensitivity : 81.2% | FPR: 0.16/h DS #3) Sensitivity : 75.0% | FPR: 0.21/h Compares on 3 Datasets Compares to 14 other SOTA (papers) DL & Trad. Wilcoxon signed-rank test No No (1) Unbalanced Classes. (2) Comparing results with SOTA is complicated because each approach was tested with one dataset that is limited in the amount of data. No N/A Yes Yannick Roy TBR Yes Truong2018
45 Semi-supervised Seizure Prediction with Generative Adversarial Networks 2018 Truong, Kuhlmann, Bonyadi & Kavehei Arxiv Yes Preprint University of Sydney University of Melbourne University of Queensland Australia Classification of EEG signals Clinical Epilepsy Prediction Improve SOTA Use unlabelled data and data fusion to improve SOTA in seizure prediction Ongoing recording with and without seizures Leverage unlabelled data N/M Raw EEG CHB-MIT; Freiburg Hospital Public DS #1 - Freiburg: 311h DS #2 - CHB-MIT: 209h (28s windows, no overlap) 39985; 26871 18660; 12540 13; 13 16; 6 256 STFT on 28-s windows with 50% overlap Removal of power line noise frequencies Yes N/M N/M STFT Frequency-domain N/M Tensorflow 1) GAN 2) CNN Other - - Yes 1) GAN generator: 100 x1 2) GAN discriminator: n x 56 x 112 3) CNN: Same as discriminator 1) GAN generator: 4 2) GAN discriminator: 3 3) Classifier: 2 4 Softmax, Sigmoid Dropout (50%) Yes 1) GAN generator: 2) GAN discriminator: 3) CNN N/M 1) Train GAN 2) Train 2 new FC layers on top of discriminator using labelled data Other N/M N/M N/M N/M N/M N/M Overlapping windows (overlap % subject-specific to match classes) N/M Both Leave-One-Seizure-Out Leave-One-Sample-Out Train: N/M Valid: 25% of training Test: N/M ROC AUC ROC AUC Nvidia P100 N/M AUC: 77.68% (CHBMIT), 75.47 (Freiburg) [6 and 12% less than benchmark] CNN DL No No No Although the performance decreased as compared to a standard CNN, the authors argue this can reduce the effort put into labelling the data. - No N/A No Hubert Banville TBR Yes Truong2018a
46 Time Series Segmentation through Automatic Feature Learning 2018 Lee, Ortiz, Ko & Lee Arxiv Yes Preprint Princeton University USA Classification of EEG signals Multi-purpose architecture Improve SOTA Detect changepoints/breakpoints in data (changes in signal) and apply to different types of time series data Eye movements Deep learning models for changepoint detection don't make assumptions about the underlying processes, as opposed to standard models EPOC (Emotiv) Eyes open vs. eyes closed EEG Eye State Public EEG Eye State Dataset 117 seconds from 1 subject with Emotiv (14980 points, using windows of 25 points) 600 2 1 14 256 N/M N/M N/M N/M N/M N/M N/M N/M Stacked Autoencoder AE - - No N/M 2 (encoder) 2 N/M Tied weights in encoder and decoder L2 weight decay Yes 2 N/M Standard optimization Standard Stochastic gradient descent SGD N/M N/M N/M N/M No Cross-entropy (or square loss?) Intra No No N/M ROC Prediction loss (specific to task) MSE Prediction ratio ROC, prediction loss, mse, prediction ratio N/M N/M ROC curves... Bayesian changepoint detection (based on Gamma or Gaussian priors) Pruned Exact Linear Time method Density-ratio estimation method Traditional pipeline No No No Deep learning avoids typical problems in modelling changepoints. - No N/A No Hubert Banville TBR TBC Lee2018a
47 Investigating the Impact of CNN Depth on Neonatal Seizure Detection Performance 2018 O’Shea, Lightbody, Boylan & Temko Arxiv Yes Preprint Irish Centre for Fetal and Neonatal Translational Research, University College Cork Ireland Classification of EEG signals Clinical Epilepsy Detection Improve SOTA Use CNN to improve SOTA in neonatal seizure detection Ongoing recording with and without seizures Improve SOTA with CNN-11 based on their CNN-6 (2017) N/M Raw EEG Internal Recordings Private 18 babies: over 800 hours of multichannel unedited EEG containing 1389 seizure (8s windows, 7s overlap) N/M N/M 18 8 256 Down-sample to 32Hz Filtered between 0.5 and 12.8Hz Yes No No 8 sec windows (1 sec shift) Raw EEG N/A N/M CNN CNN Conv - Batch Norm - Pooling Output not Dense layer but Global Average Pooling "The 11-layer network can learn more simple features in the first layer (3 samples wide) and more complex features in the final layers (212 samples wide)." Yes 256x1 (8 sec x 1 channel) 11 11 Softmax Batch norm Yes 2 Seizure / Non-Seizure 28,642 The network was trained for 100 epochs, after each epoch the validation AUC was calculated. Standard Stochastic Gradient Descent SGD LR: 0.01 Momentum: 0.9 2048 N/M N/M Sliding Window (Shifted by 1s, 7/8 overlap) N/M Inter Leave-One-Subject-Out Leave-One-Subject-Out "The training data contains less than 2% of the validation dataset" ROC AUC ROC AUC N/M N/M AUC: 97.61% AUC90: 86.85% CNN - 6 layers (O'Shea et al., 2017) SVM DL & Trad. No No No This represents a substantial improvement over a shallower 6-layer CNN network which has a smaller range of receptive fields. These results represent the current best results for this task obtained using a single classifier. N/M No N/A No Yannick Roy TBR Yes OShea2018
48 Removing Confounding Factors Associated Weights in Deep Neural Networks Improves the Prediction Accuracy for Healthcare Applications 2018 Wang, Wu, Xing Pacific Symposium on Biocomputing 2019 Yes Preprint Carnegie Mellon University University of Illinois Urbana-Champaign Petuum Inc. USA 12 Improvement of processing tools Reduce effect of confounders New approach: Reduce effect of confounders in medical data Reduce the effect of confounders in medical data (e.g., gender bias in training data) Students watching MOOC videos Learn representations from scratch Mindset (NeuroSky) Raw EEG Internal Recordings Private 10 students x 20 videos x 2min 10 confusing / 10 not confusing (??s windows) N/M 400 10 1 N/M N/M N/M N/M No No Raw EEG Raw EEG z-score TensorFlow Bi-LSTM RNN Use of Confounder Filtering N/M No N/M N/M N/M Tanh N/M N/M 2 Confused Not-confused 1 (sigmoid) N/M N/M N/M N/M N/M N/M 20 N/M N/M N/M Binary Cross-Entropy* Inter 5-Fold CV k-fold N/M Accuracy accuracy N/M N/M CF-Bidirectional LSTM acc: 75.0% SVM: 67.2% K-Nearest Neighbors: 51.9% Convolutional Neural Network: 64.0% Deep Belief Network: 52.7% RNN-LSTM: 69.0% Bidirectional LSTM: 73.3% DL & Trad. No No No The use of confounding filtering improves the predictive performance. N/M Yes GitHub No Isabela Albuquerque TBR Yes Wu2018
49 HAMLET: Interpretable Human And Machine co-LEarning Technique 2018 Deiss, Biswal, Jin, Sun, Westover & Sun Arxiv Yes Preprint Georgia Institute of Technology Massachusetts General Hospital USA 9 Classification of EEG signals Multi-purpose architecture New approach Help experts generate high quality labels Tested on Epilepsy data, could be used for different tasks Features can be automatically extracted to help experts label the data N/M Raw EEG Internal Recordings Private D: Using 140,000 of 390,486 x 16s sequences (unbalanced with 5 classes) 1) 20,000 (89h of EEG) : 80/20 - train / test. Patients in the testing set are not present in the training set (testing is per- formed on unseen patients) 2) 20,000 (89h of EEG) : 80/20 - train / test Patients in the testing set are also present in the training set (testing is performed on known patients). 3) 100,000 sequences from D 140000 37333 155 19 200 1) Low-Pass filter: 60Hz 2) Computation of montages* (not sure what that means) 3) 16s windows Yes No No Raw EEG (None) Raw EEG N/M Python Tensorflow CNN CAE (Conv AutoEncoder) Other 1D CNN FC Layer only for training One advantage of CNNs is the automated feature selection that happens during training. Without additional work, the model learns the features that it finds most relevant for its given task, from the raw signals. Yes 16x Classifier: 6 Conv + 1FC 7 ELU Dropout (20%) Yes 5 (softmax) N/M Co-Learning Supervised & Unsupervised Other Adam Adam N/M 128 N/M N/M Flipped Electrodes Left <-> Right side of brain, keeping references the same (Fz, Cz, Pz) almost 2x dataset. N/M Inter No No Train: 80% Test: 20% Accuracy accuracy Intel(R) Xeon(R) E5-2630 2.40 GHz 32 cores 256 Gb ofRAM 4 GPUs Tesla K80 13h Before re-labeling | After re-lbl full | After re-lbl re-eval only HAMLET-CNN 39.36% | 40.75% | 68.75% HAMLET-CAE 38.46% | 39.06% | 67.97% CNN 38.89% | 41.58% | 68.75% MLP 21.04% | 23.14% | 14.06% CNN MLP DL No 1) Retrieval of closest labelled example to explain the decision on a specific input 2) Analysis of weights Retrieval of closest examples, Analysis of weights To summarize, first, we have introduced a novel tech- nique, HAMLET, for human and machine co-learning that is suited for creating high-quality labeled datasets on challenging tasks with a limited budget. This technique has benefits that can appreciated in many deep learning applications. N/M No N/A Yes Yannick Roy TBR TBC Deiss2018
50 Addressing Class Imbalance in Classification Problems of Noisy Signals by using Fourier Transform Surrogates 2018 Schwabedal, Snyder, Cakmak, Nemati & Clifford Arxiv Yes Preprint Emory University USA 7 Generation of data Data augmentation Improve SOTA Use FT Surrogates for Data Augmentation. (Tested with a CNN on Sleep Data) Sleep Dataset (CAP) Some EEG problemes are unbalanced. (e.g. Sleep stages, Epilepsy, etc.) For DL to perform well, we need data augmentation techniques. N/M Sleep CAP Sleep Public CAPSLPDB: 94 out of 101 overnight PSGs x ~8h (30s windows, no overlap) 90240 45120 94 2 N/M Low-pass filter: 13Hz (4th order Butterworth) Downsampling to 32Hz Yes No No Raw EEG (None) Raw EEG N/M N/M CNN CNN 1D CNN for each channel: 2xEEG + 1xEOG + 1xEMG Yes 30s Raw EEG Conv 1D: 4 Conv 2D: 1 FC: 3 8 Dropout Yes 6 Wake S1, S2, S3, S4 REM 6 (softmax) N/M N/M N/M RMS-Prop Other LR: 0.0016 Momentum: None Decay: 0.9 128 Baysian Hyperparams Optim. Yes FT Surrogates N/M Inter 5-Fold CV k-fold Train: 4/5 Validation: 1/5 Test: N/M F1-Score Accuracy f1-score, accuracy Google Cloud N/M Accuracy (no augmentation): 67% | 73% | 51% | 64% | 75% | 70% Accuracy (FT surrogate): 83% | 86% | 38% | 75% | 97% | 46% Accuracy (IAAFT surrogates): 91% | 83% | 48% | 79% | 96% | 81% (all internal, no external) No data augmentation FT surrogates IAAFT surrogates None No No No Increases in the S2-accuracy seemed to be at the expense of stages S1 and S3 for larger values of α. Based on these results, we hypothesize that the effect of surrogate augmentation on an individual class accuracy does not directly depend on their conditional prediction accuracies, which are on the diagonal of the conditional confusion matrix (cf. Fig. 4(a)); instead, augmentation may introduce mixing between class labels indicated by a large off-diagonal element upon which the accuracy of one of the mixed labels will dominate. Unfortunately, we were not yet able to evaluate and compare IAAFT surrogates with these results due to temporal and budget constraints. Yes GitHub Yes Yannick Roy TBR TBC Schwabedal2018
51 EEG Classification Based on Sparse Representation and Deep Learning 2018 Gao, Shang, Xiong, Fang, Zhang, & Gu NeuroQuantology No Journal Zhejiang University City College, China 7 Classification of EEG signals BCI Active Motor imagery Improve SOTA Use CNN + Sparse coding on top of CSP features Motor Imagery N/M N/M CSP BCI Competition III - IVa Public BCI Competition III - IVa 140 + 140 = 280 samples (length = 6s) 280 28 2 118 100 Offline Band-pass filter 8-15Hz Yes No No CSP (32 CSP filters) Frequency-domain N/M N/M CNN CNN CNN's input is a sparse representation of CSP features N/M Yes 28 x 28 CNN: 2 FC: 1 3 ReLU N/M N/M 2 Right Hand Right Foot 2 (softmax) N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M Binary cross-entropy Inter N/M No Train: 280 Test: N/M Accuracy accuracy N/M N/M Accuracy Class 1: 98% Accuracy Class 1: 99% Sparse representations (not clear what is the classifier) Traditional pipeline No No No Performance of CNN+sparse representations is less afect when the number of training samples decreases. N/M No N/A No Isabela Albuquerque Yannick Roy TBC Gao2018
52 Use of features from RR-time series and EEG signals for automated classification of sleep stages in deep neural network framework 2018 Tripathy, & Rajendra Acharya Biocybernetics and Biomedical Engineering No Journal Siksha 'O' Anusandhan, India Ngee Ann Polytechnic, Singapore SUSS University, Singapore India 13 Classification of EEG signals Clinical Sleep Staging Improve SOTA Use DNNs on EEG + ECG for sleep stage scoring Sleep Dataset (MIT-BIH) They don't mention why DL. N/M Raw EEG (Sleep) MIT-BIH Public MIT-BIH 18 records * 100000 points / 7500 points per window (30s windows) 240 120 18 1 250 1) 5 Band-pass filters to 5 freq bands Yes No No 14 EEG-HRV Features (out of 19) (The dispersion entropy and the variance features are evaluated from the different bands of EEG signal) (the RQA and dispersion entropy features are evaluated from the IMFs of RR-time series) Other N/M Matlab 2015a SAE AE 3 DNNs EEG features + HRV features combined as inputs. Outputs = 2 classes (x3 DNNs) N/M Yes 14 EEG Features ECG Features (30s window) 2 AE 2 Sigmoid L2 (N/M... Assumed from the formula) Yes 2 (softmax) 3 DNN Networks Classifying 2 classes each N/M Greedy Layer Wise Pre-training SGD SGD N/M N/M N/M N/M N/M (See Formula) Inter 10-Fold CV k-fold N/M Accuracy (Acc) Sensitivity (Sen) Specificity (Spe) accuracy, sensitivity, specificity CPU 2 GHz 2 GB RAM 1 Instance: EEG: 4.89s RR: 0.03s Acc Sleep vs Wake: 85.51% Acc Light vs Deep Sleep: 94.03% Acc REM vs NREM: 95.71% Hayet and Slim [55] (ELM, Werteni et al. [56] (SVM), Adnane et al. [16] (SVM), Rossow et al. [57] (HMM), Redmond and Heneghan [58] (QDA), Song et al. [59] (Multivariate Discrim. Analysis), Prucnal et al. [12] (NN), Hasan et al. [11] (RUSBoost), Da Silveira et al. [13] (RF) Traditional pipeline No No No The dispersion entropy values for delta (d), theta (u) and alpha (a) bands are found to be more discriminatory for the classification of the wake and sleep classes. The limitation of this work is that we have used only 18 subjects. The performance of this work can be improved using more subjects from the diverse race. The number of REM sleep stage instances in MIT-BIH polysomnography database is less as compared to deep sleep, light sleep and wake classes. No N/A Yes Yannick Roy TBR Yes Tripathy2018
53 Emotion stress detection using EEG signal and deep learning technologies 2018 Liao, Chen & Tai IEEE International Conference on Applied System Invention (ICASI) No Conference Department of Information Management Chaoyang University of Technology Taiwan Classification of EEG signals Monitoring Affective Emotion New approach Use CNN to classify Attention & Meditation from raw EEG Listening to music Exploring the use of DL for stress detection via EEG Mindwave Mobile (Neurosky) None Internal Recordings Private 7 subjects x 10 min (1s windows, no overlap) 4300 70 7 1 512 N/M N/M N/M N/M Frequency Bands Frequency-domain N/A N/M CNN CNN N/M N/M No 1s (N/M, assuming 512 samples) 7 7 RELU N/M N/M 1 0: Meditation 1: Attention N/M N/M N/M N/M N/M N/M N/M Grid Search Yes N/M N/M Inter N/M No Train: 80% Test: 20% Accuracy F1-Score accuracy, f1-score N/M N/M Accuracy: 80.13% None None No No No The F1-score shows that our system is better in predicting class 1 than predicting class 0. N/M No N/A No Yannick Roy TBR TBC Liao2018
54 Hierarchical internal representation of spectral features in deep convolutional networks trained for EEG decoding 2018 Hartmann, Schirrmeister & Ball BCI Conference Yes Conference University of Freiburg Germany 6 Improvement of processing tools Model interpretability Model visualization Improve interpretability of CNNs Study most activating inputs. Study effect on internal representation of variations in the input signal Motor imagery End-to-end learning N/M Raw EEG Internal Recordings Private 14 subjects x 1000 trials x 4s (4s windows) 14000 933 14 128 5000 1) Downsample to 250 Hz 2) Common average re-reference Yes No No Raw EEG (None) Raw EEG N/M Pytorch CNN CNN See Schirrmeister et al. (2017) See Schirrmeister et al. (2017) Yes 522 x 128 (samples x channels) CNN: 5 FC: 1 6 ELU N/M N/M 4 N/M Standard optimization Standard Adam Adam N/M N/M N/M N/M N/M Cross-entropy Intra N/M No Train: 80% Test: 20% Accuracy F1-Score accuracy, f1-score N/M N/M Mean accuracy over 14 subjects: 88.6% (but this is not the focus of paper) None None No 1) Signal perturbation (amplitude & phase) 2) Most-activating input windows Input-perturbation network-prediction correlation maps, Analysis of most-activating input windows Analyzed effect of perturbations in phase and amplitude of input signals. Earlier layers focus on frequency-related information while latest layers focus on amplitude. N/M No N/A No Isabela Albuquerque Hubert Banville TBC Hartmann2018b
55 Spatial-Temporal Recurrent Neural Network for Emotion Recognition 2018 Zhang, Zheng, Cui, Zong & Li IEEE Transactions on Cybernetics Yes Journal Southeast University, Nanjing, China Nanjing University of Science and Technology, China China 9 Classification of EEG signals Monitoring Affective Emotion New Approach: Stacking 2 RNN layers for spatial and temporal resolution, for EEG & Facial Expression for emotion classification Stacking 2 RNN layers for spatial and temporal resolution, for EEG & Facial Expression for emotion classification Emotion Classification for short emotional films/clips (SEED dataset) Leverage RNN for both spatial and temporal features (NeuroScan) Emotions SEED Public SEED: 15 subjects Assumed: 15 subjects x 15 movies x 4min x 2 exp (9s windows, no overlap) 12000 1800 15 62 1000 None No No No DE descriptors (?) - Freq Bands (256-point FFT + Hanning Window (1s) for 5 F-Bands) Frequency-domain N/M N/M STRNN (Spatial-Temporal RNN) RNN Spatial & Temporal features representation with stacked RNNs 1) To learn spatial dependencies, a quad-directional spatial RNN (SRNN) layer is first employed 2) Then, a bi-directional temporal RNN (TRNN) layer is further stacked on SRNN to capture long-term temporal dependencies Yes Not clear... (to be reviewed) SRNN: 1 TRNN: 1 2 ReLU Sigmoid N/M N/M 3 (Softmax) N/M N/M N/M Back Propagation Through Time (BPTT) N/M N/M N/M N/M N/M N/M Cross-entropy Inter No No Train: 9 Sessions Test: 6 Sessions Accuracy accuracy N/M N/M Accuracy: 89.5% None None No No No A multidirection SRNN layer and a bi-direction TRNN layer are hierarchi- cally employed to learn spatial and temporal dependencies layer by layer. To adapt the multichannel EEG signals to the proposed STRNN framework, the spatial scanning order of electrodes are specified by spatial coordinates and tempo- ral variation information is involved by slicing a window on the extracted DE feature sequences. N/M No N/A No Yannick Roy TBR TBC Zhang2018
56 Individual Recognition in Schizophrenia using Deep Learning Methods with Random Forest and Voting Classifiers: Insights from Resting State EEG Streams 2018 Chu, Qiu, Liu, Ling, Zhang & Wang IEEE Transactions on Neural Systems and Rehabilitation Engineering Yes Journal Big Data and AI Research Center of Shanghai Jiaotong University China 7 Classification of EEG signals Clinical Schizophrenia Detection New Approach Using Random Forest and Voting Classifiers with a CNN for Individual Recognition in Schizophrenia Resting State, Eyes Open. (300s each) Automatic feature extraction (BrainProducts) Raw EEG Internal Recordings Private 120 Subjects x 300 seconds 360000 600 120 64 1000 Offline 1) Occular Correction (with Brain Vision Analyzer's algos) 2) Re-Referenced to Common Average 3) Pass-Band Filter (IIR): 0.01 - 50Hz Yes Yes* Yes 1) Raw EEG 2) Freq Bands Raw EEG Divide by max N/M CNN, RNN, and MLP CNN 3 Conv Layers, ELU, 3 Dropout 0.5, 3 Max Pooling + Dropout 0.25, 3 FCs, 1 voting (RF, softmax or SVM) N/M Yes Not clear CNN: 6 MLP: 4 RNN: 2 6 ELU Dropout (0.5, 0.25) Yes 3 High risk Schizophrenia Healthy 3 Classes (replaced Softmax with Random Forrest) N/M Standard Standard N/M N/M N/M N/M N/M N/M No N/M Inter Yes Yes (no detail) Train: 50% Test: 50% Accuracy accuracy NVIDIA GeForce GTX 750 N/M FES: 96.7% CHR: 81.6% HC: 99.2% ANNV, RNNV, CNNV, ANNV+mSVM, RNN+mSVM, CNN+mSVM, ANN+RF, RNN+RF, CNN+RF DL No No No "In conclusion, we have shown that CNNV-RF performs better than softmax and CNNV-mSVM on a well-known dataset (mnist) and resting state EEG streams used in this paper. Switching from softmax or mSVM to RF is incredibly simple and appears ro be helpful for classification problems." N/M No N/A No Yannick Roy Isabela Albuquerque TBC Chu2017
57 An EEG-based Image Annotation System 2018 Parekh, Subramanian, Roy & Jawahar National Conference on Computer Vision, Pattern Recognition, Image Processing, and Graphics Yes Conference IIIT Hyderabad, India University of Glasgow, Singapore National Brain Research Centre, Manesar, India India 11 Classification of EEG signals BCI Reactive RSVP Novel Approach: Image classification based on subject's P300 Using CNN (EEGNet) to classify images based on P300. RSVP with Oddball. RSVP. Images from Caltech101 and VOC 2012. Oddball Paradigm for P300 Not mentioned why DL... EPOC (Emotiv) RSVP P300 Internal Recordings Private 5 subjects x 3 sessions x 2 (Test/Train) x 6 min 25 block of 100 image (2x test/train) x 100ms/image (1s windows) 7500 125 5 14 128 1) Baseline power removal using the 0.5 second pre-stimulus samples 2) Band-Pass filter: 0.1 - 45 Hz 3) ICA to remove artifacts (eye-blinks, and eye and muscle movements) Yes Yes Yes P300 Other N/A Braindecode CNN (EEGNet) CNN They add a Outlier Removal "Feature". They used a pre-trained VGG-16 on predicted target image. (to reduce false-positive due to class imbalance) see EEGNet & Braindecode No 1s Windows (Raw EEG) 3 3 ELU N/M (see Braindecode / EEGNet) N/M 2 Target / Non-Target N/M N/M N/M Adam Adam N/M N/M N/M N/M N/M Categorical Cross-Entropy Inter 5-Fold CV k-fold Train: 2500 images / subject Test: 2500 images / subject F1-Score (Due to a heavy class imbalance between T/non-T, we use F1-score) f1-score NVIDIA GEFORCE GTX 1080 Ti N/M [DS: CT101] Before outliers removal: F1: 0.71 Precision: 0.66 Recall: 0.81 [DS: CT101] After outliers removal: F1: 0.68 Precision: 0.63 Recall: 0.72 [DS: VOC2012] Before outliers removal: F1: 0.88 Precision: 0.99 Recall: 0.81 [DS: VOC2012] After outliers removal: F1: 0.83 Precision: 0.97 Recall: 0.72 None None No No No Our annotation system exclusively relies on the P300 ERP signature, which is elicited upon the viewer detecting a pre-specified object class in the displayed image. A further outlier removal procedure based on binary feature-based clustering significantly improves annotation performance. N/M No N/A No Yannick Roy TBR TBC Parekh2018
58 EEGNet: A Compact Convolutional Neural Network for EEG-based Brain-Computer Interfaces 2018 Lawhern, Solon, Waytowich, Gordon, Hung & Lance Journal of Neural Engineering Yes Journal U.S. Army Lab,DCS Corporation, Columbia University, Georgetown University Medical Center USA 17 Classification of EEG signals BCI Active & Reactive MI & ERP Novel Approach: DN that can be used for different BCI paradigms Compare EEGNet with SOTA ML for different BCI Paradigms Visual P300 ERN Movement-related cortical potentials Sensory Motor Rhythms Allows robust feature extraction ActiveTwo (BioSemi), N/M, ActiveTwo (BioSemi), N/M 1) P300 2) ERN 3) Movement-related cortical potentials 4) SMR Internal Recordings; Kaggle: Inria BCI challenge; Internal Recordings; BCI Competition IV - IIa Both P300: 15 subject x 2000 trials [1s windows] ERN: 26 subjects x 340 trials [1.25s windows] MRCP: 13 subjects x 1100 trials [1.5s windows] SMR: 9 subjects x 288 trials [3s windows] 30000; 8840; 14300; 2592 500; 184.2; 357.5; 129.6 15; 26; 13; 9 64; 56; 64; 22 512; 600; 1024; 250 Offline 1) Rereferencing (linked mastoids or earlobes) 2) Bandpass filter (1 - 40 Hz, 0.1-40 Hz or 4-40 Hz) 3) Downsampled to 128 Hz (** Different approaches! e.g. Used PREP Pipeline for #3) Yes No No Raw EEG Raw EEG Exponential moving average Keras + Tensorflow CNN CNN Layer 1: 1D Temporal Filters Layer 2: Depthwise 2D Conv Layer 3: Separable 2D Conv 1D temp. conv. at L1 to learn frequency filters. Depthwise: Inspired in part by the Filter-Bank Common Spatial Pattern (FBCSP) algorithm. Separable: explicitly decoupling the relationship within and across feature maps by first learning a kernel summarizing each feature map individually, then optimally merging the outputs afterwards Yes Channels x Time 3 3 ELU Dropout, weight decay Yes P300: 2 ERN: 2 MRCP: 2 SMR: 4 P300: Target/Non-target ERN: Error/No error MRCP: Left/Right hand SMR: left hand/right hand/feet/tongue (depends on the task) (softmax) 1) 1,066 2) 1,082 3) 1,098 4) 796 Within-Subject and Cross-Subject If classes are umbalanced, we apply class-weight to the loss function whenever the data is imbalanced Standard Adam Adam N/M 64 N/M N/M N/M Categorical cross-entropy + Class weight if unbalanced Both Intrasubject: 4-Fold CV Intersubject: Leave some subjects out (different nb folds and ratio for each per task) k-fold; Leave-N-Subjects-Out [Intra] Train: 50% [Intra] Valid: 25% [Intra] Test: 25% [Inter] Different ratios of subjects for training, for validation and for test Accuracy ROC AUC accuracy, ROC AUC NVidia Quadro M6000 N/M See paper for full breakdown. TLDR; Doesn't outperform or underperform anything by a lot. (however, it uses two orders of magnitude fewer parameters) DeepConvNet (Schirrmeister, 2017) ShallowConvNet (Schirrmeister, 2017) Riemannian EEG (Barachant, 2015) FBCSP DL & Trad. Repeated-measures ANOVA 1) Summarizing averaged outputs of hidden unit activations. 2) Visualizing the convolutional kernel weights. 3) Calculating single-trial feature relevance on the classification decision Also used DeepLIFT (Shrikumar 2017) Analysis of activations, Analysis of weights, Ablation of filters, DeepLIFT In this work we proposed EEGNet, a compact convolutional neural network for EEG-based BCIs that can generalize across different BCI paradigms (e.g. ERP and oscillatory-based) in the presence of limited data and can produce interpretable features. To the best of our knowledge, this represents the first work that has validated the use of a single network architecture across multiple BCI datasets, each with their own feature characteristics and data set sizes. Through the use of feature visualization and ablation analysis, we show that neurophysiologically interpretable features can be extracted from the EEGNet model N/M Yes GitHub No Yannick Roy Hubert Banville Yes Lawhern2018
59 A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series 2018 Chambon, Galtier, Arnal, Wainrib & Gramfort IEEE Transations on Neural Systems and Rehabilitation Engineering Yes Journal Telecom ParisTech, Inria, Université Paris-Saclay France 12 Classification of EEG signals Clinical Sleep Staging Improve State-of-the-Art Sleep N/M Sleep events MASS Public MASS (61 out of 62) 61 nights (8h) each from diff subjects (30s windows, no overlap) 58560 29280 61 20 128 1) Low-pass @30Hz Yes N/M N/M Raw EEG + EOG and raw EMG Raw EEG z-score Keras + Tensorflow ConvNet CNN 3 conv layers + dense (per modality) Layer 1: spatial filter Layers 2, 3: temporal filters Yes Nb channels * 30 s 4 4 Linear, ReLU, Softmax 25% (last layer) Yes 5 <10^5 1) Training on a single 30-s epoch 2) Freezing net, and train last layer on multi-epochs Pre-training Adam Adam Random searches with the hyperopt Python packag Yes No Categorical cross-entropy Inter Leave-p-subject-out 5 random permutations Leave-N-Subjects-Out Train: 41 records Valid: 10 records Test: 10 records Balanced accuracy F1-score, Precision, Sensitivity, Specificity, Confusion matrix balanced accuracy, f1-score, precision, sensitivity, specificity, confusion matrix N/M* ~250 s Acc: ~80% Bal. acc.: ~80% Kappa: ~0.7 F1 score: ~0.71 Gradient boosting on time domain and freq. domain features Univariatie ConvNets from Tsinalis et al. (2016) and Supratak et al. (2017) DL & Trad. No Occlusion sensitivity Occlusion of input 1D convolution provided a speed-up vs. 2D convolutions Smaller number of parameters than other studies Temporal context helps for some classes, but not for others; recurrent architectures could help Size of dataset matters No N/A Yes Hubert Banville TBR TBC Chambon2018
60 Converting Your Thoughts to Texts: Enabling Brain Typing via Deep Feature Learning of EEG Signals 2018 Zhang, Yao, Sheng, Kanhere, Gu, Zhang IEEE International Conference on Pervasive Computing and Communications (PerCom) Yes Conference University of New South Wales Macquarie University RMIT University Australia 10 Classification of EEG signals BCI Active Motor imagery Improve SOTA Joint CNN & LSTM + AE for Motor Imagery (5 classes) Motor Imagery (5 classes) (see eegmmidb dataset) EEG processing is time-consuming and depend on human expertise. SOTA models achieve 70-80% which is not enough. N/M, EPOC (Emotiv) Motor Imagery eegmmidb; Internal Recordings Public 1) eegmmidb: 28,000 samples x 10 subjects (28000 points @ 160Hz = 175s/subject) 2) Internal (Emotiv): 34,560 samples x 7 subjects (28000 points @ 128Hz = 270s/subject) (window length = 1 point) 28000; 34560 29.2; 31.5 10; 7 64; 14 160; 128 Offline and Online N/M N/M No No Raw EEG (None) Raw EEG N/M N/M CNN + LSTM + linear AE + XGB (classification) Other CNN & LSTM are parallel, then combined for the AE then XGB classifier CNN for Spatial and RNN for Sequential info Yes 1 x 64 (sample x channels) LSTM: 6 layers CNN: 2 Conv + 2 FC 6 ReLU, Sigmoid, tanh L2 Yes 5 eegmmidb: eye closed, left hand, right hand, both hands, both feet emotiv: up arrow, down arrow, left arrow, right arrow, eye closed 5 N/M Standard optimization Standard LSTM & CNN: Adam AE: RMSProp Adam Full table on optim params 7000 N/M (they have tried many config, manually I suppose) N/M N/M LSTM + CNN: Cross-Entropy AE: MSE Intra No No Train: 75% Test: 25% accuracy, precision, recall, F1 score, ROC curve, and ROC AUC accuracy, precision, recall, f1-score, ROC, ROC AUC N/M 2000 s DS #1 - Accuracy: 0.955 DS #2 - Accuracy: 0.9427 Baselines: KNN, SVM, RF, LDA, AdaBoost, RNN, CNN Externals: Almoari, Sun, Mohammad, Major, Shenoy, Tonic, Rashid, Ward, Sita, Pinheiro. (all different papers, see Table IV) DL & Trad. No No No The classification accuracy of the public dataset (eegmmidb) is consistently higher than the local real-world dataset (emotiv). Our future work will focus on improving the accuracy in the person-independent scenario, wherein some subjects participate in the training and the rest of subjects involve in the testing. N/M Yes GitHub No Yannick Roy Hubert Banville Yes Zhang2017g
61 MindID: Person Identification from Brain Waves through Attention-based Recurrent Neural Network 2018 Zhang, Yao, Kanhere, Liu, Gu & Chen ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Yes Conference University of New South Wales, Australia Tsinghua University RMIT University, Australia Australia 20 Classification of EEG signals Personal trait/attribute Person identification Improve SOTA: EEG for Person Identification Use RNN on EEG for Person Indentification 3 Different Datasets. (they claim that Delta has the most personal info) The DL motivation is not clear. They want to improve SOTA. EPOC (Emotiv), N/M Delta Band* Internal Recordings; eegmmidb Both DS1 (EID-M): 21,000 Samples/Subject. Total: 168,000 DS2 (EID-S): 7,000 Samples/Subject. Total: 56,000 DS3 (EEG-S): 8 subjects x 7,000 samples (window length = 1 point) 168000; 56000; 56000 21.9; 7.3; 5.8 8; 8; 8 14; 14; 64 128; 128; 160 1) Remove DC Offset (substract) 2) Band-Pass Filter: 0.5 - 4Hz (using only Delta) Yes No No Delta Band Frequency-domain z-score Matlab Attention-based Encoder-Decoder RNN + XGB Classifier RNN Encoder, Decoder, Attention Module + XGB Classifier N/M Yes 1x14 Delta Bands / Channel (not clear about the dimensionality) Encoder: 3 FC (164) + 1 LSTM (164) Decoder: 1 FC (164) 4 N/M L2 Yes 8 One-Hot Label (ID - 8 Subjects) N/M N/M N/M Adam Adam LR: 21,000 samples (?) N/M N/M N/M Cross-Entropy Inter No No Train / Test DS1: 147,000 / 21,000 DS2: 49,000 / 7,000 DS3: 49,000 / 7,000 Precision Recall F1-Score precision, recall, f1-score Nvidia Titan X Pascal 768G memory 145 TB PCIe SSD N/M Precision | Recall | F1-Score DS #1: 0.982 | 0.982 | 0.982 DS #2: 0.988 | 0.988 | 0.988 DS #3: 0.999 | 0.999 | 0.999 SVM, RF, KNN, AdaBoost, LDA, XGB, RNN DL & Trad. No No No Moreover, the pre-trained model should be updated for a period of time since the user’s EEG data is gradually changed with the environmental factors such as age, mental state, and living style. One of our future work is to develop an online learning system which is enabled to automatically update the training dataset based on the testing data which is collected during the operating period. Limited by the local experimental conditions, our study only gathered EEG data from 8 subjects with few trials. The dataset is only divided into two categories (Multi and Single), which is not enough to explore the change trend of the identification accuracy with the increase of data trials. Yes GDrive Yes Yannick Roy TBR Yes Zhang2017e
62
63 A convolutional neural network for steady state visual evoked potential classification under ambulatory environment 2017 Kwak, Muller & Lee PLOS One No Journal Korea University, TU Berlin South Korea Classification of EEG signals BCI Reactive SSVEP Improve SOTA Improve robustness of SSVEP BCIs for exoskeleton control in ambulatory conditions SSVEP - MOVE (BrainProducts) SSVEP Internal Recordings Private 2 datasets (50x5s + 250x5s) x 7 subjects 5s trial into 300 x 2s trials (2s sliding window, 10ms shift size) 630000 175 7 8 1000 Offline 1) Notch filter @60Hz 2) Band pass from 4-40 Hz Yes No No 120 FFT bins from 5-35 Hz Frequency-domain min-max N/M 1,2) CNN 3) MLP CNN 1) CNN (3 layers) 2) CNN (4 layers) 3) MLP (3 layers) First conv layer: spatial filter Second conv layer: spectral filter Yes 120 x 8 Freq x channels 1) 3 2) 4 3) 3 4 Sigmoid N/M N/M 5 Walk Forward Turn Left Turn Right Stand Up Sit Down 5 N/M Standard optimization Standard SGD SGD Learning rate: 0.1 N/M N/M N/M Sliding Window (Shifted by [10-60ms] over 2s win) N/M Intra 10-Fold k-fold Train: 90% Test: 10% Chronological split Accuracy accuracy N/M N/M Static condition: up to 99.28% Ambulatory condition: up to 94.03% CCA, MSI, CCA + kNN Traditional pipeline Yes (not clear what method) Visualization of activations Analysis of activations CNN-1 (3 layers) was the most robust. Since architecture is pretty simple, no regularization is used. Artefacts in ambulatory settings No N/A No Hubert Banville Yannick Roy TBC kwak2017
64 Mental Tasks Classification using EEG signal, Discrete Wavelet Transform and Neural Network 2017 Padmanabh, Shastri & Biradar Discovery No Journal Savitribai Phule Pune University India Classification of EEG signals BCI Active Mental tasks [Classification of 5 different mental tasks, via Wavelet & ANNs (PNN & MLP)] 5 Mental Tasks (Baseline, Multiplication, Rotation, Counting, Letter composition) 7P511 (Grass Instruments) Keirn & Aunon (1989) Public 5 subjects x 5 tasks x 5 trials x 10s @250Hz (1s windows) 1250 20.8 5 6 250 1) Band-Pass filter: 0.1-100Hz Yes N/M N/M Frequency-domain MATLAB & NNtool MLP PNN FC No 200x1 2 (20; 15) 2 N/M N/M Standard N/M N/M Learning Rate: 0.9 N/M N/M No MSE N/M N/M No N/M accuracy accuracy N/M N/M MLP: 92% NPP: 100% None None No No No No N/A No Yannick Roy TBR TBC Padmanabh2017
65 Cross-session classification of mental workload levels using EEG and an adaptive deep learning model 2017 Yin & Zhang Biomedical Signal Processing and Control No Journal University of Shanghai for Science and Technology East China University of Science and Technology China Classification of EEG signals Monitoring Cognitive Mental workload New approach ACAMS (Automation-enhanced Cabin Air Management System) (Nihon Kohden) PSD Internal Recordings Private 7 subjects x (5min + 6x15min + 5min) x 2 sessions (2s windows, no overlap) 42000 1400 7 11 500 1) Low-Pass filter: 40Hz 2) ICA for EOG artifacts Yes Yes Yes PSD Avg. Power: T (5–7.5 Hz), A (8–13.5 Hz), B1 (14–20 Hz), B2 (20.5–30 Hz), G (30.5–40 Hz) Frequency-domain MATLAB SDAE AE Adaptive Stacked Denoising AutoEncoder Yes 55x1 EEG PSD Features 6 6 N/M N/M 2 N/M* Other N/M N/M Grid search Yes Gaussian noise on Freq features N/M* Intra N/M No Train: 66% Test: 33% accuracy, confusion matrix, sensitivity, specificity accuracy, confusion matrix, sensitivity, specificity N/M* [Complicated ... read me again ...] SDAE > State of the art. ANN, NB, kNN, SVMlin, SVMrbf, BSV, SDAE DL & Trad. Wilcoxon sign-rank test 3D scatter plots of layer activations Analysis of activations It is evident that the proposed method is superior to those shallow and static classifiers when the comprehensive cortical information is adopted as the network inputs. No N/A No Yannick Roy TBR TBC Yin2017a
66 Generative Adversarial Networks Conditioned by Brain Signals 2017 Palazzo, Spampinato, Kavasidis, Giordano & Shah ICCV No Conference University of Catania, University of Central Florida Italy 9 Generation of data Generating images conditioned on EEG New approach: generating images conditioned on EEG Generating images using GANs conditioned by EEG representation Visual presentation of images Allows image generation BrainAmp (BrainProducts) Raw EEG Internal Recordings Private 6 Subjects x 50 images x 40 classes 1400s per subject (4 sessions of 350s) 2000 images x 6 subjects = 12000 minus exclusions = 11,466 valid samples 11466 140 6 128 1000 1) Hardware notch filter: 49-51 Hz 2) Band-pass filter: 14-70 Hz 3) Non-uniform quantization of the voltage values Yes N/M N/M Raw EEG Raw EEG N/M N/M* 1) LSTM for EEG encoder 2) DCGAN for image generation Other Conditional DCGAN (conditioning G and D) N/M Yes Nb channels * 0.5 s 1) 2 2) 5 (generator), 6 (discriminator) 6 1) ReLU 2) ReLU N/M N/M 1) 40 2) 64 x 64 N/M 1) Train encoder to predict image category from raw EEG 2a) Train GAN on images without EEG features 2b) Train GAN condtioned on average (across subs) EEG representation learned by the encoder Other Adam (lr=0.001) Adam 1) N/M 2) Batch normalization 1) 16 2) N/M N/M N/M 1) Nothing on EEG 2) Resizing + Flipping on Images 1) categorical cross-entropy. 2) non-saturating Inter N/M; [TBD] No Train: 80% Valid: 10% Test: 10% 1) Accuracy 2) Inception score, Inception accuracy accuracy, inception score, inception accuracy 2 Titan X Pascal N/M Encoder: 83.9% GAN: IS: 4-6.5, acc: 43% No None No No No Conditioning vector (i.e. EEG representations) are noisy, which makes harder to learn how an appropriate conditioning vector. Suffers from classes with high internal variability Dataset is small No N/A No Isabela Albuquerque Hubert Banville TBC Palazzo2017
67 The effects of pre-filtering and individualizing components for electroencephalography neural network classification 2017 Major & Conrad IEEE SoutheastCon No Conference University of North Carolina (Charlotte) USA 6 Classification of EEG signals BCI Active Motor imagery Improve State-of-the-Art: Exploring impact of ICA preprocessing Analyze effectiveness of using ICA to enhance EEG that will be processed by a neural network Motor imagery "Since every brain computer interface (BCI) has to be tailored for each person it is advantageous to use a neural network" N/M Raw EEG eegmmidb Public 109 subjects x 14 experiments (12x2min + 2x1min) Not clear how many samples they used... (??s windows) N/M 2834 109 64 160 Offline 1) Band pass filter: 8-30Hz Yes Yes Yes Raw EEG Raw EEG N/M Matlab MLP FC N/M N/M Yes 16x ? 10 10 N/M N/M N/M 2 Left Grasp Right Grasp 2 N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M Inter No No Train: 2/3 Test: 1/3 Accuracy accuracy N/M N/M With ICA: 68% Without ICA: 56% No None No No No Applying ICA to raw data improves the neural network performance. N/M No N/A No Isabela Albuquerque Yannick Roy TBC Major2017
68 Convolutional neural network-based transfer learning and knowledge distillation using multi-subject data in motor imagery BCI 2017 Sakhavi & Guan IEEE Conference on Neural Engineering No Conference NUS & NTU Singapore 4 Classification of EEG signals BCI Active Motor imagery Transfer learning (from one subject to another) Reduce calibration time in a BCI using transfer learning Motor imagery Reduce BCI's calibration time N/M Raw EEG BCI Competition IV - IIa Public BCI competition IV-2a dataset (4 out of 9 subjects) x 4 classes x 72 samples x 2 session x 4 seconds. After removing data: ∼ 1000 per class per session (4s windows, no overlap) 8000 153.6 9 22 250 1) Bandpass between 0.5-100 Hz 2) Notch filter @50 Hz Yes N/M N/M FBCSP in 9 frequency bands, then extracting envelope Frequency-domain weird z-scoring Torch7 CNN + MLP CNN CNN: 5 layers MLP: 1 layer - Yes CNN: 32x40 MLP: 32 CNN: 4 conv, 1 FC MLP: 1 FC 5 ReLU N/M N/M CNN: 128 MLP: 128 N/M 1) Pre-train CNN+MLP on N-1 subjects 2) Fine-tune pre-trained network on 1 subject Pre-training Adam Adam N/M N/M N/M N/M N/M KL divergence Inter Leave-N-Samples-Out Leave-N-Samples-Out Train: 5, 10, 20 samples / class Test: Remaining Test set accuracy accuracy N/M N/M Average acc: 69.71% SVM Traditional pipeline Wilcoxon sign-rank test No No Best results (average across subjects) show significant improvement with respect to SVM. However, there is high variability Choosing hyperparameter lambda No N/A No Hubert Banville Isabela Albuquerque TBC Sakhavi2017
69 Single-trial EEG classification of motor imagery using deep convolutional neural networks 2017 Tang, Li & Sun Optik - International Journal for Light and Electron Optics No Journal Zhejiang University of Technology China 8 Classification of EEG signals BCI Active Motor imagery New Approach CNN for MI on Single Trial Motor Imagery Automated feature extraction ActiveTwo (BioSemi) SMR - ERD/ERS Internal Recordings Private 2 subjects x 460 trials 3s epochs, and splitted in 50ms windows (50ms windows) 55200 46 2 28 1000 Offline 1) [Hardware] Notch Filter: 50Hz 2) [Hardware] Band-Pass Filter: 0.5-100Hz 3) [Software] Band-Pass Filter: 8-30Hz Yes No No SMR - ERD/ERS Frequency-domain N/M N/M CNN CNN Activation Function: Hyperbolic Tangent N/M Yes 28x60 Channels x Time Points 2 Conv 1 FC 3 Tanh Sigmoid N/M N/M 2 Left hand Right hand 2 N/M Standard Standard GD SGD N/M N/M N/M N/M No N/M* Intra 10-Fold CV k-fold Train: 80% Test: 20% confusion matrix, accuracy, ROC, precision, recall, f-score confusion matrix, accuracy, ROC, precision, recall, f-score N/M N/M Accuracy: 86.41% Power+SVM CSP+SVM AR+SVM Traditional pipeline ANOVA No No "The results demonstrate that CNN can further improve classification performance compared with other three conventional methods." N/M No N/A No Yannick Roy Isabela Albuquerque TBC Tang2017
70 Pattern Recognition of Momentary Mental Workload Based on Multi-Channel Electrophysiological Data and Ensemble Convolutional Neural Networks 2017 Zhang, Li & Wang Frontiers in Neuroscience No Journal East China University of Science and Technology China 16 Classification of EEG signals Monitoring Cognitive Mental workload Improve State-of-the-Art MWL classification with CNN & ECNN ACAMS (Automation-enhanced Cabin Air Management System) N/M (Nihon Kohden) PSD Internal Recordings Private 6 subjects x 2 sessions x 10 tasks x 5 min (2s windows) 18000 600 6 10 500 Offline 1) Low-Pass filter: 40Hz Yes No No PSD (STFT) Avg. Power: D (1-4Hz), T (5–8 Hz), A (9–13 Hz), B1 (14–16 Hz), B2 (17–30 Hz), G (31–40 Hz) Frequency-domain N/M Python Matlab CNN ECNN CNN Many architectures tested N/M Yes 102x10 (not clear what x what) [2, 10] (tested many) 10 ReLU N/M N/M 4 and 7 Low/Normal/High/ - Unloaded - Very Low/Low/Medium/High/Very High/Overloaded 4 and 7 N/M N/M N/M Nesterov Momentum Adagrad Adadelta Adam Adam (see paper, they describe each optimizer params) N/M N/M N/M N/M Cross-Entropy Inter 5-Fold CV k-fold Train: 50% Test: 50% Accuracy Precision F-Measure G-Measure accuracy, precision, f-measure, g-measure Single Intel core i5 CPU, 4-GB memory, Windows N/M 93% LDA NB SDA Traditional pipeline No No No "It was found that the deeper CNN model with the small convolutional kernels leads to improved classification performance." [YR] --> Like in other fields... N/M No N/A No Yannick Roy TBR TBC Zhang2017
71 Deep RNN learning for EEG based functional brain state inference 2017 Patnaik, Moharkar & Chaudhari International Conference on Advances in Computing, Communication and Control (ICAC3) No Conference Xavier Institute of Engineering, Mahim, Mumbai M G M Inst. of Health Sciences, Navi Mumbai India Classification of EEG signals BCI Active Mental tasks New Approach: Brain State Inference with RNN using Alpha Phase Coherence 5 Tasks: Baseline, Multiplications, Rotations, Letter Composition, Visual Counting (not using baseline) N/M ERD/ERS (looking at Alpha Cross Coherence - Occipital/Center) Keirn & Aunon (1989) Public DB: 7 subjects x 10 sessions x 5 tasks x 10s Then they say they used 65 instances for 4 activities. not clear... (Sliding window of 50 samples) N/M 33 7 6 250 1) Band-Pass Filter: 0.1-100Hz (Hardware) 2) ICA for EOG Artifacts 3) DWT to get Alpha Sub-Bands 4) Hilbert Transform (no-overlap) for Phase Coherence Yes Yes Yes Alpha Sub-Bands Phase Coherence Frequency-domain N/M* Elman's RNN with Bottlenect RNN A 5-layer network with 53-400 - 50-200-20-T Yes [Shape Not Mentioned] 5 5 N/M N/M 4 Classes N/M* Standard N/M* N/M N/M N/M No MSE Inter No No Train: 40 instances (/65) Test: 25 instances (/65) Accuracy accuracy N/M* 90% for two tasks 82% for three tasks 77% for all the four tasks No None No No No "In this research, a RNN model is trained to identify the phase coherence patterns of EEG alpha-bands. Difference between EEG signals from central and occipital (C1-O1 & C2- O2) locations is considered to compute phase coherence patterns for various activities." No N/A No Yannick Roy TBR TBC Patnaik2017
72 Deep Convolutional Neural Networks for Interpretable Analysis of EEG Sleep Stage Scoring 2017 Vilamala, Madsen & Hansen IEEE International Workshop on Machine Learning for Signal Processing Yes Conference Technical University of Denmark Danish Research Centre for Magnetic Resonance Denmark Classification of EEG signals Clinical Sleep Staging New Approach: CNN for Sleep Stages Sleep N/M PSD Sleep EDF Public SleepEDF 2 whole nights x 20 subjects (2 x ~10h x 20) 30s windows 48000 24000 20 1 100 Multitaper Spectral Estimation Yes N/M N/M Spectrogram log values (from Multitaper Spectral Estim.) Frequency-domain N/M* CNN CNN VGGNET Activation Function: ReLU & Softmax Xavier’s initialisation. No 224x224 (RGB Image) 16 16 Dropout Yes 5 (Sleep Stage) N/M* Pre-training Adam Adam Learning Rate: 10^-5 Mini-batch: 250 Decay Rate 1st & 2nd moments 0.9 & 0.999 N/M N/M No Categorical cross-entropy Inter Leave-One-Subject-Out Leave-One-Subject-Out Train: 15 subjects Valid: 4 subjects Test: 1 subjects Precision Sensitivity F1-score Accuracy precision, sensitivity, f1-score, accuracy N/M* [VGG-FE] Precision: 91, Sensitivity: 73, F1-S: 81, Accuracy: 83 [VGG-FT] Precision: 93, Sensitivity: 78, F1-S: 84, Accuracy: 86 SSAE, CNN DL No Sensititvity maps Saliency map Further improvement of the method includes better hyperparameter optimisation when generating the spectral images No N/A Yes Yannick Roy TBR TBC Vilamala2017
73 Deep long short-term memory structures model temporal dependencies improving cognitive workload estimation 2017 Hefron, Borghetti, Christensen & Kabban Pattern Recognition Letters No Journal Air Force Institute Air Force Research Laboratory USA 9 Classification of EEG signals Monitoring Cognitive Mental workload Improve State-of-the-Art: MWL classification with RNNs (LSTM). Multi-Attribute Task Battery (MATB) environment Using deep RNNs to account for temporal dependence considerably improves day-to-day feature stationarity N/M PSD (Raw EEG) Internal Recordings Private 6 of 8 subjects x 5 sessions x 6 of 9 trials x 5 min This process yielded 380 features for each second and approximately 9000 observations per individual for the five day period. (10s slidding windows, 9s overlap) 54000 900 6 19 256 The power spectral density was determined for 30 points spread out over a logspace from 3 Hz to 55 Hz by extracting power from complex Morlet wavelets [9] . Each wavelet was 2 s in length Yes N/M N/M Mean, Variance, Skewness, Kurtosis of PSD (delta (1–4), theta (4–8), alpha (8–14), beta (15–30), and gamma (30–55)) + all possible combinations of M, V, S, K. Frequency-domain Keras, Theano LSTM RNN N/M* Yes 600 x 30 x F (batch size, temporal depth in seconds, and number of features) (F varies between 90 and 380 features) 2 LSTM Layers (50 and 10 units 2 Dropout Yes 1 (low or high WL) N/M* Standard Mini-batch gradient descent (600 obs. per batch) Adam, Dropout 20% Adam 600 Random search Yes No Binary Cross-Entropy Intra 4-Fold CV k-fold Train: 3 days Valid: 1 day Test: 1 day Accuracy accuracy N/M* 93% (using all measures: M/V/S/K) linear SVM (SVM-L), Radial Basis Function (RBF) SVM (SVM-R), feedforward ANN (ANN), deeply stacked simple RNN (RNN-D), single LSTM (LSTM-S), and deeply stacked LSTM (LSTM-D) DL & Trad. ANOVA, Tukey HSD No No There is an abundance of future work to be pursued in this area. Due to time constraints and computational complexity, only a select number of deep architectures were examined during this re- search. A thorough evaluation of different deep RNN architectures to include variations in the depth of hidden layer recurrent con- nections, stacking of different sized LSTM layers, and interleaving fully-connected feedforward layers between sequence-to-sequence recurrent layers may yield additional improvement. No N/A No Yannick Roy TBR TBC Hefron2017
74 The signature of robot action success in EEG signals of a human observer: Decoding and visualization using deep convolutional neural networks 2017 Behncke, Schirrmeister, Burgard & Ball Arxiv Yes Preprint Albert-Ludwigs-University Freiburg University Medical Center Freiburg Germany 6 Classification of EEG signals BCI Reactive ERP Novel Approach: DL for Robot Error Detection Comparing CNN to rLDA and FB-CSP (both state of the art) for error detection in human-robot interaction Participant watching short videos of robots "performing naturalistic actions either in a correct or an erroneous manner" Deep Learning has been tried for other EEG decoding tasks N/M Error Potential Internal Recordings Private 5 subjects x 720 trials + 12 subjects x 800 trials x ~20s/trials (using ~2.8s/trials) 13200 616 17 128 N/M Offline 1) Re-reference to common average (CAR) 2) Downsampled to 250 Hz Yes No No Raw EEG Raw EEG Electrode-wise exponential moving standardization Braindecode CNN CNN Deep ConvNet from braindecode paper Layer 1: temporal filtering, Layer 2: spatial filtering, with no non-linearity in-between (Braindecode) No Time x channels 5 5 ELU Dropout Early stopping Yes 2 Error No error 2 N/M Standard optimization Standard Adam Adam N/M N/M N/M N/M N/M Categorical cross-entropy Intra No No N/M Accuracy accuracy N/M N/M KPO Error (2.5-5s): (78.2 ± 8.4) % KPO Error (3.3-7.5s): (71.9 ± 7.6) % RGO Error (4.8-6.3s): (59.6 ± 6.4) % RGO Error (4-7s): (64.6 ± 6.1) % rLDA FB-CSP (CNN is better) Traditional pipeline Permutation test on individual decoding results Wilcoxon signed-rank tests Correlation of changes in ConvNet predictions with perturbation changes in 1) input spectral amplitudes and 2) time domain signals to obtain information about what the deep ConvNets learned from the data Input-perturbation network-prediction correlation maps "Among other recent advances in the field of deep learning research, automatic hyperparameter optimization and architecture search, including recurrent and residual network architectures, data augmentation, using 3-D convolutions, or increasing the amount of training data all have the potential to further increase ConvNet performance." N/M No N/A No Yannick Roy Hubert Banville TBC Behncke2017
75 Deep learning with convolutional neural networks for EEG decoding and visualization 2017 Schirrmeister, Springenberg, Fiederer, Glasstetter, Eggensperger, Tangermann, Hutter, Burgard, Ball Human Brain Mapping Yes Journal University of Freiburg Germany 30 Classification of EEG signals BCI Active Motor imagery Improve SOTA Feature visualization/interpretability Find out best CNN architecture for EEG decoding Motor imagery/execution Can learn from raw data N/M None BCI Competition IV - IIa; Internal Recordings; BCI Competition IV - IIb; Mixed Imagery Dataset Both DS #1 - BCI Comp IV - IIa: 9 * 2 * 288 = 5184 x 4s DS #2 - Internal Recordings: 14 * 1000 = 14000 x 4s DS #3 - BCI Competition IV - IIb: 9 * 720 = 6480 x 4s DS #4 - Mixed Imagery Dataset: 4009 trials / 37830 w (DS4: 2s window, 1.5s overlap) 5184; 14000; 6480; 37830 345.6; 933.33; 432; 267 9; 14; 9; 4 22; 44; 3; 64 250; 250; 250; 250 BCI Competition Datasets: 1) Lowpass @38 Hz Yes Yes (removed trials with at least one channel > 800 uV) Yes Raw EEG Raw EEG Electrode-wise exponential moving standardization Lasagne CNN CNN 1) Deep ConvNet 2) Shallow ConvNet 3) Hybrid of 1) and 2) with 2 dense layers 4) ResNet 1) Layer 1: temporal filtering, Layer 2: spatial filtering, with no non-linearity in-between 2) Embedding FBCSP in a ConvNet 3) Combining 1 and 2 4) 2 layers like in 1) Yes 1) 5 2) 2 3) max(2, 5) + 2 = 7 4) 31 31 1) ELU 2) Square, log 3) ELU, square & log 4) ELU Dropout (0.5) Early stopping Yes 2 or 4 N/M Standard optimization Standard Adam Adam Batch norm N/M N/M N/M Crops (sliding windows within 1 trial) Categorical cross-entropy For cropped training: "Tied loss function" Intra No No 1) 288 - 288 2) 880 - 160 3) 400 - 320 4) Variable per subject Accuracy Confusion matrices accuracy, confusion matrix Geforce GTX Titan Black Intel Xeon @2.60 GHz with 32 cores 128 GB RAM N/M Filter bank common spatial patterns Traditional pipeline Wilcoxon sign-rank test Input-feature unit-output correlation maps (visualization of correlation between spectral bands and receptive fields) Input-perturbation network-prediction correlation map (perturbing the input and visualizing change in output of net) Input-feature unit-output correlation maps, Input-perturbation network-prediction correlation maps ConvNets reached FBCSP accuracies ConvNet design choices substantially affects decoding accuracies Recent DL advances substantially increases accuracies ResNet performed worse than deep ConvNet Cropped training strategy improves performance on higher frequencies And much more! ConvNets can be too flexible, especially if there is a specific type of brain activity that a user should use Yes GitHub Yes Hubert Banville TBR TBC Schirrmeister2017
76 Optimal Feature Selection and Deep Learning Ensembles Method for Emotion Recognition From Human Brain EEG Sensors 2017 Mehmood, Du & Lee IEEE Access No Journal Chonbuk National University, Nanjing University of Posts and Telecommunications South Korea 10 Classification of EEG signals Monitoring Affective Emotion Improve SOTA Ensemble Method with DL and others to improve SOTA in EEG emotion classification Watching "Emotional" Images from IAPS database. Using ensemble approach. EPOC (Emotiv) Emotions Internal Recordings Private 21 subjects x 4 classes x 2 sessions x 45 trials 360 epochs, 368s / session (1.5s windows, no overlap) 7560 189 21 14 128 1) Artifact Removal (cites: Gómez-Herrero et al., 2006) 2) Filtering (cites: Widmann et al., 2012) 3) Epoching Yes Yes Yes Hjorth parameters for different frequency ranges + ANOVA feature selection Frequency-domain N/M EEGLAB Matlab WEKA "Deep Learning" (they don't even specify) N/M Ensemble: LDA, KNN, SVM, Naive/Bayes-Net, DT, RF, Deep Learning They don't describe the DL model at all N/M Yes 3 Hjorth params for each of the 5 frequencies (?) N/M N/M N/M N/M N/M N/M N/M Pre-Training and Fine Tuning Pre-training N/M N/M N/M N/M N/M N/M No N/M Inter 10-Fold CV k-fold Train: 90% Test: 10% Accuracy accuracy SOTA Server 4 TITAN-X (Pascal) N/M Accuracy: 76.62% Jirayucharoensak et al., 2014 (SAE): 46/50% Chanel et al., 2006 (FDA, Naive Bayes): 72% Khalili et al., 2008 (LDA, KNN): 61% Horlings et al., 2008 (SVM): 37/32% Jenke et al., 2014 (...): 45% Yin et al., 2017 (SAE, Ensemble): 84/83% Atkinson et al., 2016 (...): 73/73% DL & Trad. ANOVA No No Comparatively, the proposed method performs better than existing emotion recognition methods. The proposed feature selection method OF obtained the best emotion recognition rates of 76.6% for Voting ensembles method. Based on our results, we conclude that optimal feature selection is a good choice for enhancing the performance of EEG-based emotion recognition. To further improve emotion recognition performance, we need to explore additional feature combinations with more emotional classes in the arousal–valence domain. No N/A No Yannick Roy TBR TBC Mehmood2017
77 Emotion Recognition based on EEG using LSTM Recurrent Neural Network 2017 Alhagry, Fahmy & El-Khoribi International Journal of Advanced Computer Science and Applications (IJACSA) No Journal Cairo University Egypt Classification of EEG signals Monitoring Affective Emotion Improve SOTA: Using LSTM on raw EEG to classify emotions (arousal, valence, liking) Emotion Classification on DEAP (Like or Dislike video) N/M Raw EEG DEAP Public DEAP: 32 subjects x 40 x 1min 12x 5s windows per video (5s window, no overlap) 15360 1280 32 32 512 1) Downsampled to 128Hz (in the dataset) 2) Re-reference to Common Average (in the dataset) 3) Eye Artifacts Removed (in the dataset) 4) High-Pass Filter [freq not mentioned] Yes Yes Yes Raw EEG (None) Raw EEG Keras, TensorFlow LSTM RNN AF: ReLU and Sigmoid Yes 5s segments x 32 channels (672 x 32) 2 LSTM Layers (64,32) + 1 Dropout (0.2) + 1 FC 3 Dropout Yes 3 Classes 5534113 Standard RMSProp, LR:0.001 Other N/M N/M No N/M* Intra 4-Fold CV k-fold Train: 75% Test: 25% Average Accuracy accuracy N/M* Arousal: 85.65% Valence: 85.45% Liking: 87.99% Traditional pipelines Koelstra et al., [2]: 62 | 56 | 55 % Atkinson ... [3]: 73 | 73 | - % Yoon and Chung [6]: 70 | 70 | - % Naser and Saha [7]: 66 | 64 | 70 % proposed method: 86 | 85 | 88 % Traditional pipeline No No No Results show that the proposed method is a very promising choice for emotion recognition, because of its powerful ability to learn features from raw data directly. It achieves high average accuracy over participants compared to the traditional feature extraction techniques. No N/A No Yannick Roy TBR TBC Alhagry2017
78 Intent Recognition in Smart Living Through Deep Recurrent Neural Networks 2017 Zhang, Yao, Huang, Sheng & Wang International Conference on Neural Information Processing (ICONIP) Yes Conference University of New South Wales, AU Macquarie University, AU Singapore Management University, Singapore Australia 11 Classification of EEG signals BCI Active Motor imagery Improve SOTA Using LSTM on multiclass BCI open dataset Use hyperparameter fine-tuning method Motor Imagery (see eegmmidb dataset) Explore multiclass as opposed to binary classification like many others. BCI at home will be multiclasses. N/M Intent / Motor Imagery eegmmidb Public eegmmidb: 10 subjects x 28,000 samples (28000 points @ 160Hz = 175s/subject) (window length = 1 point) 28000 29.2 10 64 160 None No N/M N/M Raw EEG (None) Raw EEG N/A N/M LSTM RNN N/A N/M Yes 1 x 64 (sample x channels) 5 5 Sigmoid L2 Yes 5 eegmmidb: eye closed, left hand, right hand, both hands, both feet emotiv: up arrow, down arrow, left arrow, right arrow, eye closed 5 N/M N/M N/M Adam Adam LR: 0.004 Lambda: 0.005 N/M Orthogonal Array (OA) experiment method Yes N/M Cross-Entropy Inter No No Train: 75% Test: 25% Accuracy Recall F1 Score ROC accuracy, recall, f1-score, ROC N/M N/M Accuracy: 0.9545 Recall: 0.9228 F1: 0.9382 AUC: 0.9985 Almoari [2] 0.7497, Sun [13] 0.65, Major [4] 0.68, Shenoy [12] 0.8206, Tolic [16] 0.6821, Ward [19] 0.8, Pinheiro [10] 0.8505 KNN (k=3) 0.8369, SVM 0.5082, RF 0.7739, LDA 0.5127, AdaBoost 0.3431, CNN 0.8409 DL & Trad. No No No To achieve optimal recognition accuracy, we employ OA to op- timize the hyper-parameters. In this paper, we select five most common hyper-parameters including λ (the coefficient of L2 norm), lr (learning rate), Ki(the hid- den layer nodes size), I (the number of layers), and nb (the number of batches). N/A Yes Website No Yannick Roy TBR TBC Zhang2017d
79 Deep Recurrent Neural Networks for seizure detection and early seizure detection systems 2017 Talathi Arxiv Yes Preprint Lawrence Livermore National Lab USA Classification of EEG signals Clinical Epilepsy Detection Improve SOTA: Using RNN for early seizure dectection Using GRU-RNN for early seizure detection Resting State, Eyes Open, Eyes Closed, Seizures. Using available data to test RNNs for seizure detection. N/M Seizures Bonn University Public Bonn University 5 x 100 x 23.6s 173.61 x 23.6 = 4097 --> 51 segments x 80 (0.46s windows) 25500 197 15 1 173.6 None (see dataset preprocessing steps) No N/M N/M Raw EEG (None) Raw EEG N/A Keras GRU (RNN) RNN GRU -> FC -> GRU GRU for RNN long-term dependencies, but control the vanishing gradient Yes 51 x 80 x 1 (51 EEG sub-segment x 80 values x 1 channel) GRU: 2 FC: 1 3 N/M N/M N/M 3 (Logistic Regression with Softmax) In the order of 100,000 (1) We train the RNN in stateful-mode*. (2) Rescaling the learning rate by factor 0.1 at each 100th epoch Standard Adam Adam LR: 0.01 N/M N/M N/M N/M N/M Inter No No Train: 50% Test: 50% Accuracy accuracy N/M N/M 98% Accuracy within the first 5 sec (3 classes: Healthy vs Ictal vs InterIctal) They mentioned (A. T. Tzallas et al., 2007) getting 98% accuracy (ANN). Traditional pipeline No No No This findings offers a strong support to the utility of GRU-RNN model for use in early-seizure detection system that can be extremely useful for developing closed loop seizure control systems where timely intervention can be leveraged to abate seizure progression - No N/A No Yannick Roy TBR TBC Talathi2017
80 DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG 2017 Supratak, Dong, Wu & Guo Arxiv Yes Preprint Imperial College London UK 11 Classification of EEG signals Clinical Sleep Staging Improve SOTA: Using CNN+LSTM for Sleep Stage Scoring from Raw EEG Combining CNN + LSTM for Raw EEG and testing it on 2 different existing datasets Sleep Using RNN (LSTM) to capture time depencies in sleep stages. N/M Sleep Stages MASS; Sleep EDF Public MASS: Used SS3, PSG recordings from 62 subjects Sleep EDF: Used 20 subjects (30s windows, no overlap) 58600; 41950 29300; 20975 62; 20 20; 2 256; 100 1) Notch filter: 60Hz 2) Band-pass filter: 0.30 - 100Hz Yes No No Raw EEG (None) Raw EEG N/M TensorLayer eTRIKS CNN + bi-LSTM CNN+RNN 1D Conv, Batch Norm, Max Pooling First part is representation learning, which can be trained to learn filters to extract time-invariant features from each of raw single-channel EEG epochs. The second part is sequence residual learning, which can be trained to encode the temporal information. Yes 30s EEG Epoch (2 diff sampling freq) 2 CNN 2 bi-LSTM 4 ReLU L2 Dropout (50%) Yes 5 Sleep Stages (Softmax) N/M The two-step training algorithm (their technique) to prevent from suffering from class imbalance. The algorithm first pre-trains the representation learning part of the model and then fine-tunes the whole model using two different learning rates. Pre-training Adam Adam LR: 0.0001 b1: 0.9 b2: 0.999 100 N/M N/M Oversampling to balance classes (duplicating minority sleep stages) Cross-Entropy Inter DS 1 - MASS) 31-Fold DS 2 - Sleep EDF) 20-Fold k-fold DS 1 Train: 60 subjects DS 1 Valid: 2 subjects DS 2 Train: 30 subjects DS 2 Valid: 1 subject Precision (PR) Recall (RE) F1-score (F1) macro-averaging F1-score (MF1) Accuracy (ACC) Cohen’s Kappa coefficient (κ) precision, recall, f1-score, macro-averaging f1-score, accuracy, Cohen's kappa NVIDIA GeForce GTX980 The training time for each validation fold was approximately 3 hours on each node Sleep EDF - Acc: 82.0 Sleep EDF - MF1: 76.9 Sleep EDF - k: 0.76 MASS - Acc: 86.2 MASS - MF1: 81.7 MASS - k: 0.80 Traditional pipelines & DL Sleep EDF: Y.-L. Hsu et al., 2013 Sleep EDF: R. Sharma et al., 2017 Sleep EDF: A. R. Hassan et al., 2017 Sleep EDF: O. Tsinalis et al., 2016a Sleep EDF: O. Tsinalis et al., 2016b MASS: H. Dong et al., 2016 DL & Trad. No Visualization of filter activations Analysis of activations It achieved similar overall accuracy and macro F1-score compared to the state-of-the-art hand-engineering methods on both the MASS and Sleep-EDF datasets, which have different properties such as sampling rate and scoring standards (AASM and R&K). N/M Yes GitHub No Yannick Roy TBR TBC Supratak2017
81 Mixed Neural Network Approach for Temporal Sleep Stage Classification 2017 Dong, Supratak, Pan, Wu, Matthews & Guo IEEE Transaction on Neural Systems and Rehabilitation Engineering Yes Journal Imperial College London UK 11 Classification of EEG signals Clinical Sleep Staging Improve SOTA: Using Mixed NN on 1 channel EEG for Sleep Stage Scoring Combining MLP + LSTM on 1-Channel Raw EEG from an existing (open) dataset Sleep Using RNN (LSTM) to capture time depencies in sleep stages and using a single, frontal (skin) electrode. N/M Sleep Stages MASS Public MASS: 62 subjects (~ 494h) (30s windows, no overlap) 58600 29300 62 1 256 N/M Seems to directly do SFTF for freq features N/M No No PSD Features Frequency-domain N/M Theano Mixed NN (MNN) MLP + LSTM RNN N/M Our MNN is composed of a rectifier neural network which suitable for detecting naturally sparse patterns [18], and a long short-term memory (LSTM) for detection of temporally sequential patterns [19] Yes 30s EEG Epoch PSD MLP: [2,5] LSTM: 1 (200-1000) 6 ReLU Dropout Yes 5 Sleep Stages (Softmax) N/M N/M N/M SGD SGD LR: 0.01 Momentum: 0.9 no weight decay 500 Manual fine tune Yes Oversampling to balance classes Cross-Entropy Inter 31-Fold CV k-fold Train: 60 subjects Valid: 2 subjects Macro F1-score (MF1) Accuracy (ACC) Recall (RE) Precision (PR) macro f1-score, accuracy, recall, precision NVIDIA 630 2 days MF1: 80.50 ACC: 85.92 SVM: 75.01 | 79.70 (best with sequence 2) RF: 72.44 | 81.67 (best with sequence 3) MLP: 77.23 | 81.43 (best with sequence 4) DL & Trad. No No No (1) In terms of convenience, wearing the F4 channel near the hair line is imperfect. Other frontal EEG channels such as Fp2 and Fpz are easier to wear, but these channels have lesser information about stage W, N1, N2 and N3. (2) In our experiment, we tried to add fully connected layers between LSTM and softmax, and vary their hidden sizes, but no improvement was found. Less inofrmation in low frontal (skin) channels (They've identified 3 challenges) Challenge 1. Heterogeneity Challenge 2. Temporal Pattern Recognition Challenge 3. Comfort No N/A No Yannick Roy TBR TBC dong2018mixed
82 SLEEPNET: Automated Sleep Staging System via Deep Learning 2017 Biswal, Kulas, Sun, Goparaju, Westover, Bianchi & Sun Arxiv Yes Preprint Georgia Institute of Technology Nanyang Technological University Massachusetts General Hospital USA 17 Classification of EEG signals Clinical Sleep Staging Improve SOTA: Using CNN, RNN, CRNN for Sleep Stage Scoring Trying CNN, LSTM, RCNN on 10,000 subjects on Raw EEG, Expert Feature Set and Freq Bands for Sleep Stage Scoring Sleep Leveraging huge dataset (3.2TB) of 10,000 subjects to apply deep learning N/M Sleep Stages Internal Recordings Private 10,000 overnight PSGs x ~8h / patient 80000 hours 3.2TB of data! Each 8h ~ 950-1000 labels (avg at 975) (30s windows, no overlap) 9750000 4800000 10000 6 200 None No No No 3 Sets of Features: 1) Raw EEG 2) Experts Defined Features 3) Spectrogram Combination N/M Tensorflow CUDA 8.0 1) CNN 2) RNN 3) RCNN CNN+RNN 1) CNN: 1D Conv for Raw EEG / 2D Conv for Freq Features 2) RNN: Look back steps in RNN : [3,5,10,20,30] By combining a RNN with CNN, we can have a hybrid model, namely, Recurrent-Convolutional Neural Networks (RCNN), which is able to extract features present in a spectrogram and preserve the long-term temporal relationship present in the EEG data No 30s EEG Epoch (depending on feature set) RNN: 5 LSTM (1000) 5 ReLU Dropout Yes 5 Classes N/M N/M N/M N/M N/M LR: [0.01 - 0.00001] N/M We performed 50 iterations of random search over a set of parameter choices for hyper-parameter tuning Yes N/M Categorical Cross-Entropy Inter 50 Iterations of random search for hyper-parameter tuning Train-Valid-Test Train: 8700 patients Valid: 300 patients Test: 1000 patients Accuracy Cohen's Kappa accuracy, Cohen's kappa Intel Xeon E5-2640, 256GB RAM, four Nvidia Titan X between 40 -100 min [RNN] - Expert Defined Features: [Acc] 85.76 | 79.46 [k] [RNN] - Spectrogram Features: [Acc] 79.21 | 73.83 [k] [RNN] - Waveform Features: [Acc] 79.46 | 72.46 [k] --- [RCNN] - Expert Defined Features: [Acc] 81.67 | 76.38 [k] [RCNN] - Spectrogram Features: [Acc] 81.47 | 74.37 [k] [RCNN] - Waveform Features: [Acc] 79.81 | 73.52 [k] Logistic Regression Tree Boosting MLP CNN RNN RCNN DL & Trad. No No No On 1000 held-out testing patients, the best performing algorithm achieved an expert-algorithm level of inter-rater agreement of 85.76% with Kappa value 79.46%, exceeding previously reported levels of expert-expert inter-rater agreement for sleep EEG staging. N/M No N/A No Yannick Roy TBR TBC Biswal2017
83 DeepKey: An EEG and Gait Based Dual-Authentication System 2017 Zhang, Yao, Chen, Wang, Sheng & Gu Arxiv Yes Preprint University of New SouthWales Macquarie University RMIT University Australia 20 Classification of EEG signals Personal trait/attribute Person identification Improve SOTA: EEG for Person Identification Use AR+RNN+SVM on EEG+Gait for Person Identification Motor Imagery (see eegmmidb dataset) + Gait (PAMAP2 dataset) The DL motivation is not clear. They want to improve SOTA. N/M Raw EEG eegmmidb Public eegmmidb: 8 subjects x 13,500 samples 13,500 samples / 90 per window = 150 examples per subjects, 1200 total (90 points windows, no overlap) 1200 11.3 8 64 160 None (AR) No No No Raw EEG Raw EEG N/M N/M AR + RNN + SVM RNN N/M AR for pre-processing, RNN for feature extracting, and SVM for classification. Auto-regressive Coefficients (AR) is one of the most widely used pre-processing methods on EEG data Yes 150x13x64 (150 segments, 13 coefficients (AR), 64 features/nodes) 5 RNN (64) 5 N/M L2 Yes 8 One-Hot Label (ID - 8 Subjects) N/M N/M N/M Adam Adam lambda is set as 0.004 while learning rate is set as 0.005 8 mini-batch with the shape of [150, 13, 64] Orthogonal Array Experiment Method Yes N/M Log Loss Function Inter No No Train: 87.5% Test: 12.5% Accuracy accuracy N/M N/M Highest Accuracy: 0.9841 Gait: 0.999 Combined: 0.983 [45]: PSD + cross-correlation values, [8]: Customized Threshold, [17]: Low-pass filter+wavelets+ ANN, [3]: Bandpass FIR filter +ECOC + SVM, [44]: IAF + delta band EEG + Cross-correlation & mahalobonis, [22]: CSP +LDA, [23]: AR + SVM Traditional pipeline No No No The Gait Identification Model adopts a 7-layer deep learning model to process gait data and classify subjects’ IDs, achieving an accuracy of 0.999. The EEG Identification Model combines three components (auto-regressive coefficients, the RNN structure, and an SVM classifier) and achieves the accuracy of 0.9841 on a public dataset. Overall, the DeepKey authentication system obtains a FAR of 0 and a FRR of 0.019. N/M Yes GitHub No Yannick Roy TBR Yes Zhang2017c
84 Multi-Person Brain Activity Recognition via Comprehensive EEG Signal Analysis 2017 Zhang, Yao, Zhang, Wang, Sheng, Gu EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services Yes Conference University of New South Wales, Australia Singapore Management University Macquarie University, Australia RMIT University, Australia Australia 10 Classification of EEG signals BCI Active Motor imagery Improve SOTA Use AE + XGB for BCI-MI 5 classes (eegmmidb + internal recordings) Motor Imagery (see eegmmidb dataset) Deep learning should be able to generalize better across subjects and across classes, instead of binary classif. N/M Motor Imagery eegmmidb Public eegmidb: 20 subjects x 28000 samples Total: 560,000 EEG samples (window length = 1 point) 560000 58.3 20 64 160 N/M N/M No No Raw EEG (None) Raw EEG z-score N/M AE + XGB Classifier AE Encoder, Decoder + XGB Classifier N/M Yes 64x?? Channels x Raw EEG time window 1 (64) Input - Encoder - Decoder - Classifier (XGB) 1 N/M L2 Yes 5 N/M N/M N/M RMSProp Other LR: 0.01 There are 9 mini-batches and the batch size is 17,280. N/M N/M N/M MSE Inter No No Train: 532,000 Test: 28,000 Accuracy Precision Recall F1-Score ROC ROC AUC accuracy, precision, recall, f1-score, ROC, ROC AUC Nvidia Titan X Pascal 768G memory 145 TB PCIe SSD See charts Accuracy: 0.794 Precision: 0.7991 Recall: 0.781 F1 score: 0.7883 AUC: 0.9456 SVM, RNN, LDA, RNN+SVM, CNN, DT, AdaBoost, RF XGBoost, PCA+XGBoost, PCA+AE+XGBoost, EIG+AE+XGBoost, EIG+PCA+XGBoost, DWT+XGBoost, SAE+XGBoost, AE+XGBoost DL & Trad. No No No As part of our future work, we will build multi-view model of multi-class EEG signals to improve the classification performance. In particular, we plan to establish multiple models with each single model dealing with a single class. Following this philosophy, the correlation between test sample and each model can be calculated in the test stage and the sample can be classified to the class with minimum correlation coefficient. N/M No N/A No Yannick Roy TBR Yes Zhang2017a
85 Neurology-as-a-Service for the Developing World 2017 Dharamsi, Das, Pedapati, Bramble, Muthusamy, Samulowitz, Varshney, Rajamanickam, Thomas & Dauwels Arxiv Yes Preprint IBM Research AI Nanyang Technological University USA 5 Classification of EEG signals BCI Active Motor imagery Improve SOTA: Use DL on the Cloud Use DL on the Cloud for developing countries. Starting with a BCI Tasks (MI) MI: Feet and Hands, real / imagined. To develop neurology-as-a-service to learn features automatically from the data. This would help developing countries N/M Motor Imagery eegmmidb Public eegmidb: 103 out of 109 subjects, 12 out of 14 tasks Segments 0.8 sec and sliding window of 0.05 sec The prepared data consisted of 17,232 samples (window length = 0.8s ??? not clear) 17232 N/M 103 64 160 1) Bandpass: 3 - 30Hz 2) Generate Spectrogram: Hanning window & NFFT (128) Yes No (mention it, but only filters) No Spectrograms Frequency-domain N/A N/M (Cloud) CNN CNN N/M N/M No 3D (channels x freq x time) [1-3 3D CNN] [0-2 FC] 5 N/M Dropout Yes N/M N/M N/M N/M (hyperparameters are automatically fine-tuned using an optimizer) N/M LR: 0.001 N/M Random Optimizer Yes N/M N/M Inter No No Train: 70% Test: 30% Accuracy accuracy N/M (Cloud) N/M Best accuracy: 63.4% PCA-SVM Traditional pipeline No No No As part of our next steps, we plan to use this framework on a dataset aimed at classification of epileptic seizures and/or pathological/normal EEG. We would also like to see how the framework performs using other hyperparameter optimization techniques including Bayesian optimization. N/M No N/A No Yannick Roy TBR TBC Dharamsi2017
86 Deep Architectures for Automated Seizure Detection in Scalp EEGs 2017 Golmohammadi, Ziyabari, Shah, de Diego, Obeid, Picone Arxiv Yes Preprint Neural Engineering Data Consortium, Temple University USA 8 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA: Comparing different deep architectures Compare HMM+sAE, HMM+LSTM, IPCA+LSTM, CNN+MLP, CNN+LSTM Ongoing EEG recording, with and without seizures. With big EEG corpus now available we can explore deep learning. (Natus), (Nihon Kohden) Seizures TUH; Duke Seizure Corpus Both TUHS & DUZS: 1,864,012s ~ 517.8h, 159 subjects Multiple models with different window sizes N/M 31067 159 22; -1 250; N/M N/M N/M N/M N/M LFCCs + First & Second Derivative of LFCCs Other N/A N/M 1) HMM + sAE 2) HMM + LSTM 3) IPCA + LSTM 4) CNN + MLP 5) CNN + LSTM CNN+RNN 2D Conv Layers -> Flatten -> 1D Conv Layer -> LSTM (output 1s data) -> LSTM -> 2-way sigmoid They tried different architectures trying to capture Spatio-Temporal information. They also use Time-Freq Features, not raw EEG as is. Yes 210 @ 22 x 26 x 1 (Frames @ Channels * Features * 1) (to be reviewed) 3x 2D CNN + 1x 1D FC CNN + 2x Bi-LSTM (CNN + LSTM) (see paper for others) 6 ELU Dropout Yes 2-way Sigmoid N/M Trained + Eval on TUSZ and only Eval on DUSZ Standard Adam Adam N/M N/M N/M N/M N/M MSE Inter Train-Valid-Test Train-Valid-Test Train: 614,382 (sec) Valid: 647,948 (sec) Test: 601,682 (sec) Sensitivity Specificity sensitivity, specificity N/M N/M CNN + LSTM gave the best results. TUSZ - Sensitivity: 30.83% | Specificity: 96.86% DUSZ - Sensitivity: 33.71% | Specificity: 70.72% HMM + Gaussian mixture + AE They compared 7 Optimizer Methods. (e.g. Adam, SGD, etc.) They compared 6 Activation Functions. (e.g. Tanh, Sigmoid, etc.) CNN + LSTM, Adam, ELU is the best combinaison DL & Trad. No No No This is a significant finding because the Duke corpus was collected with different instrumentation and at different hospitals. Our work shows that deep learning architectures that integrate spatial and temporal contexts are critical to achieving state of the art performance and will enable a new generation of clinically-acceptable technology. Access to labeled data and $ to label the data and make it public. No N/A Yes Yannick Roy TBR TBC Golmohammadi2017a
87 Neonatal Seizure Detection using Convolutional Neural Networks 2017 O'Shea, Lightbody, Boylan, Temko IEEE 27th International Workshop on Machine Learning for Signal Processing Yes Conference Irish Centre for Fetal and Neonatal Translational Research, University College Cork Ireland 6 Classification of EEG signals Clinical Epilepsy Detection New Approach CNN on (preprocessed) raw EEG for neonatal seizure detection Ongoing EEG recording, with and without seizures. CNN works well on audio signal, why not on EEG. N/M Seizure Internal Recordings Private 835 hours with 1389 seizures from 18 subjects splitted into 8s, 50% overlap. (not clear 50% versus 7s overlap / 1s shift) (8s window, 7s overlap) 3006000 50100 18 8 256 Band-pass filter: 0.5 - 12.8Hz Down sampled: 32Hz EEG Split into 8s windows (12.5% overlap) Yes No No Raw EEG 8 sec windows (1 sec shift) Raw EEG N/A Keras 1D - CNN CNN Conv - Batch Norm. - Pooling Output Layer: GAP (not dense) "...1D CNNs wide convolutional filters (1-4s, 32-128 samples) significantly improved the performance". Sample size filters were used. In contrast to larger filter lengths allow the learning the various filters in a hierarchical manner [21]. Yes 256x1 (8s x 1 channel) 6 6 RELU Softmax Batch Norm Yes 2 (Seizure vs Non-seizure) 16,930 The network was trained for 100 epochs, after each epoch the validation AUC was calculated. Standard SGD SGD LR: 0.003 LR -= 10% every 20 iterations Nesterov Momentum: 0.9 2048 N/M N/M Sliding Window (Shifted by 1s, 7/8 overlap) Categorical Cross-Entropy Inter Leave-One-Subject-Out Leave-One-Subject-Out Train: 17 subjects Test: 1 subject ROC AUC ROC AUC N/M N/M AUC: 97.1% AUC90: 82.9% SVM Traditional pipeline No No No "We have also tried max pooling, which led to slightly inferior results in our experiments." "Initially, the EEG was converted to time-frequency images (spectrograms) and 2D CNNs were utilized, adopted from the area of image processing [17] – this architecture proved unsuccessful in the seizure detection task." N/M No N/A No Yannick Roy TBR Yes OShea2017
88 Improving classification accuracy of feedforward neural networks for spiking neuromorphic chips 2017 Yepes, Tang & Mashford International Joint Conference on Artificial Intelligence Yes Conference IBM Research, VIC, Australia Australia 7 Improvement of processing tools Hardware optimization Neuromorphic chips New Approach: Running DL on Neuromorphic Chips Compare constrained network for a neuromorphic chip on 2 datasets, vs unconstrained version of NN MNIST & EEG Data from Nurse et al., 2015 (BCI-MI) Implement DL/DNNs on a chip. N/M Motor Imagery Nurse et al. (2015) Public From [Nurse et al., 2015]: 1 subject ~ 30 min 480/468 examples for training, 66/95 for testing (window length N/M) 1109 30 1 -1 1000 N/M N/M No No N/M N/M [0, 1] Matlab [Esser et al., 2015] N/M [Esser et al., 2015] [Esser et al., 2015] No N/M Small Network: 3 Large Network: 4 4 N/M N/M N/M 2 N/M N/M N/M N/M N/M LR: 0.1 25 N/M N/M No N/M Intra No No Train: 80% Test: 20% Accuracy accuracy TrueNorth (IBM Chip) (see paper) EEG Data: 86% MNIST: 98-99% No None No No No Furthermore, analysis of the learnt parameters pro- vide insights that might complement hardware design, thus providing a more efficient deployment of the trained models. The trained models use a small portion of the TrueNorth chip (30 cores vs. 4096 avail- able in the current version of the chip), thus requiring a much less than 70mW to work, which makes these models suitable for portable autonomous devices with large autonomy. N/M No N/A No Yannick Roy TBR TBC Yepes2017
89 Automatic Analysis of EEGs Using Big Data and Hybrid Deep Learning Architectures 2017 Golmohammadi, Hossein, Torbati, Lopez De Diego, Obeid & Picone Arxiv Yes Preprint Temple University Jibo, Inc., Redwood City USA 20 Classification of EEG signals Clinical Epilepsy Detection New Approach: Hybrid HMM & SdA for Epilepsy Using an Hybrid 3 Passes Model, combining HMM & Stacked Denoising AutoEncoders for Epilepsy classification Ongoing EEG recording, with and without seizures. (TUH Dataset) Deep Learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction (Natus) Seizure TUH Public TUH Corpus Training from 359 sessions, Evaluation from 159 sessions. 113453 events total. Splitted 10s windows -> 1s epoch -> 0.1s frames (0.1 window, 0.2s overlap) 113453 18909 518 128 1024 PCA Yes No No Cepstral coefficient-based feature extraction approach based on Linear Frequency Cepstral Coefficients (LFCCs) Other N/M Theano 3x Stacked denoising Autoencoders (SDAE) AE 3 Passes. (1) HMM -> (2) SDAEs -> (3) NLP (2) PCA -> Out of Sample -> 3 SDAEs in parallel -> Enhancer (combining 3 SDAEs) Not your typical DL-EEG approach... Yes 3 [Nodes from 100-800] 3 N/M N/M N/M 6 Classes N/M Training of these three SDAE networks is done in two steps: pre-training and fine-tuning. Denoising autoencoders are stacked to form a deep network. The unsupervised pre-training of such an architecture is done one layer at a time. Pre-training Minibatch Stochastic Gradient Descent SGD LR: [0.1-0.5] [100-300] N/M N/M Out-of-sample technique (van der Maaten, 2009) Cross-Entropy Inter No No Train: 84,032 events Test: 29,421 events Sensitivity Specificity sensitivity, specificity N/M N/M Pass: Sensitivity | Specificity 1 (HMM): 86.78 | 17.70 2 (SDAE): 78.93 | 4.40 3 (SLM): 90.10 | 4.88 No None No No No A summary of the results for different stages of processing is shown in Table 12. The overall performance of the multi-pass hybrid HMM/deep learning classification system is promising: more than 90% sensitivity and less than 5% specificity. N/M No N/A No Yannick Roy TBR TBC Golmohammadi2017
90 Multimodal deep learning approach for joint EEG-EMG data compression and classification 2017 Ben Said, Mohamed, Elfouly, Harras & Wang IEEE Wireless Communications and Networking Conference Yes Conference Qatar University Carnegie Melon University University of British Columbia Qatar 6 Classification of EEG signals Monitoring Affective Emotion New Approach: Compressing joint EEG-EMG with an autoencoder Compression & Classification of joint EMG + EEG on DEAP dataset with SAE Watching music videos (DEAP Dataset) Deep learning approach has emerged as one of the possible techniques to exploit the correlation of the data from multiple modalities. Compression for mobile health data. N/M Emotions DEAP Public DEAP 32 subjects x 40 videos x 63s (6s windows) 23040 1280 32 -1 128 1) 6s Windows 2) Whitened 3) Normalized Yes N/M N/M Raw EEG + EMG (None) Raw EEG z-score N/M SAE AE N/M Deep learning approach has emerged as one of the possible techniques to exploit the correlation ofthe data from multiple modalities No N/M 2 SAE 2 Sigmoid L2 Yes N/M 1) Compressed data 2) Classification N/M Greedy-layer wise Pre-training N/M N/M N/M N/M N/M N/M Duplicated multimodal data keeping values from 1 modality, setting the other modality to 0. And vice-versa. Square Euclidean Distance Inter No No [Compress] Train: 50% [Compress] Test: 50% [Classif] Train: 75% [Classif] Test: 25% 1) Compression: Distortion 2) Classification: Accuracy distortion, accuracy N/M N/M 1) [Compression] Distortion: EMG = 13.85% | EEG = 12% 2) [Classification] Accuracy: 78.1% Discrete Wavelet Transform (DWT) [26] Compressed Sensing (CS) [27] (distortion: 22% | 17.21%) 2D compression approach which is based on SPIHT and FastICA [28] Traditional pipeline No No No 1) Compression: Distortion 2) Classification N/M No N/A No Yannick Roy TBR Yes BenSaid2017a
91 Deep Learning for Fatigue Estimation on the Basis of Multimodal Human-Machine Interactions 2017 Gordienko, Stirenko, Kochura, Alienin, Novotarskiy & Gordienko Arxiv Yes Preprint National Technical University of Ukraine Ukraine 12 Classification of EEG signals Monitoring Physical Exercise New Approach Multi-modal fatigue (and activity) estimation Different activities (sports) while having different sensors Use Multimodal Models. to combine different modalities with a NN. OpenBCI (OpenBCI) Multimodal Internal Recordings Private N/M Not much information on the EEG data N/M N/M N/M -1 N/M N/M N/M N/M N/M N/M N/M N/M N/M DNN N/M N/M N/M No N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M N/M Mean Residual Deviance (MRD) Mean Absolute Error (MAE) N/M N/M No N/M Mean Residual Deviance (MRD) Mean Absolute Error (MAE) mean residual deviance, mean absolute error N/M N/M See Paper (not really relevant / meaningful for this paper) N/M None No No No The main achievement is the multimodal data measured can be used as a training dataset for measuring and recognizing the intensity and physical load on the person by means of the machine learning approaches. N/M No N/A No Yannick Roy TBR Yes Gordienko2017
92 Towards Deep Modeling of Music Semantics using EEG Regularizers 2017 Raposo, Matos, Ribeiro, Tang & Yu Arxiv Yes Preprint Universidade de Lisboa Portugal 5 Classification of EEG signals Music semantics Improve SOTA on music semantics Modeling of music audio semantics Listening to music Previous success of CNNs in music audio modeling OpenBCI (OpenBCI) None Internal Recordings Private 60 music segments + 2 noise + 2 songs x 18 subjects music duration = average of 15.13s samples approx: 60 * 15.13 / 1.5 * 18 = 10894 (1.5s windows) 10894 272.3 18 16 250 1) Highpass 0.5Hz 2) Notch at 50Hz Yes Yes Yes Raw EEG + Audio embeddings Raw EEG Rescaled [-1, 1] N/M CNN CNN N/M N/M Yes N/M 5 5 ReLU No N/M 128 N/M 1) Train audio+lyrics embeddings model 2) Train audio embeddings+EEG embeddings model Standard N/M N/M N/M 102 N/M N/M N/M CCA between embeddings Inter 5-Fold CV k-fold Train: 80% Test: 20% Mean Reciprocal Rank (MRR) mean reciprocal rank GeForce GTX 1080 20 minutes Outperformed Spotfiy by ~1%, but did not perform better than the SOTA (by a small margin) Spotify embeddings and current SOTA (Choi) DL & Trad. No No No Proposed approach did not outperformed SOTA but SOTA was trained on more than 2083 hours of music, whereas the proposed method needs less than 3 hours of both music and EEG N/M No N/A No Isabela Albuquerque TBR TBC Raposo2017
93 Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals 2017 Acharya, Oh, Hagiwara, Tan & Adeli Computers in Biology and Medicine No Journal Ngee Ann Polytechnic, Singapore SUSS University, Singapore University of Malaya, Malaysia The Ohio State University, US Singapore Classification of EEG signals Clinical Epilepsy Detection New Approach: CNN for Epilepsy (claiming its a new approach, but it's not...) 13-Layers CNN for Epilepsy Ongoing EEG recording, with and without seizures. To develop a computer-aided diagnosis (CAD) to classify EEG N/M Seizures Bonn University Public Bonn University: B,D,E 3 x 100 x 23.6s (23.6s windows) 300 118 10 1 173.6 None No N/M N/M Raw EEG Raw EEG z-score N/M CNN CNN 1D CNN Conv / Max Pooling Yes 4097x1 1D CNN: 10 FC: 2 12 ReLU L1 Yes 3 (Softmax with 3 classes) N/M A conventional backpropagation (BP) [32] with a batch size of 3 is employed in this work to train CNN. Standard Adam Adam Lambda: 0.7 LR: 1x10^-3 Momentum: 0.3 3 Trial and Error Yes No N/M Inter 10-Fold CV k-fold Train: 90% Valid: 30% of 90% Test: 10% Accuracy Specificity Sensitivity accuracy, specificity, sensitivity Intel Xeon 2.40 GHz (E5620) 24 GB RAM 150 epochs 12.8s / epochs = 32 min Accuracy: 88.7% Sensitivity: 95% Specificity: 90% Many other SOTA (check paper) They performed worse than most previous SOTA Traditional pipeline No No No The advantage of the model presented in this paper, however, is separate steps of feature extraction and feature selection are not required in this work. Nevertheless, the main drawback of this work is the lack of huge EEG database Amount of data No N/A Yes Yannick Roy TBR TBC Acharya2017
94 Electroencephalogram-based decoding cognitive states using convolutional neural network and likelihood ratio based score fusion 2017 Zafar, Dass & Malik Plos One No Journal Universiti Teknologu PETRONAS Malaysia 23 Classification of EEG signals BCI Reactive RSVP Improve SOTA Decode seen images by extracting features with a CNN Watching natural images from 5 classes Features learned automatically can be more efficient (EGI) None Internal Recordings Private 26 subjects x 21min (1s windows) 13520 546 26 128 250 [Hardware: Bandpass from 0.1 to 100 Hz] 1) Bandpass from 0.3 to 30 Hz 2) Removal of eye artefacts Yes Yes Yes Raw EEG Raw EEG N/M N/M CNN CNN Modified LeNet CNN is *just* for feature extraction (feature selection and classification is done separately) Temporal 1D conv in first layer Yes 128 x 250 2 2 Sigmoid, tanh N/M N/M 128 x 11 x 100 N/M ??? Standard N/M N/M N/M N/M N/M N/M N/M N/M N/M Monte-Carlo 100-Fold CV Train-Valid-Test Train: 90% Test: 10% Accuracy Specificity Sensitivity accuracy, specificity sensitivity N/M N/M Accuracy (across participants, 5-class): 40% Discrete Wavelet Transform + SVM Traditional pipeline two-sample t-test, ANOVA No No Amount of data No N/A No Hubert Banville TBR TBC Zafar2017
95 Deep Convolutional Neural Network for Emotion Recognition Using EEG and Peripheral Physiological Signal 2017 Lin, Li & Sun International Conference on Image and Graphics No Conference College of Computer Science of Zhejiang University, Hangzhou, China China Classification of EEG signals Monitoring Affective Emotion Improve SOTA: AlexNet on DEAP AlexNet on Images (Raw EEG + Freq Bands) + other physiological sensors Watching videos (check out DEAP details) Using AlexNet on DEAP N/M Emotions DEAP Public DEAP 32 subjects x 1min x 40 videos (6s windows) 12800 1280 32 32 512 1) Downsampling to 128Hz 2) Band-Pass Filter: 4.0 - 45Hz 3) Average to Common Reference* (?) Yes No No EEG -> 6 gray images (Raw EEG + Freq Bands) + 81 features from other physiological sensors Frequency-domain min-max N/M CNN CNN AlexNet AlexNet is great for images, frequency bands can be converted to images... Yes 6 Gray Images (2D) 5 CNN 1 FC (81+500) 6 N/M N/M N/M 2 Softmax N/M Fine-tuning AlexNet Pre-training SGD SGD LR: 0.001 (decreases every 500 iterations) 200 Empirically Yes N/M N/M Inter 10-Fold CV k-fold Train: 90% Test: 10% Accuracy F1-Score accuracy, f1-score N/M N/M Arousal - Accuracy: 87.30% Arousal - F1-Score: 78.24% Valence - Accuracy: 85.50% Valence - F1-Score: 80.06% Many other SOTA (check paper) They outperform all others. Traditional pipeline No No No To achieve better performances, data preprocessing of the original signal was also adopted. The provided experimental results prove the effectiveness and validate the proposed contributions of our method by achieving superior performance over the existing methods on DEAP Dataset. N/M No N/A No Yannick Roy TBR TBC Lin2017
96 Cross-subject recognition of operator functional states via EEG and switching deep belief networks with adaptive weights 2017 Yin & Zhang Neurocomputing No Journal University of Shanghai China 18 Classification of EEG signals Monitoring Cognitive Mental workload & fatigue Improve SOTA on cross-subject operator functional state recogntion Exploit "new" improvements in deep learning Cabin air management simulation (AutoCAMS) Using switching deep belief network with adaptive weights (Nihon Kohden) None Internal Recordings Private 8 subjects x 1080 EEG Segments per subject (2s windows) 8640 288 8 11 500 Offline 1) Adaptive exponential smooth (to remove outliers) Yes Yes Yes Centroid frequency, log-energy entropy, mean, five power components, Shannon entropy, sum of energy, variance, zero-crossing rate of each channel and power differences between channel pairs Frequency-domain z-score Matlab 2011b Switching DBN DBN One DBN per subject The member DBN is switched at different time instants to fit the non-stationarity of the EEG features recorded from a novel testing subjects. Yes 152x1 4 4 Sigmoid N/M N/M 3 Low MW Medium MW High MW 3 N/M Unsupervised pre-training of DBNs to learn representation of features for each subject (layer by layer). Supervised fine-tuning of the complete model. Pre-training N/M N/M Pre-training: 0.1 Fine-tuning: 1 10 N/M N/M Gaussian noise to feature vector N/M Inter Leave-One-Subject-Out Leave-One-Subject-Out Train: 7 subjects Test: 1 subject Accuracy True positive True negative False positive False negative accuracy, true positives, true negatives, false positives, false negatives AMD4CPU 1.9GHz, 8G RAM N/M Mental workload: 77% Mental fatigure: 68% MW+MF: 54% KNN, Naive Bayes, Logistic Regression, LSSVM, SAE, DBN (all with and without PCA) DL & Trad. Two-tailed Wilcoxon sign-rank test No No Results of the proposed method outperform all baselines. When the number of subjects increases, the performance gap between SDBN and baselines increases, suggesting that the number of subjects plays a fundamental role. Number of subjects is crucial to obtain a good performance No N/A No Isabela Albuquerque Yannick Roy TBC Yin2017
97 Vowel classification from imagined speech using sub-band EEG frequencies and deep belief networks 2017 Sree & Kavita IEEE International Conference on Signal Processing, Communications and Networking No Conference SSN College of Engineering India 4 Classification of EEG signals BCI Active Speech decoding Improve SOTA on vowel classification Use DBNs to extract EEG features Speech imagery N/M Super Spec (RMS) None Internal Recordings Private 5 subjects x 75s experiment x ?? trials Between 15-20 min per subject not clear... N/M 87.5 5 32 128 1) Band-pass 1-60Hz Yes No No Energy features of Wavelet transform: Root Mean Square, Mean Absolute Value, Integrated EEG, Simple Square Integral, Variance of EEG, Average Amplitude Change Frequency-domain z-score N/M DBN DBN N/M N/M No N/M 7 7 N/M N/M N/M N/M N/M N/M N/M N/M N/M LR: 0.002 N/M N/M N/M N/M Log-likelihood Inter No No Train: 80% Test: 20% Accuracy accuracy N/M N/M ~87.5% (I believe this the average value for all vowels and EEG bands) No None No No No Vowels were more accurately classified in the theta and gamma bands N/M No N/A No Isabela Albuquerque TBR TBC Sree2017
98 Bullying incidences identification within an immersive environment using HD EEG-based analysis: A Swarm Decomposition and Deep Learning approach 2017 Baltatzis, Bintsi, Apostolidis & Hadjileontiadis Nature Scientific Reports No Journal Aristotle University of Thessaloniki, Khalifa University of Science and Technology Greece 8 Classification of EEG signals Monitoring Affective Bullying incidents New task: classifying bullying stimuli Classifying bullying stimuli in 2D or VR presentation Watching stimuli (2D or in VR) of bullying situations N/M (EGI) None Internal Recordings Upon request T1: 256 × 256 × 14 × 17 (x3 SWD) (channels × samples × trials × subject) T2: 256 × 192 × 16 × 17 (x3 SWD) (channels × samples × trials × subject) not clear... 1530 N/M 17 256 250 1) Bandpass 0.3-30 Hz 2) Artefact detection, bad channel replacement, baseline correction 3) Channel-wise normalization (- mean, / max) 4) Highpass @7Hz 5) Downsample to 128 Hz Yes Yes Yes 1) Swarm decomposition to get oscillatory modes 2) k-means clustering to re-order channels based on the respective distance to each other Other N/M N/M CNN CNN N/M N/M Yes 256 x 128 2 2 ReLU N/M N/M 2 or 4 N/M Standard optimization Standard N/M N/M N/M N/M N/M N/M N/M "Softmax" Inter 10-Fold CV k-fold Train: 75% Valid: 10% of 75% Test: 25% Accuracy Precision Recall ROC AUC accuracy, precision, recall, ROC AUC N/M N/M 2-class: Accuracy, precision, recall, AUC (test): 0.937, 0.9403, 0.9395, 0.9869 4-class: Accuracy, precision, recall, AUC (test): 0.8858, 0.8775. 0.87475, 0.975 No Swarm decomposition or clustering Just clustering Just Swarm decomposition Traditional pipeline No No No Swarm Decomposition was an important step in getting high accuracy. Withouth k-means clustering the network was overfitting. Larger nets take more resources No N/A No Hubert Banville TBR Yes Baltatzis2017
99 Classification and discrimination of focal and non-focal EEG signals based on deep neural network 2017 Taqi, Al-Azzo, Mariofanna & Al-Saadi International Conference on Current Research in Computer Science and Information Technology (ICCIT) No Conference University of Arkansas at Little Rock Asiacell Company for Telecommunication, Iraq USA 7 Classification of EEG signals Clinical Epilepsy Detection Improve SOTA Detecting Focal vs Non-Focal Seizures with existing Deep Nets: AlexNet, LeNet, GoogleNet Seizures (Bern-Barcelona Dataset) deep neural network (DNN) is a high-res model that get sophisticated hierarchical features. (e.g. AlexNet, LeNet, GoogleNet) N/M Seizures Bern-Barcelona EEG DB Public Bern-Barcelona EEG DB (600 out of 3750) 600 signal pairs: 300/300 x 40s (40s windows ??) 600 400 5 -1 256 None No No No None (Raw EEG) Raw EEG N/M Caffe N/M (pre-trained models) N/M N/M (pre-trained models) Using SOTA Networks in Vision/Images for EEG. (AlexNet, LeNet, GoogleNet) No 256x256 (images) N/M (pre-trained models) N/M N/M (pre-trained models) N/M (pre-trained models) N/M 2 N/M pre-trained models (AlexNet, LeNet, GoogleNet) Pre-training N/M (pre-trained models) N/M N/M N/M N/M N/M N/M N/M Inter No No Train: 75% Test: 25% Accuracy accuracy NVidia GPUs N/M LeNet, AlexNet, GoogleNet 100% (with different numbers of TEs) LeNet is the best compromise Anindya et al., 2016 : 89.4% (EMD-DWT domain, K-nearest neighbor classifier) R. Sharma et al.,2015 : 84% (DWT domain, KNN, PNN, fuzzy and LS-SVM) R. Sharma et al.,2014 : 85% (EMD domain, LS-SVM classifier) Traditional pipeline No No No As a future task, we are looking forward to investigating approaches for EEG signals classification of other diseases, drunk people, or ECG signals classification N/M No N/A No Yannick Roy TBR Yes Taqi2017
100 Deep Transfer Learning for Cross-subject and Cross-experiment Prediction of Image Rapid Serial Visual Presentation Events from EEG Data 2017 Hajinoroozi, Mao & Lin International Conference on Augmented Cognition No Conference University of Texas at San Antonio National Sun Yat-sen University, Tawain USA 11 Classification of EEG signals BCI Reactive RSVP Novel Approach: Transfer Learning Transfer learning on RSVP task with CNN on Raw EEG: (1) Cross-Suject (2) Cross-Experiment RSVP (3 datasets from 1990, 1999, 2013) Transfer learning has a lot of potential for BCI training. ActiveTwo (BioSemi) RSVP USA DoD (1999); USA Army (1990); Touryan et al. (2013) Private DS #1 - CT2WS: 15 subjects x 15min DS #2 - Static: 16 subjects x 15min DS #3 - Expertise: 10 subjects x 5 sessions x 60min (1s windows, no overlap) 65831; 62553; 21680 1097.2; 1042.6; 361.3 15; 16; 10 64; 64; 256 512; 512; 512 1) Bandpass filter: 0.1 - 55 Hz 2) Downsampled to 128 Hz 3) Epoching: 1s window Yes No No None (Raw EEG) Raw EEG N/M N/M STCNN (Spatial-Temporal CNN) CNN Pretty much a CNN with a fancy name. 2 Conv Layers + 3 FC with dropout Trying to capture Spatial and Temporal information from Raw EEG Yes 64x128 CNN: 2 FC: 3 5 ReLU Dropout Yes 2 Target / Non-Target (softmax) N/M The paper is about transfer learning. Training on 1 dataset, then fine-tuning (or not) on the other. Pre-training N/M N/M N/M N/M N/M N/M N/M N/M Inter 10-Fold CV k-fold Train: 90% Valid: 10% ROC AUC ROC AUC N/M N/M Ranging from 73-77%, depending on source / target datasets and transfer type (Cross-Subject or Cross-Experiment) Bagging, XLDA, LDA Traditional pipeline No All Layers: Subject Specific CNN Layers: Mostly Subj. Specific All Layers: General Info Analysis of performance with transferred layers This study represents the first comprehensive investigation of CNN transferability for EEG based classification and our results provide important information that will guide the design of more sophisticated deep transfer learning algorithms for EEG based classifications in BCI applications. N/M No N/A No Yannick Roy TBR TBC Hajinoroozi2017