Diff of /EEGModels.py [000000] .. [195f5e]

Switch to unified view

a b/EEGModels.py
1
"""
2
 ARL_EEGModels - A collection of Convolutional Neural Network models for EEG
3
 Signal Processing and Classification, using Keras and Tensorflow
4
5
 Requirements:
6
    (1) tensorflow == 2.X (as of this writing, 2.0 - 2.3 have been verified
7
        as working)
8
 
9
 To run the EEG/MEG ERP classification sample script, you will also need
10
11
    (4) mne >= 0.17.1
12
    (5) PyRiemann >= 0.2.5
13
    (6) scikit-learn >= 0.20.1
14
    (7) matplotlib >= 2.2.3
15
    
16
 To use:
17
    
18
    (1) Place this file in the PYTHONPATH variable in your IDE (i.e.: Spyder)
19
    (2) Import the model as
20
        
21
        from EEGModels import EEGNet    
22
        
23
        model = EEGNet(nb_classes = ..., Chans = ..., Samples = ...)
24
        
25
    (3) Then compile and fit the model
26
    
27
        model.compile(loss = ..., optimizer = ..., metrics = ...)
28
        fitted    = model.fit(...)
29
        predicted = model.predict(...)
30
31
 Portions of this project are works of the United States Government and are not
32
 subject to domestic copyright protection under 17 USC Sec. 105.  Those 
33
 portions are released world-wide under the terms of the Creative Commons Zero 
34
 1.0 (CC0) license.  
35
 
36
 Other portions of this project are subject to domestic copyright protection 
37
 under 17 USC Sec. 105.  Those portions are licensed under the Apache 2.0 
38
 license.  The complete text of the license governing this material is in 
39
 the file labeled LICENSE.TXT that is a part of this project's official 
40
 distribution. 
41
"""
42
43
from tensorflow.keras.models import Model
44
from tensorflow.keras.layers import Dense, Activation, Permute, Dropout
45
from tensorflow.keras.layers import Conv2D, MaxPooling2D, AveragePooling2D
46
from tensorflow.keras.layers import SeparableConv2D, DepthwiseConv2D
47
from tensorflow.keras.layers import BatchNormalization
48
from tensorflow.keras.layers import SpatialDropout2D
49
from tensorflow.keras.regularizers import l1_l2
50
from tensorflow.keras.layers import Input, Flatten
51
from tensorflow.keras.constraints import max_norm
52
from tensorflow.keras import backend as K
53
54
55
def EEGNet(nb_classes, Chans = 64, Samples = 128, 
56
             dropoutRate = 0.5, kernLength = 64, F1 = 8, 
57
             D = 2, F2 = 16, norm_rate = 0.25, dropoutType = 'Dropout'):
58
    """ Keras Implementation of EEGNet
59
    http://iopscience.iop.org/article/10.1088/1741-2552/aace8c/meta
60
61
    Note that this implements the newest version of EEGNet and NOT the earlier
62
    version (version v1 and v2 on arxiv). We strongly recommend using this
63
    architecture as it performs much better and has nicer properties than
64
    our earlier version. For example:
65
        
66
        1. Depthwise Convolutions to learn spatial filters within a 
67
        temporal convolution. The use of the depth_multiplier option maps 
68
        exactly to the number of spatial filters learned within a temporal
69
        filter. This matches the setup of algorithms like FBCSP which learn 
70
        spatial filters within each filter in a filter-bank. This also limits 
71
        the number of free parameters to fit when compared to a fully-connected
72
        convolution. 
73
        
74
        2. Separable Convolutions to learn how to optimally combine spatial
75
        filters across temporal bands. Separable Convolutions are Depthwise
76
        Convolutions followed by (1x1) Pointwise Convolutions. 
77
        
78
    
79
    While the original paper used Dropout, we found that SpatialDropout2D 
80
    sometimes produced slightly better results for classification of ERP 
81
    signals. However, SpatialDropout2D significantly reduced performance 
82
    on the Oscillatory dataset (SMR, BCI-IV Dataset 2A). We recommend using
83
    the default Dropout in most cases.
84
        
85
    Assumes the input signal is sampled at 128Hz. If you want to use this model
86
    for any other sampling rate you will need to modify the lengths of temporal
87
    kernels and average pooling size in blocks 1 and 2 as needed (double the 
88
    kernel lengths for double the sampling rate, etc). Note that we haven't 
89
    tested the model performance with this rule so this may not work well. 
90
    
91
    The model with default parameters gives the EEGNet-8,2 model as discussed
92
    in the paper. This model should do pretty well in general, although it is
93
    advised to do some model searching to get optimal performance on your
94
    particular dataset.
95
96
    We set F2 = F1 * D (number of input filters = number of output filters) for
97
    the SeparableConv2D layer. We haven't extensively tested other values of this
98
    parameter (say, F2 < F1 * D for compressed learning, and F2 > F1 * D for
99
    overcomplete). We believe the main parameters to focus on are F1 and D. 
100
101
    Inputs:
102
        
103
      nb_classes      : int, number of classes to classify
104
      Chans, Samples  : number of channels and time points in the EEG data
105
      dropoutRate     : dropout fraction
106
      kernLength      : length of temporal convolution in first layer. We found
107
                        that setting this to be half the sampling rate worked
108
                        well in practice. For the SMR dataset in particular
109
                        since the data was high-passed at 4Hz we used a kernel
110
                        length of 32.     
111
      F1, F2          : number of temporal filters (F1) and number of pointwise
112
                        filters (F2) to learn. Default: F1 = 8, F2 = F1 * D. 
113
      D               : number of spatial filters to learn within each temporal
114
                        convolution. Default: D = 2
115
      dropoutType     : Either SpatialDropout2D or Dropout, passed as a string.
116
117
    """
118
    
119
    if dropoutType == 'SpatialDropout2D':
120
        dropoutType = SpatialDropout2D
121
    elif dropoutType == 'Dropout':
122
        dropoutType = Dropout
123
    else:
124
        raise ValueError('dropoutType must be one of SpatialDropout2D '
125
                         'or Dropout, passed as a string.')
126
    
127
    input1   = Input(shape = (Chans, Samples, 1))
128
129
    ##################################################################
130
    block1       = Conv2D(F1, (1, kernLength), padding = 'same',
131
                                   input_shape = (Chans, Samples, 1),
132
                                   use_bias = False)(input1)
133
    block1       = BatchNormalization()(block1)
134
    block1       = DepthwiseConv2D((Chans, 1), use_bias = False, 
135
                                   depth_multiplier = D,
136
                                   depthwise_constraint = max_norm(1.))(block1)
137
    block1       = BatchNormalization()(block1)
138
    block1       = Activation('elu')(block1)
139
    block1       = AveragePooling2D((1, 4))(block1)
140
    block1       = dropoutType(dropoutRate)(block1)
141
    
142
    block2       = SeparableConv2D(F2, (1, 16),
143
                                   use_bias = False, padding = 'same')(block1)
144
    block2       = BatchNormalization()(block2)
145
    block2       = Activation('elu')(block2)
146
    block2       = AveragePooling2D((1, 8))(block2)
147
    block2       = dropoutType(dropoutRate)(block2)
148
        
149
    flatten      = Flatten(name = 'flatten')(block2)
150
    
151
    dense        = Dense(nb_classes, name = 'dense', 
152
                         kernel_constraint = max_norm(norm_rate))(flatten)
153
    softmax      = Activation('softmax', name = 'softmax')(dense)
154
    
155
    return Model(inputs=input1, outputs=softmax)
156
157
158
159
160
def EEGNet_SSVEP(nb_classes = 12, Chans = 8, Samples = 256, 
161
             dropoutRate = 0.5, kernLength = 256, F1 = 96, 
162
             D = 1, F2 = 96, dropoutType = 'Dropout'):
163
    """ SSVEP Variant of EEGNet, as used in [1]. 
164
165
    Inputs:
166
        
167
      nb_classes      : int, number of classes to classify
168
      Chans, Samples  : number of channels and time points in the EEG data
169
      dropoutRate     : dropout fraction
170
      kernLength      : length of temporal convolution in first layer
171
      F1, F2          : number of temporal filters (F1) and number of pointwise
172
                        filters (F2) to learn. 
173
      D               : number of spatial filters to learn within each temporal
174
                        convolution.
175
      dropoutType     : Either SpatialDropout2D or Dropout, passed as a string.
176
      
177
      
178
    [1]. Waytowich, N. et. al. (2018). Compact Convolutional Neural Networks
179
    for Classification of Asynchronous Steady-State Visual Evoked Potentials.
180
    Journal of Neural Engineering vol. 15(6). 
181
    http://iopscience.iop.org/article/10.1088/1741-2552/aae5d8
182
183
    """
184
    
185
    if dropoutType == 'SpatialDropout2D':
186
        dropoutType = SpatialDropout2D
187
    elif dropoutType == 'Dropout':
188
        dropoutType = Dropout
189
    else:
190
        raise ValueError('dropoutType must be one of SpatialDropout2D '
191
                         'or Dropout, passed as a string.')
192
    
193
    input1   = Input(shape = (Chans, Samples, 1))
194
195
    ##################################################################
196
    block1       = Conv2D(F1, (1, kernLength), padding = 'same',
197
                                   input_shape = (Chans, Samples, 1),
198
                                   use_bias = False)(input1)
199
    block1       = BatchNormalization()(block1)
200
    block1       = DepthwiseConv2D((Chans, 1), use_bias = False, 
201
                                   depth_multiplier = D,
202
                                   depthwise_constraint = max_norm(1.))(block1)
203
    block1       = BatchNormalization()(block1)
204
    block1       = Activation('elu')(block1)
205
    block1       = AveragePooling2D((1, 4))(block1)
206
    block1       = dropoutType(dropoutRate)(block1)
207
    
208
    block2       = SeparableConv2D(F2, (1, 16),
209
                                   use_bias = False, padding = 'same')(block1)
210
    block2       = BatchNormalization()(block2)
211
    block2       = Activation('elu')(block2)
212
    block2       = AveragePooling2D((1, 8))(block2)
213
    block2       = dropoutType(dropoutRate)(block2)
214
        
215
    flatten      = Flatten(name = 'flatten')(block2)
216
    
217
    dense        = Dense(nb_classes, name = 'dense')(flatten)
218
    softmax      = Activation('softmax', name = 'softmax')(dense)
219
    
220
    return Model(inputs=input1, outputs=softmax)
221
222
223
224
def EEGNet_old(nb_classes, Chans = 64, Samples = 128, regRate = 0.0001,
225
           dropoutRate = 0.25, kernels = [(2, 32), (8, 4)], strides = (2, 4)):
226
    """ Keras Implementation of EEGNet_v1 (https://arxiv.org/abs/1611.08024v2)
227
228
    This model is the original EEGNet model proposed on arxiv
229
            https://arxiv.org/abs/1611.08024v2
230
    
231
    with a few modifications: we use striding instead of max-pooling as this 
232
    helped slightly in classification performance while also providing a 
233
    computational speed-up. 
234
    
235
    Note that we no longer recommend the use of this architecture, as the new
236
    version of EEGNet performs much better overall and has nicer properties.
237
    
238
    Inputs:
239
        
240
        nb_classes     : total number of final categories
241
        Chans, Samples : number of EEG channels and samples, respectively
242
        regRate        : regularization rate for L1 and L2 regularizations
243
        dropoutRate    : dropout fraction
244
        kernels        : the 2nd and 3rd layer kernel dimensions (default is 
245
                         the [2, 32] x [8, 4] configuration)
246
        strides        : the stride size (note that this replaces the max-pool
247
                         used in the original paper)
248
    
249
    """
250
251
    # start the model
252
    input_main   = Input((Chans, Samples))
253
    layer1       = Conv2D(16, (Chans, 1), input_shape=(Chans, Samples, 1),
254
                                 kernel_regularizer = l1_l2(l1=regRate, l2=regRate))(input_main)
255
    layer1       = BatchNormalization()(layer1)
256
    layer1       = Activation('elu')(layer1)
257
    layer1       = Dropout(dropoutRate)(layer1)
258
    
259
    permute_dims = 2, 1, 3
260
    permute1     = Permute(permute_dims)(layer1)
261
    
262
    layer2       = Conv2D(4, kernels[0], padding = 'same', 
263
                            kernel_regularizer=l1_l2(l1=0.0, l2=regRate),
264
                            strides = strides)(permute1)
265
    layer2       = BatchNormalization()(layer2)
266
    layer2       = Activation('elu')(layer2)
267
    layer2       = Dropout(dropoutRate)(layer2)
268
    
269
    layer3       = Conv2D(4, kernels[1], padding = 'same',
270
                            kernel_regularizer=l1_l2(l1=0.0, l2=regRate),
271
                            strides = strides)(layer2)
272
    layer3       = BatchNormalization()(layer3)
273
    layer3       = Activation('elu')(layer3)
274
    layer3       = Dropout(dropoutRate)(layer3)
275
    
276
    flatten      = Flatten(name = 'flatten')(layer3)
277
    
278
    dense        = Dense(nb_classes, name = 'dense')(flatten)
279
    softmax      = Activation('softmax', name = 'softmax')(dense)
280
    
281
    return Model(inputs=input_main, outputs=softmax)
282
283
284
285
def DeepConvNet(nb_classes, Chans = 64, Samples = 256,
286
                dropoutRate = 0.5):
287
    """ Keras implementation of the Deep Convolutional Network as described in
288
    Schirrmeister et. al. (2017), Human Brain Mapping.
289
    
290
    This implementation assumes the input is a 2-second EEG signal sampled at 
291
    128Hz, as opposed to signals sampled at 250Hz as described in the original
292
    paper. We also perform temporal convolutions of length (1, 5) as opposed
293
    to (1, 10) due to this sampling rate difference. 
294
    
295
    Note that we use the max_norm constraint on all convolutional layers, as 
296
    well as the classification layer. We also change the defaults for the
297
    BatchNormalization layer. We used this based on a personal communication 
298
    with the original authors.
299
    
300
                      ours        original paper
301
    pool_size        1, 2        1, 3
302
    strides          1, 2        1, 3
303
    conv filters     1, 5        1, 10
304
    
305
    Note that this implementation has not been verified by the original 
306
    authors. 
307
    
308
    """
309
310
    # start the model
311
    input_main   = Input((Chans, Samples, 1))
312
    block1       = Conv2D(25, (1, 5), 
313
                                 input_shape=(Chans, Samples, 1),
314
                                 kernel_constraint = max_norm(2., axis=(0,1,2)))(input_main)
315
    block1       = Conv2D(25, (Chans, 1),
316
                                 kernel_constraint = max_norm(2., axis=(0,1,2)))(block1)
317
    block1       = BatchNormalization(epsilon=1e-05, momentum=0.9)(block1)
318
    block1       = Activation('elu')(block1)
319
    block1       = MaxPooling2D(pool_size=(1, 2), strides=(1, 2))(block1)
320
    block1       = Dropout(dropoutRate)(block1)
321
  
322
    block2       = Conv2D(50, (1, 5),
323
                                 kernel_constraint = max_norm(2., axis=(0,1,2)))(block1)
324
    block2       = BatchNormalization(epsilon=1e-05, momentum=0.9)(block2)
325
    block2       = Activation('elu')(block2)
326
    block2       = MaxPooling2D(pool_size=(1, 2), strides=(1, 2))(block2)
327
    block2       = Dropout(dropoutRate)(block2)
328
    
329
    block3       = Conv2D(100, (1, 5),
330
                                 kernel_constraint = max_norm(2., axis=(0,1,2)))(block2)
331
    block3       = BatchNormalization(epsilon=1e-05, momentum=0.9)(block3)
332
    block3       = Activation('elu')(block3)
333
    block3       = MaxPooling2D(pool_size=(1, 2), strides=(1, 2))(block3)
334
    block3       = Dropout(dropoutRate)(block3)
335
    
336
    block4       = Conv2D(200, (1, 5),
337
                                 kernel_constraint = max_norm(2., axis=(0,1,2)))(block3)
338
    block4       = BatchNormalization(epsilon=1e-05, momentum=0.9)(block4)
339
    block4       = Activation('elu')(block4)
340
    block4       = MaxPooling2D(pool_size=(1, 2), strides=(1, 2))(block4)
341
    block4       = Dropout(dropoutRate)(block4)
342
    
343
    flatten      = Flatten()(block4)
344
    
345
    dense        = Dense(nb_classes, kernel_constraint = max_norm(0.5))(flatten)
346
    softmax      = Activation('softmax')(dense)
347
    
348
    return Model(inputs=input_main, outputs=softmax)
349
350
351
# need these for ShallowConvNet
352
def square(x):
353
    return K.square(x)
354
355
def log(x):
356
    return K.log(K.clip(x, min_value = 1e-7, max_value = 10000))   
357
358
359
def ShallowConvNet(nb_classes, Chans = 64, Samples = 128, dropoutRate = 0.5):
360
    """ Keras implementation of the Shallow Convolutional Network as described
361
    in Schirrmeister et. al. (2017), Human Brain Mapping.
362
    
363
    Assumes the input is a 2-second EEG signal sampled at 128Hz. Note that in 
364
    the original paper, they do temporal convolutions of length 25 for EEG
365
    data sampled at 250Hz. We instead use length 13 since the sampling rate is 
366
    roughly half of the 250Hz which the paper used. The pool_size and stride
367
    in later layers is also approximately half of what is used in the paper.
368
    
369
    Note that we use the max_norm constraint on all convolutional layers, as 
370
    well as the classification layer. We also change the defaults for the
371
    BatchNormalization layer. We used this based on a personal communication 
372
    with the original authors.
373
    
374
                     ours        original paper
375
    pool_size        1, 35       1, 75
376
    strides          1, 7        1, 15
377
    conv filters     1, 13       1, 25    
378
    
379
    Note that this implementation has not been verified by the original 
380
    authors. We do note that this implementation reproduces the results in the
381
    original paper with minor deviations. 
382
    """
383
384
    # start the model
385
    input_main   = Input((Chans, Samples, 1))
386
    block1       = Conv2D(40, (1, 13), 
387
                                 input_shape=(Chans, Samples, 1),
388
                                 kernel_constraint = max_norm(2., axis=(0,1,2)))(input_main)
389
    block1       = Conv2D(40, (Chans, 1), use_bias=False, 
390
                          kernel_constraint = max_norm(2., axis=(0,1,2)))(block1)
391
    block1       = BatchNormalization(epsilon=1e-05, momentum=0.9)(block1)
392
    block1       = Activation(square)(block1)
393
    block1       = AveragePooling2D(pool_size=(1, 35), strides=(1, 7))(block1)
394
    block1       = Activation(log)(block1)
395
    block1       = Dropout(dropoutRate)(block1)
396
    flatten      = Flatten()(block1)
397
    dense        = Dense(nb_classes, kernel_constraint = max_norm(0.5))(flatten)
398
    softmax      = Activation('softmax')(dense)
399
    
400
    return Model(inputs=input_main, outputs=softmax)
401
402