Diff of /Nyul.py [000000] .. [271336]

Switch to unified view

a b/Nyul.py
1
# -*- coding: utf-8 -*-
2
"""
3
Created on Tue Nov 08 17:45:23 2016
4
5
@author: Shreyas_V
6
"""
7
8
# Copyright (C) 2013 Oskar Maier
9
# 
10
# This program is free software: you can redistribute it and/or modify
11
# it under the terms of the GNU General Public License as published by
12
# the Free Software Foundation, either version 3 of the License, or
13
# (at your option) any later version.
14
# 
15
# This program is distributed in the hope that it will be useful,
16
# but WITHOUT ANY WARRANTY; without even the implied warranty of
17
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
18
# GNU General Public License for more details.
19
# 
20
# You should have received a copy of the GNU General Public License
21
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
22
#
23
# author Oskar Maier
24
# version r0.1.2
25
# since 2013-09-04
26
# status Release
27
28
# build-in modules
29
30
# third-party modules
31
import numpy
32
from scipy.interpolate.interpolate import interp1d
33
34
# path changes
35
36
# own modules
37
38
# code
39
class IntensityRangeStandardization (object):
40
    r"""
41
    Class to standardize intensity ranges between a number of images.
42
    
43
    **Short description:**
44
    Often images containing similar objects or scenes have different intensity ranges
45
    that make it difficult to compare them manually as well as to process them
46
    further.
47
    
48
    IntensityRangeStandardization offers a way to transform a number of such images
49
    intensity ranges to a common standard intensity space without any loss of
50
    information using a multi-segment linear transformation model.
51
    
52
    Once learned, this model can be applied to other, formerly unseen images to map
53
    them to the same standard intensity space.    
54
            
55
    **Concept of similar images:**
56
    IntensityRangeStandardization is limited to similar images. Images containing
57
    different object or different compositions of objects are not suitable to be
58
    transformed to a common intensity space (and it would furthermore not make much
59
    sense).
60
    
61
    A typical application of IntensityRangeStandardization are MRI images showing the
62
    same body region. These often have different intensity ranges, even when acquired
63
    from the same patient and using the same scanner. For further processing, e.g.
64
    for training a classifier, they have to be mapped to a common intensity space.
65
    
66
    **Failure of the transformation:**
67
    The method implemented in IntensityRangeStandardization ensures that no
68
    information is lost i.e. a lossless transformation is performed. This can be
69
    assured when there exists a one-to-one mapping between the images original
70
    intensity values and their values mapped to the standard intensity space.
71
    
72
    But since the transformation model is trained on, and the standard intensity
73
    space range selected over the training images, this can not be guaranteed for all
74
    formerly unseen image. If they differ greatly from the training set images, a
75
    lossless transformation can not be assured anymore. In this case the transform()
76
    method will throw an InformationLossException.
77
    
78
    Should this happen, the model needs to be re-trained with the original training
79
    images and additionally the images which caused the failure. Since this will lead
80
    to a new intensity standard space, all already transformed images have to be
81
    processed again.
82
    
83
    **Setting the training parameters:**
84
    The method comes with a set of default parameters, that are suitable for most
85
    cases. But for some special cases, it might be better to set them on your own. Ti
86
    understand the working of the parameters, it is recommended to read the detailed
87
    method description first.
88
    
89
    **The method depends on three parameters:**
90
    
91
    cutoffp, i.e. the cut-off percentiles
92
        These are used to the define the intensity outliers, both during training and
93
        image transformation. The default values are usualy a good choice.
94
        (in [1]_ these are called the minimum and maximum percentile values pc1 and pc2 respectively)
95
    landmarkp, i.e. the landmark percentiles
96
        These percentiles define the landmark positions. The more supplied, the more
97
        exact but less general becomes the model. It is common to supply equally
98
        spaced percentiles between 0 and 100.
99
        (in [1]_ these are called the landmark locations mu_1, .., mu_l)
100
    strange, i.e. the standard intensity space range
101
        These two intensity values define roughly the standard intensity space (or
102
        common intensity space of the images; or even target intensity space) to
103
        which each images intensities are mapped. This space can be supplied, but it
104
        is usually recommended to let the method select it automatically during the
105
        training process. It is additionally possible to supply only the lower or
106
        upper range border and set the other to ''auto'', in which case the method
107
        chooses the range automatically, but not the position. 
108
        (in [1]_ these are called the minimum and maximum intensities on the standard scale of the IOI s1 resp. s2)
109
    
110
    
111
    **Details of the method:**
112
    In the following the method is described in some more detail. For even more
113
    information see [1]_.
114
         
115
    Essentially the method is based on a multi-segment linear transformation model. A
116
    standard intensity space (or common intensity space) is defined by an intensity
117
    value range ''stdrange''.
118
    During the training phase, the intensity values at certain cut-off percentiles of
119
    each image are computed and a single-segment linear mapping from them to the
120
    standard intensity space range limits created. Then the images intensity values
121
    at a number of landmark percentiles are extracted and passed to the linear
122
    mapping to be transfered roughly to the standard intensity space. The mean of all
123
    these mapped landmark intensities form the model learned.
124
      
125
    When presented with an image to transform, these images intensity values are
126
    extracted at the cut-off percentile as well as at the landmark percentile
127
    positions. This results in a number of segments. Using these and the
128
    corresponding standard intensity space range values and learned mean landmark
129
    values, a multi-segment linear transformation model is created for the image.
130
    This is then applied to the images intensity values to map them to the standard
131
    intensity space.
132
    
133
    Outliers, i.e. the images intensity values that lie outside of the cut-off
134
    percentiles, are treated separately. They are transformed like the first resp.
135
    last segmented of the transformation model. Not that this means the transformed
136
    images intensity values do not always lie inside the standard intensity space
137
    range, but are fitted as best as possible inside.
138
         
139
    Parameters
140
    ----------
141
    cutoffp : (float, float)
142
        Lower and upper cut-off percentiles to exclude outliers.
143
    landmarkp : sequence of floats
144
        List of percentiles serving as model landmarks, must lie
145
        between the cutoffp values.
146
    stdrange : string or (float, float)
147
        The range of the standard intensity space for which a
148
        transformation is learned; when set to 'auto, automatically
149
        determined from the training image upon training; it is also
150
        possible to fix either the upper or the lower border value and
151
        setting the other to 'auto'.
152
        
153
    Examples
154
    --------
155
    We have a number of similar images with varying intensity ranges. To make them
156
    comparable, we would like to transform them to a common intensity space. Thus we
157
    run:
158
    
159
        >>> from medpy.filter import IntensityRangeStandardization
160
        >>> irs = IntensityRangeStandardization()
161
        >>> trained_model, transformed_images = irs.train_transform(images)
162
        
163
    Let us assume we now obtain another, new image, that we would like to make
164
    comparable to the others. As long as it does not differ to much from these, we
165
    can simply call:
166
        
167
        >>> transformed_image = irs.transform(new_image)
168
        
169
    For many application, not all images are already available at the time of
170
    execution. It would therefore be good to be able to preserve a once trained
171
    model. The solution is to just pickle the once trained model:
172
    
173
        >>> import pickle
174
        >>> with open('my_trained_model.pkl', 'wb') as f:
175
        >>>     pickle.dump(irs, f)
176
            
177
    And load it again when required with:
178
    
179
        >>> with open('my_trained_model.pkl', 'r') as f:
180
        >>>     irs = pickle.load(f)
181
        
182
    References
183
    ----------
184
    .. [1] Nyul, L.G.; Udupa, J.K.; Xuan Zhang, "New variants of a method of MRI scale
185
       standardization," Medical Imaging, IEEE Transactions on , vol.19, no.2, pp.143-150,
186
       Feb. 2000
187
    """    
188
    
189
    # static member variables
190
    L2 = [50]
191
    """1-value landmark points model."""
192
    L3 = [25, 50, 75]
193
    """3-value landmark points model."""
194
    L4 = [10, 20, 30, 40, 50, 60, 70, 80, 90]
195
    """9-value landmark points model."""
196
    
197
    def __init__(self, cutoffp = (1, 99), landmarkp = L4, stdrange = 'auto'):
198
        # check parameters
199
        if not IntensityRangeStandardization.is_sequence(cutoffp):
200
            raise ValueError('cutoffp must be a sequence')
201
        if not 2 == len(cutoffp):
202
            raise ValueError('cutoffp must be of length 2, not {}'.format(len(cutoffp)))
203
        if not IntensityRangeStandardization.are_numbers(cutoffp):
204
            raise ValueError('cutoffp elements must be numbers')
205
        if not IntensityRangeStandardization.are_in_interval(cutoffp, 0, 100, 'included'):
206
            raise ValueError('cutoffp elements must be in [0, 100]')
207
        if not cutoffp[1] > cutoffp[0]:
208
            raise ValueError('the second element of cutoffp must be larger than the first')
209
        
210
        if not IntensityRangeStandardization.is_sequence(landmarkp):
211
            raise ValueError('landmarkp must be a sequence')
212
        if not 1 <= len(landmarkp):
213
            raise ValueError('landmarkp must be of length >= 1, not {}'.format(len(landmarkp)))
214
        if not IntensityRangeStandardization.are_numbers(landmarkp):
215
            raise ValueError('landmarkp elements must be numbers')
216
        if not IntensityRangeStandardization.are_in_interval(landmarkp, 0, 100, 'included'):
217
            raise ValueError('landmarkp elements must be in [0, 100]')
218
        if not IntensityRangeStandardization.are_in_interval(landmarkp, cutoffp[0], cutoffp[1], 'excluded'):
219
            raise ValueError('landmarkp elements must be in between the elements of cutoffp')
220
        if not len(landmarkp) == len(numpy.unique(landmarkp)):
221
            raise ValueError('landmarkp elements must be unique')
222
        
223
        if 'auto' == stdrange:
224
            stdrange = ('auto', 'auto')
225
        else:
226
            if not IntensityRangeStandardization.is_sequence(stdrange):
227
                raise ValueError('stdrange must be a sequence or \'auto\'')
228
            if not 2 == len(stdrange):
229
                raise ValueError('stdrange must be of length 2, not {}'.format(len(stdrange)))
230
            if not 'auto' in stdrange:
231
                if not IntensityRangeStandardization.are_numbers(stdrange):
232
                    raise ValueError('stdrange elements must be numbers or \'auto\'')
233
                if not stdrange[1] > stdrange[0]:
234
                    raise ValueError('the second element of stdrange must be larger than the first')
235
            elif 'auto' == stdrange[0] and not IntensityRangeStandardization.is_number(stdrange[1]):
236
                raise ValueError('stdrange elements must be numbers or \'auto\'')
237
            elif 'auto' == stdrange[1] and not IntensityRangeStandardization.is_number(stdrange[0]):
238
                raise ValueError('stdrange elements must be numbers or \'auto\'')
239
                
240
        
241
        # process parameters
242
        self.__cutoffp = IntensityRangeStandardization.to_float(cutoffp)
243
        self.__landmarkp = IntensityRangeStandardization.to_float(sorted(landmarkp))
244
        self.__stdrange = ['auto' if 'auto' == x else float(x) for x in stdrange]
245
            
246
        # initialize remaining instance parameters
247
        self.__model = None
248
        self.__sc_umins = None
249
        self.__sc_umaxs = None
250
        
251
252
    def train(self, images):
253
        r"""
254
        Train a standard intensity space and an associated transformation model.
255
        
256
        Note that the passed images should be masked to contain only the foreground.
257
        
258
        Parameters
259
        ----------
260
        images : sequence of array_likes
261
            A number of images.
262
        
263
        Returns
264
        -------
265
        IntensityRangeStandardization : IntensityRangeStandardization
266
            This instance of IntensityRangeStandardization
267
        """
268
        self.__stdrange = self.__compute_stdrange(images)
269
        
270
        lim = []
271
        for idx, i in enumerate(images):
272
            ci = numpy.array(numpy.percentile(i, self.__cutoffp))
273
            li = numpy.array(numpy.percentile(i, self.__landmarkp))
274
            ipf = interp1d(ci, self.__stdrange)
275
            lim.append(ipf(li))
276
277
            # treat single intensity accumulation error            
278
            if not len(numpy.unique(numpy.concatenate((ci, li)))) == len(ci) + len(li):
279
                raise SingleIntensityAccumulationError('Image no.{} shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. Another possibility would be to reduce the number of landmark percentiles landmarkp or to change their distribution.'.format(idx))
280
            
281
        self.__model = [self.__stdrange[0]] + list(numpy.mean(lim, 0)) + [self.__stdrange[1]]
282
        self.__sc_umins = [self.__stdrange[0]] + list(numpy.min(lim, 0)) + [self.__stdrange[1]]
283
        self.__sc_umaxs = [self.__stdrange[0]] + list(numpy.max(lim, 0)) + [self.__stdrange[1]]
284
            
285
        return self
286
        
287
288
    def transform(self, image, surpress_mapping_check = False):
289
        r"""
290
        Transform an images intensity values to the learned standard intensity space.
291
        
292
        Note that the passed image should be masked to contain only the foreground.
293
        
294
        The transformation is guaranteed to be lossless i.e. a one-to-one mapping between
295
        old and new intensity values exists. In cases where this does not hold, an error
296
        is thrown. This can be suppressed by setting ``surpress_mapping_check`` to 'True'.
297
        Do this only if you know what you are doing.
298
        
299
        Parameters
300
        ----------
301
        image : array_like
302
            The image to transform.
303
        surpress_mapping_check : bool
304
            Whether to ensure a lossless transformation or not.
305
        
306
        Returns
307
        -------
308
        image : ndarray
309
            The transformed image
310
        
311
        Raises
312
        -------
313
        InformationLossException 
314
            If a lossless transformation can not be ensured
315
        Exception
316
            If no model has been trained before
317
        """
318
        if None == self.__model:
319
            raise UntrainedException('Model not trained. Call train() first.')
320
        
321
        image = numpy.asarray(image)
322
        
323
        # determine image intensity values at cut-off percentiles & landmark percentiles
324
        li = numpy.percentile(image, [self.__cutoffp[0]] + self.__landmarkp + [self.__cutoffp[1]])
325
        
326
        # treat single intensity accumulation error            
327
        if not len(numpy.unique(li)) == len(li):
328
            raise SingleIntensityAccumulationError('The image shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. The only other possibility would be to re-train the model with a reduced number of landmark percentiles landmarkp or a changed distribution.')
329
        
330
        # create linear mapping models for the percentile segments to the learned standard intensity space  
331
        ipf = interp1d(li, self.__model, bounds_error = False)
332
        
333
        # transform the input image intensity values
334
        output = ipf(image)
335
        
336
        # treat image intensity values outside of the cut-off percentiles range separately
337
        llm = IntensityRangeStandardization.linear_model(li[:2], self.__model[:2])
338
        rlm = IntensityRangeStandardization.linear_model(li[-2:], self.__model[-2:])
339
        
340
        output[image < li[0]] = llm(image[image < li[0]])
341
        output[image > li[-1]] = rlm(image[image > li[-1]])
342
        
343
        if not surpress_mapping_check and not self.__check_mapping(li):
344
            raise InformationLossException('Image can not be transformed to the learned standard intensity space without loss of information. Please re-train.')
345
        
346
        return output
347
    
348
349
    def train_transform(self, images, surpress_mapping_check = False):
350
        r"""
351
        See also
352
        --------
353
        train, transform
354
        """
355
        ret = self.train(images)
356
        outputs = [self.transform(i, surpress_mapping_check) for i in images]
357
        return ret, outputs
358
    
359
360
    @property
361
    def stdrange(self):
362
        """Get the set resp. learned standard intensity range."""
363
        return self.__stdrange
364
    
365
366
    @property
367
    def cutoffp(self):
368
        """Get the cut-off percentiles."""
369
        return self.__cutoffp
370
    
371
372
    @property
373
    def landmarkp(self):
374
        """Get the landmark percentiles."""
375
        return self.__landmarkp
376
    
377
378
    @property
379
    def model(self):
380
        """Get the model (the learned percentile values)."""
381
        return self.__model
382
    
383
384
    def __compute_stdrange(self, images):
385
        r"""
386
        Computes a common standard intensity range over a number of images.
387
        
388
        Depending on the settings of the internal self.__stdrange variable,
389
        either (1) the already fixed values are returned, (2) a complete standard
390
        intensity range is computed from the supplied images, (3) an intensity range
391
        fixed at the lower end or (4) an intensity range fixed at the upper end is
392
        returned.
393
        
394
        Takes into account the maximum length of each percentile segment over all
395
        images, then adds a security margin defined by the highest variability among
396
        all segments over all images.
397
        
398
        Be
399
        
400
        .. math::
401
        
402
            L = (cop_l, lp_1, lp_2, ..., lp_n, cop_u)
403
        
404
        the set formed by the two cut-off percentiles :math:`cop_l` and :math:`cop_u` and the
405
        landmark percentiles :math:`lp_1, ..., lp_n`. The corresponding intensity values of
406
        an image :math:`i\in I` are then
407
        
408
        .. math::
409
            
410
            V_i = (v_{i,1}, v_{i,2}, ..., v_{i,n+2})
411
        
412
        The distance between each of these intensity values forms a segment along the
413
        images :math:`i` intensity range denoted as
414
        
415
        ..math ::
416
            
417
            S_i = (s_{i,1}, s_{i,2}, ..., s_{i, n+1})
418
        
419
        The common standard intensity range :math:`sir` over the set of images :math:`I` is
420
        then defined as
421
        
422
        ..math ::
423
            sir = \sum_{l=1}^{n+1}\max_{i=1}^I s_{i,l} * \max_{l=1}^{n+1} \left(\frac{\max_{i=1}^I s_{i,l}}{\min_{i=1}^I s_{i,l}}\right)
424
        
425
        Parameters
426
        ----------
427
        images : sequence of array_like
428
            A number of images.
429
            
430
        Returns
431
        -------
432
        stdrange : (float, float)
433
            The borders of the computed standard intensity range.
434
        """
435
        if not 'auto' in self.__stdrange:
436
            return self.__stdrange
437
        
438
        copl, copu = self.__cutoffp
439
        
440
        # collect cutoff + landmark percentile segments and image mean intensity values 
441
        s = []
442
        m = []
443
        for idx, i in enumerate(images):
444
            li = numpy.percentile(i, [copl] + self.__landmarkp + [copu])
445
            
446
            s.append(numpy.asarray(li)[1:] - numpy.asarray(li)[:-1])
447
            m.append(i.mean())
448
            
449
            # treat single intensity accumulation error
450
            if 0 in s[-1]:
451
                raise SingleIntensityAccumulationError('Image no.{} shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. Another possibility would be to reduce the number of landmark percentiles landmarkp or to change their distribution.'.format(idx))
452
            
453
        # select the maximum and minimum of each percentile segment over all images
454
        maxs = numpy.max(s, 0)
455
        mins = numpy.min(s, 0)
456
        
457
        # divide them pairwise
458
        divs = numpy.divide(numpy.asarray(maxs, dtype=numpy.float), mins) 
459
        
460
        # compute interval range according to generalized theorem 2 of [1]
461
        intv = numpy.sum(maxs) + numpy.max(divs)
462
        
463
        # compute mean intensity value over all images (assuming equal size)
464
        im = numpy.mean(m)
465
        
466
        # return interval with borders according to settings
467
        if 'auto' == self.__stdrange[0] and 'auto' == self.__stdrange[1]:
468
            return im - intv / 2, im + intv / 2
469
        elif 'auto' == self.__stdrange[0]:
470
            return self.__stdrange[1] - intv, self.__stdrange[1]
471
        else:
472
            return self.__stdrange[0], self.__stdrange[0] + intv
473
        
474
    def __check_mapping(self, landmarks):
475
        """
476
        Checks whether the image, from which the supplied landmarks were extracted, can
477
        be transformed to the learned standard intensity space without loss of
478
        information.
479
        """
480
        sc_udiff = numpy.asarray(self.__sc_umaxs)[1:] - numpy.asarray(self.__sc_umins)[:-1]
481
        l_diff = numpy.asarray(landmarks)[1:] - numpy.asarray(landmarks)[:-1]
482
        return numpy.all(sc_udiff > numpy.asarray(l_diff))
483
        
484
    @staticmethod
485
    def is_sequence(arg):
486
        """
487
        Checks via its hidden attribute whether the passed argument is a sequence (but
488
        excluding strings).
489
        
490
        Credits to Steve R. Hastings a.k.a steveha @ http://stackoverflow.com
491
        """
492
        return (not hasattr(arg, "strip") and
493
            hasattr(arg, "__getitem__") or
494
            hasattr(arg, "__iter__"))
495
        
496
497
    @staticmethod
498
    def is_number(arg):
499
        """
500
        Checks whether the passed argument is a valid number or not.
501
        """
502
        import numbers
503
        return isinstance(arg, numbers.Number)
504
    
505
506
    @staticmethod
507
    def are_numbers(arg):
508
        """
509
        Checks whether all elements in a sequence are valid numbers.
510
        """
511
        return numpy.all([IntensityRangeStandardization.is_number(x) for x in arg])
512
    
513
514
    @staticmethod
515
    def is_in_interval(n, l, r, border = 'included'):
516
        """
517
        Checks whether a number is inside the interval l, r.
518
        """
519
        if 'included' == border:
520
            return (n >= l) and (n <= r)
521
        elif 'excluded' == border:
522
            return (n > l) and (n < r)
523
        else:
524
            raise ValueError('borders must be either \'included\' or \'excluded\'')
525
        
526
527
    @staticmethod
528
    def are_in_interval(s, l, r, border = 'included'):
529
        """
530
        Checks whether all number in the sequence s lie inside the interval formed by
531
        l and r.
532
        """
533
        return numpy.all([IntensityRangeStandardization.is_in_interval(x, l, r, border) for x in s])
534
    
535
536
    @staticmethod
537
    def to_float(s):
538
        """
539
        Cast a sequences elements to float numbers.
540
        """
541
        return [float(x) for x in s]
542
    
543
544
    @staticmethod
545
    def linear_model((x1, x2), (y1, y2)):
546
        """
547
        Returns a linear model transformation function fitted on the two supplied points.
548
        y = m*x + b
549
        Note: Assumes that slope > 0, otherwise division through zero might occur.
550
        """
551
        m = (y2 - y1) / (x2 - x1)
552
        b = y1 - (m * x1)
553
        return lambda x: m * x + b
554
    
555
556
class SingleIntensityAccumulationError(Exception):
557
    """
558
    Thrown when an image shows an unusual single-intensity peaks which would obstruct
559
    both, training and transformation.
560
    """
561
    
562
class InformationLossException(Exception):
563
    """
564
    Thrown when a transformation can not be guaranteed to be lossless.
565
    """
566
    pass
567
    
568
class UntrainedException(Exception):
569
    """
570
    Thrown when a transformation is attempted before training.
571
    """
572
    pass