In addition to classification tasks, Slideflow also supports training and deploying whole-slide tissue segmentation models. Segmentation models identify and label regions of interest in a slide, and can be used for tasks such as tumor identification, tissue labeling, or quality control. Once trained, these models can be used for :ref:`slide QC <filtering>`, generating :ref:`regions of interest <regions_of_interest>`, or live deployment in :ref:`Slideflow Studio <studio>`.
Note
Tissue segmentation requires PyTorch. Dependencies can be installed with pip install slideflow[torch].
Tissue segmentation is performed at the whole-slide level, trained on randomly cropped sections of the slide thumbnail at a specified resolution. Slideflow supports three segmentation modes:
Note
Segmentation thumbnails and masks do not need to be explicitly exported prior to training. They will be generated automatically during training if they do not exist. However, exporting them beforehand can be useful for data visualization, troubleshooting, and computational efficiency.
Segmentation models in Slideflow are trained on regions of interest, which can be generated as discussed in :ref:`regions_of_interest` and :ref:`studio_roi`. Once ROIs have been generated and (optionally) labeled, whole-slide thumbnails and ROI masks can be exported using segment.export_thumbs_and_masks(). The mpp argument specifies the resolution of the exported images in microns-per-pixel. We recommend mpp=20 for a good balance between image size and memory requirements, or mpp=10 for tasks needing higher resolution.
from slideflow import segment # Load a project and dataset project = slideflow.load_project('path/to/project') dataset = project.dataset() # Export thumbnails and masks segment.export_thumbs_and_masks( dataset, mpp=20, # Microns-per-pixel resolution dest='path/to/output' )
By default, ROIs are exported as binary masks. To export multidimensional masks for multiclass or multilabel applications, use the mode and labels arguments. When mode is 'multiclass' or 'multilabel', masks will be exported in (N, W, H) format, where N is the number of unique ROI labels. The labels argument should be a list of strings corresponding to the ROI labels in the dataset that should be included.
... # Export thumbnails and masks segment.export_thumbs_and_masks( dataset, mpp=20, # Microns-per-pixel resolution dest='path/to/output', mode='multiclass', labels=['tumor', 'stroma', 'necrosis'] )
Segmentation models are configured using a :class:`segment.SegmentConfig` object. This object specifies the model architecture, image resolution (MPP), training parameters, and other settings. For example, to configure a model for multiclass segmentation with a resolution of 20 MPP, use:
from slideflow import segment # Create a config object config = segment.SegmentConfig( mpp=20, # Microns-per-pixel resolution size=1024, # Size of cropped/rotated images during training mode='multiclass', labels=['tumor', 'stroma', 'necrosis'], arch='Unet', encoder_name='resnet34', train_batch_size=16, epochs=10, lr=1e-4, )
Slideflow uses the segmentation_models_pytorch library to implement segmentation models. The arch argument specifies the model architecture, and the encoder_name argument specifies the encoder backbone. See available models and encoders in the segmentation_models_pytorch documentation.
The segmentation model can then be trained using the :func:`segment.train` function. This function takes a :class:`segment.SegmentConfig` object and a :class:`slideflow.Dataset` object as arguments. During training, segmentation thumbnails and masks are randomly cropped to the specified size, and images/masks then undergo augmentation with random flipping/rotating.
For example, to train a model for binary segmentation with a resolution of 20 MPP, use:
from slideflow import segment # Create a config object config = segment.SegmentConfig(mpp=20, mode='binary', arch='FPN') # Train the model segment.train(config, dataset, dest='path/to/output')
To use thumbnails and masks previously exported with :func:`segment.export_thumbs_and_masks`, specify the path to the exported data using the data_source argument. This is more computationally efficient than generating data on-the-fly during training. For example:
from slideflow import segment # Export thumbnails and masks segment.export_thumbs_and_masks(dataset, mpp=20, dest='masks/') # Create a config object config = segment.SegmentConfig(mpp=20, mode='binary', arch='FPN') # Train the model segment.train(config, dataset, data_source='masks/', dest='path/to/output')
After training, the model will be saved as a model.pth file in the destination directory specified by dest, and the model configuration will be saved as a segment_config.json file.
After training, models can be loaded using :func:`segment.load_model_and_config`. This function takes a path to a model file as an argument, and returns a tuple containing the model and configuration object. For example:
from slideflow import segment # Load the model and config model, config = segment.load_model_and_config('path/to/model.pth')
To run inference on a slide, use the :meth:`segment.SegmentModel.run_slide_inference` method. This method takes a :class:`slideflow.WSI` object or str (path to slide) as an argument, and returns an array of pixel-level predictions. For binary models, the output shape will be (H, W). For multiclass models, the output shape will be (N+1, H, W) (the first channel is predicted background), and for multilabel models, the output shape will be (N, H, W), where N is the number of labels.
from slideflow import segment # Load the model and config model, config = segment.load_model_and_config('path/to/model.pth') # Run inference, returning an np.ndarray pred = model.run_slide_inference('/path/to/slide')
You can also run inference directly on an arbitrary image using the :meth:`segment.SegmentModel.run_tiled_inference` method. This method takes an image array (np.ndarray, in W, H, C format) as an argument, and returns an array of pixel-level predictions. Predictions are generated in tiles and merged. The output shape will be (H, W) for binary models, (N+1, H, W) for multiclass models, and (N, H, W) for multilabel models.
The :class:`slideflow.slide.qc.Segment` class provides an easy interface for generating QC masks from a segmentation model. This class takes a path to a trained segmentation model as an argument, and can be used for QC :ref:`as previously described <filtering>`. For example:
import slideflow as sf from slideflow.slide import qc # Load a project and dataset project = sf.load_project('path/to/project') dataset = project.dataset(299, 302) # Create a QC mask segmenter = qc.Segment('/path/to/model.pth') # Extract tiles with this QC dataset.extract_tiles(..., qc=segmenter)
You can also use this interface for applying QC to a single slide:
import slideflow as sf from slideflow.slide import qc # Load the slide wsi = sf.WSI('/path/to/slide', ...) # Create the QC algorithm segmenter = qc.Segment('/path/to/model.pth') # Apply QC applied_mask = wsi.qc(segmenter)
For binary models, the QC mask will filter out tiles that are predicted to be background.
For multiclass models, the QC mask will filter out tiles predicted to be background (class index 0). This can be customized by setting class_idx to another value. For example, to create a QC algorithm that filters out tiles predicted to be tumor (class index 1), use:
segmenter = qc.Segment('/path/to/model.pth', class_idx=1)
For multilabel models, the QC mask will filter out tiles predicted to be background for all class labels. This can be customized to filter out tiles based only on a specific class label by setting class_idx. For example, to create a QC algorithm that filters out tiles that are not predicted to be tumor (class index 1) while ignoring predictions for necrosis (class index 2), use:
segmenter = qc.Segment('/path/to/model.pth', class_idx=1)
In all cases, the thresholding direction can be reversed with by setting threshold_direction='greater'. This might be useful, for example, if the segmentation model was trained to identify pen marks or artifacts, and you want to filter out areas predicted to be artifacts.
segmenter = qc.Segment('/path/to/model.pth', threshold_direction='greater')
The :class:`slideflow.slide.qc.Segment` also provides an easy interface for generating regions of interest (ROIs). Use :meth:`slideflow.slide.qc.Segment.generate_rois` method to generate and apply ROIs to a slide. If the segmentation model is multiclass or multilabel, generated ROIs will be labeled. For example:
import slideflow as sf from slideflow.slide import qc # Load a project and dataset wsi = sf.WSI('/path/to/slide', ...) # Create a QC mask segmenter = qc.Segment('/path/to/model.pth') # Generate and apply ROIs to a slide roi_outlines = segmenter.generate_rois(wsi)
By default, this will apply generated ROIs directly to the :class:`slideflow.WSI` object. If you wish to calculate ROI outlines without applying them to the slide, use the argument apply=False.
In addition to generating ROIs for a single slide, you can also generate ROIs for an entire dataset using :meth:`slideflow.Dataset.generate_rois`. For example:
import slideflow as sf # Load a project and dataset. project = sf.load_project('path/to/project') dataset = project.dataset() # Generate ROIs for all slides in the dataset. dataset.generate_rois('path/to/model.pth')
ROIs will be saved in the ROIs directory as configured in the dataset settings. Alternatively, ROIs can be exported to a user-defined directory using the dest argument.
By default, ROIs will be generated for all slides in the dataset, skipping slides with existing ROIs. To overwrite any existing ROIs, use the overwrite=True argument.
Segmentation models can be deployed in :ref:`Slideflow Studio <studio>` for live segmentation and QC. To do this, start by training a segmentation model as described above. Then, see the :ref:`studio_segmentation` documentation for instructions on how to deploy the model for live QC and/or ROI generation.
Create labeled ROIs as described in :ref:`studio_roi`.
import slideflow as sf from slideflow import segment # Load a project and dataset project = sf.load_project('path/to/project') dataset = project.dataset() # Train a binary segmentation model config = segment.SegmentConfig(mpp=20, mode='binary', arch='FPN') segment.train(config, dataset, dest='path/to/output')
import slideflow as sf # Load a project and dataset. project = sf.load_project('path/to/project') dataset = project.dataset() # Generate ROIs for all slides in the dataset. dataset.generate_rois('path/to/model.pth')
Use the model for either QC or ROI generation in Slideflow Studio, as described in :ref:`studio_segmentation`.