Diff of /README.md [000000] .. [78ef36]

Switch to unified view

a b/README.md
1
<div align="center">
2
  <img src="https://github.com/user-attachments/assets/53d5c1f8-8fbc-4e0f-bd62-db16797492b0" alt="slideflow logo">
3
4
  [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5703792.svg)](https://doi.org/10.5281/zenodo.5703792)
5
  [![Python application](https://github.com/slideflow/slideflow/actions/workflows/python-app.yml/badge.svg?branch=master)](https://github.com/slideflow/slideflow/actions/workflows/python-app.yml)
6
  [![PyPI version](https://badge.fury.io/py/slideflow.svg)](https://badge.fury.io/py/slideflow)
7
8
  [ArXiv](https://arxiv.org/abs/2304.04142) | [Docs](https://slideflow.dev) | [Slideflow Studio](https://slideflow.dev/studio/) | [Cite](#reference) | [✨ What's New in 3.0 ✨](https://github.com/slideflow/slideflow/releases/tag/3.0.0)
9
10
  ______________________________________________________________________
11
12
  ![Slideflow Studio: a visualization tool for interacting with models and whole-slide images.](https://github.com/slideflow/slideflow/assets/48372806/7f43d8cb-dc80-427d-84c4-3e5a35fa1472)
13
14
</div>
15
16
**Slideflow is a deep learning library for digital pathology, offering a user-friendly interface for model development.**
17
  
18
Designed for both medical researchers and AI enthusiasts, the goal of Slideflow is to provide an accessible, easy-to-use interface for developing state-of-the-art pathology models. Slideflow has been built with the future in mind, offering a scalable platform for digital biomarker development that bridges the gap between ever-evolving, sophisticated methods and the needs of a clinical researcher. For developers, Slideflow provides multiple endpoints for integration with other packages and external training paradigms, allowing you to leverage highly optimized, pathology-specific processes with the latest ML methodologies.
19
20
21
22
## 🚀 Features
23
- Easy-to-use, highly customizable training pipelines
24
- Robust **[slide processing](https://slideflow.dev/slide_processing) and [stain normalization](https://slideflow.dev/norm)** toolkit
25
- Support for training with **[weakly-supervised](https://slideflow.dev/training) or [strongly-supervised](https://slideflow.dev/tile_labels)** labels
26
- Built-in, state-of-the-art **[foundation models](https://slideflow.dev/features)**
27
- **[Multiple-instance learning (MIL)](https://slideflow.dev/mil)**
28
- **[Self-supervised learning (SSL)](https://slideflow.dev/ssl)**
29
- **[Generative adversarial networks (GANs)](https://slideflow.dev/training)**
30
- **Explainability tools**: [Heatmaps](https://slideflow.dev/evaluation/#heatmaps), [mosaic maps](https://slideflow.dev/posthoc/#mosaic-maps), [saliency maps](https://slideflow.dev/saliency/), [synthetic histology](https://slideflow.dev/stylegan)
31
- Robust **[layer activation analysis](https://slideflow.dev/posthoc)** tools
32
- **[Uncertainty quantification](https://slideflow.dev/uq)**
33
- **[Interactive user interface](https://slideflow.dev/studio)** for model deployment
34
- ... and more!
35
36
Full documentation with example tutorials can be found at [slideflow.dev](https://www.slideflow.dev/).
37
38
## Requirements
39
- Python >= 3.7 (<3.10 if using [cuCIM](https://docs.rapids.ai/api/cucim/stable/))
40
- [PyTorch](https://pytorch.org/) >= 1.9 _or_ [Tensorflow](https://www.tensorflow.org/) 2.5-2.11
41
42
### Optional
43
- [Libvips](https://libvips.github.io/libvips/) >= 8.9 (alternative slide reader, adds support for *.scn, *.mrxs, *.ndpi, *.vms, and *.vmu files).
44
- Linear solver (for preserved-site cross-validation)
45
  - [CPLEX](https://www.ibm.com/docs/en/icos/12.10.0?topic=v12100-installing-cplex-optimization-studio) 20.1.0 with [Python API](https://www.ibm.com/docs/en/icos/12.10.0?topic=cplex-setting-up-python-api)
46
  - _or_ [Pyomo](http://www.pyomo.org/installation) with [Bonmin](https://anaconda.org/conda-forge/coinbonmin) solver
47
48
49
## 📥 Installation
50
Slideflow can be installed with PyPI, as a Docker container, or run from source.
51
52
### Method 1: Install via pip
53
54
```
55
pip3 install --upgrade setuptools pip wheel
56
pip3 install slideflow[cucim] cupy-cuda11x
57
```
58
59
The `cupy` package name depends on the installed CUDA version; [see here](https://docs.cupy.dev/en/stable/install.html#installing-cupy) for installation instructions. `cupy` is not required if using Libvips.
60
61
### Method 2: Docker image
62
63
Alternatively, pre-configured [docker images](https://hub.docker.com/repository/docker/jamesdolezal/slideflow) are available with OpenSlide/Libvips and the latest version of either Tensorflow and PyTorch. To install with the Tensorflow backend:
64
65
```
66
docker pull jamesdolezal/slideflow:latest-tf
67
docker run -it --gpus all jamesdolezal/slideflow:latest-tf
68
```
69
70
To install with the PyTorch backend:
71
72
```
73
docker pull jamesdolezal/slideflow:latest-torch
74
docker run -it --shm-size=2g --gpus all jamesdolezal/slideflow:latest-torch
75
```
76
77
### Method 3: From source
78
79
To run from source, clone this repository, install the conda development environment, and build a wheel:
80
81
```
82
git clone https://github.com/slideflow/slideflow
83
conda env create -f slideflow/environment.yml
84
conda activate slideflow
85
pip install -e slideflow/ cupy-cuda11x
86
```
87
88
### Non-Commercial Add-ons
89
90
To add additional tools and pretrained models available under a non-commercial license, install `slideflow-gpl` and `slideflow-noncommercial`:
91
92
```
93
pip install slideflow-gpl slideflow-noncommercial
94
```
95
96
This will provide integrated access to 6 additional pretrained foundation models ([UNI](https://www.nature.com/articles/s41591-024-02857-3), [HistoSSL](https://www.medrxiv.org/content/10.1101/2023.07.21.23292757v2.full.pdf), [GigaPath](https://aka.ms/gigapath), [PLIP](https://www.nature.com/articles/s41591-023-02504-3), [RetCCL](https://www.sciencedirect.com/science/article/abs/pii/S1361841522002730), and [CTransPath](https://www.sciencedirect.com/science/article/abs/pii/S1361841522002043)), the MIL architecture [CLAM](https://www.nature.com/articles/s41551-020-00682-w), the UQ algorithm [BISCUIT](https://www.nature.com/articles/s41467-022-34025-x), and the GAN framework [StyleGAN3](https://nvlabs-fi-cdn.nvidia.com/stylegan3/stylegan3-paper.pdf).
97
98
## ⚙️ Configuration
99
100
### Deep learning (PyTorch vs. Tensorflow)
101
102
Slideflow supports both PyTorch and Tensorflow, defaulting to PyTorch if both are available. You can specify the backend to use with the environmental variable `SF_BACKEND`. For example:
103
104
```
105
export SF_BACKEND=tensorflow
106
```
107
108
### Slide reading (cuCIM vs. Libvips)
109
110
By default, Slideflow reads whole-slide images using [cuCIM](https://docs.rapids.ai/api/cucim/stable/). Although much faster than other openslide-based frameworks, it supports fewer slide scanner formats. Slideflow also includes a [Libvips](https://libvips.github.io/libvips/) backend, which adds support for *.scn, *.mrxs, *.ndpi, *.vms, and *.vmu files. You can set the active slide backend with the environmental variable `SF_SLIDE_BACKEND`:
111
112
```
113
export SF_SLIDE_BACKEND=libvips
114
```
115
116
117
## Getting started
118
Slideflow experiments are organized into [Projects](https://slideflow.dev/project_setup), which supervise storage of whole-slide images, extracted tiles, and patient-level annotations. The fastest way to get started is to use one of our preconfigured projects, which will automatically download slides from the Genomic Data Commons:
119
120
```python
121
import slideflow as sf
122
123
P = sf.create_project(
124
    root='/project/destination',
125
    cfg=sf.project.LungAdenoSquam(),
126
    download=True
127
)
128
```
129
130
After the slides have been downloaded and verified, you can skip to [Extract tiles from slides](#extract-tiles-from-slides).
131
132
Alternatively, to create a new custom project, supply the location of patient-level annotations (CSV), slides, and a destination for TFRecords to be saved:
133
134
```python
135
import slideflow as sf
136
P = sf.create_project(
137
  '/project/path',
138
  annotations="/patient/annotations.csv",
139
  slides="/slides/directory",
140
  tfrecords="/tfrecords/directory"
141
)
142
```
143
144
Ensure that the annotations file has a `slide` column for each annotation entry with the filename (without extension) of the corresponding slide.
145
146
### Extract tiles from slides
147
148
Next, whole-slide images are segmented into smaller image tiles and saved in `*.tfrecords` format. [Extract tiles](https://slideflow.dev/slide_processing) from slides at a given magnification (width in microns size) and resolution (width in pixels) using `sf.Project.extract_tiles()`:
149
150
```python
151
P.extract_tiles(
152
  tile_px=299,  # Tile size, in pixels
153
  tile_um=302   # Tile size, in microns
154
)
155
```
156
157
If slides are on a network drive or a spinning HDD, tile extraction can be accelerated by buffering slides to a SSD or ramdisk:
158
159
```python
160
P.extract_tiles(
161
  ...,
162
  buffer="/mnt/ramdisk"
163
)
164
```
165
166
### Training models
167
168
Once tiles are extracted, models can be [trained](https://slideflow.dev/training). Start by configuring a set of [hyperparameters](https://slideflow.dev/model#modelparams):
169
170
```python
171
params = sf.ModelParams(
172
  tile_px=299,
173
  tile_um=302,
174
  batch_size=32,
175
  model='xception',
176
  learning_rate=0.0001,
177
  ...
178
)
179
```
180
181
Models can then be trained using these parameters. Models can be trained to categorical, multi-categorical, continuous, or time-series outcomes, and the training process is [highly configurable](https://slideflow.dev/training). For example, to train models in cross-validation to predict the outcome `'category1'` as stored in the project annotations file:
182
183
```python
184
P.train(
185
  'category1',
186
  params=params,
187
  save_predictions=True,
188
  multi_gpu=True
189
)
190
```
191
192
### Evaluation, heatmaps, mosaic maps, and more
193
194
Slideflow includes a host of additional tools, including model [evaluation and prediction](https://slideflow.dev/evaluation), [heatmaps](https://slideflow.dev/evaluation#heatmaps), analysis of [layer activations](https://slideflow.dev/posthoc), [mosaic maps](https://slideflow.dev/posthoc#mosaic-maps), and more. See our [full documentation](https://slideflow.dev) for more details and tutorials.
195
196
## 📚 Publications
197
198
Slideflow has been used by:
199
200
- [Dolezal et al](https://www.nature.com/articles/s41379-020-00724-3), _Modern Pathology_, 2020
201
- [Rosenberg et al](https://ascopubs.org/doi/10.1200/JCO.2020.38.15_suppl.e23529), _Journal of Clinical Oncology_ [abstract], 2020
202
- [Howard et al](https://www.nature.com/articles/s41467-021-24698-1), _Nature Communications_, 2021
203
- [Dolezal et al](https://www.nature.com/articles/s41467-022-34025-x) _Nature Communications_, 2022
204
- [Storozuk et al](https://www.nature.com/articles/s41379-022-01039-1.pdf), _Modern Pathology_ [abstract], 2022
205
- [Partin et al](https://doi.org/10.3389/fmed.2023.1058919) _Front Med_, 2022
206
- [Dolezal et al](https://ascopubs.org/doi/abs/10.1200/JCO.2022.40.16_suppl.8549) _Journal of Clinical Oncology_ [abstract], 2022
207
- [Dolezal et al](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9792820/) _Mediastinum_ [abstract], 2022
208
- [Howard et al](https://www.nature.com/articles/s41523-023-00530-5) _npj Breast Cancer_, 2023
209
- [Dolezal et al](https://www.nature.com/articles/s41698-023-00399-4) _npj Precision Oncology_, 2023
210
- [Hieromnimon et al](https://doi.org/10.1101/2023.03.22.533810) [bioRxiv], 2023
211
- [Carrillo-Perez et al](https://doi.org/10.1186/s40644-023-00586-3) _Cancer Imaging_, 2023
212
213
## 🔓 License
214
This code is made available under the Apache-2.0 license.
215
216
## 🔗 Reference
217
If you find our work useful for your research, or if you use parts of this code, please consider citing as follows:
218
219
Dolezal, J.M., Kochanny, S., Dyer, E. et al. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinformatics 25, 134 (2024). https://doi.org/10.1186/s12859-024-05758-x
220
221
```
222
@Article{Dolezal2024,
223
    author={Dolezal, James M. and Kochanny, Sara and Dyer, Emma and Ramesh, Siddhi and Srisuwananukorn, Andrew and Sacco, Matteo and Howard, Frederick M. and Li, Anran and Mohan, Prajval and Pearson, Alexander T.},
224
    title={Slideflow: deep learning for digital histopathology with real-time whole-slide visualization},
225
    journal={BMC Bioinformatics},
226
    year={2024},
227
    month={Mar},
228
    day={27},
229
    volume={25},
230
    number={1},
231
    pages={134},
232
    doi={10.1186/s12859-024-05758-x},
233
    url={https://doi.org/10.1186/s12859-024-05758-x}
234
}
235
```