Thank you for considering to contribute to DOSMA
and encouraging open-source tools for medical image analysis.
If you use DOSMA in your work (research, company, etc.) and find it useful, spread the word!
This guide is inspired by Huggingface transformers.
There are many ways to contribute to DOSMA:
Please do your best to follow these guidelines when opening an issue. It will make it signficantly easier to give useful feedback and resolve the issue faster.
We would very much appreciate if you could make sure the bug was not already reported (use the search bar on Github under Issues). If you cannot find you bug, follow the instructions in the Bug Report template.
Great! Please open an issue and provide the following information:
If you are willing to contribute the algorithm yourself, let us know so we can best guide you.
A world-class feature request addresses the following points:
If your issue is well written we're already 80% of the way there by the time you post it. Follow the instructions in the Feature Request
Before writing code, we strongly advise you to search through the existing PRs or issues to make sure that nobody is already working on the same thing. If you are unsure, it is always a good idea to open an issue to get some feedback.
You will need basic git proficiency to be able to contribute to dosma. git is not the easiest tool to use but it has the greatest manual. Type git --help in a shell and enjoy. If you prefer books, Pro Git is a very good reference.
Follow these steps to start contributing:
Fork the repository
by clicking on the 'Fork' button the repository's page. This creates a copy of the code under your GitHub user account.
Clone your fork to your local disk, and add the base repository as a remote:
bash
$ git clone git@github.com:<your Github handle>/DOSMA.git
$ cd DOSMA
$ git remote add upstream https://github.com/ad12/DOSMA.git
bash
$ git checkout -b a-descriptive-name-for-my-changes
Do not work on the master
branch.
bash
pip install -e ".[dev]"
Develop features on your branch.
As you work on the features, you should make sure that the test suite passes:
bash
$ make test
After you make changes, autoformat them with:
bash
$ make autoformat
If you modify documentation (docs/source
), verify the documents build:
bash
$ make build-docs
Once you're happy with your changes, add changed files using git add
and
make a commit with git commit
to record your changes locally:
bash
$ git add modified_file.py
$ git commit
Please write good commit messages.
It is a good idea to sync your copy of the code with the original
repository regularly. This way you can quickly account for changes:
bash
$ git fetch upstream
$ git rebase upstream/master
Push the changes to your account using:
bash
$ git push -u origin a-descriptive-name-for-my-changes
Once you are satisfied (and the checklist below is happy too), go to the
webpage of your fork on GitHub. Click on 'Pull request' to send your changes
to the project maintainers for review.
It's ok if maintainers ask you for changes. It happens to core contributors
too! So everyone can see the changes in the Pull request, work in your local
branch and push the changes to your fork. They will automatically appear in
the pull request.
HINT: Run all major formatting and checks using the following:
make autoformat test build-docs
[WIP]
Library tests can be found in the
tests folder.
From the root of the repository, here's how to run tests with pytest
for the library:
$ make test
dosma
follows the google style for documentation.
DOSMA offers a range of interfaces to help developers get started with open-sourcing their algorithms.
Two interfaces that have been increasingly popular are
ScanSequences
: An interface for implementing scan-specific algorithms (e.g. quantitative MRI)AI toolbox
: An interface for distributing and using pretrained models.NOTE: The guides below are slightly outdated. Please reach out to us for up-to-date instructions on how to proceed.
Thank you for adding analysis support for new scan sequences! Please follow the instructions below to streamline adding and using these sequences.
qDESS
: Quantitative Double Echo in Steady-StateMAPSS
: Magnetization-Prepared Angle-Modulated Partitioned k-Space Spoiled Gradient Echo Snapshotsscan_sequences
folder with your sequence name (use snake_casing)dosma.scan_sequences.ScanSequence
.In many quantitative sequences, multiple echos are acquired for each slice to perform some form of voxel-wise quantitative fitting/extrapolation. We define a volume as a 3D matrix with values from a single echo. Therefore, a qDESS sequence, which has two echos, has volumes.
Each scan sequence implementation has a instance variable called volumes
, in which the total pool of DICOM files are intelligently split into their respective volumes. For qDESS, the volumes instance variable would be a list with len(volumes) = 2
. Sequences encoding for one echo will have the volumes
field be a list with len(volumes) = 1
.
By default, all scan sequences are split by the EchoNumbers
DICOM tag, which specifies which echo the current DICOM slice corresponds to. However, depending on the scan sequence, where the volumes may need to be split by a different DICOM tag, override the field __DEFAULT_SPLIT_BY__
in the scan sequence class.
Any scans that support quantitative parameter fitting should have a method named generate_<QuantitativeValue>_map
(e.g. generate_t1_rho_map
).
def generate_<QuantitativeValue>_map(self, tissue: Tissue, ...) --> QuantitativeValue:
...
This template defines the basic implementation for any new scan sequence
class NewScanSequence(ScanSequence/TargetSequence/NonTargetSequence):
NAME='' # add name of sequence in snake casing here
__DEFAULT_SPLIT_BY__ = 'EchoNumbers' # specify dicom tag to split volume by. default: 'EchoNumbers`
def __validate_scan__(self) -> bool:
"""Validate this scan (usually done by checking dicom header tags, if available)
:return a boolean
"""
Robust automatic segmentation methods are critical to eliminating the bottleneck for morphological and quantitative analysis. The DOSMA framework enables easy integration of automatic segmentation techniques.
Typically, deep learning segmentation algorithms consist of four blocks during inference:
1. Data preprocessing: Data is typically preprocessed to fit in the distribution expected by the network
- e.g: zero-mean & unit-standard deviation whitening, scaling, etc.
2. Architecture: Each network can have a unique architecture (U-Net, SegNet, etc.). These architectures can be hard-coded into the file itself, or can be loaded from a JSON
format (as outputted by Keras)
3. Weights: The model parameters, which determine the weights and biases for different layers of the network, can be exported to an h5
file and loaded in dynamically.
4. Mask Post-processing: Some post-processing you wish to complete on the probability/binarized output.
- e.g: Conditional Random Fields (CRFs), etc.
All weight files should contain the aliases of the tissues that they can segment and must end with the extension .h5
. For example, a weights file saved for femoral cartilage segmentation should have the alias fc
in its name (eg. fc.h5
, oai-unet-fc.h5
, etc.).
Weight files must be shared. Currently, there is no centralized location where these files can be hosted. As a result, please host the data on the cloud (google drive, box, dropbox, etc) and allow public download.
This template defines the basic implementation for any new Keras segmentation model.
See models/oaiunet2d.py
for an example.
class KerasSegModel(SegModel):
"""
Abstract wrapper for Keras model used for semantic segmentation
"""
def __load_keras_model__(self, input_shape):
"""Build Keras architecture
:param input_shape: tuple or list of tuples for initializing input(s) into Keras model
:return: a Keras model
"""
def generate_mask(self, volume: MedicalVolume):
"""Segment the MRI volumes
:param volume: A Medical Volume (height, width, slices)
:return: A Medical volume or list of Medical Volumes with volume as binarized (0,1) uint8 3D numpy array of shape volumes.shape
:raise ValueError if volumes is not 3D numpy array
:raise ValueError if tissue is not a string or not in list permitted tissues
"""