|
a/README.md |
|
b/README.md |
1 |
# Mowgli: Multi Omics Wasserstein inteGrative anaLysIs |
1 |
# Mowgli: Multi Omics Wasserstein inteGrative anaLysIs
|
2 |
[](https://github.com/gjhuizing/Mowgli/actions/workflows/main.yml) |
2 |
[](https://github.com/gjhuizing/Mowgli/actions/workflows/main.yml)
|
3 |
[](https://codecov.io/gh/cantinilab/Mowgli) |
3 |
[](https://codecov.io/gh/cantinilab/Mowgli)
|
4 |
[](https://mowgli.readthedocs.io/en/latest/?badge=latest) |
4 |
[](https://mowgli.readthedocs.io/en/latest/?badge=latest)
|
5 |
[](https://img.shields.io/pypi/v/mowgli?color=blue) |
5 |
[](https://img.shields.io/pypi/v/mowgli?color=blue)
|
6 |
[](https://github.com/psf/black) |
6 |
[](https://github.com/psf/black)
|
7 |
[](https://zenodo.org/badge/latestdoi/391909874) |
7 |
[](https://zenodo.org/badge/latestdoi/391909874) |
8 |
|
8 |
|
9 |
Mowgli is a novel method for the integration of paired multi-omics data with any type and number of omics, combining integrative Nonnegative Matrix Factorization and Optimal Transport. [Read the paper!](https://www.nature.com/articles/s41467-023-43019-2) |
9 |
Mowgli is a novel method for the integration of paired multi-omics data with any type and number of omics, combining integrative Nonnegative Matrix Factorization and Optimal Transport. [Read the paper!](https://www.nature.com/articles/s41467-023-43019-2) |
10 |
|
10 |
|
11 |
 |
11 |
|
12 |
|
|
|
13 |
## Install the package |
12 |
## Install the package |
14 |
|
13 |
|
15 |
Mowgli is implemented as a Python package seamlessly integrated within the scverse ecosystem, in particular Muon and Scanpy. |
14 |
Mowgli is implemented as a Python package seamlessly integrated within the scverse ecosystem, in particular Muon and Scanpy. |
16 |
|
15 |
|
17 |
### via PyPI (recommended) |
16 |
### via PyPI (recommended) |
18 |
|
17 |
|
19 |
On all operating systems, the easiest way to install Mowgli is via PyPI. Installation should typically take a minute and is continuously tested with Python 3.10 on an Ubuntu virtual machine. |
18 |
On all operating systems, the easiest way to install Mowgli is via PyPI. Installation should typically take a minute and is continuously tested with Python 3.10 on an Ubuntu virtual machine. |
20 |
|
19 |
|
21 |
```bash |
20 |
```bash
|
22 |
pip install mowgli |
21 |
pip install mowgli
|
23 |
``` |
22 |
``` |
24 |
|
23 |
|
25 |
### via GitHub (development version) |
24 |
### via GitHub (development version) |
26 |
|
25 |
|
27 |
```bash |
26 |
```bash
|
28 |
git clone git@github.com:cantinilab/Mowgli.git |
27 |
git clone git@github.com:cantinilab/Mowgli.git
|
29 |
pip install ./Mowgli/ |
28 |
pip install ./Mowgli/
|
30 |
``` |
29 |
``` |
31 |
|
30 |
|
32 |
### Test your installation (optional) |
31 |
### Test your installation (optional) |
33 |
|
32 |
|
34 |
```bash |
33 |
```bash
|
35 |
pytest . |
34 |
pytest .
|
36 |
``` |
35 |
``` |
37 |
|
36 |
|
38 |
## Getting started |
37 |
## Getting started |
39 |
|
38 |
|
40 |
Mowgli takes as an input a Muon object and populates its `obsm` and `uns` fields with the embeddings and dictionaries, respectively. Visit [mowgli.rtfd.io](https://mowgli.rtfd.io/) for more documentation and tutorials. |
39 |
Mowgli takes as an input a Muon object and populates its `obsm` and `uns` fields with the embeddings and dictionaries, respectively. Visit [mowgli.rtfd.io](https://mowgli.rtfd.io/) for more documentation and tutorials. |
41 |
|
40 |
|
42 |
You may download a preprocessed 10X Multiome demo dataset [here](https://figshare.com/s/4c8e72cbb188d8e1cce8). |
41 |
You may download a preprocessed 10X Multiome demo dataset [here](https://figshare.com/s/4c8e72cbb188d8e1cce8). |
43 |
|
42 |
|
44 |
A GPU is not required for small datasets, but is strongly recommended above 1,000 cells. On CPU, the [cell lines demo](https://mowgli.readthedocs.io/en/latest/vignettes/Liu%20cell%20lines.html) (206 cells) should run in under 5 minutes and the [PBMC demo](https://mowgli.readthedocs.io/en/latest/vignettes/PBMC.html) (500 cells) should run in under 10 minutes (tested on a Ubuntu 20.04 machine with an 11th gen i7 processor). |
43 |
A GPU is not required for small datasets, but is strongly recommended above 1,000 cells. On CPU, the [cell lines demo](https://mowgli.readthedocs.io/en/latest/vignettes/Liu%20cell%20lines.html) (206 cells) should run in under 5 minutes and the [PBMC demo](https://mowgli.readthedocs.io/en/latest/vignettes/PBMC.html) (500 cells) should run in under 10 minutes (tested on a Ubuntu 20.04 machine with an 11th gen i7 processor). |
45 |
|
44 |
|
46 |
```python |
45 |
```python
|
47 |
import mowgli |
46 |
import mowgli
|
48 |
import mudata as md |
47 |
import mudata as md
|
49 |
import scanpy as sc |
48 |
import scanpy as sc |
50 |
|
49 |
|
51 |
# Load data into a Muon object. |
50 |
# Load data into a Muon object.
|
52 |
mdata = md.read_h5mu("my_data.h5mu") |
51 |
mdata = md.read_h5mu("my_data.h5mu") |
53 |
|
52 |
|
54 |
# Initialize and train the model. |
53 |
# Initialize and train the model.
|
55 |
model = mowgli.models.MowgliModel(latent_dim=15) |
54 |
model = mowgli.models.MowgliModel(latent_dim=15)
|
56 |
model.train(mdata) |
55 |
model.train(mdata) |
57 |
|
56 |
|
58 |
# Visualize the embedding with UMAP. |
57 |
# Visualize the embedding with UMAP.
|
59 |
sc.pp.neighbors(mdata, use_rep="W_OT") |
58 |
sc.pp.neighbors(mdata, use_rep="W_OT")
|
60 |
sc.tl.umap(mdata) |
59 |
sc.tl.umap(mdata)
|
61 |
sc.pl.umap(mdata) |
60 |
sc.pl.umap(mdata)
|
62 |
``` |
61 |
``` |
63 |
|
62 |
|
64 |
## Publication |
63 |
## Publication |
65 |
|
64 |
|
66 |
```bibtex |
65 |
```bibtex
|
67 |
@article{huizing2023paired, |
66 |
@article{huizing2023paired,
|
68 |
title={Paired single-cell multi-omics data integration with Mowgli}, |
67 |
title={Paired single-cell multi-omics data integration with Mowgli},
|
69 |
author={Huizing, Geert-Jan and Deutschmann, Ina Maria and Peyr{\'e}, Gabriel and Cantini, Laura}, |
68 |
author={Huizing, Geert-Jan and Deutschmann, Ina Maria and Peyr{\'e}, Gabriel and Cantini, Laura},
|
70 |
journal={Nature Communications}, |
69 |
journal={Nature Communications},
|
71 |
volume={14}, |
70 |
volume={14},
|
72 |
number={1}, |
71 |
number={1},
|
73 |
pages={7711}, |
72 |
pages={7711},
|
74 |
year={2023}, |
73 |
year={2023},
|
75 |
publisher={Nature Publishing Group UK London} |
74 |
publisher={Nature Publishing Group UK London}
|
76 |
} |
75 |
}
|
77 |
``` |
76 |
``` |
78 |
|
77 |
|
79 |
If you're looking for the repository with code to reproduce the experiments in our preprint, [here is is!](https://github.com/cantinilab/mowgli_reproducibility) |
78 |
If you're looking for the repository with code to reproduce the experiments in our preprint, [here is is!](https://github.com/cantinilab/mowgli_reproducibility)
|