|
a/README.md |
|
b/README.md |
1 |
[](https://pypi.org/project/sc-libra/) |
1 |
[](https://pypi.org/project/sc-libra/)
|
2 |
[](https://pepy.tech/project/sc_libra) |
2 |
[](https://pepy.tech/project/sc_libra)
|
3 |
[](https://sc-libra.readthedocs.io/en/latest/) |
3 |
[](https://sc-libra.readthedocs.io/en/latest/)
|
4 |
[](https://doi.org/10.1101/2021.01.27.428400) |
4 |
[](https://doi.org/10.1101/2021.01.27.428400)
|
5 |
[](http://dx.doi.org/10.6084/m9.figshare.19466246) |
5 |
[](http://dx.doi.org/10.6084/m9.figshare.19466246) |
6 |
|
6 |
|
7 |
LIBRA - Machine Translation between paired <img src="gaf/figures/LIBRA_icon_2.png" width="181px" align="right" /> |
7 |
LIBRA - Machine Translation between paired <img src="https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/gaf/figures/LIBRA_icon_2.png?raw=true" width="181px" align="right" />
|
8 |
Single-Cell Multi-Omics Data |
8 |
Single-Cell Multi-Omics Data
|
9 |
=========== |
9 |
===========
|
10 |
This repository contains the [LIBRA code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/) and [online data](#datasets) used for Single-cell multi-omics integration and prediction analysis employed on [LIBRA manuscript](https://www.biorxiv.org/content/10.1101/2021.01.27.428400v2). [Libra metrics](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/R/LIBRA_code/) are also available for quantifying outputs quality as well as novel PPJI preservation measurement. [Seurat code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/R/Seurat_code/) employed to analyze LIBRA input omics as well as for clustering and visualization pipelines are providen. |
10 |
This repository contains the [LIBRA code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/) and [online data](#datasets) used for Single-cell multi-omics integration and prediction analysis employed on [LIBRA manuscript](https://www.biorxiv.org/content/10.1101/2021.01.27.428400v2). [Libra metrics](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/R/LIBRA_code/) are also available for quantifying outputs quality as well as novel PPJI preservation measurement. [Seurat code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/R/Seurat_code/) employed to analyze LIBRA input omics as well as for clustering and visualization pipelines are providen. |
11 |
|
11 |
|
12 |
The Python package [sc-Libra](https://pypi.org/project/sc-libra/), has been developed with the aim of extending and summarizing the developer code used on the paper to a user-friendly version and is freely available in the PyPI repository. Read online package [documentation](https://sc-libra.readthedocs.io/en/latest/) for detailled description and guidelines. |
12 |
The Python package [sc-Libra](https://pypi.org/project/sc-libra/), has been developed with the aim of extending and summarizing the developer code used on the paper to a user-friendly version and is freely available in the PyPI repository. Read online package [documentation](https://sc-libra.readthedocs.io/en/latest/) for detailled description and guidelines. |
13 |
|
13 |
|
14 |
- [Summary](#summary) |
14 |
- [Summary](#summary)
|
15 |
- [Installation](#installation) |
15 |
- [Installation](#installation)
|
16 |
- [Datasets](#datasets) |
16 |
- [Datasets](#datasets)
|
17 |
- [Usage](#usage) |
17 |
- [Usage](#usage)
|
18 |
- [Material of interest](#material-of-interest) |
18 |
- [Material of interest](#material-of-interest) |
19 |
|
19 |
|
20 |
# Summary |
20 |
# Summary
|
21 |
LIBRA is a deep learning model that is designed for Single-cell multi-omics integration and prediction. LIBRA performs this by using an unbalance Autoencoder which learns a shared low-dimensional embedding from both experiment omics, combining each sample's uniqueness for generating a enriched representation of integrated data respect to the original experiment independent data. This tool has been first developed in [R code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/R/LIBRA_code/), a code snapshot is providen for R users. Next, adaptative LIBRA (aLIBRA) tool has been develop for paralellize training of LIBRA models using a grid structure for selecting optimal hyperparameters in a automatic way excluding the requirement of doing this by users saving considerable time. Snapshot code is providen in [Python code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/Python/LIBRA_fine_tune_code/) for conceptual understanding. |
21 |
LIBRA is a deep learning model that is designed for Single-cell multi-omics integration and prediction. LIBRA performs this by using an unbalance Autoencoder which learns a shared low-dimensional embedding from both experiment omics, combining each sample's uniqueness for generating a enriched representation of integrated data respect to the original experiment independent data. This tool has been first developed in [R code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/R/LIBRA_code/), a code snapshot is providen for R users. Next, adaptative LIBRA (aLIBRA) tool has been develop for paralellize training of LIBRA models using a grid structure for selecting optimal hyperparameters in a automatic way excluding the requirement of doing this by users saving considerable time. Snapshot code is providen in [Python code](https://github.com/TranslationalBioinformaticsUnit/LIBRA/blob/main/code_snapshots/Python/LIBRA_fine_tune_code/) for conceptual understanding. |
22 |
|
22 |
|
23 |
As a result from these raw developer-codes provided, [sc-Libra](https://pypi.org/project/sc-libra/) package is provided as a built-in resource to perform the pipeline propossed. |
23 |
As a result from these raw developer-codes provided, [sc-Libra](https://pypi.org/project/sc-libra/) package is provided as a built-in resource to perform the pipeline propossed. |
24 |
|
24 |
|
25 |
For further details, please refer to the [online manuscript](https://www.biorxiv.org/content/10.1101/2021.01.27.428400v2) currently at biorxiv repository (will be updated asap). |
25 |
For further details, please refer to the [online manuscript](https://www.biorxiv.org/content/10.1101/2021.01.27.428400v2) currently at biorxiv repository (will be updated asap). |
26 |
|
26 |
|
27 |
# Installation |
27 |
# Installation
|
28 |
|
28 |
|
29 |
To run sc-Libra pipeline the following settings are required: |
29 |
To run sc-Libra pipeline the following settings are required:
|
30 |
- Install Python **>=3.7.0**. |
30 |
- Install Python **>=3.7.0**.
|
31 |
- Install R **>=3.5.2**. |
31 |
- Install R **>=3.5.2**.
|
32 |
- Install sc-libra python package: |
32 |
- Install sc-libra python package:
|
33 |
``` |
33 |
```
|
34 |
$ pip install sc_libra |
34 |
$ pip install sc_libra
|
35 |
``` |
35 |
``` |
36 |
|
36 |
|
37 |
For stepwise guide follow the online [documentation](https://sc-libra.readthedocs.io/en/latest/). |
37 |
For stepwise guide follow the online [documentation](https://sc-libra.readthedocs.io/en/latest/). |
38 |
|
38 |
|
39 |
# Datasets |
39 |
# Datasets
|
40 |
Find [Neurips](https://openproblems.bio/neurips_2021/) provided dataset for LIBRA testing at figsahre repository to be downloaded [here](https://figshare.com/s/d7ad0c6b8285e75de40f). |
40 |
Find [Neurips](https://openproblems.bio/neurips_2021/) provided dataset for LIBRA testing at figsahre repository to be downloaded [here](https://figshare.com/s/d7ad0c6b8285e75de40f). |
41 |
|
41 |
|
42 |
Following datasets consist only on the sparse versions without cell/feature identity, go to corresponding autor references for original datasets. |
42 |
Following datasets consist only on the sparse versions without cell/feature identity, go to corresponding autor references for original datasets.
|
43 |
| LIBRA name | GSE link | Modalities | Technology | Genomic ref used | Download sparse matrix | |
43 |
| LIBRA name | GSE link | Modalities | Technology | Genomic ref used | Download sparse matrix |
|
44 |
| :---: | :---: | :---: | :---: | :---: | :---: | |
44 |
| :---: | :---: | :---: | :---: | :---: | :---: |
|
45 |
| DataSet1 | [GSE126074](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE126074) | scRNAseq + scATACseq | SNARE-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/c9b87f4ac1d1c030e128) and [ATAC](https://figshare.com/s/9ff9ea93a2108478bb36) | |
45 |
| DataSet1 | [GSE126074](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE126074) | scRNAseq + scATACseq | SNARE-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/c9b87f4ac1d1c030e128) and [ATAC](https://figshare.com/s/9ff9ea93a2108478bb36) |
|
46 |
| DataSet2 | [GSE128639](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scADT | CITE-seq | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/5f5cfa6fda4ae3512c0d) and [ADT](https://figshare.com/s/5e34cd80455398855ad8) | |
46 |
| DataSet2 | [GSE128639](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scADT | CITE-seq | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/5f5cfa6fda4ae3512c0d) and [ADT](https://figshare.com/s/5e34cd80455398855ad8) |
|
47 |
| DataSet3 | [GSE130399](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | Paired-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/a1f4a5ef0735d1b4167d) and [ATAC](https://figshare.com/s/80d9b9d84ada526668a6) | |
47 |
| DataSet3 | [GSE130399](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | Paired-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/a1f4a5ef0735d1b4167d) and [ATAC](https://figshare.com/s/80d9b9d84ada526668a6) |
|
48 |
| DataSet4 | [GSE140203](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | SHARE-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/71312a335649b04972b8) and [ATAC](https://figshare.com/s/0b581450cd6e1f8fb64c) | |
48 |
| DataSet4 | [GSE140203](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | SHARE-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/71312a335649b04972b8) and [ATAC](https://figshare.com/s/0b581450cd6e1f8fb64c) |
|
49 |
| DataSet5 | [10X Genomics](https://support.10xgenomics.com/single-cell-multiome-atac-gex/datasets/1.0.0/pbmc_granulocyte_sorted_10k) | scRNAseq + scATACseq | 10X multiome | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/90b237227f0cc07d075d) and [ATAC](https://figshare.com/s/4086bce6032f6a206a13) | |
49 |
| DataSet5 | [10X Genomics](https://support.10xgenomics.com/single-cell-multiome-atac-gex/datasets/1.0.0/pbmc_granulocyte_sorted_10k) | scRNAseq + scATACseq | 10X multiome | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/90b237227f0cc07d075d) and [ATAC](https://figshare.com/s/4086bce6032f6a206a13) |
|
50 |
| DataSet6 | [GSE194122](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | 10X multiome | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/134562c3ec74a3a50c84) and [ATAC](https://figshare.com/s/378a630ec9c6ddadf4f5) | |
50 |
| DataSet6 | [GSE194122](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | 10X multiome | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/134562c3ec74a3a50c84) and [ATAC](https://figshare.com/s/378a630ec9c6ddadf4f5) |
|
51 |
| DataSet7 | [GSE194122](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scADT | CITE-seq | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/41bdbfe7479e9729c800) and [ADT](https://figshare.com/s/975cd8a5bbc57c8d2c8c) | |
51 |
| DataSet7 | [GSE194122](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scADT | CITE-seq | [Homo_sapiens.GRCh38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/41bdbfe7479e9729c800) and [ADT](https://figshare.com/s/975cd8a5bbc57c8d2c8c) |
|
52 |
| DataSet8 | [GSE109262](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | scNMT-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/4a158e6d243bcd45a171) and [ATAC](https://figshare.com/s/f13136b52f3b387d1a66) | |
52 |
| DataSet8 | [GSE109262](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi) | scRNAseq + scATACseq | scNMT-seq | [Mus_musculus.GRCm38 Ver: 3.0.0](https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#) | [RNA](https://figshare.com/s/4a158e6d243bcd45a171) and [ATAC](https://figshare.com/s/f13136b52f3b387d1a66) | |
53 |
|
53 |
|
54 |
# Usage |
54 |
# Usage |
55 |
|
55 |
|
56 |
- Easiest way of running LIBRA analysis is though [sc-Libra](https://pypi.org/project/sc-libra/) python package. |
56 |
- Easiest way of running LIBRA analysis is though [sc-Libra](https://pypi.org/project/sc-libra/) python package.
|
57 |
- Package [documentation](https://sc-libra.readthedocs.io/en/latest/) is online available using "Read the Docs" platform. |
57 |
- Package [documentation](https://sc-libra.readthedocs.io/en/latest/) is online available using "Read the Docs" platform. |
58 |
|
58 |
|
59 |
# Material of interest |
59 |
# Material of interest |
60 |
|
60 |
|
61 |
### LIBRA benchmarking comparison: |
61 |
### LIBRA benchmarking comparison:
|
62 |
For validating LIBRA performance we compared it against other: |
62 |
For validating LIBRA performance we compared it against other: |
63 |
|
63 |
|
64 |
- Integration performance compared to - published/available: [BABEL](https://github.com/wukevin/babel). |
64 |
- Integration performance compared to - published/available: [BABEL](https://github.com/wukevin/babel). |
65 |
|
65 |
|
66 |
- Prediction performance compared to - published/available: [Seurat3](https://satijalab.org/seurat/articles/integration_mapping.html), [Seurat4](https://github.com/satijalab/seurat), [MOFA+](https://biofam.github.io/MOFA2/index.html), [totalVI](https://github.com/YosefLab/scvi-tools), [BABEL](https://github.com/wukevin/babel), [multiVI](https://github.com/scverse/scvi-tutorials/blob/master/MultiVI_tutorial.ipynb) and [multigrate](https://github.com/theislab/multigrate). |
66 |
- Prediction performance compared to - published/available: [Seurat3](https://satijalab.org/seurat/articles/integration_mapping.html), [Seurat4](https://github.com/satijalab/seurat), [MOFA+](https://biofam.github.io/MOFA2/index.html), [totalVI](https://github.com/YosefLab/scvi-tools), [BABEL](https://github.com/wukevin/babel), [multiVI](https://github.com/scverse/scvi-tutorials/blob/master/MultiVI_tutorial.ipynb) and [multigrate](https://github.com/theislab/multigrate). |
67 |
|
67 |
|
68 |
**Further details are provided at supplementary material added at [LIBRA manuscript](https://www.biorxiv.org/content/10.1101/2021.01.27.428400v1).** |
68 |
**Further details are provided at supplementary material added at [LIBRA manuscript](https://www.biorxiv.org/content/10.1101/2021.01.27.428400v1).** |
69 |
|
69 |
|
70 |
### LIBRA visual workflow: |
70 |
### LIBRA visual workflow:
|
71 |
 |
71 |
 |
72 |
|
72 |
|