Information of our data is described in the Method section of the manuscript and the supplementary materials. All the data are also publicly available from https://github.com/jaydu1/VITAE/tree/master/data.
Our Python package is available on PyPI and the user can install the CPU version with the following command:
>> pip install pyvitae==2.0.1
For reproducing the results, please use version v2.0.1.
To enable GPU for TensorFlow, one should install CUDA dependencies and the tensorflow-gpu
package. We also recommend using conda
, miniconda
, or virtualenv
to manage the Python environment and install the package in a new environment.
One can also install all dependencies by hand and use the source code in the GitHub repo.
package | version |
---|---|
Python | >=3.7.0 |
tensorflow | >=2.3.0 |
tensorflow-probability | >=0.11.0 |
pandas | >=1.1.5 |
jupyter | >=1.0.0 |
umap-learn | >=0.4.6 |
matplotlib | >=3.3.3 |
numba | >=0.52.0 |
seaborn | >=0.11.0 |
scikit-learn | >=0.23.2 |
scikit-misc | >=0.1.3 |
statsmodels | >= 0.12.1 |
louvain | >=0.7.0 |
networkx | >=2.5 |
scanpy | >=1.8.2 |
For reproducing the results in the manuscript, the following versions of the R language and packages are needed:
package | version |
---|---|
R | >=4.0.2 |
Seurat | >=3.2.2 |
slingshot | >=1.9.1 |
dyno | >=0.1.2 |
dplyr | >=1.0.2 |
tidyverse | >=1.3.0 |
purrr | >=0.3.4 |
AnnotationHub | >=2.20.2 |
Matrix | >=1.2-18 |
data.table | >=1.13.2 |
sandwich | >=3.0-0 |
lmtest | >=0.9-38 |
ggplot2 | >=3.3.2 |
ggpubr | >=0.4.0 |
cowplot | >=1.1.0 |
hdf5r | >=1.3.3 |
The folder Benchmarking with Real and Synthetic Datasets
contains code to reproduce results in Section 5 of the manuscript.
run_other_methods.R
: Run PAGA, monocle 3, and slingshot on all datasets. To run the new version of PAGA and monocle 3, one would need to create TI methods from docker with dyno (see https://dynverse.org/developers/creating-ti-method/create_ti_method_container/). The source code for them is in the folder ti_methods
.evaluate_other_methods.py
: Evaluate the trajectory inference results from other methods.run_and_evaluate_VITAE_Gaussian.py
: Run VITAE using Gaussian likelihood with different random seeds and record the evaluation result.run_and_evaluate_VITAE_NB.py
: Run VITAE using Negative Binomial likelihood with different random seeds and record the evaluation result.The intermediate results of the above scripts are in the sub-folder result
. Then, the figures can be plotted with the following Jupyter notebook:
plot_benchmark.py
: Reproduce Figure 2 in the manuscript.The folder Application on developing mouse neocortex
contains code to reproduce results on two developing mouse neocortex datasets in the manuscript.
cc_scores.R
: Functions to compute cell-cycle scores.run_Seurat_Slingshot
: Run Seurat and Slingshot and record the results.run_VITAE.py
: Run VITAE and save the trained model and intermediate results.The intermediate results of the above scripts are in the sub-folder result
. Then, the figures can be plotted with the following Jupyter notebook:
plot_mouse_brain.ipynb
: Reproduce the Figure 3 and S3 in the manuscript.The folder Application on integrating mouse brain datasets
contains code to reproduce results on the integration of three mouse brain datasets in the manuscript.
run_VITAE.py
: Run VITAE and save the trained model and intermediate results.The figures can be plotted with the following Jupyter notebook:
plot_integrating_mouse_brain_datasets
: Reproduce Figures 4 and 5 in the manuscript.The folder Application on scRNA and scATAC datasets
contains code to reproduce results on multi-omic trajectory inference in the manuscript.
run_VITAE.py
: Run VITAE and save the trained model and intermediate results.The figures can be plotted with the following Jupyter notebook:
plot_human_hematopoiesis.ipynb
: Reproduce Figure 6 in the manuscript.