janggu_usecases / Git / Diff of /README.md

Models:
MarcoTheBlack/
janggu_usecases
Downloads: 1
Diff of /README.md [000000] .. [214c6e]
Switch to side-by-side view

--- a
+++ b/README.md
@@ -0,0 +1,61 @@
+# janggu_usecases
+Examples for deep learning in genomics using Janggu
+
+## Requirements
+
+```
+jupyter
+bedtools
+pybedtools
+samtools
+dash
+janggu
+R
+rpy2
+tzlocal
+r-ggplot2
+r-ggrepel
+r-dplyr
+statsmodels
+pandas
+numpy
+```
+
+These can be installed via conda and pip.
+
+The respective cells in the notebook for installing requirements may be outcommented.
+
+## Download the datasets
+In order to download the required datasets, enter the 00_preparation folder.
+It contains jupyter notebooks that specify and control the data download. 
+Furthermore, it sets up the regions of interest for the model training and evaluation.
+
+## Note
+
+Some of the steps in the notebooks may be outcommented or deactivated, 
+including the invocation of time-consuming training steps,
+so that during evaluation, they are not re-run. You may either activate them within the notebook
+or invoke the scripts on the command line if you wish to train the models from scratch.
+It may also be necessary to adapt the use of `CUDA_VISIBLE_DEVICES` (see tensorflow docs). The GPU device is selected via the `-dev` option in use case 2.
+These were chosen for our specific setup with 8 GPUs. For example, if you only have access to one GPU specify
+`CUDA_VISIBLE_DEVICES=0` before running the scripts.
+
+## JunD prediction
+
+Run the jupyter notebook 'predicting_jund_binding.ipynb' in order to reproduce the results.
+You can control on which gpu the models are trained by specifying the environment variable `CUDA_VISIBLE_DEVICES` (see tensorflow documentation).
+
+## DeepSEA and DanQ experiments
+
+To train and evaluate the DeepSEA and DanQ comparison, enter the '02_deepsea_danq_prediction' folder and launch the
+jupyter notebook 'deepsea_danq_experiments.ipynb'.
+To activate model training, set the parameter `train_models = True`.
+Otherwise, the notebook merely evaluates the results.
+You may need to adapt `-dev` to select a specfic GPU.
+
+
+## CAGE-tag prediction
+
+To reproduce the CAGE-tag prediction use case, enter '03_cage_prediction' and launch the 'predicting_cage_tags.ipynb' notebook.
+In order to run the cross-validation analysis, outcomment the respective command line invocations of the script 'cage_prediction.py'.
+You can control on which gpu the models are trained by specifying the environment variable `CUDA_VISIBLE_DEVICES` (see tensorflow documentation).