a/README.md b/README.md
1
1
---
2
# DeepG <img src="man/figures/logo_small.png"  align="left" vspace="-1800px"/>
2
output: github_document
3
3
---
4
5
```{r, echo = FALSE}
6
knitr::opts_chunk$set(
7
  collapse = TRUE,
8
  comment = "#>",
9
  fig.path = "man/figures/"
10
)
11
```
12
13
4
**deepG: toolbox for deep neural networks optimized for genomic
14
**deepG: R toolbox for deep neural networks optimized for genomic datasets**
5
datasets** <!---
15
<!---
6
% <p><img alt="DeepG logo" height="70px" src="man/figures/logo_small.png" align="left" hspace="-1000px" vspace="-180px"></p>
16
% <p><img alt="DeepG logo" height="70px" src="man/figures/logo_small.png" align="left" hspace="-1000px" vspace="-180px"></p>
7
-->
17
-->
8
18
19
9
The goal of the package is to speed up the development of
20
The goal of the package is to speed up the development of bioinformatical tools for sequence classification, homology detection and other bioinformatical tasks. It is developed for biologists and advanced AI researchers. DeepG is a collaborative effort from the McHardy Lab at the *Helmholtz Centre for Infection Research*, the chair of Statistical Learning and Data Science at the *Ludwig Maximilian University of Munich* of Prof. Dr. Bernd Bischl and the Huttenhower lab at *Harvard T.H. Chan School of Public Health*.
10
bioinformatical tools for sequence classification, homology detection
21
11
and other bioinformatical tasks. It is developed for biologists and
12
advanced AI researchers. DeepG is a collaborative effort from the
13
McHardy Lab at the *Helmholtz Centre for Infection Research*, the Chair of
14
Statistical Learning and Data Science at the *Ludwig Maximilian
15
University of Munich* and the Huttenhower lab at *Harvard T.H. Chan
16
School of Public Health*.
17
18
[![DOI](https://zenodo.org/badge/387820006.svg)](https://zenodo.org/badge/latestdoi/387820006)
22
[![DOI](https://zenodo.org/badge/387820006.svg)](https://zenodo.org/badge/latestdoi/387820006)
19
23
20
## Overview
24
## Overview
21
25
22
The package offers several functions to create, train and evaluate
26
The package offers several functions to create, train and evaluate neural networks as well as data processing.
23
neural networks as well as data processing.
27
24
28
+ **Data processing**
25
- **Data processing**
26
  - Create data generator to handle large collections of files.
29
  + Different options to encode fasta/fastq file (one-hot encoding, coverage or quality score encoding).
27
  - Different options to encode fasta/fastq file (one-hot encoding,
28
    coverage or quality score encoding).
29
  - Different options to handle ambiguous nucleotides.
30
  + Different options to handle ambiguous nucleotides.
30
- **Deep learning architectures**
31
  + Create data generator to handle large collections of files.
32
+ **Deep learning architectures**
31
  - Create network architectures with single function call.
33
  + Create network architectures with single function call.
32
  - Custom loss and metric functions available.
34
  + Custom loss and metric functions available.
33
- **Model training**
35
+ **Model training** 
34
  - Automatically create model/data pipeline.
36
  + Automatically create model/data pipeline.
35
- **Visualizing training progress**
37
+ **Visualizing training progress**
36
  - Visualize training progress and metrics in tensorboard.  
38
  + Visualize training progress and metrics in tensorboard.   
37
- **Model evaluation**
39
+ **Model evaluation**
38
  - Evaluate trained models.
40
  + Evaluate trained models.
39
- **Model interpretability**
41
+ **Model interpretability**
40
  - Use Integrated Gradient to visualize relationship of model’s
42
  + Use Integrated Gradient to visualize relationship of model's predictions with regard to its input.
41
    predictions with regard to its input.
43
    
42
44
43
## Installation
45
## Installation
44
46
45
Install the tensorflow python package
47
Install the tensorflow python package
46
48
47
``` r
49
```{r, eval=FALSE, message=FALSE}
48
install.packages("tensorflow")
50
install.packages("tensorflow")
49
tensorflow::install_tensorflow()
51
tensorflow::install_tensorflow()
50
```
52
```
51
53
52
and afterwards install the latest version of deepG from github
54
and afterwards install the latest version of deepG from github 
53
55
54
``` r
56
```{r, eval=FALSE, message=FALSE}
55
devtools::install_github("GenomeNet/deepG")
57
devtools::install_github("GenomeNet/deepG")
56
```
58
```
57
59
60
```{r, echo=FALSE, warning=FALSE, message=FALSE}
61
devtools::load_all(path = "~/deepG")
62
```
63
58
## Usage
64
## Usage 
59
65
60
See the Package website at <https://deepg.de> for documentation and
66
See the Package website at https://deepg.de for documentation and example code.
61
example code.
67
62
68
 <!-- ## Examples  -->
63
<!-- ## Examples  -->
69
64
65
<!-- ## Datasets -->
70
<!-- ## Datasets -->
71
66
<!-- The library comes with mutiple different datasets for testing: -->
72
<!-- The library comes with mutiple different datasets for testing: -->
73
67
<!-- - The set `data(parenthesis)` contains 100k characters of the parenthesis synthetic language generated from a very simple counting language with a parenthesis and letter alphabet Σ = {( ) 0 1 2 3 4 }. The language is constrained to match parentheses, and nesting is limited to at most 4 levels deep. Each opening parenthesis increases and each closing parenthesis decreases the nesting level, respectively. Numbers are generated randomly, but are constrained to indicate the nesting level at their position. -->
74
<!-- - The set `data(parenthesis)` contains 100k characters of the parenthesis synthetic language generated from a very simple counting language with a parenthesis and letter alphabet Σ = {( ) 0 1 2 3 4 }. The language is constrained to match parentheses, and nesting is limited to at most 4 levels deep. Each opening parenthesis increases and each closing parenthesis decreases the nesting level, respectively. Numbers are generated randomly, but are constrained to indicate the nesting level at their position. -->
68
<!-- - The set `data(crispr_full)` containing all CRISPR loci found in NCBI representative genomes with neighbor nucleotides up and downstream. -->
75
<!-- - The set `data(crispr_full)` containing all CRISPR loci found in NCBI representative genomes with neighbor nucleotides up and downstream. -->
69
<!-- - The set `data(crispr_sample)` containing a subset of `data(crispr_full)`. -->
76
<!-- - The set `data(crispr_sample)` containing a subset of `data(crispr_full)`. -->
70
<!-- - The set `data(ecoli)` contains the *E. coli* genome, see [the genome sequence of Escherichia coli K-12](https://science.sciencemag.org/content/277/5331/1453.long). -->
77
<!-- - The set `data(ecoli)` contains the *E. coli* genome, see [the genome sequence of Escherichia coli K-12](https://science.sciencemag.org/content/277/5331/1453.long). -->
71
<!-- - The set `data(ecoli_small)` contains a subset of `data(ecoli)`. -->
78
<!-- - The set `data(ecoli_small)` contains a subset of `data(ecoli)`. -->
79
72
<!---
80
<!---
73
## Installation and Usage
81
## Installation and Usage
74
&#10;Please see our [Wiki](https://github.com/hiddengenome/deepG/wiki) for further installation instructions. It covers also usage instructions for multi-GPU machines.
82
75
&#10;- [Installation on desktop machine](https://github.com/hiddengenome/deepG/wiki/Installation-of-deepG-on-desktop)
83
Please see our [Wiki](https://github.com/hiddengenome/deepG/wiki) for further installation instructions. It covers also usage instructions for multi-GPU machines.
84
85
- [Installation on desktop machine](https://github.com/hiddengenome/deepG/wiki/Installation-of-deepG-on-desktop)
76
- [Installation on GPU server](https://github.com/hiddengenome/deepG/wiki/Installation-of-deepG-on-GPU-server)
86
- [Installation on GPU server](https://github.com/hiddengenome/deepG/wiki/Installation-of-deepG-on-GPU-server)
77
- [Installation AWS](https://github.com/hiddengenome/deepG/wiki/Installation-AWS)
87
- [Installation AWS](https://github.com/hiddengenome/deepG/wiki/Installation-AWS)
78
- [GPU Usage](https://github.com/hiddengenome/deepG/wiki/manage-GPU-usage)
88
- [GPU Usage](https://github.com/hiddengenome/deepG/wiki/manage-GPU-usage)
79
- [Tensorboard Integration](https://github.com/hiddengenome/deepG/wiki/Tensorboard-integration)
89
- [Tensorboard Integration](https://github.com/hiddengenome/deepG/wiki/Tensorboard-integration)
80
&#10;See the help files `?deepG` to get started and for questions use the [FAQ](https://github.com/hiddengenome/deepG/wiki/FAQ).
90
91
See the help files `?deepG` to get started and for questions use the [FAQ](https://github.com/hiddengenome/deepG/wiki/FAQ).
81
-->
92
-->