<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta name="description" content="Training and applying deep learning models to genome sequence data. Applications include data processing, model fitting, model evaluation, model optimization and inference. A few genome datasets for testing are provided, as is the possibility to extract deep representations.">
<title>Deep Learning for Genome Sequence Data • deepG</title>
<!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="favicon-32x32.png">
<link rel="apple-touch-icon" type="image/png" sizes="180x180" href="apple-touch-icon.png">
<link rel="apple-touch-icon" type="image/png" sizes="120x120" href="apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" type="image/png" sizes="76x76" href="apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" type="image/png" sizes="60x60" href="apple-touch-icon-60x60.png">
<script src="deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<link href="deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet">
<script src="deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
<!-- bootstrap-toc --><script src="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js" integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.11/clipboard.min.js" integrity="sha512-7O5pXpc0oCRrxk8RUfDYFgn0nO1t+jLuIOQdOMRp4APB7uZ4vSjspzp5y6YDtDs4VzUSTbWzBFZ/LKJhnyFOKw==" crossorigin="anonymous" referrerpolicy="no-referrer"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="pkgdown.js"></script><meta property="og:title" content="Deep Learning for Genome Sequence Data">
<meta property="og:description" content="Training and applying deep learning models to genome sequence data. Applications include data processing, model fitting, model evaluation, model optimization and inference. A few genome datasets for testing are provided, as is the possibility to extract deep representations.">
<meta property="og:image" content="https://genomenet.github.io/deepG/logo.png">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<a href="#main" class="visually-hidden-focusable">Skip to contents</a>
<nav class="navbar fixed-top navbar-light navbar-expand-lg bg-light" data-bs-theme="light"><div class="container">
<a class="navbar-brand me-2" href="index.html">deepG</a>
<small class="nav-text text-default me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="Released version">0.3.0</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div id="navbar" class="collapse navbar-collapse ms-3">
<ul class="navbar-nav me-auto">
<li class="nav-item">
<a class="nav-link" href="reference/index.html">
<span class="fa fa fa fa-file-alt"></span>
Reference
</a>
</li>
<li class="nav-item dropdown">
<a href="#" class="nav-link dropdown-toggle" data-bs-toggle="dropdown" role="button" aria-expanded="false" aria-haspopup="true" id="dropdown-notebooks">Notebooks</a>
<div class="dropdown-menu" aria-labelledby="dropdown-notebooks">
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/175jIdXcDcgPUvaBo2rH2Lupbpjnp5O7G?usp=sharing">deepG tutorial</a>
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/1Eolc0koMNM1zkuO4XyVM58ImeF1BpRiH?usp=sharing">Read-length level: Human contamination</a>
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/1yiXSwFafXpMLHaov9iBTQLIDZ6bK1zYX?usp=sharing">Locus level: CRISPR detection</a>
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/1G7bOFEX87cZNrM2tdRtTdkrZn5fM__g0?usp=sharing">Gene level: 16S rRNA detection</a>
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/1BCggL-tfQF136YeJ8cKKi-zoBEDMgkNh?usp=sharing">Genome level: Bacterial morphology (Sporulation)</a>
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/10xpRzGd3JeBAbqQYSCxzQUMctt01sx9D?usp=sharing">Full metagenome level: Colorectal cancer prediction</a>
<a class="external-link dropdown-item" href="https://colab.research.google.com/drive/1kyYK7IU7GSfdpDzO_a8U3_qD4i3zTu6w?usp=sharing">BERT with deepG</a>
</div>
</li>
<li class="nav-item dropdown">
<a href="#" class="nav-link dropdown-toggle" data-bs-toggle="dropdown" role="button" aria-expanded="false" aria-haspopup="true" id="dropdown-tutorials">Tutorials</a>
<div class="dropdown-menu" aria-labelledby="dropdown-tutorials">
<a class="dropdown-item" href="articles/getting_started.html">Getting Started</a>
<a class="dropdown-item" href="articles/training_types.html">Training types</a>
<a class="dropdown-item" href="articles/data_generator.html">Data generator</a>
<a class="dropdown-item" href="articles/using_tb.html">Using tensorboard</a>
<a class="dropdown-item" href="articles/integrated_gradient.html">Integrated Gradient</a>
</div>
</li>
</ul>
<form class="form-inline my-2 my-lg-0" role="search">
<input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="search.json" id="search-input" placeholder="Search for" autocomplete="off">
</form>
<ul class="navbar-nav">
<li class="nav-item">
<a class="external-link nav-link" href="https://github.com/GenomeNet/deepG/" aria-label="github">
<span class="fab fa fab fa-github fa-lg"></span>
</a>
</li>
</ul>
</div>
</div>
</nav><div class="container template-home">
<div class="row">
<main id="main" class="col-md-9"><div class="section level1">
<div class="page-header">
<img src="logo.png" class="logo" alt=""><h1 id="deepg-">DeepG <a class="anchor" aria-label="anchor" href="#deepg-"></a>
</h1>
</div>
<p><strong>deepG: toolbox for deep neural networks optimized for genomic datasets</strong> <!---
% <p><img alt="DeepG logo" height="70px" src="man/figures/logo_small.png" align="left" hspace="-1000px" vspace="-180px"></p>
--></p>
<p>The goal of the package is to speed up the development of bioinformatical tools for sequence classification, homology detection and other bioinformatical tasks. It is developed for biologists and advanced AI researchers. DeepG is a collaborative effort from the McHardy Lab at the <em>Helmholtz Centre for Infection Research</em>, the Chair of Statistical Learning and Data Science at the <em>Ludwig Maximilian University of Munich</em> and the Huttenhower lab at <em>Harvard T.H. Chan School of Public Health</em>.</p>
<p><a href="https://zenodo.org/badge/latestdoi/387820006" class="external-link"><img src="https://zenodo.org/badge/387820006.svg" alt="DOI"></a></p>
<div class="section level2">
<h2 id="overview">Overview<a class="anchor" aria-label="anchor" href="#overview"></a>
</h2>
<p>The package offers several functions to create, train and evaluate neural networks as well as data processing.</p>
<ul>
<li>
<strong>Data processing</strong>
<ul>
<li>Create data generator to handle large collections of files.</li>
<li>Different options to encode fasta/fastq file (one-hot encoding, coverage or quality score encoding).</li>
<li>Different options to handle ambiguous nucleotides.</li>
</ul>
</li>
<li>
<strong>Deep learning architectures</strong>
<ul>
<li>Create network architectures with single function call.</li>
<li>Custom loss and metric functions available.</li>
</ul>
</li>
<li>
<strong>Model training</strong>
<ul>
<li>Automatically create model/data pipeline.</li>
</ul>
</li>
<li>
<strong>Visualizing training progress</strong>
<ul>
<li>Visualize training progress and metrics in tensorboard.<br>
</li>
</ul>
</li>
<li>
<strong>Model evaluation</strong>
<ul>
<li>Evaluate trained models.</li>
</ul>
</li>
<li>
<strong>Model interpretability</strong>
<ul>
<li>Use Integrated Gradient to visualize relationship of model’s predictions with regard to its input.</li>
</ul>
</li>
</ul>
</div>
<div class="section level2">
<h2 id="installation">Installation<a class="anchor" aria-label="anchor" href="#installation"></a>
</h2>
<p>Install the tensorflow python package</p>
<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/utils/install.packages.html" class="external-link">install.packages</a></span><span class="op">(</span><span class="st">"tensorflow"</span><span class="op">)</span></span>
<span><span class="fu">tensorflow</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/tensorflow/man/install_tensorflow.html" class="external-link">install_tensorflow</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div>
<p>and afterwards install the latest version of deepG from github</p>
<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu">devtools</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="st">"GenomeNet/deepG"</span><span class="op">)</span></span></code></pre></div>
</div>
<div class="section level2">
<h2 id="usage">Usage<a class="anchor" aria-label="anchor" href="#usage"></a>
</h2>
<p>See the Package website at <a href="https://deepg.de" class="external-link uri">https://deepg.de</a> for documentation and example code.</p>
<!-- ## Examples -->
<!-- ## Datasets -->
<!-- The library comes with mutiple different datasets for testing: -->
<!-- - The set `data(parenthesis)` contains 100k characters of the parenthesis synthetic language generated from a very simple counting language with a parenthesis and letter alphabet Σ = {( ) 0 1 2 3 4 }. The language is constrained to match parentheses, and nesting is limited to at most 4 levels deep. Each opening parenthesis increases and each closing parenthesis decreases the nesting level, respectively. Numbers are generated randomly, but are constrained to indicate the nesting level at their position. -->
<!-- - The set `data(crispr_full)` containing all CRISPR loci found in NCBI representative genomes with neighbor nucleotides up and downstream. -->
<!-- - The set `data(crispr_sample)` containing a subset of `data(crispr_full)`. -->
<!-- - The set `data(ecoli)` contains the *E. coli* genome, see [the genome sequence of Escherichia coli K-12](https://science.sciencemag.org/content/277/5331/1453.long). -->
<!-- - The set `data(ecoli_small)` contains a subset of `data(ecoli)`. -->
<!---
## Installation and Usage
Please see our [Wiki](https://github.com/hiddengenome/deepG/wiki) for further installation instructions. It covers also usage instructions for multi-GPU machines.
- [Installation on desktop machine](https://github.com/hiddengenome/deepG/wiki/Installation-of-deepG-on-desktop)
- [Installation on GPU server](https://github.com/hiddengenome/deepG/wiki/Installation-of-deepG-on-GPU-server)
- [Installation AWS](https://github.com/hiddengenome/deepG/wiki/Installation-AWS)
- [GPU Usage](https://github.com/hiddengenome/deepG/wiki/manage-GPU-usage)
- [Tensorboard Integration](https://github.com/hiddengenome/deepG/wiki/Tensorboard-integration)
See the help files `?deepG` to get started and for questions use the [FAQ](https://github.com/hiddengenome/deepG/wiki/FAQ).
-->
</div>
</div>
</main><aside class="col-md-3"><div class="links">
<h2 data-toc-skip>Links</h2>
<ul class="list-unstyled">
<li><a href="https://github.com/GenomeNet/deepG/" class="external-link">Browse source code</a></li>
<li><a href="https://github.com/GenomeNet/deepG/issues" class="external-link">Report a bug</a></li>
</ul>
</div>
<div class="license">
<h2 data-toc-skip>License</h2>
<ul class="list-unstyled">
<li>LGPL (>= 3)</li>
</ul>
</div>
<div class="citation">
<h2 data-toc-skip>Citation</h2>
<ul class="list-unstyled">
<li><a href="authors.html#citation">Citing deepG</a></li>
</ul>
</div>
<div class="developers">
<h2 data-toc-skip>Developers</h2>
<ul class="list-unstyled">
<li>Philipp Münch <br><small class="roles"> Author </small> </li>
<li>René Mreches <br><small class="roles"> Author, maintainer </small> </li>
<li>Martin Binder <br><small class="roles"> Author </small> </li>
<li>Hüseyin Anil Gündüz <br><small class="roles"> Author </small> </li>
<li>Xiao-Yin To <br><small class="roles"> Author </small> </li>
<li>Alice McHardy <br><small class="roles"> Author </small> </li>
</ul>
</div>
</aside>
</div>
<footer><div class="pkgdown-footer-left">
<p>Developed by Philipp Münch, René Mreches, Martin Binder, Hüseyin Anil Gündüz, Xiao-Yin To, Alice McHardy.</p>
</div>
<div class="pkgdown-footer-right">
<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.9.</p>
</div>
</footer>
</div>
</body>
</html>