--- a +++ b/doc/_build/html/maui.html @@ -0,0 +1,381 @@ + +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> + +<html xmlns="http://www.w3.org/1999/xhtml"> + <head> + <meta http-equiv="X-UA-Compatible" content="IE=Edge" /> + <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> + <title>The Maui Class — 0.1 documentation</title> + <link rel="stylesheet" href="_static/alabaster.css" type="text/css" /> + <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> + <script type="text/javascript" id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script> + <script type="text/javascript" src="_static/jquery.js"></script> + <script type="text/javascript" src="_static/underscore.js"></script> + <script type="text/javascript" src="_static/doctools.js"></script> + <script async="async" type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> + <link rel="shortcut icon" href="_static/icons8-beach-30.png"/> + <link rel="index" title="Index" href="genindex.html" /> + <link rel="search" title="Search" href="search.html" /> + <link rel="next" title="Maui Utilities" href="utils.html" /> + <link rel="prev" title="Filtering and Merging latent factors" href="filtering-and-merging-latent-factors.html" /> + + <link rel="stylesheet" href="_static/custom.css" type="text/css" /> + + + <meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" /> + + </head><body> + + + <div class="document"> + <div class="documentwrapper"> + <div class="bodywrapper"> + + + <div class="body" role="main"> + + <div class="section" id="the-maui-class"> +<h1>The Maui Class<a class="headerlink" href="#the-maui-class" title="Permalink to this headline">¶</a></h1> +<dl class="class"> +<dt id="maui.Maui"> +<em class="property">class </em><code class="descclassname">maui.</code><code class="descname">Maui</code><span class="sig-paren">(</span><em>n_hidden=[1500], n_latent=80, batch_size=100, epochs=400, architecture='stacked', initial_beta_val=0, kappa=1.0, max_beta_val=1, learning_rate=0.0005, epsilon_std=1.0, batch_normalize_inputs=True, batch_normalize_intermediaries=True, batch_normalize_embedding=True, relu_intermediaries=True, relu_embedding=True, input_dim=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui" title="Permalink to this definition">¶</a></dt> +<dd><p>Maui (Multi-omics Autoencoder Integration) model.</p> +<p>Trains a variational autoencoder to find latent factors in multi-modal data.</p> +<dl class="docutils"> +<dt>n_hidden: array (default [1500])</dt> +<dd>The sizes of the hidden layers of the autoencoder architecture. +Each element of the array specifies the number of nodes in successive +layers of the autoencoder</dd> +<dt>n_latent: int (default 80)</dt> +<dd>The size of the latent layer (number of latent features)</dd> +<dt>batch_size: int (default 100)</dt> +<dd>The size of the mini-batches used for training the network</dd> +<dt>epochs: int (default 400)</dt> +<dd>The number of epoches to use for training the network</dd> +<dt>architecture:</dt> +<dd>One of ‘stacked’ or ‘deep’. If ‘stacked’, will use a stacked VAE model, where +the intermediate layers are also variational. If ‘deep’, will train a deep VAE +where the intermediate layers are regular (ReLU) units, and only the middle +(latent) layer is variational.</dd> +</dl> +<dl class="method"> +<dt id="maui.Maui.c_index"> +<code class="descname">c_index</code><span class="sig-paren">(</span><em>survival, clinical_only=True, duration_column='duration', observed_column='observed', cox_penalties=[0.1, 1, 10, 100, 1000, 10000], cv_folds=5, sel_clin_alpha=0.05, sel_clin_penalty=0</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.c_index"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.c_index" title="Permalink to this definition">¶</a></dt> +<dd><p>Compute’s Harrell’s c-Index for a Cox Proportional Hazards regression modeling +survival by the latent factors in z.</p> +<p>z: pd.DataFrame (n_samples, n_latent factors) +survival: pd.DataFrame of survival information and relevant covariates</p> +<blockquote> +<div>(such as sex, age at diagnosis, or tumor stage)</div></blockquote> +<dl class="docutils"> +<dt>clinical_only: Compute the c-Index for a model containing only</dt> +<dd>individually clinically relevant latent factors +(see <code class="docutils literal notranslate"><span class="pre">select_clinical_factors</span></code>)</dd> +<dt>duration_column: the name of the column in <code class="docutils literal notranslate"><span class="pre">survival</span></code> containing the</dt> +<dd>duration (time between diagnosis and death or last followup)</dd> +<dt>observed_column: the name of the column in <code class="docutils literal notranslate"><span class="pre">survival</span></code> containing</dt> +<dd>indicating whether time of death is known</dd> +<dt>cox_penalties: penalty coefficient in Cox PH solver (see <code class="docutils literal notranslate"><span class="pre">lifelines.CoxPHFitter</span></code>)</dt> +<dd>to try. Returns the best c given by the different penalties +(by cross-validation)</dd> +</dl> +<p>cv_folds: number of cross-validation folds to compute C +sel_clin_penalty: CPH penalizer to use when selecting clinical factors +sel_clin_alpha: significance level when selecting clinical factors</p> +<dl class="docutils"> +<dt>cs: array, Harrell’s c-Index, an auc-like metric for survival prediction accuracy.</dt> +<dd>one value per cv_fold</dd> +</dl> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.cluster"> +<code class="descname">cluster</code><span class="sig-paren">(</span><em>k=None</em>, <em>optimal_k_method='ami'</em>, <em>optimal_k_range=range(3</em>, <em>10)</em>, <em>ami_y=None</em>, <em>kmeans_kwargs={'n_init': 1000</em>, <em>'n_jobs': 2}</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.cluster"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.cluster" title="Permalink to this definition">¶</a></dt> +<dd><p>Cluster the samples using k-means based on the latent factors.</p> +<dl class="docutils"> +<dt>k: optional, the number of clusters to find.</dt> +<dd>if not given, will attempt to find optimal k.</dd> +<dt>optimal_k_method: supported methods are ‘ami’ and ‘silhouette’. Otherwise, callable.</dt> +<dd>if ‘ami’, will pick K which gives the best AMI +(adjusted mutual information) with external labels. +if ‘silhouette’ will pick the K which gives the best +mean silhouette coefficient. +if callable, should have signature <code class="docutils literal notranslate"><span class="pre">scorer(yhat)</span></code> +and return a scalar score.</dd> +</dl> +<p>optimal_k_range: array-like, range of Ks to try to find optimal K among +ami_y: array-like (n_samples), the ground-truth labels to use</p> +<blockquote> +<div>when picking K by “best AMI against ground-truth” method.</div></blockquote> +<p>kmeans_kwargs: optional, kwargs for initialization of sklearn.cluster.KMeans</p> +<p>yhat: Series (n_samples) cluster labels for each sample</p> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.compute_auc"> +<code class="descname">compute_auc</code><span class="sig-paren">(</span><em>y</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.compute_auc"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.compute_auc" title="Permalink to this definition">¶</a></dt> +<dd><p>Compute area under the ROC curve for predicting the labels in y using the +latent features previously inferred.</p> +<p>y: labels to predict +<a href="#id1"><span class="problematic" id="id2">**</span></a>kwargs: arguments for <code class="docutils literal notranslate"><span class="pre">compute_roc</span></code></p> +<p>aucs: pd.Series, auc per class as well as mean</p> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.compute_roc"> +<code class="descname">compute_roc</code><span class="sig-paren">(</span><em>y</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.compute_roc"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.compute_roc" title="Permalink to this definition">¶</a></dt> +<dd><p>Compute Receiver Operating Characteristics curve for SVM prediction +of labels <code class="docutils literal notranslate"><span class="pre">y</span></code> from the latent factors. Computes both the ROC curves +(true positive rate, true negative rate), and the area under the roc (auc). +ROC and auROC computed for each class (the classes are inferred from <code class="docutils literal notranslate"><span class="pre">y</span></code>), +as well as a “mean” ROC, computed by averaging the class ROCs. Only samples +in the index of <code class="docutils literal notranslate"><span class="pre">y</span></code> will be considered.</p> +<p>y: array-like (n_samples,), the labels of the samples to predict +<a href="#id3"><span class="problematic" id="id4">**</span></a>kwargs: arguments for <code class="docutils literal notranslate"><span class="pre">utils.compute_roc</span></code></p> +<dl class="docutils"> +<dt>roc_curves: dict, one key per class as well as “mean”, each value is a dataframe</dt> +<dd>containing the tpr (true positive rate) and fpr (falce positive rate) +defining that class (or the mean) ROC.</dd> +</dl> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.drop_unexplanatory_factors"> +<code class="descname">drop_unexplanatory_factors</code><span class="sig-paren">(</span><em>threshold=0.02</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.drop_unexplanatory_factors"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.drop_unexplanatory_factors" title="Permalink to this definition">¶</a></dt> +<dd><p>Drops factors which have a low R^2 score in a univariate linear model +predicting the features <cite>x</cite> from a column of the latent factors <cite>z</cite>.</p> +<dl class="docutils"> +<dt>threshold: threshold for R^2, latent factors below this threshold</dt> +<dd>are dropped.</dd> +</dl> +<dl class="docutils"> +<dt>z_filt: (n_samples, n_factors) DataFrame of latent factor values,</dt> +<dd>with only those columns from the input <cite>z</cite> which have an R^2 +above the threshold when using that column as an input +to a linear model predicting <cite>x</cite>.</dd> +</dl> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.fit"> +<code class="descname">fit</code><span class="sig-paren">(</span><em>X</em>, <em>y=None</em>, <em>X_validation=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.fit"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.fit" title="Permalink to this definition">¶</a></dt> +<dd><p>Train autoencoder model</p> +<dl class="docutils"> +<dt>X: dict with multi-modal dataframes, containing training data, e.g.</dt> +<dd>{‘mRNA’: df1, ‘SNP’: df2}, +df1, df2, etc. are (n_features, n_samples) pandas.DataFrame’s. +The sample names must match, the feature names need not.</dd> +<dt>X_validation: optional, dict with multi-modal dataframes, containing validation data</dt> +<dd>will be used to compute validation loss under training</dd> +</dl> +<p>y: Not used.</p> +<p>self : Maui object</p> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.fit_transform"> +<code class="descname">fit_transform</code><span class="sig-paren">(</span><em>X</em>, <em>y=None</em>, <em>X_validation=None</em>, <em>encoder='mean'</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.fit_transform"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.fit_transform" title="Permalink to this definition">¶</a></dt> +<dd><p>Train autoencoder model, and return the latent factor representation +of the data X.</p> +<dl class="docutils"> +<dt>X: dict with multi-modal dataframes, containing training data, e.g.</dt> +<dd>{‘mRNA’: df1, ‘SNP’: df2}, +df1, df2, etc. are (n_samples, n_features) pandas.DataFrame’s. +The sample names must match, the feature names need not.</dd> +<dt>X_validation: optional, dict with multi-modal dataframes, containing validation data</dt> +<dd>will be used to compute validation loss under training</dd> +</dl> +<p>y: Not used.</p> +<dl class="docutils"> +<dt>z: DataFrame (n_samples, n_latent_factors)</dt> +<dd>Latent factors representation of the data X.</dd> +</dl> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.get_linear_weights"> +<code class="descname">get_linear_weights</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.get_linear_weights"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.get_linear_weights" title="Permalink to this definition">¶</a></dt> +<dd><p>Get linear model coefficients obtained from fitting linear models +predicting feature values from latent factors. One model is fit per latent +factor, and the coefficients are stored in the matrix.</p> +<dl class="docutils"> +<dt>W: (n_features, n_latent_factors) DataFrame</dt> +<dd>w_{ij} is the coefficient associated with feature <cite>i</cite> in a linear model +predicting it from latent factor <cite>j</cite>.</dd> +</dl> +</dd></dl> + +<dl class="staticmethod"> +<dt id="maui.Maui.load"> +<em class="property">static </em><code class="descname">load</code><span class="sig-paren">(</span><em>dir</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.load"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.load" title="Permalink to this definition">¶</a></dt> +<dd><p>Load a maui model from disk, which was previously saved using <code class="docutils literal notranslate"><span class="pre">save()</span></code></p> +<p>dir: The directory from which to load the maui model</p> +<p>maui_model: a maui model that was previously saved to disk</p> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.merge_similar_latent_factors"> +<code class="descname">merge_similar_latent_factors</code><span class="sig-paren">(</span><em>distance_in='z'</em>, <em>distance_metric='correlation'</em>, <em>linkage_method='complete'</em>, <em>distance_threshold=0.17</em>, <em>merge_fn=<function mean></em>, <em>plot_dendrogram=True</em>, <em>plot_dendro_ax=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.merge_similar_latent_factors"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.merge_similar_latent_factors" title="Permalink to this definition">¶</a></dt> +<dd><p>Merge latent factorz in z whose distance is below a certain threshold. +Used to squeeze down latent factor representations if there are many co-linear +latent factors.</p> +<dl class="docutils"> +<dt>distance_in: If ‘z’, latent factors will be merged based on their distance</dt> +<dd>to each other in ‘z’. If ‘w’, favtors will be merged based +on their distance in ‘w’ (see <a class="reference internal" href="#maui.Maui.get_linear_weights" title="maui.Maui.get_linear_weights"><code class="xref py py-func docutils literal notranslate"><span class="pre">get_linear_weights()</span></code></a>)</dd> +<dt>distance_metric: The distance metric based on which to merge latent factors.</dt> +<dd>One which is supported by <code class="xref py py-func docutils literal notranslate"><span class="pre">scipy.spatial.distance.pdist()</span></code></dd> +<dt>linkage_method: The linkage method used to cluster latent factors. One which</dt> +<dd>is supported by <code class="xref py py-func docutils literal notranslate"><span class="pre">scipy.cluster.hierarchy.linkage()</span></code>.</dd> +<dt>distance_threshold: Latent factors with distance below this threshold</dt> +<dd>will be merged</dd> +<dt>merge_fn: Function used to determine value of merged latent factor.</dt> +<dd>The default is <code class="xref py py-func docutils literal notranslate"><span class="pre">numpy.mean()</span></code>, meaning the merged +latent factor will have the mean value of the inputs.</dd> +<dt>plot_dendrogram: Boolean. If true, a dendrogram will be plotted showing</dt> +<dd>which latent factors are merged and the threshold.</dd> +</dl> +<p>plot_dendro_ax: A matplotlib axis object to plot the dendrogram on (optional)</p> +<dl class="docutils"> +<dt>z: (n_samples, n_factors) pd.DataFrame of latent factors</dt> +<dd>where some have been merged</dd> +</dl> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.save"> +<code class="descname">save</code><span class="sig-paren">(</span><em>destdir</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.save"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.save" title="Permalink to this definition">¶</a></dt> +<dd><p>Save a maui model to disk, so that it may be reloaded later using <code class="docutils literal notranslate"><span class="pre">load()</span></code></p> +<p>destdir: destination directory in which to save model files</p> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.select_clinical_factors"> +<code class="descname">select_clinical_factors</code><span class="sig-paren">(</span><em>survival</em>, <em>duration_column='duration'</em>, <em>observed_column='observed'</em>, <em>alpha=0.05</em>, <em>cox_penalizer=0</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.select_clinical_factors"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.select_clinical_factors" title="Permalink to this definition">¶</a></dt> +<dd><p>Select latent factors which are predictive of survival. This is +accomplished by fitting a Cox Proportional Hazards (CPH) model to each +latent factor, while controlling for known covariates, and only keeping +those latent factors whose coefficient in the CPH is nonzero (adjusted +p-value < alpha).</p> +<dl class="docutils"> +<dt>survival: pd.DataFrame of survival information and relevant covariates</dt> +<dd>(such as sex, age at diagnosis, or tumor stage)</dd> +<dt>duration_column: the name of the column in <code class="docutils literal notranslate"><span class="pre">survival</span></code> containing the</dt> +<dd>duration (time between diagnosis and death or last followup)</dd> +<dt>observed_column: the name of the column in <code class="docutils literal notranslate"><span class="pre">survival</span></code> containing</dt> +<dd>indicating whether time of death is known</dd> +<dt>alpha: threshold for p-value of CPH coefficients to call a latent</dt> +<dd>factor clinically relevant (p < alpha)</dd> +</dl> +<p>cox_penalizer: penalty coefficient in Cox PH solver (see <code class="docutils literal notranslate"><span class="pre">lifelines.CoxPHFitter</span></code>)</p> +<dl class="docutils"> +<dt>z_clinical: pd.DataFrame, subset of the latent factors which have been</dt> +<dd>determined to have clinical value (are individually predictive +of survival, controlling for covariates)</dd> +</dl> +</dd></dl> + +<dl class="method"> +<dt id="maui.Maui.transform"> +<code class="descname">transform</code><span class="sig-paren">(</span><em>X</em>, <em>encoder='mean'</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/maui/model.html#Maui.transform"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#maui.Maui.transform" title="Permalink to this definition">¶</a></dt> +<dd><p>Transform X into the latent space that was previously learned using +<cite>fit</cite> or <cite>fit_transform</cite>, and return the latent factor representation.</p> +<dl class="docutils"> +<dt>X: dict with multi-modal dataframes, containing training data, e.g.</dt> +<dd>{‘mRNA’: df1, ‘SNP’: df2}, +df1, df2, etc. are (n_features, n_samples) pandas.DataFrame’s.</dd> +<dt>encoder: the mode of the encoder to be used. one of ‘mean’ or ‘sample’,</dt> +<dd>where ‘mean’ indicates the encoder network only uses the mean +estimates for each successive layer. ‘sample’ indicates the +encoder should sample from the distribution specified from each +successive layer, and results in non-reproducible embeddings.</dd> +</dl> +<dl class="docutils"> +<dt>z: DataFrame (n_samples, n_latent_factors)</dt> +<dd>Latent factors representation of the data X.</dd> +</dl> +</dd></dl> + +</dd></dl> + +</div> + + + </div> + + </div> + </div> + <div class="sphinxsidebar" role="navigation" aria-label="main navigation"> + <div class="sphinxsidebarwrapper"> + <p class="logo"><a href="index.html"> + <img class="logo" src="_static/hex-maui.png" alt="Logo"/> + </a></p> +<h1 class="logo"><a href="index.html"></a></h1> + + + + + + + + +<h3>Navigation</h3> +<ul class="current"> +<li class="toctree-l1"><a class="reference internal" href="autoencoder-integration.html">Multi-modal Autoencoders</a></li> +<li class="toctree-l1"><a class="reference internal" href="data-normalization.html">Data and Normalization</a></li> +<li class="toctree-l1"><a class="reference internal" href="filtering-and-merging-latent-factors.html">Filtering and Merging latent factors</a></li> +<li class="toctree-l1 current"><a class="current reference internal" href="#">The Maui Class</a></li> +<li class="toctree-l1"><a class="reference internal" href="utils.html">Maui Utilities</a></li> +</ul> + +<div class="relations"> +<h3>Related Topics</h3> +<ul> + <li><a href="index.html">Documentation overview</a><ul> + <li>Previous: <a href="filtering-and-merging-latent-factors.html" title="previous chapter">Filtering and Merging latent factors</a></li> + <li>Next: <a href="utils.html" title="next chapter">Maui Utilities</a></li> + </ul></li> +</ul> +</div> +<div id="searchbox" style="display: none" role="search"> + <h3>Quick search</h3> + <div class="searchformwrapper"> + <form class="search" action="search.html" method="get"> + <input type="text" name="q" /> + <input type="submit" value="Go" /> + <input type="hidden" name="check_keywords" value="yes" /> + <input type="hidden" name="area" value="default" /> + </form> + </div> +</div> +<script type="text/javascript">$('#searchbox').show(0);</script> + + + + + + + + + </div> + </div> + <div class="clearer"></div> + </div> + <div class="footer"> + ©2018, Jonathan Ronen, Altuna Akalin. + + | + Powered by <a href="http://sphinx-doc.org/">Sphinx 1.8.2</a> + & <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.12</a> + + | + <a href="_sources/maui.rst.txt" + rel="nofollow">Page source</a> + </div> + + + + + </body> +</html> \ No newline at end of file