b/demo/notebooks/trainDL.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Deep Learning Training code"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook describes our deep learning function. It sets up the architecture, pre-processed the input data and runs the specified number of epochs.\n",
+    "We will describe the code step by step."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Overview\n",
+    "The deep learning architecture is built using `Keras`, a model-level library that provides high-level building blocks for developing deep learning models. It provides a user-friendly interface to low-level tensor manipulation libraries that handle the computationally intensive tensor operations involved in model fitting (e.g. `Tensorflow`, `Theano`, `CNTK`). For this work, we used `Keras` interfaced with `Tensorflow` as the backend computational engine. \n",
+    "\n",
+    "### 1. Importing Libraries \n",
+    "We begin by importing required libraries: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import keras\n",
+    "from keras import backend as K\n",
+    "from keras.models import Sequential, Model\n",
+    "from keras.layers import Input\n",
+    "from keras.layers.core import Dense, Dropout\n",
+    "from keras.optimizers import Adam\n",
+    "from keras.regularizers import l1, l2, l1_l2\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "from sklearn.preprocessing import scale\n",
+    "import os, sys, time, pickle, copy"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 2. Data formatting\n",
+    "Define function that converts input to require formats for analysis"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def prepare_data(x, e, t):\n",
+    "    return (x.astype(\"float32\"), e.astype(\"int32\"), t.astype(\"float32\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define `DL_single_run` function\n",
+    "We define function `DL_single_run` below. The first 2 function arguments are variables `xtr` and `ytr`, which represent the training data. Variable `xtr` is an $n \\times p$ matrix (where $n$ is sample size and $p$ is number of variables [motion descriptors, in our case]). Variable `ytr` contains the survival outcomes for the training data, it is an $n \\times 2$ matrix where the first column is the censoring status and the second is survival/censoring time.\n",
+    "The `DL_single_run` function also has arguments representing network parameters: `units1` (hidden layer size), `units2` (latent code size), `dro` (dropout), `lr` (learning rate), `l1r` (L1 penalty for activity regularizer), and `alpha` (tradeoff parameter). The final arguments are `batchsize` (batch size) and `numepochs` (number of epochs).\n",
+    "\n",
+    "The first steps of the function format the data into the correct numeric types, and sort into minibatches:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def DL_single_run(xtr, ytr, units1, units2, dro, lr, l1r, alpha, batchsize, numepochs):\n",
+    "    #Data preparation: create X, E and TM where X=input vector, E=censoring status and T=survival time. Apply formatting\n",
+    "    X_tr, E_tr, TM_tr = prepare_data(xtr, ytr[:,0,np.newaxis], ytr[:,1])\n",
+    "\n",
+    "    #Arrange data into minibatches (based on specified batch size), and within each minibatch, sort in descending order of survival/censoring time (see explanation of Cox PH loss function definition)\n",
+    "    X_tr, E_tr, TM_tr, _ = sort4minibatches(X_tr, E_tr, TM_tr, batchsize)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 3. Model  Architecture\n",
+    "Still within the `DL_single_run` function definition, we set up Denoising autoencoder model in *Keras*."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "    #before defining network architecture, clear current computational graph (if one exists), and specify input dimensionality)\n",
+    "    K.clear_session()\n",
+    "    inpshape = xtr.shape[1]\n",
+    "    \n",
+    "    #Define Network Architecture\n",
+    "    inputvec= Input(shape=(inpshape,))\n",
+    "    x       = Dropout(dro, input_shape=(inpshape,))(inputvec)\n",
+    "    x       = Dense(units=int(units1), activation='relu', activity_regularizer=l1(10**l1r))(x)\n",
+    "    encoded = Dense(units=int(units2), activation='relu', name='encoded')(x)\n",
+    "    riskpred= Dense(units=1,  activation='linear', name='predicted_risk')(encoded)\n",
+    "    z       = Dense(units=int(units1),  activation='relu')(encoded)\n",
+    "    decoded = Dense(units=inpshape, activation='linear', name='decoded')(z)\n",
+    "\n",
+    "    model = Model(inputs=inputvec, outputs=[decoded,riskpred])\n",
+    "    model.summary()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 4. Define Network prediction loss function\n",
+    "We use the negative Cox Proportional Hazards Partial Likelihood function: $$L_{s} =  -\\sum_{i=1}^{n} \\delta_{i} \\bigg\\{\\bf{W}'\\phi(\\bf{x}_i)  - log \\sum_{j \\in R(t_i)} e^{\\bf{W}'\\phi(\\bf{x}_j)} \\bigg\\}$$\n",
+    "for the survival prediction component of our network. This loss function is not one of the standard losses included in *Keras*. However, *Keras* allows creation of user-defined custom loss functions written using symbolic Keras backend operations, which operate on tensor objects (as opposed to python primitives). We show our user-defined function below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "    def _negative_log_likelihood(E, risk):  \n",
+    "        hazard_ratio = K.exp(risk)\n",
+    "        log_risk = K.log(K.cumsum(hazard_ratio))\n",
+    "        uncensored_likelihood = risk - log_risk\n",
+    "        censored_likelihood = uncensored_likelihood * E\n",
+    "        neg_likelihood = -K.sum(censored_likelihood)\n",
+    "        return neg_likelihood"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the function, argument `E` represents censoring status ($\\delta_{i}$) and `risk` ($\\bf{W}'\\phi(\\bf{x}_i)$) represents the network output, i.e. the log hazard ratio. Before being passed to the `_negative_log_likelihood` function below, the data for the `E` and `risk` arguments must be sorted in order of descending survival time. This is because of the use of the `cumsum` (cumulative sum) Keras backend function on *Line 3* of the definition, which sums hazard ratios over the risk sets of each patient (see second term in above equation)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 4b. Arrange data into minibatches sorted by descending time\n",
+    "We use the function below, with arguments `xvals` (input vector of motion descriptors), `evals` (censoring statuses), `tvals` (survival/censoring time [in days]), `batchsize` (desired batch size)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def sort4minibatches(xvals, evals, tvals, batchsize):\n",
+    "    ntot = len(xvals)\n",
+    "    indices = np.arange(ntot)\n",
+    "    np.random.shuffle(indices)\n",
+    "    start_idx=0\n",
+    "    esall = []\n",
+    "    for end_idx in list(range(batchsize, batchsize*(ntot//batchsize)+1, batchsize))+[ntot]:\n",
+    "        excerpt = indices[start_idx:end_idx]\n",
+    "        sort_idx = np.argsort(tvals[excerpt])[::-1]\n",
+    "        es = excerpt[sort_idx]\n",
+    "        esall += list(es)\n",
+    "        start_idx = end_idx\n",
+    "    return (xvals[esall], evals[esall], tvals[esall], esall)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 5. Compile and Run Network\n",
+    "Still within the `DL_single_run` function definition, we compile the model and run. For compilation, we specify our 2 losses: reconstruction loss (`mean_squared_error`) and our custom Cox prediction loss (`_negative_log_likelihood`). Their corresponding weights are defined based on the `alpha` parameter.\n",
+    "Note that for running the model (`model.fit`), we set `shuffle=False` to ensure that the predefined batch size order (see `sort4minibatches`) is preserved from epoch to epoch. \n",
+    "The last line (`return mlog`) of the `DL_single_run` function definition returns the log file produced by `model.fit`. This contains epoch history (losses at each epoch), the fitted model as a `Keras` model object (architecture, weights, etc.), and other information about the fit."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "    #Model compilation\n",
+    "    optimdef = Adam(lr = lr)\n",
+    "    model.compile(loss=[keras.losses.mean_squared_error, _negative_log_likelihood], loss_weights=[alpha,1-alpha], optimizer=optimdef, metrics={'decoded':keras.metrics.mean_squared_error})\n",
+    "    \n",
+    "    #Run model\n",
+    "    mlog = model.fit(X_tr, [X_tr,E_tr], batch_size=batchsize, epochs=numepochs, shuffle=False, verbose=1)\n",
+    "\n",
+    "    K.clear_session()\n",
+    "    return mlog"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}