--- a +++ b/demo/notebooks/trainDL.ipynb @@ -0,0 +1,231 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Deep Learning Training code" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This notebook describes our deep learning function. It sets up the architecture, pre-processed the input data and runs the specified number of epochs.\n", + "We will describe the code step by step." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Overview\n", + "The deep learning architecture is built using `Keras`, a model-level library that provides high-level building blocks for developing deep learning models. It provides a user-friendly interface to low-level tensor manipulation libraries that handle the computationally intensive tensor operations involved in model fitting (e.g. `Tensorflow`, `Theano`, `CNTK`). For this work, we used `Keras` interfaced with `Tensorflow` as the backend computational engine. \n", + "\n", + "### 1. Importing Libraries \n", + "We begin by importing required libraries: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import keras\n", + "from keras import backend as K\n", + "from keras.models import Sequential, Model\n", + "from keras.layers import Input\n", + "from keras.layers.core import Dense, Dropout\n", + "from keras.optimizers import Adam\n", + "from keras.regularizers import l1, l2, l1_l2\n", + "import numpy as np\n", + "import pandas as pd\n", + "from sklearn.preprocessing import scale\n", + "import os, sys, time, pickle, copy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2. Data formatting\n", + "Define function that converts input to require formats for analysis" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def prepare_data(x, e, t):\n", + " return (x.astype(\"float32\"), e.astype(\"int32\"), t.astype(\"float32\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Define `DL_single_run` function\n", + "We define function `DL_single_run` below. The first 2 function arguments are variables `xtr` and `ytr`, which represent the training data. Variable `xtr` is an $n \\times p$ matrix (where $n$ is sample size and $p$ is number of variables [motion descriptors, in our case]). Variable `ytr` contains the survival outcomes for the training data, it is an $n \\times 2$ matrix where the first column is the censoring status and the second is survival/censoring time.\n", + "The `DL_single_run` function also has arguments representing network parameters: `units1` (hidden layer size), `units2` (latent code size), `dro` (dropout), `lr` (learning rate), `l1r` (L1 penalty for activity regularizer), and `alpha` (tradeoff parameter). The final arguments are `batchsize` (batch size) and `numepochs` (number of epochs).\n", + "\n", + "The first steps of the function format the data into the correct numeric types, and sort into minibatches:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def DL_single_run(xtr, ytr, units1, units2, dro, lr, l1r, alpha, batchsize, numepochs):\n", + " #Data preparation: create X, E and TM where X=input vector, E=censoring status and T=survival time. Apply formatting\n", + " X_tr, E_tr, TM_tr = prepare_data(xtr, ytr[:,0,np.newaxis], ytr[:,1])\n", + "\n", + " #Arrange data into minibatches (based on specified batch size), and within each minibatch, sort in descending order of survival/censoring time (see explanation of Cox PH loss function definition)\n", + " X_tr, E_tr, TM_tr, _ = sort4minibatches(X_tr, E_tr, TM_tr, batchsize)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3. Model Architecture\n", + "Still within the `DL_single_run` function definition, we set up Denoising autoencoder model in *Keras*." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + " #before defining network architecture, clear current computational graph (if one exists), and specify input dimensionality)\n", + " K.clear_session()\n", + " inpshape = xtr.shape[1]\n", + " \n", + " #Define Network Architecture\n", + " inputvec= Input(shape=(inpshape,))\n", + " x = Dropout(dro, input_shape=(inpshape,))(inputvec)\n", + " x = Dense(units=int(units1), activation='relu', activity_regularizer=l1(10**l1r))(x)\n", + " encoded = Dense(units=int(units2), activation='relu', name='encoded')(x)\n", + " riskpred= Dense(units=1, activation='linear', name='predicted_risk')(encoded)\n", + " z = Dense(units=int(units1), activation='relu')(encoded)\n", + " decoded = Dense(units=inpshape, activation='linear', name='decoded')(z)\n", + "\n", + " model = Model(inputs=inputvec, outputs=[decoded,riskpred])\n", + " model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 4. Define Network prediction loss function\n", + "We use the negative Cox Proportional Hazards Partial Likelihood function: $$L_{s} = -\\sum_{i=1}^{n} \\delta_{i} \\bigg\\{\\bf{W}'\\phi(\\bf{x}_i) - log \\sum_{j \\in R(t_i)} e^{\\bf{W}'\\phi(\\bf{x}_j)} \\bigg\\}$$\n", + "for the survival prediction component of our network. This loss function is not one of the standard losses included in *Keras*. However, *Keras* allows creation of user-defined custom loss functions written using symbolic Keras backend operations, which operate on tensor objects (as opposed to python primitives). We show our user-defined function below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + " def _negative_log_likelihood(E, risk): \n", + " hazard_ratio = K.exp(risk)\n", + " log_risk = K.log(K.cumsum(hazard_ratio))\n", + " uncensored_likelihood = risk - log_risk\n", + " censored_likelihood = uncensored_likelihood * E\n", + " neg_likelihood = -K.sum(censored_likelihood)\n", + " return neg_likelihood" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the function, argument `E` represents censoring status ($\\delta_{i}$) and `risk` ($\\bf{W}'\\phi(\\bf{x}_i)$) represents the network output, i.e. the log hazard ratio. Before being passed to the `_negative_log_likelihood` function below, the data for the `E` and `risk` arguments must be sorted in order of descending survival time. This is because of the use of the `cumsum` (cumulative sum) Keras backend function on *Line 3* of the definition, which sums hazard ratios over the risk sets of each patient (see second term in above equation)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 4b. Arrange data into minibatches sorted by descending time\n", + "We use the function below, with arguments `xvals` (input vector of motion descriptors), `evals` (censoring statuses), `tvals` (survival/censoring time [in days]), `batchsize` (desired batch size)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def sort4minibatches(xvals, evals, tvals, batchsize):\n", + " ntot = len(xvals)\n", + " indices = np.arange(ntot)\n", + " np.random.shuffle(indices)\n", + " start_idx=0\n", + " esall = []\n", + " for end_idx in list(range(batchsize, batchsize*(ntot//batchsize)+1, batchsize))+[ntot]:\n", + " excerpt = indices[start_idx:end_idx]\n", + " sort_idx = np.argsort(tvals[excerpt])[::-1]\n", + " es = excerpt[sort_idx]\n", + " esall += list(es)\n", + " start_idx = end_idx\n", + " return (xvals[esall], evals[esall], tvals[esall], esall)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 5. Compile and Run Network\n", + "Still within the `DL_single_run` function definition, we compile the model and run. For compilation, we specify our 2 losses: reconstruction loss (`mean_squared_error`) and our custom Cox prediction loss (`_negative_log_likelihood`). Their corresponding weights are defined based on the `alpha` parameter.\n", + "Note that for running the model (`model.fit`), we set `shuffle=False` to ensure that the predefined batch size order (see `sort4minibatches`) is preserved from epoch to epoch. \n", + "The last line (`return mlog`) of the `DL_single_run` function definition returns the log file produced by `model.fit`. This contains epoch history (losses at each epoch), the fitted model as a `Keras` model object (architecture, weights, etc.), and other information about the fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + " #Model compilation\n", + " optimdef = Adam(lr = lr)\n", + " model.compile(loss=[keras.losses.mean_squared_error, _negative_log_likelihood], loss_weights=[alpha,1-alpha], optimizer=optimdef, metrics={'decoded':keras.metrics.mean_squared_error})\n", + " \n", + " #Run model\n", + " mlog = model.fit(X_tr, [X_tr,E_tr], batch_size=batchsize, epochs=numepochs, shuffle=False, verbose=1)\n", + "\n", + " K.clear_session()\n", + " return mlog" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.6" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}