a b/demo/notebooks/trainDL.ipynb
1
{
2
 "cells": [
3
  {
4
   "cell_type": "markdown",
5
   "metadata": {},
6
   "source": [
7
    "# Deep Learning Training code"
8
   ]
9
  },
10
  {
11
   "cell_type": "markdown",
12
   "metadata": {},
13
   "source": [
14
    "This notebook describes our deep learning function. It sets up the architecture, pre-processed the input data and runs the specified number of epochs.\n",
15
    "We will describe the code step by step."
16
   ]
17
  },
18
  {
19
   "cell_type": "markdown",
20
   "metadata": {},
21
   "source": [
22
    "## Overview\n",
23
    "The deep learning architecture is built using `Keras`, a model-level library that provides high-level building blocks for developing deep learning models. It provides a user-friendly interface to low-level tensor manipulation libraries that handle the computationally intensive tensor operations involved in model fitting (e.g. `Tensorflow`, `Theano`, `CNTK`). For this work, we used `Keras` interfaced with `Tensorflow` as the backend computational engine. \n",
24
    "\n",
25
    "### 1. Importing Libraries \n",
26
    "We begin by importing required libraries: "
27
   ]
28
  },
29
  {
30
   "cell_type": "code",
31
   "execution_count": null,
32
   "metadata": {},
33
   "outputs": [],
34
   "source": [
35
    "import keras\n",
36
    "from keras import backend as K\n",
37
    "from keras.models import Sequential, Model\n",
38
    "from keras.layers import Input\n",
39
    "from keras.layers.core import Dense, Dropout\n",
40
    "from keras.optimizers import Adam\n",
41
    "from keras.regularizers import l1, l2, l1_l2\n",
42
    "import numpy as np\n",
43
    "import pandas as pd\n",
44
    "from sklearn.preprocessing import scale\n",
45
    "import os, sys, time, pickle, copy"
46
   ]
47
  },
48
  {
49
   "cell_type": "markdown",
50
   "metadata": {},
51
   "source": [
52
    "### 2. Data formatting\n",
53
    "Define function that converts input to require formats for analysis"
54
   ]
55
  },
56
  {
57
   "cell_type": "code",
58
   "execution_count": null,
59
   "metadata": {},
60
   "outputs": [],
61
   "source": [
62
    "def prepare_data(x, e, t):\n",
63
    "    return (x.astype(\"float32\"), e.astype(\"int32\"), t.astype(\"float32\"))"
64
   ]
65
  },
66
  {
67
   "cell_type": "markdown",
68
   "metadata": {},
69
   "source": [
70
    "### Define `DL_single_run` function\n",
71
    "We define function `DL_single_run` below. The first 2 function arguments are variables `xtr` and `ytr`, which represent the training data. Variable `xtr` is an $n \\times p$ matrix (where $n$ is sample size and $p$ is number of variables [motion descriptors, in our case]). Variable `ytr` contains the survival outcomes for the training data, it is an $n \\times 2$ matrix where the first column is the censoring status and the second is survival/censoring time.\n",
72
    "The `DL_single_run` function also has arguments representing network parameters: `units1` (hidden layer size), `units2` (latent code size), `dro` (dropout), `lr` (learning rate), `l1r` (L1 penalty for activity regularizer), and `alpha` (tradeoff parameter). The final arguments are `batchsize` (batch size) and `numepochs` (number of epochs).\n",
73
    "\n",
74
    "The first steps of the function format the data into the correct numeric types, and sort into minibatches:"
75
   ]
76
  },
77
  {
78
   "cell_type": "code",
79
   "execution_count": null,
80
   "metadata": {},
81
   "outputs": [],
82
   "source": [
83
    "def DL_single_run(xtr, ytr, units1, units2, dro, lr, l1r, alpha, batchsize, numepochs):\n",
84
    "    #Data preparation: create X, E and TM where X=input vector, E=censoring status and T=survival time. Apply formatting\n",
85
    "    X_tr, E_tr, TM_tr = prepare_data(xtr, ytr[:,0,np.newaxis], ytr[:,1])\n",
86
    "\n",
87
    "    #Arrange data into minibatches (based on specified batch size), and within each minibatch, sort in descending order of survival/censoring time (see explanation of Cox PH loss function definition)\n",
88
    "    X_tr, E_tr, TM_tr, _ = sort4minibatches(X_tr, E_tr, TM_tr, batchsize)"
89
   ]
90
  },
91
  {
92
   "cell_type": "markdown",
93
   "metadata": {},
94
   "source": [
95
    "### 3. Model  Architecture\n",
96
    "Still within the `DL_single_run` function definition, we set up Denoising autoencoder model in *Keras*."
97
   ]
98
  },
99
  {
100
   "cell_type": "code",
101
   "execution_count": null,
102
   "metadata": {},
103
   "outputs": [],
104
   "source": [
105
    "    #before defining network architecture, clear current computational graph (if one exists), and specify input dimensionality)\n",
106
    "    K.clear_session()\n",
107
    "    inpshape = xtr.shape[1]\n",
108
    "    \n",
109
    "    #Define Network Architecture\n",
110
    "    inputvec= Input(shape=(inpshape,))\n",
111
    "    x       = Dropout(dro, input_shape=(inpshape,))(inputvec)\n",
112
    "    x       = Dense(units=int(units1), activation='relu', activity_regularizer=l1(10**l1r))(x)\n",
113
    "    encoded = Dense(units=int(units2), activation='relu', name='encoded')(x)\n",
114
    "    riskpred= Dense(units=1,  activation='linear', name='predicted_risk')(encoded)\n",
115
    "    z       = Dense(units=int(units1),  activation='relu')(encoded)\n",
116
    "    decoded = Dense(units=inpshape, activation='linear', name='decoded')(z)\n",
117
    "\n",
118
    "    model = Model(inputs=inputvec, outputs=[decoded,riskpred])\n",
119
    "    model.summary()"
120
   ]
121
  },
122
  {
123
   "cell_type": "markdown",
124
   "metadata": {},
125
   "source": [
126
    "### 4. Define Network prediction loss function\n",
127
    "We use the negative Cox Proportional Hazards Partial Likelihood function: $$L_{s} =  -\\sum_{i=1}^{n} \\delta_{i} \\bigg\\{\\bf{W}'\\phi(\\bf{x}_i)  - log \\sum_{j \\in R(t_i)} e^{\\bf{W}'\\phi(\\bf{x}_j)} \\bigg\\}$$\n",
128
    "for the survival prediction component of our network. This loss function is not one of the standard losses included in *Keras*. However, *Keras* allows creation of user-defined custom loss functions written using symbolic Keras backend operations, which operate on tensor objects (as opposed to python primitives). We show our user-defined function below."
129
   ]
130
  },
131
  {
132
   "cell_type": "code",
133
   "execution_count": null,
134
   "metadata": {},
135
   "outputs": [],
136
   "source": [
137
    "    def _negative_log_likelihood(E, risk):  \n",
138
    "        hazard_ratio = K.exp(risk)\n",
139
    "        log_risk = K.log(K.cumsum(hazard_ratio))\n",
140
    "        uncensored_likelihood = risk - log_risk\n",
141
    "        censored_likelihood = uncensored_likelihood * E\n",
142
    "        neg_likelihood = -K.sum(censored_likelihood)\n",
143
    "        return neg_likelihood"
144
   ]
145
  },
146
  {
147
   "cell_type": "markdown",
148
   "metadata": {},
149
   "source": [
150
    "In the function, argument `E` represents censoring status ($\\delta_{i}$) and `risk` ($\\bf{W}'\\phi(\\bf{x}_i)$) represents the network output, i.e. the log hazard ratio. Before being passed to the `_negative_log_likelihood` function below, the data for the `E` and `risk` arguments must be sorted in order of descending survival time. This is because of the use of the `cumsum` (cumulative sum) Keras backend function on *Line 3* of the definition, which sums hazard ratios over the risk sets of each patient (see second term in above equation)."
151
   ]
152
  },
153
  {
154
   "cell_type": "markdown",
155
   "metadata": {},
156
   "source": [
157
    "### 4b. Arrange data into minibatches sorted by descending time\n",
158
    "We use the function below, with arguments `xvals` (input vector of motion descriptors), `evals` (censoring statuses), `tvals` (survival/censoring time [in days]), `batchsize` (desired batch size)"
159
   ]
160
  },
161
  {
162
   "cell_type": "code",
163
   "execution_count": null,
164
   "metadata": {},
165
   "outputs": [],
166
   "source": [
167
    "def sort4minibatches(xvals, evals, tvals, batchsize):\n",
168
    "    ntot = len(xvals)\n",
169
    "    indices = np.arange(ntot)\n",
170
    "    np.random.shuffle(indices)\n",
171
    "    start_idx=0\n",
172
    "    esall = []\n",
173
    "    for end_idx in list(range(batchsize, batchsize*(ntot//batchsize)+1, batchsize))+[ntot]:\n",
174
    "        excerpt = indices[start_idx:end_idx]\n",
175
    "        sort_idx = np.argsort(tvals[excerpt])[::-1]\n",
176
    "        es = excerpt[sort_idx]\n",
177
    "        esall += list(es)\n",
178
    "        start_idx = end_idx\n",
179
    "    return (xvals[esall], evals[esall], tvals[esall], esall)"
180
   ]
181
  },
182
  {
183
   "cell_type": "markdown",
184
   "metadata": {},
185
   "source": [
186
    "### 5. Compile and Run Network\n",
187
    "Still within the `DL_single_run` function definition, we compile the model and run. For compilation, we specify our 2 losses: reconstruction loss (`mean_squared_error`) and our custom Cox prediction loss (`_negative_log_likelihood`). Their corresponding weights are defined based on the `alpha` parameter.\n",
188
    "Note that for running the model (`model.fit`), we set `shuffle=False` to ensure that the predefined batch size order (see `sort4minibatches`) is preserved from epoch to epoch. \n",
189
    "The last line (`return mlog`) of the `DL_single_run` function definition returns the log file produced by `model.fit`. This contains epoch history (losses at each epoch), the fitted model as a `Keras` model object (architecture, weights, etc.), and other information about the fit."
190
   ]
191
  },
192
  {
193
   "cell_type": "code",
194
   "execution_count": null,
195
   "metadata": {},
196
   "outputs": [],
197
   "source": [
198
    "    #Model compilation\n",
199
    "    optimdef = Adam(lr = lr)\n",
200
    "    model.compile(loss=[keras.losses.mean_squared_error, _negative_log_likelihood], loss_weights=[alpha,1-alpha], optimizer=optimdef, metrics={'decoded':keras.metrics.mean_squared_error})\n",
201
    "    \n",
202
    "    #Run model\n",
203
    "    mlog = model.fit(X_tr, [X_tr,E_tr], batch_size=batchsize, epochs=numepochs, shuffle=False, verbose=1)\n",
204
    "\n",
205
    "    K.clear_session()\n",
206
    "    return mlog"
207
   ]
208
  }
209
 ],
210
 "metadata": {
211
  "kernelspec": {
212
   "display_name": "Python 3",
213
   "language": "python",
214
   "name": "python3"
215
  },
216
  "language_info": {
217
   "codemirror_mode": {
218
    "name": "ipython",
219
    "version": 3
220
   },
221
   "file_extension": ".py",
222
   "mimetype": "text/x-python",
223
   "name": "python",
224
   "nbconvert_exporter": "python",
225
   "pygments_lexer": "ipython3",
226
   "version": "3.6.6"
227
  }
228
 },
229
 "nbformat": 4,
230
 "nbformat_minor": 2
231
}