[2cc208]: / demo / notebooks / demo_modelcomp_pvalue.ipynb

Download this file

150 lines (149 with data), 4.9 kB

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Statistical comparison of model performance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook, we will describe how we compare the accuracy of the 2 predictive models used in this study: the deep learning network (4DSurvival) and the conventional parameter model (utilizing volumetric indices of RV function)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To compare 2 models, we compare their optimism-corrected concordance indices (which are derived by the bootstrap internal validation procedure outlined in our paper). From equation (9) in the paper, the optimism-corrected concordance index for each model is given by: $$C_{corrected} = C_{full}^{full} - \\frac{1}{B}\\sum_{b=1}^{B} \\bigg( C_{b}^{b} - C_{b}^{full} \\bigg)$$\n",
    "In the above equation, the symbol $C_{s_{1}}^{s_{2}}$ refers to the concordance index of a model trained on sample $s_1$ and tested on sample $s_2$. The first term refers to the *apparent* predictive accuracy, i.e. the (inflated) concordance index obtained when a model trained on the full sample is then tested on the same sample. The second term is the average *optimism* (difference between *bootstrap performance* and *test performance*) over the $B$ bootstrap samples.\n",
    "Note that we can rewrite the equation above as: $$C_{corrected} = \\frac{1}{B}\\sum_{b=1}^{B} \\bigg[ C_{full}^{full} - \\bigg( C_{b}^{b} - C_{b}^{full} \\bigg) \\bigg]$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this formulation, we can think of the summand (the term within the summation) as the optimism-corrected concordance index based on only 1 particular bootstrap sample $b$ ($C_{b,corrected}$). Averaging this quantity across $b=\\{1,...,B\\}$ bootstrap samples gives $C_{corrected}$. \n",
    "To compare $C_{corrected}$ between 2 competing models, we perform a statistical test comparing the distributions of $C_{b,corrected}$ ($b=\\{1,...,B\\}$) between the 2 models. We use the Wilcoxon rank-sum test for this purpose. This is implemented in the code below:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------------------------------------------------------------------------------------------------------\n",
    "Import required libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false},
   "outputs": [],
   "source": [
    "import scipy\n",
    "from scipy.stats import wilcoxon\n",
    "import numpy as np\n",
    "import pickle"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For each model under comparison, read in bootstrap data and compute $C_{b,corrected}$ ($b = \\{1,...,B\\}$):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false},
   "outputs": [],
   "source": [
    "def p_reader(pfile):\n",
    "    with open(pfile, 'rb') as f: mlist = pickle.load(f)\n",
    "    return mlist[0], mlist[1]\n",
    "\n",
    "C_app_model1, opts_model1 = p_reader('../data/modelCstats_DL.pkl')\n",
    "C_app_model2, opts_model2 = p_reader('../data/modelCstats_conv.pkl')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false},
   "outputs": [],
   "source": [
    "Cb_adjs_model1 = [C_app_model1 - o for o in opts_model1]\n",
    "Cb_adjs_model2 = [C_app_model2 - o for o in opts_model2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Perform Wilcoxon signed-rank test and compute p-value using the:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false},
   "outputs": [],
   "source": [
    "pval = wilcoxon(Cb_adjs_model1, Cb_adjs_model2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print output:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false},
   "outputs": [],
   "source": [
    "print('Model 1 optimism-adjusted concordance index = {0:.4f}\\nModel 2 optimism-adjusted concordance index = {1:.4f}\\np-value = {2}'.format(np.mean(Cb_adjs_model1), np.mean(Cb_adjs_model2), pval.pvalue))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}