5527 lines (5526 with data), 183.7 kB
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"import pandas as pd\n",
"import os\n",
"# ^^^ pyforest auto-imports - don't write above this line\n",
"# Clinical Deterioration Prediction Model - Preprocessing II\n",
"\n",
"\n",
"## Data Source \n",
"\n",
"For this project, I used publicly available Electronic Health Records (EHRs) datasets. The MIT Media Lab for Computational Physiology has developed MIMIC-IIIv1.4 dataset based on 46,520 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center of Boston between 2001 and 2012. MIMIC-IIIv1.4 dataset is freely available to researchers across the world. A formal request should be made directly to www.mimic.physionet.org, to gain acess to the data. There is a required course on human research ‘Data or Specimens Only Research’ prior to data acess request. I have secured one here -www.citiprogram.org/verify/?kb6607b78-5821-4de5-8cad-daf929f7fbbf-33486907\n",
"\n",
"The dataset has 26 relational tables including patient’s hospital admission, callout information when patient was ready for discharge, caregiver information, electronic charted events including vital signs and any additional information relevant to patient care, patient demographic data, list of services the patient was admitted or transferred under, ICU stay types, diagnoses types, laboratory measurments, microbiology tests and sensitivity, prescription data and billing information. \n",
"\n",
"Although I have full access to the MIMIC-IIIv1.4 datasets, I can not share any part of the data publicly. If you are interested to learn more about the data, there is a MIMIC III Demo dataset based on 100 patients https://mimic.physionet.org/gettingstarted/demo/. If you are interested to requesting access to the data - https://mimic.physionet.org/gettingstarted/access/."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import pandas as pd\n",
"import numpy as np\n",
"import random\n",
"import sys\n",
"import csv"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"application/javascript": [
"\n",
" if (window._pyforest_update_imports_cell) { window._pyforest_update_imports_cell('import pandas as pd\\nimport os'); }\n",
" "
],
"text/plain": [
"<IPython.core.display.Javascript object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"'C:\\\\Users\\\\abebu\\\\Dropbox\\\\Data Science\\\\Projects\\\\Capstone Project 1\\\\Potential Projects\\\\9. MIMIC\\\\data_wrangling'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"os.getcwd()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"application/javascript": [
"\n",
" if (window._pyforest_update_imports_cell) { window._pyforest_update_imports_cell('import pandas as pd\\nimport os'); }\n",
" "
],
"text/plain": [
"<IPython.core.display.Javascript object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"os.chdir(\"C://Users/abebu/Google Drive/mimic-iii-clinical-database-1.4\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\abebu\\DS\\lib\\site-packages\\IPython\\core\\interactiveshell.py:3058: DtypeWarning: Columns (18,19,21,22,23,25,26,27,28) have mixed types. Specify dtype option on import or set low_memory=False.\n",
" interactivity=interactivity, compiler=compiler, result=result)\n",
"C:\\Users\\abebu\\DS\\lib\\site-packages\\numpy\\lib\\arraysetops.py:569: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
" mask |= (ar1 == a)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>ART BP Systolic</th>\n",
" <th>BUN</th>\n",
" <th>GCS - Eye Opening</th>\n",
" <th>GCS - Motor Response</th>\n",
" <th>GCS - Verbal Response</th>\n",
" <th>GCS Total</th>\n",
" <th>HCO3 (serum)</th>\n",
" <th>...</th>\n",
" <th>ADMITTIME</th>\n",
" <th>DISCHTIME</th>\n",
" <th>ADMISSION_TYPE</th>\n",
" <th>HOSPITAL_EXPIRE_FLAG</th>\n",
" <th>GENDER</th>\n",
" <th>DOB</th>\n",
" <th>DOD</th>\n",
" <th>DOD_HOSP</th>\n",
" <th>EXPIRE_FLAG</th>\n",
" <th>AGE</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2138-07-17</td>\n",
" <td>2138-07-21 15:48:00</td>\n",
" <td>NEWBORN</td>\n",
" <td>0.0</td>\n",
" <td>M</td>\n",
" <td>2138-07-17</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2138-07-17</td>\n",
" <td>2138-07-21 15:48:00</td>\n",
" <td>NEWBORN</td>\n",
" <td>0.0</td>\n",
" <td>M</td>\n",
" <td>2138-07-17</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2138-07-17</td>\n",
" <td>2138-07-21 15:48:00</td>\n",
" <td>NEWBORN</td>\n",
" <td>0.0</td>\n",
" <td>M</td>\n",
" <td>2138-07-17</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2138-07-17</td>\n",
" <td>2138-07-21 15:48:00</td>\n",
" <td>NEWBORN</td>\n",
" <td>0.0</td>\n",
" <td>M</td>\n",
" <td>2138-07-17</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>72.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2138-07-17</td>\n",
" <td>2138-07-21 15:48:00</td>\n",
" <td>NEWBORN</td>\n",
" <td>0.0</td>\n",
" <td>M</td>\n",
" <td>2138-07-17</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>25100118</td>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2117-12-30</td>\n",
" <td>2118-01-04 16:30:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0.0</td>\n",
" <td>F</td>\n",
" <td>2054-01-06</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>64.021918</td>\n",
" </tr>\n",
" <tr>\n",
" <td>25100119</td>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2117-12-30</td>\n",
" <td>2118-01-04 16:30:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0.0</td>\n",
" <td>F</td>\n",
" <td>2054-01-06</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>64.021918</td>\n",
" </tr>\n",
" <tr>\n",
" <td>25100120</td>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2117-12-30</td>\n",
" <td>2118-01-04 16:30:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0.0</td>\n",
" <td>F</td>\n",
" <td>2054-01-06</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>64.021918</td>\n",
" </tr>\n",
" <tr>\n",
" <td>25100121</td>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" <td>15.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>2117-12-30</td>\n",
" <td>2118-01-04 16:30:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0.0</td>\n",
" <td>F</td>\n",
" <td>2054-01-06</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>64.021918</td>\n",
" </tr>\n",
" <tr>\n",
" <td>25100122</td>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" <td>...</td>\n",
" <td>2117-12-30</td>\n",
" <td>2118-01-04 16:30:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0.0</td>\n",
" <td>F</td>\n",
" <td>2054-01-06</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>64.021918</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>25100123 rows × 30 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID ART BP Systolic BUN \\\n",
"0 2 163353 243653.0 NaN NaN \n",
"1 2 163353 243653.0 NaN NaN \n",
"2 2 163353 243653.0 NaN NaN \n",
"3 2 163353 243653.0 NaN NaN \n",
"4 2 163353 243653.0 72.0 NaN \n",
"... ... ... ... ... ... \n",
"25100118 99999 113369 246512.0 NaN NaN \n",
"25100119 99999 113369 246512.0 NaN NaN \n",
"25100120 99999 113369 246512.0 NaN NaN \n",
"25100121 99999 113369 246512.0 NaN 15.0 \n",
"25100122 99999 113369 246512.0 NaN NaN \n",
"\n",
" GCS - Eye Opening GCS - Motor Response GCS - Verbal Response \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 NaN NaN NaN \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
"... ... ... ... \n",
"25100118 NaN 6.0 NaN \n",
"25100119 NaN 6.0 NaN \n",
"25100120 NaN 6.0 NaN \n",
"25100121 NaN NaN NaN \n",
"25100122 NaN NaN NaN \n",
"\n",
" GCS Total HCO3 (serum) ... ADMITTIME DISCHTIME \\\n",
"0 NaN NaN ... 2138-07-17 2138-07-21 15:48:00 \n",
"1 NaN NaN ... 2138-07-17 2138-07-21 15:48:00 \n",
"2 NaN NaN ... 2138-07-17 2138-07-21 15:48:00 \n",
"3 NaN NaN ... 2138-07-17 2138-07-21 15:48:00 \n",
"4 NaN NaN ... 2138-07-17 2138-07-21 15:48:00 \n",
"... ... ... ... ... ... \n",
"25100118 NaN NaN ... 2117-12-30 2118-01-04 16:30:00 \n",
"25100119 NaN NaN ... 2117-12-30 2118-01-04 16:30:00 \n",
"25100120 NaN NaN ... 2117-12-30 2118-01-04 16:30:00 \n",
"25100121 NaN NaN ... 2117-12-30 2118-01-04 16:30:00 \n",
"25100122 NaN 26.0 ... 2117-12-30 2118-01-04 16:30:00 \n",
"\n",
" ADMISSION_TYPE HOSPITAL_EXPIRE_FLAG GENDER DOB DOD \\\n",
"0 NEWBORN 0.0 M 2138-07-17 NaN \n",
"1 NEWBORN 0.0 M 2138-07-17 NaN \n",
"2 NEWBORN 0.0 M 2138-07-17 NaN \n",
"3 NEWBORN 0.0 M 2138-07-17 NaN \n",
"4 NEWBORN 0.0 M 2138-07-17 NaN \n",
"... ... ... ... ... ... \n",
"25100118 ELECTIVE 0.0 F 2054-01-06 NaN \n",
"25100119 ELECTIVE 0.0 F 2054-01-06 NaN \n",
"25100120 ELECTIVE 0.0 F 2054-01-06 NaN \n",
"25100121 ELECTIVE 0.0 F 2054-01-06 NaN \n",
"25100122 ELECTIVE 0.0 F 2054-01-06 NaN \n",
"\n",
" DOD_HOSP EXPIRE_FLAG AGE \n",
"0 NaN 0.0 0.000000 \n",
"1 NaN 0.0 0.000000 \n",
"2 NaN 0.0 0.000000 \n",
"3 NaN 0.0 0.000000 \n",
"4 NaN 0.0 0.000000 \n",
"... ... ... ... \n",
"25100118 NaN 0.0 64.021918 \n",
"25100119 NaN 0.0 64.021918 \n",
"25100120 NaN 0.0 64.021918 \n",
"25100121 NaN 0.0 64.021918 \n",
"25100122 NaN 0.0 64.021918 \n",
"\n",
"[25100123 rows x 30 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps=pd.read_csv('charts_f.csv', header=0, index_col=0)\n",
"saps"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.drop(['DOD', 'DOD_HOSP', 'DOB', 'ADMITTIME', 'DISCHTIME'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.drop(['INTIME', 'OUTTIME'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Int64Index: 25100123 entries, 0 to 25100122\n",
"Data columns (total 23 columns):\n",
"SUBJECT_ID int64\n",
"HADM_ID int64\n",
"ICUSTAY_ID float64\n",
"ART BP Systolic float64\n",
"BUN float64\n",
"GCS - Eye Opening float64\n",
"GCS - Motor Response float64\n",
"GCS - Verbal Response float64\n",
"GCS Total float64\n",
"HCO3 (serum) float64\n",
"Heart Rate float64\n",
"Inspired O2 Fraction float64\n",
"Potassium (whole blood) float64\n",
"Sodium (serum) float64\n",
"Temperature Celsius float64\n",
"Total Bilirubin float64\n",
"WBC float64\n",
"LOS float64\n",
"ADMISSION_TYPE object\n",
"HOSPITAL_EXPIRE_FLAG float64\n",
"GENDER object\n",
"EXPIRE_FLAG float64\n",
"AGE float64\n",
"dtypes: float64(19), int64(2), object(2)\n",
"memory usage: 4.5+ GB\n"
]
}
],
"source": [
"saps.info()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th>BUN</th>\n",
" <th>GCS - Eye Opening</th>\n",
" <th>ART BP Systolic</th>\n",
" <th>GCS - Motor Response</th>\n",
" <th>GCS - Verbal Response</th>\n",
" <th>GCS Total</th>\n",
" <th>HCO3 (serum)</th>\n",
" <th>Heart Rate</th>\n",
" <th>Inspired O2 Fraction</th>\n",
" <th>Potassium (whole blood)</th>\n",
" <th>Sodium (serum)</th>\n",
" <th>Temperature Celsius</th>\n",
" <th>Total Bilirubin</th>\n",
" <th>WBC</th>\n",
" </tr>\n",
" <tr>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>72.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>131.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>145834</td>\n",
" <td>211552.0</td>\n",
" <td>17.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>64.0</td>\n",
" <td>20.0</td>\n",
" <td>3.4</td>\n",
" <td>125.0</td>\n",
" <td>37.666698</td>\n",
" <td>NaN</td>\n",
" <td>10.3</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>185777</td>\n",
" <td>294638.0</td>\n",
" <td>10.0</td>\n",
" <td>NaN</td>\n",
" <td>97.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>15.0</td>\n",
" <td>NaN</td>\n",
" <td>74.0</td>\n",
" <td>NaN</td>\n",
" <td>3.3</td>\n",
" <td>138.0</td>\n",
" <td>37.444445</td>\n",
" <td>1.9</td>\n",
" <td>7.3</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>178980</td>\n",
" <td>214757.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>67.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>140.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>13.9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>107064</td>\n",
" <td>228232.0</td>\n",
" <td>65.0</td>\n",
" <td>NaN</td>\n",
" <td>125.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>14.0</td>\n",
" <td>NaN</td>\n",
" <td>71.0</td>\n",
" <td>NaN</td>\n",
" <td>4.4</td>\n",
" <td>129.0</td>\n",
" <td>37.444445</td>\n",
" <td>0.3</td>\n",
" <td>2.9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99985</td>\n",
" <td>176670</td>\n",
" <td>279638.0</td>\n",
" <td>13.0</td>\n",
" <td>2.0</td>\n",
" <td>71.0</td>\n",
" <td>4.0</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>21.0</td>\n",
" <td>5.0</td>\n",
" <td>39.0</td>\n",
" <td>NaN</td>\n",
" <td>131.0</td>\n",
" <td>39.166667</td>\n",
" <td>1.4</td>\n",
" <td>12.3</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99991</td>\n",
" <td>151118</td>\n",
" <td>226241.0</td>\n",
" <td>17.0</td>\n",
" <td>3.0</td>\n",
" <td>112.0</td>\n",
" <td>6.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" <td>21.0</td>\n",
" <td>50.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>143.0</td>\n",
" <td>37.611111</td>\n",
" <td>NaN</td>\n",
" <td>4.2</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99992</td>\n",
" <td>197084</td>\n",
" <td>242052.0</td>\n",
" <td>44.0</td>\n",
" <td>3.0</td>\n",
" <td>83.0</td>\n",
" <td>6.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" <td>22.0</td>\n",
" <td>50.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>136.0</td>\n",
" <td>36.888889</td>\n",
" <td>NaN</td>\n",
" <td>12.9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99995</td>\n",
" <td>137810</td>\n",
" <td>229633.0</td>\n",
" <td>10.0</td>\n",
" <td>3.0</td>\n",
" <td>104.0</td>\n",
" <td>6.0</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>32.0</td>\n",
" <td>45.0</td>\n",
" <td>50.0</td>\n",
" <td>3.2</td>\n",
" <td>132.0</td>\n",
" <td>37.611111</td>\n",
" <td>3.5</td>\n",
" <td>6.4</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>15.0</td>\n",
" <td>4.0</td>\n",
" <td>100.0</td>\n",
" <td>6.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" <td>50.0</td>\n",
" <td>35.0</td>\n",
" <td>NaN</td>\n",
" <td>141.0</td>\n",
" <td>37.611111</td>\n",
" <td>NaN</td>\n",
" <td>9.3</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>60661 rows × 14 columns</p>\n",
"</div>"
],
"text/plain": [
" BUN GCS - Eye Opening ART BP Systolic \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN 72.0 \n",
"3 145834 211552.0 17.0 NaN 0.0 \n",
"4 185777 294638.0 10.0 NaN 97.0 \n",
"5 178980 214757.0 NaN NaN 67.0 \n",
"6 107064 228232.0 65.0 NaN 125.0 \n",
"... ... ... ... \n",
"99985 176670 279638.0 13.0 2.0 71.0 \n",
"99991 151118 226241.0 17.0 3.0 112.0 \n",
"99992 197084 242052.0 44.0 3.0 83.0 \n",
"99995 137810 229633.0 10.0 3.0 104.0 \n",
"99999 113369 246512.0 15.0 4.0 100.0 \n",
"\n",
" GCS - Motor Response GCS - Verbal Response \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 NaN NaN \n",
"4 185777 294638.0 NaN NaN \n",
"5 178980 214757.0 NaN NaN \n",
"6 107064 228232.0 NaN NaN \n",
"... ... ... \n",
"99985 176670 279638.0 4.0 1.0 \n",
"99991 151118 226241.0 6.0 5.0 \n",
"99992 197084 242052.0 6.0 5.0 \n",
"99995 137810 229633.0 6.0 1.0 \n",
"99999 113369 246512.0 6.0 5.0 \n",
"\n",
" GCS Total HCO3 (serum) Heart Rate \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN 131.0 \n",
"3 145834 211552.0 3.0 NaN 64.0 \n",
"4 185777 294638.0 15.0 NaN 74.0 \n",
"5 178980 214757.0 NaN NaN 140.0 \n",
"6 107064 228232.0 14.0 NaN 71.0 \n",
"... ... ... ... \n",
"99985 176670 279638.0 NaN 21.0 5.0 \n",
"99991 151118 226241.0 NaN 21.0 50.0 \n",
"99992 197084 242052.0 NaN 22.0 50.0 \n",
"99995 137810 229633.0 NaN 32.0 45.0 \n",
"99999 113369 246512.0 NaN 26.0 50.0 \n",
"\n",
" Inspired O2 Fraction Potassium (whole blood) \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 20.0 3.4 \n",
"4 185777 294638.0 NaN 3.3 \n",
"5 178980 214757.0 NaN NaN \n",
"6 107064 228232.0 NaN 4.4 \n",
"... ... ... \n",
"99985 176670 279638.0 39.0 NaN \n",
"99991 151118 226241.0 0.0 NaN \n",
"99992 197084 242052.0 NaN NaN \n",
"99995 137810 229633.0 50.0 3.2 \n",
"99999 113369 246512.0 35.0 NaN \n",
"\n",
" Sodium (serum) Temperature Celsius \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 125.0 37.666698 \n",
"4 185777 294638.0 138.0 37.444445 \n",
"5 178980 214757.0 NaN NaN \n",
"6 107064 228232.0 129.0 37.444445 \n",
"... ... ... \n",
"99985 176670 279638.0 131.0 39.166667 \n",
"99991 151118 226241.0 143.0 37.611111 \n",
"99992 197084 242052.0 136.0 36.888889 \n",
"99995 137810 229633.0 132.0 37.611111 \n",
"99999 113369 246512.0 141.0 37.611111 \n",
"\n",
" Total Bilirubin WBC \n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 NaN 10.3 \n",
"4 185777 294638.0 1.9 7.3 \n",
"5 178980 214757.0 NaN 13.9 \n",
"6 107064 228232.0 0.3 2.9 \n",
"... ... ... \n",
"99985 176670 279638.0 1.4 12.3 \n",
"99991 151118 226241.0 NaN 4.2 \n",
"99992 197084 242052.0 NaN 12.9 \n",
"99995 137810 229633.0 3.5 6.4 \n",
"99999 113369 246512.0 NaN 9.3 \n",
"\n",
"[60661 rows x 14 columns]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th>BUN</th>\n",
" <th>GCS - Eye Opening</th>\n",
" <th>ART BP Systolic</th>\n",
" <th>GCS - Motor Response</th>\n",
" <th>GCS - Verbal Response</th>\n",
" <th>GCS Total</th>\n",
" <th>HCO3 (serum)</th>\n",
" <th>Heart Rate</th>\n",
" <th>Inspired O2 Fraction</th>\n",
" <th>Potassium (whole blood)</th>\n",
" <th>Sodium (serum)</th>\n",
" <th>Temperature Celsius</th>\n",
" <th>Total Bilirubin</th>\n",
" <th>WBC</th>\n",
" <th>HR</th>\n",
" </tr>\n",
" <tr>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>72.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>131.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>145834</td>\n",
" <td>211552.0</td>\n",
" <td>17.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>64.0</td>\n",
" <td>20.0</td>\n",
" <td>3.4</td>\n",
" <td>125.0</td>\n",
" <td>37.666698</td>\n",
" <td>NaN</td>\n",
" <td>10.3</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>185777</td>\n",
" <td>294638.0</td>\n",
" <td>10.0</td>\n",
" <td>NaN</td>\n",
" <td>97.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>15.0</td>\n",
" <td>NaN</td>\n",
" <td>74.0</td>\n",
" <td>NaN</td>\n",
" <td>3.3</td>\n",
" <td>138.0</td>\n",
" <td>37.444445</td>\n",
" <td>1.9</td>\n",
" <td>7.3</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>178980</td>\n",
" <td>214757.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>67.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>140.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>13.9</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>107064</td>\n",
" <td>228232.0</td>\n",
" <td>65.0</td>\n",
" <td>NaN</td>\n",
" <td>125.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>14.0</td>\n",
" <td>NaN</td>\n",
" <td>71.0</td>\n",
" <td>NaN</td>\n",
" <td>4.4</td>\n",
" <td>129.0</td>\n",
" <td>37.444445</td>\n",
" <td>0.3</td>\n",
" <td>2.9</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99985</td>\n",
" <td>176670</td>\n",
" <td>279638.0</td>\n",
" <td>13.0</td>\n",
" <td>2.0</td>\n",
" <td>71.0</td>\n",
" <td>4.0</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>21.0</td>\n",
" <td>5.0</td>\n",
" <td>39.0</td>\n",
" <td>NaN</td>\n",
" <td>131.0</td>\n",
" <td>39.166667</td>\n",
" <td>1.4</td>\n",
" <td>12.3</td>\n",
" <td>11.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99991</td>\n",
" <td>151118</td>\n",
" <td>226241.0</td>\n",
" <td>17.0</td>\n",
" <td>3.0</td>\n",
" <td>112.0</td>\n",
" <td>6.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" <td>21.0</td>\n",
" <td>50.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>143.0</td>\n",
" <td>37.611111</td>\n",
" <td>NaN</td>\n",
" <td>4.2</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99992</td>\n",
" <td>197084</td>\n",
" <td>242052.0</td>\n",
" <td>44.0</td>\n",
" <td>3.0</td>\n",
" <td>83.0</td>\n",
" <td>6.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" <td>22.0</td>\n",
" <td>50.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>136.0</td>\n",
" <td>36.888889</td>\n",
" <td>NaN</td>\n",
" <td>12.9</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99995</td>\n",
" <td>137810</td>\n",
" <td>229633.0</td>\n",
" <td>10.0</td>\n",
" <td>3.0</td>\n",
" <td>104.0</td>\n",
" <td>6.0</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>32.0</td>\n",
" <td>45.0</td>\n",
" <td>50.0</td>\n",
" <td>3.2</td>\n",
" <td>132.0</td>\n",
" <td>37.611111</td>\n",
" <td>3.5</td>\n",
" <td>6.4</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>15.0</td>\n",
" <td>4.0</td>\n",
" <td>100.0</td>\n",
" <td>6.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" <td>50.0</td>\n",
" <td>35.0</td>\n",
" <td>NaN</td>\n",
" <td>141.0</td>\n",
" <td>37.611111</td>\n",
" <td>NaN</td>\n",
" <td>9.3</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>60661 rows × 15 columns</p>\n",
"</div>"
],
"text/plain": [
" BUN GCS - Eye Opening ART BP Systolic \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN 72.0 \n",
"3 145834 211552.0 17.0 NaN 0.0 \n",
"4 185777 294638.0 10.0 NaN 97.0 \n",
"5 178980 214757.0 NaN NaN 67.0 \n",
"6 107064 228232.0 65.0 NaN 125.0 \n",
"... ... ... ... \n",
"99985 176670 279638.0 13.0 2.0 71.0 \n",
"99991 151118 226241.0 17.0 3.0 112.0 \n",
"99992 197084 242052.0 44.0 3.0 83.0 \n",
"99995 137810 229633.0 10.0 3.0 104.0 \n",
"99999 113369 246512.0 15.0 4.0 100.0 \n",
"\n",
" GCS - Motor Response GCS - Verbal Response \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 NaN NaN \n",
"4 185777 294638.0 NaN NaN \n",
"5 178980 214757.0 NaN NaN \n",
"6 107064 228232.0 NaN NaN \n",
"... ... ... \n",
"99985 176670 279638.0 4.0 1.0 \n",
"99991 151118 226241.0 6.0 5.0 \n",
"99992 197084 242052.0 6.0 5.0 \n",
"99995 137810 229633.0 6.0 1.0 \n",
"99999 113369 246512.0 6.0 5.0 \n",
"\n",
" GCS Total HCO3 (serum) Heart Rate \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN 131.0 \n",
"3 145834 211552.0 3.0 NaN 64.0 \n",
"4 185777 294638.0 15.0 NaN 74.0 \n",
"5 178980 214757.0 NaN NaN 140.0 \n",
"6 107064 228232.0 14.0 NaN 71.0 \n",
"... ... ... ... \n",
"99985 176670 279638.0 NaN 21.0 5.0 \n",
"99991 151118 226241.0 NaN 21.0 50.0 \n",
"99992 197084 242052.0 NaN 22.0 50.0 \n",
"99995 137810 229633.0 NaN 32.0 45.0 \n",
"99999 113369 246512.0 NaN 26.0 50.0 \n",
"\n",
" Inspired O2 Fraction Potassium (whole blood) \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 20.0 3.4 \n",
"4 185777 294638.0 NaN 3.3 \n",
"5 178980 214757.0 NaN NaN \n",
"6 107064 228232.0 NaN 4.4 \n",
"... ... ... \n",
"99985 176670 279638.0 39.0 NaN \n",
"99991 151118 226241.0 0.0 NaN \n",
"99992 197084 242052.0 NaN NaN \n",
"99995 137810 229633.0 50.0 3.2 \n",
"99999 113369 246512.0 35.0 NaN \n",
"\n",
" Sodium (serum) Temperature Celsius \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN \n",
"3 145834 211552.0 125.0 37.666698 \n",
"4 185777 294638.0 138.0 37.444445 \n",
"5 178980 214757.0 NaN NaN \n",
"6 107064 228232.0 129.0 37.444445 \n",
"... ... ... \n",
"99985 176670 279638.0 131.0 39.166667 \n",
"99991 151118 226241.0 143.0 37.611111 \n",
"99992 197084 242052.0 136.0 36.888889 \n",
"99995 137810 229633.0 132.0 37.611111 \n",
"99999 113369 246512.0 141.0 37.611111 \n",
"\n",
" Total Bilirubin WBC HR \n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN 4.0 \n",
"3 145834 211552.0 NaN 10.3 NaN \n",
"4 185777 294638.0 1.9 7.3 0.0 \n",
"5 178980 214757.0 NaN 13.9 4.0 \n",
"6 107064 228232.0 0.3 2.9 0.0 \n",
"... ... ... ... \n",
"99985 176670 279638.0 1.4 12.3 11.0 \n",
"99991 151118 226241.0 NaN 4.2 2.0 \n",
"99992 197084 242052.0 NaN 12.9 2.0 \n",
"99995 137810 229633.0 3.5 6.4 2.0 \n",
"99999 113369 246512.0 NaN 9.3 2.0 \n",
"\n",
"[60661 rows x 15 columns]"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" \n",
"saps.loc[saps['Heart Rate'] <40, 'HR'] = 11\n",
"saps.loc[saps['Heart Rate'].between(40, 59), 'HR'] = 2\n",
"saps.loc[saps['Heart Rate'].between(70, 119), 'HR'] = 0\n",
"saps.loc[saps['Heart Rate'].between(120, 159), 'HR'] = 4\n",
"saps.loc[saps['Heart Rate'] >159, 'HR'] = 7"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['BUN'] <28, 'BUN'] = 0\n",
"saps.loc[saps['BUN'].between(28, 83), 'BUN'] = 6\n",
"saps.loc[saps['BUN'] >83, 'BUN'] = 10"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"saps['GCS_SCORE']=saps['GCS - Eye Opening'] + saps['GCS - Motor Response'] + saps['GCS - Verbal Response']"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.drop(['GCS - Eye Opening', 'GCS - Motor Response', 'GCS - Verbal Response'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"saps.GCS_SCORE.fillna(saps['GCS Total'], inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"del saps['GCS Total']"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['GCS_SCORE'] <6, 'GCS_SCORE'] = 26\n",
"saps.loc[saps['GCS_SCORE'].between(6, 8), 'GCS_SCORE'] = 13\n",
"saps.loc[saps['GCS_SCORE'].between(9, 10), 'GCS_SCORE'] = 7\n",
"saps.loc[saps['GCS_SCORE'].between(11, 13), 'GCS_SCORE'] = 5\n",
"saps.loc[saps['GCS_SCORE'].between(14, 15), 'GCS_SCORE'] = 0"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['Temperature Celsius'] <39, 'Temperature Celsius'] = 3\n",
"saps.loc[saps['Temperature Celsius'] >38.9, 'Temperature Celsius'] = 0\n"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['ART BP Systolic'] <70, 'ART BP Systolic'] = 13\n",
"saps.loc[saps['ART BP Systolic'].between(70, 99), 'ART BP Systolic'] = 5\n",
"saps.loc[saps['ART BP Systolic'].between(100, 199), 'ART BP Systolic'] = 0\n",
"saps.loc[saps['ART BP Systolic']>199, 'ART BP Systolic'] = 2"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['HCO3 (serum)'] <15, 'HCO3 (serum)'] = 6\n",
"saps.loc[saps['HCO3 (serum)'].between(15, 19), 'HCO3 (serum)'] = 3\n",
"saps.loc[saps['HCO3 (serum)']>19, 'HCO3 (serum)'] = 0"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['Total Bilirubin'] <4, 'Total Bilirubin'] = 0\n",
"saps.loc[saps['Total Bilirubin'].between(4, 5.9), 'Total Bilirubin'] = 4\n",
"saps.loc[saps['Total Bilirubin']>5.9, 'Total Bilirubin'] = 9"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['Potassium (whole blood)'] <3, 'Potassium (whole blood)'] = 3\n",
"saps.loc[saps['Potassium (whole blood)'].between(3, 4.9), 'Potassium (whole blood)'] = 0\n",
"saps.loc[saps['Potassium (whole blood)']>4.9, 'Potassium (whole blood)'] = 3"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['Sodium (serum)'] <125, 'Sodium (serum)'] = 5\n",
"saps.loc[saps['Sodium (serum)'].between(125, 144), 'Sodium (serum)'] = 0\n",
"saps.loc[saps['Sodium (serum)']>144, 'Sodium (serum)'] = 1"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['WBC'] <1, 'WBC'] = 12\n",
"saps.loc[saps['WBC'].between(1, 19.9), 'WBC'] = 0\n",
"saps.loc[saps['WBC']>19.9, 'WBC'] = 3"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"saps.loc[saps['Inspired O2 Fraction'] <13.3, 'Inspired O2 Fraction'] = 11\n",
"saps.loc[saps['Inspired O2 Fraction'].between(13.3, 26.5), 'Inspired O2 Fraction'] = 9\n",
"saps.loc[saps['Inspired O2 Fraction']>26.5, 'Inspired O2 Fraction'] = 6\n",
"saps['Inspired O2 Fraction'] = saps['Inspired O2 Fraction'].fillna(0)"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"#ART BP Systolic\n",
"del saps['Heart Rate']"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"#rename columns\n",
"saps = saps.rename(columns = {'HCO3 (serum)':'Bicarbonate','Inspired O2 Fraction':'ventilation', \n",
" 'Potassium (whole blood)': 'Potassium', 'Sodium (serum)':'Sodium', \n",
" 'Temperature Celsius': 'Temp', 'Total Bilirubin':'Bilirubin',\n",
" 'GCS_SCORE':'GCS'})"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th>BUN</th>\n",
" <th>Bicarbonate</th>\n",
" <th>ventilation</th>\n",
" <th>Potassium</th>\n",
" <th>Sodium</th>\n",
" <th>Temp</th>\n",
" <th>Bilirubin</th>\n",
" <th>WBC</th>\n",
" <th>HR</th>\n",
" <th>GCS</th>\n",
" </tr>\n",
" <tr>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>163353</td>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>4.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>145834</td>\n",
" <td>211552.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>185777</td>\n",
" <td>294638.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>178980</td>\n",
" <td>214757.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>4.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>107064</td>\n",
" <td>228232.0</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99985</td>\n",
" <td>176670</td>\n",
" <td>279638.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99991</td>\n",
" <td>151118</td>\n",
" <td>226241.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99992</td>\n",
" <td>197084</td>\n",
" <td>242052.0</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99995</td>\n",
" <td>137810</td>\n",
" <td>229633.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>99999</td>\n",
" <td>113369</td>\n",
" <td>246512.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>60661 rows × 10 columns</p>\n",
"</div>"
],
"text/plain": [
" BUN Bicarbonate ventilation Potassium \\\n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN 0.0 NaN \n",
"3 145834 211552.0 0.0 NaN 9.0 0.0 \n",
"4 185777 294638.0 0.0 NaN 0.0 0.0 \n",
"5 178980 214757.0 NaN NaN 0.0 NaN \n",
"6 107064 228232.0 6.0 NaN 0.0 0.0 \n",
"... ... ... ... ... \n",
"99985 176670 279638.0 0.0 0.0 6.0 NaN \n",
"99991 151118 226241.0 0.0 0.0 11.0 NaN \n",
"99992 197084 242052.0 6.0 0.0 0.0 NaN \n",
"99995 137810 229633.0 0.0 0.0 6.0 0.0 \n",
"99999 113369 246512.0 0.0 0.0 6.0 NaN \n",
"\n",
" Sodium Temp Bilirubin WBC HR GCS \n",
"SUBJECT_ID HADM_ID ICUSTAY_ID \n",
"2 163353 243653.0 NaN NaN NaN NaN 4.0 NaN \n",
"3 145834 211552.0 0.0 3.0 NaN 0.0 NaN 26.0 \n",
"4 185777 294638.0 0.0 3.0 0.0 0.0 0.0 0.0 \n",
"5 178980 214757.0 NaN NaN NaN 0.0 4.0 NaN \n",
"6 107064 228232.0 0.0 3.0 0.0 0.0 0.0 0.0 \n",
"... ... ... ... ... ... ... \n",
"99985 176670 279638.0 0.0 0.0 0.0 0.0 11.0 5.0 \n",
"99991 151118 226241.0 0.0 3.0 NaN 0.0 2.0 0.0 \n",
"99992 197084 242052.0 0.0 3.0 NaN 0.0 2.0 0.0 \n",
"99995 137810 229633.0 0.0 3.0 0.0 0.0 2.0 7.0 \n",
"99999 113369 246512.0 0.0 3.0 NaN 0.0 2.0 0.0 \n",
"\n",
"[60661 rows x 10 columns]"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\abebu\\DS\\lib\\site-packages\\numpy\\lib\\arraysetops.py:569: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
" mask |= (ar1 == a)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>ART BP Systolic</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>243653.0</td>\n",
" <td>72.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>246512.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>25100123 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" ART BP Systolic\n",
"ICUSTAY_ID \n",
"243653.0 NaN\n",
"243653.0 NaN\n",
"243653.0 NaN\n",
"243653.0 NaN\n",
"243653.0 72.0\n",
"... ...\n",
"246512.0 NaN\n",
"246512.0 NaN\n",
"246512.0 NaN\n",
"246512.0 NaN\n",
"246512.0 NaN\n",
"\n",
"[25100123 rows x 1 columns]"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fields = ['ICUSTAY_ID', 'ART BP Systolic']\n",
"bp=pd.read_csv('charts_f.csv', usecols=fields, header=0, index_col=0)\n",
"bp"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"bp.loc[bp['ART BP Systolic'] <70, 'ART BP Systolic'] = 13\n",
"bp.loc[bp['ART BP Systolic'].between(70, 99), 'ART BP Systolic'] = 5\n",
"bp.loc[bp['ART BP Systolic'].between(100, 199), 'ART BP Systolic'] = 0\n",
"bp.loc[bp['ART BP Systolic']>199, 'ART BP Systolic'] = 2"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [],
"source": [
"bp=bp.groupby('ICUSTAY_ID').agg('max')"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"bp= bp.rename(columns = {'ART BP Systolic':'SystolicBP'})"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SystolicBP</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>200001.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200003.0</td>\n",
" <td>13.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200006.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200007.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200009.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299993.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299994.0</td>\n",
" <td>13.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299995.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299998.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299999.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>60517 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" SystolicBP\n",
"ICUSTAY_ID \n",
"200001.0 5.0\n",
"200003.0 13.0\n",
"200006.0 5.0\n",
"200007.0 5.0\n",
"200009.0 5.0\n",
"... ...\n",
"299993.0 0.0\n",
"299994.0 13.0\n",
"299995.0 0.0\n",
"299998.0 5.0\n",
"299999.0 5.0\n",
"\n",
"[60517 rows x 1 columns]"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bp"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [],
"source": [
"def icu_merge(table1, table2):\n",
" return table1.merge(table2, how='left', left_on=['ICUSTAY_ID'], right_on=['ICUSTAY_ID'])"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>BUN</th>\n",
" <th>Bicarbonate</th>\n",
" <th>ventilation</th>\n",
" <th>Potassium</th>\n",
" <th>Sodium</th>\n",
" <th>Temp</th>\n",
" <th>Bilirubin</th>\n",
" <th>WBC</th>\n",
" <th>HR</th>\n",
" <th>GCS</th>\n",
" <th>SystolicBP</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>243653.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>4.0</td>\n",
" <td>NaN</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>211552.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>9.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>294638.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>214757.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>4.0</td>\n",
" <td>NaN</td>\n",
" <td>13.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>228232.0</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>279638.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>5.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>226241.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>242052.0</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>229633.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>7.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>246512.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>60661 rows × 11 columns</p>\n",
"</div>"
],
"text/plain": [
" BUN Bicarbonate ventilation Potassium Sodium Temp Bilirubin \\\n",
"ICUSTAY_ID \n",
"243653.0 NaN NaN 0.0 NaN NaN NaN NaN \n",
"211552.0 0.0 NaN 9.0 0.0 0.0 3.0 NaN \n",
"294638.0 0.0 NaN 0.0 0.0 0.0 3.0 0.0 \n",
"214757.0 NaN NaN 0.0 NaN NaN NaN NaN \n",
"228232.0 6.0 NaN 0.0 0.0 0.0 3.0 0.0 \n",
"... ... ... ... ... ... ... ... \n",
"279638.0 0.0 0.0 6.0 NaN 0.0 0.0 0.0 \n",
"226241.0 0.0 0.0 11.0 NaN 0.0 3.0 NaN \n",
"242052.0 6.0 0.0 0.0 NaN 0.0 3.0 NaN \n",
"229633.0 0.0 0.0 6.0 0.0 0.0 3.0 0.0 \n",
"246512.0 0.0 0.0 6.0 NaN 0.0 3.0 NaN \n",
"\n",
" WBC HR GCS SystolicBP \n",
"ICUSTAY_ID \n",
"243653.0 NaN 4.0 NaN 5.0 \n",
"211552.0 0.0 NaN 26.0 13.0 \n",
"294638.0 0.0 0.0 0.0 5.0 \n",
"214757.0 0.0 4.0 NaN 13.0 \n",
"228232.0 0.0 0.0 0.0 0.0 \n",
"... ... ... ... ... \n",
"279638.0 0.0 11.0 5.0 5.0 \n",
"226241.0 0.0 2.0 0.0 2.0 \n",
"242052.0 0.0 2.0 0.0 5.0 \n",
"229633.0 0.0 2.0 7.0 0.0 \n",
"246512.0 0.0 2.0 0.0 0.0 \n",
"\n",
"[60661 rows x 11 columns]"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps=icu_merge(saps, bp)\n",
"saps"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"application/javascript": [
"\n",
" if (window._pyforest_update_imports_cell) { window._pyforest_update_imports_cell('import pandas as pd\\nimport os'); }\n",
" "
],
"text/plain": [
"<IPython.core.display.Javascript object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>UO</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>200001.0</td>\n",
" <td>0.000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200003.0</td>\n",
" <td>0.320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200006.0</td>\n",
" <td>1.160</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200007.0</td>\n",
" <td>0.620</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200009.0</td>\n",
" <td>0.960</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299992.0</td>\n",
" <td>0.000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299993.0</td>\n",
" <td>0.245</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299995.0</td>\n",
" <td>1.640</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299998.0</td>\n",
" <td>0.730</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299999.0</td>\n",
" <td>0.990</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>53718 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" UO\n",
"ICUSTAY_ID \n",
"200001.0 0.000\n",
"200003.0 0.320\n",
"200006.0 1.160\n",
"200007.0 0.620\n",
"200009.0 0.960\n",
"... ...\n",
"299992.0 0.000\n",
"299993.0 0.245\n",
"299995.0 1.640\n",
"299998.0 0.730\n",
"299999.0 0.990\n",
"\n",
"[53718 rows x 1 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fields = ['ICUSTAY_ID', 'UO']\n",
"uo=pd.read_csv('uo_value.csv', usecols=fields, header=0, index_col=0)\n",
"uo"
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>uo</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>4887256.0</td>\n",
" <td>2.106</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5109404.0</td>\n",
" <td>3.085</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4220812.0</td>\n",
" <td>2.835</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4887256.0</td>\n",
" <td>2.365</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1555036.0</td>\n",
" <td>1.495</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>0.0</td>\n",
" <td>0.000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>0.0</td>\n",
" <td>0.000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2885364.0</td>\n",
" <td>0.420</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4808940.0</td>\n",
" <td>2.535</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1202235.0</td>\n",
" <td>0.425</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>37874 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" uo\n",
"ICUSTAY_ID \n",
"4887256.0 2.106\n",
"5109404.0 3.085\n",
"4220812.0 2.835\n",
"4887256.0 2.365\n",
"1555036.0 1.495\n",
"... ...\n",
"0.0 0.000\n",
"0.0 0.000\n",
"2885364.0 0.420\n",
"4808940.0 2.535\n",
"1202235.0 0.425\n",
"\n",
"[37874 rows x 1 columns]"
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"uo= uo.rename(columns = {'URINE_OUTPUT':'uo'})\n",
"uo"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>uo</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>4887256.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5109404.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4220812.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4887256.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1555036.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2885364.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4808940.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1202235.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>37874 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" uo\n",
"ICUSTAY_ID \n",
"4887256.0 0.0\n",
"5109404.0 0.0\n",
"4220812.0 0.0\n",
"4887256.0 0.0\n",
"1555036.0 0.0\n",
"... ...\n",
"0.0 0.0\n",
"0.0 0.0\n",
"2885364.0 0.0\n",
"4808940.0 0.0\n",
"1202235.0 0.0\n",
"\n",
"[37874 rows x 1 columns]"
]
},
"execution_count": 105,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"uo.loc[uo['uo'] <0.500, 'uo'] = 11\n",
"uo.loc[uo['uo'].between(0.500, 0.999), 'uo'] = 4\n",
"uo.loc[uo['uo']>0.999, 'uo'] = 0\n",
"uo"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>AGE</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>280836</td>\n",
" <td>66.019178</td>\n",
" </tr>\n",
" <tr>\n",
" <td>206613</td>\n",
" <td>40.126027</td>\n",
" </tr>\n",
" <tr>\n",
" <td>220345</td>\n",
" <td>80.128767</td>\n",
" </tr>\n",
" <tr>\n",
" <td>249196</td>\n",
" <td>45.715068</td>\n",
" </tr>\n",
" <tr>\n",
" <td>210407</td>\n",
" <td>67.139726</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>201233</td>\n",
" <td>77.161644</td>\n",
" </tr>\n",
" <tr>\n",
" <td>283653</td>\n",
" <td>300.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>241585</td>\n",
" <td>53.128767</td>\n",
" </tr>\n",
" <tr>\n",
" <td>202802</td>\n",
" <td>67.884932</td>\n",
" </tr>\n",
" <tr>\n",
" <td>275346</td>\n",
" <td>34.241096</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61532 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" AGE\n",
"ICUSTAY_ID \n",
"280836 66.019178\n",
"206613 40.126027\n",
"220345 80.128767\n",
"249196 45.715068\n",
"210407 67.139726\n",
"... ...\n",
"201233 77.161644\n",
"283653 300.200000\n",
"241585 53.128767\n",
"202802 67.884932\n",
"275346 34.241096\n",
"\n",
"[61532 rows x 1 columns]"
]
},
"execution_count": 112,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fields = ['ICUSTAY_ID', 'AGE']\n",
"age=pd.read_csv('icu.csv', usecols=fields, header=0, index_col=0)\n",
"age"
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>AGE</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>280836</td>\n",
" <td>66.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>206613</td>\n",
" <td>40.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>220345</td>\n",
" <td>80.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>249196</td>\n",
" <td>45.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>210407</td>\n",
" <td>67.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>201233</td>\n",
" <td>77.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>283653</td>\n",
" <td>300.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>241585</td>\n",
" <td>53.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>202802</td>\n",
" <td>67.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>275346</td>\n",
" <td>34.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61532 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" AGE\n",
"ICUSTAY_ID \n",
"280836 66.0\n",
"206613 40.0\n",
"220345 80.0\n",
"249196 45.0\n",
"210407 67.0\n",
"... ...\n",
"201233 77.0\n",
"283653 300.0\n",
"241585 53.0\n",
"202802 67.0\n",
"275346 34.0\n",
"\n",
"[61532 rows x 1 columns]"
]
},
"execution_count": 113,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"age['AGE'] = age['AGE'].apply(np.floor)\n",
"age"
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>AGE</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>280836</td>\n",
" <td>12.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>206613</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>220345</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>249196</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>210407</td>\n",
" <td>12.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>201233</td>\n",
" <td>16.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>283653</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>241585</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>202802</td>\n",
" <td>12.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>275346</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61532 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" AGE\n",
"ICUSTAY_ID \n",
"280836 12.0\n",
"206613 7.0\n",
"220345 18.0\n",
"249196 7.0\n",
"210407 12.0\n",
"... ...\n",
"201233 16.0\n",
"283653 18.0\n",
"241585 7.0\n",
"202802 12.0\n",
"275346 0.0\n",
"\n",
"[61532 rows x 1 columns]"
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"age.loc[age['AGE'] <40, 'AGE'] = 0\n",
"age.loc[age['AGE'].between(40, 59), 'AGE'] = 7\n",
"age.loc[age['AGE'].between(60, 69), 'AGE'] = 12\n",
"age.loc[age['AGE'].between(70, 74), 'AGE'] = 15\n",
"age.loc[age['AGE'].between(75, 79), 'AGE'] = 16\n",
"age.loc[age['AGE']>79, 'AGE'] = 18\n",
"\n",
"age"
]
},
{
"cell_type": "code",
"execution_count": 115,
"metadata": {},
"outputs": [],
"source": [
"age.to_csv('AGE.csv')"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>INTIME</th>\n",
" <th>OUTTIME</th>\n",
" <th>LOS</th>\n",
" <th>ADMITTIME</th>\n",
" <th>DISCHTIME</th>\n",
" <th>ADMISSION_TYPE</th>\n",
" <th>HOSPITAL_EXPIRE_FLAG</th>\n",
" <th>GENDER</th>\n",
" <th>DOB</th>\n",
" <th>DOD</th>\n",
" <th>DOD_HOSP</th>\n",
" <th>EXPIRE_FLAG</th>\n",
" <th>AGE</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>268</td>\n",
" <td>110404</td>\n",
" <td>280836</td>\n",
" <td>2198-02-14 23:27:38</td>\n",
" <td>2198-02-18 05:26:11</td>\n",
" <td>3.2490</td>\n",
" <td>2198-02-11</td>\n",
" <td>2198-02-18 03:55:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>1</td>\n",
" <td>F</td>\n",
" <td>2132-02-21</td>\n",
" <td>2198-02-18 00:00:00</td>\n",
" <td>2198-02-18 00:00:00</td>\n",
" <td>1</td>\n",
" <td>66.019178</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>269</td>\n",
" <td>106296</td>\n",
" <td>206613</td>\n",
" <td>2170-11-05 11:05:29</td>\n",
" <td>2170-11-08 17:46:57</td>\n",
" <td>3.2788</td>\n",
" <td>2170-11-05</td>\n",
" <td>2170-11-27 18:00:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>2130-09-30</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>40.126027</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>270</td>\n",
" <td>188028</td>\n",
" <td>220345</td>\n",
" <td>2128-06-24 15:05:20</td>\n",
" <td>2128-06-27 12:32:29</td>\n",
" <td>2.8939</td>\n",
" <td>2128-06-23</td>\n",
" <td>2128-06-27 12:31:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>2048-05-26</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>80.128767</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>271</td>\n",
" <td>173727</td>\n",
" <td>249196</td>\n",
" <td>2120-08-07 23:12:42</td>\n",
" <td>2120-08-10 00:39:04</td>\n",
" <td>2.0600</td>\n",
" <td>2120-08-07</td>\n",
" <td>2120-08-20 16:00:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>2074-11-30</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>45.715068</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>272</td>\n",
" <td>164716</td>\n",
" <td>210407</td>\n",
" <td>2186-12-25 21:08:04</td>\n",
" <td>2186-12-27 12:01:13</td>\n",
" <td>1.6202</td>\n",
" <td>2186-12-25</td>\n",
" <td>2187-01-02 14:57:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>2119-11-21</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>67.139726</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61527</td>\n",
" <td>94944</td>\n",
" <td>143774</td>\n",
" <td>201233</td>\n",
" <td>2104-04-15 10:18:16</td>\n",
" <td>2104-04-17 14:51:00</td>\n",
" <td>2.1894</td>\n",
" <td>2104-04-11</td>\n",
" <td>2104-04-20 16:16:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>2027-03-02</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>77.161644</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61528</td>\n",
" <td>94950</td>\n",
" <td>123750</td>\n",
" <td>283653</td>\n",
" <td>2155-12-08 05:33:16</td>\n",
" <td>2155-12-10 17:24:58</td>\n",
" <td>2.4942</td>\n",
" <td>2155-12-07</td>\n",
" <td>2155-12-12 10:10:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>1855-12-07</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>300.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61529</td>\n",
" <td>94953</td>\n",
" <td>196881</td>\n",
" <td>241585</td>\n",
" <td>2160-03-03 16:09:11</td>\n",
" <td>2160-03-04 14:22:33</td>\n",
" <td>0.9259</td>\n",
" <td>2160-03-03</td>\n",
" <td>2160-03-04 12:48:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>2107-01-29</td>\n",
" <td>2162-01-05 00:00:00</td>\n",
" <td>NaN</td>\n",
" <td>1</td>\n",
" <td>53.128767</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61530</td>\n",
" <td>94954</td>\n",
" <td>118475</td>\n",
" <td>202802</td>\n",
" <td>2183-03-25 09:53:10</td>\n",
" <td>2183-03-27 17:55:03</td>\n",
" <td>2.3346</td>\n",
" <td>2183-03-25</td>\n",
" <td>2183-04-01 17:07:00</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>2115-05-23</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>67.884932</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61531</td>\n",
" <td>94956</td>\n",
" <td>156386</td>\n",
" <td>275346</td>\n",
" <td>2157-05-19 02:54:54</td>\n",
" <td>2157-05-23 14:58:04</td>\n",
" <td>4.5022</td>\n",
" <td>2157-05-19</td>\n",
" <td>2157-05-25 16:40:00</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>2123-03-01</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>34.241096</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61532 rows × 16 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID INTIME \\\n",
"0 268 110404 280836 2198-02-14 23:27:38 \n",
"1 269 106296 206613 2170-11-05 11:05:29 \n",
"2 270 188028 220345 2128-06-24 15:05:20 \n",
"3 271 173727 249196 2120-08-07 23:12:42 \n",
"4 272 164716 210407 2186-12-25 21:08:04 \n",
"... ... ... ... ... \n",
"61527 94944 143774 201233 2104-04-15 10:18:16 \n",
"61528 94950 123750 283653 2155-12-08 05:33:16 \n",
"61529 94953 196881 241585 2160-03-03 16:09:11 \n",
"61530 94954 118475 202802 2183-03-25 09:53:10 \n",
"61531 94956 156386 275346 2157-05-19 02:54:54 \n",
"\n",
" OUTTIME LOS ADMITTIME DISCHTIME \\\n",
"0 2198-02-18 05:26:11 3.2490 2198-02-11 2198-02-18 03:55:00 \n",
"1 2170-11-08 17:46:57 3.2788 2170-11-05 2170-11-27 18:00:00 \n",
"2 2128-06-27 12:32:29 2.8939 2128-06-23 2128-06-27 12:31:00 \n",
"3 2120-08-10 00:39:04 2.0600 2120-08-07 2120-08-20 16:00:00 \n",
"4 2186-12-27 12:01:13 1.6202 2186-12-25 2187-01-02 14:57:00 \n",
"... ... ... ... ... \n",
"61527 2104-04-17 14:51:00 2.1894 2104-04-11 2104-04-20 16:16:00 \n",
"61528 2155-12-10 17:24:58 2.4942 2155-12-07 2155-12-12 10:10:00 \n",
"61529 2160-03-04 14:22:33 0.9259 2160-03-03 2160-03-04 12:48:00 \n",
"61530 2183-03-27 17:55:03 2.3346 2183-03-25 2183-04-01 17:07:00 \n",
"61531 2157-05-23 14:58:04 4.5022 2157-05-19 2157-05-25 16:40:00 \n",
"\n",
" ADMISSION_TYPE HOSPITAL_EXPIRE_FLAG GENDER DOB \\\n",
"0 EMERGENCY 1 F 2132-02-21 \n",
"1 EMERGENCY 0 M 2130-09-30 \n",
"2 ELECTIVE 0 M 2048-05-26 \n",
"3 EMERGENCY 0 F 2074-11-30 \n",
"4 EMERGENCY 0 M 2119-11-21 \n",
"... ... ... ... ... \n",
"61527 EMERGENCY 0 M 2027-03-02 \n",
"61528 EMERGENCY 0 F 1855-12-07 \n",
"61529 ELECTIVE 0 F 2107-01-29 \n",
"61530 ELECTIVE 0 F 2115-05-23 \n",
"61531 EMERGENCY 0 M 2123-03-01 \n",
"\n",
" DOD DOD_HOSP EXPIRE_FLAG AGE \n",
"0 2198-02-18 00:00:00 2198-02-18 00:00:00 1 66.019178 \n",
"1 NaN NaN 0 40.126027 \n",
"2 NaN NaN 0 80.128767 \n",
"3 NaN NaN 0 45.715068 \n",
"4 NaN NaN 0 67.139726 \n",
"... ... ... ... ... \n",
"61527 NaN NaN 0 77.161644 \n",
"61528 NaN NaN 0 300.200000 \n",
"61529 2162-01-05 00:00:00 NaN 1 53.128767 \n",
"61530 NaN NaN 0 67.884932 \n",
"61531 NaN NaN 0 34.241096 \n",
"\n",
"[61532 rows x 16 columns]"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\n",
"icu=pd.read_csv('icu.csv', header=0, index_col=0)\n",
"icu"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [],
"source": [
"icu=icu.drop(['DOD', 'DOD_HOSP', 'DOB', 'ADMITTIME', 'DISCHTIME', 'INTIME', 'OUTTIME'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [],
"source": [
"icu.loc[icu['ADMISSION_TYPE'] == 'URGENT', 'uo'] = 8\n",
"icu.loc[icu['ADMISSION_TYPE'] == 'UEMERGENCY', 'uo'] = 8\n",
"icu.loc[icu['ADMISSION_TYPE'] == 'ELECTIVE', 'uo'] = 0\n",
"icu.loc[icu['ADMISSION_TYPE'] == 'NEWBORN', 'uo'] = 0"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [],
"source": [
"icu.loc[icu['AGE'] <40, 'AGE'] = 0\n",
"icu.loc[icu['AGE'].between(40, 59), 'AGE'] = 7\n",
"icu.loc[icu['AGE'].between(60, 69), 'AGE'] = 12\n",
"icu.loc[icu['AGE'].between(70, 74), 'AGE'] = 15\n",
"icu.loc[icu['AGE'].between(75, 79), 'AGE'] = 16\n",
"icu.loc[icu['AGE']>79, 'AGE'] = 18"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [],
"source": [
"icu = icu.rename(columns = {'LOS':'los','ADMISSION_TYPE':'admission', \n",
" 'HOSPITAL_EXPIRE_FLAG': 'hdeath', 'EXPIRE_FLAG':'death'})\n"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>HADM_ID</th>\n",
" <th>UD_VALUE</th>\n",
" </tr>\n",
" <tr>\n",
" <th>SUBJECT_ID</th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>109</td>\n",
" <td>172335</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>109</td>\n",
" <td>173633</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>109</td>\n",
" <td>131345</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>109</td>\n",
" <td>131376</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>109</td>\n",
" <td>135923</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>86684</td>\n",
" <td>158798</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <td>92644</td>\n",
" <td>132124</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <td>91832</td>\n",
" <td>173664</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <td>95280</td>\n",
" <td>155715</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>98701</td>\n",
" <td>124568</td>\n",
" <td>9</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>6675 rows × 2 columns</p>\n",
"</div>"
],
"text/plain": [
" HADM_ID UD_VALUE\n",
"SUBJECT_ID \n",
"109 172335 9\n",
"109 173633 9\n",
"109 131345 9\n",
"109 131376 9\n",
"109 135923 9\n",
"... ... ...\n",
"86684 158798 10\n",
"92644 132124 10\n",
"91832 173664 10\n",
"95280 155715 9\n",
"98701 124568 9\n",
"\n",
"[6675 rows x 2 columns]"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fields = ['SUBJECT_ID', 'HADM_ID','UD_VALUE']\n",
"ud=pd.read_csv('ud_value.csv', usecols=fields, header=0, index_col=0)\n",
"ud"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [],
"source": [
"def adm_merge(table1, table2):\n",
" return table1.merge(table2, how='left', left_on=['SUBJECT_ID','HADM_ID'], right_on=['SUBJECT_ID','HADM_ID'])"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [],
"source": [
"ud= ud.rename(columns = {'UD_VALUE':'ud'})"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>los</th>\n",
" <th>admission</th>\n",
" <th>hdeath</th>\n",
" <th>GENDER</th>\n",
" <th>death</th>\n",
" <th>AGE</th>\n",
" <th>uo</th>\n",
" <th>ud</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>268</td>\n",
" <td>110404</td>\n",
" <td>280836</td>\n",
" <td>3.2490</td>\n",
" <td>EMERGENCY</td>\n",
" <td>1</td>\n",
" <td>F</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>269</td>\n",
" <td>106296</td>\n",
" <td>206613</td>\n",
" <td>3.2788</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>17.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>270</td>\n",
" <td>188028</td>\n",
" <td>220345</td>\n",
" <td>2.8939</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>271</td>\n",
" <td>173727</td>\n",
" <td>249196</td>\n",
" <td>2.0600</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>272</td>\n",
" <td>164716</td>\n",
" <td>210407</td>\n",
" <td>1.6202</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61986</td>\n",
" <td>94944</td>\n",
" <td>143774</td>\n",
" <td>201233</td>\n",
" <td>2.1894</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61987</td>\n",
" <td>94950</td>\n",
" <td>123750</td>\n",
" <td>283653</td>\n",
" <td>2.4942</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>0</td>\n",
" <td>18.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61988</td>\n",
" <td>94953</td>\n",
" <td>196881</td>\n",
" <td>241585</td>\n",
" <td>0.9259</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61989</td>\n",
" <td>94954</td>\n",
" <td>118475</td>\n",
" <td>202802</td>\n",
" <td>2.3346</td>\n",
" <td>ELECTIVE</td>\n",
" <td>0</td>\n",
" <td>F</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61990</td>\n",
" <td>94956</td>\n",
" <td>156386</td>\n",
" <td>275346</td>\n",
" <td>4.5022</td>\n",
" <td>EMERGENCY</td>\n",
" <td>0</td>\n",
" <td>M</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61991 rows × 11 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID los admission hdeath GENDER \\\n",
"0 268 110404 280836 3.2490 EMERGENCY 1 F \n",
"1 269 106296 206613 3.2788 EMERGENCY 0 M \n",
"2 270 188028 220345 2.8939 ELECTIVE 0 M \n",
"3 271 173727 249196 2.0600 EMERGENCY 0 F \n",
"4 272 164716 210407 1.6202 EMERGENCY 0 M \n",
"... ... ... ... ... ... ... ... \n",
"61986 94944 143774 201233 2.1894 EMERGENCY 0 M \n",
"61987 94950 123750 283653 2.4942 EMERGENCY 0 F \n",
"61988 94953 196881 241585 0.9259 ELECTIVE 0 F \n",
"61989 94954 118475 202802 2.3346 ELECTIVE 0 F \n",
"61990 94956 156386 275346 4.5022 EMERGENCY 0 M \n",
"\n",
" death AGE uo ud \n",
"0 1 0.0 NaN NaN \n",
"1 0 0.0 NaN 17.0 \n",
"2 0 0.0 0.0 NaN \n",
"3 0 0.0 NaN NaN \n",
"4 0 0.0 NaN NaN \n",
"... ... ... ... ... \n",
"61986 0 0.0 NaN NaN \n",
"61987 0 18.0 NaN NaN \n",
"61988 1 0.0 0.0 NaN \n",
"61989 0 0.0 0.0 NaN \n",
"61990 0 0.0 NaN NaN \n",
"\n",
"[61991 rows x 11 columns]"
]
},
"execution_count": 81,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"icu=adm_merge(icu, ud)\n",
"icu"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [],
"source": [
"icu.loc[icu['admission'] == 'URGENT', 'admission'] = 8\n",
"icu.loc[icu['admission'] == 'EMERGENCY', 'admission'] = 8\n",
"icu.loc[icu['admission'] == 'ELECTIVE', 'admission'] = 0\n",
"icu.loc[icu['admission'] == 'NEWBORN', 'admission'] = 0"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [],
"source": [
"icu=icu.drop(['uo', 'GENDER'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>los</th>\n",
" <th>admission</th>\n",
" <th>hdeath</th>\n",
" <th>death</th>\n",
" <th>AGE</th>\n",
" <th>ud</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>268</td>\n",
" <td>110404</td>\n",
" <td>280836</td>\n",
" <td>3.2490</td>\n",
" <td>8</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>269</td>\n",
" <td>106296</td>\n",
" <td>206613</td>\n",
" <td>3.2788</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>17.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>270</td>\n",
" <td>188028</td>\n",
" <td>220345</td>\n",
" <td>2.8939</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>271</td>\n",
" <td>173727</td>\n",
" <td>249196</td>\n",
" <td>2.0600</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>272</td>\n",
" <td>164716</td>\n",
" <td>210407</td>\n",
" <td>1.6202</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61986</td>\n",
" <td>94944</td>\n",
" <td>143774</td>\n",
" <td>201233</td>\n",
" <td>2.1894</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61987</td>\n",
" <td>94950</td>\n",
" <td>123750</td>\n",
" <td>283653</td>\n",
" <td>2.4942</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>18.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61988</td>\n",
" <td>94953</td>\n",
" <td>196881</td>\n",
" <td>241585</td>\n",
" <td>0.9259</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61989</td>\n",
" <td>94954</td>\n",
" <td>118475</td>\n",
" <td>202802</td>\n",
" <td>2.3346</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61990</td>\n",
" <td>94956</td>\n",
" <td>156386</td>\n",
" <td>275346</td>\n",
" <td>4.5022</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61991 rows × 9 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID los admission hdeath death \\\n",
"0 268 110404 280836 3.2490 8 1 1 \n",
"1 269 106296 206613 3.2788 8 0 0 \n",
"2 270 188028 220345 2.8939 0 0 0 \n",
"3 271 173727 249196 2.0600 8 0 0 \n",
"4 272 164716 210407 1.6202 8 0 0 \n",
"... ... ... ... ... ... ... ... \n",
"61986 94944 143774 201233 2.1894 8 0 0 \n",
"61987 94950 123750 283653 2.4942 8 0 0 \n",
"61988 94953 196881 241585 0.9259 0 0 1 \n",
"61989 94954 118475 202802 2.3346 0 0 0 \n",
"61990 94956 156386 275346 4.5022 8 0 0 \n",
"\n",
" AGE ud \n",
"0 0.0 NaN \n",
"1 0.0 17.0 \n",
"2 0.0 NaN \n",
"3 0.0 NaN \n",
"4 0.0 NaN \n",
"... ... ... \n",
"61986 0.0 NaN \n",
"61987 18.0 NaN \n",
"61988 0.0 NaN \n",
"61989 0.0 NaN \n",
"61990 0.0 NaN \n",
"\n",
"[61991 rows x 9 columns]"
]
},
"execution_count": 85,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"icu"
]
},
{
"cell_type": "code",
"execution_count": 125,
"metadata": {},
"outputs": [],
"source": [
"def icu_merge(table1, table2):\n",
" return table1.merge(table2, how='left', left_on=['ICUSTAY_ID'], right_on=['ICUSTAY_ID'])"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [],
"source": [
"saps=icu_merge(icu, saps)"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Int64Index: 61117 entries, 0 to 61116\n",
"Data columns (total 21 columns):\n",
"SUBJECT_ID 61117 non-null int64\n",
"HADM_ID 61117 non-null int64\n",
"ICUSTAY_ID 61117 non-null int64\n",
"los 61117 non-null float64\n",
"admission 61117 non-null int64\n",
"hdeath 61117 non-null int64\n",
"death 61117 non-null int64\n",
"age 61117 non-null float64\n",
"ud 61117 non-null float64\n",
"bun 52952 non-null float64\n",
"Bicarbonate 26060 non-null float64\n",
"ventilation 61117 non-null float64\n",
"Potassium 39731 non-null float64\n",
"Sodium 54983 non-null float64\n",
"Temp 52825 non-null float64\n",
"Bilirubin 24422 non-null float64\n",
"WBC 57479 non-null float64\n",
"hr 48225 non-null float64\n",
"gcs 52809 non-null float64\n",
"bp 60732 non-null float64\n",
"uo 189 non-null float64\n",
"dtypes: float64(15), int64(6)\n",
"memory usage: 10.3 MB\n"
]
}
],
"source": [
"saps.info()"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [],
"source": [
"saps = saps.rename(columns = {'AGE':'age','BUN':'bun', \n",
" 'SystolicBP': 'bp', 'GCS':'gcs', 'HR':'hr'})"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {},
"outputs": [],
"source": [
"saps['ud'] = saps['ud'].fillna(0)"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>los</th>\n",
" <th>admission</th>\n",
" <th>hdeath</th>\n",
" <th>death</th>\n",
" <th>age</th>\n",
" <th>ud</th>\n",
" <th>bun</th>\n",
" <th>...</th>\n",
" <th>ventilation</th>\n",
" <th>Potassium</th>\n",
" <th>Sodium</th>\n",
" <th>Temp</th>\n",
" <th>Bilirubin</th>\n",
" <th>WBC</th>\n",
" <th>hr</th>\n",
" <th>gcs</th>\n",
" <th>bp</th>\n",
" <th>uo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>268</td>\n",
" <td>110404</td>\n",
" <td>280836</td>\n",
" <td>3.2490</td>\n",
" <td>8</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>269</td>\n",
" <td>106296</td>\n",
" <td>206613</td>\n",
" <td>3.2788</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>17.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>270</td>\n",
" <td>188028</td>\n",
" <td>220345</td>\n",
" <td>2.8939</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>13.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>271</td>\n",
" <td>173727</td>\n",
" <td>249196</td>\n",
" <td>2.0600</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>272</td>\n",
" <td>164716</td>\n",
" <td>210407</td>\n",
" <td>1.6202</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61112</td>\n",
" <td>94944</td>\n",
" <td>143774</td>\n",
" <td>201233</td>\n",
" <td>2.1894</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61113</td>\n",
" <td>94950</td>\n",
" <td>123750</td>\n",
" <td>283653</td>\n",
" <td>2.4942</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>18.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61114</td>\n",
" <td>94953</td>\n",
" <td>196881</td>\n",
" <td>241585</td>\n",
" <td>0.9259</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>7.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61115</td>\n",
" <td>94954</td>\n",
" <td>118475</td>\n",
" <td>202802</td>\n",
" <td>2.3346</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61116</td>\n",
" <td>94956</td>\n",
" <td>156386</td>\n",
" <td>275346</td>\n",
" <td>4.5022</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>NaN</td>\n",
" <td>0.0</td>\n",
" <td>NaN</td>\n",
" <td>26.0</td>\n",
" <td>5.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61117 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID los admission hdeath death \\\n",
"0 268 110404 280836 3.2490 8 1 1 \n",
"1 269 106296 206613 3.2788 8 0 0 \n",
"2 270 188028 220345 2.8939 0 0 0 \n",
"3 271 173727 249196 2.0600 8 0 0 \n",
"4 272 164716 210407 1.6202 8 0 0 \n",
"... ... ... ... ... ... ... ... \n",
"61112 94944 143774 201233 2.1894 8 0 0 \n",
"61113 94950 123750 283653 2.4942 8 0 0 \n",
"61114 94953 196881 241585 0.9259 0 0 1 \n",
"61115 94954 118475 202802 2.3346 0 0 0 \n",
"61116 94956 156386 275346 4.5022 8 0 0 \n",
"\n",
" age ud bun ... ventilation Potassium Sodium Temp Bilirubin \\\n",
"0 0.0 0.0 6.0 ... 6.0 0.0 0.0 0.0 NaN \n",
"1 0.0 17.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 ... 0.0 0.0 0.0 3.0 NaN \n",
"3 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 0.0 \n",
"4 0.0 0.0 0.0 ... 0.0 0.0 0.0 3.0 NaN \n",
"... ... ... ... ... ... ... ... ... ... \n",
"61112 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 NaN \n",
"61113 18.0 0.0 0.0 ... 0.0 NaN 0.0 3.0 NaN \n",
"61114 0.0 0.0 0.0 ... 6.0 NaN 0.0 3.0 NaN \n",
"61115 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 NaN \n",
"61116 0.0 0.0 0.0 ... 11.0 0.0 0.0 3.0 NaN \n",
"\n",
" WBC hr gcs bp uo \n",
"0 0.0 11.0 26.0 13.0 NaN \n",
"1 0.0 0.0 0.0 5.0 NaN \n",
"2 0.0 11.0 0.0 13.0 NaN \n",
"3 0.0 0.0 0.0 0.0 NaN \n",
"4 0.0 0.0 0.0 5.0 NaN \n",
"... ... ... ... ... .. \n",
"61112 0.0 2.0 26.0 13.0 NaN \n",
"61113 0.0 2.0 5.0 0.0 NaN \n",
"61114 0.0 NaN 7.0 5.0 NaN \n",
"61115 0.0 NaN 26.0 13.0 NaN \n",
"61116 0.0 NaN 26.0 5.0 NaN \n",
"\n",
"[61117 rows x 21 columns]"
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [],
"source": [
"saps.to_csv('saps2_fnan.csv')"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.fillna(0)"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Int64Index: 61117 entries, 0 to 61116\n",
"Data columns (total 21 columns):\n",
"SUBJECT_ID 61117 non-null int64\n",
"HADM_ID 61117 non-null int64\n",
"ICUSTAY_ID 61117 non-null int64\n",
"los 61117 non-null float64\n",
"admission 61117 non-null int64\n",
"hdeath 61117 non-null int64\n",
"death 61117 non-null int64\n",
"age 61117 non-null float64\n",
"ud 61117 non-null float64\n",
"bun 61117 non-null float64\n",
"Bicarbonate 61117 non-null float64\n",
"ventilation 61117 non-null float64\n",
"Potassium 61117 non-null float64\n",
"Sodium 61117 non-null float64\n",
"Temp 61117 non-null float64\n",
"Bilirubin 61117 non-null float64\n",
"WBC 61117 non-null float64\n",
"hr 61117 non-null float64\n",
"gcs 61117 non-null float64\n",
"bp 61117 non-null float64\n",
"uo 61117 non-null float64\n",
"dtypes: float64(15), int64(6)\n",
"memory usage: 10.3 MB\n"
]
}
],
"source": [
"saps.info()"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>los</th>\n",
" <th>admission</th>\n",
" <th>hdeath</th>\n",
" <th>death</th>\n",
" <th>age</th>\n",
" <th>ud</th>\n",
" <th>bun</th>\n",
" <th>...</th>\n",
" <th>ventilation</th>\n",
" <th>Potassium</th>\n",
" <th>Sodium</th>\n",
" <th>Temp</th>\n",
" <th>Bilirubin</th>\n",
" <th>WBC</th>\n",
" <th>hr</th>\n",
" <th>gcs</th>\n",
" <th>bp</th>\n",
" <th>uo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>268</td>\n",
" <td>110404</td>\n",
" <td>280836</td>\n",
" <td>3.2490</td>\n",
" <td>8</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>269</td>\n",
" <td>106296</td>\n",
" <td>206613</td>\n",
" <td>3.2788</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>17.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>270</td>\n",
" <td>188028</td>\n",
" <td>220345</td>\n",
" <td>2.8939</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>271</td>\n",
" <td>173727</td>\n",
" <td>249196</td>\n",
" <td>2.0600</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>272</td>\n",
" <td>164716</td>\n",
" <td>210407</td>\n",
" <td>1.6202</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61112</td>\n",
" <td>94944</td>\n",
" <td>143774</td>\n",
" <td>201233</td>\n",
" <td>2.1894</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61113</td>\n",
" <td>94950</td>\n",
" <td>123750</td>\n",
" <td>283653</td>\n",
" <td>2.4942</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>18.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61114</td>\n",
" <td>94953</td>\n",
" <td>196881</td>\n",
" <td>241585</td>\n",
" <td>0.9259</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>7.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61115</td>\n",
" <td>94954</td>\n",
" <td>118475</td>\n",
" <td>202802</td>\n",
" <td>2.3346</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61116</td>\n",
" <td>94956</td>\n",
" <td>156386</td>\n",
" <td>275346</td>\n",
" <td>4.5022</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>26.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61117 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID los admission hdeath death \\\n",
"0 268 110404 280836 3.2490 8 1 1 \n",
"1 269 106296 206613 3.2788 8 0 0 \n",
"2 270 188028 220345 2.8939 0 0 0 \n",
"3 271 173727 249196 2.0600 8 0 0 \n",
"4 272 164716 210407 1.6202 8 0 0 \n",
"... ... ... ... ... ... ... ... \n",
"61112 94944 143774 201233 2.1894 8 0 0 \n",
"61113 94950 123750 283653 2.4942 8 0 0 \n",
"61114 94953 196881 241585 0.9259 0 0 1 \n",
"61115 94954 118475 202802 2.3346 0 0 0 \n",
"61116 94956 156386 275346 4.5022 8 0 0 \n",
"\n",
" age ud bun ... ventilation Potassium Sodium Temp Bilirubin \\\n",
"0 0.0 0.0 6.0 ... 6.0 0.0 0.0 0.0 0.0 \n",
"1 0.0 17.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 ... 0.0 0.0 0.0 3.0 0.0 \n",
"3 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 0.0 \n",
"4 0.0 0.0 0.0 ... 0.0 0.0 0.0 3.0 0.0 \n",
"... ... ... ... ... ... ... ... ... ... \n",
"61112 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 0.0 \n",
"61113 18.0 0.0 0.0 ... 0.0 0.0 0.0 3.0 0.0 \n",
"61114 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 0.0 \n",
"61115 0.0 0.0 0.0 ... 6.0 0.0 0.0 3.0 0.0 \n",
"61116 0.0 0.0 0.0 ... 11.0 0.0 0.0 3.0 0.0 \n",
"\n",
" WBC hr gcs bp uo \n",
"0 0.0 11.0 26.0 13.0 0.0 \n",
"1 0.0 0.0 0.0 5.0 0.0 \n",
"2 0.0 11.0 0.0 13.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 5.0 0.0 \n",
"... ... ... ... ... ... \n",
"61112 0.0 2.0 26.0 13.0 0.0 \n",
"61113 0.0 2.0 5.0 0.0 0.0 \n",
"61114 0.0 0.0 7.0 5.0 0.0 \n",
"61115 0.0 0.0 26.0 13.0 0.0 \n",
"61116 0.0 0.0 26.0 5.0 0.0 \n",
"\n",
"[61117 rows x 21 columns]"
]
},
"execution_count": 102,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps"
]
},
{
"cell_type": "code",
"execution_count": 117,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.drop(['age'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 118,
"metadata": {},
"outputs": [],
"source": [
"saps=icu_merge(saps, age)"
]
},
{
"cell_type": "code",
"execution_count": 119,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SUBJECT_ID</th>\n",
" <th>HADM_ID</th>\n",
" <th>ICUSTAY_ID</th>\n",
" <th>los</th>\n",
" <th>admission</th>\n",
" <th>hdeath</th>\n",
" <th>death</th>\n",
" <th>ud</th>\n",
" <th>bun</th>\n",
" <th>Bicarbonate</th>\n",
" <th>...</th>\n",
" <th>Potassium</th>\n",
" <th>Sodium</th>\n",
" <th>Temp</th>\n",
" <th>Bilirubin</th>\n",
" <th>WBC</th>\n",
" <th>hr</th>\n",
" <th>gcs</th>\n",
" <th>bp</th>\n",
" <th>uo</th>\n",
" <th>AGE</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>268</td>\n",
" <td>110404</td>\n",
" <td>280836</td>\n",
" <td>3.2490</td>\n",
" <td>8</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" <td>12.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>269</td>\n",
" <td>106296</td>\n",
" <td>206613</td>\n",
" <td>3.2788</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>17.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>270</td>\n",
" <td>188028</td>\n",
" <td>220345</td>\n",
" <td>2.8939</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>271</td>\n",
" <td>173727</td>\n",
" <td>249196</td>\n",
" <td>2.0600</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>272</td>\n",
" <td>164716</td>\n",
" <td>210407</td>\n",
" <td>1.6202</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>12.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61112</td>\n",
" <td>94944</td>\n",
" <td>143774</td>\n",
" <td>201233</td>\n",
" <td>2.1894</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" <td>16.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61113</td>\n",
" <td>94950</td>\n",
" <td>123750</td>\n",
" <td>283653</td>\n",
" <td>2.4942</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61114</td>\n",
" <td>94953</td>\n",
" <td>196881</td>\n",
" <td>241585</td>\n",
" <td>0.9259</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>7.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61115</td>\n",
" <td>94954</td>\n",
" <td>118475</td>\n",
" <td>202802</td>\n",
" <td>2.3346</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>26.0</td>\n",
" <td>13.0</td>\n",
" <td>0.0</td>\n",
" <td>12.0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>61116</td>\n",
" <td>94956</td>\n",
" <td>156386</td>\n",
" <td>275346</td>\n",
" <td>4.5022</td>\n",
" <td>8</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>...</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>3.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>26.0</td>\n",
" <td>5.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>61117 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" SUBJECT_ID HADM_ID ICUSTAY_ID los admission hdeath death \\\n",
"0 268 110404 280836 3.2490 8 1 1 \n",
"1 269 106296 206613 3.2788 8 0 0 \n",
"2 270 188028 220345 2.8939 0 0 0 \n",
"3 271 173727 249196 2.0600 8 0 0 \n",
"4 272 164716 210407 1.6202 8 0 0 \n",
"... ... ... ... ... ... ... ... \n",
"61112 94944 143774 201233 2.1894 8 0 0 \n",
"61113 94950 123750 283653 2.4942 8 0 0 \n",
"61114 94953 196881 241585 0.9259 0 0 1 \n",
"61115 94954 118475 202802 2.3346 0 0 0 \n",
"61116 94956 156386 275346 4.5022 8 0 0 \n",
"\n",
" ud bun Bicarbonate ... Potassium Sodium Temp Bilirubin WBC \\\n",
"0 0.0 6.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 \n",
"1 17.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"... ... ... ... ... ... ... ... ... ... \n",
"61112 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"61113 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"61114 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"61115 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"61116 0.0 0.0 0.0 ... 0.0 0.0 3.0 0.0 0.0 \n",
"\n",
" hr gcs bp uo AGE \n",
"0 11.0 26.0 13.0 0.0 12.0 \n",
"1 0.0 0.0 5.0 0.0 7.0 \n",
"2 11.0 0.0 13.0 0.0 18.0 \n",
"3 0.0 0.0 0.0 0.0 7.0 \n",
"4 0.0 0.0 5.0 0.0 12.0 \n",
"... ... ... ... ... ... \n",
"61112 2.0 26.0 13.0 0.0 16.0 \n",
"61113 2.0 5.0 0.0 0.0 18.0 \n",
"61114 0.0 7.0 5.0 0.0 7.0 \n",
"61115 0.0 26.0 13.0 0.0 12.0 \n",
"61116 0.0 26.0 5.0 0.0 0.0 \n",
"\n",
"[61117 rows x 21 columns]"
]
},
"execution_count": 119,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"saps"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {},
"outputs": [],
"source": [
"saps.to_csv('saps2_f.csv')"
]
},
{
"cell_type": "code",
"execution_count": 122,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>UO</th>\n",
" </tr>\n",
" <tr>\n",
" <th>ICUSTAY_ID</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>200001.0</td>\n",
" <td>0.000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200003.0</td>\n",
" <td>0.320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200006.0</td>\n",
" <td>1.160</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200007.0</td>\n",
" <td>0.620</td>\n",
" </tr>\n",
" <tr>\n",
" <td>200009.0</td>\n",
" <td>0.960</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299992.0</td>\n",
" <td>0.000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299993.0</td>\n",
" <td>0.245</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299995.0</td>\n",
" <td>1.640</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299998.0</td>\n",
" <td>0.730</td>\n",
" </tr>\n",
" <tr>\n",
" <td>299999.0</td>\n",
" <td>0.990</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>53718 rows × 1 columns</p>\n",
"</div>"
],
"text/plain": [
" UO\n",
"ICUSTAY_ID \n",
"200001.0 0.000\n",
"200003.0 0.320\n",
"200006.0 1.160\n",
"200007.0 0.620\n",
"200009.0 0.960\n",
"... ...\n",
"299992.0 0.000\n",
"299993.0 0.245\n",
"299995.0 1.640\n",
"299998.0 0.730\n",
"299999.0 0.990\n",
"\n",
"[53718 rows x 1 columns]"
]
},
"execution_count": 122,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fields = ['ICUSTAY_ID', 'UO']\n",
"uo=pd.read_csv('uo_value.csv', usecols=fields, header=0, index_col=0)\n",
"uo"
]
},
{
"cell_type": "code",
"execution_count": 123,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.drop(['uo'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 127,
"metadata": {},
"outputs": [],
"source": [
"saps=icu_merge(saps, uo)"
]
},
{
"cell_type": "code",
"execution_count": 128,
"metadata": {},
"outputs": [],
"source": [
"saps=saps.fillna(0)"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Int64Index: 61117 entries, 0 to 61116\n",
"Data columns (total 21 columns):\n",
"SUBJECT_ID 61117 non-null int64\n",
"HADM_ID 61117 non-null int64\n",
"ICUSTAY_ID 61117 non-null int64\n",
"los 61117 non-null float64\n",
"admission 61117 non-null int64\n",
"hdeath 61117 non-null int64\n",
"death 61117 non-null int64\n",
"ud 61117 non-null float64\n",
"bun 61117 non-null float64\n",
"Bicarbonate 61117 non-null float64\n",
"ventilation 61117 non-null float64\n",
"Potassium 61117 non-null float64\n",
"Sodium 61117 non-null float64\n",
"Temp 61117 non-null float64\n",
"Bilirubin 61117 non-null float64\n",
"WBC 61117 non-null float64\n",
"hr 61117 non-null float64\n",
"gcs 61117 non-null float64\n",
"bp 61117 non-null float64\n",
"AGE 61117 non-null float64\n",
"UO 61117 non-null float64\n",
"dtypes: float64(15), int64(6)\n",
"memory usage: 10.3 MB\n"
]
}
],
"source": [
"saps.info()"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {},
"outputs": [],
"source": [
"saps_nan=pd.read_csv('saps2_fnan.csv', header=0, index_col=0)"
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {},
"outputs": [],
"source": [
"saps_nan=icu_merge(saps_nan, uo)"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {},
"outputs": [],
"source": [
"saps_nan=icu_merge(saps_nan, age)"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {},
"outputs": [],
"source": [
"saps_nan.to_csv('saps2_nan.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Wrangling Summary:\n",
"+ death - is a binary flag which indicates whether the patient died, i.e.whether DOD is null or not. These deaths include both deaths within the hospital ( DOD_HOSP ) and deaths identified by matching the patient to the social security master death index ( DOD_SSN ).\n",
"+ The final 38 columns(variables) were selected and created from a total of 324 columns found in 26 relational tabeles of MIMIC-IIIv1.4 database.\n",
"+ We have 46,234 unique patients in the final clinical deterioration dataset. Around 286 were lost during data cleaning. MIMIC-IIIv1.4 database has a total population of 46,520 patients.\n",
"+ There are 60,517 unique Intensive Care Unit(ICU) stays, some patients were addmitted to ICU more than once. The entire MIMIC-IIIv1.4 datasets has 61,532 ICU stays. Only one ICU stay per hospital amission were included in the study, as a result 1,015 ICU stays were excluded. \n",
"+ There are 800,097 electronic chart events from 4,602 patients. The primary repository of a patients's information is their electronic chart. Each raw in this the final 'Clinical Deterioration' dataset represent a unique entry of patient electronic chart. \n",
"+ The data spans June 2001 - October 2011. In this dataset, the years were moved to future years to comply with HIPAA."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Raw Cell Format",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}