Note: This project is not 100% open source. Only part of coding developed by owner of this github account is made public. To view the original GitLab repo (http://pacegitlab.dhe.duke.edu/dihi/2019_rfa/adult_decompensation.git), please apply access to Duke PACE machine https://pace.ori.duke.edu/.
This project aims to initialize machine learning models for predicting adult inpatients' decompensation (ICU admission, mortality, RRT events, etc) in real time. Most preliminary work before building models includes data cleaning, data visualization, data quality assurance and data manipulation etc. The ultimate goal is to reduce patients' deterioration and standardize hospital response protocols.
Directory tree along with functionality of each folder(or file) is summarized as follows (click the arrow to expand folders):
db //codes for creating project database and importing data into the databaseDataPrep
cohort //codes for cohort generation
features //codes for pulling and cleaning data elements
outcome //codes for querying and labelling outcomes
pull_data //pull useful data from raw db file
adt_transfer.py //create transfer table and output a csv file
adt_transfer.sql //transfer table sql queryockham //unit conversion packageModel
v1.0 //version 1.0 (24-hour prediction window)
design_matrix
News //python package for implementing News(National Early Warning Score)
visualization //model visualization
model_utils.py //model utils python package
run_ann.ipynb
run_logistic_regression.py
run_news.py
run_random_forest.ipynb
run_xgboost.py
utils //utils python package (db utils, dataframe utils, etc)
db //project database file(s)
metadataModeling
v1.0
design_matrix //design matrix file(s)
Output //model output dataRaw //project raw data subset from datapipelineProcessed
cohort
features
outcome
adult_decomp_adt_transfer.csv
Slides //presentation slides for project milestonesProject
code map_v1.xlsx //outlines the code and associated data files for "start-to-finish" process of data curation
code map_v2.xlsx
code map_supplement.xlsx //outlines supporting code and data files for feature engineering, modeling, etc
code map_supplement_v2.xlsx
literature_review.pdf
Perspectives Piece.docx
gap_analysis //gap analysis outputFigures //data visualization figures
cohort //visualization figures for cohort statistics
features //visualization figures for features quality assurance
Model //visualization figures for model performance
.gitignore
README.md
Instructions on setting up the project locally.
git clone http://pacegitlab.dhe.duke.edu/dihi/2019_rfa/adult_decompensation.git
./Docs/Project
to run the project from start to endAll the source data comes from the following locations:
Data visualizations for the project include:
Project is: in progress;
To-do list:
P:/dihi_qi/data_pipeline/db/data_pipeline.db
needs to be updated by duh_dep_info_v06 (in next iteration)Copyright 2019 Ziyuan Shen, Duke Institute for Health Innovation (DIHI), Duke University School of Medicine, Durham NC.
All Rights Reserved.
Ziyuan Shen - ziyuan.shen@duke.edu
Mengxuan Cui - mengxuan.cui@duke.edu
This work is funded by Woo Center for Big Data and Precision Health, in collaboration with DIHI (Duke Institute for Health Innovation). The authors thank Professor Xiling Shen for consistently supporting the project and DIHI team for guidance and assistance with project specifics (Will Ratliff and Mark Sendak for hospital data resource and modeling support, Michael Gao and Marshall Nichols for technical support).