|
a |
|
b/README.md |
|
|
1 |
# Codebase - Predicting Efficacy of Cardiac Resynchronization Therapy Using NLP and ML |
|
|
2 |
## - _Charlotta Lindvall, Josh Haimson, Alex Forsynth, Michael Traub, Austin Freel_ |
|
|
3 |
|
|
|
4 |
## Description: |
|
|
5 |
|
|
|
6 |
The project code can be categorized into 3 groups: |
|
|
7 |
1. Data management and extraction |
|
|
8 |
2. Transformers for data pipeline |
|
|
9 |
3. Methods for queueing, building, and exceuting tests |
|
|
10 |
|
|
|
11 |
Below I will group the file names by these categories, and all |
|
|
12 |
files not mentioned are miscelaneous, superfluous or unimportant. |
|
|
13 |
|
|
|
14 |
(1) DATA MANAGEMENT, EXTRACTION AND VALIDATION |
|
|
15 |
|
|
|
16 |
anonymizer.py -- anonymizes MRNs, SSNs and names from free text during initial data transformation |
|
|
17 |
extract_data.py -- misc data extraction functions |
|
|
18 |
free_text_jsonifyer.py -- Extracts meta data from free text files |
|
|
19 |
language_processing.py -- used to clean extracted values |
|
|
20 |
loader.py -- Loads patient data from disk |
|
|
21 |
tables.py -- generates table statistics for paper |
|
|
22 |
validate.py -- Generates statistics to validate results |
|
|
23 |
generateTurkTasks.py -- Creates a csv file for localturk to efficiently do manual extraction |
|
|
24 |
|
|
|
25 |
(2) TRANSFORMERS |
|
|
26 |
|
|
|
27 |
baseline_transformer.py -- contains all structured data transformers |
|
|
28 |
doc2vec_trainer.py -- Creates doc2vec models to be used by doc2vec transformer |
|
|
29 |
doc2vec_transformer.py -- Transforms arbitrary length text into fixed dimensional semantic representation |
|
|
30 |
icd_transformer.py -- Transforms ICD9 code into hierarchical numeric representation |
|
|
31 |
value_extractor_transformer.py -- contains the Regex extractors |
|
|
32 |
|
|
|
33 |
(3) QUEUING, BUILDING, AND EXECUTING TESTS |
|
|
34 |
|
|
|
35 |
decision_model.py -- contains the hard-coded clincical guideline |
|
|
36 |
experiment_runner.py -- Daemon process to continually run tests |
|
|
37 |
model_builder.py -- Build ML/NLP models to test |
|
|
38 |
model_tester.py -- Used by experiment runer to run a test on a given model |
|
|
39 |
queue_test.py -- Helper function to queue up tests |
|
|
40 |
run_test.py -- Standalone code to run an individual model and test |