Table of Contents

  1. Notebook order to get to this point
  2. Adding Data
  3. Age & Sex
  4. Race & Ethnicity
  5. Time from ECG to Echo
  6. Bundle Branch & QRS
  7. Summary Stats for All Subgroups (Tables 1 & 2)
  8. Varying Prevalence (Table 3)
  9. Overall AUROC & AUPRC (Figure 3A/3B)
  10. Logistic Regression on Tabular Data
  11. NYP Lawrence Validation
  12. Model Calibration Curves
  13. Determining Performance per Model at Youden Index

Notebook order to get to this point

  1. Generate an updated test dataset that removes bioprosthetic ECG-echo pairings and generates 1:1 pairing between ecg and echo, only including complete (all 3 labels present) echos and ECGs done 0 to 365 days prior to echo in question. One pair per patient, most recent period where pair is valid, with largest time difference between ECG and echo as tiebreaker (b/t 0 to 365 days)
    /Users/pae2/Box/Heart Failure Analytics/Data/MuseLabelGeneration/ValveNet_JACC_Revisions/Multivalvular_Label_Generation_No_Bioprosthesis_03_01_2022.ipynb

  1. Take that new list of ECG-echo pairs and run it on Dendrite to find the ECGs and preprocess the waveform arrays. Same for the tabular data. Then sanity check including looking at waveform/tabular output and going back into medical record to verify the label is correct, ecg is correct, and tabular data is correct. /Users/pae2/Box/Heart Failure Analytics/Data/MuseLabelGeneration/ValveNet_JACC_Revisions/Dendrite_Transfer/ValveNet_JACC_Revision_Dataset_Generation_03_02_2022.ipynb

  1. Run the ValveNet_v2 engine on Dendrite (10.144.220.25). Running the CADnet branch of ValveNet_v2 files:

  1. And now we're ready to run the post-hoc analyses in this notebook

Adding Data

New file missing LBBB and RBBB so we need to add those

Backfill four_grade columns

Add PatientAge_Years

Old test set counts vs new test set

Append model scores to the data

Generate Model Output Summary Table

Exploring binary label vs. y discrepancies

These seem to be consistent (except for 1 case)

Generate Summary Table for Race / Ethnicity

Append Race/Ethnicity Data

Compute Age/Sex Stats Using Bootstrap

Plot Results by Age Group

Sex Summary Stats

Compute Race Stats Using Bootstrap

Plot Results By Race

Compute Ethnicity Stats Using Bootstrap

Plot Results By Ethnicity

Compute Time Differential ECG/Echo Stats Using Bootstrap

Compute Bundle Branch and QRS Stats Using Bootstrap

Plot results by QRS Duration

Plot results by LBBB/RBBB

Generate Summary Stats for Table 1 and 2

Compute Overall Stats Using Bootstrap

Plot Results For All Data

Look at age breakdown on per study level

Look at age breakdown on per patient level

Now we run the vary_prevalance scripts from ValveNet_v2 back on the Dendrite server

scp -r /Users/pae2/Box/Heart\ Failure\ Analytics/Data/MuseLabelGeneration/ValveNet_JACC_Revisions/Dendrite_Transfer/test_set_subsampling_indices pae2115@10.144.220.25:/home/pae2115/ValveNet_JACC_Revisions

!python /home/pae2115/ValveNet/ValveNet_Multisite_Validation/eval_AS_vary_prevalence.py

!python /home/pae2115/ValveNet/ValveNet_Multisite_Validation/eval_AI_vary_prevalence.py

!python /home/pae2115/ValveNet/ValveNet_Multisite_Validation/eval_MR_vary_prevalence.py

!python /home/pae2115/ValveNet/ValveNet_Multisite_Validation/eval_AS_AI_MR_vary_prevalence.py

results found at Box/Heart Failure Analytics/Data/MuseLabelGeneration/ValveNet_JACC_Revisions/Dendrite_Transfer/eval_vary_prevalence_script_outputs.ipynb

Test Set AUROC and AUPRC Curves (Fig 3A and 3B)

Compute a Logistic Regression Model using Tabular Data

How much earlier was a patient's first ECG than their first echo among severe AS patients?

NYP Lawrence Validation with and without Propensity Score Matching

This can be found at /Users/pae2/Box/Heart Failure Analytics/Data/MuseLabelGeneration/ValveNet_JACC_Revisions/NYP_Lawrence/propensity-score-matching-main/propensity_score_matching_v2_ValveNet.ipynb

Model Calibration Curves

Determine PPV and NPV at Youden Index value per model