This repository contains code and analysis for ''NLP chatbot for Discharge Summaries'' using MIMIC-III dataset.
The project is aimed at building a system that leverages huge healthcare information available as electronic data and focusses on creating a Human Computer Interaction component in the form of a NLP chatbot. This system answers questions about a patient's discharge summaries primarily based on topic modeling and matching techniques.
The chatbot uses the data from MIMIC-III discharge summaries to train and answer questions. The proper data path of MIMIC-III dataset NOTEEVENTS.csv.gz should be set in base_path. Then navigate to the path of MED277_bot.ipynb run the jupyter notebook using the following command:
jupyter notebook
This should start jupyter and then you can run the notebook MED277_bot.ipynb
To run the python file MED277_bot.py, the code for reading from the file should be uncommented and proper file NOTEEVENTS.csv.gz downloaded from MIMIC-III dataset should be kept at the base_path location, and the code to read from saved data file data10.pkl should be commented in load_data() function. By default the code for reading data is commented out in the MED277_bot.py file. Then the file can be executed using the following command:
python MED277_bot.py
We have saved small subset of data to a pickle file data10.pkl.
Just edit the base_path inside MED277_bot.py file load_data() function, and keep this file at the location of base_path. The file MED277_bot.py should then run using the following command:
python MED277_bot.py
Follow the on screen instructions for interacting further with chatbot.