|
a |
|
b/README.md |
|
|
1 |
# MedNLP |
|
|
2 |
This repository contains code and analysis for ''NLP chatbot for Discharge Summaries'' using MIMIC-III dataset. |
|
|
3 |
|
|
|
4 |
The project is aimed at building a system that leverages huge healthcare information available as electronic data and focusses on creating a Human Computer Interaction component in the form of a NLP chatbot. This system answers questions about a patient's discharge summaries primarily based on topic modeling and matching techniques. |
|
|
5 |
|
|
|
6 |
### Files |
|
|
7 |
* MED277_bot.ipynb - A jupyter notebook containing code for NLP chatbot. |
|
|
8 |
* MED277_bot.py - A pyton script for initializing and running the chatbot. Contains same code as MED277_bot.ipynb. |
|
|
9 |
* data10.pkl - A subset of 50 discharge summaries extracted from MIMIC-II dataset and saved as a python pickle file. The file can be found at [location](https://drive.google.com/open?id=19-Wh4x-roinUDStfiZ_C2BmmcCr4UVNY) |
|
|
10 |
* MED277_report.pdf - A pdf report for the project. |
|
|
11 |
* MED277_presentation.pptx - A power presentation for the project. |
|
|
12 |
|
|
|
13 |
#### Running the chatbot [Complete Data] |
|
|
14 |
The chatbot uses the data from MIMIC-III discharge summaries to train and answer questions. The proper data path of MIMIC-III dataset *NOTEEVENTS.csv.gz* should be set in *base_path*. Then navigate to the path of **MED277_bot.ipynb** run the jupyter notebook using the following command: |
|
|
15 |
```python |
|
|
16 |
jupyter notebook |
|
|
17 |
``` |
|
|
18 |
This should start jupyter and then you can run the notebook **MED277_bot.ipynb** |
|
|
19 |
|
|
|
20 |
To run the python file **MED277_bot.py**, the code for reading from the file should be uncommented and proper file *NOTEEVENTS.csv.gz* downloaded from MIMIC-III dataset should be kept at the *base_path* location, and the code to read from saved data file **data10.pkl** should be commented in **load_data()** function. By default the code for reading data is commented out in the MED277_bot.py file. Then the file can be executed using the following command: |
|
|
21 |
```python |
|
|
22 |
python MED277_bot.py |
|
|
23 |
``` |
|
|
24 |
|
|
|
25 |
#### Running the chatbot [Small Subset Data] |
|
|
26 |
We have saved small subset of data to a pickle file **data10.pkl**. |
|
|
27 |
Just edit the *base_path* inside ***MED277_bot.py*** file **load_data()** function, and keep this file at the location of *base_path*. The file ***MED277_bot.py*** should then run using the following command: |
|
|
28 |
```python |
|
|
29 |
python MED277_bot.py |
|
|
30 |
``` |
|
|
31 |
Follow the on screen instructions for interacting further with chatbot. |
|
|
32 |
|
|
|
33 |
### Sample questions for chatbot |
|
|
34 |
- What is my date of birth? |
|
|
35 |
- What is my admission date? |
|
|
36 |
- When was I discharged? |
|
|
37 |
- What is my gender? |
|
|
38 |
- What are the services I had? |
|
|
39 |
- Do I have allergy? |
|
|
40 |
- Who was my attending? |
|
|
41 |
- Am I married? |
|
|
42 |
- What is my social history? |
|
|
43 |
- How can I make an appointment? |
|
|
44 |
- Do I need to visit the clinic? |
|
|
45 |
- How was my MRI? |
|
|
46 |
- What are the medication I should take? |
|
|
47 |
- How to take the steroid? |
|
|
48 |
- What do I do if I have seizures? |
|
|
49 |
- Is my vision blurry? |
|
|
50 |
- Is something wrong with my brain? |
|
|
51 |
- Do I have a cold? |
|
|
52 |
- Do I have dysphagia? |
|
|
53 |
|
|
|
54 |
### Required Dependencies & Libraries |
|
|
55 |
- anaconda 5.2 |
|
|
56 |
- Python 3.x |
|
|
57 |
- pandas |
|
|
58 |
- sklearn |
|
|
59 |
- nltk |
|
|
60 |
- numpy |