Diff of /collated_tasks/README.md [000000] .. [2d4573]

Switch to unified view

a b/collated_tasks/README.md
1
# Tasks on Medical Data
2
Developed by Keen (keen.you@yale.edu) Fall 2021
3
## /tasks
4
These tasks are included in the wrapper function subdirectory, check [here](https://github.com/IreneZihuiLi/EHRKit-LILY/tree/main/wrapper_functions) for more comprehensive usage descriptions.
5
6
The /tasks directory contains code for tasks that utilize MIMIC data. Tasks including named entity recognition, abbreviation detection, entity linking, hyponym detection, and translation. 
7
8
The /tasks/utils subdirectory contains helper functions for various tasks including selecting medical notes from csv file using row\_id or subject\_id, vectorize notes, and segmenting notes into sentences. Additional tasks including document clustering and retrieving similar documents.
9
10
## /non_mimic
11
The /non\_mimic directory contains tasks that utilize medical data that is not MIMIC. Tasks including de-identification, inference, and question answering.
12
13
## Getting Started with Collated Tasks
14
#### Clone repo & prepare data
15
```
16
git clone https://github.com/Yale-LILY/EHRKit.git --depth=1
17
cd EHRKit/
18
cd tutorials
19
mkdir data
20
mkdir data/mimic_data
21
cp /data/corpora/mimic/NOTEEVENTS.csv ./data/mimic_data/.
22
```
23
#### Create virtual environment
24
```
25
python3 -m venv task_virenv/
26
source task_virenv/bin/activate
27
```
28
#### Install packages
29
```
30
pip install pip==21.3.1
31
pip install -U spacy
32
pip install scispacy
33
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_sm-0.4.0.tar.gz
34
pip install pandas==1.1.5
35
pip install numpy==1.19.5
36
pip install transformers==4.12.3
37
pip install torch==1.10.0
38
pip install torchvision==0.11.1
39
pip install sentencepiece==0.1.96
40
pip install sklearn
41
pip install PyRuSH
42
pip install stanza==1.3.0
43
```