|
a |
|
b/tests/README.md |
|
|
1 |
## What's new? |
|
|
2 |
|
|
|
3 |
1. The UMN Clinical Abbreviation Sense Inventory: Clinical Abbreviations and Acronyms is indexed to solr core ehr_abbreviations (440 most frequently used abbreviations, 30000+ docs in total on solr) |
|
|
4 |
|
|
|
5 |
The following two are temporarily placed in sample_script.py only, yet to incorporate into test.py |
|
|
6 |
2. Function to output discharge summaries for patients who have more then 10 of them |
|
|
7 |
3. Function to initialize word2vec model using outputted discharge summary above |
|
|
8 |
|
|
|
9 |
## Running Unit Tests on MIMIC-III: |
|
|
10 |
|
|
|
11 |
Running test.py without any arguments would run all tests ones but **skips** ones that take a while to run. (Those tests must be called explicitly.) |
|
|
12 |
``` |
|
|
13 |
> python tests.py |
|
|
14 |
``` |
|
|
15 |
|
|
|
16 |
To run specific tests, follow the format python tests.py CLASS_NAME1.TEST_NAME1 CLASS_NAME2.TEST_NAME2 … |
|
|
17 |
For example, you can run |
|
|
18 |
``` |
|
|
19 |
> python tests.py t7.test7_1_naive_bayes |
|
|
20 |
|
|
|
21 |
``` |
|
|
22 |
or |
|
|
23 |
``` |
|
|
24 |
> python tests.py t3.test3_4_doc_sentences t7.test7_1_naive_bayes |
|
|
25 |
``` |
|
|
26 |
|
|
|
27 |
Unittest prints "." when a test runs successfully, "E" when it encounters an error, and "s" when it skips a test. |
|
|
28 |
|
|
|
29 |
No tests updates the database; all of them are just processing information from it. Not all tests currently work. |
|
|
30 |
|
|
|
31 |
###Tests available to run |
|
|
32 |
|
|
|
33 |
The tests marked in bold are skipped and must be run explicitly. |
|
|
34 |
|
|
|
35 |
T1. |
|
|
36 |
1. Count the total number of patients. test1_1_count_patients |
|
|
37 |
2. **Count the total number of patient records** (fails). test1_2_count_docs |
|
|
38 |
3. **Count the number of sentences**. test1_3_note_info |
|
|
39 |
4. **Print the record with the most sentences**. test1_4_longest_note |
|
|
40 |
|
|
|
41 |
T2. |
|
|
42 |
|
|
|
43 |
1. Display a full document given its document ID. test2_1_print_note |
|
|
44 |
2. **Count how many documents are associated with a given patient, given the patient ID, e.g., 23224 - show also the list of document IDs**. test2_2_patient_info |
|
|
45 |
3. List all document IDs. test2_3_doc_ids |
|
|
46 |
4. List all patient IDs. test2_4_patient_ids |
|
|
47 |
5. List all document IDs for a given admission date, e.g., 2188-11-1. test2_5_docs_on_date |
|
|
48 |
|
|
|
49 |
T3. |
|
|
50 |
|
|
|
51 |
1. Extract all abbreviations from a document, given the document ID. For now, let’s assume that an abbreviation is a sequence of two or more capital letters, e.g., GERD, PEERL, AMI. test3_1_extract_abbrevations |
|
|
52 |
2. **List all document IDs that include keyword "meningitis"**. test3_2_docs_with_query |
|
|
53 |
3. **List all document IDs that include keywords "Service: SURGERY”**. test3_3_query_docs |
|
|
54 |
4. Given a document ID, show a numbered list of all sentences in that document. test3_4_doc_sentences |
|
|
55 |
5. **Count the number of prescriptions for each unique medication**. test3_5_medications |
|
|
56 |
|
|
|
57 |
T5 |
|
|
58 |
|
|
|
59 |
1. **Use https://github.com/kavgan/phrase-at-scale to extract phrases from a document, given its ID**. test5_1_extract_phrases |
|
|
60 |
2. Count how many patients are labeled as “male” or “female”. test5_2_count_gender |
|
|
61 |
|
|
|
62 |
T6 |
|
|
63 |
1. **Classifies the sentiment of a document as positive or negative using AllenNLLP.** test6_1_sentiment_classification |
|
|
64 |
2. **Performs named entity recognition on a document using AllenNLLP**. test6_2_ner |
|
|
65 |
3. **Tokenizes the words of a document using Huggingface.** test6_3_tokenize |
|
|
66 |
|
|
|
67 |
T7. |
|
|
68 |
1. **Creates extractive summary of an EHR with Naive Bayes Algorithm trained on PubMed articles.** test7_1_naive_bayes |
|
|
69 |
2. **Generates abstractive summary of an EHR with pre-trained Distilbart model from Huggingface (works poorly)**. test7_2_distilbart_summary |
|
|
70 |
3. **Generates abstractive summary of an EHR with pre-trained T5 model from Huggingface (works poorly)**. test7_3_t5_summary |