|
a |
|
b/README.md |
|
|
1 |
[](https://spacy.io) |
|
|
2 |
|
|
|
3 |
# medaCy |
|
|
4 |
:hospital: Medical Text Mining and Information Extraction with spaCy :hospital: |
|
|
5 |
|
|
|
6 |
MedaCy is a text processing and learning framework built over [spaCy](https://spacy.io/) to support the lightning fast |
|
|
7 |
prototyping, training, and application of highly predictive medical NLP models. It is designed to streamline researcher |
|
|
8 |
workflow by providing utilities for model training, prediction and organization while insuring the replicability of systems. |
|
|
9 |
|
|
|
10 |
 |
|
|
11 |
|
|
|
12 |
# :star2: Features |
|
|
13 |
- Highly predictive, shared-task dominating out-of-the-box trained models for medical named entity recognition. |
|
|
14 |
- Customizable pipelines with detailed development instructions and documentation. |
|
|
15 |
- Allows the designing of replicable NLP systems for reproducing results and encouraging the distribution of models whilst still allowing for privacy. |
|
|
16 |
- Active community development spearheaded and maintained by [NLP@VCU](https://nlp.cs.vcu.edu/). |
|
|
17 |
- Detailed [API](https://medacy.readthedocs.io/en/latest/). |
|
|
18 |
|
|
|
19 |
## :thought_balloon: Where to ask questions |
|
|
20 |
|
|
|
21 |
MedaCy is actively maintained by a team of researchers at Virginia Commonwealth University. The best way to |
|
|
22 |
receive immediate responses to any questions is to raise an issue. Make sure to first consult the |
|
|
23 |
[API](https://medacy.readthedocs.io/en/latest/). See how to formulate a good issue or feature request in the [Contribution Guide](CONTRIBUTING.md). |
|
|
24 |
|
|
|
25 |
## :computer: Installation Instructions |
|
|
26 |
MedaCy can be installed for general use or for pipeline development / research purposes. |
|
|
27 |
|
|
|
28 |
| Application | Run | |
|
|
29 |
| ----------- |:-------------:| |
|
|
30 |
| Prediction and Model Training (stable) | `pip install git+https://github.com/NLPatVCU/medaCy.git` | |
|
|
31 |
| Prediction and Model Training (latest) | `pip install git+https://github.com/NLPatVCU/medaCy.git@development` | |
|
|
32 |
| Pipeline Development and Contribution | [See Contribution Instructions](/CONTRIBUTING.md) | |
|
|
33 |
|
|
|
34 |
|
|
|
35 |
# :books: Power of medaCy |
|
|
36 |
After installing medaCy and [medaCy's clinical model](guide/models/clinical_notes_model.md), simply run: |
|
|
37 |
|
|
|
38 |
```python |
|
|
39 |
from medacy.model.model import Model |
|
|
40 |
|
|
|
41 |
model = Model.load_external('medacy_model_clinical_notes') |
|
|
42 |
annotation = model.predict("The patient was prescribed 1 capsule of Advil for 5 days.") |
|
|
43 |
print(annotation) |
|
|
44 |
``` |
|
|
45 |
and receive instant predictions: |
|
|
46 |
```python |
|
|
47 |
[ |
|
|
48 |
('Drug', 40, 45, 'Advil'), |
|
|
49 |
('Dosage', 27, 28, '1'), |
|
|
50 |
('Form', 29, 36, 'capsule'), |
|
|
51 |
('Duration', 46, 56, 'for 5 days') |
|
|
52 |
] |
|
|
53 |
``` |
|
|
54 |
|
|
|
55 |
MedaCy can also be used through its command line interface, documented [here](./guide/command_line_interface.md) |
|
|
56 |
|
|
|
57 |
To explore medaCy's other models or train your own, visit the [examples section](guide). |
|
|
58 |
|
|
|
59 |
Reference |
|
|
60 |
========= |
|
|
61 |
```bibtex |
|
|
62 |
@ARTICLE { |
|
|
63 |
author = "Andriy Mulyar, Natassja Lewinski and Bridget McInnes", |
|
|
64 |
title = "TAC SRIE 2018: Extracting Systematic Review Information with MedaCy", |
|
|
65 |
journal = "National Institute of Standards and Technology (NIST) 2018 Systematic Review Information Extraction (SRIE) > Text Analysis Conference", |
|
|
66 |
year = "2018", |
|
|
67 |
month = "nov" |
|
|
68 |
} |
|
|
69 |
``` |
|
|
70 |
|
|
|
71 |
License |
|
|
72 |
======= |
|
|
73 |
This package is licensed under the GNU General Public License. |
|
|
74 |
|
|
|
75 |
Authors |
|
|
76 |
======= |
|
|
77 |
Current contributors: Steele Farnsworth, Anna Conte, Gabby Gurdin, Aidan Kierans, Aidan Myers, and Bridget T. McInnes |
|
|
78 |
|
|
|
79 |
Former contributors: Andriy Mulyar, Jorge Vargas, Corey Sutphin, and Bobby Best |
|
|
80 |
|
|
|
81 |
Acknowledgments |
|
|
82 |
=============== |
|
|
83 |
- [VCU Natural Language Processing Lab](https://nlp.cs.vcu.edu/)  |
|
|
84 |
- [Nanoinformatics Vertically Integrated Projects](https://rampages.us/nanoinformatics/) |