|
a/README.md |
|
b/README.md |
1 |
[](https://spacy.io) |
1 |
[](https://spacy.io) |
2 |
|
2 |
|
3 |
# medaCy |
3 |
# medaCy
|
4 |
:hospital: Medical Text Mining and Information Extraction with spaCy :hospital: |
4 |
:hospital: Medical Text Mining and Information Extraction with spaCy :hospital: |
5 |
|
5 |
|
6 |
MedaCy is a text processing and learning framework built over [spaCy](https://spacy.io/) to support the lightning fast |
6 |
MedaCy is a text processing and learning framework built over [spaCy](https://spacy.io/) to support the lightning fast
|
7 |
prototyping, training, and application of highly predictive medical NLP models. It is designed to streamline researcher |
7 |
prototyping, training, and application of highly predictive medical NLP models. It is designed to streamline researcher
|
8 |
workflow by providing utilities for model training, prediction and organization while insuring the replicability of systems. |
8 |
workflow by providing utilities for model training, prediction and organization while insuring the replicability of systems. |
9 |
|
9 |
|
10 |
 |
10 |
 |
11 |
|
11 |
|
12 |
# :star2: Features |
12 |
# :star2: Features
|
13 |
- Highly predictive, shared-task dominating out-of-the-box trained models for medical named entity recognition. |
13 |
- Highly predictive, shared-task dominating out-of-the-box trained models for medical named entity recognition.
|
14 |
- Customizable pipelines with detailed development instructions and documentation. |
14 |
- Customizable pipelines with detailed development instructions and documentation.
|
15 |
- Allows the designing of replicable NLP systems for reproducing results and encouraging the distribution of models whilst still allowing for privacy. |
15 |
- Allows the designing of replicable NLP systems for reproducing results and encouraging the distribution of models whilst still allowing for privacy.
|
16 |
- Active community development spearheaded and maintained by [NLP@VCU](https://nlp.cs.vcu.edu/). |
16 |
- Active community development spearheaded and maintained by [NLP@VCU](https://nlp.cs.vcu.edu/).
|
17 |
- Detailed [API](https://medacy.readthedocs.io/en/latest/). |
17 |
- Detailed [API](https://medacy.readthedocs.io/en/latest/). |
18 |
|
18 |
|
19 |
## :thought_balloon: Where to ask questions |
19 |
## :thought_balloon: Where to ask questions |
20 |
|
20 |
|
21 |
MedaCy is actively maintained by a team of researchers at Virginia Commonwealth University. The best way to |
21 |
MedaCy is actively maintained by a team of researchers at Virginia Commonwealth University. The best way to
|
22 |
receive immediate responses to any questions is to raise an issue. Make sure to first consult the |
22 |
receive immediate responses to any questions is to raise an issue. Make sure to first consult the
|
23 |
[API](https://medacy.readthedocs.io/en/latest/). See how to formulate a good issue or feature request in the [Contribution Guide](CONTRIBUTING.md). |
23 |
[API](https://medacy.readthedocs.io/en/latest/). See how to formulate a good issue or feature request in the [Contribution Guide](CONTRIBUTING.md). |
24 |
|
24 |
|
25 |
## :computer: Installation Instructions |
25 |
## :computer: Installation Instructions
|
26 |
MedaCy can be installed for general use or for pipeline development / research purposes. |
26 |
MedaCy can be installed for general use or for pipeline development / research purposes. |
27 |
|
27 |
|
28 |
| Application | Run | |
28 |
| Application | Run |
|
29 |
| ----------- |:-------------:| |
29 |
| ----------- |:-------------:|
|
30 |
| Prediction and Model Training (stable) | `pip install git+https://github.com/NLPatVCU/medaCy.git` | |
30 |
| Prediction and Model Training (stable) | `pip install git+https://github.com/NLPatVCU/medaCy.git` |
|
31 |
| Prediction and Model Training (latest) | `pip install git+https://github.com/NLPatVCU/medaCy.git@development` | |
31 |
| Prediction and Model Training (latest) | `pip install git+https://github.com/NLPatVCU/medaCy.git@development` |
|
32 |
| Pipeline Development and Contribution | [See Contribution Instructions](/CONTRIBUTING.md) | |
32 |
| Pipeline Development and Contribution | [See Contribution Instructions](/CONTRIBUTING.md) | |
33 |
|
33 |
|
34 |
|
34 |
|
35 |
# :books: Power of medaCy |
35 |
# :books: Power of medaCy
|
36 |
After installing medaCy and [medaCy's clinical model](guide/models/clinical_notes_model.md), simply run: |
36 |
After installing medaCy and [medaCy's clinical model](guide/models/clinical_notes_model.md), simply run: |
37 |
|
37 |
|
38 |
```python |
38 |
```python
|
39 |
from medacy.model.model import Model |
39 |
from medacy.model.model import Model |
40 |
|
40 |
|
41 |
model = Model.load_external('medacy_model_clinical_notes') |
41 |
model = Model.load_external('medacy_model_clinical_notes')
|
42 |
annotation = model.predict("The patient was prescribed 1 capsule of Advil for 5 days.") |
42 |
annotation = model.predict("The patient was prescribed 1 capsule of Advil for 5 days.")
|
43 |
print(annotation) |
43 |
print(annotation)
|
44 |
``` |
44 |
```
|
45 |
and receive instant predictions: |
45 |
and receive instant predictions:
|
46 |
```python |
46 |
```python
|
47 |
[ |
47 |
[
|
48 |
('Drug', 40, 45, 'Advil'), |
48 |
('Drug', 40, 45, 'Advil'),
|
49 |
('Dosage', 27, 28, '1'), |
49 |
('Dosage', 27, 28, '1'),
|
50 |
('Form', 29, 36, 'capsule'), |
50 |
('Form', 29, 36, 'capsule'),
|
51 |
('Duration', 46, 56, 'for 5 days') |
51 |
('Duration', 46, 56, 'for 5 days')
|
52 |
] |
52 |
]
|
53 |
``` |
53 |
``` |
54 |
|
54 |
|
55 |
MedaCy can also be used through its command line interface, documented [here](./guide/command_line_interface.md) |
55 |
MedaCy can also be used through its command line interface, documented [here](./guide/command_line_interface.md) |
56 |
|
56 |
|
57 |
To explore medaCy's other models or train your own, visit the [examples section](guide). |
57 |
To explore medaCy's other models or train your own, visit the [examples section](guide). |
58 |
|
58 |
|
59 |
Reference |
59 |
Reference
|
60 |
========= |
60 |
=========
|
61 |
```bibtex |
61 |
```bibtex
|
62 |
@ARTICLE { |
62 |
@ARTICLE {
|
63 |
author = "Andriy Mulyar, Natassja Lewinski and Bridget McInnes", |
63 |
author = "Andriy Mulyar, Natassja Lewinski and Bridget McInnes",
|
64 |
title = "TAC SRIE 2018: Extracting Systematic Review Information with MedaCy", |
64 |
title = "TAC SRIE 2018: Extracting Systematic Review Information with MedaCy",
|
65 |
journal = "National Institute of Standards and Technology (NIST) 2018 Systematic Review Information Extraction (SRIE) > Text Analysis Conference", |
65 |
journal = "National Institute of Standards and Technology (NIST) 2018 Systematic Review Information Extraction (SRIE) > Text Analysis Conference",
|
66 |
year = "2018", |
66 |
year = "2018",
|
67 |
month = "nov" |
67 |
month = "nov"
|
68 |
} |
68 |
}
|
69 |
``` |
69 |
``` |
70 |
|
70 |
|
71 |
License |
71 |
License
|
72 |
======= |
72 |
=======
|
73 |
This package is licensed under the GNU General Public License. |
73 |
This package is licensed under the GNU General Public License. |
74 |
|
74 |
|
75 |
Authors |
75 |
Authors
|
76 |
======= |
76 |
=======
|
77 |
Current contributors: Steele Farnsworth, Anna Conte, Gabby Gurdin, Aidan Kierans, Aidan Myers, and Bridget T. McInnes |
77 |
Current contributors: Steele Farnsworth, Anna Conte, Gabby Gurdin, Aidan Kierans, Aidan Myers, and Bridget T. McInnes |
78 |
|
78 |
|
79 |
Former contributors: Andriy Mulyar, Jorge Vargas, Corey Sutphin, and Bobby Best |
79 |
Former contributors: Andriy Mulyar, Jorge Vargas, Corey Sutphin, and Bobby Best |
80 |
|
80 |
|
81 |
Acknowledgments |
81 |
Acknowledgments
|
82 |
=============== |
82 |
===============
|
83 |
- [VCU Natural Language Processing Lab](https://nlp.cs.vcu.edu/)  |
83 |
- [VCU Natural Language Processing Lab](https://nlp.cs.vcu.edu/) 
|
84 |
- [Nanoinformatics Vertically Integrated Projects](https://rampages.us/nanoinformatics/) |
84 |
- [Nanoinformatics Vertically Integrated Projects](https://rampages.us/nanoinformatics/)
|