|
a |
|
b/bilstm_crf_ner/README.md |
|
|
1 |
# BiLSTM + CRF for NER |
|
|
2 |
Before running anything, make sure you have generated the pre-processed dataset using the generate_data.py file with the command mentioned in the parent directory. |
|
|
3 |
|
|
|
4 |
## Additional Requirements |
|
|
5 |
1. Packages: Anaconda, Pytorch, AllenNLP |
|
|
6 |
2. Glove 300B embeddings |
|
|
7 |
|
|
|
8 |
## Training |
|
|
9 |
1. Change settings in model/config.py |
|
|
10 |
2. Main settings to change: File directories, model hyperparameters etc. |
|
|
11 |
3. Run build_data.py |
|
|
12 |
- Builds embedding dictionary, text file of words, chars tags, as well as idx to word and idx to char mapping for the model to read |
|
|
13 |
|
|
|
14 |
4. Run train.py |
|
|
15 |
|
|
|
16 |
## Results |
|
|
17 |
| |precision| recall| f1-score| |
|
|
18 |
|:---:|:---:|:---:|:---:| |
|
|
19 |
| ADE | 0.1807 | 0.7168 | 0.2887 | |
|
|
20 |
| Dosage | 0.9272 | 0.9123 | 0.9197 | |
|
|
21 |
| Drug | 0.8898 | 0.9287| 0.9088 | |
|
|
22 |
| Duration | 0.8882 | 0.7778 | 0.8293 | |
|
|
23 |
| Form | 0.9840 | 0.9172 | 0.9494 | |
|
|
24 |
| Frequency | 0.9412 | 0.9494 | 0.9453 | |
|
|
25 |
| Reason | 0.7883 | 0.5238 | 0.6294 | |
|
|
26 |
| Route | 0.9583 | 0.9226 | 0.9401 | |
|
|
27 |
| Strength | 0.9769 | 0.9683 | 0.9726 | |
|
|
28 |
| micro avg | 0.8708 | 0.8957 | 0.8831 | |
|
|
29 |
| macro avg | 0.8624 | 0.8792 | 0.8684 | |