a b/biobert_ner/README.md
1
# BioBERT for NER
2
To train an NER model with BioBERT-v1.1 (base), run the command below.
3
<br>
4
Before running this, make sure you have generated the pre-processed dataset using the generate_data.py file with the command mentioned in the parent directory. 
5
6
## Additional Requirements
7
- seqeval: Used for NER evaluation (```pip install seqeval```)
8
9
## Training
10
```
11
export SAVE_DIR=./output
12
export DATA_DIR=./dataset
13
14
export MAX_LENGTH=128
15
export BATCH_SIZE=16
16
export NUM_EPOCHS=5
17
export SAVE_STEPS=1000
18
export SEED=0
19
20
python run_ner.py \
21
    --data_dir ${DATA_DIR}/ \
22
    --labels ${DATA_DIR}/labels.txt \
23
    --model_name_or_path dmis-lab/biobert-large-cased-v1.1 \
24
    --output_dir ${SAVE_DIR}/ \
25
    --max_seq_length ${MAX_LENGTH} \
26
    --num_train_epochs ${NUM_EPOCHS} \
27
    --per_device_train_batch_size ${BATCH_SIZE} \
28
    --save_steps ${SAVE_STEPS} \
29
    --seed ${SEED} \
30
    --do_train \
31
    --do_eval \
32
    --do_predict \
33
    --overwrite_output_dir
34
```
35
36
## Results
37
|             | precision |   recall | f1-score |
38
|:---:|:---:|:---:|:---:|
39
|         ADE |      0.6351 |     0.5680|      0.5997|
40
|       Dosage |      0.9254  |    0.9485  |    0.9368   |
41
|        Drug |      0.9580  |    0.9542  |    0.9561   |
42
|         Duration |      0.8119  |    0.9021  |    0.8546   |
43
|         Form |      0.9546  |    0.9456  |    0.9501   |
44
|         Frequency |      0.9707  |    0.9668  |    0.9688   |
45
|         Reason |      0.7203  |    0.7348  |    0.7275   |
46
|         Route |      0.9530  |    0.9525  |    0.9527   |
47
|         Strength |      0.9807  |    0.9846  |    0.9827   |
48
|   micro avg |      0.9327  |    0.9330  |    0.9328   |
49
|   macro avg |      0.9253  |    0.9225  |    0.9230   |