Diff of /biobert_re/README.md [000000] .. [1de6ed]

Switch to side-by-side view

--- a
+++ b/biobert_re/README.md
@@ -0,0 +1,81 @@
+# BioBERT for RE
+To train an NER model with BioBERT-v1.1 (base), run the command below.
+<br>
+Before running this, make sure you have generated the pre-processed dataset using the generate_data.py file with the command mentioned in the parent directory. 
+
+## Additional Requirements
+- sklearn: Used for RE evaluation (`pip install scikit-learn`)
+- pandas : Used for RE evaluation (`pip install pandas`)
+
+## Training
+```
+export SAVE_DIR=./output
+export DATA_DIR=./dataset
+
+export MAX_LENGTH=128
+export BATCH_SIZE=8
+export NUM_EPOCHS=3
+export SAVE_STEPS=1000
+export SEED=1
+export LEARNING_RATE=5e-5
+
+python run_re.py \
+    --task_name ehr-re \
+    --config_name bert-base-cased \
+    --data_dir ${DATA_DIR} \
+    --model_name_or_path dmis-lab/biobert-base-cased-v1.1 \
+    --max_seq_length ${MAX_LENGTH} \
+    --num_train_epochs ${NUM_EPOCHS} \
+    --per_device_train_batch_size ${BATCH_SIZE} \
+    --save_steps ${SAVE_STEPS} \
+    --seed ${SEED} \
+    --do_train \
+    --do_eval \
+    --do_predict \
+    --learning_rate ${LEARNING_RATE} \
+    --output_dir ${SAVE_DIR} \
+    --overwrite_output_dir
+```
+
+## Results
+#### With gold standard entities
+|             | precision |   recall | f1-score |
+|:---:|:---:|:---:|:---:|
+|Strength -> Drug |      0.9854 |     0.9691|      0.9772|
+|Dosage -> Drug |      0.9798  |    0.9725  |    0.9762   |
+| Duration -> Drug |      0.9229  |    0.8991  |    0.9108   |
+| Frequency -> Drug |      0.9782  |    0.9348  |    0.9560   |
+| Form -> Drug |      0.9887  |    0.9829  |    0.9858   |
+| Route -> Drug |      0.9668  |    0.9605  |    0.9636   |
+| Reason -> Drug |      0.7623  |    0.8801  |    0.8169   |
+| ADE -> Drug |      0.8601  |    0.8049  |    0.8316   |
+|   micro avg |      0.9395  |    0.9455  |    0.9425   |
+|   macro avg |      0.9303  |    0.9341  |    0.9296   |
+
+#### With entities predicted using BioBERT NER model (End-to-end Results)
+|             | precision |   recall | f1-score |
+|:---:|:---:|:---:|:---:|
+|Strength -> Drug |      0.9672 |     0.9526|      0.9599|
+|Dosage -> Drug |      0.8995  |    0.9232  |    0.9112   |
+| Duration -> Drug |      0.7545  |    0.7934  |    0.7735   |
+| Frequency -> Drug |      0.9450  |    0.8607  |    0.9009   |
+| Form -> Drug |      0.9443  |    0.9300  |    0.9371   |
+| Route -> Drug |      0.9213  |    0.9148  |    0.9181   |
+| Reason -> Drug |      0.5531  |    0.6370  |    0.5921   |
+| ADE -> Drug |      0.5419  |    0.4584  |    0.4967   |
+|   micro avg |      0.8600  |    0.8593  |    0.8596   |
+|   macro avg |      0.8406  |    0.8345  |    0.8340   |
+
+#### With entities predicted using BiLSTM+CRF NER model
+|             | precision |   recall | f1-score |
+|:---:|:---:|:---:|:---:|
+|Strength -> Drug |      0.7008 |     0.8475|      0.7672|
+|Dosage -> Drug |      0.6418  |    0.8497  |    0.7313   |
+| Duration -> Drug |      0.6244  |    0.6244  |    0.6244   |
+| Frequency -> Drug |      0.6446  |    0.7643  |    0.6993   |
+| Form -> Drug |      0.7006  |    0.8727  |    0.7772   |
+| Route -> Drug |      0.6502  |    0.8082  |    0.7206   |
+| Reason -> Drug |      0.4455  |    0.3821  |    0.4114   |
+| ADE -> Drug |      0.1143  |    0.4829  |    0.1849   |
+|   micro avg |      0.5900  |    0.7491  |    0.6601   |
+|   macro avg |      0.5713  |    0.6918  |    0.6149   |
\ No newline at end of file