|
a |
|
b/docs/utilities/connectors/brat.md |
|
|
1 |
# BRAT Connector |
|
|
2 |
|
|
|
3 |
BRAT is currently the only supported in-text annotation editor at EDS. BRAT annotations are in the [standoff format](https://brat.nlplab.org/standoff.html). Consider the following document: |
|
|
4 |
|
|
|
5 |
``` |
|
|
6 |
Le patient est admis pour une pneumopathie au coronavirus. |
|
|
7 |
On lui prescrit du paracétamol. |
|
|
8 |
``` |
|
|
9 |
|
|
|
10 |
It could be annotated as follows : |
|
|
11 |
|
|
|
12 |
``` |
|
|
13 |
T1 Patient 4 11 patient |
|
|
14 |
T2 Disease 31 58 pneumopathie au coronavirus |
|
|
15 |
T3 Drug 79 90 paracétamol |
|
|
16 |
``` |
|
|
17 |
|
|
|
18 |
The point of the BRAT connector is to go from the standoff annotation format to an annotated spaCy document : |
|
|
19 |
|
|
|
20 |
```{ .python .no-check } |
|
|
21 |
import edsnlp |
|
|
22 |
from edsnlp.connectors.brat import BratConnector |
|
|
23 |
|
|
|
24 |
# Instantiate the connector |
|
|
25 |
brat = BratConnector("path/to/brat") |
|
|
26 |
|
|
|
27 |
# Instantiate the spacy pipeline |
|
|
28 |
nlp = edsnlp.blank("eds") |
|
|
29 |
|
|
|
30 |
# Convert all BRAT files to a list of documents |
|
|
31 |
docs = brat.brat2docs(nlp) |
|
|
32 |
doc = docs[0] |
|
|
33 |
|
|
|
34 |
doc.ents |
|
|
35 |
# Out: [patient, pneumopathie au coronavirus, paracétamol] |
|
|
36 |
|
|
|
37 |
doc.ents[0].label_ |
|
|
38 |
# Out: Patient |
|
|
39 |
``` |
|
|
40 |
|
|
|
41 |
The connector can also go the other way around, enabling pre-annotations and an ersatz of active learning. |