|
a |
|
b/docs/data/standoff.md |
|
|
1 |
# BRAT and Standoff |
|
|
2 |
|
|
|
3 |
??? abstract "TLDR" |
|
|
4 |
|
|
|
5 |
```{ .python .no-check } |
|
|
6 |
import edsnlp |
|
|
7 |
|
|
|
8 |
stream = edsnlp.data.read_standoff(path) |
|
|
9 |
stream = stream.map_pipeline(nlp) |
|
|
10 |
res = stream.write_standoff(path) |
|
|
11 |
# or equivalently |
|
|
12 |
edsnlp.data.write_standoff(stream, path) |
|
|
13 |
``` |
|
|
14 |
|
|
|
15 |
You can easily integrate [BRAT](https://brat.nlplab.org/) into your project by using EDS-NLP's BRAT reader and writer. |
|
|
16 |
|
|
|
17 |
BRAT annotations are in the [standoff format](https://brat.nlplab.org/standoff.html). Consider the following document: |
|
|
18 |
|
|
|
19 |
```{ title="doc.txt" } |
|
|
20 |
Le patient est admis pour une pneumopathie au coronavirus. |
|
|
21 |
On lui prescrit du paracétamol. |
|
|
22 |
``` |
|
|
23 |
|
|
|
24 |
Brat annotations are stored in a separate file formatted as follows: |
|
|
25 |
|
|
|
26 |
```{ title="doc.ann" } |
|
|
27 |
T1 Patient 4 11 patient |
|
|
28 |
T2 Disease 31 58 pneumopathie au coronavirus |
|
|
29 |
T3 Drug 79 90 paracétamol |
|
|
30 |
``` |
|
|
31 |
|
|
|
32 |
## Reading Standoff files {: #edsnlp.data.standoff.read_standoff } |
|
|
33 |
|
|
|
34 |
::: edsnlp.data.standoff.read_standoff |
|
|
35 |
options: |
|
|
36 |
heading_level: 3 |
|
|
37 |
show_source: false |
|
|
38 |
show_toc: false |
|
|
39 |
show_bases: false |
|
|
40 |
|
|
|
41 |
## Writing Standoff files {: #edsnlp.data.standoff.write_standoff } |
|
|
42 |
|
|
|
43 |
::: edsnlp.data.standoff.write_standoff |
|
|
44 |
options: |
|
|
45 |
heading_level: 3 |
|
|
46 |
show_source: false |
|
|
47 |
show_toc: false |
|
|
48 |
show_bases: false |