|
a |
|
b/docs/utilities/connectors/labeltool.md |
|
|
1 |
# LabelTool Connector |
|
|
2 |
|
|
|
3 |
LabelTool is an in-house module enabling rapid annotation of pre-extracted entities. |
|
|
4 |
|
|
|
5 |
We provide a ready-to-use function that converts a list of annotated spaCy documents into a `pandas` DataFrame that is readable to LabelTool. |
|
|
6 |
|
|
|
7 |
```python |
|
|
8 |
import edsnlp, edsnlp.pipes as eds |
|
|
9 |
|
|
|
10 |
from edsnlp.connectors.labeltool import docs2labeltool |
|
|
11 |
|
|
|
12 |
corpus = [ |
|
|
13 |
"Ceci est un document médical.", |
|
|
14 |
"Le patient n'est pas malade.", |
|
|
15 |
] |
|
|
16 |
|
|
|
17 |
# Instantiate the spacy pipeline |
|
|
18 |
nlp = edsnlp.blank("fr") |
|
|
19 |
nlp.add_pipe(eds.sentences()) |
|
|
20 |
nlp.add_pipe(eds.matcher(terms=dict(medical="médical", malade="malade"))) |
|
|
21 |
nlp.add_pipe(eds.negation()) |
|
|
22 |
|
|
|
23 |
# Convert all BRAT files to a list of documents |
|
|
24 |
docs = nlp.pipe(corpus) |
|
|
25 |
|
|
|
26 |
df = docs2labeltool(docs, extensions=["negation"]) |
|
|
27 |
``` |
|
|
28 |
|
|
|
29 |
The results: |
|
|
30 |
|
|
|
31 |
| note_id | note_text | start | end | label | lexical_variant | negation | |
|
|
32 |
| ------- | ----------------------------- | ----- | --- | ------- | --------------- | -------- | |
|
|
33 |
| 0 | Ceci est un document médical. | 21 | 28 | medical | médical | False | |
|
|
34 |
| 1 | Le patient n'est pas malade. | 21 | 27 | malade | malade | True | |