Diff of /docs/index.md [000000] .. [cad161]

Switch to unified view

a b/docs/index.md
1
# Getting started
2
3
EDS-NLP is a collaborative NLP framework that aims at extracting information from French clinical notes.
4
At its core, it is a collection of components or pipes, either rule-based functions or
5
deep learning modules. These components are organized into a novel efficient and modular pipeline system, built for hybrid and multitask models. We use [spaCy](https://spacy.io) to represent documents and their annotations, and [Pytorch](https://pytorch.org/) as a deep-learning backend for trainable components.
6
7
EDS-NLP is versatile and can be used on any textual document. The rule-based components are fully compatible with spaCy's pipelines, and vice versa. This library is a product of collaborative effort, and we encourage further contributions to enhance its capabilities.
8
9
Check out our interactive [demo](https://aphp.github.io/edsnlp/demo/) !
10
11
## Quick start
12
13
### Installation
14
15
You can install EDS-NLP via `pip`. We recommend pinning the library version in your projects, or use a strict package manager like [Poetry](https://python-poetry.org/).
16
17
```{: data-md-color-scheme="slate" }
18
pip install edsnlp==0.17.0
19
```
20
21
or if you want to use the trainable components (using pytorch)
22
23
```{: data-md-color-scheme="slate" }
24
pip install "edsnlp[ml]==0.17.0"
25
```
26
27
### A first pipeline
28
29
Once you've installed the library, let's begin with a very simple example that extracts mentions of COVID19 in a text, and detects whether they are negated.
30
31
```python
32
import edsnlp, edsnlp.pipes as eds
33
34
nlp = edsnlp.blank("eds")  # (1)
35
36
terms = dict(
37
    covid=["covid", "coronavirus"],  # (2)
38
)
39
40
# Sentencizer component, needed for negation detection
41
nlp.add_pipe(eds.sentences())  # (3)
42
# Matcher component
43
nlp.add_pipe(eds.matcher(terms=terms))  # (4)
44
# Negation detection
45
nlp.add_pipe(eds.negation())
46
47
# Process your text in one call !
48
doc = nlp("Le patient n'est pas atteint de covid")
49
50
doc.ents  # (5)
51
# Out: (covid,)
52
53
doc.ents[0]._.negation  # (6)
54
# Out: True
55
```
56
57
1. 'eds' is the name of the language, which defines the [tokenizer](/tokenizers).
58
2. This example terminology provides a very simple, and by no means exhaustive, list of synonyms for COVID19.
59
3. Similarly to spaCy, pipes are added via the [`nlp.add_pipe` method](https://spacy.io/api/language#add_pipe).
60
4. See the [matching tutorial](tutorials/matching-a-terminology.md) for mode details.
61
5. spaCy stores extracted entities in the [`Doc.ents` attribute](https://spacy.io/api/doc#ents).
62
6. The `eds.negation` component has adds a `negation` custom attribute.
63
64
This example is complete, it should run as-is.
65
66
## Tutorials
67
68
To learn more about EDS-NLP, we have prepared a series of tutorials that should cover the main features of the library.
69
70
--8<-- "docs/tutorials/index.md:tutorials"
71
72
## Available pipeline components
73
74
--8<-- "docs/pipes/index.md:components"
75
76
## Disclaimer
77
78
The performances of an extraction pipeline may depend on the population and documents that are considered.
79
80
## Contributing to EDS-NLP
81
82
We welcome contributions ! Fork the project and propose a pull request.
83
Take a look at the [dedicated page](https://aphp.github.io/edsnlp/latest/contributing/) for detail.
84
85
## Citation
86
87
If you use EDS-NLP, please cite us as below.
88
89
```bibtex
90
@misc{edsnlp,
91
  author = {Wajsburt, Perceval and Petit-Jean, Thomas and Dura, Basile and Cohen, Ariel and Jean, Charline and Bey, Romain},
92
  doi    = {10.5281/zenodo.6424993},
93
  title  = {EDS-NLP: efficient information extraction from French clinical notes},
94
  url    = {https://aphp.github.io/edsnlp}
95
}
96
```