[cad161]: / notebooks / dates / user-guide.md

Download this file

94 lines (74 with data), 1.5 kB


jupyter:
jupytext:
formats: md,ipynb
main_language: python
text_representation:
extension: .md
format_name: markdown
format_version: '1.3'
jupytext_version: 1.13.8
kernelspec:
display_name: 'Python 3.9.5 64-bit (''.env'': venv)'
name: python3


import context
from edsnlp.pipelines.misc.dates import Dates, terms
from datetime import datetime
import spacy

Date detection

text = (
    "Le patient est arrivé le 23 août (23/08/2021). "
    "Il dit avoir eu mal au ventre hier. "
    "L'année dernière, on lui avait prescrit du doliprane."
)
nlp = spacy.blank('fr')
doc = nlp(text)
dates = Dates(
    nlp,
    absolute=terms.absolute,
    relative=terms.relative,
    no_year=terms.no_year,
)
dates(doc)
doc.spans
print(f"{'expression':<20}  label")
print(f"{'----------':<20}  -----")

for span in doc.spans['dates']:
    print(f"{span.text:<20}  {span._.date}")

Lorsque la date du document n'est pas connue, le label des dates relatives (hier, il y a quinze jours, etc) devient TD±<nb-de-jours>

Si on renseigne l'extension note_datetime :

doc._.note_datetime = datetime(2020, 10, 10)
dates(doc)
print(f"{'expression':<20}  label")
print(f"{'----------':<20}  -----")

for span in doc.spans['dates']:
    print(f"{span.text:<20}  {span._.date}")