Diff of /docs/how-to/examples.md [000000] .. [e988c2]

Switch to unified view

a b/docs/how-to/examples.md
1
## How to use this page
2
3
You can either read this page from start to end
4
to get an idea of the kinds of queries you can make with ehrQL.
5
6
Or you can use the navigation bar at the top-right of this page,
7
to see a list of the examples,
8
and then jump to a specific example of interest.
9
10
The examples are organised firstly by the table which they pull data from -
11
for a more complete guide to the tables, refer to the
12
[Table Schemas](../reference/schemas.md) section of the
13
ehrQL documentation.
14
15
## Understanding these examples
16
17
### The populations defined with `define_population()`
18
19
In each of these examples,
20
we specify that the population is **all patients**
21
via `dataset.define_population(patients.exists_for_patient())`.
22
23
In practice,
24
you will likely want to adapt an example to filter to a specific population of interest.
25
Refer to the [`define_population()` documentation](../reference/language.md#Dataset.define_population).
26
27
### Some examples using `codelist_from_csv()`
28
29
:warning: Some examples refer to CSV codelists using the
30
`codelist_from_csv` function,
31
but are incomplete.
32
To actually use these code example,
33
you will need to correctly complete the function call.
34
The codelists are not provided as a part of these examples.
35
36
For example, instead of:
37
38
```python
39
asthma_codelist = codelist_from_csv("XXX", column="YYY")
40
```
41
42
you will need a line more like:
43
44
```python
45
asthma_codelist = codelist_from_csv("your-asthma-codelist.csv", column="code")
46
```
47
48
which provides the filename `your-asthma-codelist.csv`
49
and the name of the CSV column with codes.
50
51
#### Using codelists with category columns
52
53
Some codelists will have a category column that groups individual codes into categories. For example, [this codelist for ethnicity](https://www.opencodelists.org/codelist/opensafely/ethnicity-snomed-0removed/2e641f61/) has 2 category columns, which represent categories at both 6 and 16 levels. To make use of these categories, you can use `codelist_from_csv()` as follows:
54
55
```python
56
ethnicity_codelist = codelist_from_csv("ethnicity_codelist_with_categories", column="snomedcode", category_column="Grouping_6")
57
```
58
59
If you include an argument for `category_column`, the codelist returned will be a *dictionary* mapping individual codes to their respective categories. Without the `category_column` argument, the codelist returned will be a *list* of codes.
60
61
You can see an example of [how to access these categories within your dataset definition ](#finding-each-patients-ethnicity) below.
62
63
## Patients
64
65
Examples for the [patients table](../reference/schemas/core.md#patients).
66
67
### Finding patient demographics
68
69
#### Finding each patient's sex
70
71
```ehrql
72
from ehrql import create_dataset
73
from ehrql.tables.core import patients
74
75
dataset = create_dataset()
76
dataset.sex = patients.sex
77
dataset.define_population(patients.exists_for_patient())
78
```
79
80
The possible values are "female", "male", "intersex", and "unknown".
81
82
#### Finding each patient's date of birth
83
84
```ehrql
85
from ehrql import create_dataset
86
from ehrql.tables.core import patients
87
88
dataset = create_dataset()
89
dataset.date_of_birth = patients.date_of_birth
90
dataset.define_population(patients.exists_for_patient())
91
```
92
93
#### Finding each patient's age
94
95
```ehrql
96
from ehrql import create_dataset
97
from ehrql.tables.core import patients
98
99
dataset = create_dataset()
100
dataset.age = patients.age_on("2023-01-01")
101
dataset.define_population(patients.exists_for_patient())
102
```
103
104
Alternatively, using a native Python `date`:
105
106
```ehrql
107
from datetime import date
108
from ehrql import create_dataset
109
from ehrql.tables.core import patients
110
111
dataset = create_dataset()
112
dataset.age = patients.age_on(date(2023, 1, 1))
113
dataset.define_population(patients.exists_for_patient())
114
```
115
116
Or using an `index_date` variable:
117
118
```ehrql
119
from ehrql import create_dataset
120
from ehrql.tables.core import patients
121
122
index_date = "2023-01-01"
123
dataset = create_dataset()
124
dataset.age = patients.age_on(index_date)
125
dataset.define_population(patients.exists_for_patient())
126
```
127
128
#### Assigning each patient an age band
129
130
```ehrql
131
from ehrql import create_dataset, case, when
132
from ehrql.tables.core import patients
133
134
dataset = create_dataset()
135
age = patients.age_on("2023-01-01")
136
dataset.age_band = case(
137
        when(age < 20).then("0-19"),
138
        when(age < 40).then("20-39"),
139
        when(age < 60).then("40-59"),
140
        when(age < 80).then("60-79"),
141
        when(age >= 80).then("80+"),
142
        otherwise="missing",
143
)
144
dataset.define_population(patients.exists_for_patient())
145
```
146
147
#### Finding each patient's date of death in their primary care record
148
149
```ehrql
150
from ehrql import create_dataset
151
from ehrql.tables.core import patients
152
153
dataset = create_dataset()
154
dataset.date_of_death = patients.date_of_death
155
dataset.define_population(patients.exists_for_patient())
156
```
157
158
:notepad_spiral: This value comes from the patient's EHR record. You can find more information about the accuracy of this value in the [reference schema](../reference/schemas/core.md#recording-of-death-in-primary-care).
159
160
## ONS Deaths
161
162
Examples for the [ons_deaths table](../reference/schemas/core.md#ons_deaths).
163
164
### Finding patient demographics
165
166
#### Finding each patient's date, underlying_cause_of_death, and first noted additional medical condition noted on the death certificate from ONS records
167
168
```ehrql
169
from ehrql import create_dataset
170
from ehrql.tables.core import ons_deaths, patients
171
172
dataset = create_dataset()
173
dataset.date_of_death = ons_deaths.date
174
dataset.underlying_cause_of_death = ons_deaths.underlying_cause_of_death
175
dataset.cause_of_death = ons_deaths.cause_of_death_01
176
dataset.define_population(patients.exists_for_patient())
177
```
178
179
:notepad_spiral: There are currently [multiple](https://github.com/opensafely-core/ehrql/blob/d29ff8ab2cebf3522258c408f8225b7a76f7b6f2/ehrql/tables/beta/core.py#L78-L92) cause of death fields. We aim to resolve these to a single feature in the future.
180
181
182
#### Finding patients with a particular cause of death
183
184
The `ons_deaths` table has multiple "cause of death" fields. Using the
185
[`cause_of_death_is_in()`](../reference/schemas/core.md#ons_deaths.cause_of_death_is_in)
186
method we can match a codelist against all of these at once.
187
188
```ehrql
189
from ehrql import create_dataset, codelist_from_csv
190
from ehrql.tables.core import ons_deaths, patients
191
192
dataset = create_dataset()
193
194
cause_of_death_X_codelist = codelist_from_csv("XXX", column="YYY")
195
196
dataset.died_with_X = ons_deaths.cause_of_death_is_in(cause_of_death_X_codelist)
197
dataset.define_population(patients.exists_for_patient())
198
```
199
200
## Addresses
201
202
Examples for the [TPP addresses table](../reference/schemas/tpp.md#addresses).
203
204
### Finding attributes related to each patient's address as of a given date
205
206
#### Finding each patient's IMD rank
207
208
```ehrql
209
from ehrql import create_dataset
210
from ehrql.tables.tpp import addresses, patients
211
212
dataset = create_dataset()
213
dataset.imd = addresses.for_patient_on("2023-01-01").imd_rounded
214
dataset.define_population(patients.exists_for_patient())
215
```
216
217
The original IMD ranking is rounded to the nearest 100.
218
The rounded IMD ranking ranges from 0 to 32,800.
219
220
See [this code comment](https://github.com/opensafely-core/ehrql/blob/d29ff8ab2cebf3522258c408f8225b7a76f7b6f2/ehrql/tables/beta/tpp.py#L117-L123) about how we choose one address if a patient has multiple registered addresses on the given date.
221
222
#### Calculating each patient's IMD quintile and/or decile
223
224
```ehrql
225
from ehrql import create_dataset
226
from ehrql.tables.tpp import addresses, patients
227
228
dataset = create_dataset()
229
230
patient_address = addresses.for_patient_on("2023-01-01")
231
dataset.imd_quintile = patient_address.imd_quintile
232
dataset.imd_decile = patient_address.imd_decile
233
dataset.define_population(patients.exists_for_patient())
234
```
235
236
#### Finding each patient's rural/urban classification
237
238
```ehrql
239
from ehrql import create_dataset
240
from ehrql.tables.tpp import addresses, patients
241
242
dataset = create_dataset()
243
dataset.rural_urban = addresses.for_patient_on("2023-01-01").rural_urban_classification
244
dataset.define_population(patients.exists_for_patient())
245
```
246
247
The meaning of this value is as follows:
248
249
* 1 - Urban major conurbation
250
* 2 - Urban minor conurbation
251
* 3 - Urban city and town
252
* 4 - Urban city and town in a sparse setting
253
* 5 - Rural town and fringe
254
* 6 - Rural town and fringe in a sparse setting
255
* 7 - Rural village and dispersed
256
* 8 - Rural village and dispersed in a sparse setting
257
258
#### Finding each patient's MSOA
259
260
```ehrql
261
from ehrql import create_dataset
262
from ehrql.tables.tpp import addresses, patients
263
264
dataset = create_dataset()
265
dataset.msoa_code = addresses.for_patient_on("2023-01-01").msoa_code
266
dataset.define_population(patients.exists_for_patient())
267
```
268
269
#### Finding multiple attributes of each patient's address
270
271
```ehrql
272
from ehrql import create_dataset
273
from ehrql.tables.tpp import addresses, patients
274
275
dataset = create_dataset()
276
address = addresses.for_patient_on("2023-01-01")
277
dataset.imd_rounded = address.imd_rounded
278
dataset.rural_urban_classification = address.rural_urban_classification
279
dataset.msoa_code = address.msoa_code
280
dataset.define_population(patients.exists_for_patient())
281
```
282
283
## Practice Registrations
284
285
Examples for the [practice_registrations table](../reference/schemas/core.md#practice_registrations).
286
287
### Finding attributes related to each patient's GP practice as of a given date
288
289
#### Finding each patient's practice's pseudonymised identifier
290
291
```ehrql
292
from ehrql import create_dataset
293
from ehrql.tables.tpp import practice_registrations, patients
294
295
dataset = create_dataset()
296
dataset.practice = practice_registrations.for_patient_on("2023-01-01").practice_pseudo_id
297
dataset.define_population(patients.exists_for_patient())
298
```
299
300
#### Finding each patient's practice's STP
301
302
```ehrql
303
from ehrql import create_dataset
304
from ehrql.tables.tpp import practice_registrations, patients
305
306
dataset = create_dataset()
307
dataset.stp = practice_registrations.for_patient_on("2023-01-01").practice_stp
308
dataset.define_population(patients.exists_for_patient())
309
```
310
311
#### Finding each patient's practice's region
312
313
```ehrql
314
from ehrql import create_dataset
315
from ehrql.tables.tpp import practice_registrations, patients
316
317
dataset = create_dataset()
318
dataset.region = practice_registrations.for_patient_on("2023-01-01").practice_nuts1_region_name
319
dataset.define_population(patients.exists_for_patient())
320
```
321
322
#### Finding multiple attributes of each patient's practice
323
324
```ehrql
325
from ehrql import create_dataset
326
from ehrql.tables.tpp import practice_registrations, patients
327
328
dataset = create_dataset()
329
registration = practice_registrations.for_patient_on("2023-01-01")
330
dataset.practice = registration.practice_pseudo_id
331
dataset.stp = registration.practice_stp
332
dataset.region = registration.practice_nuts1_region_name
333
dataset.define_population(patients.exists_for_patient())
334
```
335
336
#### Excluding patients based on study dates
337
338
The following example ensures that the dataset only includes patients registered at a
339
single practice for the entire duration of the study, plus at least 3 months prior to the
340
study start.
341
342
```ehrql
343
from ehrql import create_dataset, codelist_from_csv, months
344
from ehrql.tables.tpp import patients, practice_registrations
345
346
study_start_date = "2022-01-01"
347
study_end_date = "2022-12-31"
348
349
dataset = create_dataset()
350
351
# find registrations that exist for the full study period, and at least 3 months
352
# prior
353
registrations = (
354
    practice_registrations.where(
355
        practice_registrations.start_date.is_on_or_before(study_start_date - months(3))
356
    )
357
    .except_where(
358
        practice_registrations.end_date.is_on_or_before(study_end_date)
359
    )
360
)
361
362
dataset.define_population(registrations.exists_for_patient())
363
```
364
365
## Clinical Events
366
367
Examples for the [clinical_events table](../reference/schemas/core.md#clinical_events).
368
369
### Finding patient demographics
370
371
#### Finding each patient's ethnicity
372
373
Ethnicity can be defined using a codelist. There are a lot of individual codes that can used to indicate a patients' fine-grained ethnicity. To make analysis more manageable, ethnicity is therefore commonly grouped into higher level categories. Above, we described how you can [import codelists that have a category column](#some-examples-using-codelist_from_csv). You can use a codelist with a category column to map clinical event codes for ethnicity to higher level categories as in this example:
374
375
```ehrql
376
from ehrql import create_dataset
377
from ehrql.tables.core import clinical_events, patients
378
from ehrql import codelist_from_csv
379
380
dataset = create_dataset()
381
382
ethnicity_codelist = codelist_from_csv(
383
    "ethnicity_codelist_with_categories",
384
    column="snomedcode",
385
    category_column="Grouping_6",
386
)
387
388
dataset.latest_ethnicity_code = (
389
    clinical_events.where(clinical_events.snomedct_code.is_in(ethnicity_codelist))
390
    .where(clinical_events.date.is_on_or_before("2023-01-01"))
391
    .sort_by(clinical_events.date)
392
    .last_for_patient()
393
    .snomedct_code
394
)
395
dataset.latest_ethnicity_group = dataset.latest_ethnicity_code.to_category(
396
    ethnicity_codelist
397
)
398
dataset.define_population(patients.exists_for_patient())
399
```
400
401
### Does each patient have an event matching some criteria?
402
403
#### Does each patient have a clinical event matching a code in a codelist?
404
405
```ehrql
406
from ehrql import create_dataset, codelist_from_csv
407
from ehrql.tables.core import clinical_events, patients
408
409
asthma_codelist = codelist_from_csv("XXX", column="YYY")
410
411
dataset = create_dataset()
412
dataset.has_had_asthma_diagnosis = clinical_events.where(
413
        clinical_events.snomedct_code.is_in(asthma_codelist)
414
).exists_for_patient()
415
dataset.define_population(patients.exists_for_patient())
416
```
417
418
#### Does each patient have a clinical event matching a code in a codelist in a time period?
419
420
```ehrql
421
from ehrql import create_dataset, codelist_from_csv
422
from ehrql.tables.core import clinical_events, patients
423
424
asthma_codelist = codelist_from_csv("XXX", column="YYY")
425
426
dataset = create_dataset()
427
dataset.has_recent_asthma_diagnosis = clinical_events.where(
428
        clinical_events.snomedct_code.is_in(asthma_codelist)
429
).where(
430
        clinical_events.date.is_on_or_between("2022-07-01", "2023-01-01")
431
).exists_for_patient()
432
dataset.define_population(patients.exists_for_patient())
433
```
434
435
### What is the first/last event matching some criteria?
436
437
The `first_for_patient()` and `last_for_patient()` methods can only be used on a sorted frame.
438
Frames can be sorted by calling the `sort_by()` method with the column to sort the frame by.
439
440
#### What is the earliest/latest clinical event matching some criteria?
441
442
```ehrql
443
from ehrql import create_dataset, codelist_from_csv
444
from ehrql.tables.core import clinical_events, patients
445
446
asthma_codelist = codelist_from_csv("XXX", column="YYY")
447
448
dataset = create_dataset()
449
dataset.first_asthma_diagnosis_date = clinical_events.where(
450
        clinical_events.snomedct_code.is_in(asthma_codelist)
451
).where(
452
        clinical_events.date.is_on_or_after("2022-07-01")
453
).sort_by(
454
        clinical_events.date
455
).first_for_patient().date
456
dataset.define_population(patients.exists_for_patient())
457
```
458
459
```ehrql
460
from ehrql import create_dataset, codelist_from_csv
461
from ehrql.tables.core import clinical_events, patients
462
463
asthma_codelist = codelist_from_csv("XXX", column="YYY")
464
465
dataset = create_dataset()
466
dataset.last_asthma_diagnosis_date = clinical_events.where(
467
        clinical_events.snomedct_code.is_in(asthma_codelist)
468
).where(
469
        clinical_events.date.is_on_or_after("2022-07-01")
470
).sort_by(
471
        clinical_events.date
472
).last_for_patient().date
473
dataset.define_population(patients.exists_for_patient())
474
```
475
476
#### What is the clinical event, matching some criteria, with the least/greatest value?
477
478
```ehrql
479
from ehrql import create_dataset, codelist_from_csv
480
from ehrql.tables.core import clinical_events, patients
481
482
hba1c_codelist = codelist_from_csv("XXX", column="YYY")
483
484
dataset = create_dataset()
485
486
hba1c_events = clinical_events.where(
487
        clinical_events.snomedct_code.is_in(hba1c_codelist)
488
).where(
489
        clinical_events.date.is_on_or_after("2022-07-01")
490
)
491
492
earliest_min_hba1c_event = hba1c_events.sort_by(
493
        clinical_events.numeric_value, clinical_events.date
494
).first_for_patient()
495
496
earliest_max_hba1c_event = hba1c_events.sort_by(
497
        # Note the leading minus sign to sort numeric_value in reverse order
498
        -clinical_events.numeric_value, clinical_events.date
499
).first_for_patient()
500
501
latest_min_hba1c_event = hba1c_events.sort_by(
502
        # Note the leading minus sign to sort numeric_value in reverse order
503
        -clinical_events.numeric_value, clinical_events.date
504
).last_for_patient()
505
506
latest_max_hba1c_event = hba1c_events.sort_by(
507
        clinical_events.numeric_value, clinical_events.date
508
).last_for_patient()
509
510
dataset.date_of_first_min_hba1c_observed = earliest_min_hba1c_event.date
511
dataset.date_of_first_max_hba1c_observed = earliest_max_hba1c_event.date
512
dataset.date_of_last_min_hba1c_observed = latest_min_hba1c_event.date
513
dataset.date_of_last_max_hba1c_observed = latest_max_hba1c_event.date
514
515
dataset.value_of_first_min_hba1c_observed = earliest_min_hba1c_event.numeric_value
516
dataset.value_of_first_max_hba1c_observed = earliest_max_hba1c_event.numeric_value
517
dataset.value_of_last_min_hba1c_observed = latest_min_hba1c_event.numeric_value
518
dataset.value_of_last_max_hba1c_observed = latest_max_hba1c_event.numeric_value
519
520
dataset.define_population(patients.exists_for_patient())
521
```
522
523
### Getting properties of an event matching some criteria
524
525
#### What is the code of the first/last clinical event matching some criteria?
526
527
```ehrql
528
from ehrql import create_dataset, codelist_from_csv
529
from ehrql.tables.core import clinical_events, patients
530
531
asthma_codelist = codelist_from_csv("XXX", column="YYY")
532
533
dataset = create_dataset()
534
dataset.first_asthma_diagnosis_code = clinical_events.where(
535
        clinical_events.snomedct_code.is_in(asthma_codelist)
536
).where(
537
        clinical_events.date.is_on_or_after("2022-07-01")
538
).sort_by(
539
        clinical_events.date
540
).first_for_patient().snomedct_code
541
dataset.define_population(patients.exists_for_patient())
542
```
543
544
#### What is the date of the first/last clinical event matching some criteria?
545
546
```ehrql
547
from ehrql import create_dataset, codelist_from_csv
548
from ehrql.tables.core import clinical_events, patients
549
550
asthma_codelist = codelist_from_csv("XXX", column="YYY")
551
552
dataset = create_dataset()
553
dataset.first_asthma_diagnosis_date = clinical_events.where(
554
        clinical_events.snomedct_code.is_in(asthma_codelist)
555
).where(
556
        clinical_events.date.is_on_or_after("2022-07-01")
557
).sort_by(
558
        clinical_events.date
559
).first_for_patient().date
560
dataset.define_population(patients.exists_for_patient())
561
```
562
563
#### What is the code and date of the first/last clinical event matching some criteria?
564
565
```ehrql
566
from ehrql import create_dataset, codelist_from_csv
567
from ehrql.tables.core import clinical_events, patients
568
569
asthma_codelist = codelist_from_csv("XXX", column="YYY")
570
571
dataset = create_dataset()
572
first_asthma_diagnosis = clinical_events.where(
573
        clinical_events.snomedct_code.is_in(asthma_codelist)
574
).where(
575
        clinical_events.date.is_on_or_after("2022-07-01")
576
).sort_by(
577
        clinical_events.date
578
).first_for_patient()
579
dataset.first_asthma_diagnosis_code = first_asthma_diagnosis.snomedct_code
580
dataset.first_asthma_diagnosis_date = first_asthma_diagnosis.date
581
dataset.define_population(patients.exists_for_patient())
582
```
583
584
### Performing arithmetic on numeric values of clinical events
585
586
#### Finding the mean observed value of clinical events matching some criteria
587
588
```ehrql
589
from ehrql import create_dataset, codelist_from_csv
590
from ehrql.tables.core import clinical_events, patients
591
592
hba1c_codelist = codelist_from_csv("XXX", column="YYY")
593
594
dataset = create_dataset()
595
dataset.mean_hba1c = clinical_events.where(
596
        clinical_events.snomedct_code.is_in(hba1c_codelist)
597
).where(
598
        clinical_events.date.is_on_or_after("2022-07-01")
599
).numeric_value.mean_for_patient()
600
dataset.define_population(patients.exists_for_patient())
601
```
602
603
### Finding events within a date range
604
605
#### Finding events within a fixed date range
606
607
```ehrql
608
from ehrql import create_dataset, codelist_from_csv
609
from ehrql.tables.core import clinical_events, patients
610
611
asthma_codelist = codelist_from_csv("XXX", column="YYY")
612
613
dataset = create_dataset()
614
dataset.has_recent_asthma_diagnosis = clinical_events.where(
615
        clinical_events.snomedct_code.is_in(asthma_codelist)
616
).where(
617
        clinical_events.date.is_on_or_between("2022-07-01", "2023-01-01")
618
).exists_for_patient()
619
dataset.define_population(patients.exists_for_patient())
620
```
621
622
#### Finding events within a date range plus a constant
623
624
```ehrql
625
from ehrql import create_dataset, codelist_from_csv, weeks
626
from ehrql.tables.core import clinical_events, patients
627
628
asthma_codelist = codelist_from_csv("XXX", column="YYY")
629
630
index_date = "2022-07-01"
631
632
dataset = create_dataset()
633
dataset.has_recent_asthma_diagnosis = clinical_events.where(
634
        clinical_events.snomedct_code.is_in(asthma_codelist)
635
).where(
636
        clinical_events.date.is_on_or_between(index_date, index_date + weeks(2))
637
).exists_for_patient()
638
dataset.define_population(patients.exists_for_patient())
639
```
640
641
#### Finding events within a dynamic date range
642
643
```ehrql
644
from ehrql import create_dataset, codelist_from_csv, months
645
from ehrql.tables.core import clinical_events, patients
646
647
diabetes_codelist = codelist_from_csv("XXX", column="YYY")
648
hba1c_codelist = codelist_from_csv("XXX", column="YYY")
649
650
dataset = create_dataset()
651
first_diabetes_code_date = clinical_events.where(
652
        clinical_events.snomedct_code.is_in(diabetes_codelist)
653
).sort_by(
654
        clinical_events.date
655
).first_for_patient().date
656
657
dataset.count_of_hba1c_tests_6mo_post_first_diabetes_code = clinical_events.where(
658
        clinical_events.snomedct_code.is_in(hba1c_codelist)
659
).where(
660
        clinical_events.date.is_on_or_between(first_diabetes_code_date, first_diabetes_code_date + months(6))
661
).count_for_patient()
662
dataset.define_population(patients.exists_for_patient())
663
```
664
665
#### Excluding events which have happened in the future
666
667
Data quality issues with many sources may result in events apparently happening in future dates (e.g. 9999-01-01), it is useful to filter these from your analysis.
668
669
```ehrql
670
from datetime import date
671
from ehrql import create_dataset, codelist_from_csv
672
from ehrql.tables.core import clinical_events, patients
673
674
asthma_codelist = codelist_from_csv("XXX", column="YYY")
675
676
dataset = create_dataset()
677
dataset.has_recent_asthma_diagnosis = clinical_events.where(
678
        clinical_events.snomedct_code.is_in(asthma_codelist)
679
).where(
680
        clinical_events.date > "2022-07-01"
681
).where(
682
        clinical_events.date < date.today()
683
).exists_for_patient()
684
dataset.define_population(patients.exists_for_patient())
685
```
686
687
### Extracting parts of dates and date differences
688
689
#### Finding the year an event occurred
690
691
```ehrql
692
from datetime import date
693
from ehrql import create_dataset, codelist_from_csv
694
from ehrql.tables.core import clinical_events, patients
695
696
asthma_codelist = codelist_from_csv("XXX", column="YYY")
697
698
dataset = create_dataset()
699
dataset.year_of_first = clinical_events.where(
700
        clinical_events.snomedct_code.is_in(asthma_codelist)
701
).sort_by(
702
        clinical_events.date
703
).first_for_patient().date.year
704
dataset.define_population(patients.exists_for_patient())
705
```
706
707
#### Finding the number of weeks between two events
708
709
```ehrql
710
from ehrql import create_dataset, codelist_from_csv
711
from ehrql.tables.core import clinical_events, patients
712
713
asthma_codelist = codelist_from_csv("XXX", column="YYY")
714
asthma_review_codelist = codelist_from_csv("XXX", column="YYY")
715
716
dataset = create_dataset()
717
first_asthma_diagnosis_date = clinical_events.where(
718
        clinical_events.snomedct_code.is_in(asthma_codelist)
719
).sort_by(clinical_events.date).first_for_patient().date
720
721
first_asthma_review_date = clinical_events.where(
722
        clinical_events.snomedct_code.is_in(asthma_review_codelist)
723
).where(
724
        clinical_events.date.is_on_or_after(first_asthma_diagnosis_date)
725
).sort_by(clinical_events.date).first_for_patient().date
726
727
dataset.weeks_between_diagnosis_and_review = (first_asthma_review_date - first_asthma_diagnosis_date).weeks
728
dataset.define_population(patients.exists_for_patient())
729
```
730
731
## Admitted Patient Care Spells (APCS)
732
733
Examples for the [TPP apcs table](../reference/schemas/tpp.md#apcs).
734
735
### Does each patient have an event matching some criteria?
736
737
#### Does each patient have a hospitalisation event matching some criteria?
738
739
```ehrql
740
from ehrql import create_dataset, codelist_from_csv
741
from ehrql.tables.tpp import apcs, patients
742
743
cardiac_diagnosis_codes = codelist_from_csv("XXX", column="YYY")
744
745
dataset = create_dataset()
746
dataset.has_recent_cardiac_admission = apcs.where(
747
        apcs.primary_diagnosis.is_in(cardiac_diagnosis_codes)
748
).where(
749
        apcs.admission_date.is_on_or_between("2022-07-01", "2023-01-01")
750
).exists_for_patient()
751
dataset.define_population(patients.exists_for_patient())
752
```
753
754
## Medications
755
756
Examples for the [medications table](../reference/schemas/core.md#medications).
757
758
### Does each patient have an event matching some criteria?
759
760
#### Does each patient have a medication event matching some criteria?
761
762
```ehrql
763
from ehrql import create_dataset, codelist_from_csv
764
from ehrql.tables.core import medications, patients
765
766
statin_medications = codelist_from_csv("XXX", column="YYY")
767
768
dataset = create_dataset()
769
dataset.has_recent_statin_prescription = medications.where(
770
        medications.dmd_code.is_in(statin_medications)
771
).where(
772
        medications.date.is_on_or_between("2022-07-01", "2023-01-01")
773
).exists_for_patient()
774
dataset.define_population(patients.exists_for_patient())
775
```
776
777
#### How many events does each patient have matching some criteria?
778
779
```ehrql
780
from ehrql import create_dataset, codelist_from_csv
781
from ehrql.tables.core import medications, patients
782
783
statin_medications = codelist_from_csv("XXX", column="YYY")
784
785
dataset = create_dataset()
786
dataset.number_of_statin_prescriptions_in_last_year = medications.where(
787
        medications.dmd_code.is_in(statin_medications)
788
).where(
789
        medications.date.is_on_or_between("2022-01-01", "2023-01-01")
790
).count_for_patient()
791
dataset.define_population(patients.exists_for_patient())
792
```
793
794
#### What is the earliest/latest medication event matching some criteria?
795
796
```ehrql
797
from ehrql import create_dataset, codelist_from_csv
798
from ehrql.tables.core import medications, patients
799
800
statin_medications = codelist_from_csv("XXX", column="YYY")
801
802
dataset = create_dataset()
803
dataset.first_statin_prescription_date = medications.where(
804
        medications.dmd_code.is_in(statin_medications)
805
).where(
806
        medications.date.is_on_or_after("2022-07-01")
807
).sort_by(
808
        medications.date
809
).first_for_patient().date
810
dataset.define_population(patients.exists_for_patient())
811
```
812
813
```ehrql
814
from ehrql import create_dataset, codelist_from_csv
815
from ehrql.tables.core import medications, patients
816
817
statin_medications = codelist_from_csv("XXX", column="YYY")
818
819
dataset = create_dataset()
820
dataset.last_statin_prescription_date = medications.where(
821
        medications.dmd_code.is_in(statin_medications)
822
).where(
823
        medications.date.is_on_or_after("2022-07-01")
824
).sort_by(
825
        medications.date
826
).last_for_patient().date
827
dataset.define_population(patients.exists_for_patient())
828
```
829
830
### Extracting parts of dates and date differences
831
832
#### Finding prescriptions made in particular months of the year
833
834
```ehrql
835
from ehrql import create_dataset, codelist_from_csv
836
from ehrql.tables.core import medications, patients
837
838
amoxicillin_codelist = codelist_from_csv("XXX", column="YYY")
839
840
winter_months = [10,11,12,1,2,3]
841
842
dataset = create_dataset()
843
dataset.winter_amoxicillin_count = medications.where(
844
        medications.dmd_code.is_in(amoxicillin_codelist)
845
).where(
846
        medications.date.month.is_in(winter_months)
847
).count_for_patient()
848
dataset.define_population(patients.exists_for_patient())
849
```
850
851
### Finding events occuring close in time to another event
852
853
#### Finding the code of the first medication after the first clinical event matching some criteria
854
855
```ehrql
856
from ehrql import create_dataset, codelist_from_csv, weeks
857
from ehrql.tables.core import clinical_events, medications, patients
858
859
asthma_codelist = codelist_from_csv("XXX", column="YYY")
860
inhaled_corticosteroid_codelist = codelist_from_csv("XXX", column="YYY")
861
862
dataset = create_dataset()
863
first_asthma_diagnosis_date = clinical_events.where(
864
        clinical_events.snomedct_code.is_in(asthma_codelist)
865
).where(
866
        clinical_events.date.is_on_or_after("2022-07-01")
867
).sort_by(
868
        clinical_events.date
869
).first_for_patient().date
870
dataset.first_asthma_diagnosis_date = first_asthma_diagnosis_date
871
dataset.count_ics_prescriptions_2wks_post_diagnosis = medications.where(
872
        medications.dmd_code.is_in(inhaled_corticosteroid_codelist)
873
).where(
874
        medications.date.is_on_or_between(first_asthma_diagnosis_date,first_asthma_diagnosis_date + weeks(2))
875
).count_for_patient()
876
dataset.define_population(patients.exists_for_patient())
877
```
878
879
### Excluding medications for patients who have transferred between practices
880
881
Note that in these examples, the periods defined are illustrative only.
882
883
#### Excluding patients based on prescription date
884
885
```ehrql
886
from ehrql import case, create_dataset, codelist_from_csv, when, weeks
887
from ehrql.tables.tpp import medications, patients, practice_registrations
888
889
def meets_registrations_criteria(medication_date):
890
    # For this medication date, find whether a registration exists where
891
    # the start date and end dates are within a 12 weeks
892
    # prior/after to the prescription
893
894
    start_cutoff_date = medication_date - weeks(12)
895
    end_cutoff_date = medication_date + weeks(12)
896
    return (
897
        practice_registrations.where(
898
        practice_registrations.start_date.is_on_or_before(start_cutoff_date)
899
        )
900
        .except_where(
901
        practice_registrations.end_date.is_on_or_before(end_cutoff_date)
902
        )
903
        .exists_for_patient()
904
    )
905
906
medication_codelist = codelist_from_csv("XXX", column="YYY")
907
908
dataset = create_dataset()
909
910
# First relevant prescription per patient
911
first_prescription = (
912
    medications.where(
913
        medications.dmd_code.is_in(medication_codelist)
914
    )
915
    .sort_by(medications.date)
916
    .first_for_patient()
917
)
918
919
# Include only prescriptions that fall within accepatable registration dates
920
dataset.prescription_date = case(
921
    when(meets_registrations_criteria(first_prescription.date))
922
    .then(first_prescription.date)
923
)
924
dataset.define_population(patients.exists_for_patient())
925
```
926
927
## Decision support values
928
929
Examples for the [TPP decision support values table](../reference/schemas/tpp.md#decision_support_values).
930
931
### Finding the most recent decision support value
932
933
#### Finding each patient's EFI (electronic frailty index)
934
935
```ehrql
936
from ehrql import create_dataset
937
from ehrql.tables.tpp import decision_support_values
938
939
dataset = create_dataset()
940
latest_efi_record = (
941
  decision_support_values
942
    .electronic_frailty_index()
943
    .sort_by(decision_support_values.calculation_date)
944
    .last_for_patient()
945
)
946
dataset.latest_efi = latest_efi_record.numeric_value
947
dataset.latest_efi_date = latest_efi_record.calculation_date
948
dataset.define_population(decision_support_values.exists_for_patient())
949
```