Download this file

796 lines (654 with data), 24.8 kB

emis schema

Available on backends: EMIS

This schema defines the data (both primary care and externally linked) available in the
OpenSAFELY-EMIS backend. For more information about this backend, see
"EMIS Primary Care".

from ehrql.tables.emis import (
    clinical_events,
    medications,
    ons_deaths,
    patients,
    practice_registrations,
    vaccinations,
)

many rows per patient

clinical_events

Each record corresponds to a single clinical or consultation event for a patient.

Note that event codes do not change in this table. If an event code in the coding
system becomes inactive, the event will still be coded to the inactive code.
As such, codelists should include all relevant inactive codes.

Example ehrQL usage of clinical_events

Columns
date 🔗 date
snomedct_code 🔗 SNOMED-CT code
numeric_value 🔗 float

many rows per patient

medications

The medications table provides data about prescribed medications in primary care.

Prescribing data, including the contents of the medications table are standardised
across clinical information systems such as SystmOne (TPP). This is a requirement
for data transfer through the
Electronic Prescription Service
in which data passes from the prescriber to the pharmacy for dispensing.

Medications are coded using
dm+d codes.
The medications table is structured similarly to the clinical_events
table, and each row in the table is made up of a patient identifier, an event (dm+d)
code, and an event date. For this table, the event refers to the issue of a medication
(coded as a dm+d code), and the event date, the date the prescription was issued.

Factors to consider when using medications data

Depending on the specific area of research, you may wish to exclude medications
in particular periods. For example, in order to ensure medication data is stable
following a change of practice, you may want to exclude patients for a period after
the start of their practice registration . You may also want to
exclude medications for patients for a period prior to their leaving a practice.
Alternatively, for research looking at a specific period of
interest, you may simply want to ensure that all included patients were registered
at a single practice for a minimum time prior to the study period, and were
registered at the same practice for the duration of the study period.

Examples of using ehrQL to calculation such periods can be found in the documentation
on how to
use ehrQL to answer specific questions using the medications table

Columns
date 🔗 date
dmd_code 🔗 dm+d code

one row per patient

ons_deaths

Registered deaths

Date and cause of death based on information recorded when deaths are
certified and registered in England and Wales from February 2019 onwards.
The data provider is the Office for National Statistics (ONS).
This table is updated approximately weekly in OpenSAFELY.

This table includes the underlying cause of death and up to 15 medical conditions
mentioned on the death certificate. These codes (cause_of_death_01 to
cause_of_death_15) are not ordered meaningfully.

More information about this table can be found in following documents provided by the ONS:

In the associated database table ONS_Deaths,
a small number of patients have multiple registered deaths.
This table contains the earliest registered death.
The ehrql.tables.raw.core.ons_deaths table contains all registered deaths.

!!! warning
There is also a lag in ONS death recording caused amongst other things by things
like autopsies and inquests delaying reporting on cause of death. This is
evident in the OpenSAFELY historical database coverage
report

Example ehrQL usage of ons_deaths

Columns
date 🔗 date

Patient's date of death.

underlying_cause_of_death 🔗 ICD-10 code

Patient's underlying cause of death.

cause_of_death_01 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_02 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_03 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_04 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_05 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_06 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_07 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_08 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_09 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_10 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_11 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_12 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_13 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_14 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_15 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

Methods
cause_of_death_is_in(codelist) 🔗

Match codelist against the underlying_cause_of_death field and all 15
separate cause_of_death fields.

This method evaluates as True if any code in the codelist matches any of
these fields.


View method definition

columns = [
    "underlying_cause_of_death",
    *[f"cause_of_death_{i:02d}" for i in range(1, 16)],
]
conditions = [getattr(ons_deaths, column).is_in(codelist) for column in columns]
return functools.reduce(operator.or_, conditions)
</details>

one row per patient

patients

Patients in primary care.

In the EMIS backend, this table also includes information about the patient's
current practice registration. Historical practice registration data is not
currently available.

Recording of death in primary care

Dates of death appear in two places in the data made available via OpenSAFELY: the
primary care record, and the death certificate data supplied by the ONS.

ONS death data are considered the gold standard for identifying patient death in
England because they are based on the MCCDs (Medical Certificate of Cause of Death)
which the last attending doctor has a statutory duty to complete.

While there is generally a lag between the death being recorded in ONS data and it
appearing in the primary care record, the coverage of recorded death is almost
complete and the date of death is usually reliable when it appears. There is also a
lag in ONS death recording (see ons_deaths below for more detail).
You can find out more about the accuracy of date of death recording in primary care
in:

Gallagher, A. M., Dedman, D., Padmanabhan, S., Leufkens, H. G. M. & de Vries, F 2019. The accuracy of date of death recording in the Clinical
Practice Research Datalink GOLD database in England compared with the Office for National Statistics death registrations.
Pharmacoepidemiol. Drug Saf. 28, 563–569.
https://doi.org/10.1002/pds.4747

By contrast, cause of death is often not accurate in the primary care record so we
don't make it available to query here.

Example ehrQL usage of patients

Columns
date_of_birth 🔗 date

Patient's date of birth.

  • Always the first day of a month
  • Never NULL
sex 🔗 string

Patient's sex.

  • Possible values: female, male, unknown
  • Never NULL
date_of_death 🔗 date

Patient's date of death.

registration_start_date 🔗 date

Date patient joined practice.

  • Never NULL
registration_end_date 🔗 date

Date patient left practice.

practice_pseudo_id 🔗 string

Pseudonymised practice identifier.

  • Never NULL
rural_urban_classification 🔗 integer

Rural urban classification:

  • 1 - Urban major conurbation
  • 2 - Urban minor conurbation
  • 3 - Urban city and town
  • 4 - Urban city and town in a sparse setting
  • 5 - Rural town and fringe
  • 6 - Rural town and fringe in a sparse setting
  • 7 - Rural village and dispersed
  • 8 - Rural village and dispersed in a sparse setting

  • Always >= 1 and <= 8

imd_rounded 🔗 integer

Index of Multiple Deprivation (IMD)
rounded to the nearest 100, where lower values represent more deprived areas.

  • Always >= 0, <= 32800, and a multiple of 100
Methods
age_on(date) 🔗

Patient's age as an integer, in whole elapsed calendar years, as it would be on
the given date.

This method takes no account of whether the patient is alive on the given date.
In particular, it may return negative values if the given date is before the
patient's date of birth.


View method definition

return (date - patients.date_of_birth).years
</details>
has_practice_registration_spanning(start_date, end_date) 🔗

Whether a patient's registration spans the entire period between
start_date and end_date.


View method definition

return patients.registration_start_date.is_on_or_before(start_date) & (
    patients.registration_end_date.is_after(end_date)
    | patients.registration_end_date.is_null()
)
</details>

many rows per patient

practice_registrations

Each record corresponds to a patient's registration with a practice.

Example ehrQL usage of practice_registrations

!!! warning
At present, the EMIS database contains only the patient's current practice
registration and does not include their full registration history.

Columns
start_date 🔗 date

Date patient joined practice.

  • Never NULL
end_date 🔗 date

Date patient left practice.

practice_pseudo_id 🔗 integer

Pseudonymised practice identifier.

  • Never NULL
Methods
for_patient_on(date) 🔗

Return each patient's practice registration as it was on the supplied date.

Where a patient is registered with multiple practices we prefer the most recent
registration and then, if there are multiple of these, the one with the longest
duration. If there's still an exact tie we choose arbitrarily based on the
practice ID.


View method definition

spanning_regs = practice_registrations.where(practice_registrations.start_date <= date).except_where(
    practice_registrations.end_date < date
)
ordered_regs = spanning_regs.sort_by(
    practice_registrations.start_date,
    practice_registrations.end_date,
    practice_registrations.practice_pseudo_id,
)
return ordered_regs.last_for_patient()
</details>
exists_for_patient_on(date) 🔗

Returns whether a person was registered with a practice on the supplied date.

NB. The implementation currently uses spanning(). It would also have been
valid to implement as
practice_registrations.for_patient_on(date).exists_for_patient(), but for
internal reasons that is less efficient.


View method definition

return practice_registrations.spanning(date, date).exists_for_patient()
</details>
spanning(start_date, end_date) 🔗

Filter registrations to just those spanning the entire period between
start_date and end_date.


View method definition

return practice_registrations.where(
    practice_registrations.start_date.is_on_or_before(start_date)
    & (practice_registrations.end_date.is_after(end_date) | practice_registrations.end_date.is_null())
)
</details>

many rows per patient

vaccinations

This table contains information on administered vaccinations,
identified using SNOMED-CT codes for the vaccination procedure.

Vaccinations may also be queried by product code using the
medications table.

Vaccinations that were administered at work or in a pharmacy might not be
included in this table.

Columns
date 🔗 date

The date the vaccination was administered.

procedure_code 🔗 SNOMED-CT code