---8<-- 'includes/generated_docs/language__dataset.md'
Frames are the starting point for building any query in ehrQL. You can
think of a Frame as being like a table in a database, in that it
contains multiple rows and multiple columns. But a Frame can have
operations applied to it like filtering or sorting to produce a new
Frame.
You don't need to define any Frames yourself. Instead you import them
from the various schemas available in ehrql.tables
e.g.
from ehrql.tables.core import patients
Frames have columns which you can access as attributes on the Frame e.g.
dob = patients.date_of_birth
The schema documentation contains the full list of
available columns for each Frame. For example, see
ehrql.tables.core.patients
.
Accessing a column attribute on a Frame produces a Series,
which are documented elsewhere below.
Some Frames contain at most one row per patient, we call these
PatientFrames; others can contain multiple rows per patient, we call
these EventFrames.
---8<-- 'includes/generated_docs/language__frames.md'
A Series represents a column of values of a certain type. Some Series
contain at most one value per patient, we call these PatientSeries;
others can contain multiple values per patient, we call these
EventSeries. Values can be NULL (i.e. missing) but a Series can never
mix values of different types.
---8<-- 'includes/generated_docs/language__series.md'
ehrQL supports adding and subtracting durations from dates e.g.
date_of_admission + days(10)
or date_of_discharge - weeks(6)
. It
also supports finding the difference between dates e.g.
days_in_hospital = (date_of_discharge - date_of_admission).days
When working with dates you should generally prefer using days or weeks
as units, rather than months or years, unless there's a specific reason
you care about the calendar. Adding or subtracting days or weeks to a
date is a simple, unambiguous process; adding years or months introduces
ambiguities and complexities as neither unit is a consistent length (see
section below for details). So, for example, unless
there is specific epidemiological significance to whether an event
happened exactly three calendar months ago, it is better to say "90
days" rather than "3 months".
Additionally, some dates in the patient data (e.g. dates of birth) have
been rounded to the first of the month for privacy purposes. It's
pointless applying precise calendar arithmetic to such dates.
Tip: it can be clearer to readers to write time periods as a product
rather than just a number e.g. it's easier to see that days(5 * 365)
is approximately five years than it is with days(1825)
.
Adding years or months to a date sometimes has a clear answer e.g. 1
January 2001 is exactly one calendar year later than 1 January 2000, and
1 February is exactly one calendar month later. But not all cases are
clear, for instance: which day is exactly one year after 29 February
2000? There is no 29 February 2001, so should it be 28 February or 1
March? And which day is exactly one calendar month after 31 August?
There is no 31 September, so should it be 30 September or 1 October?
Different databases give different answers here.
ehrQL takes the approach that we consistently round up to the next
date. That is, whenever the naïve calculation takes us to a date which
doesn't exist (like 29 February 2001 or 31 September) we always take the
next largest date (1 March 2001, or 1 October).
As different databases take different approaches here, ehrQL has to go
to some lengths to ensure that we get the same results on every database
we support. This introduces complexity into the generated SQL which may
have performance costs in some circumstances. This is another reason to
prefer unambiguous units of days and weeks where possible.
---8<-- 'includes/generated_docs/language__date_arithmetic.md'
---8<-- 'includes/generated_docs/language__codelists.md'
---8<-- 'includes/generated_docs/language__functions.md'
Measures are used for calculating the ratio of one quantity to another
as it varies over time, and broken down by different demographic (or
other) groupings.
---8<-- 'includes/generated_docs/language__measures.md'