Diff of /tests/spec/README.md [000000] .. [e988c2]

Switch to unified view

a b/tests/spec/README.md
1
# Spec tests
2
3
These tests specify the expected behaviour of ehrQL, and by extension, the query model.
4
5
They are also used to generate documentation.
6
7
8
## Table of Contents
9
10
`toc.py` defines the Table of Contents for the documentation generated by tests. A new spec test
11
subdirectory or file must be defined here in order for it to appear in the docs.
12
13
## Folder structure
14
15
Within the spec directory, each test directory is considered a chapter in the docs, each file
16
is considered to be a section, and each test is a paragraph.
17
18
e.g.
19
```
20
tests
21
    ├── spec
22
    │   ├── aggregate_frame                       <-- chapter
23
    │   │   ├── __init__.py                       <-- contains chapter title
24
    │   │   ├── test_count_for_patient.py         <-- section
25
    │   │   └── test_exists_for_patient.py        <-- section
26
```
27
28
### Titles
29
Chapter titles are specified with a `title` attribute in the test directory's `__init__.py`.
30
Section titles are specified with a `title` attribute in each test file.
31
Paragraph titles are extracted from the test names, or from a `title` attribute in the test
32
function.
33
34
E.g. `test_count_for_patient.py` contains 2 tests:
35
- `test_count_for_patient_on_event_frame()`
36
- `test_count_for_patient_on_patient_frame()`
37
38
Assuming this is included in `toc.py`, the following structure will be generated in the docs:
39
40
> 1. Aggregating event and patient frames  # from `title` in `aggregate_frame/__init__.py`
41
> 1.1 Counting the rows for each patient  # from `title` in `aggregate_frame/test_count_for_patient.py`
42
> 1.1.1 Count for patient on event frame  # from test name
43
> 1.1.2 Count for patient on patient frame  # from test name
44
45
46
### Additional text
47
Optional additional text can be included under a Chapter, Section or Paragraph title in the
48
documentation as follows:
49
50
- Chapter: add a `text` attribute in the test directory's `__init__.py`.
51
- Section: add a `text` attribute in a test file.
52
- Paragraph: add a docstring to the test function.
53
54
55
## Test structure
56
57
Spec tests follow a standardised structure in order to allow documentation generation.  Each
58
spec test uses the `spec_test` fixture, which has the following components:
59
60
```
61
spec_test(
62
    table_data,                          # a dict (see below) defining the tables in the test database
63
    e.where(e.b1).i1.sum_for_patient(),   # the ehrQL code being tested
64
    {                                    # expected results
65
        1: (101 + 102),
66
        2: 201,
67
    },
68
)
69
```
70
71
### Defining table data
72
Each test must define a dict named `table_data`, with keys `e` (event level data, with multiple rows
73
per patient) and/or `p` (patient level data, with one row per patient).
74
75
This sets up the test data.  The first column is always patient ID and must be of integer type;
76
it can be given a heading in the test for readability if desired, but it does not require one.
77
Subsequent columns must take one of a discrete set of names, which indicate the type of the column
78
in the test database.
79
80
See `tests/spec/tables.py` for available column names and types.
81
82
E.g. The following table data dict sets up:
83
- an event table with an integer column `i1` and a boolean column `b1`, with 3 rows for patient 1,
84
  and 2 rows for patient 2
85
- a patient table with a date column `d1` with rows for 3 patients, 2 of which also appear in the
86
  event table.
87
```
88
table_data = {
89
    e: """
90
          |  i1 |  b1
91
        --+-----+-----
92
        1 | 101 |  T
93
        1 | 102 |  T
94
        1 | 103 |  F
95
        2 | 201 |  T
96
        2 | 202 |
97
    """,
98
    p: """
99
          |  d1
100
        --+-----------
101
        1 | 1990-01-02
102
        2 | 1990-01-02
103
        3 |
104
    """
105
}
106
```
107
108
### Defining the ehrQL code to return a series
109
110
Define the ehrQL code that's being tested, using the table data. This should always be
111
expected to return a Series.
112
113
e.g. in the table data above, to get the sum of the `i1` column where `b1` is True:
114
115
```e.where(e.b1).i1.sum_for_patient()```
116
117
### Expected results
118
119
Describe the expected resulsts as a dict, with patient IDs as keys.
120
121
e.g. in the example above, the expected sums would be:
122
123
```
124
{
125
    1: (101 + 102),
126
    2: 201,
127
}
128
```