|
a |
|
b/tests/spec/README.md |
|
|
1 |
# Spec tests |
|
|
2 |
|
|
|
3 |
These tests specify the expected behaviour of ehrQL, and by extension, the query model. |
|
|
4 |
|
|
|
5 |
They are also used to generate documentation. |
|
|
6 |
|
|
|
7 |
|
|
|
8 |
## Table of Contents |
|
|
9 |
|
|
|
10 |
`toc.py` defines the Table of Contents for the documentation generated by tests. A new spec test |
|
|
11 |
subdirectory or file must be defined here in order for it to appear in the docs. |
|
|
12 |
|
|
|
13 |
## Folder structure |
|
|
14 |
|
|
|
15 |
Within the spec directory, each test directory is considered a chapter in the docs, each file |
|
|
16 |
is considered to be a section, and each test is a paragraph. |
|
|
17 |
|
|
|
18 |
e.g. |
|
|
19 |
``` |
|
|
20 |
tests |
|
|
21 |
├── spec |
|
|
22 |
│ ├── aggregate_frame <-- chapter |
|
|
23 |
│ │ ├── __init__.py <-- contains chapter title |
|
|
24 |
│ │ ├── test_count_for_patient.py <-- section |
|
|
25 |
│ │ └── test_exists_for_patient.py <-- section |
|
|
26 |
``` |
|
|
27 |
|
|
|
28 |
### Titles |
|
|
29 |
Chapter titles are specified with a `title` attribute in the test directory's `__init__.py`. |
|
|
30 |
Section titles are specified with a `title` attribute in each test file. |
|
|
31 |
Paragraph titles are extracted from the test names, or from a `title` attribute in the test |
|
|
32 |
function. |
|
|
33 |
|
|
|
34 |
E.g. `test_count_for_patient.py` contains 2 tests: |
|
|
35 |
- `test_count_for_patient_on_event_frame()` |
|
|
36 |
- `test_count_for_patient_on_patient_frame()` |
|
|
37 |
|
|
|
38 |
Assuming this is included in `toc.py`, the following structure will be generated in the docs: |
|
|
39 |
|
|
|
40 |
> 1. Aggregating event and patient frames # from `title` in `aggregate_frame/__init__.py` |
|
|
41 |
> 1.1 Counting the rows for each patient # from `title` in `aggregate_frame/test_count_for_patient.py` |
|
|
42 |
> 1.1.1 Count for patient on event frame # from test name |
|
|
43 |
> 1.1.2 Count for patient on patient frame # from test name |
|
|
44 |
|
|
|
45 |
|
|
|
46 |
### Additional text |
|
|
47 |
Optional additional text can be included under a Chapter, Section or Paragraph title in the |
|
|
48 |
documentation as follows: |
|
|
49 |
|
|
|
50 |
- Chapter: add a `text` attribute in the test directory's `__init__.py`. |
|
|
51 |
- Section: add a `text` attribute in a test file. |
|
|
52 |
- Paragraph: add a docstring to the test function. |
|
|
53 |
|
|
|
54 |
|
|
|
55 |
## Test structure |
|
|
56 |
|
|
|
57 |
Spec tests follow a standardised structure in order to allow documentation generation. Each |
|
|
58 |
spec test uses the `spec_test` fixture, which has the following components: |
|
|
59 |
|
|
|
60 |
``` |
|
|
61 |
spec_test( |
|
|
62 |
table_data, # a dict (see below) defining the tables in the test database |
|
|
63 |
e.where(e.b1).i1.sum_for_patient(), # the ehrQL code being tested |
|
|
64 |
{ # expected results |
|
|
65 |
1: (101 + 102), |
|
|
66 |
2: 201, |
|
|
67 |
}, |
|
|
68 |
) |
|
|
69 |
``` |
|
|
70 |
|
|
|
71 |
### Defining table data |
|
|
72 |
Each test must define a dict named `table_data`, with keys `e` (event level data, with multiple rows |
|
|
73 |
per patient) and/or `p` (patient level data, with one row per patient). |
|
|
74 |
|
|
|
75 |
This sets up the test data. The first column is always patient ID and must be of integer type; |
|
|
76 |
it can be given a heading in the test for readability if desired, but it does not require one. |
|
|
77 |
Subsequent columns must take one of a discrete set of names, which indicate the type of the column |
|
|
78 |
in the test database. |
|
|
79 |
|
|
|
80 |
See `tests/spec/tables.py` for available column names and types. |
|
|
81 |
|
|
|
82 |
E.g. The following table data dict sets up: |
|
|
83 |
- an event table with an integer column `i1` and a boolean column `b1`, with 3 rows for patient 1, |
|
|
84 |
and 2 rows for patient 2 |
|
|
85 |
- a patient table with a date column `d1` with rows for 3 patients, 2 of which also appear in the |
|
|
86 |
event table. |
|
|
87 |
``` |
|
|
88 |
table_data = { |
|
|
89 |
e: """ |
|
|
90 |
| i1 | b1 |
|
|
91 |
--+-----+----- |
|
|
92 |
1 | 101 | T |
|
|
93 |
1 | 102 | T |
|
|
94 |
1 | 103 | F |
|
|
95 |
2 | 201 | T |
|
|
96 |
2 | 202 | |
|
|
97 |
""", |
|
|
98 |
p: """ |
|
|
99 |
| d1 |
|
|
100 |
--+----------- |
|
|
101 |
1 | 1990-01-02 |
|
|
102 |
2 | 1990-01-02 |
|
|
103 |
3 | |
|
|
104 |
""" |
|
|
105 |
} |
|
|
106 |
``` |
|
|
107 |
|
|
|
108 |
### Defining the ehrQL code to return a series |
|
|
109 |
|
|
|
110 |
Define the ehrQL code that's being tested, using the table data. This should always be |
|
|
111 |
expected to return a Series. |
|
|
112 |
|
|
|
113 |
e.g. in the table data above, to get the sum of the `i1` column where `b1` is True: |
|
|
114 |
|
|
|
115 |
```e.where(e.b1).i1.sum_for_patient()``` |
|
|
116 |
|
|
|
117 |
### Expected results |
|
|
118 |
|
|
|
119 |
Describe the expected resulsts as a dict, with patient IDs as keys. |
|
|
120 |
|
|
|
121 |
e.g. in the example above, the expected sums would be: |
|
|
122 |
|
|
|
123 |
``` |
|
|
124 |
{ |
|
|
125 |
1: (101 + 102), |
|
|
126 |
2: 201, |
|
|
127 |
} |
|
|
128 |
``` |