Switch to unified view

a b/docs/explanation/using-ehrql-in-opensafely-projects.md
1
This page describes how ehrQL fits in with a full OpenSAFELY project.
2
3
In one sentence:
4
5
> Researchers develop an ehrQL query and analysis code on their own computers
6
> using dummy tables,
7
> then submit it to the [OpenSAFELY jobs site](https://jobs.opensafely.org)
8
> to run against real tables in an OpenSAFELY backend.
9
10
## Project workflow summary
11
12
The workflow for a single study using ehrQL is much like that for
13
[existing studies that use cohort-extractor](https://docs.opensafely.org/workflow/).
14
15
In summary:
16
17
1. Create a Git repository from the template repository provided and clone it on your local machine.
18
1. Write a dataset definition in ehrQL that specifies what data you want to extract from the database.
19
   **Only this step is specific to ehrQL.**
20
1. Develop analysis scripts using [dummy datasets](#dummy-datasets) in R, Stata, or Python to process and analyse the dummy datasets created by ehrQL.
21
1. Test the code by running the analysis steps specified in the [project pipeline](https://docs.opensafely.org/actions-pipelines/).
22
1. Execute the analysis on the [real tables via OpenSAFELY's jobs site](#real-tables). This will generate outputs on the secure server.
23
1. Check the [output for disclosivity within the server, and redact if necessary](https://docs.opensafely.org/releasing-files/).
24
1. Release the [outputs on the jobs site](https://docs.opensafely.org/releasing-files/#2-requesting-release-of-outputs-from-the-server).
25
26
## Dummy datasets
27
28
Because OpenSAFELY doesn't allow researchers direct access to patient data,
29
researchers must use dummy datasets for developing their analysis code on their own computer.
30
31
When an ehrQL action is executed on a researcher's computer (see [Running ehrQL](../explanation/running-ehrql.md)),
32
ehrQL can generate dummy datasets based on the properties of the tables used in the dataset definition.
33
Alternatively, users can also provide their own dummy tables.
34
35
This allows the dataset definition to be checked for errors,
36
and produces dummy datasets that can be used to test downstream actions that depend on the output of the ehrQL action.
37
38
## Real tables
39
40
Executing a dataset definition against real tables in an OpenSAFELY backend involves running the study on the
41
[OpenSAFELY jobs site](https://jobs.opensafely.org).
42
More information about the jobs site and how to run a study can be found in the
43
[OpenSAFELY documentation](https://docs.opensafely.org/jobs-site/).