|
a |
|
b/EXPERIMENTS.md |
|
|
1 |
# Recreate the initial experiments |
|
|
2 |
We will recreate the experiments from "Genome-wide association study of behavioral, physiological and gene expression traits in outbred CFW mice". The command below will download the datasets used in the mouse genome paper into a local directory called `mice_data_set` for processing. |
|
|
3 |
|
|
|
4 |
## Set up a Jupyter notebook with the R Kernel |
|
|
5 |
Next, use the Conda package manager to set up a virtual environment to run the Jupyter notebooks that recreate the original experiments on the datasets. |
|
|
6 |
|
|
|
7 |
``` |
|
|
8 |
conda create -n r-kernel |
|
|
9 |
conda activate r-kernel |
|
|
10 |
conda install r-recommended r-irkernel |
|
|
11 |
conda install jupyter |
|
|
12 |
``` |
|
|
13 |
|
|
|
14 |
|
|
|
15 |
Add the R-kernel spec to Jupyter and install required packages. |
|
|
16 |
``` |
|
|
17 |
R -e 'IRkernel::installspec()' |
|
|
18 |
R -e 'install.packages("qtl", repos = "http://cran.us.r-project.org")' |
|
|
19 |
R -e 'install.packages("qqman", repos = "http://cran.us.r-project.org")' |
|
|
20 |
R -e 'install.packages("data.table", repos = "http://cran.us.r-project.org")' |
|
|
21 |
R -e 'install.packages("stringr", repos = "http://cran.us.r-project.org")' |
|
|
22 |
R -e 'install.packages("qqman", repos = "http://cran.us.r-project.org")' |
|
|
23 |
R -e 'install.packages("devtools", repos = "http://cran.us.r-project.org")' |
|
|
24 |
``` |
|
|
25 |
|
|
|
26 |
Run Jupyter notebook |
|
|
27 |
|
|
|
28 |
``` |
|
|
29 |
jupyter notebook |
|
|
30 |
``` |
|
|
31 |
|
|
|
32 |
## Run Map.ipynb |
|
|
33 |
Running this notebook before synthesizing data is optional as the original `abBMD` analysis was downloaded in the steps above. Map.ipynb can be run to optinally recreate the original experiment results. To run the notebook, open `./research_paper_code/notebooks/map.ipynb` in Jupyter notebook, and choose Kernel->Run All. This will run through the R-studio code in this repository that recreates the results from the original paper. As data is generated, you will see plots and data files generated in the following formats: |
|
|
34 |
|
|
|
35 |
``` |
|
|
36 |
(base) redlined@redlined-980:~/GitHub/synthetic-data-genomics/mice_data_set/out$ head lm_abBMD_1_79646.csv |
|
|
37 |
"","snp","chr","pos","p" |
|
|
38 |
"1","rs29477109",11,95292217,5.05231663641996e-14 |
|
|
39 |
"2","rs27071351",11,96114911,7.07418067212828e-14 |
|
|
40 |
"3","rs27024162",11,96918116,7.17058199722633e-14 |
|
|
41 |
"4","rs49423067",11,96918212,7.19866140655625e-14 |
|
|
42 |
"5","rs29470802",11,95263588,8.04984862217419e-14 |
|
|
43 |
"6","rs29459746",11,95987376,1.03725122425739e-13 |
|
|
44 |
"7","rs50417410",11,97011284,1.04333530152468e-13 |
|
|
45 |
"8","rs29473466",11,96920033,1.33866242959213e-13 |
|
|
46 |
"9","rs221074340",11,96018255,1.35574083178291e-13 |
|
|
47 |
``` |
|
|
48 |
|
|
|
49 |
 |
|
|
50 |
|
|
|
51 |
|
|
|
52 |
|
|
|
53 |
|