911 lines (910 with data), 34.0 kB
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## KGWAS Disease critical network\n",
"This notebook showcases an example to generate (1) KGWAS network weights for each edge, measuring the importance of the edge to explain the disease GWAS signals and (2) variant interpretation graph where for each variant, it retrieves the top K nodes that are most important to explain the GWAS signals of the variant.\n",
"\n",
"This code assumes that you have the model folder `./data/model/test` saved. Feel free to modify the path to your own model folder."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Note\n",
"Please in the site package torch_geometric/nn/conv/hetero_conv.py, change the `group` function into this:\n",
"\n",
"```python\n",
"def group(xs: List[Tensor], aggr: Optional[str]) -> Optional[Tensor]:\n",
" if len(xs) == 0:\n",
" return None\n",
" elif aggr is None:\n",
" return torch.stack(xs, dim=1)\n",
" elif len(xs) == 1:\n",
" return xs[0]\n",
" elif isinstance(xs[0], tuple):\n",
" return xs\n",
" else:\n",
" out = torch.stack(xs, dim=0)\n",
" out = getattr(torch, aggr)(out, dim=0)\n",
" out = out[0] if isinstance(out, tuple) else out\n",
" return out\n",
"```\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All required data files are present.\n",
"--loading KG---\n",
"--using enformer SNP embedding--\n",
"--using random go embedding--\n",
"--using ESM gene embedding--\n",
"Loading example GWAS file...\n",
"Example file already exists locally.\n",
"Loading GWAS file from ./data/biochemistry_Creatinine_fastgwa_full_10000_1.fastGWA...\n",
"Using ldsc weight...\n",
"ldsc_weight mean: 0.9999999999999993\n",
"Retrieving weights...\n",
"Aggregating across node types...\n",
"Start generating disease critical network...\n",
"No filters... Using all genes and gene programs...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/dfs/user/kexinh/miniconda3/envs/a100_env/lib/python3.8/site-packages/pandas/core/indexing.py:1773: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" self._setitem_single_column(ilocs[0], value, pi)\n",
"/dfs/user/kexinh/miniconda3/envs/a100_env/lib/python3.8/site-packages/pandas/core/indexing.py:1667: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" self.obj[key] = value\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Disease critical network finished generating...\n",
"Generating variant interpretation networks...\n",
"Number of hit snps: 3\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 3/3 [00:22<00:00, 7.50s/it]\n"
]
}
],
"source": [
"import sys\n",
"sys.path.append('../')\n",
"\n",
"from kgwas import KGWAS, KGWAS_Data\n",
"data = KGWAS_Data(data_path = './data/')\n",
"data.load_kg()\n",
"\n",
"data.load_external_gwas(example_file = True)\n",
"data.process_gwas_file()\n",
"data.prepare_split()\n",
"\n",
"run = KGWAS(data, device = 'cuda:9', exp_name = 'test')\n",
"run.load_pretrained('./data/model/test')\n",
"df_network_weight, df_variant_interpretation, disease_critical_network = run.get_disease_critical_network(variant_threshold = 5e-8, \n",
" magma_path = None, magma_threshold = 0.05, program_threshold = 0.05,\n",
" K_neighbors = 3, num_cpus = 1)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>h_idx</th>\n",
" <th>t_idx</th>\n",
" <th>weight</th>\n",
" <th>h_type</th>\n",
" <th>rel_type</th>\n",
" <th>t_type</th>\n",
" <th>layer</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>11105.0</td>\n",
" <td>3.0</td>\n",
" <td>-0.005648</td>\n",
" <td>Gene</td>\n",
" <td>rev_ABC</td>\n",
" <td>SNP</td>\n",
" <td>l1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>11105.0</td>\n",
" <td>4.0</td>\n",
" <td>-0.006695</td>\n",
" <td>Gene</td>\n",
" <td>rev_ABC</td>\n",
" <td>SNP</td>\n",
" <td>l1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>10667.0</td>\n",
" <td>8.0</td>\n",
" <td>-0.007284</td>\n",
" <td>Gene</td>\n",
" <td>rev_ABC</td>\n",
" <td>SNP</td>\n",
" <td>l1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>11105.0</td>\n",
" <td>8.0</td>\n",
" <td>-0.007112</td>\n",
" <td>Gene</td>\n",
" <td>rev_ABC</td>\n",
" <td>SNP</td>\n",
" <td>l1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>10667.0</td>\n",
" <td>11.0</td>\n",
" <td>-0.007329</td>\n",
" <td>Gene</td>\n",
" <td>rev_ABC</td>\n",
" <td>SNP</td>\n",
" <td>l1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>1191.0</td>\n",
" <td>1147.0</td>\n",
" <td>0.004166</td>\n",
" <td>CellularComponent</td>\n",
" <td>rev_Gene-NotColocalizes-CellularComponent</td>\n",
" <td>Gene</td>\n",
" <td>l2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2423.0</td>\n",
" <td>19025.0</td>\n",
" <td>0.439045</td>\n",
" <td>MolecularFunction</td>\n",
" <td>rev_Gene-NotContributes-MolecularFunction</td>\n",
" <td>Gene</td>\n",
" <td>l2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>768.0</td>\n",
" <td>12181.0</td>\n",
" <td>0.203226</td>\n",
" <td>MolecularFunction</td>\n",
" <td>rev_Gene-NotContributes-MolecularFunction</td>\n",
" <td>Gene</td>\n",
" <td>l2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>4511.0</td>\n",
" <td>12992.0</td>\n",
" <td>0.127140</td>\n",
" <td>MolecularFunction</td>\n",
" <td>rev_Gene-NotContributes-MolecularFunction</td>\n",
" <td>Gene</td>\n",
" <td>l2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>948.0</td>\n",
" <td>3794.0</td>\n",
" <td>0.421596</td>\n",
" <td>MolecularFunction</td>\n",
" <td>rev_Gene-NotContributes-MolecularFunction</td>\n",
" <td>Gene</td>\n",
" <td>l2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>42809560 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" h_idx t_idx weight h_type \\\n",
"0 11105.0 3.0 -0.005648 Gene \n",
"1 11105.0 4.0 -0.006695 Gene \n",
"2 10667.0 8.0 -0.007284 Gene \n",
"3 11105.0 8.0 -0.007112 Gene \n",
"4 10667.0 11.0 -0.007329 Gene \n",
".. ... ... ... ... \n",
"10 1191.0 1147.0 0.004166 CellularComponent \n",
"0 2423.0 19025.0 0.439045 MolecularFunction \n",
"1 768.0 12181.0 0.203226 MolecularFunction \n",
"2 4511.0 12992.0 0.127140 MolecularFunction \n",
"3 948.0 3794.0 0.421596 MolecularFunction \n",
"\n",
" rel_type t_type layer \n",
"0 rev_ABC SNP l1 \n",
"1 rev_ABC SNP l1 \n",
"2 rev_ABC SNP l1 \n",
"3 rev_ABC SNP l1 \n",
"4 rev_ABC SNP l1 \n",
".. ... ... ... \n",
"10 rev_Gene-NotColocalizes-CellularComponent Gene l2 \n",
"0 rev_Gene-NotContributes-MolecularFunction Gene l2 \n",
"1 rev_Gene-NotContributes-MolecularFunction Gene l2 \n",
"2 rev_Gene-NotContributes-MolecularFunction Gene l2 \n",
"3 rev_Gene-NotContributes-MolecularFunction Gene l2 \n",
"\n",
"[42809560 rows x 7 columns]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_network_weight"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>h_idx</th>\n",
" <th>t_idx</th>\n",
" <th>importance</th>\n",
" <th>h_type</th>\n",
" <th>t_type</th>\n",
" <th>rel_type</th>\n",
" <th>h_id</th>\n",
" <th>t_id</th>\n",
" <th>QUERY_SNP</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>393735</th>\n",
" <td>4238.0</td>\n",
" <td>115005.0</td>\n",
" <td>0.707107</td>\n",
" <td>Gene</td>\n",
" <td>SNP</td>\n",
" <td>Exon</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1098740</th>\n",
" <td>17670.0</td>\n",
" <td>4238.0</td>\n",
" <td>1.779486</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>PhysicalAssociation</td>\n",
" <td>RPL35</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>573324</th>\n",
" <td>8800.0</td>\n",
" <td>4238.0</td>\n",
" <td>0.425998</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>PhysicalAssociation</td>\n",
" <td>HSPA5</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>786046</th>\n",
" <td>13187.0</td>\n",
" <td>4238.0</td>\n",
" <td>0.019682</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>DosageLethality</td>\n",
" <td>MYC</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28855</th>\n",
" <td>13165.0</td>\n",
" <td>4238.0</td>\n",
" <td>-0.035167</td>\n",
" <td>BiologicalProcess</td>\n",
" <td>Gene</td>\n",
" <td>Associates</td>\n",
" <td>Nitric oxide metabolic process</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30283</th>\n",
" <td>14457.0</td>\n",
" <td>4238.0</td>\n",
" <td>-0.035197</td>\n",
" <td>BiologicalProcess</td>\n",
" <td>Gene</td>\n",
" <td>Associates</td>\n",
" <td>Homocysteine metabolic process</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14116</th>\n",
" <td>6294.0</td>\n",
" <td>4238.0</td>\n",
" <td>-0.035242</td>\n",
" <td>BiologicalProcess</td>\n",
" <td>Gene</td>\n",
" <td>Associates</td>\n",
" <td>Triglyceride catabolic process</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>280494</th>\n",
" <td>114935.0</td>\n",
" <td>4238.0</td>\n",
" <td>1.473419</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs116492934</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>280526</th>\n",
" <td>114958.0</td>\n",
" <td>4238.0</td>\n",
" <td>1.433980</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs148584272</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>280523</th>\n",
" <td>114956.0</td>\n",
" <td>4238.0</td>\n",
" <td>1.381602</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs138424013</td>\n",
" <td>CPS1</td>\n",
" <td>rs1047891</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1611316</th>\n",
" <td>18584.0</td>\n",
" <td>197572.0</td>\n",
" <td>1.394467</td>\n",
" <td>Gene</td>\n",
" <td>SNP</td>\n",
" <td>VEP</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>260136</th>\n",
" <td>2909.0</td>\n",
" <td>197572.0</td>\n",
" <td>0.884753</td>\n",
" <td>Gene</td>\n",
" <td>SNP</td>\n",
" <td>eQTL</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1729186</th>\n",
" <td>19980.0</td>\n",
" <td>197572.0</td>\n",
" <td>0.856848</td>\n",
" <td>Gene</td>\n",
" <td>SNP</td>\n",
" <td>eQTL</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1122286</th>\n",
" <td>17788.0</td>\n",
" <td>18584.0</td>\n",
" <td>-0.000446</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Literature</td>\n",
" <td>RPS6KA1</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1164822</th>\n",
" <td>18443.0</td>\n",
" <td>18584.0</td>\n",
" <td>-0.000453</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Literature</td>\n",
" <td>SFN</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>333210</th>\n",
" <td>5512.0</td>\n",
" <td>18584.0</td>\n",
" <td>-0.000456</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Literature</td>\n",
" <td>DYNLL1</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>491797</th>\n",
" <td>7839.0</td>\n",
" <td>2909.0</td>\n",
" <td>0.628518</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>PhysicalAssociation</td>\n",
" <td>GPSM1</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1169157</th>\n",
" <td>18527.0</td>\n",
" <td>2909.0</td>\n",
" <td>0.563372</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Signaling</td>\n",
" <td>SH3GLB2</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>430656</th>\n",
" <td>6932.0</td>\n",
" <td>2909.0</td>\n",
" <td>-0.000444</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Literature</td>\n",
" <td>FOS</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1249313</th>\n",
" <td>20031.0</td>\n",
" <td>19980.0</td>\n",
" <td>0.457000</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Reaction</td>\n",
" <td>STOM</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1016826</th>\n",
" <td>16746.0</td>\n",
" <td>19980.0</td>\n",
" <td>0.312869</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Reaction</td>\n",
" <td>RAB14</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>443769</th>\n",
" <td>7132.0</td>\n",
" <td>19980.0</td>\n",
" <td>-0.000451</td>\n",
" <td>Gene</td>\n",
" <td>Gene</td>\n",
" <td>Literature</td>\n",
" <td>GABARAP</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3204</th>\n",
" <td>1938.0</td>\n",
" <td>19980.0</td>\n",
" <td>-0.034033</td>\n",
" <td>BiologicalProcess</td>\n",
" <td>Gene</td>\n",
" <td>Associates</td>\n",
" <td>Glycogen catabolic process</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34520</th>\n",
" <td>17247.0</td>\n",
" <td>19980.0</td>\n",
" <td>-0.034096</td>\n",
" <td>BiologicalProcess</td>\n",
" <td>Gene</td>\n",
" <td>Associates</td>\n",
" <td>Glycophagy</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29598</th>\n",
" <td>13657.0</td>\n",
" <td>19980.0</td>\n",
" <td>-0.034503</td>\n",
" <td>BiologicalProcess</td>\n",
" <td>Gene</td>\n",
" <td>Associates</td>\n",
" <td>Intracellular transport</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>449552</th>\n",
" <td>197686.0</td>\n",
" <td>18584.0</td>\n",
" <td>2.281075</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs6829573</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>449272</th>\n",
" <td>197568.0</td>\n",
" <td>18584.0</td>\n",
" <td>1.772625</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs6836365</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>449497</th>\n",
" <td>197656.0</td>\n",
" <td>18584.0</td>\n",
" <td>1.618933</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs55650799</td>\n",
" <td>SHROOM3</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>449256</th>\n",
" <td>197565.0</td>\n",
" <td>2909.0</td>\n",
" <td>1.428451</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs17319721</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>449433</th>\n",
" <td>197621.0</td>\n",
" <td>2909.0</td>\n",
" <td>1.427075</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs1493360</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>449317</th>\n",
" <td>197577.0</td>\n",
" <td>2909.0</td>\n",
" <td>1.374755</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs60871166</td>\n",
" <td>CCDC158</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>448552</th>\n",
" <td>197443.0</td>\n",
" <td>19980.0</td>\n",
" <td>1.292024</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs7697073</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>448562</th>\n",
" <td>197444.0</td>\n",
" <td>19980.0</td>\n",
" <td>1.273231</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs79146874</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" <tr>\n",
" <th>448575</th>\n",
" <td>197445.0</td>\n",
" <td>19980.0</td>\n",
" <td>1.256691</td>\n",
" <td>SNP</td>\n",
" <td>Gene</td>\n",
" <td>PCHi-C</td>\n",
" <td>rs12498327</td>\n",
" <td>STBD1</td>\n",
" <td>rs13146355</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" h_idx t_idx importance h_type t_type \\\n",
"393735 4238.0 115005.0 0.707107 Gene SNP \n",
"1098740 17670.0 4238.0 1.779486 Gene Gene \n",
"573324 8800.0 4238.0 0.425998 Gene Gene \n",
"786046 13187.0 4238.0 0.019682 Gene Gene \n",
"28855 13165.0 4238.0 -0.035167 BiologicalProcess Gene \n",
"30283 14457.0 4238.0 -0.035197 BiologicalProcess Gene \n",
"14116 6294.0 4238.0 -0.035242 BiologicalProcess Gene \n",
"280494 114935.0 4238.0 1.473419 SNP Gene \n",
"280526 114958.0 4238.0 1.433980 SNP Gene \n",
"280523 114956.0 4238.0 1.381602 SNP Gene \n",
"1611316 18584.0 197572.0 1.394467 Gene SNP \n",
"260136 2909.0 197572.0 0.884753 Gene SNP \n",
"1729186 19980.0 197572.0 0.856848 Gene SNP \n",
"1122286 17788.0 18584.0 -0.000446 Gene Gene \n",
"1164822 18443.0 18584.0 -0.000453 Gene Gene \n",
"333210 5512.0 18584.0 -0.000456 Gene Gene \n",
"491797 7839.0 2909.0 0.628518 Gene Gene \n",
"1169157 18527.0 2909.0 0.563372 Gene Gene \n",
"430656 6932.0 2909.0 -0.000444 Gene Gene \n",
"1249313 20031.0 19980.0 0.457000 Gene Gene \n",
"1016826 16746.0 19980.0 0.312869 Gene Gene \n",
"443769 7132.0 19980.0 -0.000451 Gene Gene \n",
"3204 1938.0 19980.0 -0.034033 BiologicalProcess Gene \n",
"34520 17247.0 19980.0 -0.034096 BiologicalProcess Gene \n",
"29598 13657.0 19980.0 -0.034503 BiologicalProcess Gene \n",
"449552 197686.0 18584.0 2.281075 SNP Gene \n",
"449272 197568.0 18584.0 1.772625 SNP Gene \n",
"449497 197656.0 18584.0 1.618933 SNP Gene \n",
"449256 197565.0 2909.0 1.428451 SNP Gene \n",
"449433 197621.0 2909.0 1.427075 SNP Gene \n",
"449317 197577.0 2909.0 1.374755 SNP Gene \n",
"448552 197443.0 19980.0 1.292024 SNP Gene \n",
"448562 197444.0 19980.0 1.273231 SNP Gene \n",
"448575 197445.0 19980.0 1.256691 SNP Gene \n",
"\n",
" rel_type h_id t_id \\\n",
"393735 Exon CPS1 rs1047891 \n",
"1098740 PhysicalAssociation RPL35 CPS1 \n",
"573324 PhysicalAssociation HSPA5 CPS1 \n",
"786046 DosageLethality MYC CPS1 \n",
"28855 Associates Nitric oxide metabolic process CPS1 \n",
"30283 Associates Homocysteine metabolic process CPS1 \n",
"14116 Associates Triglyceride catabolic process CPS1 \n",
"280494 PCHi-C rs116492934 CPS1 \n",
"280526 PCHi-C rs148584272 CPS1 \n",
"280523 PCHi-C rs138424013 CPS1 \n",
"1611316 VEP SHROOM3 rs13146355 \n",
"260136 eQTL CCDC158 rs13146355 \n",
"1729186 eQTL STBD1 rs13146355 \n",
"1122286 Literature RPS6KA1 SHROOM3 \n",
"1164822 Literature SFN SHROOM3 \n",
"333210 Literature DYNLL1 SHROOM3 \n",
"491797 PhysicalAssociation GPSM1 CCDC158 \n",
"1169157 Signaling SH3GLB2 CCDC158 \n",
"430656 Literature FOS CCDC158 \n",
"1249313 Reaction STOM STBD1 \n",
"1016826 Reaction RAB14 STBD1 \n",
"443769 Literature GABARAP STBD1 \n",
"3204 Associates Glycogen catabolic process STBD1 \n",
"34520 Associates Glycophagy STBD1 \n",
"29598 Associates Intracellular transport STBD1 \n",
"449552 PCHi-C rs6829573 SHROOM3 \n",
"449272 PCHi-C rs6836365 SHROOM3 \n",
"449497 PCHi-C rs55650799 SHROOM3 \n",
"449256 PCHi-C rs17319721 CCDC158 \n",
"449433 PCHi-C rs1493360 CCDC158 \n",
"449317 PCHi-C rs60871166 CCDC158 \n",
"448552 PCHi-C rs7697073 STBD1 \n",
"448562 PCHi-C rs79146874 STBD1 \n",
"448575 PCHi-C rs12498327 STBD1 \n",
"\n",
" QUERY_SNP \n",
"393735 rs1047891 \n",
"1098740 rs1047891 \n",
"573324 rs1047891 \n",
"786046 rs1047891 \n",
"28855 rs1047891 \n",
"30283 rs1047891 \n",
"14116 rs1047891 \n",
"280494 rs1047891 \n",
"280526 rs1047891 \n",
"280523 rs1047891 \n",
"1611316 rs13146355 \n",
"260136 rs13146355 \n",
"1729186 rs13146355 \n",
"1122286 rs13146355 \n",
"1164822 rs13146355 \n",
"333210 rs13146355 \n",
"491797 rs13146355 \n",
"1169157 rs13146355 \n",
"430656 rs13146355 \n",
"1249313 rs13146355 \n",
"1016826 rs13146355 \n",
"443769 rs13146355 \n",
"3204 rs13146355 \n",
"34520 rs13146355 \n",
"29598 rs13146355 \n",
"449552 rs13146355 \n",
"449272 rs13146355 \n",
"449497 rs13146355 \n",
"449256 rs13146355 \n",
"449433 rs13146355 \n",
"449317 rs13146355 \n",
"448552 rs13146355 \n",
"448562 rs13146355 \n",
"448575 rs13146355 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_variant_interpretation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In default, it does not use any filtering on the genes and programs. But we highly suggest you do so! You can do it by simply feeding the MAGMA result file to `magma_path`. For how to run MAGMA, check out [this notebook](run_magma.ipynb)!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "a100_env",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}