a b/tutorials/1_Ontology.ipynb
1
{
2
 "cells": [
3
  {
4
   "cell_type": "markdown",
5
   "metadata": {},
6
   "source": [
7
    "# FEMR Ontology support\n",
8
    "\n",
9
    "FEMR provides support for querying ontologies using the OMOP Vocabulary. \n",
10
    "\n",
11
    "This enables easier definition of labeling functions as well as better feature generation."
12
   ]
13
  },
14
  {
15
   "cell_type": "markdown",
16
   "metadata": {},
17
   "source": [
18
    "# Downloading the OMOP Vocabulary\n",
19
    "\n",
20
    "The OMOP Vocabulary can be downloaded for free from the [OHDSI ATHENA website.](https://athena.ohdsi.org/)"
21
   ]
22
  },
23
  {
24
   "cell_type": "markdown",
25
   "metadata": {},
26
   "source": [
27
    "# Processing the OMOP Vocabulary\n",
28
    "\n",
29
    "femr.ontology.Ontology allows you to process, and then use the OMOP Vocabulary, optionally combining it with [code metadata from MEDS](https://github.com/Medical-Event-Data-Standard/meds/blob/e93f63a2f9642123c49a31ecffcdb84d877dc54a/src/meds/__init__.py#L94).\n",
30
    "\n",
31
    "```python \n",
32
    "ontology = femr.ontology.Ontology(path_to_athena, code_metadata)\n",
33
    "```"
34
   ]
35
  },
36
  {
37
   "cell_type": "markdown",
38
   "metadata": {},
39
   "source": [
40
    "# Working with an Ontology object\n",
41
    "\n",
42
    "The following code samples illustrate the main ways to use a vocabulary object"
43
   ]
44
  },
45
  {
46
   "cell_type": "code",
47
   "execution_count": 1,
48
   "metadata": {},
49
   "outputs": [
50
    {
51
     "name": "stderr",
52
     "output_type": "stream",
53
     "text": [
54
      "/home/esteinberg/miniconda3/envs/debug_document_femr/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
55
      "  from .autonotebook import tqdm as notebook_tqdm\n"
56
     ]
57
    },
58
    {
59
     "name": "stdout",
60
     "output_type": "stream",
61
     "text": [
62
      "Loaded ontology\n"
63
     ]
64
    }
65
   ],
66
   "source": [
67
    "import pickle\n",
68
    "\n",
69
    "# You can load / save ontology objects with pickle\n",
70
    "\n",
71
    "with open('input/meds/ontology.pkl', 'rb') as f:\n",
72
    "    ontology = pickle.load(f)\n",
73
    "\n",
74
    "print(\"Loaded ontology\")"
75
   ]
76
  },
77
  {
78
   "cell_type": "code",
79
   "execution_count": 2,
80
   "metadata": {},
81
   "outputs": [
82
    {
83
     "name": "stderr",
84
     "output_type": "stream",
85
     "text": [
86
      "Generating train split: 200 examples [00:00, 34972.93 examples/s]\n",
87
      "Map: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:00<00:00, 3282.29 examples/s]\n"
88
     ]
89
    }
90
   ],
91
   "source": [
92
    "# Ontology datasets downloaded by Athena tend to be very large as they contain many codes, including several that are no longer used.\n",
93
    "# We therefore provide a function to prune ontologies to a particular dataset of interest.\n",
94
    "# This makes it much cheaper to store and use an ontology object, both in terms of disk space and RAM\n",
95
    "\n",
96
    "import datasets\n",
97
    "dataset = datasets.Dataset.from_parquet(\"input/meds/data/*\")\n",
98
    "\n",
99
    "ontology.prune_to_dataset(dataset)"
100
   ]
101
  },
102
  {
103
   "cell_type": "code",
104
   "execution_count": 3,
105
   "metadata": {},
106
   "outputs": [
107
    {
108
     "name": "stdout",
109
     "output_type": "stream",
110
     "text": [
111
      "Description DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD)\n",
112
      "Parents {'ATC/A02'}\n",
113
      "Children {'ATC/A02BX'}\n",
114
      "All children {'RxNorm/2344', 'ATC/A02BX', 'RxNorm/4501', 'ATC/A02BX71', 'ATC/A02B', 'RxNorm/7815', 'RxNorm/7019', 'ATC/A02BX77', 'RxNorm/2353', 'RxNorm/8705', 'RxNorm/38574', 'RxNorm/2620', 'RxNorm/2018', 'RxNorm/8704', 'RxNorm/8730', 'RxNorm/6852', 'RxNorm/2017', 'RxNorm/2403'}\n",
115
      "All parents {'ATC/A', 'ATC/A02', 'ATC/A02B'}\n"
116
     ]
117
    }
118
   ],
119
   "source": [
120
    "# First, we can query the description for a particular code\n",
121
    "print(\"Description\", ontology.get_description(\"ATC/A02B\"))\n",
122
    "\n",
123
    "# Second, we can search for the parents of a particular code\n",
124
    "print(\"Parents\", ontology.get_parents(\"ATC/A02B\"))\n",
125
    "\n",
126
    "# Finally, we can search for the children of a particular code\n",
127
    "print(\"Children\", ontology.get_children(\"ATC/A02B\"))\n",
128
    "\n",
129
    "# For the sake of convience, we also support the recursive versions of querying parents and children\n",
130
    "print(\"All children\", ontology.get_all_children(\"ATC/A02B\"))\n",
131
    "print(\"All parents\", ontology.get_all_parents(\"ATC/A02B\"))"
132
   ]
133
  }
134
 ],
135
 "metadata": {
136
  "kernelspec": {
137
   "display_name": "Python 3 (ipykernel)",
138
   "language": "python",
139
   "name": "python3"
140
  },
141
  "language_info": {
142
   "codemirror_mode": {
143
    "name": "ipython",
144
    "version": 3
145
   },
146
   "file_extension": ".py",
147
   "mimetype": "text/x-python",
148
   "name": "python",
149
   "nbconvert_exporter": "python",
150
   "pygments_lexer": "ipython3",
151
   "version": "3.10.14"
152
  }
153
 },
154
 "nbformat": 4,
155
 "nbformat_minor": 4
156
}