Diff of /ehragent/prompts_eicu.py [000000] .. [6cf5c7]

Switch to unified view

a b/ehragent/prompts_eicu.py
1
CodeHeader = """from tools import tabtools, calculator
2
Calculate = calculator.WolframAlphaCalculator
3
LoadDB = tabtools.db_loader
4
FilterDB = tabtools.data_filter
5
GetValue = tabtools.get_value
6
SQLInterpreter = tabtools.sql_interpreter
7
Calendar = tabtools.date_calculator
8
"""
9
10
RetrKnowledge = """Read the following data descriptions, generate the background knowledge as the context information that could be helpful for answering the question.
11
(1) Data include vital signs, laboratory measurements, medications, APACHE components, care plan information, admission diagnosis, patient history, time-stamped diagnoses from a structured problem list, and similarly chosen treatments.
12
(2) Data from each patient is collected into a common warehouse only if certain “interfaces” are available. Each interface is used to transform and load a certain type of data: vital sign interfaces incorporate vital signs, laboratory interfaces provide measurements on blood samples, and so on. 
13
(3) It is important to be aware that different care units may have different interfaces in place, and that the lack of an interface will result in no data being available for a given patient, even if those measurements were made in reality. The data is provided as a relational database, comprising multiple tables joined by keys.
14
(4) All the databases are used to record information associated to patient care, such as allergy, cost, diagnosis, intakeoutput, lab, medication, microlab, patient, treatment, vitalperiodic.
15
For different tables, they contain the following information:
16
(1) allergy: allergyid, patientunitstayid, drugname, allergyname, allergytime
17
(2) cost: costid, uniquepid, patienthealthsystemstayid, eventtype, eventid, chargetime, cost
18
(3) diagnosis: diagnosisid, patientunitstayid, icd9code, diagnosisname, diagnosistime
19
(4) intakeoutput: intakeoutputid, patientunitstayid, cellpath, celllabel, cellvaluenumeric, intakeoutputtime
20
(5) lab: labid, patientunitstayid, labname, labresult, labresulttime
21
(6) medication: medicationid, patientunitstayid, drugname, dosage, routeadmin, drugstarttime, drugstoptime
22
(7) microlab: microlabid, patientunitstayid, culturesite, organism, culturetakentime
23
(8) patient: patientunitstayid, patienthealthsystemstayid, gender, age, ethnicity, hospitalid, wardid, admissionheight, hospitaladmitsource, hospitaldischargestatus, admissionweight, dischargeweight, uniquepid, hospitaladmittime, unitadmittime, unitdischargetime, hospitaldischargetime
24
(9) treatment: treatmentid, patientunitstayid, treatmentname, treatmenttime
25
(10) vitalperiodic: vitalperiodicid, patientunitstayid, temperature, sao2, heartrate, respiration, systemicsystolic, systemicdiastolic, systemicmean, observationtime
26
27
Question: was the fluticasone-salmeterol 250-50 mcg/dose in aepb prescribed to patient 035-2205 on their current hospital encounter?
28
Knowledge:
29
- We can find the patient 035-2205 information in the patient database.
30
- As fluticasone-salmeterol 250-50 mcg/dose in aepb is a drug, we can find the drug information in the medication database.
31
- We can find the patientunitstayid in the patient database and use it to find the drug precsription information in the medication database.
32
33
Question: in the last hospital encounter, when was patient 031-22988's first microbiology test time?
34
Knowledge:
35
- We can find the patient 031-22988 information in the patient database.
36
- We can find the microbiology test information in the microlab database.
37
- We can find the patientunitstayid in the patient database and use it to find the microbiology test information in the microlab database.
38
39
Question: what is the minimum hospital cost for a drug with a name called albumin 5% since 6 years ago?
40
Knowledge:
41
- As albumin 5% is a drug, we can find the drug information in the medication database.
42
- We can find the patientunitstayid in the medication database and use it to find the patienthealthsystemstayid information in the patient database.
43
- We can use the patienthealthsystemstayid information to find the cost information in the cost database.
44
45
Question: what are the number of patients who have had a magnesium test the previous year?
46
Knowledge:
47
- As magnesium is a lab test, we can find the lab test information in the lab database.
48
- We can find the patientunitstayid in the lab database and use it to find the patient information in the patient database.
49
50
Question: {question}
51
Knowledge:
52
"""
53
54
SYSTEM_PROMPT = """You are a helpful AI assistant. Solve tasks using your coding and language skills.
55
In the following cases, suggest python code (in a python coding block) or shell script (in a sh
56
coding block) for the user to execute.
57
1. When you need to collect info, use the code to output the info you need, for example, browse or
58
search the web, download/read a file, print the content of a webpage or a file, get the current
59
date/time. After sufficient info is printed and the task is ready to be solved based on your
60
language skill, you can solve the task by yourself.
61
2. When you need to perform some task with code, use the code to perform the task and output the
62
result. Finish the task smartly.
63
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be
64
clear which step uses code, and which step uses your language skill.
65
When using code, you must indicate the script type in the code block. The user cannot provide any
66
other feedback or perform any other action beyond executing the code you suggest. The user can't
67
modify your code. So do not suggest incomplete code which requires users to modify. Don't use a
68
code block if it's not intended to be executed by the user.
69
If you want the user to save the code in a file before executing it, put # filename: <filename>
70
inside the code block as the first line. Don't include multiple code blocks in one response. Do not
71
ask users to copy and paste the result. Instead, use 'print' function for the output when relevant.
72
Check the execution result returned by the user.
73
If the result indicates there is an error, fix the error and output the code again. Suggest the
74
full code instead of partial code or code changes. If the error can't be fixed or if the task is
75
not solved even after the code is executed successfully, analyze the problem, revisit your
76
assumption, collect additional info you need, and think of a different approach to try.
77
When you find an answer, verify the answer carefully. Include verifiable evidence in your response
78
if possible.
79
Reply "TERMINATE" in the end when everything is done."""
80
81
EHRAgent_Message_Prompt = """Assume you have knowledge of several tables:
82
(1) Tables are linked by identifiers which usually have the suffix 'ID'. For example, SUBJECT_ID refers to a unique patient, HADM_ID refers to a unique admission to the hospital, and ICUSTAY_ID refers to a unique admission to an intensive care unit.
83
(2) Charted events such as notes, laboratory tests, and fluid balance are stored in a series of 'events' tables. For example the outputevents table contains all measurements related to output for a given patient, while the labevents table contains laboratory test results for a patient.
84
(3) Tables prefixed with 'd_' are dictionary tables and provide definitions for identifiers. For example, every row of chartevents is associated with a single ITEMID which represents the concept measured, but it does not contain the actual name of the measurement. By joining chartevents and d_items on ITEMID, it is possible to identify the concept represented by a given ITEMID.
85
(4) For the databases, four of them are used to define and track patient stays: admissions, patients, icustays, and transfers. Another four tables are dictionaries for cross-referencing codes against their respective definitions: d_icd_diagnoses, d_icd_procedures, d_items, and d_labitems. The remaining tables, including chartevents, cost, inputevents_cv, labevents, microbiologyevents, outputevents, prescriptions, procedures_icd, contain data associated with patient care, such as physiological measurements, caregiver observations, and billing information.
86
Write a python code to solve the given question. You can use the following functions:
87
(1) Calculate(FORMULA), which calculates the FORMULA and returns the result.
88
(2) LoadDB(DBNAME) which loads the database DBNAME and returns the database. The DBNAME can be one of the following: allergy, cost, diagnosis, intakeoutput, lab, medication, microlab, patient, treatment, vitalperiodic.
89
(3) FilterDB(DATABASE, CONDITIONS), which filters the DATABASE according to the CONDITIONS and returns the filtered database. The CONDITIONS is a string composed of multiple conditions, each of which consists of the column_name, the relation and the value (e.g., COST<10). The CONDITIONS is one single string (e.g., "admissions, SUBJECT_ID=24971"). Different conditions are separated by '||'.
90
(4) GetValue(DATABASE, ARGUMENT), which returns a string containing all the values of the column in the DATABASE (if multiple values, separated by ", "). When there is no additional operations on the values, the ARGUMENT is the column_name in demand. If the values need to be returned with certain operations, the ARGUMENT is composed of the column_name and the operation (like COST, sum). Please do not contain " or ' in the argument.
91
(5) SQLInterpreter(SQL), which interprets the query SQL and returns the result.
92
(6) Calendar(DURATION), which returns the date after the duration of time.
93
Use the variable 'answer' to store the answer of the code. Here are some examples:
94
{examples}
95
(END OF EXAMPLES)
96
Knowledge:
97
{knowledge}
98
Question: {question}
99
Solution: """
100
101
DEFAULT_USER_PROXY_AGENT_DESCRIPTIONS = {
102
    "ALWAYS": "An attentive HUMAN user who can answer questions about the task, and can perform tasks such as running Python code or inputting command line commands at a Linux terminal and reporting back the execution results.",
103
    "TERMINATE": "A user that can run Python code or input command line commands at a Linux terminal and report back the execution results.",
104
    "NEVER": "A user that can run Python code or input command line commands at a Linux terminal and report back the execution results.",
105
}
106
107
CodeDebugger = """Given a question:
108
{question}
109
The user have written code with the following functions:
110
(1) Calculate(FORMULA), which calculates the FORMULA and returns the result.
111
(2) LoadDB(DBNAME) which loads the database DBNAME and returns the database. The DBNAME can be one of the following: allergy, cost, diagnosis, intakeoutput, lab, medication, microlab, patient, treatment, vitalperiodic.
112
(3) FilterDB(DATABASE, CONDITIONS), which filters the DATABASE according to the CONDITIONS. The CONDITIONS is a string composed of multiple conditions, each of which consists of the column_name, the relation and the value (e.g., COST<10). The CONDITIONS is one single string (e.g., "admissions, SUBJECT_ID=24971"). Different conditions are separated by '||'.
113
(4) GetValue(DATABASE, ARGUMENT), which returns the values of the column in the DATABASE. When there is no additional operations on the values, the ARGUMENT is the column_name in demand. If the values need to be returned with certain operations, the ARGUMENT is composed of the column_name and the operation (like COST, sum). Please do not contain " or ' in the argument.
114
(5) SQLInterpreter(SQL), which interprets the query SQL and returns the result.
115
(6) Calendar(DURATION), which returns the date after the duration of time.
116
117
The code is as follows:
118
{code}
119
120
The execution result is:
121
{error_info}
122
123
Please check the code and point out the most possible reason to the error.
124
"""
125
126
EHRAgent_4Shots_Knowledge = """Question: was the fluticasone-salmeterol 250-50 mcg/dose in aepb prescribed to patient 035-2205 on their current hospital encounter?
127
Knowledge:
128
- We can find the patient 035-2205 information in the patient database.
129
- As fluticasone-salmeterol 250-50 mcg/dose in aepb is a drug, we can find the drug information in the medication database.
130
- We can find the patientunitstayid in the patient database and use it to find the drug precsription information in the medication database.
131
Solution: patient_db = LoadDB('patient')
132
filtered_patient_db = FilterDB(patient_db, 'uniquepid=035-2205||hospitaldischargetime=null')
133
patientunitstayid = GetValue(filtered_patient_db, 'patientunitstayid')
134
medication_db = LoadDB('medication')
135
filtered_medication_db = FilterDB(medication_db, 'patientunitstayid={}||drugname=fluticasone-salmeterol 250-50 mcg/dose in aepb'.format(patientunitstayid))
136
if len(filtered_medication_db) > 0:
137
    answer = 1
138
else:
139
    answer = 0
140
141
Question: in the last hospital encounter, when was patient 031-22988's first microbiology test time?
142
Knowledge:
143
- We can find the patient 031-22988 information in the patient database.
144
- We can find the microbiology test information in the microlab database.
145
- We can find the patientunitstayid in the patient database and use it to find the microbiology test information in the microlab database.
146
Solution: patient_db = LoadDB('patient')
147
filtered_patient_db = FilterDB(patient_db, 'uniquepid=031-22988||max(hospitaladmittime)')
148
patientunitstayid = GetValue(filtered_patient_db, 'patientunitstayid')
149
microlab_db = LoadDB('microlab')
150
filtered_microlab_db = FilterDB(microlab_db, 'patientunitstayid={}||min(culturetakentime)'.format(patientunitstayid))
151
culturetakentime = GetValue(filtered_microlab_db, 'culturetakentime')
152
answer = culturetakentime
153
154
Question: what is the minimum hospital cost for a drug with a name called albumin 5% since 6 years ago?
155
Knowledge:
156
- As albumin 5% is a drug, we can find the drug information in the medication database.
157
- We can find the patientunitstayid in the medication database and use it to find the patienthealthsystemstayid information in the patient database.
158
- We can use the patienthealthsystemstayid information to find the cost information in the cost database.
159
Solution: date = Calendar('-6 year')
160
medication_db = LoadDB('medication')
161
filtered_medication_db = FilterDB(medication_db, 'drugname=albumin 5%')
162
patientunitstayid_list = GetValue(filtered_medication_db, 'patientunitstayid, list')
163
patient_db = LoadDB('patient')
164
filtered_patient_db = FilterDB(patient_db, 'patientunitstayid in {}'.format(patientunitstayid_list))
165
patienthealthsystemstayid_list = GetValue(filtered_patient_db, 'patienthealthsystemstayid, list')
166
cost_db = LoadDB('cost')
167
min_cost = 1e9
168
for patienthealthsystemstayid in patienthealthsystemstayid_list:
169
    filtered_cost_db = FilterDB(cost_db, 'patienthealthsystemstayid={}||chargetime>{}'.format(patienthealthsystemstayid, date))
170
    cost = GetValue(filtered_cost_db, 'cost, sum')
171
    if cost < min_cost:
172
        min_cost = cost
173
answer = min_cost
174
175
Question: what are the number of patients who have had a magnesium test the previous year?
176
Knowledge:
177
- As magnesium is a lab test, we can find the lab test information in the lab database.
178
- We can find the patientunitstayid in the lab database and use it to find the patient information in the patient database.
179
Solution: answer = SQLInterpreter[select count( distinct patient.uniquepid ) from patient where patient.patientunitstayid in ( select lab.patientunitstayid from lab where lab.labname = 'magnesium' and datetime(lab.labresulttime,'start of year') = datetime(current_time,'start of year','-1 year') )]
180
"""