EHRKit-2022 / Git / [2d4573] /collated_tasks/non_mimic/question

Models:
philipB/
EHRKit-2022
Downloads: 1
[2d4573]: / collated_tasks / non_mimic / question_answering / readme.MD
History
Download this file
139 lines (125 with data), 7.8 kB

# Medical Question Answering

## Overview of directory
MedicalQADataset.ipynb provides instructions of loading and inspecting a collection of medical question asnwering datasets.

HeadQA_tutorial.ipynb is a tutorial notebook that uses BertForMultipleChoice to solve the HeadQA dataset.

These notebooks are developed using Google Colab.

## Collection of Medical QA Datasets
### HeadQA 
HEAD-QA: A Healthcare Dataset for Complex Reasoning [paper](https://aclanthology.org/P19-1092.pdf)
- Corpus of multiple choice questions
- Dataset for general reasoning (given a question and 4/5 choices, select the correct choice)
    - In training set, each question has 5 choices  
    - In validation and test set, each question has 4 choices
- *Answer choices are not from given passages/contexts, i.e., not identifying spans like SQuAD
- Available in English and Spanish
- Source: annual exam question to apply for specialization positions in Spanish public healthcare positions
- Healthcare areas: medicine, pharmacology, psychology, nursing, biology, chemistry
- Data can be downloaded from huggingface datasets
- train/val/test split: 2657/2742/1366

### BioASQ
- Task Synergy on Biomedical Semantic QA for Covid-19   
    - Given unanswered questions, model provided answers which will be evaluated by experts
    - Involves IR, QA, summarization, etc
- Task A: Large-Scale Online Biomedical Semantic Indexing
    - Classify new PubMed documents before PubMed curators annotate them manually
    - Classes come from MeSH hierarchy
- Task B: Biomedical Semantic QA
    - 4 types of questions:
        - Yes/No
        - Factoid: requires a particular entity name, number or short expression
        - List: a list of entity names, numbers or short expressions
        - Summary: produce a short text summarization of the most relevant information
    - For 9b, there are 3743 questions {1091 factoid, 1033 yesno, 899 summary, 719 list}

### MedQuAD
A question-entailment approach to question answering [paper](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3119-4)
- Proposing RQE (Recognizing Question Entailment): retrieve answers to a new question by retrieving entailed questions with associated answers
- Dataset:
    - Collected from 12 trusted medical websites, one in each folder. Each folder has multiple xml files
    - Contain question-answer pairs
- Task: generate a ranked list of answers for a given premise question by ranking the recognized hypothesis questions
- Evaluation: compare hybrid entailment-based approach, the IR method, and other QA systems participating in LiveQA
- Remark:
    - The dataset is essentially an auxiliary dataset to LiveQA. It serves as a base of hypothesis questions.

### LiveQA
Overview of the Medical Question Answering Task at TREC 2017 LiveQA
- Providing automatic answers to consumer health questions received by the U.S. National Library of Medicine (questions posts, each question can have multiple sub-questions)
- Free to use any medical websites to find relevant answers
- Dataset:
    - 2 training sets with 634 pairs of medical questions and answers
    - QA pairs constructed from FAQs on trusted websites of NIH
    - Additional annotations for Question Focus and Question Type for each subquestion
    - 1st training set contains 388 question-answer pairs, corresponding to 200 NLM questions
    - 2nd training set contains 246 question-answer pairs, corresponding to 246 NLM question (retrieved manually)
    - Test set contains 104 NLM questions (subquestion focus and type annotations not provided)
    - For each test question, one or more reference answers are manually collected

### MEDIQA
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering
- 3 subtasks:
    - Natural Language Inference (NLI)
        - classify relationship between two sentences as Entailment, Neutral, and Contradiction
    - Recognizing Question Entailment (RQE)
        - “a question A entails a question B if every answer to B is also a complete or partial answer to A”
    - Question Answering (QA)
        -filter and improve the ranking of automatically retrieved answers, input ranks generated by CHiQA.
- Task 1 dataset
    - MedNLI derived from MIMIC-III, access using GCP or AQS
    - 14,049 text-hypothesis training pairs, 405 test pairs.
    - Tutorial notebook in inference directory
- Task 2 dataset
    - available at Github, xml files, train/val/test = 8,890/302/230
- Task 3 dataset 
    - available at Github, {system\_rank, reference\_rank, reference\_score}
    - two training datasets:
        - 104 consumer health questions from LiveQA, 839 answers retrieved by CHiQA and manually rated and re-ranked
        - 104 simple questions about the most frequent diseases (Alexa), 862 answers
    - validation set:
        - 25 consumer health questions, 234 answers returned by CHiQA and judged manually
    - test set:
        - 150 consumer health questions, 1,107 answers
- MedQuAD (47k pairs) can be used to retrieve answered questions that are entailed from the original questions
- Evaluations:
Task 1 & 2: accuracy
Task 3: accuracy, Mean Reciprocal Rank (MRR), Precision, and Spearman’s Rank Correlation Coefficient (Spearman’s Rho)

### Medcation_QA_MedInfo2019
- Answering consumer health questions about medications/drugs
- Dataset
    - Select anonymized consumer questions submitted to MedlinePlus
    - Annotating questions with Question focus and Question type
    - Annotating reference answers by manually retrieve a correct and complete reference answer with url and section title
    - xlsl file with columns={Question, Focus (Drug), Question Type, Answer, Section Title, URL}

### BiQA
Generating Biomedical Question Answering Corpora From Q&A Forums [paper](https://ieeexplore.ieee.org/abstract/document/9184044)
- Corpus of question-article pairs
- Dataset for information retrieval (given a question, find relevant articles)
- Source: questions selected from popular questions in public forums (Biology and Medical Sciences from Stackexchange, Nutrition from Reddit)
- 7,453 questions and 14,239 question-article pairs
- csv files, one for each topic, three in total
- For each question, the pubmed id of each answer article and the title of each article are recorded in the csv
- Code for getting and filtering additional posts available
- Code for retrieving documents using PMIDs available

### MASHQA
Multiple Answer Spans Healthcare Question Answering dataset from consumer health domain [paper](https://people.cs.vt.edu/mingzhu/papers/conf/emnlp2020.pdf)
- Dataset:
    - Context, question, and multiple highlighted spans of context as answers. The answer consists of multiple spans. All spans together form a complete answer.
    - Source: queries from WebMD, answers curated by healthcare experts
    - For each question, answer is split into sentences. The sentences are identified in their source context as answer spans.

### EPIC QA
- Task
    - Given question and context passages, identify consecutive answer sentences (start sentence id to end sentence id). 
    - Gold answers available now.

### PubMedQA
PubMedQA: A Dataset for Biomedical Research Question Answering [paper](https://arxiv.org/pdf/1909.06146.pdf)
- Answer research questions with yes/no/maybe
- Three sets
    - PQA-Labeled (1k)
    - PQA-Unlabeled (211.3k)
    - PQA-Artificial (61.2k) 
- Dataset created from PubMed article abstracts
    - Labeled
        - Question: title
        - Context: sections in abstract except for conclusion
        - Label: annotated yes/no/maybe
    - Unlabeled	
        - Removes all questions started with wh-words or involve selections from multiple entities
    - Artificial
        - Noisily labeled instances using heuristics for pre-training