Switch to unified view

a/README.md b/README.md
1
# PMC-LLaMA
1
# PMC-LLaMA
2
2
3
The official codes for "PMC-LLaMA: Towards Building Open-source Language Models for Medicine". 
3
The official codes for "PMC-LLaMA: Towards Building Open-source Language Models for Medicine". 
4
4
5
<!-- vim-markdown-toc GFM -->
5
<!-- vim-markdown-toc GFM -->
6
6
7
* [Latest News](#latest-news)
7
* [Latest News](#latest-news)
8
* [Environment](#environment)
8
* [Environment](#environment)
9
* [Quick Start](#quick-start)
9
* [Quick Start](#quick-start)
10
* [Training](#training)
10
* [Training](#training)
11
* [Results](#results)
11
* [Results](#results)
12
    * [QA Benchmark](#qa-benchmark)
12
    * [QA Benchmark](#qa-benchmark)
13
    * [Zero-shot Cases](#zero-shot-cases)
13
    * [Zero-shot Cases](#zero-shot-cases)
14
* [Acknowledge](#acknowledge)
14
* [Acknowledge](#acknowledge)
15
* [Contact](#contact)
15
* [Contact](#contact)
16
16
17
<!-- vim-markdown-toc -->
17
<!-- vim-markdown-toc -->
18
18
19
[**Arxiv Version**](https://arxiv.org/abs/2304.14454)
19
[**Arxiv Version**](https://arxiv.org/abs/2304.14454)
20
20
21
We prove that medical LLM should be first pretrained with domain corpus, and then tuned with instructions following dataset.
21
We prove that medical LLM should be first pretrained with domain corpus, and then tuned with instructions following dataset.
22
22
23
We have released The latest model **PMC_LLaMA_13B** finetuned on our instructions the following dataset.
23
We have released The latest model **PMC_LLaMA_13B** finetuned on our instructions the following dataset.
24
It has shown a better ability to follow user instructions than MedLLaMA_13B.
24
It has shown a better ability to follow user instructions than MedLLaMA_13B.
25
25
26
<p align="center">
26
<p align="center">
27
    <img src="https://github.com/chaoyi-wu/PMC-LLaMA/raw/main/figures/teaser.png?raw=true" width="70%"> 
27
    <img src="https://github.com/chaoyi-wu/PMC-LLaMA/raw/main/figures/teaser.png?raw=true" width="70%"> 
28
</p>
28
</p>
29
29
30
30
31
Similarly, it can be easily loaded with:
31
Similarly, it can be easily loaded with:
32
32
33
```python
33
```python
34
import transformers
34
import transformers
35
import torch
35
import torch
36
tokenizer = transformers.LlamaTokenizer.from_pretrained('axiong/PMC_LLaMA_13B')
36
tokenizer = transformers.LlamaTokenizer.from_pretrained('axiong/PMC_LLaMA_13B')
37
model = transformers.LlamaForCausalLM.from_pretrained('axiong/PMC_LLaMA_13B')
37
model = transformers.LlamaForCausalLM.from_pretrained('axiong/PMC_LLaMA_13B')
38
```
38
```
39
Hereby we present PMC_LLaMA's versions and briefs.
39
Hereby we present PMC_LLaMA's versions and briefs.
40
40
41
[MedLLaMA_13B](https://huggingface.co/chaoyi-wu/MedLLaMA_13B) is pretrained on medical corpus, and [PMC_LLaMA_13B](https://huggingface.co/axiong/PMC_LLaMA_13B) is further finetuned based on that.
41
[MedLLaMA_13B](https://huggingface.co/chaoyi-wu/MedLLaMA_13B) is pretrained on medical corpus, and [PMC_LLaMA_13B](https://huggingface.co/axiong/PMC_LLaMA_13B) is further finetuned based on that.
42
42
43
| Version | Link | Brief | Release Date |
43
| Version | Link | Brief | Release Date |
44
| --- | --- | --- | --- |
44
| --- | --- | --- | --- |
45
| MMedLM ![](./figures/new.gif) | https://github.com/MAGIC-AI4Med/MMedLM | Further Pretrained Multilingual LLM | 2023/02/21 |
45
| MMedLM ![](./figures/new.gif) | https://github.com/MAGIC-AI4Med/MMedLM | Further Pretrained Multilingual LLM | 2023/02/21 |
46
| PMC_LLaMA_13B | https://huggingface.co/axiong/PMC_LLaMA_13B | Instruction Tuned | 2023/09/01 |
46
| PMC_LLaMA_13B | https://huggingface.co/axiong/PMC_LLaMA_13B | Instruction Tuned | 2023/09/01 |
47
| MedLLaMA_13B | https://huggingface.co/chaoyi-wu/MedLLaMA_13B | Pre-training LLaMA on 4.8M PubmedCentral papers and Medical Books | 2023/05/01 |
47
| MedLLaMA_13B | https://huggingface.co/chaoyi-wu/MedLLaMA_13B | Pre-training LLaMA on 4.8M PubmedCentral papers and Medical Books | 2023/05/01 |
48
| PMC_LLaMA_7B_10_epoch | https://huggingface.co/chaoyi-wu/PMC_LLAMA_7B_10_epoch | Similar to PMC_LLaMA_7B but trained 10 epochs | 2023/05/01 |
48
| PMC_LLaMA_7B_10_epoch | https://huggingface.co/chaoyi-wu/PMC_LLAMA_7B_10_epoch | Similar to PMC_LLaMA_7B but trained 10 epochs | 2023/05/01 |
49
| PMC_LLaMA_7B | https://huggingface.co/chaoyi-wu/PMC_LLAMA_7B | LLaMA-7b finetuned with PMC papers for 5 epochs | 2023/04/25 |
49
| PMC_LLaMA_7B | https://huggingface.co/chaoyi-wu/PMC_LLAMA_7B | LLaMA-7b finetuned with PMC papers for 5 epochs | 2023/04/25 |
50
50
51
51
52
## Latest News
52
## Latest News
53
We have released a new multilingual medical LLM **MMedLM**, you can find detailed information in [here](https://github.com/MAGIC-AI4Med/MMedLM). 
53
We have released a new multilingual medical LLM **MMedLM**, you can find detailed information in [here](https://github.com/MAGIC-AI4Med/MMedLM). 
54
54
55
It is **better than PMC-LLaMA** even in the English domain while it has not passed instruction tuning, thus is more suitable for fine-tuning instead of zero-shot or few-shot prompting. 
55
It is **better than PMC-LLaMA** even in the English domain while it has not passed instruction tuning, thus is more suitable for fine-tuning instead of zero-shot or few-shot prompting. 
56
56
57
## Environment
57
## Environment
58
Simply set up the required environment as following:
58
Simply set up the required environment as following:
59
```bash
59
```bash
60
conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia
60
conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia
61
pip install transformers=4.28.1, sentencepiece, datasets
61
pip install transformers=4.28.1, sentencepiece, datasets
62
```
62
```
63
63
64
## Quick Start
64
## Quick Start
65
Check `simple_test.py` for quickly use PMC-LLaMA or you can follow this folowing simple sample.
65
Check `simple_test.py` for quickly use PMC-LLaMA or you can follow this folowing simple sample.
66
66
67
```python
67
```python
68
import transformers
68
import transformers
69
import torch
69
import torch
70
tokenizer = transformers.LlamaTokenizer.from_pretrained('axiong/PMC_LLaMA_13B')
70
tokenizer = transformers.LlamaTokenizer.from_pretrained('axiong/PMC_LLaMA_13B')
71
model = transformers.LlamaForCausalLM.from_pretrained('axiong/PMC_LLaMA_13B')
71
model = transformers.LlamaForCausalLM.from_pretrained('axiong/PMC_LLaMA_13B')
72
model.cuda()  # move the model to GPU
72
model.cuda()  # move the model to GPU
73
73
74
prompt_input = (
74
prompt_input = (
75
    'Below is an instruction that describes a task, paired with an input that provides further context.'
75
    'Below is an instruction that describes a task, paired with an input that provides further context.'
76
    'Write a response that appropriately completes the request.\n\n'
76
    'Write a response that appropriately completes the request.\n\n'
77
    '### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:'
77
    '### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:'
78
)
78
)
79
79
80
example = {
80
example = {
81
    "instruction": "You're a doctor, kindly address the medical queries according to the patient's account. Answer with the best option directly.",
81
    "instruction": "You're a doctor, kindly address the medical queries according to the patient's account. Answer with the best option directly.",
82
    "input": (
82
    "input": (
83
        "###Question: A 23-year-old pregnant woman at 22 weeks gestation presents with burning upon urination. "
83
        "###Question: A 23-year-old pregnant woman at 22 weeks gestation presents with burning upon urination. "
84
        "She states it started 1 day ago and has been worsening despite drinking more water and taking cranberry extract. "
84
        "She states it started 1 day ago and has been worsening despite drinking more water and taking cranberry extract. "
85
        "She otherwise feels well and is followed by a doctor for her pregnancy. "
85
        "She otherwise feels well and is followed by a doctor for her pregnancy. "
86
        "Her temperature is 97.7°F (36.5°C), blood pressure is 122/77 mmHg, pulse is 80/min, respirations are 19/min, and oxygen saturation is 98% on room air."
86
        "Her temperature is 97.7°F (36.5°C), blood pressure is 122/77 mmHg, pulse is 80/min, respirations are 19/min, and oxygen saturation is 98% on room air."
87
        "Physical exam is notable for an absence of costovertebral angle tenderness and a gravid uterus. "
87
        "Physical exam is notable for an absence of costovertebral angle tenderness and a gravid uterus. "
88
        "Which of the following is the best treatment for this patient?"
88
        "Which of the following is the best treatment for this patient?"
89
        "###Options: A. Ampicillin B. Ceftriaxone C. Doxycycline D. Nitrofurantoin"
89
        "###Options: A. Ampicillin B. Ceftriaxone C. Doxycycline D. Nitrofurantoin"
90
    )
90
    )
91
}
91
}
92
input_str = [prompt_input.format_map(example)]
92
input_str = [prompt_input.format_map(example)]
93
93
94
model_inputs = tokenizer(
94
model_inputs = tokenizer(
95
    input_str,
95
    input_str,
96
    return_tensors='pt',
96
    return_tensors='pt',
97
    padding=True,
97
    padding=True,
98
)
98
)
99
print( f"\033[32mmodel_inputs\033[0m: { model_inputs }" )
99
print( f"\033[32mmodel_inputs\033[0m: { model_inputs }" )
100
100
101
101
102
topk_output = model.generate(
102
topk_output = model.generate(
103
    model_inputs.input_ids.cuda(),
103
    model_inputs.input_ids.cuda(),
104
    max_new_tokens=1000,
104
    max_new_tokens=1000,
105
    top_k=50
105
    top_k=50
106
)
106
)
107
output_str = tokenizer.batch_decode(topk_output)
107
output_str = tokenizer.batch_decode(topk_output)
108
print('model predict: ', output_str[0])
108
print('model predict: ', output_str[0])
109
```
109
```
110
110
111
111
112
## Training
112
## Training
113
113
114
The training process can be divided as two phases: pretrain and instruction-tuning.
114
The training process can be divided as two phases: pretrain and instruction-tuning.
115
115
116
**Pre-training**
116
**Pre-training**
117
117
118
The script for pretraining locates at `Pretrain/training.sh`.
118
The script for pretraining locates at `Pretrain/training.sh`.
119
119
120
Our pretraining dataset sources from [S2ORC](https://github.com/allenai/s2orc). Only those papers with PubMed IDs are deemed as medical-related and used during pretraining.
120
Our pretraining dataset sources from [S2ORC](https://github.com/allenai/s2orc). Only those papers with PubMed IDs are deemed as medical-related and used during pretraining.
121
<!-- The raw training data can be dowloaded from [S2ORC](https://github.com/allenai/s2orc), filter out the papers with PubmedCentral IDs, and you can get the training data we use.  -->
121
<!-- The raw training data can be dowloaded from [S2ORC](https://github.com/allenai/s2orc), filter out the papers with PubmedCentral IDs, and you can get the training data we use.  -->
122
122
123
The book is listed in this repo as [MedicalBook.xlsx](https://github.com/chaoyi-wu/PMC-LLaMA/blob/main/MedicalBook.xlsx), due to licenses, we cannot release raw content. For reproducing, pls buy and process the books.
123
The book is listed in this repo as [MedicalBook.xlsx](https://github.com/chaoyi-wu/PMC-LLaMA/blob/main/MedicalBook.xlsx), due to licenses, we cannot release raw content. For reproducing, pls buy and process the books.
124
124
125
More details about how to fine-tune LLaMA can refer to [Finetune_LLAMA](https://github.com/chaoyi-wu/Finetune_LLAMA)
125
More details about how to fine-tune LLaMA can refer to [Finetune_LLAMA](https://github.com/chaoyi-wu/Finetune_LLAMA)
126
126
127
127
128
**Instruction Tuning**
128
**Instruction Tuning**
129
129
130
We also provide instruction tuning script at `SFT/train.py`.
130
We also provide instruction tuning script at `SFT/train.py`.
131
And you can find our instruction dataset at [PMC LLaMA Instructions](https://huggingface.co/datasets/axiong/pmc_llama_instructions).
131
And you can find our instruction dataset at [PMC LLaMA Instructions](https://huggingface.co/datasets/axiong/pmc_llama_instructions).
132
132
133
133
134
## Results
134
## Results
135
135
136
### QA Benchmark
136
### QA Benchmark
137
| Method              | Model Size          | USMLE | MedMCQA | PubMedQA |
137
| Method              | Model Size          | USMLE | MedMCQA | PubMedQA |
138
|---------------------|---------------------|------------------|--------------|------------------|
138
|---------------------|---------------------|------------------|--------------|------------------|
139
| Human (pass)        | -                   | 50.0            | --            | 60.0           |
139
| Human (pass)        | -                   | 50.0            | --            | 60.0           |
140
| Human (expert)      | -                   | 87.0            | 90.0         | 78.0           |
140
| Human (expert)      | -                   | 87.0            | 90.0         | 78.0           |
141
| ChatGPT             | 175B                | **57.0**        | 44.7         | 63.9           |
141
| ChatGPT             | 175B                | **57.0**        | 44.7         | 63.9           |
142
| LLaMA-2             | 13B                 | 42.73           | 37.41        | 68.0           |
142
| LLaMA-2             | 13B                 | 42.73           | 37.41        | 68.0           |
143
| LLaMA-2             | 70B                 | 43.68           | 35.02        | 74.3           |
143
| LLaMA-2             | 70B                 | 43.68           | 35.02        | 74.3           |
144
| Med-Alpaca          | 13B                 | 30.85           | 31.13        | 53.2           |
144
| Med-Alpaca          | 13B                 | 30.85           | 31.13        | 53.2           |
145
| Chat-Doctor         | 7B                  | 33.93           | 31.10        | 54.3           |
145
| Chat-Doctor         | 7B                  | 33.93           | 31.10        | 54.3           |
146
| PMC_LLaMA_13B ![](./figures/new.gif) | 13B | **56.36**   | **56.04**  | **77.9**  |
146
| PMC_LLaMA_13B ![](./figures/new.gif) | 13B | **56.36**   | **56.04**  | **77.9**  |
147
147
148
148
149
Note that, the manual and zero-shot results with * are referred from [LMFLow](https://github.com/OptimalScale/LMFlow/tree/main/src/lmflow).
149
Note that, the manual and zero-shot results with * are referred from [LMFLow](https://github.com/OptimalScale/LMFlow/tree/main/src/lmflow).
150
150
151
151
152
### Zero-shot Cases
152
### Zero-shot Cases
153
153
154
We demonstrate PMC_LLaMA_13B's responses with out of domain queries.
154
We demonstrate PMC_LLaMA_13B's responses with out of domain queries.
155
155
156
<p align="center">
156
<p align="center">
157
    <img src="https://github.com/chaoyi-wu/PMC-LLaMA/raw/main/figures/pmc_llama_cases.png?raw=true" width="70%"> 
157
    <img src="https://github.com/chaoyi-wu/PMC-LLaMA/raw/main/figures/pmc_llama_cases.png?raw=true" width="70%"> 
158
</p>
158
</p>
159
159
160
160
161
Note that, due to train on the papers, MedLLaMA_13B may generate some citation numbers (LLaMA somtimes will do this as well) and we dismiss them in the cases to show the main contents.
161
Note that, due to train on the papers, MedLLaMA_13B may generate some citation numbers (LLaMA somtimes will do this as well) and we dismiss them in the cases to show the main contents.
162
While for PMC_LLaMA_13B, it's much easier to extract the correct answer as the output result is structured.
162
While for PMC_LLaMA_13B, it's much easier to extract the correct answer as the output result is structured.
163
163
164
164
165
## Acknowledge
165
## Acknowledge
166
Minimal LLaMA -- https://github.com/zphang/minimal-llama
166
Minimal LLaMA -- https://github.com/zphang/minimal-llama
167
167
168
alpaca -- https://github.com/tatsu-lab/stanford_alpaca
168
alpaca -- https://github.com/tatsu-lab/stanford_alpaca
169
169
170
LMFLow -- https://github.com/OptimalScale/LMFlow/tree/main/src/lmflow
170
LMFLow -- https://github.com/OptimalScale/LMFlow/tree/main/src/lmflow
171
171
172
LLaMA: Open and Efficient Foundation Language Models -- https://arxiv.org/abs/2302.13971
172
LLaMA: Open and Efficient Foundation Language Models -- https://arxiv.org/abs/2302.13971
173
173
174
## Contact
174
## Contact
175
If you have any question, please feel free to contact wtzxxxwcy02@sjtu.edu.cn.
175
If you have any question, please feel free to contact wtzxxxwcy02@sjtu.edu.cn.
176
176