Switch to unified view

a/README.md b/README.md
1
# Merlin: Vision Language Foundation Model for 3D Computed Tomography
1
# Merlin: Vision Language Foundation Model for 3D Computed Tomography
2
2
3
[![arXiv](https://img.shields.io/badge/arXiv-2406.06512-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2406.06512)    [![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/Merlin)    [![pypi](https://img.shields.io/pypi/v/merlin-vlm?style=for-the-badge)](https://pypi.org/project/merlin-vlm/)    [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://youtu.be/XWmCkbpXOUw?si=6GggZgj9U4kbLAKx)    ![License](https://img.shields.io/github/license/stanfordmimi/merlin?style=for-the-badge)
3
[![arXiv](https://img.shields.io/badge/arXiv-2406.06512-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2406.06512)    [![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/Merlin)    [![pypi](https://img.shields.io/pypi/v/merlin-vlm?style=for-the-badge)](https://pypi.org/project/merlin-vlm/)    [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://youtu.be/XWmCkbpXOUw?si=6GggZgj9U4kbLAKx)    ![License](https://img.shields.io/github/license/stanfordmimi/merlin?style=for-the-badge)
4
4
5
*Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.*
5
*Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.*
6
6
7
![Key Graphic](documentation/assets/overview.png)
7
![Key Graphic](https://github.com/StanfordMIMI/Merlin/blob/main/documentation/assets/overview.png?raw=true)
8
8
9
## ⚡️ Installation
9
## ⚡️ Installation
10
10
11
To install Merlin (Python 3.9 required), you can simply run:
11
To install Merlin (Python 3.9 required), you can simply run:
12
12
13
```python
13
```python
14
pip install merlin-vlm
14
pip install merlin-vlm
15
```
15
```
16
16
17
For an editable installation, use the following commands to clone and install this repository.
17
For an editable installation, use the following commands to clone and install this repository.
18
18
19
```python
19
```python
20
conda create -name merlin python==3.9.0 # python == 3.9
20
conda create -name merlin python==3.9.0 # python == 3.9
21
conda activate merlin
21
conda activate merlin
22
22
23
git clone https://github.com/StanfordMIMI/Merlin.git
23
git clone https://github.com/StanfordMIMI/Merlin.git
24
cd merlin
24
cd merlin
25
pip install -e .
25
pip install -e .
26
```
26
```
27
27
28
## 🚀 Inference with Merlin
28
## 🚀 Inference with Merlin
29
29
30
To create a Merlin model with both image and text embeddings enabled, use the following:
30
To create a Merlin model with both image and text embeddings enabled, use the following:
31
31
32
```python
32
```python
33
from merlin import Merlin
33
from merlin import Merlin
34
34
35
model = Merlin()
35
model = Merlin()
36
```
36
```
37
37
38
To initialize the model with **only image embeddings** active, use:
38
To initialize the model with **only image embeddings** active, use:
39
39
40
```python
40
```python
41
from merlin import Merlin
41
from merlin import Merlin
42
42
43
model = Merlin(ImageEmbedding=True)
43
model = Merlin(ImageEmbedding=True)
44
```
44
```
45
45
46
#### For inference on a demo CT scan, please check out the [demo](documentation/demo.py)
46
#### For inference on a demo CT scan, please check out the [demo](documentation/demo.py)
47
47
48
#### For additional information, please read the [documentation](documentation/inference.md).
48
#### For additional information, please read the [documentation](documentation/inference.md).
49
49
50
## 📎 Citation
50
## 📎 Citation
51
51
52
If you find this repository useful for your work, please cite the cite the [original paper](https://arxiv.org/abs/2406.06512):
52
If you find this repository useful for your work, please cite the cite the [original paper](https://arxiv.org/abs/2406.06512):
53
53
54
```bibtex
54
```bibtex
55
@article{blankemeier2024merlin,
55
@article{blankemeier2024merlin,
56
  title={Merlin: A vision language foundation model for 3d computed tomography},
56
  title={Merlin: A vision language foundation model for 3d computed tomography},
57
  author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},
57
  author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},
58
  journal={Research Square},
58
  journal={Research Square},
59
  pages={rs--3},
59
  pages={rs--3},
60
  year={2024}
60
  year={2024}
61
}
61
}
62
```
62
```