Merlin / Git / Diff of /README.md

Models:

philipB/

Merlin

Downloads: 1

Diff of /README.md [eb6cad] .. [b64384]

Switch to unified view


# Merlin: Vision Language Foundation Model for 3D Computed Tomography

[![arXiv](https://img.shields.io/badge/arXiv-2406.06512-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2406.06512)    [![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/Merlin)    [![pypi](https://img.shields.io/pypi/v/merlin-vlm?style=for-the-badge)](https://pypi.org/project/merlin-vlm/)    [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://youtu.be/XWmCkbpXOUw?si=6GggZgj9U4kbLAKx)    ![License](https://img.shields.io/github/license/stanfordmimi/merlin?style=for-the-badge)

*Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.*

![Key Graphic](https://github.com/StanfordMIMI/Merlin/blob/main/documentation/assets/overview.png?raw=true)

## ⚡️ Installation

To install Merlin (Python 3.9 required), you can simply run:

```python
pip install merlin-vlm
```

For an editable installation, use the following commands to clone and install this repository.

```python
conda create -name merlin python==3.9.0 # python == 3.9
conda activate merlin

git clone https://github.com/StanfordMIMI/Merlin.git
cd merlin
pip install -e .
```

## 🚀 Inference with Merlin

To create a Merlin model with both image and text embeddings enabled, use the following:

```python
from merlin import Merlin

model = Merlin()
```

To initialize the model with **only image embeddings** active, use:

```python
from merlin import Merlin

model = Merlin(ImageEmbedding=True)
```

#### For inference on a demo CT scan, please check out the [demo](documentation/demo.py)

#### For additional information, please read the [documentation](documentation/inference.md).

## 📎 Citation

If you find this repository useful for your work, please cite the cite the [original paper](https://arxiv.org/abs/2406.06512):

```bibtex
@article{blankemeier2024merlin,
  title={Merlin: A vision language foundation model for 3d computed tomography},
  author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},
  journal={Research Square},
  pages={rs--3},
  year={2024}
}
```

	a/README.md		b/README.md
1	# Merlin: Vision Language Foundation Model for 3D Computed Tomography	1	# Merlin: Vision Language Foundation Model for 3D Computed Tomography
2		2
3	[![arXiv](https://img.shields.io/badge/arXiv-2406.06512-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2406.06512) [![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/Merlin) [![pypi](https://img.shields.io/pypi/v/merlin-vlm?style=for-the-badge)](https://pypi.org/project/merlin-vlm/) [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://youtu.be/XWmCkbpXOUw?si=6GggZgj9U4kbLAKx) ![License](https://img.shields.io/github/license/stanfordmimi/merlin?style=for-the-badge)	3	[![arXiv](https://img.shields.io/badge/arXiv-2406.06512-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2406.06512) [![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/Merlin) [![pypi](https://img.shields.io/pypi/v/merlin-vlm?style=for-the-badge)](https://pypi.org/project/merlin-vlm/) [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://youtu.be/XWmCkbpXOUw?si=6GggZgj9U4kbLAKx) ![License](https://img.shields.io/github/license/stanfordmimi/merlin?style=for-the-badge)
4		4
5	Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.	5	Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.
6		6
7	![Key Graphic](documentation/assets/overview.png)	7	![Key Graphic](https://github.com/StanfordMIMI/Merlin/blob/main/documentation/assets/overview.png?raw=true)
8		8
9	## ⚡️ Installation	9	## ⚡️ Installation
10		10
11	To install Merlin (Python 3.9 required), you can simply run:	11	To install Merlin (Python 3.9 required), you can simply run:
12		12
13	```python	13	```python
14	pip install merlin-vlm	14	pip install merlin-vlm
15	```	15	```
16		16
17	For an editable installation, use the following commands to clone and install this repository.	17	For an editable installation, use the following commands to clone and install this repository.
18		18
19	```python	19	```python
20	conda create -name merlin python==3.9.0 # python == 3.9	20	conda create -name merlin python==3.9.0 # python == 3.9
21	conda activate merlin	21	conda activate merlin
22		22
23	git clone https://github.com/StanfordMIMI/Merlin.git	23	git clone https://github.com/StanfordMIMI/Merlin.git
24	cd merlin	24	cd merlin
25	pip install -e .	25	pip install -e .
26	```	26	```
27		27
28	## 🚀 Inference with Merlin	28	## 🚀 Inference with Merlin
29		29
30	To create a Merlin model with both image and text embeddings enabled, use the following:	30	To create a Merlin model with both image and text embeddings enabled, use the following:
31		31
32	```python	32	```python
33	from merlin import Merlin	33	from merlin import Merlin
34		34
35	model = Merlin()	35	model = Merlin()
36	```	36	```
37		37
38	To initialize the model with only image embeddings active, use:	38	To initialize the model with only image embeddings active, use:
39		39
40	```python	40	```python
41	from merlin import Merlin	41	from merlin import Merlin
42		42
43	model = Merlin(ImageEmbedding=True)	43	model = Merlin(ImageEmbedding=True)
44	```	44	```
45		45
46	#### For inference on a demo CT scan, please check out the [demo](documentation/demo.py)	46	#### For inference on a demo CT scan, please check out the [demo](documentation/demo.py)
47		47
48	#### For additional information, please read the [documentation](documentation/inference.md).	48	#### For additional information, please read the [documentation](documentation/inference.md).
49		49
50	## 📎 Citation	50	## 📎 Citation
51		51
52	If you find this repository useful for your work, please cite the cite the [original paper](https://arxiv.org/abs/2406.06512):	52	If you find this repository useful for your work, please cite the cite the [original paper](https://arxiv.org/abs/2406.06512):
53		53
54	```bibtex	54	```bibtex
55	@article{blankemeier2024merlin,	55	@article{blankemeier2024merlin,
56	title={Merlin: A vision language foundation model for 3d computed tomography},	56	title={Merlin: A vision language foundation model for 3d computed tomography},
57	author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},	57	author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},
58	journal={Research Square},	58	journal={Research Square},
59	pages={rs--3},	59	pages={rs--3},
60	year={2024}	60	year={2024}
61	}	61	}
62	```	62	```