Diff of /README.md [000000] .. [eb6cad]

Switch to unified view

a b/README.md
1
# Merlin: Vision Language Foundation Model for 3D Computed Tomography
2
3
[![arXiv](https://img.shields.io/badge/arXiv-2406.06512-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2406.06512)    [![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/Merlin)    [![pypi](https://img.shields.io/pypi/v/merlin-vlm?style=for-the-badge)](https://pypi.org/project/merlin-vlm/)    [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://youtu.be/XWmCkbpXOUw?si=6GggZgj9U4kbLAKx)    ![License](https://img.shields.io/github/license/stanfordmimi/merlin?style=for-the-badge)
4
5
*Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.*
6
7
![Key Graphic](documentation/assets/overview.png)
8
9
## ⚡️ Installation
10
11
To install Merlin (Python 3.9 required), you can simply run:
12
13
```python
14
pip install merlin-vlm
15
```
16
17
For an editable installation, use the following commands to clone and install this repository.
18
19
```python
20
conda create -name merlin python==3.9.0 # python == 3.9
21
conda activate merlin
22
23
git clone https://github.com/StanfordMIMI/Merlin.git
24
cd merlin
25
pip install -e .
26
```
27
28
## 🚀 Inference with Merlin
29
30
To create a Merlin model with both image and text embeddings enabled, use the following:
31
32
```python
33
from merlin import Merlin
34
35
model = Merlin()
36
```
37
38
To initialize the model with **only image embeddings** active, use:
39
40
```python
41
from merlin import Merlin
42
43
model = Merlin(ImageEmbedding=True)
44
```
45
46
#### For inference on a demo CT scan, please check out the [demo](documentation/demo.py)
47
48
#### For additional information, please read the [documentation](documentation/inference.md).
49
50
## 📎 Citation
51
52
If you find this repository useful for your work, please cite the cite the [original paper](https://arxiv.org/abs/2406.06512):
53
54
```bibtex
55
@article{blankemeier2024merlin,
56
  title={Merlin: A vision language foundation model for 3d computed tomography},
57
  author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},
58
  journal={Research Square},
59
  pages={rs--3},
60
  year={2024}
61
}
62
```