Switch to unified view

a/README.md b/README.md
1
# Pathology Language and Image Pre-Training (PLIP)
1
# Pathology Language and Image Pre-Training (PLIP)
2
2
3
Pathology Language and Image Pre-Training (PLIP) is the first vision and language foundation model for Pathology AI. PLIP is a large-scale pre-trained model that can be used to extract visual and language features from pathology images and text description.
3
Pathology Language and Image Pre-Training (PLIP) is the first vision and language foundation model for Pathology AI. PLIP is a large-scale pre-trained model that can be used to extract visual and language features from pathology images and text description.
4
The model is a fine-tuned version of the original CLIP model.
4
The model is a fine-tuned version of the original CLIP model.
5
5
6
6
7
![PLIP](assets/banner.png "A visualโ€“language foundation model for pathology AI")
7
8
8
9
10
## Resources
9
## Resources
11
- ๐Ÿ“š [Official Demo](https://huggingface.co/spaces/vinid/webplip)
10
- ๐Ÿ“š [Official Demo](https://huggingface.co/spaces/vinid/webplip)
12
- ๐Ÿ“š [PLIP on HuggingFace](https://huggingface.co/vinid/plip)
11
- ๐Ÿ“š [PLIP on HuggingFace](https://huggingface.co/vinid/plip)
13
- ๐Ÿ“š [Paper](https://www.nature.com/articles/s41591-023-02504-3)
12
- ๐Ÿ“š [Paper](https://www.nature.com/articles/s41591-023-02504-3)
14
13
15
14
16
### Internal API Usage
15
### Internal API Usage
17
16
18
```python
17
```python
19
    from plip.plip import PLIP
18
    from plip.plip import PLIP
20
    import numpy as np
19
    import numpy as np
21
    
20
    
22
    plip = PLIP('vinid/plip')
21
    plip = PLIP('vinid/plip')
23
    
22
    
24
    # we create image embeddings and text embeddings
23
    # we create image embeddings and text embeddings
25
    image_embeddings = plip.encode_images(images, batch_size=32)
24
    image_embeddings = plip.encode_images(images, batch_size=32)
26
    text_embeddings = plip.encode_text(texts, batch_size=32)
25
    text_embeddings = plip.encode_text(texts, batch_size=32)
27
    
26
    
28
    # we normalize the embeddings to unit norm (so that we can use dot product instead of cosine similarity to do comparisons)
27
    # we normalize the embeddings to unit norm (so that we can use dot product instead of cosine similarity to do comparisons)
29
    image_embeddings = image_embeddings/np.linalg.norm(image_embeddings, ord=2, axis=-1, keepdims=True)
28
    image_embeddings = image_embeddings/np.linalg.norm(image_embeddings, ord=2, axis=-1, keepdims=True)
30
    text_embeddings = text_embeddings/np.linalg.norm(text_embeddings, ord=2, axis=-1, keepdims=True)
29
    text_embeddings = text_embeddings/np.linalg.norm(text_embeddings, ord=2, axis=-1, keepdims=True)
31
```
30
```
32
31
33
### HuggingFace API Usage
32
### HuggingFace API Usage
34
33
35
```python
34
```python
36
35
37
    from PIL import Image
36
    from PIL import Image
38
    from transformers import CLIPProcessor, CLIPModel
37
    from transformers import CLIPProcessor, CLIPModel
39
    
38
    
40
    model = CLIPModel.from_pretrained("vinid/plip")
39
    model = CLIPModel.from_pretrained("vinid/plip")
41
    processor = CLIPProcessor.from_pretrained("vinid/plip")
40
    processor = CLIPProcessor.from_pretrained("vinid/plip")
42
    
41
    
43
    image = Image.open("images/image1.jpg")
42
    image = Image.open("images/image1.jpg")
44
    
43
    
45
    inputs = processor(text=["a photo of label 1", "a photo of label 2"],
44
    inputs = processor(text=["a photo of label 1", "a photo of label 2"],
46
                       images=image, return_tensors="pt", padding=True)
45
                       images=image, return_tensors="pt", padding=True)
47
    
46
    
48
    outputs = model(**inputs)
47
    outputs = model(**inputs)
49
    logits_per_image = outputs.logits_per_image  # this is the image-text similarity score
48
    logits_per_image = outputs.logits_per_image  # this is the image-text similarity score
50
    probs = logits_per_image.softmax(dim=1)  
49
    probs = logits_per_image.softmax(dim=1)  
51
    print(probs)
50
    print(probs)
52
    image.resize((224, 224))
51
    image.resize((224, 224))
53
52
54
```
53
```
55
54
56
### Citation
55
### Citation
57
56
58
If you use PLIP in your research, please cite the following paper:
57
If you use PLIP in your research, please cite the following paper:
59
58
60
```bibtex
59
```bibtex
61
    @article{huang2023visual,
60
    @article{huang2023visual,
62
    title={A visual--language foundation model for pathology image analysis using medical Twitter},
61
    title={A visual--language foundation model for pathology image analysis using medical Twitter},
63
    author={Huang, Zhi and Bianchi, Federico and Yuksekgonul, Mert and Montine, Thomas J and Zou, James},
62
    author={Huang, Zhi and Bianchi, Federico and Yuksekgonul, Mert and Montine, Thomas J and Zou, James},
64
    journal={Nature Medicine},
63
    journal={Nature Medicine},
65
    pages={1--10},
64
    pages={1--10},
66
    year={2023},
65
    year={2023},
67
    publisher={Nature Publishing Group US New York}
66
    publisher={Nature Publishing Group US New York}
68
}
67
}
69
```
68
```
70
69
71
### Acknowledgements
70
### Acknowledgements
72
71
73
The internal API has been **copied** from FashionCLIP.
72
The internal API has been **copied** from FashionCLIP.