Authors: Chantal Pellegrini, Ege Özsoy, Benjamin Busam, Nassir Navab, Matthias Keicher
✨ News ✨
- 27 March 2025: RaDialog has been accepted to MIDL 2025: The new paper includes additional experiments and details about RaDialog_v2 and can be found on OpenReview
- 12 July 2024: We published a new version of our Instruct Datset including additional tasks on PhysioNet
- 29 May 2024: The new and improved version of RaDialog is now publically available on Huggingface and Github. This new version is much better in conversational assistance, easier to use and allows a simple inference setup with huggingface!
- 26 March 2024: RaDialog Instruct Dataset now available on PhysioNet!
Conversational AI tools that can generate and discuss clinically correct radiology reports for a given medical image have the potential to transform radiology. Such a human-in-the-loop radiology assistant could facilitate a collaborative diagnostic process, thus saving time and improving the quality of reports. Towards this goal, we introduce RaDialog, the first thoroughly evaluated and publicly available large vision-language model for radiology report generation and interactive dialog. RaDialog effectively integrates visual image features and structured pathology findings with a large language model (LLM) while simultaneously adapting it to a specialized domain using parameter-efficient fine-tuning. To keep the conversational abilities of the underlying LLM, we propose a comprehensive, semi-automatically labeled, image-grounded instruct dataset for chest X-ray radiology tasks. By training with this dataset, our method achieves state-of-the-art clinical correctness in report generation and shows impressive abilities in interactive tasks such as correcting reports and answering questions, serving as a foundational step toward clinical dialog systems.
cd RaDialog
conda create --name radialog python=3.7
conda activate radialog
pip install -r requirements.txt
pip install hi-ml-multimodal==0.2.0
pip install torch==1.13.0 transformers==4.28.1
conda create --name chexbert python=3.7
conda activate chexbert
cd chexbert
pip install -r requirements.txt
RaDialog/local_config.py
cd mimic-cxr
python create_section_files.py
to prepare the report datacd ..
local_config.py
.Data for RaDialog-RG:
- run python -m data.create_data --mode "RG"
to generate the report generation dataset in the required format (no instruct data)
Data for RaDialog-INS:
- run python -m data.create_data --mode "INS"
to generate the instruct dataset
python demo.py --cfg-path pretraining/configs/blip2_pretrain_stage1_emb.yaml
to start the demohttp://127.0.0.1:7860
and start chatting with RaDialogpython test.py --prompt img_matching_examples_ig2_noexamples_IMG_findings --use_embs --num_workers 0 --lora_model checkpoints/vicuna-7b-img-report/checkpoint-11200
python test.py --prompt img_matching_examples_ig2_noexamples_IMG_findings --use_embs --num_workers 0 --lora_model checkpoints/vicuna-7b-img-instruct/checkpoint-4800
python test.py --prompt img_matching_examples_ig2_noexamples_IMG_findings --use_embs --num_workers 0 --lora_model checkpoints/vicuna-7b-img-instruct/checkpoint-4800 --do_corr
python test.py --prompt img_matching_examples_ig2_noexamples_IMG_findings --use_embs --num_workers 0 --lora_model checkpoints/vicuna-7b-img-instruct/checkpoint-4800 --do_cp_all_qa
(or --do_cp_bin_qa)python -m findings_classifier.chexpert_train --train --run_name "train_chexbert"
python -m findings_classifier.chexpert_train --run_name "save_preds"
to save the predictions of the trained modelpython -m pretraining.train --cfg-path pretraining/configs/blip2_pretrain_stage1.yaml
, we used the 4th epoch checkpointpython -m pretraining.train --cfg-path pretraining/configs/blip2_pretrain_stage1_emb.yaml
, to save the embeddings of the trained modelTrain RaDialog-RG:
- run python finetune.py --use_embs True --base_model 'vicuna_v7' --output_dir 'checkpoints/lora-vicuna-7b-report' --wandb_run_name lora-vicuna-7b-report --prompt_template_name vicuna_v11 --data_path "data/data_files/mimic_cxr_reports_stratified.json" --cutoff_len 600 --num_epochs 10
- we used checkpoint-11200
Train RaDialog-INS:
- run python finetune.py --use_embs True --base_model 'vicuna_v7' --output_dir 'checkpoints/lora-vicuna-7b-instruct' --wandb_run_name lora-vicuna-7b-instruct --prompt_template_name vicuna_v11 --data_path "data/data_files/mimic_cxr_instruct_stratified.json" --cutoff_len 800 --num_epochs 10
- we used checkpoint-4800
To use a model from a checkpoint, you'll need to perform the following steps:
- make a copy of "pytorch_model.bin" and rename it to "adapter_model.bin"
- copy adapter_config.json to the checkpoint folder (it will be generated after the last epoch or you can copy it from the checkpoints we provide)
When using our model or dataset, please cite:
@article{pellegrini2023radialog,
title={RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance},
author={Pellegrini, Chantal and {\"O}zsoy, Ege and Busam, Benjamin and Navab, Nassir and Keicher, Matthias},
journal={arXiv preprint arXiv:2311.18681},
year={2023}
}