TY - GEN
T1 - Comparative Analysis of ImageNet Pre-Trained Deep Learning Models and DINOv2 in Medical Imaging Classification
AU - Huang, Yuning
AU - Zou, Jingchen
AU - Meng, Lanxi
AU - Yue, Xin
AU - Zhao, Qing
AU - Li, Jianqiang
AU - Song, Changwei
AU - Jimenez, Gabriel
AU - Li, Shaowu
AU - Fu, Guanghui
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Medical image analysis frequently encounters data scarcity challenges. Transfer learning has been effective in addressing this issue while conserving computational resources. The recent advent of foundational models like the DINOv2, which uses the vision transformer architecture, has opened new opportunities in the field and gathered significant interest. However, DINOv2's performance on clinical data still needs to be verified. In this paper, we performed a glioma grading task using three clinical modalities of brain MRI data. We compared the performance of various pre-trained deep learning models, including those based on ImageNet and DINOv2, in a transfer learning context. Our focus was on understanding the impact of the freezing mechanism on performance. We also validated our findings on three other types of public datasets: chest radiography, fundus radiography, and dermoscopy. Our findings indicate that in our clinical dataset, DINOv2's performance was not as strong as ImageNet-based pre-trained models, whereas in public datasets, DINOv2 generally outperformed other models, especially when using the frozen mechanism. Similar performance was observed with various sizes of DINOv2 models across different tasks. In summary, DINOv2 is viable for medical image classification tasks, particularly with data resembling natural images. However, its effectiveness may vary with data that significantly differs from natural images such as MRI. In addition, employing smaller versions of the model can be adequate for medical task, offering resource-saving benefits. Our codes are available at https://github.com/GuanghuiFU/medical-dino-eval.
AB - Medical image analysis frequently encounters data scarcity challenges. Transfer learning has been effective in addressing this issue while conserving computational resources. The recent advent of foundational models like the DINOv2, which uses the vision transformer architecture, has opened new opportunities in the field and gathered significant interest. However, DINOv2's performance on clinical data still needs to be verified. In this paper, we performed a glioma grading task using three clinical modalities of brain MRI data. We compared the performance of various pre-trained deep learning models, including those based on ImageNet and DINOv2, in a transfer learning context. Our focus was on understanding the impact of the freezing mechanism on performance. We also validated our findings on three other types of public datasets: chest radiography, fundus radiography, and dermoscopy. Our findings indicate that in our clinical dataset, DINOv2's performance was not as strong as ImageNet-based pre-trained models, whereas in public datasets, DINOv2 generally outperformed other models, especially when using the frozen mechanism. Similar performance was observed with various sizes of DINOv2 models across different tasks. In summary, DINOv2 is viable for medical image classification tasks, particularly with data resembling natural images. However, its effectiveness may vary with data that significantly differs from natural images such as MRI. In addition, employing smaller versions of the model can be adequate for medical task, offering resource-saving benefits. Our codes are available at https://github.com/GuanghuiFU/medical-dino-eval.
KW - Brain MRI
KW - Classification
KW - Foundation model
KW - Glioma
KW - Pretrained
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85204046705&partnerID=8YFLogxK
U2 - 10.1109/COMPSAC61105.2024.00049
DO - 10.1109/COMPSAC61105.2024.00049
M3 - Conference contribution
AN - SCOPUS:85204046705
T3 - Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024
SP - 297
EP - 305
BT - Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024
A2 - Shahriar, Hossain
A2 - Ohsaki, Hiroyuki
A2 - Sharmin, Moushumi
A2 - Towey, Dave
A2 - Majumder, AKM Jahangir Alam
A2 - Hori, Yoshiaki
A2 - Yang, Ji-Jiang
A2 - Takemoto, Michiharu
A2 - Sakib, Nazmus
A2 - Banno, Ryohei
A2 - Ahamed, Sheikh Iqbal
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 48th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2024
Y2 - 2 July 2024 through 4 July 2024
ER -