Similar to other diagnostic medical tests, chest radiography yields clinical results with a long tail distribution; the majority of disorders are very infrequent, but a small group is often encountered [1]. Standard deep learning techniques are challenged by this, since they show bias towards the most common classes at the expense of the significant but uncommon tailclasses [2]. This specific sort of imbalance has been addressed by several current approaches [3], while long-tailed medical picture identification difficulties [4] have just lately received attention. Chest X-ray (CXR) diagnosis is a multi-label challenge since patients frequently present with several illness signs at the same time. Nevertheless, very few research include label co-occurrence in the learning process [5]. The long-tailed, multi-label nature of tasks like disease diagnosis on CXRs poses class imbalance and co-occurrence problems, which many standard deep learning methods are unable to handle because the majority of large-scale image classification benchmarks contain single-label images with a mostly balanced distribution of labels [2].
Compared to the supervised models, the CLIP [6] model exhibits more robustness to the data imbalance [7], and its feasibility of allowing one to develop own classifiers makes it a clear baseline for this. In this thesis, we aim to focus on the development of CLIP based model for our CXR classification problem.
Initially, existing neural network architectures, such as ResNets[8] and BERT [9] will be used as baseline encoders for the CLIP model to benchmark the performance, followed by proposing our algorithm that aims to address the problem of long-tailed multi-label disease classification of chest x-ray image dataset given by MIMIC-CXR.
Previously, Clip-based algorithms such as Medclip [10] and CheXzero [11] addressed the issue by presenting the zero-shot method, we aim to leverage these approaches and create a better algorithm for the classification of long-tailed diseases.
References
[1] S. Kevin Zhou, Hayit Greenspan, Christos Davatzikos, James S. Duncan, Bram Van Ginneken, Anant Madabhushi, Jerry L. Prince, Daniel Rueckert, and Ronald M. Summers. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, May 2021.
[2] Gregory Holste, Song Wang, Ziyu Jiang, Thomas C. Shen, George Shih, Ronald M. Summers, Yifan Peng, and Zhangyang Wang. Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study, page 22–32. Springer Nature Switzerland, 2022.
[3] Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan, and Jiashi Feng. Deep long-tailed learning: A survey, 2023.
[4] Lie Ju, Xin Wang, Lin Wang, Tongliang Liu, Xin Zhao, Tom Drummond, Dwarikanath Mahapatra, and Zongyuan Ge. Relational subsets knowledge distillation for long-tailed retinal diseases recognition, 2021.
[5] Guoli Wang, Pingping Wang, Jinyu Cong, Kunmeng Liu, and Benzheng Wei. Bb-gcn: A bi-modal bridged graph convolutional network for multi-label chest x-ray recognition, 2023.
[6] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021.
[7] Xin Wen, Bingchen Zhao, Yilun Chen, Jiangmiao Pang, and Xiaojuan Qi. Generalization beyond data imbalance: A controlled study on clip for transferable insights, 2024.
[8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2015.
[9] Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, September 2019.
[10] Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, and Jimeng Sun. Medclip: Contrastive learning from unpaired medical images and text, 2022.
[11] Pujan Patel Curtis P. Langlotz Andrew Y. Ng Pranav Rajpurkar Ekin Tiu, Ellie Talius. Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning, 2022.