Index
Multi-stage Patch based U-Net for Text Line Segmentation of Historical Documents
Deep Learning-based Bleed-through Removal in Historical Documents
Disentangling Visual Attributes for Inherently Interpretable Medical Image Classification
M
Project description:
Interpretability is essential for a deep neural network approach when applied to crucial scenarios such as medical image processing. Current gradient-based [1] and counterfactual image-based [2] interpretability approaches can only provide information of where the evidence is. We also want to know what the evidence is. In this master thesis project, we will build an inherently interpretable classification method. This classifier can learn disentangled features that are semantically meaningful and, in the future, corresponding to related clinical concepts.
This project based on a previous proposed visual feature attribution method in [3]. This method can generate class relevant attribution map for a given input disease image. We will extend this method to generate class relevant shape variations and design an inherently interpretable classifier only using the disentangled features (class relevant intensity variation and shape variation). This method can be further extended by disentangling more semantically meaningful and causal independent features such as texture, shape, and background as the work in [4].
References
[1] Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
[2] Cher Bass, Mariana da Silva, Carole Sudre, Petru-Daniel Tudosiu, Stephen M Smith, and Emma C Robinson. Icam: Interpretable classifi- cation via disentangled representations and feature attribution mapping. arXiv preprint arXiv:2006.08287, 2020.
[3] Christian F Baumgartner, Lisa M Koch, Kerem Can Tezcan, Jia Xi Ang, and Ender Konukoglu. Visual feature attribution using wasserstein gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8309–8319, 2018.
[4] Axel Sauer and Andreas Geiger. Counterfactual generative networks. arXiv preprint arXiv:2101.06046, 2021.
MR automated image quality assessment
Virtual contrast enhancement of breast MRI using Deep learning
Network analysis of soluble factor-mediated autocrine and paracrine circuits in melanoma immunotherapy
Spike Detection in Gradient Coils of MR Scanners using Artificial Intelligence
Introduction
Spikes, also known as herringbone artifact, are a well-known artifact in MRI imaging. They occur
when a hardware component produces an unwanted spark. Spikes are caused by malfunctioning
hardware components and lead to a degraded image quality; therefore, it is important to eliminate
their cause. A common case is gradient coils, which produce rapidly changing magnetic fields with
high amplitude. The aim of this thesis is to develop a deep-learning-based spike detection algorithm
based on multi-channel k-space data.
Methods and data
For this work anonymized clinical data in TWIX format are used provided by Siemens Healthineers.
The dataset contains more than 90 recordings from more than 15 scanners measured with a variety
of different sequences. The recordings are annotated by one expert with a binary label per slice or
partitions for 3D recordings. The label indicates whether a spike is present or not.
The goal of this thesis is to create a deep learning pipeline for the classification of the presence of
spikes. This includes comparing different preprocessing techniques and neural network architectures
(e.g., Res-blocks [1], Inception modules [2] and Dense-blocks [3]) in terms of their performance in
solving the classification task. In addition, their computational performance will also be evaluated.
Evaluation
The following aspects will be evaluated:
- Different preprocessing methods (e.g., dimensionality reduction, feature extraction, data
augmentation) will be implemented and compared w.r.t the classification performance - Different model architectures (e.g., Res-blocks, Inception modules, Dense-blocks) will be
implemented and compared w.r.t the classification performance - The classification performance will be evaluated with different metrics and the model’s
decision will be investigated with different attribution methods - The chosen architecture will be analyzed and optimized w.r.t its computational performance
References
[1] HE, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE
conference on computer vision and pattern recognition. 2016. S. 770-778.
[2] SZEGEDY, Christian, et al. Going deeper with convolutions. In: Proceedings of the IEEE
conference on computer vision and pattern recognition. 2015. S. 1-9.
[3] HUANG, Gao, et al. Densely connected convolutional networks. In: Proceedings of the IEEE
conference on computer vision and pattern recognition. 2017. S. 4700-4708.
Similarity and duplicate search in artwork images
Similarity and duplicate search in artworks is important in the domain of the art historical studies. It is a very
challenging task even for the art historians, because the image similarity in art depends on different features
like color, texture and style of the artwork [2]. Applying some pretrained deep neural networks like VGG16,
ResNet15 makes it efficient to find similar features between images. However, they have high bias on one of
the features (e.g., focus too much on color) and we usually do not know which features influence most on
similarity search, so they cannot be directly applied on artwork databases.
The goal of this thesis is to implement and evaluate a model based on deep neural networks that can find cor-
related artworks with a custom definition of similarity.
Advanced Model Architectures for Interactive Segmentation and Segmentation Enhancement in CT Images
Thesis Description
Cerebrovascular accidents are a world disease with a severe impact on patients and healthcare systems. Approximately 15 million people suffer an ischemic stroke each year worldwide [1, 2]. More detailed information about the condition of arterial vessels can play a critical role in both preventing stroke and improving stroke therapy [1, 3, 4].
Since about one third of patients die from the consequences of a stroke, it is of great interest to detect indications of cerebrovascular diseases as quickly and as efficiently as possible, enabling to intervene in time or even to take preventive measures [1, 4]. Currently, however, vascular imaging in clinical routine is primarily assessed by visual-qualitative means only. The technical difficulties in extracting cerebral arteries and quantifying their parameters have prevented this data from becoming part of routine clinical practice [1, 5].
Image segmentation in general remains challenging for many applications. In particular, advanced implementations such as ischemic infarct tissue segmentation require highly accurate results to ensure optimal patient care and treatment [6, 7]. Thus, if at all, segmentation of cerebral vessels to date are predominantly performed manually or semi-manually. Since manual vessel segmentation is time consuming, research has focused on developing faster and more general automatic vessel segmentation methods [1, 5].
In recent years, deep learning techniques have demonstrated to be a very useful approach to this problem, as they can, unlike traditional threshold approaches, incorporate spatial information into their predictions [8, 9]. Therefore, the current development trend is shifting away from the rule-based methods proposed in previous decades, such as vessel intensity distributions, geometric models and vessel extraction methods [10, 11]. Although most rule-based approaches such as midline tracing, active contour models, or region growth use various vessel image features for reconstruction [12, 10], they are either hand-crafted or insufficiently validated [11, 10]. Therefore, it is difficult to achieve the desired level of robustness in vessel segmentation, and none of the proposed methods has found widespread application in the clinical setting or in research [5].
However, even deep learning methods that have shown to be particularly powerful and adaptable have their specific drawbacks, as they demand a large amount of training data [13, 14]. Providing this data is challenging, because it usually contains sensitive personal data and therefore is not publicly available [15, 16, 17]. In addition, successful deep segmentation also requires ground truth data which is, as discussed earlier, both extremely time-consuming and thus costly to create [1, 5].
Recently, several alternative strategies to circumvent this lack of commentary have been explored. For example, methods for semi-supervised semantic segmentation have been successfully developed, based on the generative adversarial network (GAN) approach [17, 14, 18]. Subsequent work has further improved this approach by explicitly accounting particular issues, such as domain shift, during translation and utilizing contrastive learning for translating unpaired images [19, 20].
In addition, pretraining algorithms have emerged that promise to improve performance by preparing the model in an unsupervised manner. This is referred to as self-supervised learning. Its popularity can be traced back to well-known pretraining networks like [21, 22, 23, 24]. These networks are able to incorporate unlabeled samples into the training and thus make use of the entirety of the datasets despite the lack of annotations, ultimately increasing model performance [21, 22, 17].
An alternative approach eliminating this shortage of clinical annotations might involve accelerating the time consuming manual segmentation process. The idea of using deep learning methods to optimize this process has recently become more popular [25, 26, 27]. These interactive segmentations can be used not only for the creation of annotations, but also for the improvement of already existing ones. In doing so, a segmentation can be created in a first step and optimized in subsequent steps either automatically, interactively or manually. These changes are then automatically applied to the entire vessel, saving valuable time [25, 26].
For the reasons stated above, this work aims to investigate whether advanced model architectures can be successfully used for semi-supervised and unsupervised image segmentation, with the overall goal of improving deep vessel segmentation and will conduct an in-depth examination of the potential of pretraining methodologies to increase model performance. This work will investigate whether interactive segmentation might be applied in the medical field and how it can be integrated into the clinical workflow to reduce annotational workload.
- Literature overview of the current state of the art and collection of frameworks
- Pretraining methods
- Interactive segmentation strategies
- Expanding the current state of the art for carotid artery segmentation
- Utilizing semi-supervised contrastive learning mechanisms
- Enabling interactive segmentation
- Systematic analysis and evaluation of the developed deep learning approaches
References
[1] Michelle Livne, Jana Rieger, Orhun Utku Aydin, Abdel Aziz Taha, Ela Marie Akay, Tabea Kossen, Jan Sobesky, John D Kelleher, Kristian Hildebrand, Dietmar Frey, et al. A u-net deep learning framework for
high performance vessel segmentation in patients with cerebrovascular disease. Frontiers in neuroscience, 13:97, 2019.
[2] Walter Johnson, Oyere Onuma, Mayowa Owolabi, and Sonal Sachdev. Stroke: a global response is needed. Bulletin of the World Health Organization, 94(9):634, 2016.
[3] Jason D Hinman, Natalia S Rost, Thomas W Leung, Joan Montaner, Keith W Muir, Scott Brown, Juan F Arenillas, Edward Feldmann, and David S Liebeskind. Principles of precision medicine in stroke. Journal
of Neurology, Neurosurgery & Psychiatry, 88(1):54–61, 2017.
[4] James C Grotta, Gregory W Albers, Joseph P Broderick, Scott E Kasner, Eng H Lo, Ralph L Sacco, Lawrence KS Wong, and Arthur L Day. Stroke E-Book: Pathophysiology, Diagnosis, and Management.
Elsevier Health Sciences, 2021.
[5] Renzo Phellan, Alan Peixinho, Alexandre Falc˜ao, and Nils D Forkert. Vascular segmentation in tof mra images of the brain using a deep convolutional neural network. In Intravascular Imaging and Computer
Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, pages 39–46. Springer, 2017.
[6] Maryam Rastgarpour and Jamshid Shanbehzadeh. The problems, applications and growing interest in automatic segmentation of medical images from the year 2000 till 2011. International Journal of Computer Theory and Engineering, 5(1):1, 2013.
[7] Richard Szeliski. Computer vision: algorithms and applications. Springer Science & Business Media, 2010.
[8] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
[9] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[10] David Lesage, Elsa D Angelini, Isabelle Bloch, and Gareth Funka-Lea. A review of 3d vessel lumen segmentation techniques: Models, features and extraction schemes. Medical image analysis, 13(6):819–845, 2009.
[11] Fengjun Zhao, Yanrong Chen, Yuqing Hou, and Xiaowei He. Segmentation of blood vessels using rule-based and machine-learning-based methods: a review. Multimedia Systems, 25(2):109–118, 2019.
[12] Yun Tian, Qingli Chen, Wei Wang, Yu Peng, Qingjun Wang, Fuqing Duan, Zhongke Wu, and Mingquan Zhou. A vessel active contour model for vascular segmentation. BioMed research international, 2014, 2014.
[13] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
[14] Wei-Chih Hung, Yi-Hsuan Tsai, Yan-Ting Liou, Yen-Yu Lin, and Ming-Hsuan Yang. Adversarial learning for semi-supervised semantic segmentation. arXiv preprint arXiv:1802.07934, 2018.
[15] Brett K Beaulieu-Jones, Zhiwei Steven Wu, Chris Williams, Ran Lee, Sanjeev P Bhavnani, James Brian Byrd, and Casey S Greene. Privacy-preserving generative deep neural networks support clinical data
sharing. Circulation: Cardiovascular Quality and Outcomes, 12(7):e005122, 2019.
[16] Omer Tene and Jules Polonetsky. Big data for all: Privacy and user control in the age of analytics. Nw. J. Tech. & Intell. Prop., 11:xxvii, 2012.
[17] Nima Tajbakhsh, Laura Jeyaseelan, Qian Li, Jeffrey N Chiang, Zhihao Wu, and Xiaowei Ding. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Medical Image Analysis, 63:101693, 2020.
[18] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing
systems, 27, 2014.
[19] Yawei Luo, Liang Zheng, Tao Guan, Junqing Yu, and Yi Yang. Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 2507–2516, 2019.
[20] Taesung Park, Alexei A Efros, Richard Zhang, and Jun-Yan Zhu. Contrastive learning for unpaired imageto-image translation. In European Conference on Computer Vision, pages 319–345. Springer, 2020.
[21] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–
1607. PMLR, 2020.
[22] Jean-Bastien Grill, Florian Strub, Florent Altch´e, Corentin Tallec, Pierre H Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733, 2020.
[23] Xinlei Chen and Kaiming He. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15750–15758, 2021.
[24] Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Armand Joulin, Nicolas Ballas, and Michael Rabbat. Semi-supervised learning of visual features by non-parametrically predicting view assignments with support samples. arXiv preprint arXiv:2104.13963, 2021.
[25] Sabarinath Mahadevan, Paul Voigtlaender, and Bastian Leibe. Iteratively trained interactive segmentation. In British Machine Vision Conference (BMVC), 2018.
[26] Konstantin Sofiiuk, Ilia Petrov, and Anton Konushin. Reviving iterative training with mask guidance for interactive segmentation. arXiv preprint arXiv:2102.06583, 2021.
[27] Xiangde Luo, Guotai Wang, Tao Song, Jingyang Zhang, Michael Aertsen, Jan Deprest, Sebastien Ourselin, Tom Vercauteren, and Shaoting Zhang. Mideepseg: Minimally interactive segmentation of unseen objects from medical images using deep learning. Medical Image Analysis, 72:102102, 2021.
Learning-based reduction of non-significant changes in subtraction volumes
Computed Tomography (CT) is a diagnostic tool that allows doctors or radiologists to visualize the internal morphology of the body. Radiologists compare CT studies to identify tumors, infections, blood clots, and to assess the response to treatment. To identify changed features, radiologists visually compare the current with a prior study. They align both studies while scrolling through the images and switch between the acquisitions to identify relevant changes.
Overlaying the current study with a color-coded confidence mask indicative of changes is a helpful tool to mark areas with potential changes. To compute such a mask, a registration of both datasets is required. Here, inaccurate registrations can introduce misalignments, which will be marked as tissue changes but are not of clinical relevance. Such misalignments can cause shadow-like effects at tissue boundaries, which can obscure pathologically relevant features. Another source of non-significant changes is related to different acquisition parameters, resulting in salt and pepper noise.
The goal of this master thesis is to train a deep learning model to detect and remove non-significant changes. Generative Adversarial Networks (GANs) have shown promising results on image processing tasks with no or limited ground truth data available. GANs consist of two models, Generator, and Discriminator that by design learn the distribution of the training data. The generator model generates fake data to be fed to the discriminator which aims to identify fake examples. With this adversarial training method we aim to leverage the image quality of the difference images and, hence, enable easier identification of non-significant changes by the physician.