Index
Detection of Label Noise in Solar Cell Datasets
On-site inspection of solar panels is a time-consuming and difficult process, as the solar panels are often difficult to reach. Furthermore, identifying defects can be hard, especially for small cracks. Electroluminescence (EL) imaging enables the detection of small cracks, for example using a convolutional neural network (CNN) [1,2]. Hence, it can be used to identify such cracks before they propagate and result in a measurable impact on the efficiency of a solar panel [3]. This way costly inspection and replacement of solar panels can be avoided.
To train a CNN for the detection of cracks, a comprehensive dataset of labeled solar cells is required. Unfortunately, assessing, if a certain structure on a polycrystalline solar cell corresponds to a crack or not, is a hard task, even for human experts. As a result, setting up a consistently labeled dataset is nearly impossible. That is why EL datasets of solar cells favor a significant amount of label noise.
It has been shown that CNNs are robust against small amounts of label noise, but there may be drastic influence on the performance starting at 5%-10% of label noise [4]. This thesis will
(1) analyze the given dataset with respect to label noise and
(2) attempts to minimize the negative impact on the performance of the trained network caused by label noise.
Recently, Ding et. al. proposed to identify label noise by clustering of the features learned by the CNN [4]. As part of this thesis, the proposed method will be applied to a dataset consisting of more than 40k labeled samples of solar cells, which is known to contain a significant amount of label noise. As a result, it will be investigated, if the method can be used to identify noisy samples. Furthermore, it will be evaluated, if abstaining from noisy samples improves the performance of the resulting model. To this end, a subset of the dataset will be labeled by at least three experts to obtain a cleaned subset. Finally, an extension of the method will be developed. Here, it shall be evaluated, if the clustering can be omitted, since this proved instable in prior experiments using the same data.
[1] Deitsch, Sergiu, et al. “Automatic classification of defective photovoltaic module cells in electroluminescence images.” Solar Energy 185 (2019): 455-468.
[2] Mayr, Martin, et al. “Weakly Supervised Segmentation of Cracks on Solar Cells Using Normalized L p Norm.” 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2019.
[3] Köntges, Marc, et al. “Impact of transportation on silicon wafer‐based photovoltaic modules.” Progress in Photovoltaics: research and applications 24.8 (2016): 1085-1095.
[4] Ding, Guiguang, et al. “DECODE: Deep confidence network for robust image classification.” IEEE Transactions on Image Processing 28.8 (2019): 3752-3765.
Comparison of different text attention techniques for writer identification
Distillation Learning for Speech Enhancement
Noise suppression has remained a field of interest for more than five decades now, and a number of techniques have been employed to extract clean and/or noise free data. Continuous audio and video signals offer greater challenges when it comes to noise reduction, and deep neural network (DNN) techniques have been designed to enhance those signals (Valin, 2018). While the DNNs are efficient, they are computationally expensive and demand adequate memory resources. The aim of the proposed thesis will remain on addressing these constraints when working limited memory and computational power, without compromising much on the model efficiency.
A Neural Network (NN) can easily be overfitted with the training data, owing to the large number of parameters and training sessions for which the network was trained on the given data (Dakwale & Monz, 2019). One solution to this is to use ensemble (combination) of models trained on the same data to achieve generalization. The limitation of this solution comes with hardware constraints and when the network needs to be used on a hardware with limited memory and computational power, such as mobile phones. This resource limitation seeds the idea of distillation learning, in which the knowledge from a complex or ensembled network is transferred to a relatively simpler and computationally less expensive model.
Following the framework of distillation learning, a Teacher-Student network will be designed, with an existing trained Teacher network. The teacher network has been trained on audio data with hard labels, using a dense parameter matrix. The high number of parameters dictates the complexity of the neural network and also the efficiency to identify and suppress signal noise (Hinton, et al., 2015). The proposed method is to design a student network, which tries to imitate the output of the teacher, i.e., the probability distribution, without the need to be trained with the same number of parameters. By transferring the learning of the teacher to the student network, a simpler model can be designed, with a reduced set of parameters, which would be more suited for hardware with lower memory and computational power.
Motion Compensation Using Epipolar Consistency Condition in Computed Tomography
The hippocampus and the Successor Representation – An analysis of the properties of the Successor Representation, place- and grid cells.
The human brain is a big role model for computer science. Many applications
like Neural Networks mimic brain functions with great success. However a lot of
functions are still not well understood and therefore subject to present research.
The hippocampus is one of the regions of greater interest. It is a central part
of memory processing, the limbic system and used for spatial navigation. Place
and grid cells are two important cell types found in the hippocampus, which
help to encode information for navigational tasks [1].
New theories however extend this view from spatial navigation to more abstract
navigation, which can be used for all concepts of information. In the paper The
hippocampus as a predictive map a mathematical description of the place cells
in the hippocampus, the Successor Representation (SR) is developed. The SR
can be used to imitate the data processing method of the hippocampus and
could already recreate experimental results [2]. Other experiments have also
extended the view from spatial navigation to broader information processing.
For example that the grid cells do not encode only the euclidean distances [3]
or that we use grid and place cells to orientate in our eld of vision [4]. All of
this could lead to powerful data processing tool, which can adept
exible to all
kinds of problems.
This thesis wants to build a framework which can be used to use and analyze
the properties of the SR. The framework should enable to create dierent envi-
ronments for simple navigation tasks, but also to get more abstract information
relationships in graphs. Furthermore mathematical properties should be ana-
lyzed to improve the learning process and to gain a broader understanding of
the functionality of the SR.
1
Development of a framework to simulate learning and task solving inspired by the hippocampus and successor representation
Since nervous systems have developed very efficient
mechanisms to store, retrieve and even extrapolate from
learned experience, machine learning has always oriented
itself on nature. Even though the discovery of neural
networks, support vector machines and deep networks have
been significantly pushing performance, science is still far
away from completely understanding the brain’s
implementation of those phenomena.
The hippocampus is a structure of the brain present in both
hemispheres. It has been proven to be responsible for both
spatial orientation and memory management [1, 2] but recent
studies suggest it is involved in far more profound tasks of
learning. This new theory assumes the hippocampus creates
abstract cognitive maps with the ability to predict unknown
states, joining the proven findings already mentioned
above.[3] To further investigate and study this behaviour
and possibly add proof to the theory, it is crucial to
examine the two dominant neural cell types which have
already been identified in the context of spatial
orientation. These are so called place cells on the one hand
and grid cells on the other.
Place- and grid cells were originally discovered to encode
spatial information and thus named accordingly. According to
the theory of abstract cognitive mapping in the hippocampus,
place cells’ activities are believed to represent states
in general. Grid cells were originally discovered firing
uniformly over space in different orientations, generating
some kind of coordinate system. In the context of this more
holistic theory, grid cells could provide a reference frame
for the abstract cognitive map. Many experiments have been
conducted to investigate the behaviour of the hippocampuses
structures related to learning.
The aim of this thesis is to create a framework for
researchers to simulate and work in environments which are
held so simple that the results can be transferred to any
other cognitive map. Hopefully this can help to avoid
complicated experimental setups, the use of laboratory
animal experiments and speed up research on the
hippocampuses role in learning in the future.
[1] D. S. Olton, J. T. Becker, and G. E. Handelmann.
“Hippocampus, space, and memory”. In: Behavioral and
Brain Sciences 2.3 (1979), pp. 313–322. issn: 0140-525X.
doi: 10.1017/S0140525X00062713.
[2] B. Milner, S. Corkin, and H.-L. Teuber. “Further
analysis of the hippocampal amnesic syndrome: 14-year
follow-up study of HM”. In: Neuropsychologia 6.3 (1968),
pp. 215–234. issn: 0028-3932.
[3] K. L. Stachenfeld, M. M. Botvinick, and S. J. Gershman.
“The hippocampus as a predictive map”. In: Nature
neuroscience 20.11 (2017), p. 1643. issn: 1546-1726.
——
Semi-Supervised Segmentation of Cell Images using Differentiable Rendering.
With the recent advancements in machine learning and mainly deep learning [1], deep convolutional neural networks
(CNNs) [2–7] have been developed, which are able to learn from data sets containing millions of images [8] to resolve
object detection tasks. When trained on such big data sets, CNNs are able to achieve task-relevant object detection
performances that are comparable or even superior to the capabilities of humans [9, 10]. A key problem of using deep
learning for cell detection is in general the large amount of data needed to train such networks. The main difficulty
lies in the acquisition of a representative data set of cell images which ideally contain various sizes, shapes and distributions
for a variety of cell types. Additionally, the manual annotation of the acquired data is mandatory to obtain
the so called ‘ground truth’ or ‘labels’, which is in general error-prone, time-consuming and costly to obtain.
Differentiable rendering [11–13] on the other hand is a emerging technique, which allows to generate synthetic,
photo-realistic images based on photographs of real-world objects by estimating its 3D shape and material properties.
While this approach can be used to generate photo-realistic images, it can also be applied for the generation of
respective ground truth labels for segmentation and object detection masks. Combining differentiable rendering with
deep learning could potentially solve the data bottleneck for machine learning algorithms in various fields, including
materials science and biomedical engineering.
The work of this thesis is based on the differentiable rendering framework ‘Redner’ [11] using data from the Cell
Tracking Challenge [14, 15]. In a first step, a literature research will be conducted on the topic of differentiable rendering.
In a second step, an existing implementation for the light, shader and geometry estimation of nanoparticles
will be adapted for the semi-supervised segmentation of GFP-GOWT1 mouse stem cells. Afterwards, the results of
this approach will be evaluated in terms of segmentation accuracy.
The thesis will include the following points:
• Getting familiar with the concepts of Differentiable Rendering and Gradient-based learning methods
• Implementation of a proof-of-concept for the semi-supervised segmentation of cells based on the ‘Redner’
framework using existing data from the Cell Tracking Challenge
• Evaluation of the method in terms of segmentation accuracy
• Elaboration of potential improvements for the method
Academic advisors:
References
[1] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
[2] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” Proceedings of the IEEE International Conference
on Computer Vision, vol. 2017-Octob, pp. 2980–2988, 2017.
[3] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv
preprint arXiv:1409.1556, 2014.
[4] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE
conference on computer vision and pattern recognition, pp. 770–778, 2016.
[5] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,”
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-
Decem, pp. 779–788, 2016.
[6] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,”
in International Conference on Medical image computing and computer-assisted intervention, pp. 234–241,
Springer, 2015.
[7] T. Falk, D. Mai, R. Bensch, O¨ . C¸ ic¸ek, A. Abdulkadir, Y. Marrakchi, A. Bo¨hm, J. Deubner, Z. Ja¨ckel, K. Seiwald,
A. Dovzhenko, O. Tietz, C. Dal Bosco, S. Walsh, D. Saltukoglu, T. L. Tay, M. Prinz, K. Palme, M. Simons,
I. Diester, T. Brox, and O. Ronneberger, “U-Net: deep learning for cell counting, detection, and morphometry,”
Nature Methods, vol. 16, no. 1, pp. 67–70, 2019.
[8] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image
database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, no. May 2014, pp. 248–255,
2009.
[9] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou,
V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap,
M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks
and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”
in Advances in neural information processing systems, pp. 1097–1105, 2012.
[11] T.-M. Li, M. Aittala, F. Durand, and J. Lehtinen, “Differentiable monte carlo ray tracing through edge sampling,”
ACM Trans. Graph., vol. 37, Dec. 2018.
[12] M. Nimier-David, D. Vicini, T. Zeltner, and W. Jakob, “Mitsuba 2: a retargetable forward and inverse renderer,”
ACM Transactions on Graphics (TOG), vol. 38, no. 6, p. 203, 2019.
[13] G. Loubet, N. Holzschuch, andW. Jakob, “Reparameterizing discontinuous integrands for differentiable rendering,”
ACM Transactions on Graphics (TOG), vol. 38, no. 6, pp. 1–14, 2019.
[14] M. Maˇska, V. Ulman, D. Svoboda, P. Matula, P. Matula, C. Ederra, A. Urbiola, T. Espa˜na, S. Venkatesan,
D. M. Balak, et al., “A benchmark for comparison of cell tracking algorithms,” Bioinformatics, vol. 30, no. 11,
pp. 1609–1617, 2014.
[15] V. Ulman, M. Maˇska, K. E. Magnusson, O. Ronneberger, C. Haubold, N. Harder, P. Matula, P. Matula, D. Svoboda,
M. Radojevic, et al., “An objective comparison of cell-tracking algorithms,” Nature methods, vol. 14,
no. 12, p. 1141, 2017.
Detection of Hand Drawn Electrical Circuit Diagrams and their Components using Deep Learning Methods and Conversion into LTspice Format
Thesis Description
An electrical circuit diagram (ECD) is a graphical representation of an electrical circuit. ECDs consist of electrical circuit components (ECC), where for each ECC an unique symbol is defined in the international standard [1]. The ECCs are connected with lines, which correspond to wires in the real world. Furthermore, ECCs are further specified by an annotation next to their symbol, which consists of a digit followed by a unit. For instance a resistor can be denoted as ”100 mΩ” (Milliohm). Voltage sources and current sources are ECCs, which provide either a voltage (U) or a current (I) through the circuit. While U and I provided by sources are given, U and I with respect to certain ECCs have to be obtained through calculations. For small circuits this can be done by hand, however the calculation complexity grows with the size of the circuit and even more when alternating U/I sources are used, since certain component calculations become dependent on the frequency of the used source. Therefore, often a circuit simulation software (CSS) is used, where complex simulations can easily be performed in an automated way. Before a circuit can be simulated in a CSS, it first has to be modeled in the application. Refaat et al. [2] compared the drawing speed of structured diagrams by hand and with the diagram drawing tool Microsoft Visio. Their experiments have shown that drawing by hand was around 90% faster than drawing with Microsoft Visio. Since ECDs are also structured diagrams it seems that a hand drawn approach could be done more efficient than an application based drawing approach. Hence, an automated method to convert an image of a hand drawn ECD into a digital format processable by a CSS, would ease the use of CSS.
So far various researches have been conducted on the segmentation, recognition and the tracing of connections between ECCs, which will be briefly described in the following. The proposed approaches, can be structured as follows: 1) classification of ECCs [3, 4, 5], 2) segmentation and classification of ECCs [6, 7], 3) segmentation and classification of ECCs and ECD topology acquisition [8], 4) object detection of ECCs and ECD topology acquisition [9]. Moetesum et al. [6] used computer vision methods to segment ECCs from an ECD, where for different ECC types different strategies were used to obtain a segmentation mask. For instance sources were segmented by filling the region inside the source symbol, followed by a bounding box drawn around the segmentation mask. A Histogram of Oriented Gradients was applied on the region inside the bounding box, to obtain a feature vector for a following Support Vector Machine classifier. While this approach yielded good classification results, it is only partially extendable. For ECCs which have a similar shape to components which are already covered by a segmentation strategy, the existing strategy can probably be reused, but for completely new shapes, a new strategy has to be introduced. The aim of the proposed method by Dhanushika et al. [9] was to extract a boolean expression from an ECD made out of logical gate components (AND, OR, NOT, etc.). The ECS classification was modeled here by using the object detection algorithm YOLO (You Only Look Once) [10], which localizes and classifies an object in a single step. The ECD topology was recognized, by removing the bounding boxes from the image and applying a Hough Transform on the remaining connections. Hough Lines and bounding box intersections were now used to form the ECD topology, from which the final boolean expression was generated.
All of the above mentioned methods were restricted to drawings on white paper only. As it is quite common to also draw on gridded paper, this might become too restrictive for the use in real world scenarios. Furthermore, no method has been proposed so far, which aims to cover the full conversion, beginning with the image to the simulation based on a CSS formatted file.
Thus, this thesis aims to cover the development of a full processing pipeline able to convert images of hand drawn ECDs into an intermediate format, which reflects the topologies of the ECDs. Extensibility should be ensured by using an object detection deep neural network architecture, which is due to the nature of neural networks, simply extended by providing new data and labels for the training step. The pipeline should also be invariant to image quality (paper type, lighting conditions, background etc.), at least considering white and grid paper. Furthermore, the pipeline should contain the recognition of component annotations e.g. component values and voltage/current flow symbols. The conversion into a CSS format should be realized on the example of LTspice. Additionally, the used methods should be chosen such that the pipeline could be executed on mobile hardware, thus the computational effort for the whole pipeline must be kept as low as possible.
The thesis will comprise of the following work items:
- Collection of a suitable dataset
- Object detection of ECCs and annotations in images of a hand drawn ECDs
- Segmentation of the ECD from the drawing
- Identification of the ECD topology
- Postprocessing
- Building the ECD topology
- Assigning annotations to corresponding ECCs
- Embedding gathered information into a LTspice file
- Optional: Mobile demo application
References
[1] IEC-60617. https://webstore.iec.ch/publication/2723. Accessed: 21-12-2020.
[2] K. Refaat, W. Helmy, A. Ali, M. AbdelGhany, and A. Atiya. A new approach for context-independent
handwritten offline diagram recognition using support vector machines. In 2008 IEEE International Joint
Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pages 177–182,
2008.
[3] M. Rabbani, R. Khoshkangini, H.S. Nagendraswamy, and M. Conti. Hand drawn optical circuit recognition.
Procedia Computer Science, 84:41 – 48, 2016. Proceeding of the Seventh International Conference on
Intelligent Human Computer Interaction (IHCI 2015).
[4] M. G¨unay, M. K¨oseo˘glu, and O. Yıldırım. Classification of hand-drawn basic circuit components using con- ¨
volutional neural networks. In 2020 International Congress on Human-Computer Interaction, Optimization
and Robotic Applications (HORA), pages 1–5, 2020.
[5] S. Roy, A. Bhattacharya, and N. Sarkar et al. Offline hand-drawn circuit component recognition using
texture and shape-based features. Springer Science+Business Media, August 2020.
[6] M. Moetesum, S. Waqar Younus, M. Ali Warsi, and I. Siddiqi. Segmentation and recognition of electronic
components in hand-drawn circuit diagrams. EAI Endorsed Transactions on Scalable Information Systems,
5(16), 4 2018.
[7] M. D. Patare and M. Joshi. Hand-drawn digital logic circuit component recognition using svm. International
Journal of Computer Applications, 143:24–28, 2016.
[8] B. Edwards and V. Chandran. Machine recognition of hand-drawn circuit diagrams. In 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100),
volume 6, pages 3618–3621 vol.6, 2000.
[9] T. Dhanushika and L. Ranathunga. Fine-tuned line connection accompanied boolean expression generation
for hand-drawn logic circuits. In 2019 14th Conference on Industrial and Information Systems (ICIIS),
pages 436–441, 2019.
[10] J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi. You only look once: Unified, real-time object
detection. CoRR, abs/1506.02640, 2015.
Semi-Supervised Beating Whole Heart Segmentation Based on 3D Cine MRI in Congenital Heart Disease Using Deep Learning
The heart is a dynamic, beating organ, and until now it has been challenging to fully capture its com-
plexity by magnetic resonance imaging (MRI). In an ideal world, doctors could create a 3-dimensional
(3D) visual representation of each patient’s unique heart and watch as it pumps, moving through each
phase of the cardiac cycle. [2]
The standard cardiac MRI includes multiple 2D image slices stacked next to each other that must
be carefully positioned by the MRI technologist based on a patient’s anatomy. Planning the location
and angle for the slices requires a highly-knowledgeable operator and takes time. [2]
Recently, a new MRI-based technology, referred to as “3D cine”, has been developed that can
produce moving 3D images of the heart. It allows cardiologists and cardiac surgeons to see a patient’s
heart from any angle and observe its movement throughout the entire cardiac cycle [2], as well as the
assessment of cardiac morphology and function [4].
Fully automatic methods for analysis of 3D cine cardiovascular MRI would improve the clinical
utility of this promising technique. At the moment, there is no automatic segmentation algorithm
available for 3D cine images of the heart. Furthermore, manual segmentation of 3D cine images is
time-consuming and impractical. Therefore, in this master thesis, dierent deep learning techniques
(DL) based on 3D MRI data will be investigated in order to automate the segmentation process. In
particular, two time frames of every 3D image might be rst semi-automatically segmented [3]. The
segmentation of these two time frames will be used to train a deep neural network for automatic
segmentation of the other time frames.
The datasets are acquired from 125 dierent patients at the Boston Children’s Hospital1. In
contrast to the standard cardiac MRI that patients must hold their breath while the picture is being
taken, these datasets are obtained by tracking the patient’s breathing motion and only collecting data
during expiration, when the patient is breathing out [1].
The segmentation results will be quantitatively validated using Dice score and qualitatively eval-
uated by clinicians.
The thesis has to comprise the following work items:
Data processing and manual annotation of the available datasets in order to utilize them for the
DL methods.
Development and implementation of 3D cine segmentation models based on DL techniques.
Quantitative evaluation of the segmentation results with respect to Dice score.
The thesis will be carried out at the Department of Pediatrics at Harvard University Medical School
and the Department of Cardiology at Boston Children’s Hospital, in cooperation with the Pattern
Recognition Lab at FAU Erlangen-Nuremberg and the Computer Science and Articial Intelligence
Lab of MIT. Furthermore, the results of the study are expected to be published as an abstract and
article at the International Society for Cardiovascular Magnetic Resonance in Medicine2.
1Department of Cardiology, Boston Children’s Hospital, 300 Longwood Avenue, Boston, MA 02115, USA
2https://scmr.org/
References
[1] Mehdi Hedjazi Moghari, Ashita Barthur, Maria Amaral, Tal Geva, and Andrew Powell. Free-
breathing whole-heart 3d cine magnetic resonance imaging with prospective respiratory motion
compensation: Whole-heart 3d cine mri. Magnetic Resonance in Medicine, 80, 2017.
[2] Erin Horan. The future of cardiac mri: 3-d cine. Boston Children’s Hospital’s science and clinical
innovation blog, 2016. [Online]. Available: https://vector.childrenshospital.org/2016/12/
the-future-of-cardiac-mri-3-d-cine.
[3] Danielle F. Pace. Image segmentation for highly variable anatomy: Applications to congenital heart
disease. Doctoral dissertation, Massachusetts Institute of Technology, Cambridge, MA, USA, 2020.
[4] Jens Wetzl, Michaela Schmidt, Francois Pontana, Benjamin Longere, Felix Lugauer, Andreas
Maier, Joachim Hornegger, and Christoph Forman. Single-breath-hold 3-d cine imaging of the
left ventricle using cartesian sampling. Magnetic Resonance Materials in Physics, Biology and
Medicine, 31:1{13, 2017.
Deep Learning based image enhancement for contrast agent minimization in cardiac MRI
Late gadolinium enhancement (LGE) imaging has become an indispensable tool in diag-
nosis and assessment of myocardial infarction (MI). Size, location, and extent of the in-
farcted tissue are important indicators to assess treatment ecacy and to predict functional
recovery.[1] In LGE imaging, T1-weighted inversion recovery pulse sequences are applied
several minutes after injection of a gadolinium-based contrast agent (GBCA). However,
contraindications (e. g. renal insuciency) and severe adverse eects (e. g. nephrogenic sys-
temic brosis) of GBCAs are known.[2] Therefore, minimization of administered contrast
agent doses is subject of current research. Existing neural network-based approaches either
rely on cardiac wall motion abnormalities[3] or have been developed for brain MRI[4].
The aim of this thesis is to develop a post-processing approach based on convolutional neural
networks (CNN) to accurately segment and quantify myocardial scar in 2-D LGE images
acquired with reduced doses of GBCA. For this purpose, synthetic data generated with an
in-house MRI simulation suite is used for a start. The 4-D XCAT phantom[5] is used for the
simulation, as it oers multiple possibilities for variations in patient anatomy as well as in
geometry and location of myocardial scar. Furthermore, the simulated images will include
variability in certain acquisition parameters to best re
ect in-vivo data. In addition to LGE
images, T1-maps are simulated with dierent levels of contrast agent dose. In the scope of
this thesis, multiple approaches using dierent combinations of input data (i. e. LGE images
and/or T1-maps at zero-dose and/or low-dose) are explored. The performance of the net-
work will be evaluated on simulated and in-vivo data. Depending on the availability, in-vivo
data will also be incorporated into the training process.
The thesis covers the following aspects:
Generation of simulated training data, best re
ecting in-vivo data
Development of the CNN-based system including implementation using PyTorch
Optional: depending on data availability and on previous results, incorporation of
in-vivo data into the training process
Quantitative evaluation of the implemented network on simulated and in-vivo data
using dice score and clinically relevant MI quantication metrics, e. g. the full width
at half maximum method (FWHM)
References
[1] V. Hombach, N. Merkle, P. Bernhard, V. Rasche, and W. Rottbauer, \Prognostic sig-
nicance of cardiac magnetic resonance imaging: Update 2010,” Cardiology Journal,
2010.
[2] L. Bakhos and M. A. Syed, Contrast Media, pp. 271{281. Cham: Springer International
Publishing, 2015.
1
[3] N. Zhang, G. Yang, Z. Gao, C. Xu, Y. Zhang, R. Shi, J. Keegan, L. Xu, H. Zhang,
Z. Fan, and D. Firmin, \Deep learning for diagnosis of chronic myocardial infarction on
nonenhanced cardiac cine MRI,” Radiology, 2019.
[4] E. Gong, J. M. Pauly, M. Wintermark, and G. Zaharchuk, \Deep learning enables re-
duced gadolinium dose for contrast-enhanced brain MRI,” Journal of Magnetic Reso-
nance Imaging, 2018.
[5] W. P. Segars, G. Sturgeon, S. Mendonca, J. Grimes, and B. M. Tsui, \4D XCAT phantom
for multimodality imaging research,” Medical Physics, 2010.