Index

Synthetic X-rays from CT volumes for deep learning

X-rays are a standard imaging modality in clinical care and various artificial intelligence (AI) applications have been proposed to support clinical work with X-ray images. AI-based applications employing deep learning requires a great number of training data that must be structured and annotated with respect to the anatomical regions of interest. However, acquiring this training data is challenging due to the time intensive, error prone and expensive nature of annotating and labelling image data.  As an alternative, Computed Tomography (CT) data along with annotations generated from existing AI-software can be used to generate synthetic X-ray images with the corresponding transformed annotations [1][2].

In this master’s thesis, the use of synthetic X-rays generated from CT volumes for deep learning shall be investigated. Synthetic X-rays are a simulation of radiographic images produced through a perspective projection of the three-dimensional (CT) image volume onto a two-dimensional image plane. The application focuses mainly on orthopedic imaging, in particular spine imaging. A deep neural network is trained to identify anatomical landmarks of the vertebrae (e.g. corners or centers) using only the generated synthetic X-ray data [3][4]. This trained network is then extensively tested on unseen datasets of real X-ray images. The hypothesis is that the synthetic 2D data from CT volumes (image, annotations) can improve training a Deep Neural Network for X-ray applications. The results should be able to demonstrate if generated images can effectively be used in place of real data for training.

 

The thesis consists of the following milestones:

1: Create a landmark detector model (vertebral corners or center) from real spine X-ray data

2: Generate synthetic X-ray images and corresponding annotations from available CT data

3: Train the landmark detector model using only the synthetic X-rays

4: Evaluate the results generated from the two trained models

References:

[1] B. Bier, F. Goldmann, J. Zaech, J. Fatouhi, R. Hageman, R. Grupp, M. Armand, G. Osgood, N. Navab, A. Maier & M. Unberath, “Learning to detect anatomical landmarks of the pelvis in X-rays from arbitrary views”, International Journal of Computer Assisted Radiology and Surgery 14, 1463-1473 (2019)

[2] M. Unberath, J. Zaech, S.C. Lee, B. Bier, J. Fatouhi, M. Armand & N. Navab, “Deep DRR – A catalyst for machine learning in fluoroscopy-guided procedures” (2018) arXiv:1803.08606 [physics.med-ph]

[3] Khanal B., Dahal L., Adhikari P., Khanal B. (2020) Automatic Cobb Angle Detection Using Vertebra Detector and Vertebra Corners Regression. In: Cai Y., Wang L., Audette M., Zheng G., Li S. (eds) Computational Methods and Clinical Applications for Spine Imaging. CSI 2019. Lecture Notes in Computer Science, vol 11963. Springer, Cham.

[4] J. Yi, P. Wu, Q. Huang, H. Qu, D.N. Metaxas, “Vertebra-focused landmark detection for scoliosis assessment” (2020) arXiv:2001.03187 [eess.IV]

 

Automatic characterization of nanoparticles using deep learning techniques

Nanotechnology has been bringing numerous advances in all its applications fields, ranging from electronics to
medicine. Nanomedicine, as it is called the emerging field of the meeting of pharmaceutical, biomedical sciences
and nanotechnology, investigates the potentials of nanoparticles to improve diagnostics and therapy in healthcare
[1, 2]. Interactions of these particles with the biological environment are dependent on some key factors, as particle
size, shape and distribution. These aspects impact the particles efficacy, safety, and toxicological profiles [1–4].
Therefore, it is important to develop an accurate method to measure particle size, distribution, and characterize them
to assess their quality and safety [2].
To assist in this task, an automatic yet reliable method would be desirable to eliminate human subjectivity [5]. Recently,
deep learning is emerging as a powerful tool and will continue to attract considerable interests in microscopy
image analysis, as object detection and segmentation, extraction of regions of interest (ROIs), image classification,
etc. [6].
In this thesis, we will employ a well-established deep neural network to automatically detect, segment, and classify
nanoparticles in microscopy images. Additionally, we will extend the method to measure the size of our nanoparticles,
which also requires annotation of the particles’ measurements beforehand. Finally, we will evaluate our approach and
analyze our outcomes.
The thesis will include the following points:
• Getting familiar with the nanoparticle characterization problem and tools applied in this work.
• Extend the dataset’s annotations with the nanoparticles measurements.
• Modify the chosen network to predict the nanoparticles’ size.
• Employ the modified network to detect, segment, and classify nanoparticles and predict their size.
• Evaluate the results according to appropriate metrics for the task.
• Elaboration of further improvements for the proposed method.
Academic advisors:

References
[1] D. Bobo, K. J. Robinson, J. Islam, K. J. Thurecht, and S. R. Corrie, “Nanoparticle-based medicines: a review
of fda-approved materials and clinical trials to date,” Pharmaceutical research, vol. 33, no. 10, pp. 2373–2387,
2016.
[2] F. Caputo, J. Clogston, L. Calzolai, M. R¨osslein, and A. Prina-Mello, “Measuring particle size distribution of
nanoparticle enabled medicinal products, the joint view of euncl and nci-ncl. a step by step approach combining
orthogonal measurements with increasing complexity,” Journal of Controlled Release, vol. 299, pp. 31–43, 2019.
[3] V. Mohanraj and Y. Chen, “Nanoparticles-a review,” Tropical journal of pharmaceutical research, vol. 5, no. 1,
pp. 561–573, 2006.
[4] A. G. Roca, L. Guti´errez, H. Gavil´an, M. E. F. Brollo, S. Veintemillas-Verdaguer, and M. del Puerto Morales, “Design
strategies for shape-controlled magnetic iron oxide nanoparticles,” Advanced drug delivery reviews, vol. 138,
pp. 68–104, 2019.
[5] B. Sun and A. S. Barnard, “Texture based image classification for nanoparticle surface characterisation and machine
learning,” Journal of Physics: Materials, vol. 1, no. 1, p. 016001, 2018.
[6] L. Lu, Y. Zheng, G. Carneiro, and L. Yang, “Deep learning and convolutional neural networks for medical image
computing,” Advances in Computer Vision and Pattern Recognition; Springer: New York, NY, USA, 2017.

Weakly supervised localization of defects in electroluminescence images of solar cells

With the recent rise of renewable energy, usage of solar energy has also grown rapidly. Detecting faulty panels inproduction and on-site therefore has become more important. Prior works focus on fault detection using the e.g. the current, voltage and temperature of solar modules as inputs [6, 1], but the localization of defects using imaging and machine learning has only recently gained attention [5, 4].

This work studies the detection of defects in electroluminescence (EL) images of solar cells using state of the art computer vision techniques with a focus on crack detection. Previously, in order to train a model to predict pixel classifications, exhaustive labelling of every pixel in an image of the dataset was required. State of the art training methods allow models to predict coarse segmentations using only image-wise classification labels by means of weakly supervised training. Recently, it has been shown that these methods can be applied to perform a coarse segmentation of cracks on EL images of solar cells as well [5].

This thesis aims to improve upon the existing method. To this end, weakly supervised learning methods like guided backpropagation, grad-cam, score-cam and adversarial learning [5, 9, 2, 7, 8, 3] will be implemented to train a model that reliably and accurately localizes cracks in a dataset of about 40k image-wise annotated EL images of solar cells. Finally, a thorough evaluation will show, if these methods can improve over the state of the art.

References

[1] Ali, Mohamed Hassan, et al. “Real time fault detection in photovoltaic systems.” Energy Procedia 111 (2017): 914-923.
[2] Chattopadhay, Aditya, et al. “Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks.” 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018.
[3] Choe, Junsuk, and Hyunjung Shim. “Attention-based dropout layer for weakly supervised object localization.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
[4] Deitsch, Sergiu, et al. “Automatic classification of defective photovoltaic module cells in electroluminescence images.” Solar Energy 185 (2019): 455-468.
[5] Mayr, Martin, et al. “Weakly Supervised Segmentation of Cracks on Solar Cells Using Normalized L p Norm.” 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2019.
[6] Triki-Lahiani, Asma, Afef Bennani-Ben Abdelghani, and Ilhem Slama-Belkhodja. “Fault detection and monitoring systems for photovoltaic installations: A review.” Renewable and Sustainable Energy Reviews 82 (2018): 2680-2692.
[7] Wang, Haofan, et al. “Score-CAM: Score-weighted visual explanations for convolutional neural networks.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020.
[8] Zhang, Xiaolin, et al. “Adversarial complementary learning for weakly supervised object localization.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[9] Zhou, Bolei, et al. “Learning deep features for discriminative localization.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

Detection of Label Noise in Solar Cell Datasets

On-site inspection of solar panels is a time-consuming and difficult process, as the solar panels are often difficult to reach. Furthermore, identifying defects can be hard, especially for small cracks. Electroluminescence (EL) imaging enables the detection of small cracks, for example using a convolutional neural network (CNN) [1,2]. Hence, it can be used to identify such cracks before they propagate and result in a measurable impact on the efficiency of a solar panel [3]. This way costly inspection and replacement of solar panels can be avoided.

To train a CNN for the detection of cracks, a comprehensive dataset of labeled solar cells is required. Unfortunately, assessing, if a certain structure on a polycrystalline solar cell corresponds to a crack or not, is a hard task, even for human experts. As a result, setting up a consistently labeled dataset is nearly impossible. That is why EL datasets of solar cells favor a significant amount of label noise.

It has been shown that CNNs are robust against small amounts of label noise, but there may be drastic influence on the performance starting at 5%-10% of label noise [4]. This thesis will

(1) analyze the given dataset with respect to label noise and
(2) attempts to minimize the negative impact on the performance of the trained network caused by label noise.

Recently, Ding et. al. proposed to identify label noise by clustering of the features learned by the CNN [4]. As part of this thesis, the proposed method will be applied to a dataset consisting of more than 40k labeled samples of solar cells, which is known to contain a significant amount of label noise. As a result, it will be investigated, if the method can be used to identify noisy samples. Furthermore, it will be evaluated, if abstaining from noisy samples improves the performance of the resulting model. To this end, a subset of the dataset will be labeled by at least three experts to obtain a cleaned subset. Finally, an extension of the method will be developed. Here, it shall be evaluated, if the clustering can be omitted, since this proved instable in prior experiments using the same data.

[1] Deitsch, Sergiu, et al. “Automatic classification of defective photovoltaic module cells in electroluminescence images.” Solar Energy 185 (2019): 455-468.
[2] Mayr, Martin, et al. “Weakly Supervised Segmentation of Cracks on Solar Cells Using Normalized L p Norm.” 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2019.
[3] Köntges, Marc, et al. “Impact of transportation on silicon wafer‐based photovoltaic modules.” Progress in Photovoltaics: research and applications 24.8 (2016): 1085-1095.
[4] Ding, Guiguang, et al. “DECODE: Deep confidence network for robust image classification.” IEEE Transactions on Image Processing 28.8 (2019): 3752-3765.

Comparison of different text attention techniques for writer identification

Distillation Learning for Speech Enhancement

Noise suppression has remained a field of interest for more than five decades now, and a number of techniques have been employed to extract clean and/or noise free data. Continuous audio and video signals offer greater challenges when it comes to noise reduction, and deep neural network (DNN) techniques have been designed to enhance those signals (Valin, 2018). While the DNNs are efficient, they are computationally expensive and demand adequate memory resources. The aim of the proposed thesis will remain on addressing these constraints when working limited memory and computational power, without compromising much on the model efficiency.

A Neural Network (NN) can easily be overfitted with the training data, owing to the large number of parameters and training sessions for which the network was trained on the given data (Dakwale & Monz, 2019). One solution to this is to use ensemble (combination) of models trained on the same data to achieve generalization. The limitation of this solution comes with hardware constraints and when the network needs to be used on a hardware with limited memory and computational power, such as mobile phones. This resource limitation seeds the idea of distillation learning, in which the knowledge from a complex or ensembled network is transferred to a relatively simpler and computationally less expensive model.

Following the framework of distillation learning, a Teacher-Student network will be designed, with an existing trained Teacher network. The teacher network has been trained on audio data with hard labels, using a dense parameter matrix. The high number of parameters dictates the complexity of the neural network and also the efficiency to identify and suppress signal noise (Hinton, et al., 2015). The proposed method is to design a student network, which tries to imitate the output of the teacher, i.e., the probability distribution, without the need to be trained with the same number of parameters. By transferring the learning of the teacher to the student network, a simpler model can be designed, with a reduced set of parameters, which would be more suited for hardware with lower memory and computational power.

Motion Compensation Using Epipolar Consistency Condition in Computed Tomography

The hippocampus and the Successor Representation – An analysis of the properties of the Successor Representation, place- and grid cells.

The human brain is a big role model for computer science. Many applications
like Neural Networks mimic brain functions with great success. However a lot of
functions are still not well understood and therefore subject to present research.
The hippocampus is one of the regions of greater interest. It is a central part
of memory processing, the limbic system and used for spatial navigation. Place
and grid cells are two important cell types found in the hippocampus, which
help to encode information for navigational tasks [1].
New theories however extend this view from spatial navigation to more abstract
navigation, which can be used for all concepts of information. In the paper The
hippocampus as a predictive map a mathematical description of the place cells
in the hippocampus, the Successor Representation (SR) is developed. The SR
can be used to imitate the data processing method of the hippocampus and
could already recreate experimental results [2]. Other experiments have also
extended the view from spatial navigation to broader information processing.
For example that the grid cells do not encode only the euclidean distances [3]
or that we use grid and place cells to orientate in our eld of vision [4]. All of
this could lead to powerful data processing tool, which can adept
exible to all
kinds of problems.
This thesis wants to build a framework which can be used to use and analyze
the properties of the SR. The framework should enable to create di erent envi-
ronments for simple navigation tasks, but also to get more abstract information
relationships in graphs. Furthermore mathematical properties should be ana-
lyzed to improve the learning process and to gain a broader understanding of
the functionality of the SR.
1

Development of a framework to simulate learning and task solving inspired by the hippocampus and successor representation

Since nervous systems have developed very efficient
mechanisms to store, retrieve and even extrapolate from
learned experience, machine learning has always oriented
itself on nature. Even though the discovery of neural
networks, support vector machines and deep networks have
been significantly pushing performance, science is still far
away from completely understanding the brain’s
implementation of those phenomena.

The hippocampus is a structure of the brain present in both
hemispheres. It has been proven to be responsible for both
spatial orientation and memory management [1, 2] but recent
studies suggest it is involved in far more profound tasks of
learning. This new theory assumes the hippocampus creates
abstract cognitive maps with the ability to predict unknown
states, joining the proven findings already mentioned
above.[3] To further investigate and study this behaviour
and possibly add proof to the theory, it is crucial to
examine the two dominant neural cell types which have
already been identified in the context of spatial
orientation. These are so called place cells on the one hand
and grid cells on the other.

Place- and grid cells were originally discovered to encode
spatial information and thus named accordingly. According to
the theory of abstract cognitive mapping in the hippocampus,
place cells’ activities are believed to represent states
in general. Grid cells were originally discovered firing
uniformly over space in different orientations, generating
some kind of coordinate system. In the context of this more
holistic theory, grid cells could provide a reference frame
for the abstract cognitive map. Many experiments have been
conducted to investigate the behaviour of the hippocampuses
structures related to learning.

The aim of this thesis is to create a framework for
researchers to simulate and work in environments which are
held so simple that the results can be transferred to any
other cognitive map. Hopefully this can help to avoid
complicated experimental setups, the use of laboratory
animal experiments and speed up research on the
hippocampuses role in learning in the future.

[1] D. S. Olton, J. T. Becker, and G. E. Handelmann.
“Hippocampus, space, and memory”. In: Behavioral and
Brain Sciences 2.3 (1979), pp. 313–322. issn: 0140-525X.
doi: 10.1017/S0140525X00062713.

[2] B. Milner, S. Corkin, and H.-L. Teuber. “Further
analysis of the hippocampal amnesic syndrome: 14-year
follow-up study of HM”. In: Neuropsychologia 6.3 (1968),
pp. 215–234. issn: 0028-3932.

[3] K. L. Stachenfeld, M. M. Botvinick, and S. J. Gershman.
“The hippocampus as a predictive map”. In: Nature
neuroscience 20.11 (2017), p. 1643. issn: 1546-1726.
——

Semi-Supervised Segmentation of Cell Images using Differentiable Rendering.

With the recent advancements in machine learning and mainly deep learning [1], deep convolutional neural networks
(CNNs) [2–7] have been developed, which are able to learn from data sets containing millions of images [8] to resolve
object detection tasks. When trained on such big data sets, CNNs are able to achieve task-relevant object detection
performances that are comparable or even superior to the capabilities of humans [9, 10]. A key problem of using deep
learning for cell detection is in general the large amount of data needed to train such networks. The main difficulty
lies in the acquisition of a representative data set of cell images which ideally contain various sizes, shapes and distributions
for a variety of cell types. Additionally, the manual annotation of the acquired data is mandatory to obtain
the so called ‘ground truth’ or ‘labels’, which is in general error-prone, time-consuming and costly to obtain.
Differentiable rendering [11–13] on the other hand is a emerging technique, which allows to generate synthetic,
photo-realistic images based on photographs of real-world objects by estimating its 3D shape and material properties.
While this approach can be used to generate photo-realistic images, it can also be applied for the generation of
respective ground truth labels for segmentation and object detection masks. Combining differentiable rendering with
deep learning could potentially solve the data bottleneck for machine learning algorithms in various fields, including
materials science and biomedical engineering.
The work of this thesis is based on the differentiable rendering framework ‘Redner’ [11] using data from the Cell
Tracking Challenge [14, 15]. In a first step, a literature research will be conducted on the topic of differentiable rendering.
In a second step, an existing implementation for the light, shader and geometry estimation of nanoparticles
will be adapted for the semi-supervised segmentation of GFP-GOWT1 mouse stem cells. Afterwards, the results of
this approach will be evaluated in terms of segmentation accuracy.
The thesis will include the following points:
• Getting familiar with the concepts of Differentiable Rendering and Gradient-based learning methods
• Implementation of a proof-of-concept for the semi-supervised segmentation of cells based on the ‘Redner’
framework using existing data from the Cell Tracking Challenge
• Evaluation of the method in terms of segmentation accuracy
• Elaboration of potential improvements for the method
Academic advisors:

References
[1] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
[2] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” Proceedings of the IEEE International Conference
on Computer Vision, vol. 2017-Octob, pp. 2980–2988, 2017.
[3] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv
preprint arXiv:1409.1556, 2014.
[4] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE
conference on computer vision and pattern recognition, pp. 770–778, 2016.
[5] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,”
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-
Decem, pp. 779–788, 2016.
[6] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,”
in International Conference on Medical image computing and computer-assisted intervention, pp. 234–241,
Springer, 2015.
[7] T. Falk, D. Mai, R. Bensch, O¨ . C¸ ic¸ek, A. Abdulkadir, Y. Marrakchi, A. Bo¨hm, J. Deubner, Z. Ja¨ckel, K. Seiwald,
A. Dovzhenko, O. Tietz, C. Dal Bosco, S. Walsh, D. Saltukoglu, T. L. Tay, M. Prinz, K. Palme, M. Simons,
I. Diester, T. Brox, and O. Ronneberger, “U-Net: deep learning for cell counting, detection, and morphometry,”
Nature Methods, vol. 16, no. 1, pp. 67–70, 2019.
[8] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image
database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, no. May 2014, pp. 248–255,
2009.
[9] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou,
V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap,
M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks
and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”
in Advances in neural information processing systems, pp. 1097–1105, 2012.
[11] T.-M. Li, M. Aittala, F. Durand, and J. Lehtinen, “Differentiable monte carlo ray tracing through edge sampling,”
ACM Trans. Graph., vol. 37, Dec. 2018.
[12] M. Nimier-David, D. Vicini, T. Zeltner, and W. Jakob, “Mitsuba 2: a retargetable forward and inverse renderer,”
ACM Transactions on Graphics (TOG), vol. 38, no. 6, p. 203, 2019.
[13] G. Loubet, N. Holzschuch, andW. Jakob, “Reparameterizing discontinuous integrands for differentiable rendering,”
ACM Transactions on Graphics (TOG), vol. 38, no. 6, pp. 1–14, 2019.
[14] M. Maˇska, V. Ulman, D. Svoboda, P. Matula, P. Matula, C. Ederra, A. Urbiola, T. Espa˜na, S. Venkatesan,
D. M. Balak, et al., “A benchmark for comparison of cell tracking algorithms,” Bioinformatics, vol. 30, no. 11,
pp. 1609–1617, 2014.
[15] V. Ulman, M. Maˇska, K. E. Magnusson, O. Ronneberger, C. Haubold, N. Harder, P. Matula, P. Matula, D. Svoboda,
M. Radojevic, et al., “An objective comparison of cell-tracking algorithms,” Nature methods, vol. 14,
no. 12, p. 1141, 2017.