Index

Continuous Non-Invasive Blood Pressure Measurement Using 60GHz-Radar – A Feasibility Study

Hypertension – high blood pressure (BP) – is known to be a silent killer. Untreated, it can cause
severe damage to the human’s organs, mainly to the heart and kidneys [5, 6]. BP is
usually classified by using the highest – systolic – and the lowest – diastolic – pressures during one
cardiac cycle [2]. The gold standard for measuring BP remains the oscillometric method,
which is employed in traditional arm-cuffs [4]. This method, however, suffers from extensive
deficiencies: Discomfort leads to unreliable measures [2]. Additionally, it only captures
the static status of the very dynamic arterial BP and thus loses important variation information,
leading to poor time resolution [2, 3, 4, 7] However, there is a strong
need for continuous beat-to-beat BP readings [4], as they are more reliable predictors of
aforementioned cardiovascular risks than single readings [1].
The goal of this master thesis is to show whether it is feasible to use a 60GHz radar device to
continuously estimate BP. Radar is chosen as it has a very small form factor and very low power
consumption – both being favorable characteristics for integrating into a wearable device. The
radar is put into an 3D-printed enclosure which is fastened to the left wrist using a velcro strap. It
is capable of extracting the skin displacement caused by the expansion of the underlying artery,
which is localized using a beamforming algorithm. The extracted skin displacement contains the
pulse waveforms which are used for extracting the BP.
In literature, mainly two methods have been used to design continuous BP devices. One is based
on Pulse-Wave-Velocity, and in that context also Pulse-Transit-Time, the other is based on Pulse-
Wave-Analysis [4]. Since the first method depends on the usage of an electrocardiograph,
this method was not employed in this work, as the goal is to implement a stand-alone solution
which does not require additional devices. Therefore, the second method is implemented.
For that, the extracted skin displacement is split into individual pulse waveforms. Each is used
as input for a support vector machine, that decides whether it is good enough as an input for the
neural network, such that only sufficiently good waveforms are used. Then, 21 distinctive features
are extracted for the individual good waveforms. These features, together with the calibration
parameters gender, age, height and weight, are used as features for a neural network. The network
is then used to predict systolic and diastolic values.
It is expected that some correlation between the skin displacement, captured by the radar, and
the corresponding BP will become apparent, allowing for future research to further improve the
accuracy.

References

[1] D. Buxi, J.-M. Redout´e, and M. R. Yuce. Blood pressure estimation using pulse
transit time from bioimpedance and continuous wave radar. IEEE Transactions on
Biomedical Engineering, 64(4):917–927, 2016.
[2] Y. Kurylyak, F. Lamonaca, and D. Grimaldi. A neural network-based method for
continuous blood pressure estimation from a ppg signal. In 2013 IEEE International
instrumentation and measurement technology conference (I2MTC).
IEEE, 2013.
[3] M. Proenc¸a, G. Bonnier, D. Ferrario, C. Verjus, and M. Lemay. Ppg-based blood
pressure monitoring by pulse wave analysis: calibration parameters are stable for three
months. In 2019 41st Annual International Conference of the IEEE Engineering in
Medicine and Biology Society (EMBC), pages 5560–5563. IEEE, 2019.
[4] J. Sol`a and R. Delgado-Gonzalo. The handbook of cuffless blood pressure monitoring.
Springer. Available online at: https://link. springer. com/book/10, 1007:978–3, 2019.
[5] WHO. Hypertension.World Health Organization. URL: https://www.who.int/
news-room/fact-sheets/detail/hypertension.
[6] X. Xing, Z. Ma, M. Zhang, Y. Zhou, W. Dong, and M. Song. An unobtrusive and
calibration-free blood pressure estimation method using photoplethysmography and
biometrics. Scientific reports, 9(1):1–8, 2019.
[7] Y. Yoon, J. H. Cho, and G. Yoon. Non-constrained blood pressure monitoring using
ecg and ppg for personal healthcare. Journal of medical systems, 33(4):261–266, 2009.

Detection of positions of K-Wires Tips in X-Ray Images using Deep learning

Ground Truth based Convolution Kernel Initialization Method for Medical Image Segmentation

Thesis Description

Proper initialization of convolution kernel is crucial for a highly optimized deep learning neural network [1]. A popular way to instantiate these kernels is random assignment of weights [2]. It follows a gaussian distribution pattern with a mean value of 0 and standard deviation of 1. Despite being easy to implement in a neural network it has quite a few downsides like not finding the global optima or slowing the training process down. As a further improvement to random assignment Xavier Glorot et al. proposed “Xavier initialization value (Xe)” [3] for convolution kernels. This method follows an uniform distribution with a 0 mean and a variance of 1/n where n is the total number of input neurons. Although, training process is faster with increased convergence speed, the derivation process of Xe initialization is based on the assumption that the activation function is linear, which is not the case for popular activation functions such as Rectified Linear Unit (Relu). To mitigate this issue Kaiming He et al. proposed He initialization [4] targeted more toward Relu activation function. He uses a gaussian uniform distribution of 0 mean and a variance of 2/n. All of the above initialization techniques for convolution kernels are based on independent initialization of kernel weights, not taking into account the already available data of training samples. The kernel weights are trained in such a way that these randomly generated values are tried to be matched against the local pattern of the images. In every iteration, the trainer tries to minimize the error between the kernel weights and the local features, which leads further to convergence. As this is a probability event, so it takes quite a lot of iteration after which the convolution kernels can have better match with the local features. This translates into slow down of network, with a larger training time and longer convergence rate. Different methods for initializing the convolution kernel have taken this issue into account. OrthoNorm is another method that uses orthogonal matrix for kernel initialization. It can successfully be used in non-linear networks as well unlike random assignment [5]. There is also “Layer sequence unit variance (LSUV)” method which takes the orthogonal initialization to the iterative process. It uses singular value decomposition SVD to replace the weights initiated with gaussian noise [6]. In 2014 Tsung-HanChan et al. proposed a Principal Component Analysis (PCA) based method for convolution kernel initialization [7]. The model gets all image patches from a feature map and initializes
the convolution kernel by calculating the principal components of image patches. This thesis aims to further improve the PCA based kernel initialization method by incorporating
ground truth GT images. GT images are already labeled and can be used to find suitable feature sets. Leveraging the dominant features from these sets and using them as convolution kernel weights, a dependency between training images and convolution kernels is created. It could theoretically decrease the training time and improve overall convergence rate [1]. Extensive benchmarking of the proposed initialization method along with other quantitative measures needs to be taken into account
while developing the system which is also included in the scope of this thesis. To achieve the goals of the thesis work, already existing tools and libraries such as, Pytorch Lightning(
www.pytorchlightning.ai), Monai(monai.io),Weights and Biases(wandb.ai), python(www.python.org) and notable python scientific packages shall be used and re-used where possible.

The thesis will comprise the following work items:

Literature overview of improved convolution kernel initialization method
Design and formalization of the system to be developed
Overview and explanation of the algorithms used
System development including code implementation
Quantitative evaluation of the implemented system on medical image data

References

[1] Chunyu Xu and Hong Wang. Research on a convolution kernel initialization method for speeding
up the convergence of cnn. Applied Sciences, 12:633, 01 2022.
[2] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional
neural networks. In F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger, editors,
Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012.
[3] Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward
neural networks. In Yee Whye Teh and D. Mike Titterington, editors, AISTATS, volume 9 of
JMLR Proceedings, pages 249–256. JMLR.org, 2010.
[4] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing
human-level performance on imagenet classification, 2015.
[5] Andrew M. Saxe, James L. McClelland, and Surya Ganguli. Exact solutions to the nonlinear
dynamics of learning in deep linear neural networks, 2013.
[6] Dmytro Mishkin and Jiri Matas. All you need is a good init, 2015.
[7] Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, and Yi Ma. PCANet: A
simple deep learning baseline for image classification? IEEE Transactions on Image Processing,
24(12):5017–5032, dec 2015.

reAction: Automatic Speech Recognition in German Automotive Domain

Deep Learning for Cancer Patient Survival Prediction Using 2D Portrait Photos Based on StyleGAN Embedding

Risk Classification of Brain Metastases via Deep Learning Radiomics

Novel View Synthesis for Augmentation of Fine-Grained Image Datasets

Current deep-learning-based classification methods require large amounts of data for training, and in certain scenarios such as in the surveillance imaging there is only a limited amount of data. The aim of the research is to generate new training images of vehicles with the same characteristics as the training data but from novel view points and investigate its suitability for fine-grained classification of vehicles.

Generative models such as generative adversarial networks (GANs) [1] allow for customization of images. However, adjusting the perspective through methods such as conditional GANs for unsupervised image-to-image translation has proven to be particularly difficult [1]. Methods such as StyleGANs [2] or neural radiance fields (NeRFs) [3] are relevant approaches to generate images with different styles and perspectives.
StyleGAN is an extension to the GAN architecture that proposes changes to the generator model such as the introduction of a mapping network. The mapping network generates intermediate latent codes which are transformed into styles that is integrated at each point in the generator network. It also includes a progressive growing approach for training generator models capable of synthesizing very large high-quality images.
NeRF can generate novel views of complex 3D scenes based on a partial set of 2D images. It is trained to directly map from spatial location and viewing direction (5D input) to opacity and color, using volume rendering [4] to render new views.

The thesis consists of the following milestones:

Literature review on the state-of-the-art approaches for GAN- and neural radiance fields-based
image synthesis
Adoption of existing GAN- and neural radiance fields-based image synthesis methods to generate
car images using different styles and camera poses [5]
Experimental evaluation and comparison of different image synthesis methods
Investigate the suitability of the generated images for fine-grained vehicle classification using
different classification methods [6], [7]

The implementation will be done in Python.

References
[1] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Networks ”, in NIPS, 2014
[2] Tero Karras, Samuli Laine, Timo Aila, “A Style-Based Generator Architecture for Generative Adversarial Networks ”, in proceedings of the IEEE/CVF Conference on CVPR, 2019
[3] Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng, “NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis ”, in ECCV 2020
[4] Robert A. Drebin, Loren Carpenter, Pat Hanrahan, “Volume Rendering ”, in Proceedings of SIGGRAPH 1988
[5] Jiatao Gu, Lingjie Liu, Peng Wang, Christian Theobalt, “StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis ”, in ICLR 2022
[6] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows ”, in IEEE/CVF conference on ICCV, 2021
[7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep residual learning for image recognition ”, Proceedings of the IEEE conference on CVPR, 2016

Modelling of the breast during the mammography examination

Metal-conscious Transformer Enhanced CBCT Projection Inpainting

Computed tomography device (CT) is a means of tomographic imaging technology, and it has been developed
rapidly. Due to beam hardening effect, metallic artifacts occur and degrade the quality of CT images. Metal
artifacts have been the focus and difficulty in the field of CT imaging research because of their direct impact on
clinical diagnosis and the diversity of manifestations and causes [1]. In order to reconstruct metal-free CT
images, the inpainting task is an essential part.
The traditional method of inpainting replaces the metal-affected region of the projected data by interpolation
[2][3]. Recently, deep convolutional networks (CNNs) have shown strong potential in all computer vision tasks,
including image inpainting. Several approaches have been proposed for image restoration using CNN based
encoder-decoder network. Shift-Net based on U-Net architecture is one of these approaches, which has good
restoration accuracy in structure and texture [4]. Zeng et al. [5] built a pyramidal-context architecture called
PEN-NET for high-quality image inpainting. Liao et al. [6] proposed a new generative mask pyramid network
to reduce for CT/CBCT Metal Artifact Reduction. Although CNNs have many advantages, their field of
perception is usually small and not conducive to capturing global features. On the contrary, Vision Transformer
(ViT) uses attention to model long-term dependencies among image patches. The shifted window Transformer
(Swin Transformer) is proposed to adapt to the high resolution of images in vision tasks [8], taking into account
the translational invariance of CNNs, the perceptual field and the hierarchical relationship.
To overcome the shortage of medical image data and the domain shift problem in the field of deep learning, this
research is based on simulated X-ray images using ViT as the encoder and CNN as the decoder for image
inpainting. In order to further improve the inpainting performance, some variants of the backbone network are
considered, such as using Swin Transformer instead of ViT and adding the adversarial loss.
The paper will include the following points:
• Literature review in inpainting and metal artifacts reduction.
• Traditional method and CNN based model implementation.
• ViT-based model construction; parameter optimization and incorporation with adversarial loss; results
evaluation.
• Thesis writing.

References
[1] Netto, C., Mansur, N., Tazegul, T., Lalevee, M., Lee, H., Behrens, A., Lintz, F., Godoy-Santos, A., Dibbern,
K., Anderson, D. Implant Related Artifact Around Metallic and Bio-Integrative Screws: A CT Scan 3D
Hounsfield Unit Assessment. Foot & Ankle Orthopaedics. 7, 2473011421S00174 (2022)
[2] Kalender WA, Hebel R, Ebersberger J. Reduction of CT artifacts caused by metallic implants. Radiology.
1987 Aug;164(2):576-7. doi: 10.1148/radiology.164.2.3602406. PMID: 3602406.
[3] Meyer E, Raupach R, Lell M, Schmidt B, Kachelriess M. Normalized metal artifact reduction (NMAR) in
computed tomography. Med Phys. 2010 Oct;37(10):5482-93. doi: 10.1118/1.3484090. PMID: 21089784.
[4] Zhaoyi Yan, Xiaoming Li, Mu Li, Wangmeng Zuo, and Shiguang Shan. Shift-net: Image inpainting via
deep feature rearrangement, 2018.
[5] Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo. Learning pyramid-context encoder
network for high-quality image inpainting, 2019.
[6] Haofu Liao, Wei-An Lin, Zhimin Huo, Levon Vogelsang, William J Sehnert, S Kevin Zhou, and Jiebo Luo.
Generative mask pyramid network for ct/cbct metal artifact reduction with joint projection-sinogram
correction. In International Conference on Medical Image Computing and Computer-Assisted Intervention,
pages 77–85. Springer, 2019.
[7] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas
Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth
16×16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
[8] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin
transformer: Hierarchical vision transformer using shifted windows, 2021.