In the field of natural language processing transformer networks, which dispense with recurrent
architectures by using scaled dot-product attention mechanism , became state of the art for
many tasks. Due to its huge success, transformers also have been applied in other fields of research
such as music generation or computer vision [2, 3].
For electrocardiogram (ECG) classification convolutional neural networks (CNNs) or recurrent
neural networks (RNNs) are still widely used. Combining a CNN as a feature extractor with
transformer encoders instead of an RNN lately has shown to be potentially competitive with
existing architectures . As transformer layers rely on attention feature maps that can be visualized
easily they could help to improve the interpretability of decisions made by the deep learning
model, which is in particular important in medical and health care applications.
In image classification a recent work proposes that transformers could even replace convolutions
and outperform deep residual models . Therefore the goal of this work is to develop an algorithm
for 12-lead ECG classification with transformer encoder layers as a crucial part of the feature extractor
and evaluate its performance, in particular concerning different types of cardiac abnormalities.
Furthermore, it is to be investigated, if the model learns to compute human-comprehensible
attention feature maps.
The work consists of the following parts:
• Literature research on existing deep learning models for ECG signal classification and arrhythmia
• Adapt a transformer architecture for 12-lead ECG classification
• Training and evaluation of the model on PTB-XL  and ICBEB challenge 2018  data
• Comparison based on the ROC-AUC score with a transformer-based reference implementation
 and existing models that were benchmarked on PTB-XL 
• Assessment of advantages/disadvantages in the classification of different types of cardiac abnormalities,
at morphological and rhythm level in particular, and visualization of attention
 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,
Lukasz Kaiser, and Illia Polosukhin. Attention is all you need.
 Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis
Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, and Douglas Eck.
 Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai,
Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly,
Jakob Uszkoreit, and Neil Houlsby. An image is worth 16×16 words: Transformers for image
recognition at scale.
 Annamalai Natarajan, Yale Chang, Sara Mariani, Asif Rahman, Gregory Boverman, Shruti
Vij, and Jonathan Rubin. A wide and deep transformer neural network for 12-lead ecg classification.
In 2020 Computing in Cardiology Conference (CinC), Computing in Cardiology
Conference (CinC). Computing in Cardiology, 2020.
 PatrickWagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dieter Kreiseler, Fatima I. Lunze,Wojciech
Samek, and Tobias Schaeffter. PTB-XL, a large publicly available electrocardiography
dataset. Scientific Data, 7(1):154, 2020.
 Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, Xiaoyan Xu, Yulin Liu,
Caiyun Ma, Shoushui Wei, Zhiqiang He, Jianqing Li, and Eddie Ng Yin Kwee. An Open
Access Database for Evaluating the Algorithms of Electrocardiogram Rhythm and Morphology
Abnormality Detection. Journal of Medical Imaging and Health Informatics, 8(7):1368–1373,
 Nils Strodthoff, Patrick Wagner, Tobias Schaeffter, and Wojciech Samek. Deep learning for
ECG analysis: Benchmarks and insights from PTB-XL. IEEE Journal of Biomedical and
Health Informatics, 25(5):1519–1528, 2021.