Introduction
The following proposes the use of binary masks as a ground truth for the
denoising of deep learning based killer whale classication. It is part of a
project of the Pattern Recognition Lab of the FAU in cooperation with the
Vancouver Maritime Institute. Based on thousands of images of killer whale
populations taken over the last years, a deep learning approach was used
to ease the classication of individual animals for local researchers, both
visual and with call recognition [2]. Previous work focused on the extraction
of regions of interest from the original images and classication of single
animals. To limit the in
uence of noise on the classication, this thesis aims
to create binary masks of the animals from image segmentation. Binary
masks often present an accurate ground truth for deep learning approaches.
The following work is therefore closely related to [4]. It is part of the visual
counterpart of the existing “Denoising Kit” for audio signals of killer whales.
Motivation
Noise plays a crucial role in the detection and reconstruction of images.
In this case close color spaces and partially blurry images throughout the
extracted data limit the success of deep learning based classication. With
a binary mask of the orca body as a ground truth, a network can be trained
1
without the in
uence of noise. This can further increase the accuracy of orca
detection, helping researchers to track animal populations much easier.
Approach
Two approaches have presented themselves to be most ecient and are going
to be utilized. First, common methods are used to detect edges in the orca
images. This will be done with the popular Canny Edge Algorithm. The
images are also processed by a superpixel segmentation algorithm [5]. By
overlaying both results, an accurate outline of the animals shape can be
segmented. After a binarization, the resulting mask will be used as a ground
truth for a deep learning network. With it the original images are denoised
to account for a better classication later.
Finally this thesis will look into intelligent data augmentation in the form
of image morphing techniques, utilizing the created binary masks. With
feature-based image morphing [1], the variety of training data and therefore
also the accuracy of the underlying classier could be further improved.
Medical application
Ground truth binary masks can and in some parts already have application
in computer vision tasks in the medical eld. Deep learning classication of
tumors in CT and MRI images are often based on binary masks, traced by
radiologists [3]. Similar issues regarding noise are often faced.
References
[1] Thaddeus Beier and Shawn Neely. Feature-based image metamorphosis.
ACM SIGGRAPH computer graphics, 26(2):35{42, 1992.
[2] Christian Bergler, Manuel Schmitt, Rachael Xi Cheng, Andreas K Maier,
Volker Barth, and Elmar Noth. Deep learning for orca call type
identication-a fully unsupervised approach. In INTERSPEECH, pages
3357{3361, 2019.
[3] Francisco Javier Daz-Pernas, Mario Martnez-Zarzuela, Mriam Anton-
Rodrguez, and David Gonzalez-Ortega. A deep learning approach for
2
brain tumor classication and segmentation using a multiscale convolutional
neural network. In Healthcare, volume 9, page 153. Multidisciplinary
Digital Publishing Institute, 2021.
[4] Christian Bergler et. al. Orca-clean: A deep denoising toolkit for killer
whale communication. INTERSPEECH, 2020.
[5] Pedro F Felzenszwalb and Daniel P Huttenlocher. Ecient graph-based
image segmentation. International journal of computer vision, 59(2):167{
181, 2004.