The classification of images is a well known task in computer vision. However, transparent or semi-
transparent objects have several properties that can make computer vision tasks harder. Those objects
usually have less textures and sometimes strong reflections. Occasionally, different backgrounds make
it hard to recognize edges or the shape of an object. [1, 2]
To overcome these difficulties we use polarization cameras in this work. In contrast to ordinary cameras,
polarization cameras additionally record information about the polarization of the light rays. Most
natural light sources emit unpolarized light. By using a light source that emits polarized light, it is
possible to remove reflections or increase the contrast. Further it is known that the Angle of Linear
Polarization (AoLP) provides information about the normal of a surface [3].
In this work, we will use the Deep Learning approach and use Convolutional Neural Networks (CNNs)
to explore the following topics:
1. Comparison of different sorts of preprocessing:
• Using only raw data / reshaped raw data
• Using extra features Degree of Linear Polarization (DoLP) and AoLP
2. Influence of different light sources.
3. Comparison of different defect classes.
To evaluate the results we use different error metrics such as accuracy and f1, as well as gradient class
activation maps (GradCAM) [4].
The implementation should be done in Python.
References
[1] Agastya Kalra, Vage Taamazyan, Supreeth Krishna Rao, Kartik Venkataraman, Ramesh Raskar, and Achuta
Kadambi. Deep Polarization Cues for Transparent Object Segmentation. In 2020 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), pages 8599–8608, Seattle, WA, USA, June 2020. IEEE.
[2] Ilya Lysenkov, Victor Eruhimov, and Gary Bradski. Recognition and Pose Estimation of Rigid Transparent
Objects with a Kinect Sensor. page 8, 2013.
[3] Francelino Freitas Carvalho, Carlos Augusto de Moraes Cruz, Greicy Costa Marques, and Kayque Martins
Cruz Damasceno. Angular Light, Polarization and Stokes Parameters Information in a Hybrid Image Sensor
with Division of Focal Plane. Sensors, 20(12):3391, June 2020.[4] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and
Dhruv Batra. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization.
International Journal of Computer Vision, 128(2):336–359, February 2020.