Deep Learning for Geo-Referencing Historical Utility Documents With Geographical Features

Type: MA thesis

Status: running

Date: September 1, 2024 - February 28, 2025

Supervisors: Andreas Maier, Adithya Ramachandran, Siming Bayer

Abstract:

The digitization of industries has spurred significant advancements across sectors, including utilities responsible for essential services like heating and water supply. As many utility systems developed before the digital era, they hold immense potential for optimization through digital representation. Accurate mapping of their extensive underground pipeline networks is key to improving operational efficiency. However, this digitization presents challenges, primarily because extracting geographic information from historical planning documents is difficult, as the infrastructure remains buried underground.

In this work, we propose a two-stage deep-learning framework to extract geographic information from historical utility planning records and facilitate the digital representation of utility networks. During the first stage, we frame this as a geo-location classification task, using a Convolutional Neural Network (CNN) to classify OpenStreetMap images into specific geographic regions covered by the utility network. In the second stage, we address the scarcity of annotated data by applying a style-transfer technique to historical documents containing geographic features, converting them into a format similar to OpenStreetMap images. This process enables further classification using the trained CNN. We will evaluate the method on real-world utility data.


This thesis is part of the “UtilityTwin” project.