Vincent Christlein
Handwriting with Focus on Writer Identification and Writer Retrieval
Abstract
In the course of the mass digitization of historical as well as contemporary sources, an individual examination by means of historical or forensic experts is no longer feasible. A solution could be an automatic handwriting analysis that determines or suggests script attributes, such as the writer or the date of a document. In this work, several novel techniques based on machine learning are presented to obtain these attributes from a single document image. The focus lies on writer recognition for which a novel pipeline is developed, which identifies the correct writer of a given sample in over 99 % of all tested contemporary datasets, numbering between 150 and 310 writers each, with four to five samples per writer. In a large historical dataset, consisting of 720 writers and five samples per writer, an identification rate of close to 90 % is achieved. Robust local descriptors play a major role in the success of this pipeline. Shape- and histogram-based descriptors prove to be very effective. Furthermore, novel deep-learningbased features are developed using deep convolutional neural networks, which are trained with writer information from the training set. While these features achieve very good results in contemporary data, they lack distinctiveness in the evaluated historical dataset. Therefore, a novel feature learning technique is presented that solves this by learning robust writer-independent script features in an unsupervised manner. The computation of a global descriptor from the local descriptors is the next step. For this encoding procedure, various techniques from the speech and computer vision community are investigated and thoroughly evaluated. It is important to counter several effects, such as feature correlation and the over-counting of local descriptors. Overall, methods based on aggregating first order statistics of residuals are the most effective approaches. Common writer recognition methods use the global descriptors directly for comparison. In contrast, exemplar classifiers are introduced in this thesis allowing sample-individual similarities to be learned, which are shown to be very effective for an improved writer recognition. This writer recognition pipeline is adapted to other tasks related to digital paleography. Medieval papal charters are automatically dated up to an error range of 17 years. Furthermore, an adapted pipeline is among the best to classify medieval Latin manuscripts into twelve different script types. This information can then be used for a pre-sorting of documents or as a preprocessing step for handwritten text recognition. It turns out that counteracting different illumination and contrast effects is an important factor for deep-learning-based approaches. The observation that script has similar tubal structures to blood vessels is exploited for an improved text block segmentation in historical data by means of a well-known medical filtering technique. This work sets new recognition standards for several tasks, allowing the automatic document analysis of large corpora with low error rates. These methods are also applicable to other fields, such as forensics or paleography, to determine writers, script types or other metadata of contemporary or historical documents.