ICDAR2024 Competition on Multi Font Group Recognition and OCR

This competition investigates the performance of Optical Character Recognition (OCR) systems for early-modern prints, with a main focus on font group diversity. Participants have to submit both OCR results, and font group recognition at character level. The data, which has been provided by multiple institutions, has been carefully transcribed and proof-checked by experts, and font group information has been labeled at the character level.

Tasks

The two following tasks will be evaluated separately:

  • OCR, with Character Error Rate (CER) and Word Error Rate (WER) as metrics
  • Font group recognition, with CER as metric

Tracks

The quality of OCR results depend both on methods and training data. A fair comparison of methods only requires to use the same training data, however a fair comparison of what different research groups manage to achieve requires to avoid any restriction. For this reason, this competition has the following two tracks:

  • Provided data only: the networks have to be trained from scratch using only the competition’s training data. Any data augmentation technique is of course allowed.
  • Data alchemist: there is no restriction on which data is used for training the networks (other than the test data, once available, cannot be involved in any way in the training process). Using pre-trained models is allowed.

Participants are of course encouraged to submit to both tracks if possible.

Timeline

  • Ongoing: participants have access to the training data
  • March 15th: participants receive the test set
  • March 29th: participants submit their results, and a short description of their methodology

Data

Training and validation data, available through the links below, are split book-wise. Participants are free to merge or re-split this data as they wish, including for the first track.

https://faubox.rrze.uni-erlangen.de/getlink/fiSDupUxNJWYgBkHtwDjZx/icdar2024-comp-ocr-font.zip

Data format:

  • One jpg image per text line,
  • One .txt file containing the transcription per text line,
  • One .font file containing font groups at character level per text line, encoded as ASCII text

Font groups are encoded as follows:

  • a: Antiqua
  • b: Bastarda
  • f: Fraktur
  • G: Gotico-Antiqua
  • i: Italic
  • r: Rotunda
  • s: Schwabacher
  • t: Textura

Font groups

In the same sequence as the list above.

Baseline Method

We will soon provide a link to a cleaned up version of the OCR model presented in “Combining OCR Models for Reading Early Modern Books” at ICDAR 2023.

Submission

For evaluation, we will use an online competition platform such as aicrowd.com. A link will be provided on this page shortly after March 15th.

Organizers

Mathias Seuret¹
Janne van der Loop²
Dalia Rodríguez-Salas¹
Martin Mayr¹
Fei Wu¹
Florian Kordon¹
Nikolaus Weichselbaumer²
Vincent Christlein¹

¹: Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
²: Buchwissenschaft, Johannes Gutenberg Universität Mainz