Navigation

ICDAR 2021 Competition on Historical Document Classification

MS AUTUN, B. mun. S 25 (21 bis) , first half of 14th century

This competition investigates the performance of historical document classification. The analysis of historical documents is a difficult challenge commonly solved by trained humanists. We provide three different classification tasks, which can be solved individually or jointly: font group, location, date. Additionally, we will declare an overall winner for participants who contributed to all three tasks. The document images are provided by several institutions and different genres (handwritten and printed manuscripts, letters, charters).

Tasks

  • Font/Script type classification
  • Date range classification
  • Location classification

Timeline

  • Sep 10 Homepage running, registration already possible for obtaining updates, link to the CLaMM’16 and CLaMM’17 competition datasets, as well as the font group recognition dataset for the first task (font/script type classification).
  • Oct 1 Providing additional training sets for task 2 and task 3 which represent the final classes. For each training set, we will provide pytorch dataset loaders for quick and easy processing.
  • Nov 1 Submission of systems possible, i. e. the test set is available internally but at this point participants can hand in only full systems that will then be evaluated on our machines. A public leaderboard will be updated each month. A submission can be handed-in once a week and its result seen only for the participant.
  • Mar 15 Test set will be made publicly available. Submissions of single csv files possible (no leaderboard option).
  • Mar 31 Competition deadline.

Registration

https://forms.gle/3mobXxXKCjMEn3Xp6

Dataset

Training

Testing

Publicly available: March 15th

Evaluation

The competition winners will be determined by the TOP-1 accuracy measurement. The overall strongest method will be defined as the harmonic mean of the TOP-1 scores across all sub-tasks. While the participants will have to deal with an imbalanced training set, we will try to balance the classes of the test set.

Submission

Each participant can participate in any subtask but we encourage to evaluate on all tasks since they follow all the same classification protocol. That means that only the classifier/classification layer needs to be adjusted to the different number of classes. Participants who submit a system can optionally participate in the competition anonymously.

Submission of systems: possible 1st of November 2020

Organizers