Navigation

ICDAR 2021 Competition on Historical Document Classification

MS AUTUN, B. mun. S 25 (21 bis) , first half of 14th century

This competition investigates the performance of historical document classification. The analysis of historical documents is a difficult challenge commonly solved by trained humanists. We provide three different classification tasks, which can be solved individually or jointly: font group, location, date. Additionally, we will declare an overall winner for participants who contributed to all three tasks. The document images are provided by several institutions and different genres (handwritten and printed manuscripts, letters, charters).

Paper

https://link.springer.com/chapter/10.1007/978-3-030-86337-1_41

Tasks

  • Font/Script type classification
  • Date range classification
  • Location classification

Timeline

  • Sep 10 Homepage running, registration already possible for obtaining updates, link to the CLaMM’16 and CLaMM’17 competition datasets, as well as the font group recognition dataset for the first task (font/script type classification).
  • Oct 10 Providing additional training sets for task 2
  • Nov 1 Task 3 which represent the final classes. For each training set, we will provide pytorch dataset loaders for quick and easy processing. First data loaders are available here: https://github.com/anguelos/histdiads
  • Nov 1 Submission of systems possible, i. e. the test set is available internally but at this point participants can hand in only full systems that will then be evaluated on our machines. A public leaderboard will be updated each month (Note: so far no system submissions happened, so no public leader board). A submission can be handed-in once a week and its result seen only for the participant.
  • Mar 29 Test set will be made publicly available. Submissions of single csv files possible. Per task 5 submissions possible.
  • April 16April 25 Competition deadline.

Registration

https://forms.gle/3mobXxXKCjMEn3Xp6

Data

  • Task 1 font group/script type classification:
    • Train:
      • Script type datasets: ICDAR’17 CLAMM, ICDAR’16 CLAMM (you find the datasets at the bottom of ‘registration & download’. Also note that some parts of the CLAMM’17 training data uses data from the CLAMM’16 data)
      • Font group dataset: https://doi.org/10.5281/zenodo.3366685 (note: The test set will contain only one font group but this dataset also provides multiple labels. It’s always main_font[,other_font,other_font,…] . If there is no main font group it is denoted with ‘-‘, i.e. the first column will have a ‘-‘.)
  • Task 2 date classification
    • New train + test dataset: https://zenodo.org/record/4836687
      • Dataset contains dates/date ranges, goal is to get as close as possible to the date ranges, i.e. mean absolute error of estimated date to the date range
      • gt.csv: notBefore,notAfter,filename
      • File names: imageNumber_notBefore_notAfter.jpg
    • additional allowed train dataset: ICDAR’17 CLAMM
  • Task 3 location classification
    • New train + validation + test dataset: will be uploaded to zenodo soon!

Note: All test pages of Task 1 (fonts) contain a known font (i.e., no “not a font”, and no “other font” labels are present), and that there is a single font on each test page.

These are the lists of classes:

Task 1a, scripts:
Caroline, Cursiva, Half-Uncial, Humanistic, Humanistic Cursive, Hybrida, Praegothica, Semihybrida, Semitextualis, Southern Textualis, Textualis, Uncial

Task 1b, fonts:
Antiqua, Bastarda, Fraktur, Gotico Antiqua, Greek, Hebrew, Italic, Rotunda, Schwabacher, Textura, Not a font (not present in test set), Other font (not present in test set), “-“, i.e., not main font (not present in test set)

Task 2, dating:
There are no classes, only date ranges given. You are free to create your own date (range) classes. However note that we will use MAE between ground truth date range and your suggested date for the evaluation.

Task 3, location:
Cluny, Corbie, Citeaux, Florence, Fonteney, Himanis, Milan, MontSaintMichel, Paris, SaintBertin, SaintGermainDesPres, SainMatrialDeLinoges, Signy

Evaluation

The competition winners will be determined by the TOP-1 accuracy measurement (in case of dating: 1-MAE). The overall strongest method will be defined as the sum of ranks in all competitions. While the participants will have to deal with an imbalanced training set, we will try to balance the classes of the test set.

Submission

Each participant can participate in any subtask but we encourage to evaluate on all tasks since they follow nearly all the same classification protocol (exception is somewhat the dating task, however you are free to phrase it as a classification task, too). That means that only the classifier/classification layer needs to be adjusted to the different number of classes. Participants who submit a system can optionally participate in the competition anonymously.

Please send us (Mathias and Vincent) your results in form of CSV files in the following format: filename,class (or in case of dating: filename,date). You can submit up to 5 different CSVs per task.

Additionally, please send us 1/2 page description of your method (also containing team member names) and the info if you are fine that we also publicly show the results on this competition page (can also be published anonymous with a team name).

Submission of systems: possible 1st of November 2020

Leaderboard

Team nameOverall Accuracy [%]Average Accuracy (Unweighted Average Recall) [%]
PERO (4)88.7788.84
PERO (5)88.4688.60
PERO (2)88.4688.54
PERO (1)83.0483.11
PERO (3)80.3380.26
The North LTU73.9674.12
Baseline55.8155.22
CLUZH (3)36.8635.25
CLUZH (1) 26.1125.39
CLUZH (2)22.2921.51
Task 1a) Script classification

Team nameOverall Accuracy [%]Average Accuracy (UAR) [%]
PERO (5)99.0498.48
PERO (2)99.0998.42
PERO (4) 98.9898.27
NAVER Papago (3) 98.3297.17
NAVER Papago (2)97.8696.47
PERO (1)97.9396.36
NAVER Papago (4)97.6996.24
NAVER Papago (5)97.6496.01
NAVER Papago (1)97.6495.86
PERO (3)97.4495.68
CLUZH (3)97.1395.66
CLUZH (1)96.8495.34
CLUZH (2)96.8095.16
The North LTU87.0082.80
Task 1b) Font group classification

Team nameMean Absolute Error [y]In Interval [%]
PERO (2)21.9148.25
PERO (4)21.9947.97
PERO (3)32.4539.75
PERO (1)32.8240.58
The North LTU79.4328.97
Task 2) Dating

Team nameOverall AccuracyAverage Accuracy (UAR)
PERO (5)79.6979.69
PERO (3)75.0875.08
PERO (1)74.7774.77
PERO (4) 70.7770.77
PERO (2)69.8569.85
Baseline63.1262.46
The North LTU43.6943.69
Task 3) Location

Organizers