ICDAR2023 Competition on Detection and Recognition of Greek Letters on Papyri

Symbolic picture for the article. The link opens the image in a large view.
Homer, Ilias(C) Staatliche Museen zu Berlin, Ägyptisches Museum und Papyrussammlung

This competition investigates the performance of glyph detection and recognition on a very challenging type of historical document: Greek papyri. The detection and recognition of Greek letters on papyri is a preliminary step for computational analysis of handwriting that can lead to major steps forward in our understanding of this major source of information on Antiquity. It can be done manually by trained papyrologists. It is however a time-consuming task that would need automatising. We provide two different tasks: localization and classification or classification only.The document images are provided by several institutions and are representative of the diversity of book hands on papyri (a millennium time span, various script styles, provenance, states of preservation, means of digitization and resolution).

Homer, Ilias 1, 54–64, 71–104, 114–123, 131–164, 412–433, 456–465, 494–534, 537–590, 602–609. 1. – 2. Jh. n.Chr.

Tasks

  • Glyph (character) localization
  • Character classification

Timeline

  • April 1st, 0h01 GMT+1: publication of the test data
    • We will use CodaLab for the evaluation and leaderboard
      • Important we have a glitch with the evaluation on CodaLab and are working on it. It will be corrected on the 11th or 12th.
    • Five submissions per day are allowed
    • Results dating from the deadline will be used for the ranking announced at ICDAR
  • April 16th, 15h00 GMT+1:
    • submission of the results
    • submission of a brief (~1 paragraph) description of the method

Results should be submitted either in the same format as the example given in the “Baseline” section.

Competition results will be announced at the conference.

Data

The training data can be downloaded at the following address:

https://faubox.rrze.uni-erlangen.de/getlink/fi8qaEMwMkc5L2Bg7tdh5L/HomerCompTraining.zip

The data is composed of images, and JSON files in the COCO format. The evaluation will be done using the average precision (AP) defined by COCO for the localization, and the percentage of correct answers for the classification.

The test data can be downloaded at the following address:

https://faubox.rrze.uni-erlangen.de/getlink/fiSy6Gqncw6yJXLFxrmSbj/HomerCompTesting.zip

Important: the JSON template to fill has been now added to the archive.

The results have to be submitted following the same structure as the example in the section below.

Here is an example of an annotation which follows this format (from baseline_validation.zip, provided below):

    {
        "image_id": 1,
        "category_id": 119,
        "bbox": [
            339,
            461,
            114,
            102
        ],
        "score": 0.9871184825897217
    }

Baseline

You can implement your system from scratch, or use this baseline for an easier initial development phase:

https://faubox.rrze.uni-erlangen.de/getlink/fi6WyMcdhECDQQDQsh4WUT/baseline.zip

We used the PyTorch tutorial on object detection, and made the minimum modifications to have this running on our data. The main modification is that instead of downscaling input images to process them as a whole, which makes the letters too small to be detected, the input is processed patch-wise without overlapping.

The baseline also contains evaluation code, and a trained model.

Moreover, for information purpose only, we did a random split of the training set and validated the provided model on it – you can find this subset and the predictions in:

https://faubox.rrze.uni-erlangen.de/getlink/fiJY4iYrYrtdpTzjBWL11s/baseline_validation.zip

Registration

We would like people intending to participate to register for organizational purpose. Should any modification to the training data be done, we will inform the participants.

To register, either send an e-mail to mathias.seuret AT fau DOT de, or fill the following form, or simply register on CodaLab.

https://forms.gle/4Xmk9qZRd5qRs8zM9

Organizers

  • Isabelle Marthot-Santaniello
  • Stephen White
  • Olga Serbaeva Saraogi
  • Dalia Rodriguez-Salas
  • Guillaume Carrière
  • Vincent Christlein
  • Mathias Seuret

Prices for EELISA members (not incl. FAU)

The best submissions from EELISA European University partners will be awarded with the possibility of a one-week lab visit at the pattern recognition lab at FAU in Erlangen. For this lab visit, EELISA FAU will cover the costs for flight/train and hotel incl. breakfast.