DocILE Lab and Challenge at CLEF 2023

Join our Google Group to receive updates and news about the challenge and lab at CLEF 2023.

Goal

The main goal of the lab is to provide a research benchmark for cross-evaluation of machine learning methods for Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) from semi-structured business documents such as invoices, orders etc. Such benchmark is currently missing, hindering cross evaluation, as discussed by Skalický et al. (2022).

DALL·E 2: Human in the clouds relaxing as AI extracts data from documents instead of them, digital art.

Data

For the DocILE'23 challenge, we will provide the largest dataset of business documents annotated for KILE and LIR. The training set will be published in January 2023.

[Legal information on the processing of personal data for the purpose of scientific research.]

Organizers

Task Chairs:
Milan Šulc, Head of Rossum AI Labs, Rossum.ai, Czech Republic.
Štěpán Šimsa, Researcher at Rossum AI Labs, Rossum.ai, Czech Republic.

Co-Organizers:
Ahmed Hamdi, Associate Professor at the University of La Rochelle, France.
Yash Patel, PhD. candidate at the Visual Recognition Group, Czech Technical University in Prague, Czech Republic.
Matyáš Skalický, Research Engineer at Rossum.ai, Czech Republic.

Steering Committee:
Michal Uřičář, Researcher at Rossum AI Labs, Rossum.ai, Czech Republic.
Antoine Doucet, Full Professor of Computer Science at the University of La Rochelle, France.
Mickael Coustaty, Associate Professor at the University of La Rochelle, France.
Dimosthenis Karatzas, Associate Director of the Computer Vision Center, Barcelona, Spain.