Czech Question Answering with Extended SQAD v3.0 Benchmark Dataset

Investor logo

Warning

This publication doesn't include Faculty of Sports Studies. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

SABOL Radoslav MEDVEĎ Marek HORÁK Aleš

Year of publication 2019
Type Article in Proceedings
Conference Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019
MU Faculty or unit

Faculty of Informatics

Citation
Keywords question answering; QA benchmark dataset; SQAD; Czech
Description In this paper, we introduce a new version of the Simple QuestionAnswering Databases (SQAD). The main asset of the new version lies inincreasing the number of records to a total of 13,473 records. Besides thedatabase enlargement, the new version incorporates new restrictions ofspecifying different formats of the expected answer for a given question.These new restrictions are connected with automatic database consistencychecks where new sub-processes safeguard the database correctness andconsistency.We also introduce a new on-line annotation tool used which offered aunified environment for extending the SQAD data in a crowdsourcingexperiment.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info