SQAD: Simple Question Answering Database

Investor logo

Warning

This publication doesn't include Faculty of Sports Studies. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

HORÁK Aleš MEDVEĎ Marek

Year of publication 2014
Type Article in Proceedings
Conference Eighth Workshop on Recent Advances in Slavonic Natural Language Processing
MU Faculty or unit

Faculty of Informatics

Citation
Field Informatics
Keywords question answering; Simple Question Answering Database; SQAD; syntax-based question answering; SBQA
Description In this paper, we present a new free resource for comparable Czech question answering evaluation. The Simple Question Answering Database, SQAD, contains 3301 questions and answers extracted and processed from the Czech Wikipedia. The SQAD database was prepared with the aim of a precision evaluation of automatic question answering systems. Such resource was currently not available for the Czech language. We describe the process of SQAD creation, processing of the texts by automatic tokenization (Unitok) and morphological disambiguation (Desamb) and successive semi-automatic cleaning and post-processing. We also show the results of a first version of Czech question answering system named SBQA (syntax-based question answering).
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info