Building Evaluation Dataset for Textual Entailment in Czech
Authors | |
---|---|
Year of publication | 2012 |
Type | Article in Proceedings |
Conference | Sixth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2012 |
MU Faculty or unit | |
Citation | |
Web | https://nlp.fi.muni.cz/raslan/2012/paper03.pdf |
Field | Informatics |
Keywords | textual entailment; evaluation data set; Czech language; paraphrasing |
Description | Recognizing textual entailment (RTE) is a subfield of natural language processing (NLP). Currently several RTE systems exist in which some of the subtasks are language independent but some are not. Moreover, large datasets for evaluation are prepared almost exclusively for English language. In this paper we describe methods for obtaining test dataset for RTE in Czech. We have used methods for extracting facts from texts based on corpus templates as well as syntactic parser. Moreover, we have used reading comprehension tests for children and students. The main contribution of this article is the classification of “difficulty levels” for particular RTE questions. |
Related projects: |