Multilingual Recognition of Temporal Expressions

Varování

Publikace nespadá pod Fakultu sportovních studií, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.
Autoři

STARÝ Michal NEVĚŘILOVÁ Zuzana VALČÍK Jakub

Rok publikování 2020
Druh Článek ve sborníku
Konference Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
www
Klíčová slova temporal expressions; multilingual; date recognition
Popis The paper presents a multilingual approach to temporal expression recognition (TER) using existing tools and their combination. We observe that the rules based methods perform well on documents using wellformed temporal expressions in a narrower domain (e.g., news), while data driven methods are more stable within less standard language and texts across domains. With combination of the two approaches, we achieved F1 of 0.73 and 0.9 for strict and relaxed evaluations respectively on one English dataset. Although these results do not achieve the state-of-the-art on English, the same method outperformed the state-of-the-art results in a multilingual setting not only in recall but also in F1. We see this as a strong indication that combining rule based systems with data driven models such as BERT is a valid approach to improve the overall performance in TER, especially for languages other than English. Further observations indicate that in the domain of office documents, the combined method is able to recognize general temporal expressions as well as domain specific ones (e.g., those used in financial documents).
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info