Detecting Online Risks and Supportive Interaction in Instant Messenger Conversations using Czech Transformers

Warning

This publication doesn't include Faculty of Sports Studies. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	SOTOLÁŘ Ondřej PLHÁK Jaromír TKACZYK Michal LEBEDÍKOVÁ Michaela ŠMAHEL David
Year of publication	2021
Type	Article in Proceedings
Conference	Recent Advances in Slavonic Natural Language Processing (RASLAN 2021)
MU Faculty or unit	Faculty of Informatics
Citation
web	Full text PDF Domovská stránka workshopu
Keywords	Online Risks; Supportive Interaction; Facebook Messenger; Text Classification
Description	We present a comparison of state-of-the-art models for text clas- sification of Online Risks and Supportive Interaction in anonymized In- stant Messenger conversations held in Czech. We compare the transformer models Czert, RobeCzech, and FERNET-C5 with the Fasttext classifier as a baseline. For the comparison, we build a novel dataset with five sub- categories for the Online Risks and five for the Supportive Interaction. We solve the balanced classification problem achieving 75.44 - 89.66 F1 score depending on the category. Our results show that the transformer models perform consistently better than the baseline.
Related projects:	Modelling the future: Understanding the impact of technology on adolescent’s well-being