ON THE USE OF GRAPHEME MODELS FOR SEARCHING IN LARGE SPOKEN ARCHIVES
Autoři | |
---|---|
Rok publikování | 2018 |
Druh | Článek ve sborníku |
Konference | 43rd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018) |
Fakulta / Pracoviště MU | |
Citace | |
Doi | http://dx.doi.org/10.1109/ICASSP.2018.8461774 |
Obor | Informatika |
Klíčová slova | spoken term detection; speech indexing; grapheme-based speech recognition; keyword search |
Popis | This paper explores the possibility to use grapheme-based word and sub-word models in the task of spoken term detection (STD). The usage of grapheme models eliminates the need for expert-prepared pronunciation lexicons (which are often far from complete) and/or trainable grapheme-to-phoneme (G2P) algorithms that are frequently rather inaccurate, especially for rare words (words coming from a different language). Moreover, the G2P conversion of the search terms that need to be performed on-line can substantially increase the response time of the STD system. Our results show that using various grapheme-based models, we can achieve STD performance (measured in terms of ATWV) comparable with phoneme-based models but without the additional burden of G2P conversion. |
Související projekty: |