Header menu link for other important links
Investigation of Methods to Improve the Recognition Performance of Tamil-English Code-Switched Data in Transformer Framework
, Metilda Sagaya Mary N.J., Shetty V.M.
Published in Institute of Electrical and Electronics Engineers Inc.
Volume: 2020-May
Pages: 7889 - 7893
Code-switching (CS) refers to (inter/intra-word) switching between multiple languages in a single conversation. In multilingual countries like India, CS occurs very often in everyday speech, resulting in a new breed of languages in urban regions like Hinglish (Hindi-English), Tanglish (Tamil-English), etc. Research in Indic CS speech recognition is primarily affected by insufficient data. In this paper, we investigate methods to deal with such very low resource scenarios. Recently, Transformers have shown promising results on automatic speech recognition (ASR) tasks. In a Transformer based framework, we investigate two methods for Tamil-English CS speech recognition, namely, (i) well-trained encoders of Monolingual Transformers as feature extractors to provide language discrimination, (ii) language information as tokens at the targets. Our results show that CS is efficiently handled by the second method, while the first method was efficient in discriminating languages. © 2020 IEEE.
About the journal
JournalData powered by TypesetICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
Open AccessNo