Results of the SIGMORPHON 2020 shared task on multilingual grapheme-to-phoneme conversion

Unsurprisingly, the best systems all used some form of ensembling.
Many of the best teams performed self-training and/or data augmentation experiments, but most of these experiments were performance-negative except in simulated low-resource conditions. Maybe we’ll do a low-resource challenge in a future year.
LSTMs and transformers are roughly neck-and-neck; one strong submission used a variant of hard monotonic attention.
Many of the best teams used some kind of pre-processing romanization strategy for Korean, the language with the worst baseline accuracy. We speculate why this helps in the task paper.
There were some concerns about data quality for three languages (Bulgarian, Georgian, and Lithuanian). We know how to fix them and will do so this summer, if time allows. We may also “re-issue” the challenge data with these fixes.

Leave a Reply Cancel reply