Speech Recognition and Synthesis Across Thousands of Languages
Meta introduces Massively Multilingual Speech (MMS), a groundbreaking AI model that can accurately recognize oral speech in more than 4000 languages and transform text into spoken language in over 1100 languages.
Innovative Approach to Collecting Audio Data
To gather audio data, Meta employed an unconventional method by leveraging translated religious texts, such as the Bible. This unique approach significantly expanded the model’s language coverage.
Benefits and Limitations
While Meta’s MMS offers substantial benefits, it is important to note that the models are not flawless, reminds NIX Solutions. There is a risk of incorrect transcription of selected words or phrases. Nevertheless, this initiative aims to challenge the trend of technology limited to supporting only 100 languages by major tech companies.