Meta (formerly known as Facebook) has delighted the world with a new breakthrough in the field of artificial intelligence. They recently published an artificial intelligence model called Massively Multilingual Speech (MMS). This model is able to recognize spoken language in 4,000 languages and convert it to text, opening up new opportunities for communication and access to information.
Non-standard approach to learning
Conventional speech recognition and text-to-speech models require thousands of hours of text-tagged audio to train. However, for languages that are not common in the industrial world, such data is simply not available. The Meta team solved this problem with a non-standard approach. They used religious texts such as the Bible and other foundational books, which have been translated into many languages and have audio recordings available to the public. Although audio recordings did not have precise markup, wav2vec 2.0 was able to overcome this problem.
Over 4000 supported languages
The MMS model can recognize and convert spoken language in over 4,000 languages, making it one of the most multilingual models on the market. Surprisingly, learning from religious texts did not affect the model, and it did not show any bias towards religious worldview or gender bias.
New opportunities for accessing information
One of the key advantages of the MMS project is the availability of information in rare languages. Most modern technology giants are limited to 100 languages, which creates problems for speakers of rare languages. Meta strives to overcome this limitation and provide a level playing field for all users, notes NIXsolutions.
A new era in speech recognition
Meta’s MMS is a revolutionary advance in speech recognition and text-to-speech. Its ability to recognize and reproduce spoken language in more than 4,000 languages opens up new horizons for communication and interaction in a multilingual world.