fbpx

Meta Releases Massively Multilingual Speech Model for Over 4,000 Languages

24/05/2023

Facebook’s parent corporation, the social media behemoth Meta, announced the availability of its Massively Multilingual Speech (MMS) model on Tuesday (23/5/2023), which can now recognize over 4,000 spoken languages, a 40-fold improvement over current technology. Text-to-speech and speech-to-text conversion can now be employed in augmented reality (AR) and virtual reality (VR) applications and their application space has grown from 100 to 1,100 languages. It may communicate not only in the user’s selected language but also in their voice.

The shortcomings of current language identification and generating technology are hastening the demise of several languages throughout the world. In a statement, Meta announced the availability of a number of AI models designed to make it easier for consumers to obtain information and use technology in their native tongues.

According to Meta, it will make the technology’s source code and models available to the public, enabling the research community to build on earlier work and help preserve languages all around the world while fostering intercultural understanding.

The first obstacle to overcome in the development of this system was “collecting voice training data for thousands of languages.” The Bible is one of many religious books that have been translated and utilized as textual training data for languages by Meta to overcome this difficulty.

According to Meta, audio recordings of the Bible’s translations in several languages are available to the general public. A dataset of audio readings of the New Testament in more than 1,100 languages was produced by Meta as part of the extensive multilingual speech model effort. This dataset provided an average of 32 hours of voice training data for each language. The accessible language training data now spans over 4,000 languages thanks to the addition of additional unlabeled Christian audiobooks.

Meta emphasized that it will keep broadening the scope of the massively multilingual speech model to accommodate more language conversions and recognition while attempting to address the issues with dialect handling presented by current speech technologies.

You May Also Like…