Meta has introduced a groundbreaking AI model to revolutionize translation and transcription capabilities across many languages.
This innovation has the potential to become a fundamental building block for products aimed at facilitating real-time communication across linguistic barriers.
The company’s official blog post stated that its newly developed SeamlessM4T AI model has ingeniously merged previously siloed technologies into separate models. This integration allows for comprehensive translations between both text and speech across approximately 100 languages.
Moreover, the model boasts the remarkable ability to perform complete speech-to-speech translations for 35 distinct languages.
Mark Zuckerberg, the CEO of Meta, emphasized that these cutting-edge technologies are pivotal in enabling individuals from around the globe to interact seamlessly within the metaverse—a network of interconnected virtual realms that holds the key to the company’s future vision.
To promote wider accessibility, Meta has made this groundbreaking model available to the public for non-commercial purposes, as articulated in the blog above post.
In an impressive wave of innovation throughout the year, Meta has unveiled a series of AI models, many offered at no cost to users.
Among these releases is the significant Llama language model, which directly competes with proprietary models from industry giants like Google and Microsoft-sponsored OpenAI.
Zuckerberg highlighted the strategic advantage of Meta’s commitment to an open AI ecosystem. He articulated that the company gains more from harnessing its users’ collective intelligence to develop tools for its social platforms rather than imposing access fees to these models.
However, Meta’s journey has not been without legal complexities. In July, comedian Sarah Silverman and two other authors complained about Meta and OpenAI, alleging unauthorized utilization of their works as training data for AI models. The crux of the matter lies in the training data employed by Meta to refine its models.
According to a research publication by Meta’s researchers, the SeamlessM4T AI model’s audio training data was culled from an extensive 4 million hours of “raw audio” sourced from a publicly available repository of web data.
The specific identity of this repository remains undisclosed. Despite inquiries, a Meta spokesperson declined to provide additional information regarding the origin of this audio data.
The research article further reveals that textual data underpinning the AI model was drawn from datasets established in the previous year. These datasets encompassed content gathered from authoritative sources such as Wikipedia and affiliated websites, affirming Meta’s commitment to robust and comprehensive training resources.
Must read: Apple Leads New Podcasting.