text-to-speech Archives - AI News https://www.artificialintelligence-news.com/tag/text-to-speech/ Artificial Intelligence News Tue, 07 Nov 2023 11:59:32 +0000 en-GB hourly 1 https://www.artificialintelligence-news.com/wp-content/uploads/sites/9/2020/09/ai-icon-60x60.png text-to-speech Archives - AI News https://www.artificialintelligence-news.com/tag/text-to-speech/ 32 32 OpenAI introduces GPT-4 Turbo, platform enhancements, and reduced pricing https://www.artificialintelligence-news.com/2023/11/07/openai-gpt-4-turbo-platform-enhancements-reduced-pricing/ https://www.artificialintelligence-news.com/2023/11/07/openai-gpt-4-turbo-platform-enhancements-reduced-pricing/#respond Tue, 07 Nov 2023 11:59:31 +0000 https://www.artificialintelligence-news.com/?p=13851 OpenAI has announced a slew of new additions and improvements to its platform, alongside reduced pricing, aimed at empowering developers and enhancing user experience. Following yesterday’s leak of a custom GPT-4 chatbot creator, OpenAI unveiled several other key features during its DevDay that promise a transformative impact on the landscape of AI applications: OpenAI’s latest... Read more »

The post OpenAI introduces GPT-4 Turbo, platform enhancements, and reduced pricing appeared first on AI News.

]]>
OpenAI has announced a slew of new additions and improvements to its platform, alongside reduced pricing, aimed at empowering developers and enhancing user experience.

Following yesterday’s leak of a custom GPT-4 chatbot creator, OpenAI unveiled several other key features during its DevDay that promise a transformative impact on the landscape of AI applications:

  • GPT-4 Turbo: OpenAI introduced the preview of GPT-4 Turbo, the next generation of its renowned language model. This new iteration boasts enhanced capabilities and an extensive knowledge base encompassing world events up until April 2023.
    • One of GPT-4 Turbo’s standout features is the impressive 128K context window, allowing it to process the equivalent of more than 300 pages of text in a single prompt.
    • Notably, OpenAI has optimised the pricing structure, making GPT-4 Turbo 3x cheaper for input tokens and 2x cheaper for output tokens compared to its predecessor.
  • Assistants API: OpenAI also unveiled the Assistants API, a tool designed to simplify the process of building agent-like experiences within applications.
    • The API equips developers with the ability to create purpose-built AIs with specific instructions, leveraging additional knowledge and calling models and tools to perform tasks.
  • Multimodal capabilities: OpenAI’s platform now supports a range of multimodal capabilities, including vision, image creation (DALL·E 3), and text-to-speech (TTS).
    • GPT-4 Turbo can process images, opening up possibilities such as generating captions, detailed image analysis, and reading documents with figures.
    • Additionally, DALL·E 3 integration allows developers to create images and designs programmatically, while the text-to-speech API enables the generation of human-quality speech from text.
  • Pricing overhaul: OpenAI has significantly reduced prices across its platform, making it more accessible to developers.
    • GPT-4 Turbo input tokens are now 3x cheaper than its predecessor at $0.01, and output tokens are 2x cheaper at $0.03. Similar reductions apply to GPT-3.5 Turbo, catering to various user requirements and ensuring affordability.
  • Copyright Shield: To bolster customer protection, OpenAI has introduced Copyright Shield.
    • This initiative sees OpenAI stepping in to defend customers and cover the associated legal costs if they face copyright infringement claims related to the generally available features of ChatGPT Enterprise and the developer platform.

OpenAI’s latest announcements mark a significant stride in the company’s mission to democratise AI technology, empowering developers to create innovative and intelligent applications across various domains.

See also: OpenAI set to unveil custom GPT-4 chatbot creator

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI introduces GPT-4 Turbo, platform enhancements, and reduced pricing appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/11/07/openai-gpt-4-turbo-platform-enhancements-reduced-pricing/feed/ 0
Meta’s open-source speech AI models support over 1,100 languages https://www.artificialintelligence-news.com/2023/05/23/meta-open-source-speech-ai-models-support-over-1100-languages/ https://www.artificialintelligence-news.com/2023/05/23/meta-open-source-speech-ai-models-support-over-1100-languages/#respond Tue, 23 May 2023 12:46:19 +0000 https://www.artificialintelligence-news.com/?p=13101 Advancements in machine learning and speech recognition technology have made information more accessible to people, particularly those who rely on voice to access information. However, the lack of labelled data for numerous languages poses a significant challenge in developing high-quality machine-learning models. In response to this problem, the Meta-led Massively Multilingual Speech (MMS) project has... Read more »

The post Meta’s open-source speech AI models support over 1,100 languages appeared first on AI News.

]]>
Advancements in machine learning and speech recognition technology have made information more accessible to people, particularly those who rely on voice to access information. However, the lack of labelled data for numerous languages poses a significant challenge in developing high-quality machine-learning models.

In response to this problem, the Meta-led Massively Multilingual Speech (MMS) project has made remarkable strides in expanding language coverage and improving the performance of speech recognition and synthesis models.

By combining self-supervised learning techniques with a diverse dataset of religious readings, the MMS project has achieved impressive results in growing the ~100 languages supported by existing speech recognition models to over 1,100 languages.

Breaking down language barriers

To address the scarcity of labelled data for most languages, the MMS project utilised religious texts, such as the Bible, which have been translated into numerous languages.

These translations provided publicly available audio recordings of people reading the texts, enabling the creation of a dataset comprising readings of the New Testament in over 1,100 languages.

By including unlabeled recordings of other religious readings, the project expanded language coverage to recognise over 4,000 languages.

Despite the dataset’s specific domain and predominantly male speakers, the models performed equally well for male and female voices. Meta also says it did not introduce any religious bias.

Overcoming challenges through self-supervised learning

Training conventional supervised speech recognition models with just 32 hours of data per language is inadequate.

To overcome this limitation, the MMS project leveraged the benefits of the wav2vec 2.0 self-supervised speech representation learning technique.

By training self-supervised models on approximately 500,000 hours of speech data across 1,400 languages, the project significantly reduced the reliance on labelled data.

The resulting models were then fine-tuned for specific speech tasks, such as multilingual speech recognition and language identification.

Impressive results

Evaluation of the models trained on the MMS data revealed impressive results. In a comparison with OpenAI’s Whisper, the MMS models exhibited half the word error rate while covering 11 times more languages.

Furthermore, the MMS project successfully built text-to-speech systems for over 1,100 languages. Despite the limitation of having relatively few different speakers for many languages, the speech generated by these systems exhibited high quality.

While the MMS models have shown promising results, it is essential to acknowledge their imperfections. Mistranscriptions or misinterpretations by the speech-to-text model could result in offensive or inaccurate language. The MMS project emphasises collaboration across the AI community to mitigate such risks.

You can read the MMS paper here or find the project on GitHub.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta’s open-source speech AI models support over 1,100 languages appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/05/23/meta-open-source-speech-ai-models-support-over-1100-languages/feed/ 0