GPT-4o delivers human-like AI interaction with text, audio, and vision integration

OpenAI has launched its new flagship model, GPT-4o, which seamlessly integrates text, audio, and visual inputs and outputs, promising to enhance the naturalness of machine interactions.

GPT-4o, where the “o” stands for “omni,” is designed to cater to a broader spectrum of input and output modalities. “It accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs,” OpenAI announced.

Users can...

Mixtral 8x22B sets new benchmark for open models

Mistral AI has released Mixtral 8x22B, which sets a new benchmark for open source models in performance and efficiency. The model boasts robust multilingual capabilities and superior mathematical and coding prowess.

Mixtral 8x22B operates as a Sparse Mixture-of-Experts (SMoE) model, utilising just 39 billion of its 141 billion parameters when active.

Beyond its efficiency, the Mixtral 8x22B boasts fluency in multiple major languages including English, French, Italian,...

Hugging Face launches Idefics2 vision-language model

Hugging Face has announced the release of Idefics2, a versatile model capable of understanding and generating text responses based on both images and texts. The model sets a new benchmark for answering visual questions, describing visual content, story creation from images, document information extraction, and even performing arithmetic operations based on visual input.

Idefics2 leapfrogs its predecessor, Idefics1, with just eight billion parameters and the versatility afforded by...

Anthropic says Claude 3 Haiku is the fastest model in its class

Anthropic has released Claude 3 Haiku, the fastest and most affordable AI model in its intelligence class. Boasting state-of-the-art vision capabilities and strong performance on industry benchmarks, Haiku is touted as a versatile solution for a wide range of enterprise applications.

The model is now available alongside Anthropic's Sonnet and Opus models in the Claude API and on Claude.ai for Claude Pro subscribers.

"Speed is essential for our enterprise users who need...

Stability AI previews Stable Diffusion 3 text-to-image model

London-based AI lab Stability AI has announced an early preview of its new text-to-image model, Stable Diffusion 3. The advanced generative AI model aims to create high-quality images from text prompts with improved performance across several key areas.

The announcement comes just days after Stability AI’s largest rival, OpenAI, unveiled Sora—a brand new AI model capable of generating nearly-realistic, high-definition videos from simple text prompts.

Sora, which...

Google pledges to fix Gemini’s inaccurate and biased image generation

Google's Gemini model has come under fire for its production of historically-inaccurate and racially-skewed images, reigniting concerns about bias in AI systems.

The controversy arose as users on social media platforms flooded feeds with examples of Gemini generating pictures depicting racially-diverse Nazis, black medieval English kings, and other improbable scenarios.

Google Gemini Image generation model receives criticism for being 'Woke'....

Reddit is reportedly selling data for AI training

Reddit has negotiated a content licensing deal to allow its data to be used for training AI models, according to a Bloomberg report.

Just ahead of a potential $5 billion initial public offering (IPO) debut in March, Reddit has reportedly signed a $60 million deal with an undisclosed major AI company. This move could be seen as a last-minute effort to showcase potential revenue streams in the rapidly growing AI industry to prospective investors.

Although Reddit has yet to...

Google launches Gemini to replace Bard chatbot

Google has launched its AI chatbot called Gemini, which replaces its short-lived Bard service.

Unveiled in December, Bard was touted as a competitor to chatbots like ChatGPT but failed to impress in demos. Google staff even called the launch “botched” and slammed CEO Sundar Pichai.

Now rebranded as Gemini, Google says it represents the company's "most capable family of models" for natural conversations. Two experiences are being launched: Gemini Advanced and a mobile...

OpenAI releases new models and lowers API pricing

OpenAI has announced several updates that will benefit developers using its AI services, including new embedding models, a lower price for GPT-3.5 Turbo, an updated GPT-4 Turbo preview, and more robust content moderation capabilities.

The San Francisco-based AI lab said its new text-embedding-3-small and text-embedding-3-large models offer upgraded performance over previous generations. For example, text-embedding-3-large achieves average scores of 54.9 percent on the MIRACL...

Microsoft unveils 2.7B parameter language model Phi-2

Microsoft’s 2.7 billion-parameter model Phi-2 showcases outstanding reasoning and language understanding capabilities, setting a new standard for performance among base language models with less than 13 billion parameters.

Phi-2 builds upon the success of its predecessors, Phi-1 and Phi-1.5, by matching or surpassing models up to 25 times larger—thanks to innovations in model scaling and training data curation.

The compact size of Phi-2 makes it an ideal playground...