Development - AI News

Meta unveils five AI models for multi-modal processing, music generation, and more

Ryan Daws — Wed, 19 Jun 2024 15:40:48 +0000

Meta has unveiled five major new AI models and research, including multi-modal systems that can process both text and images, next-gen language models, music generation, AI speech detection, and efforts to improve diversity in AI systems.

The releases come from Meta’s Fundamental AI Research (FAIR) team which has focused on advancing AI through open research and collaboration for over a decade. As AI rapidly innovates, Meta believes working with the global community is crucial.

“By publicly sharing this research, we hope to inspire iterations and ultimately help advance AI in a responsible way,” said Meta.

Chameleon: Multi-modal text and image processing

Among the releases are key components of Meta’s ‘Chameleon’ models under a research license. Chameleon is a family of multi-modal models that can understand and generate both text and images simultaneously—unlike most large language models which are typically unimodal.

“Just as humans can process the words and images simultaneously, Chameleon can process and deliver both image and text at the same time,” explained Meta. “Chameleon can take any combination of text and images as input and also output any combination of text and images.”

Potential use cases are virtually limitless from generating creative captions to prompting new scenes with text and images.

Multi-token prediction for faster language model training

Meta has also released pretrained models for code completion that use ‘multi-token prediction’ under a non-commercial research license. Traditional language model training is inefficient by predicting just the next word. Multi-token models can predict multiple future words simultaneously to train faster.

“While [the one-word] approach is simple and scalable, it’s also inefficient. It requires several orders of magnitude more text than what children need to learn the same degree of language fluency,” said Meta.

JASCO: Enhanced text-to-music model

On the creative side, Meta’s JASCO allows generating music clips from text while affording more control by accepting inputs like chords and beats.

“While existing text-to-music models like MusicGen rely mainly on text inputs for music generation, our new model, JASCO, is capable of accepting various inputs, such as chords or beat, to improve control over generated music outputs,” explained Meta.

AudioSeal: Detecting AI-generated speech

Meta claims AudioSeal is the first audio watermarking system designed to detect AI-generated speech. It can pinpoint the specific segments generated by AI within larger audio clips up to 485x faster than previous methods.

“AudioSeal is being released under a commercial license. It’s just one of several lines of responsible research we have shared to help prevent the misuse of generative AI tools,” said Meta.

Improving text-to-image diversity

Another important release aims to improve the diversity of text-to-image models which can often exhibit geographical and cultural biases.

Meta developed automatic indicators to evaluate potential geographical disparities and conducted a large 65,000+ annotation study to understand how people globally perceive geographic representation.

“This enables more diversity and better representation in AI-generated images,” said Meta. The relevant code and annotations have been released to help improve diversity across generative models.

By publicly sharing these groundbreaking models, Meta says it hopes to foster collaboration and drive innovation within the AI community.

(Photo by Dima Solomin)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta unveils five AI models for multi-modal processing, music generation, and more appeared first on AI News.

AI comes to Ireland’s remote Islands through Microsoft’s ‘Skill Up’ program

Dashveenjit Kaur — Wed, 19 Jun 2024 08:35:50 +0000

On Inishbofin, a small island off the western coast of Ireland where the population hovers around 170 and the main industries are farming, fishing and tourism, a quiet technology revolution has been taking place.

Artificial intelligence (AI), once thought to be the exclusive domain of big cities and tech hubs, is making its way to the furthest corners of rural Ireland, empowering locals with cutting-edge tools to boost their businesses and preserve their traditional crafts.

It is all part of Microsoft’s ambitious ‘Skill Up Ireland’ initiative, which aims to provide every person in Ireland with the opportunity to learn AI skills. The program has partnered with the Irish government and various organisations to deliver AI training and resources to communities across the country, leaving no one behind in the era of rapid technological advancement.

One recent beneficiary of this program is Andrew Murray, the general manager of the 22-room Doonmore Hotel on Inishbofin. A native of the island, Murray comes from a family that has lived on Inishbofin for generations, with his parents founding the hotel in 1969. Despite the remote location, Murray is eager to embrace AI as a tool to streamline his operations and save time.

“What I’m interested in the most is the power of AI to save time for people like me,” Murray said. “Because time is the most precious thing we have, and it’s finite. There are only 24 hours in a day.”

Through an AI introduction class, Murray discovered the possibilities of tools such as Microsoft Copilot, an AI-powered assistant for everything from scheduling to data analysis to creating content. He intends to use these tools to oversee things like scheduling staff and inventory management as well as invoicing and pricing – tasks that he has normally spent hours, if not days, doing completely manually.

But Murray is not alone in his enthusiasm for AI on Inishbofin. Catherine O’Connor, a weaver who draws inspiration from the island’s natural colors and textures, has also embraced the technology. Initially wary of the AI training, O’Connor quickly became “absorbed by it” once she realised its potential to help her market her handmade scarves, table runners, and wall hangings.

“Every piece has a story behind it,” O’Connor explained. “You can get a scarf at the five-and-dime store, but a handmade scarf takes hours and hours to make. It’s a totally different level. So you have to find the right words to use.”

Now, with the help of Copilot, O’Connor can write engaging descriptions of her creations for marketing her craft on a proper e-commerce platform and help people understand her work more accurately and visualise the creation.

Another Copilot user, Inishbofin-based florist Patricia Concannon, plans to also use Copilot to make her website and Instagram captions more engaging which should prove useful in helping her reach new customers and attracting a wider audience for her floral displays.

The AI training on Inishbofin is just one element of Microsoft’s wider ‘Skill Up Ireland’ programme aimed at upskill and reskill over in Ireland, which includes Dream Space, an immersive learning experience to introduce STEM and AI skills to every one of the country’s one million students and their teachers.

Kevin Marshall, head of Learning & Skills for Microsoft Ireland, said the rapid growth in the prevalence of AI in the last few years has necessitated upskilling and reskilling programmes. He continued: “At the same time, with the explosion of generative AI in the last 18 months, there’s a real need to educate people on what this is, to show them that it’s not black magic.

The challenge, however, lies in the ever-evolving nature of AI technology. “The teaching is non-invasive, it’s collaborative,” Marshall explained. “The programs teach the basic foundations and core principles of AI. Here’s what it can do. Here are the risks and the ethical issues. Here are the opportunities. And here’s where you go play with it.”

Programmes like ‘Skill Up Ireland’ are an opportunity for rural communities like Inishbofin not to be left behind through the digital divide as AI significantly impacts industries and the way that we live and work. Audrey Murray, a felt artist and teaching assistant on the island, summed it up: “AI has to be another step, I suppose, bringing us closer to the world and bringing the world here.”

And with Microsoft’s promise of creating AI skills for all in Ireland, the remote extremities of the Emerald Isle are on the brink of being catapulted into the future, when the very latest technologies are melded with ancient skills and lifeways. Meanwhile, for the inhabitants of Inishbofin, the opportunities are yet to reveal themselves.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post AI comes to Ireland’s remote Islands through Microsoft’s ‘Skill Up’ program appeared first on AI News.

NVIDIA presents latest advancements in visual AI

Ryan Daws — Mon, 17 Jun 2024 16:05:03 +0000

NVIDIA researchers are presenting new visual generative AI models and techniques at the Computer Vision and Pattern Recognition (CVPR) conference this week in Seattle. The advancements span areas like custom image generation, 3D scene editing, visual language understanding, and autonomous vehicle perception.

“Artificial intelligence, and generative AI in particular, represents a pivotal technological advancement,” said Jan Kautz, VP of learning and perception research at NVIDIA.

“At CVPR, NVIDIA Research is sharing how we’re pushing the boundaries of what’s possible — from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.”

Among the over 50 NVIDIA research projects being presented, two papers have been selected as finalists for CVPR’s Best Paper Awards – one exploring the training dynamics of diffusion models and another on high-definition maps for self-driving cars.

Additionally, NVIDIA has won the CVPR Autonomous Grand Challenge’s End-to-End Driving at Scale track, outperforming over 450 entries globally. This milestone demonstrates NVIDIA’s pioneering work in using generative AI for comprehensive self-driving vehicle models, also earning an Innovation Award from CVPR.

One of the headlining research projects is JeDi, a new technique that allows creators to rapidly customise diffusion models – the leading approach for text-to-image generation – to depict specific objects or characters using just a few reference images, rather than the time-intensive process of fine-tuning on custom datasets.

Another breakthrough is FoundationPose, a new foundation model that can instantly understand and track the 3D pose of objects in videos without per-object training. It set a new performance record and could unlock new AR and robotics applications.

NVIDIA researchers also introduced NeRFDeformer, a method to edit the 3D scene captured by a Neural Radiance Field (NeRF) using a single 2D snapshot, rather than having to manually reanimate changes or recreate the NeRF entirely. This could streamline 3D scene editing for graphics, robotics, and digital twin applications.

On the visual language front, NVIDIA collaborated with MIT to develop VILA, a new family of vision language models that achieve state-of-the-art performance in understanding images, videos, and text. With enhanced reasoning capabilities, VILA can even comprehend internet memes by combining visual and linguistic understanding.

NVIDIA’s visual AI research spans numerous industries, including over a dozen papers exploring novel approaches for autonomous vehicle perception, mapping, and planning. Sanja Fidler, VP of NVIDIA’s AI Research team, is presenting on the potential of vision language models for self-driving cars.

The breadth of NVIDIA’s CVPR research exemplifies how generative AI could empower creators, accelerate automation in manufacturing and healthcare, while propelling autonomy and robotics forward.

(Photo by v2osk)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post NVIDIA presents latest advancements in visual AI appeared first on AI News.

NLEPs: Bridging the gap between LLMs and symbolic reasoning

Ryan Daws — Fri, 14 Jun 2024 16:07:57 +0000

Researchers have introduced a novel approach called natural language embedded programs (NLEPs) to improve the numerical and symbolic reasoning capabilities of large language models (LLMs). The technique involves prompting LLMs to generate and execute Python programs to solve user queries, then output solutions in natural language.

While LLMs like ChatGPT have demonstrated impressive performance on various tasks, they often struggle with problems requiring numerical or symbolic reasoning.

NLEPs follow a four-step problem-solving template: calling necessary packages, importing natural language representations of required knowledge, implementing a solution-calculating function, and outputting results as natural language with optional data visualisation.

This approach offers several advantages, including improved accuracy, transparency, and efficiency. Users can investigate generated programs and fix errors directly, avoiding the need to rerun entire models for troubleshooting. Additionally, a single NLEP can be reused for multiple tasks by replacing certain variables.

The researchers found that NLEPs enabled GPT-4 to achieve over 90% accuracy on various symbolic reasoning tasks, outperforming task-specific prompting methods by 30%

Beyond accuracy improvements, NLEPs could enhance data privacy by running programs locally, eliminating the need to send sensitive user data to external companies for processing. The technique may also boost the performance of smaller language models without costly retraining.

However, NLEPs rely on a model’s program generation capability and may not work as well with smaller models trained on limited datasets. Future research will explore methods to make smaller LLMs generate more effective NLEPs and investigate the impact of prompt variations on reasoning robustness.

The research, supported in part by the Center for Perceptual and Interactive Intelligence of Hong Kong, will be presented at the Annual Conference of the North American Chapter of the Association for Computational Linguistics later this month.

(Photo by Alex Azabache)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post NLEPs: Bridging the gap between LLMs and symbolic reasoning appeared first on AI News.

Gil Pekelman, Atera: How businesses can harness the power of AI

Duncan MacRae — Tue, 28 May 2024 15:32:37 +0000

TechForge recently caught up with Gil Pekelman, CEO of all-in-one IT management platform, Atera, to discuss how AI is becoming the IT professionals’ number one companion.

Can you tell us a little bit about Atera and what it does?

We launched the Atera all-in-one platform for IT management in 2016, so quite a few years ago. And it’s very broad. It’s everything from technical things like patching and security to ongoing support, alerts, automations, ticket management, reports, and analytics, etc.

Atera is a single platform that manages all your IT in a single pane of glass. The power of it – and we’re the only company that does this – is it’s a single codebase and single database for all of that. The alternative, for many years now, has been to buy four or five different products, and have them all somehow connected, which is usually very difficult.

Here, the fact is it’s a single codebase and a single database. Everything is connected and streamlined and very intuitive. So, in essence, you sign up or start a trial and within five minutes, you’re already running with it and onboarding. It’s that intuitive.

We have 12,000+ customers in 120 countries around the world. The UK is our second-largest country in terms of business, currently. The US is the first, but the UK is right behind them.

What are the latest trends you’re seeing develop in AI this year?

From the start, we’ve been dedicated to integrating AI into our company’s DNA. Our goal has always been to use data to identify problems and alert humans so they can fix or avoid issues. Initially, we focused on leveraging data to provide solutions.

Over the past nine years, we’ve aimed to let AI handle mundane IT tasks, freeing up professionals for more engaging work. With early access to Chat GPT and Open AI tools a year and a half ago, we’ve been pioneering a new trend we call Action AI.

Unlike generic Generative AI, which creates content like songs or emails, Action AI operates in the real world, interacting with hardware and software to perform tasks autonomously. Our AI can understand IT problems and resolve them on its own, moving beyond mere dialogue to real-world action.

Atera offers Copilot and Autopilot. Could you explain what these are?

Autopilot is autonomous. It understands a problem you might have on your computer. It’s a widget on your computer, and it will communicate with you and fix the problem autonomously. However, it has boundaries on what it’s allowed to fix and what it’s not allowed to fix. And everything it’s allowed to deal with has to be bulletproof. 100% secure or private. No opportunity to do any damage or anything like that.

So if a ticket is opened up, or a complaint is raised, if it’s outside of these boundaries, it will then activate the Copilot. The Copilot augments the IT professional.

They’re both companions. The Autopilot is a companion that takes away password resets, printer issues, installs software, etc. – mundane and repetitive issues – and the Copilot is a companion that will help the IT professional deal with the issues they deal with on a day-to-day basis. And it has all kinds of different tools.

The Copilot is very elaborate. If you have a problem, you can ask it and it will not only give you an answer like ChatGPT, but it will research and run all kinds of tests on the network, the computer, and the printer, and it will come to a conclusion, and create the action that is required to solve it. But it won’t solve it. It will still leave that to the IT professional to think about the different information and decide what they want to do.

Copilot can save IT professionals nearly half of their workday. While it’s been tested in the field for some time, we’re excited to officially launch it now. Meanwhile, Autopilot is still in the beta phase.

What advice would you give to any companies that are thinking about integrating AI technologies into their business operations?

I strongly recommend that companies begin integrating AI technologies immediately, but it is crucial to research and select the right and secure generative AI tools. Incorporating AI offers numerous advantages: it automates routine tasks, enhances efficiency and productivity, improves accuracy by reducing human error, and speeds up problem resolution. That being said, it’s important to pick the right generative AI tool to help you reap the benefits without compromising on security. For example, with our collaboration with Microsoft, our customers’ data is secure—it stays within the system, and the AI doesn’t use it for training or expanding its database. This ensures safety while delivering substantial benefits.

Our incorporation of AI into our product focuses on two key aspects. First, your IT team no longer has to deal with mundane, frustrating tasks. Second, for end users, issues like non-working printers, forgotten passwords, or slow internet are resolved in seconds or minutes instead of hours. This provides a measurable and significant improvement in efficiency.

There are all kinds of AIs out there. Some of them are more beneficial, some are less. Some are just Chat GPT in disguise, and it’s a very thin layer. What we do literally changes the whole interaction with IT. And we know, when IT has a problem things stop working, and you stop working. Our solution ensures everything keeps running smoothly.

What can we expect from AI over the next few years?

AI is set to become significantly more intelligent and aware. One remarkable development is its growing ability to reason, predict, and understand data. This capability enables AI to foresee issues and autonomously resolve them, showcasing an astonishing level of reasoning.

We anticipate a dual advancement: a rapid acceleration in AI’s intelligence and a substantial enhancement in its empathetic interactions, as demonstrated in the latest OpenAI release. This evolution will transform how humans engage with AI.

Our work exemplifies this shift. When non-technical users interact with our software to solve problems, AI responds with a highly empathetic, human-like approach. Users feel as though they are speaking to a real IT professional, ensuring a seamless and comforting experience.

As AI continues to evolve, it will become increasingly powerful and capable. Recent breakthroughs in understanding AI’s mechanisms will not only enhance its functionality but also ensure its security and ethical use, reinforcing its role as a force for good.

What plans does Atera have for the next year?

We are excited to announce the upcoming launch of Autopilot, scheduled for release in a few months. While Copilot, our comprehensive suite of advanced tools designed specifically for IT professionals, has already been instrumental in enhancing efficiency and effectiveness, Autopilot represents the next significant advancement.

Currently in beta so whoever wants to try it already can, Autopilot directly interacts with end users, automating and resolving common IT issues that typically burden IT staff, such as password resets and printer malfunctions. By addressing these routine tasks, Autopilot allows IT professionals to focus on more strategic and rewarding activities, ultimately improving overall productivity and job satisfaction.

For more information, visit atera.com

Atera is a sponsor of TechEx North America 2024 on June 5-6 in Santa Clara, US. Visit the Atera team at booth 237 for a personalised demo, or to test your IT skills with the company’s first-of-kind AIT game, APOLLO IT, for a chance to win a prize.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Gil Pekelman, Atera: How businesses can harness the power of AI appeared first on AI News.

Google ushers in the “Gemini era” with AI advancements

Ryan Daws — Wed, 15 May 2024 17:29:19 +0000

Google has unveiled a series of updates to its AI offerings, including the introduction of Gemini 1.5 Flash, enhancements to Gemini 1.5 Pro, and progress on Project Astra, its vision for the future of AI assistants.

Gemini 1.5 Flash is a new addition to Google’s family of models, designed to be faster and more efficient to serve at scale. While lighter-weight than the 1.5 Pro, it retains the ability for multimodal reasoning across vast amounts of information and features the breakthrough long context window of one million tokens.

“1.5 Flash excels at summarisation, chat applications, image and video captioning, data extraction from long documents and tables, and more,” explained Demis Hassabis, CEO of Google DeepMind. “This is because it’s been trained by 1.5 Pro through a process called ‘distillation,’ where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.”

Meanwhile, Google has significantly improved the capabilities of its Gemini 1.5 Pro model, extending its context window to a groundbreaking two million tokens. Enhancements have been made to its code generation, logical reasoning, multi-turn conversation, and audio and image understanding capabilities.

The company has also integrated Gemini 1.5 Pro into Google products, including the Gemini Advanced and Workspace apps. Additionally, Gemini Nano now understands multimodal inputs, expanding beyond text-only to include images.

Google announced its next generation of open models, Gemma 2, designed for breakthrough performance and efficiency. The Gemma family is also expanding with PaliGemma, the company’s first vision-language model inspired by PaLI-3.

Finally, Google shared progress on Project Astra (advanced seeing and talking responsive agent), its vision for the future of AI assistants. The company has developed prototype agents that can process information faster, understand context better, and respond quickly in conversation.

“We’ve always wanted to build a universal agent that will be useful in everyday life. Project Astra, shows multimodal understanding and real-time conversational capabilities,” explained Google CEO Sundar Pichai.

“With technology like this, it’s easy to envision a future where people could have an expert AI assistant by their side, through a phone or glasses.”

Google says that some of these capabilities will be coming to its products later this year. Developers can find all of the Gemini-related announcements they need here.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Google ushers in the “Gemini era” with AI advancements appeared first on AI News.

UAE unveils new AI model to rival big tech giants

Muhammad Zulhusni — Wed, 15 May 2024 09:53:41 +0000

The UAE is making big waves by launching a new open-source generative AI model. This step, taken by a government-backed research institute, is turning heads and marking the UAE as a formidable player in the global AI race.

In Abu Dhabi, the Technology Innovation Institute (TII) unveiled the Falcon 2 series. As reported by Reuters, this series includes Falcon 2 11B, a text-based model, and Falcon 2 11B VLM, a vision-to-language model capable of generating text descriptions from images. TII is run by Abu Dhabi’s Advanced Technology Research Council.

As a major oil exporter and a key player in the Middle East, the UAE is investing heavily in AI. This strategy has caught the eye of U.S. officials, leading to tensions over whether to use American or Chinese technology. In a move coordinated with Washington, Emirati AI firm G42 withdrew from Chinese investments and replaced Chinese hardware, securing a US$1.5 billion investment from Microsoft.

Faisal Al Bannai, Secretary General of the Advanced Technology Research Council and an adviser on strategic research and advanced technology, proudly states that the UAE is proving itself as a major player in AI. The release of the Falcon 2 series is part of a broader race among nations and companies to develop proprietary large language models. While some opt to keep their AI code private, the UAE, like Meta’s Llama, is making its groundbreaking work accessible to all.

Al Bannai is also excited about the upcoming Falcon 3 generation and expresses confidence in the UAE’s ability to compete globally: “We’re very proud that we can still punch way above our weight, really compete with the best players globally.”

Reflecting on his earlier statements this year, Al Bannai emphasised that the UAE’s decisive advantage lies in its ability to make swift strategic decisions.

It’s worth noting that Abu Dhabi’s ruling family controls some of the world’s largest sovereign wealth funds, worth about US$1.5 trillion. These funds, formerly used to diversify the UAE’s oil wealth, are now critical for accelerating growth in AI and other cutting-edge technologies. In fact, the UAE is emerging as a key player in producing advanced computer chips essential for training powerful AI systems. According to Wall Street Journal, OpenAI CEO Sam Altman met with investors, including Sheik Tahnoun bin Zayed Al Nahyan, who runs Abu Dhabi’s major sovereign wealth fund, to discuss a potential US$7 trillion investment to develop an AI chipmaker to compete with Nvidia.

Furthermore, the UAE’s commitment to generative AI is evident in its recent launch of a ‘Generative AI’ guide. This guide aims to unlock AI’s potential in various fields, including education, healthcare, and media. It provides a detailed overview of generative AI, addressing digital technologies’ challenges and opportunities while emphasising data privacy. The guide is designed to assist government agencies and the community leverage AI technologies by demonstrating 100 practical AI use cases for entrepreneurs, students, job seekers, and tech enthusiasts.

This proactive stance showcases the UAE’s commitment to participating in and leading the global AI race, positioning it as a nation to watch in the rapidly evolving tech scene.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post UAE unveils new AI model to rival big tech giants appeared first on AI News.

GPT-4o delivers human-like AI interaction with text, audio, and vision integration

Ryan Daws — Tue, 14 May 2024 12:43:56 +0000

OpenAI has launched its new flagship model, GPT-4o, which seamlessly integrates text, audio, and visual inputs and outputs, promising to enhance the naturalness of machine interactions.

GPT-4o, where the “o” stands for “omni,” is designed to cater to a broader spectrum of input and output modalities. “It accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs,” OpenAI announced.

Users can expect a response time as quick as 232 milliseconds, mirroring human conversational speed, with an impressive average response time of 320 milliseconds.

Pioneering capabilities

The introduction of GPT-4o marks a leap from its predecessors by processing all inputs and outputs through a single neural network. This approach enables the model to retain critical information and context that were previously lost in the separate model pipeline used in earlier versions.

Prior to GPT-4o, ‘Voice Mode’ could handle audio interactions with latencies of 2.8 seconds for GPT-3.5 and 5.4 seconds for GPT-4. The previous setup involved three distinct models: one for transcribing audio to text, another for textual responses, and a third for converting text back to audio. This segmentation led to loss of nuances such as tone, multiple speakers, and background noise.

As an integrated solution, GPT-4o boasts notable improvements in vision and audio understanding. It can perform more complex tasks such as harmonising songs, providing real-time translations, and even generating outputs with expressive elements like laughter and singing. Examples of its broad capabilities include preparing for interviews, translating languages on the fly, and generating customer service responses.

Nathaniel Whittemore, Founder and CEO of Superintelligent, commented: “Product announcements are going to inherently be more divisive than technology announcements because it’s harder to tell if a product is going to be truly different until you actually interact with it. And especially when it comes to a different mode of human-computer interaction, there is even more room for diverse beliefs about how useful it’s going to be.

“That said, the fact that there wasn’t a GPT-4.5 or GPT-5 announced is also distracting people from the technological advancement that this is a natively multimodal model. It’s not a text model with a voice or image addition; it is a multimodal token in, multimodal token out. This opens up a huge array of use cases that are going to take some time to filter into the consciousness.”

Performance and safety

GPT-4o matches GPT-4 Turbo performance levels in English text and coding tasks but outshines significantly in non-English languages, making it a more inclusive and versatile model. It sets a new benchmark in reasoning with a high score of 88.7% on 0-shot COT MMLU (general knowledge questions) and 87.2% on the 5-shot no-CoT MMLU.

The model also excels in audio and translation benchmarks, surpassing previous state-of-the-art models like Whisper-v3. In multilingual and vision evaluations, it demonstrates superior performance, enhancing OpenAI’s multilingual, audio, and vision capabilities.

OpenAI has incorporated robust safety measures into GPT-4o by design, incorporating techniques to filter training data and refining behaviour through post-training safeguards. The model has been assessed through a Preparedness Framework and complies with OpenAI’s voluntary commitments. Evaluations in areas like cybersecurity, persuasion, and model autonomy indicate that GPT-4o does not exceed a ‘Medium’ risk level across any category.

Further safety assessments involved extensive external red teaming with over 70 experts in various domains, including social psychology, bias, fairness, and misinformation. This comprehensive scrutiny aims to mitigate risks introduced by the new modalities of GPT-4o.

Availability and future integration

Starting today, GPT-4o’s text and image capabilities are available in ChatGPT—including a free tier and extended features for Plus users. A new Voice Mode powered by GPT-4o will enter alpha testing within ChatGPT Plus in the coming weeks.

Developers can access GPT-4o through the API for text and vision tasks, benefiting from its doubled speed, halved price, and enhanced rate limits compared to GPT-4 Turbo.

OpenAI plans to expand GPT-4o’s audio and video functionalities to a select group of trusted partners via the API, with broader rollout expected in the near future. This phased release strategy aims to ensure thorough safety and usability testing before making the full range of capabilities publicly available.

“It’s hugely significant that they’ve made this model available for free to everyone, as well as making the API 50% cheaper. That is a massive increase in accessibility,” explained Whittemore.

OpenAI invites community feedback to continuously refine GPT-4o, emphasising the importance of user input in identifying and closing gaps where GPT-4 Turbo might still outperform.

(Image Credit: OpenAI)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post GPT-4o delivers human-like AI interaction with text, audio, and vision integration appeared first on AI News.

Intel’s Aurora achieves exascale to become the fastest AI system

Ryan Daws — Mon, 13 May 2024 13:13:09 +0000

Intel, in collaboration with Argonne National Laboratory and Hewlett Packard Enterprise (HPE), has revealed that its Aurora supercomputer has exceeded the exascale computing threshold reaching speeds of 1.012 exaflops to become the fastest AI-focused system.

“The Aurora supercomputer surpassing exascale will allow it to pave the road to tomorrow’s discoveries,” said Ogi Brkic, Intel’s VP and GM of Data Centre AI Solutions. “From understanding climate patterns to unravelling the mysteries of the universe, supercomputers serve as a compass guiding us toward solving truly difficult scientific challenges that may improve humanity.”

Aurora has not only excelled in speed but also in innovation and utility. Designed from the outset as an AI-centric supercomputer, Aurora enables researchers to leverage generative AI models, significantly accelerating scientific discovery.

Groundbreaking work has already been achieved using Aurora, including mapping the 80 billion neurons of the human brain, enhancing high-energy particle physics with deep learning, and accelerating drug design and discovery through machine learning.

At the core of Aurora lies the Intel Data Center GPU Max Series, built on the innovative Intel Xe GPU architecture, optimised for both AI and HPC tasks. This technological foundation enables parallel processing capabilities, crucial for handling complex neural network AI computations.

Details shared about the supercomputer highlight its grand scale, consisting of 166 racks, 10,624 compute blades, 21,248 Intel Xeon CPU Max Series processors, and 63,744 Intel Data Center GPU Max Series units, making it the largest GPU cluster worldwide.

Alongside the hardware, Intel’s suite of software tools – including the Intel® oneAPI DPC++/C++ Compiler and an array of performance libraries – boost developer flexibility and system scalability.

Intel is also expanding its Tiber Developer Cloud, incorporating new state-of-the-art hardware and enhanced service capabilities to support the evaluation, innovation, and optimisation of AI models and workloads on a large scale.

Looking forward, the deployment of new supercomputers integrated with Intel technologies is expected to transform various scientific fields. Systems like CMCC’s Cassandra will advance climate change modelling, while ENEA’s CRESCO 8 supports breakthroughs in fusion energy, among others, underscoring Intel’s dedication to advancing HPC and AI into new realms of discovery and innovation.

(Image Credit: Intel)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Intel’s Aurora achieves exascale to become the fastest AI system appeared first on AI News.

Chuck Ros, SoftServe: Delivering transformative AI solutions responsibly

Ryan Daws — Fri, 03 May 2024 14:47:56 +0000

As the world embraces the transformative potential of AI, SoftServe is at the forefront of developing cutting-edge AI solutions while prioritising responsible deployment.

Ahead of AI & Big Data Expo North America – where the company will showcase its expertise – Chuck Ros, Industry Success Director at SoftServe, provided valuable insights into the company’s AI initiatives, the challenges faced, and its future strategy for leveraging this powerful technology.

Highlighting a recent AI project that exemplifies SoftServe’s innovative approach, Ros discussed the company’s unique solution for a software company in the field service management industry. The vision was to create an easy-to-use, language model-enabled interface that would allow field technicians to access service histories, equipment documentation, and maintenance schedules seamlessly, enhancing productivity and operational efficiency.

“Our AI engineers built a prompt evaluation pipeline that seamlessly considers cost, processing time, semantic similarity, and the likelihood of hallucinations,” Ros explained. “It proved to be an extremely effective architecture that led to improved operational efficiencies for the customer, increased productivity for users in the field, competitive edge for the software company and for their clients, and—perhaps most importantly—a spark for additional innovation.”

While the potential of AI is undeniable, Ros acknowledged the key mistakes businesses often make when deploying AI solutions, emphasising the importance of having a robust data strategy, building adequate data pipelines, and thoroughly testing the models. He also cautioned against rushing to deploy generative AI solutions without properly assessing feasibility and business viability, stating, “We need to pay at least as much attention to whether it should be built as we do to whether it can be built.”

Recognising the critical concern of ethical AI development, Ros stressed the significance of human oversight throughout the entire process. “Managing dynamic data quality, testing and detecting for bias and inaccuracies, ensuring high standards of data privacy, and ethical use of AI systems all require human oversight,” he said. SoftServe’s approach to AI development involves structured engagements that evaluate data and algorithms for suitability, assess potential risks, and implement governance measures to ensure accountability and data traceability.

Looking ahead, Ros envisions AI playing an increasingly vital role in SoftServe’s business strategy, with ongoing refinements to AI-assisted software development lifecycles and the introduction of new tools and processes to boost productivity further. Softserve’s findings suggest that GenAI can accelerate programming productivity by as much as 40 percent.

“I see more models assisting us on a daily basis, helping us write emails and documentation and helping us more and more with the simple, time-consuming mundane tasks we still do,” Ros said. “In the next five years I see ongoing refinement of that view to AI in SDLCs and the regular introduction of new tools, new models, new processes that push that 40 percent productivity hike to 50 percent and 60 percent.”

When asked how SoftServe is leveraging AI for social good, Ros explained the company is delivering solutions ranging from machine learning models to help students discover their passions and aptitudes, enabling personalised learning experiences, to assisting teachers in their daily tasks and making their jobs easier.

“I love this question because one of SoftServe’s key strategic tenets is to power our social purpose and make the world a better place. It’s obviously an ambitious goal, but it’s important to our employees and it’s important to our clients,” explained Ros.

“It’s why we created the Open Eyes Foundation and have collected more than $15 million with the support of the public, our clients, our partners, and of course our employees. We naturally support the Open Eyes Foundation with all manner of technology needs, including AI.”

At the AI & Big Data Expo North America, SoftServe plans to host a keynote presentation titled “Revolutionizing Learning: Unleashing the Power of Generative AI in Education and Beyond,” which will explore the transformative impact of generative AI and large language models in the education sector.

“As we explore the mechanisms through which generative AI leverages data – including training methodologies like fine-tuning and Retrieval Augmented Generation (RAG) – we will pinpoint high-value, low-risk applications that promise to redefine the educational landscape,” said Ros.

“The journey from a nascent idea to a fully operational AI solution is fraught with challenges, including ethical considerations and risks inherent in deploying AI solutions. Through the lens of a success story at Mesquite ISD, where generative AI was leveraged to help students uncover their passions and aptitudes enabling the delivery of personalised learning experiences, this presentation will illustrate the practical benefits and transformative potential of generative AI in education.”

Additionally, the company will participate in panel discussions on topics such as “Getting to Production-Ready – Challenges and Best Practices for Deploying AI” and “Navigating the Data & AI Landscape – Ensuring Safety, Security, and Responsibility in Big Data and AI Systems.” These sessions will provide attendees with valuable insights from SoftServe’s experts on overcoming deployment challenges, ensuring data quality and user acceptance, and mitigating risks associated with AI implementation.

As a key sponsor of the event, SoftServe aims to contribute to the discourse surrounding the responsible and ethical development of AI solutions, while sharing its expertise and vision for leveraging this powerful technology to drive innovation, enhance productivity, and address global challenges.

“We are, of course, always interested in both sharing and hearing about the diversity of business cases for applications in AI and big data: the concept of the rising tide lifting all boats is definitely relevant in AI and GenAI in particular, and we’re proud to be a part of the AI technology community,” Ros concludes.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Chuck Ros, SoftServe: Delivering transformative AI solutions responsibly appeared first on AI News.