h100 Archives - AI News

Inflection-2 beats Google’s PaLM 2 across common benchmarks

Ryan Daws — Thu, 23 Nov 2023 09:54:15 +0000

Inflection, an AI startup aiming to create “personal AI for everyone”, has announced a new large language model dubbed Inflection-2 that beats Google’s PaLM 2.

Inflection-2 was trained on over 5,000 NVIDIA GPUs to reach 1.025 quadrillion floating point operations (FLOPs), putting it in the same league as PaLM 2 Large. However, early benchmarks show Inflection-2 outperforming Google’s model on tests of reasoning ability, factual knowledge, and stylistic prowess.

On a range of common academic AI benchmarks, Inflection-2 achieved higher scores than PaLM 2 on most. This included outscoring the search giant’s flagship on the diverse Multi-task Middle-school Language Understanding (MMLU) tests, as well as TriviaQA, HellaSwag, and the Grade School Math (GSM8k) benchmarks:

The startup’s new model will soon power its personal assistant app Pi to enable more natural conversations and useful features.

Thrilled to announce that Inflection-2 is now the 2nd best LLM in the world! 💚✨🎉

It will be powering https://t.co/1RWFB5RHtF very soon. And available to select API partners in time. Tech report linked…

Come run with us!https://t.co/8DZwP1Qnqo
— Mustafa Suleyman (@mustafasuleyman) November 22, 2023

Inflection said its transition from NVIDIA A100 to H100 GPUs for inference – combined with optimisation work – will increase serving speed and reduce costs despite Inflection-2 being much larger than its predecessor.

An Inflection spokesperson said this latest model brings them “a big milestone closer” towards fulfilling the mission of providing AI assistants for all. They added the team is “already looking forward” to training even larger models on their 22,000 GPU supercluster.

Safety is said to be a top priority for the researchers, with Inflection being one of the first signatories to the White House’s July 2023 voluntary AI commitments. The company said its safety team continues working to ensure models are rigorously evaluated and rely on best practices for alignment.

With impressive benchmarks and plans to scale further, Inflection’s latest effort poses a serious challenge to tech giants like Google and Microsoft who have so far dominated the field of large language models. The race is on to deliver the next generation of AI.

(Photo by Johann Walter Bantz on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Inflection-2 beats Google’s PaLM 2 across common benchmarks appeared first on AI News.

Azure and NVIDIA deliver next-gen GPU acceleration for AI

Ryan Daws — Wed, 09 Aug 2023 15:47:51 +0000

Microsoft Azure users are now able to harness the latest advancements in NVIDIA’s accelerated computing technology, revolutionising the training and deployment of their generative AI applications.

The integration of Azure ND H100 v5 virtual machines (VMs) with NVIDIA H100 Tensor Core GPUs and Quantum-2 InfiniBand networking promises seamless scaling of generative AI and high-performance computing applications, all at the click of a button.

This cutting-edge collaboration comes at a pivotal moment when developers and researchers are actively exploring the potential of large language models (LLMs) and accelerated computing to unlock novel consumer and business use cases.

NVIDIA’s H100 GPU achieves supercomputing-class performance through an array of architectural innovations. These include fourth-generation Tensor Cores, a new Transformer Engine for enhanced LLM acceleration, and NVLink technology that propels inter-GPU communication to unprecedented speeds of 900GB/sec.

The integration of the NVIDIA Quantum-2 CX7 InfiniBand – boasting 3,200 Gbps cross-node bandwidth – ensures flawless performance across GPUs, even at massive scales. This capability positions the technology on par with the computational capabilities of the world’s most advanced supercomputers.

The newly introduced ND H100 v5 VMs hold immense potential for training and inferring increasingly intricate LLMs and computer vision models. These neural networks power the most complex and compute-intensive generative AI applications, spanning from question answering and code generation to audio, video, image synthesis, and speech recognition.

A standout feature of the ND H100 v5 VMs is their ability to achieve up to a 2x speedup in LLM inference, notably demonstrated by the BLOOM 175B model when compared to previous generation instances. This performance boost underscores their capacity to optimise AI applications further, fueling innovation across industries.

The synergy between NVIDIA H100 Tensor Core GPUs and Microsoft Azure empowers enterprises with unparalleled AI training and inference capabilities. This partnership also streamlines the development and deployment of production AI, bolstered by the integration of the NVIDIA AI Enterprise software suite and Azure Machine Learning for MLOps.

The combined efforts have led to groundbreaking AI performance, as validated by industry-standard MLPerf benchmarks:

The integration of the NVIDIA Omniverse platform with Azure extends the reach of this collaboration further, providing users with everything they need for industrial digitalisation and AI supercomputing.

(Image Credit: Uwe Hoh from Pixabay)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Azure and NVIDIA deliver next-gen GPU acceleration for AI appeared first on AI News.

US introduces new AI chip export restrictions

Ryan Daws — Thu, 01 Sep 2022 16:01:15 +0000

NVIDIA has revealed that it’s subject to new laws restricting the export of AI chips to China and Russia.

In an SEC filing, NVIDIA says the US government has informed the chipmaker of a new license requirement that impacts two of its GPUs designed to speed up machine learning tasks: the current A100, and the upcoming H100.

“The license requirement also includes any future NVIDIA integrated circuit achieving both peak performance and chip-to-chip I/O performance equal to or greater than thresholds that are roughly equivalent to the A100, as well as any system that includes those circuits,” adds NVIDIA.

The US government has reportedly told NVIDIA that the new rules are geared at addressing the risk of the affected products being used for military purposes.

“While we are not in a position to outline specific policy changes at this time, we are taking a comprehensive approach to implement additional actions necessary related to technologies, end-uses, and end-users to protect US national security and foreign policy interests,” said a US Department of Commerce spokesperson.

China is a large market for NVIDIA and the new rules could affect around $400 million in quarterly sales.

AMD has also been told the new rules will impact its similar products, including the MI200.

As of writing, NVIDIA’s shares were down 11.45 percent from the market open. AMD’s shares are down 6.81 percent. However, it’s worth noting that it’s been another red day for the wider stock market.

(Photo by Wesley Tingey on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post US introduces new AI chip export restrictions appeared first on AI News.