open source Archives - AI News

Meta unveils five AI models for multi-modal processing, music generation, and more

Ryan Daws — Wed, 19 Jun 2024 15:40:48 +0000

Meta has unveiled five major new AI models and research, including multi-modal systems that can process both text and images, next-gen language models, music generation, AI speech detection, and efforts to improve diversity in AI systems.

The releases come from Meta’s Fundamental AI Research (FAIR) team which has focused on advancing AI through open research and collaboration for over a decade. As AI rapidly innovates, Meta believes working with the global community is crucial.

“By publicly sharing this research, we hope to inspire iterations and ultimately help advance AI in a responsible way,” said Meta.

Chameleon: Multi-modal text and image processing

Among the releases are key components of Meta’s ‘Chameleon’ models under a research license. Chameleon is a family of multi-modal models that can understand and generate both text and images simultaneously—unlike most large language models which are typically unimodal.

“Just as humans can process the words and images simultaneously, Chameleon can process and deliver both image and text at the same time,” explained Meta. “Chameleon can take any combination of text and images as input and also output any combination of text and images.”

Potential use cases are virtually limitless from generating creative captions to prompting new scenes with text and images.

Multi-token prediction for faster language model training

Meta has also released pretrained models for code completion that use ‘multi-token prediction’ under a non-commercial research license. Traditional language model training is inefficient by predicting just the next word. Multi-token models can predict multiple future words simultaneously to train faster.

“While [the one-word] approach is simple and scalable, it’s also inefficient. It requires several orders of magnitude more text than what children need to learn the same degree of language fluency,” said Meta.

JASCO: Enhanced text-to-music model

On the creative side, Meta’s JASCO allows generating music clips from text while affording more control by accepting inputs like chords and beats.

“While existing text-to-music models like MusicGen rely mainly on text inputs for music generation, our new model, JASCO, is capable of accepting various inputs, such as chords or beat, to improve control over generated music outputs,” explained Meta.

AudioSeal: Detecting AI-generated speech

Meta claims AudioSeal is the first audio watermarking system designed to detect AI-generated speech. It can pinpoint the specific segments generated by AI within larger audio clips up to 485x faster than previous methods.

“AudioSeal is being released under a commercial license. It’s just one of several lines of responsible research we have shared to help prevent the misuse of generative AI tools,” said Meta.

Improving text-to-image diversity

Another important release aims to improve the diversity of text-to-image models which can often exhibit geographical and cultural biases.

Meta developed automatic indicators to evaluate potential geographical disparities and conducted a large 65,000+ annotation study to understand how people globally perceive geographic representation.

“This enables more diversity and better representation in AI-generated images,” said Meta. The relevant code and annotations have been released to help improve diversity across generative models.

By publicly sharing these groundbreaking models, Meta says it hopes to foster collaboration and drive innovation within the AI community.

(Photo by Dima Solomin)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta unveils five AI models for multi-modal processing, music generation, and more appeared first on AI News.

Microsoft unveils Phi-3 family of compact language models

Ryan Daws — Wed, 24 Apr 2024 13:44:28 +0000

Microsoft has announced the Phi-3 family of open small language models (SLMs), touting them as the most capable and cost-effective of their size available. The innovative training approach developed by Microsoft researchers has allowed the Phi-3 models to outperform larger models on language, coding, and math benchmarks.

“What we’re going to start to see is not a shift from large to small, but a shift from a singular category of models to a portfolio of models where customers get the ability to make a decision on what is the best model for their scenario,” said Sonali Yadav, Principal Product Manager for Generative AI at Microsoft.

The first Phi-3 model, Phi-3-mini at 3.8 billion parameters, is now publicly available in Azure AI Model Catalog, Hugging Face, Ollama, and as an NVIDIA NIM microservice. Despite its compact size, Phi-3-mini outperforms models twice its size. Additional Phi-3 models like Phi-3-small (7B parameters) and Phi-3-medium (14B parameters) will follow soon.

phi-3-mini: 3.8B model matching Mixtral 8x7B and GPT-3.5

Plus a 7B model that matches Llama 3 8B in many benchmarks.

Plus a 14B model.https://t.co/2h0xahzUUS pic.twitter.com/XaED6mJL1V
— Mira (@_Mira___Mira_) April 23, 2024

“Some customers may only need small models, some will need big models and many are going to want to combine both in a variety of ways,” said Luis Vargas, Microsoft VP of AI.

The key advantage of SLMs is their smaller size enabling on-device deployment for low-latency AI experiences without network connectivity. Potential use cases include smart sensors, cameras, farming equipment, and more. Privacy is another benefit by keeping data on the device.

(Credit: Microsoft)

Large language models (LLMs) excel at complex reasoning over vast datasets—strengths suited to applications like drug discovery by understanding interactions across scientific literature. However, SLMs offer a compelling alternative for simpler query answering, summarisation, content generation, and the like.

“Rather than chasing ever-larger models, Microsoft is developing tools with more carefully curated data and specialised training,” commented Victor Botev, CTO and Co-Founder of Iris.ai.

“This allows for improved performance and reasoning abilities without the massive computational costs of models with trillions of parameters. Fulfilling this promise would mean tearing down a huge adoption barrier for businesses looking for AI solutions.”

Breakthrough training technique

What enabled Microsoft’s SLM quality leap was an innovative data filtering and generation approach inspired by bedtime story books.

“Instead of training on just raw web data, why don’t you look for data which is of extremely high quality?” asked Sebastien Bubeck, Microsoft VP leading SLM research.

Ronen Eldan’s nightly reading routine with his daughter sparked the idea to generate a ‘TinyStories’ dataset of millions of simple narratives created by prompting a large model with combinations of words a 4-year-old would know. Remarkably, a 10M parameter model trained on TinyStories could generate fluent stories with perfect grammar.

Building on that early success, the team procured high-quality web data vetted for educational value to create the ‘CodeTextbook’ dataset. This was synthesised through rounds of prompting, generation, and filtering by both humans and large AI models.

“A lot of care goes into producing these synthetic data,” Bubeck said. “We don’t take everything that we produce.”

The high-quality training data proved transformative. “Because it’s reading from textbook-like material…you make the task of the language model to read and understand this material much easier,” Bubeck explained.

Mitigating AI safety risks

Despite the thoughtful data curation, Microsoft emphasises applying additional safety practices to the Phi-3 release mirroring its standard processes for all generative AI models.

“As with all generative AI model releases, Microsoft’s product and responsible AI teams used a multi-layered approach to manage and mitigate risks in developing Phi-3 models,” a blog post stated.

This included further training examples to reinforce expected behaviours, assessments to identify vulnerabilities through red-teaming, and offering Azure AI tools for customers to build trustworthy applications atop Phi-3.

(Photo by Tadas Sar)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Microsoft unveils Phi-3 family of compact language models appeared first on AI News.

Meta raises the bar with open source Llama 3 LLM

Ryan Daws — Fri, 19 Apr 2024 12:00:18 +0000

Meta has introduced Llama 3, the next generation of its state-of-the-art open source large language model (LLM). The tech giant claims Llama 3 establishes new performance benchmarks, surpassing previous industry-leading models like GPT-3.5 in real-world scenarios.

“With Llama 3, we set out to build the best open models that are on par with the best proprietary models available today,” said Meta in a blog post announcing the release.

The initial Llama 3 models being opened up are 8 billion and 70 billion parameter versions. Meta says its teams are still training larger 400 billion+ parameter models which will be released over the coming months, alongside research papers detailing the work.

Llama 3 has been over two years in the making with significant resources dedicated to assembling high-quality training data, scaling up distributed training, optimising the model architecture, and innovative approaches to instruction fine-tuning.

Meta’s 70 billion parameter instruction fine-tuned model outperformed GPT-3.5, Claude, and other LLMs of comparable scale in human evaluations across 12 key usage scenarios like coding, reasoning, and creative writing. The company’s 8 billion parameter pretrained model also sets new benchmarks on popular LLM evaluation tasks:

“We believe these are the best open source models of their class, period,” stated Meta.

The tech giant is releasing the models via an “open by default” approach to further an open ecosystem around AI development. Llama 3 will be available across all major cloud providers, model hosts, hardware manufacturers, and AI platforms.

Victor Botev, CTO and co-founder of Iris.ai, said: “With the global shift towards AI regulation, the launch of Meta’s Llama 3 model is notable. By embracing transparency through open-sourcing, Meta aligns with the growing emphasis on responsible AI practices and ethical development.

”Moreover, this grants the opportunity for wider community education as open models facilitate insights into development and the ability to scrutinise various approaches, with this transparency feeding back into the drafting and enforcement of regulation.”

Accompanying Meta’s latest models is an updated suite of AI safety tools, including the second iterations of Llama Guard for classifying risks and CyberSec Eval for assessing potential misuse. A new component called Code Shield has also been introduced to filter insecure code suggestions at inference time.

“However, it’s important to maintain perspective – a model simply being open-source does not automatically equate to ethical AI,” Botev continued. “Addressing AI’s challenges requires a comprehensive approach to tackling issues like data privacy, algorithmic bias, and societal impacts – all key focuses of emerging AI regulations worldwide.

”While open initiatives like Llama 3 promote scrutiny and collaboration, their true impact hinges on a holistic approach to AI governance compliance and embedding ethics into AI systems’ lifecycles. Meta’s continuing efforts with the Llama model is a step in the right direction, but ethical AI demands sustained commitment from all stakeholders.”

Meta says it has adopted a “system-level approach” to responsible AI development and deployment with Llama 3. While the models have undergone extensive safety testing, the company emphasises that developers should implement their own input/output filtering in line with their application’s requirements.

The company’s end-user product integrating Llama 3 is Meta AI, which Meta claims is now the world’s leading AI assistant thanks to the new models. Users can access Meta AI via Facebook, Instagram, WhatsApp, Messenger and the web for productivity, learning, creativity, and general queries.

Multimodal versions of Meta AI integrating vision capabilities are on the way, with an early preview coming to Meta’s Ray-Ban smart glasses.

Despite the considerable achievements of Llama 3, some in the AI field have expressed scepticism over Meta’s motivation being an open approach “for the good of society.”

However, just a day after Mistral AI set a new benchmark for open source models with Mixtral 8x22B, Meta’s release does once again raise the bar for openly-available LLMs.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta raises the bar with open source Llama 3 LLM appeared first on AI News.

Mixtral 8x22B sets new benchmark for open models

Ryan Daws — Thu, 18 Apr 2024 14:39:18 +0000

Mistral AI has released Mixtral 8x22B, which sets a new benchmark for open source models in performance and efficiency. The model boasts robust multilingual capabilities and superior mathematical and coding prowess.

Mixtral 8x22B operates as a Sparse Mixture-of-Experts (SMoE) model, utilising just 39 billion of its 141 billion parameters when active.

Beyond its efficiency, the Mixtral 8x22B boasts fluency in multiple major languages including English, French, Italian, German, and Spanish. Its adeptness extends into technical domains with strong mathematical and coding capabilities. Notably, the model supports native function calling paired with a ‘constrained output mode,’ facilitating large-scale application development and tech upgrades.

Mixtral 8x22B Instruct is out. It significantly outperforms existing open models, and only uses 39B active parameters (making it significantly faster than 70B models during inference). 1/n pic.twitter.com/EbDLMHcBOq
— Guillaume Lample (@GuillaumeLample) April 17, 2024

With a substantial 64K tokens context window, Mixtral 8x22B ensures precise information recall from voluminous documents, further appealing to enterprise-level utilisation where handling extensive data sets is routine.

In line with fostering a collaborative and innovative AI research environment, Mistral AI has released Mixtral 8x22B under the Apache 2.0 license. This highly permissive open-source license ensures no-restriction usage and enables widespread adoption.

Statistically, Mixtral 8x22B outclasses many existing models. In head-to-head comparisons on standard industry benchmarks – ranging from common sense, reasoning, to subject-specific knowledge – Mistral’s new innovation excels. Figures released by Mistral AI illustrate that Mixtral 8x22B significantly outperforms LLaMA 2 70B model in varied linguistic contexts across critical reasoning and knowledge benchmarks:

Furthermore, in the arenas of coding and maths, Mixtral continues its dominance among open models. Updated results show an impressive performance improvement in mathematical benchmarks, following the release of an instructed version of the model:

Prospective users and developers are urged to explore Mixtral 8x22B on La Plateforme, Mistral AI’s interactive platform. Here, they can engage directly with the model.

In an era where AI’s role is ever-expanding, Mixtral 8x22B’s blend of high performance, efficiency, and open accessibility marks a significant milestone in the democratisation of advanced AI tools.

(Photo by Joshua Golde)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Mixtral 8x22B sets new benchmark for open models appeared first on AI News.

Databricks claims DBRX sets ‘a new standard’ for open-source LLMs

Ryan Daws — Thu, 28 Mar 2024 16:36:08 +0000

Databricks has announced the launch of DBRX, a powerful new open-source large language model that it claims sets a new bar for open models by outperforming established options like GPT-3.5 on industry benchmarks.

The company says the 132 billion parameter DBRX model surpasses popular open-source LLMs like LLaMA 2 70B, Mixtral, and Grok-1 across language understanding, programming, and maths tasks. It even outperforms Anthropic’s closed-source model Claude on certain benchmarks.

DBRX demonstrated state-of-the-art performance among open models on coding tasks, beating out specialised models like CodeLLaMA despite being a general-purpose LLM. It also matched or exceeded GPT-3.5 across nearly all benchmarks evaluated.

The state-of-the-art capabilities come thanks to a more efficient mixture-of-experts architecture that makes DBRX up to 2x faster at inference than LLaMA 2 70B, despite having fewer active parameters. Databricks claims training the model was also around 2x more compute-efficient than dense alternatives.

“DBRX is setting a new standard for open source LLMs—it gives enterprises a platform to build customised reasoning capabilities based on their own data,” said Ali Ghodsi, Databricks co-founder and CEO.

DBRX was pretrained on a massive 12 trillion tokens of “carefully curated” text and code data selected to improve quality. It leverages technologies like rotary position encodings and curriculum learning during pretraining.

Customers can interact with DBRX via APIs or use the company’s tools to finetune the model on their proprietary data. It’s already being integrated into Databricks’ AI products.

“Our research shows enterprises plan to spend half of their AI budgets on generative AI,” said Dave Menninger, Executive Director, Ventana Research, part of ISG. “One of the top three challenges they face is data security and privacy.

“With their end-to-end Data Intelligence Platform and the introduction of DBRX, Databricks is enabling enterprises to build generative AI applications that are governed, secure and tailored to the context of their business, while maintaining control and ownership of their IP along the way.”

Partners including Accenture, Block, Nasdaq, Prosus, Replit, and Zoom praised DBRX’s potential to accelerate enterprise adoption of open, customised large language models. Analysts said it could drive a shift from closed to open source as fine-tuned open models match proprietary performance.

Mike O’Rourke, Head of AI and Data Services at NASDAQ, commented: “Databricks is a key partner to Nasdaq on some of our most important data systems. They continue to be at the forefront of the industry in managing data and leveraging AI, and we are excited about the release of DBRX.

“The combination of strong model performance and favourable serving economics is the kind of innovation we are looking for as we grow our use of generative AI at Nasdaq.”

You can find the DBRX base and fine-tuned models on Hugging Face. The project’s GitHub has further resources and code examples.

(Photo by Ryan Quintal)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Databricks claims DBRX sets ‘a new standard’ for open-source LLMs appeared first on AI News.

Elon Musk’s xAI open-sources Grok

Ryan Daws — Mon, 18 Mar 2024 11:13:15 +0000

Elon Musk’s startup xAI has made its large language model Grok available as open source software. The 314 billion parameter model can now be freely accessed, modified, and distributed by anyone under an Apache 2.0 license.

The release fulfils Musk’s promise to open source Grok in an effort to accelerate AI development and adoption.

XAI announced the move in a blog post, stating: “We are releasing the base model weights and network architecture of Grok-1, our large language model. Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.”

Grok had previously only been available through Musk’s social network X as part of the paid X Premium+ subscription. By open sourcing it, xAI has empowered developers, companies, and enthusiasts worldwide to leverage the advanced language model’s capabilities.

here's your DEEP DIVE into @grok's architecture!
I just went through the https://t.co/8Y5cjeImg6, for this 314B open source behemoth with *no strings attached*.

👇🧵 pic.twitter.com/CraHKGqILe
— Andrew Kean Gao (@itsandrewgao) March 17, 2024

The model’s release includes its weights, which represent the strength of connections between its artificial neurons, as well as documentation and code. However, it omits the original training data and access to real-time data streams that gave the proprietary version an advantage.

Named after a term meaning “understanding” from Douglas Adams’ Hitchhiker’s Guide series, Grok has been positioned as a more open and humorous alternative to OpenAI’s ChatGPT. The move aligns with Musk’s battle against censorship, “woke” ideology displayed by models like Gemini, and his recent lawsuit claiming OpenAI violated its nonprofit principles.

While xAI’s open source release earned praise from open source advocates, some critics raised concerns about potential misuse facilitated by unrestricted access to powerful AI capabilities.

You can find Grok-1 on GitHub here.

(Image Credit: xAI)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Elon Musk’s xAI open-sources Grok appeared first on AI News.

Hugging Face is launching an open robotics project

Ryan Daws — Fri, 08 Mar 2024 17:37:22 +0000

Hugging Face, the startup behind the popular open source machine learning codebase and ChatGPT rival Hugging Chat, is venturing into new territory with the launch of an open robotics project.

The ambitious expansion was announced by former Tesla staff scientist Remi Cadene in a post on X:

After 3 years @tesla and Optimus, I am thrilled to announce that I joined Hugging Face to start an ambitious open robotics project! (open as in open-source, not as in Open AI) Looking for engineers to build real robots in Paris 🇫🇷 https://t.co/cFuNL4PVI4 🤖🤗 pic.twitter.com/7IkunPXOpX
— Remi Cadene (@RemiCadene) March 7, 2024

In keeping with Hugging Face’s ethos of open source, Cadene stated the robot project would be “open-source, not as in Open AI” in reference to OpenAI’s legal battle with Cadene’s former boss, Elon Musk.

Cadene – who will be leading the robotics initiative – revealed that Hugging Face is hiring robotics engineers in Paris, France.

A job listing for an “Embodied Robotics Engineer” sheds light on the project’s goals, which include “designing, building, and maintaining open-source and low cost robotic systems that integrate AI technologies, specifically in deep learning and embodied AI.”

The role involves collaborating with ML engineers, researchers, and product teams to develop innovative robotics solutions that “push the boundaries of what’s possible in robotics and AI.” Key responsibilities range from building low-cost robots using off-the-shelf components and 3D-printed parts to integrating deep learning and embodied AI technologies into robotic systems.

Until now, Hugging Face has primarily focused on software offerings like its machine learning codebase and open-source chatbot. The robotics project marks a significant departure into the hardware realm as the startup aims to bring AI into the physical world through open and affordable robotic platforms.

(Photo by Possessed Photography on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Hugging Face is launching an open robotics project appeared first on AI News.

IBM and Hugging Face release AI foundation model for climate science

Ryan Daws — Thu, 03 Aug 2023 10:32:39 +0000

In a bid to democratise access to AI technology for climate science, IBM and Hugging Face have announced the release of the watsonx.ai geospatial foundation model.

The geospatial model, built from NASA’s satellite data, will be the largest of its kind on Hugging Face and marks the first-ever open-source AI foundation model developed in collaboration with NASA.

Jeff Boudier, head of product and growth at Hugging Face, highlighted the importance of information sharing and collaboration in driving progress in AI. Open-source AI and the release of models and datasets are fundamental in ensuring AI benefits as many people as possible.

Climate science faces constant challenges due to rapidly changing environmental conditions, requiring access to the latest data. Despite the abundance of data, scientists and researchers struggle to analyse the vast datasets effectively. NASA estimates that by 2024, there will be 250,000 terabytes of data from new missions.

To address this issue, IBM embarked on a Space Act Agreement with NASA earlier this year—aiming to build an AI foundation model for geospatial data.

By making this geospatial foundation model openly available on Hugging Face, both companies aim to promote collaboration and accelerate progress in climate and Earth science.

Sriram Raghavan, VP at IBM Research AI, commented:

“The essential role of open-source technologies to accelerate critical areas of discovery such as climate change has never been clearer.

By combining IBM’s foundation model efforts aimed at creating flexible, reusable AI systems with NASA’s repository of Earth-satellite data, and making it available on the leading open-source AI platform, Hugging Face, we can leverage the power of collaboration to implement faster and more impactful solutions that will improve our planet.”

The geospatial model, jointly trained by IBM and NASA on Harmonized Landsat Sentinel-2 satellite data (HLS) over one year across the continental United States, has shown promising results. It demonstrated a 15 percent improvement over state-of-the-art techniques using only half the labelled data.

With further fine-tuning, the model can be adapted for various tasks such as deforestation tracking, crop yield prediction, and greenhouse gas detection.

IBM’s collaboration with NASA in building the AI model aligns with NASA’s decade-long Open-Source Science Initiative, promoting a more accessible and inclusive scientific community. NASA, along with other federal agencies, has designated 2023 as the Year of Open Science, celebrating the benefits of sharing data, information, and knowledge openly.

Kevin Murphy, Chief Science Data Officer at NASA, said:

“We believe that foundation models have the potential to change the way observational data is analysed and help us to better understand our planet.

By open-sourcing such models and making them available to the world, we hope to multiply their impact.”

The geospatial model leverages IBM’s foundation model technology and is part of IBM’s broader initiative to create and train AI models with transferable capabilities across different tasks.

In June, IBM introduced watsonx, an AI and data platform designed to scale and accelerate the impact of advanced AI with trusted data. A commercial version of the geospatial model, integrated into IBM watsonx, will be available through the IBM Environmental Intelligence Suite (EIS) later this year.

By leveraging the power of open-source technologies, this latest collaboration aims to address climate challenges effectively and contribute to a more sustainable future for our planet.

(Photo by Markus Spiske on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post IBM and Hugging Face release AI foundation model for climate science appeared first on AI News.

Meta launches Llama 2 open-source LLM

Ryan Daws — Wed, 19 Jul 2023 11:14:53 +0000

Meta has introduced Llama 2, an open-source family of AI language models which comes with a license allowing integration into commercial products.

The Llama 2 models range in size from 7-70 billion parameters, making them a formidable force in the AI landscape.

According to Meta’s claims, these models “outperform open source chat models on most benchmarks we tested.”

The release of Llama 2 marks a turning point in the LLM (large language model) market and has already caught the attention of industry experts and enthusiasts alike.

The new language models offered by Llama 2 come in two variants – pretrained and fine-tuned:

The pretrained models are trained on a whopping two trillion tokens and have a context window of 4,096 tokens, enabling them to process vast amounts of content at once.
The fine-tuned models, designed for chat applications like ChatGPT, have been trained on “over one million human annotations,” further enhancing their language processing capabilities.

While Llama 2’s performance may not yet rival OpenAI’s GPT-4, it shows remarkable promise for an open-source model.

The long-awaited sequel, Llama-2 is announced today! It's the best OSS model we have now.

▸ Tiers: 7B, 13B, 70B. Context: 4K
▸ 70B is close to GPT-3.5 on reasoning tasks, but there is a significant gap on coding benchmarks. It's on par or better than PaLM-540B on most… pic.twitter.com/qiQr4NsuxC
— Jim Fan (@DrJimFan) July 18, 2023

The Llama 2 journey started with its predecessor, LLaMA, which Meta released as open source with a non-commercial license in February.

However, someone leaked LLaMA’s weights to torrent sites, leading to a surge in its usage within the AI community. This laid the foundation for a fast-growing underground LLM development scene.

Open-source AI models like Llama 2 come with their share of advantages and concerns.

On the positive side, they encourage transparency in terms of training data, foster economic competition, promote free speech, and democratise access to AI. However, critics point out potential risks, such as misuse in synthetic biology, spam generation, or disinformation.

To address such concerns, Meta released a statement in support of its open innovation approach, emphasising that responsible and open innovation encourages transparency and trust in AI technologies.

Despite the benefits of open-source models, some critics remain sceptical, especially regarding the lack of transparency in the training data used for LLMs. While Meta claims to have made efforts to remove data containing personal information, the specific sources of training data remain undisclosed, raising concerns about privacy and ethical considerations.

With the combination of open-source development and commercial licensing, Llama 2 promises to bring exciting advancements and opportunities to the AI community while simultaneously navigating the challenges of data privacy and responsible usage.

(Photo by Joakim Honkasalo on Unsplash)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta launches Llama 2 open-source LLM appeared first on AI News.

GitHub CEO: The EU ‘will define how the world regulates AI’

Ryan Daws — Mon, 06 Feb 2023 17:04:56 +0000

GitHub CEO Thomas Dohmke addressed the EU Open Source Policy Summit in Brussels and gave his views on the bloc’s upcoming AI Act.

“The AI Act will define how the world regulates AI and we need to get it right, for developers and the open-source community,” said Dohmke.

Dohmke was born and grew up in Germany but now lives in the US. As such, he is all too aware of the widespread belief that the EU cannot lead when it comes to tech innovation.

“As a European, I love seeing how open-source AI innovations are beginning to break the narrative that only the US and China can lead on tech innovation.”

“I’ll be honest, as a European living in the United States, this is a pervasive – and often true – narrative. But this can change. And it’s already beginning to, thanks to open-source developers.”

AI will revolutionise just about every aspect of our lives. Regulation is vital to minimise the risks associated with AI while allowing the benefits to flourish.

“Together, OSS (Open Source Software) developers will use AI to help make our lives better. I have no doubt that OSS developers will help build AI innovations that empower those with disabilities, help us solve climate change, and save lives.”

A risk of overregulation is that it drives innovation elsewhere. Startups are more likely to establish themselves in countries like the US and China where they’re likely not subject to as strict regulations. Europe will find itself falling behind and having less influence on the global stage when it comes to AI.

“The AI Act is so crucial. This policy could well set the precedent for how the world regulates AI. It is foundationally important. Important for European technological leadership, and the future of the European economy itself. The AI Act must be fair and balanced for the open-source community.

“Policymakers should help us get there. The AI Act can foster democratised innovation and solidify Europe’s leadership in open, values-based artificial intelligence. That is why I believe that open-source developers should be exempt from the AI Act.”

In expanding on his belief that open-source developers should be exempt, Dohmke explains that the compliance burden should fall on those shipping products.

“OSS developers are often volunteers. Many are working two jobs. They are scientists, doctors, academics, professors, and university students alike. They don’t usually stand to profit from their contributions—and they certainly don’t have big budgets and compliance departments!”

EU lawmakers are hoping to agree on draft AI rules next month with the aim of winning the acceptance of member states by the end of the year.

“Open-source is forming the foundation of AI innovation in Europe. The US and China don’t have to win it all. Let’s break that narrative apart!

“Let’s give the open-source community the daylight and the clarity to grow their ideas and build them for the rest of the world! And by doing so, let’s give Europe the chance to be a leader in this new age of AI.”

GitHub’s policy paper on the AI Act can be found here.

(Image Credit: Collision Conf under CC BY 2.0 license)

Relevant: US and EU agree to collaborate on improving lives with AI

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post GitHub CEO: The EU ‘will define how the world regulates AI’ appeared first on AI News.