training Archives - AI News https://www.artificialintelligence-news.com/tag/training/ Artificial Intelligence News Mon, 19 Feb 2024 11:11:41 +0000 en-GB hourly 1 https://www.artificialintelligence-news.com/wp-content/uploads/sites/9/2020/09/ai-icon-60x60.png training Archives - AI News https://www.artificialintelligence-news.com/tag/training/ 32 32 Reddit is reportedly selling data for AI training https://www.artificialintelligence-news.com/2024/02/19/reddit-is-reportedly-selling-data-for-ai-training/ https://www.artificialintelligence-news.com/2024/02/19/reddit-is-reportedly-selling-data-for-ai-training/#respond Mon, 19 Feb 2024 11:11:40 +0000 https://www.artificialintelligence-news.com/?p=14419 Reddit has negotiated a content licensing deal to allow its data to be used for training AI models, according to a Bloomberg report. Just ahead of a potential $5 billion initial public offering (IPO) debut in March, Reddit has reportedly signed a $60 million deal with an undisclosed major AI company. This move could be... Read more »

The post Reddit is reportedly selling data for AI training appeared first on AI News.

]]>
Reddit has negotiated a content licensing deal to allow its data to be used for training AI models, according to a Bloomberg report.

Just ahead of a potential $5 billion initial public offering (IPO) debut in March, Reddit has reportedly signed a $60 million deal with an undisclosed major AI company. This move could be seen as a last-minute effort to showcase potential revenue streams in the rapidly growing AI industry to prospective investors.

Although Reddit has yet to confirm the deal, the decision could have significant implications. If true, it would mean that Reddit’s vast trove of user-generated content – including posts from popular subreddits, comments from both prominent and obscure users, and discussions on a wide range of topics – could be used to train and enhance existing large language models (LLMs) or provide the foundation for the development of new generative AI systems.

However, this decision by Reddit may not sit well with its user base, as the company has faced increasing opposition from its community regarding its recent business decisions.

Last year, when Reddit announced plans to start charging for access to its application programming interfaces (APIs), thousands of Reddit forums temporarily shut down in protest. Days later, a group of Reddit hackers threatened to release previously stolen site data unless the company reversed the API plan or paid a ransom of $4.5 million.

Reddit has recently made other controversial decisions, such as removing years of private chat logs and messages from users’ accounts. The platform also implemented new automatic moderation features and removed the option for users to turn off personalised advertising, fuelling additional discontent among its users.

This latest reported deal to sell Reddit’s data for AI training could generate even more backlash from users, as the debate over the ethics of using public data, art, and other human-created content to train AI systems continues to intensify across various industries and platforms.

(Photo by Brett Jordan on Unsplash)

See also: Amazon trains 980M parameter LLM with ’emergent abilities’

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Reddit is reportedly selling data for AI training appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2024/02/19/reddit-is-reportedly-selling-data-for-ai-training/feed/ 0
OpenAI: Copyrighted data ‘impossible’ to avoid for AI training https://www.artificialintelligence-news.com/2024/01/09/openai-copyrighted-data-impossible-avoid-for-ai-training/ https://www.artificialintelligence-news.com/2024/01/09/openai-copyrighted-data-impossible-avoid-for-ai-training/#respond Tue, 09 Jan 2024 15:45:05 +0000 https://www.artificialintelligence-news.com/?p=14167 OpenAI made waves this week with its bold assertion to a UK parliamentary committee that it would be “impossible” to develop today’s leading AI systems without using vast amounts of copyrighted data. The company argued that advanced AI tools like ChatGPT require such broad training that adhering to copyright law would be utterly unworkable. In... Read more »

The post OpenAI: Copyrighted data ‘impossible’ to avoid for AI training appeared first on AI News.

]]>
OpenAI made waves this week with its bold assertion to a UK parliamentary committee that it would be “impossible” to develop today’s leading AI systems without using vast amounts of copyrighted data.

The company argued that advanced AI tools like ChatGPT require such broad training that adhering to copyright law would be utterly unworkable.

In written testimony, OpenAI stated that between expansive copyright laws and the ubiquity of protected online content, “virtually every sort of human expression” would be off-limits for training data. From news articles to forum comments to digital images, little online content can be utilised freely and legally.

According to OpenAI, attempts to create capable AI while avoiding copyright infringement would fail: “Limiting training data to public domain books and drawings created more than a century ago … would not provide AI systems that meet the needs of today’s citizens.”

While defending its practices as compliant, OpenAI conceded that partnerships and compensation schemes with publishers may be warranted to “support and empower creators.” But the company gave no indication that it intends to dramatically restrict its harvesting of online data, including paywalled journalism and literature.

This stance has opened OpenAI up to multiple lawsuits, including from media outlets like The New York Times alleging copyright breaches.

Nonetheless, OpenAI appears unwilling to fundamentally alter its data collection and training processes—given the “impossible” constraints self-imposed copyright limits would bring. The company instead hopes to rely on broad interpretations of fair use allowances to legally leverage vast swathes of copyrighted data.

As advanced AI continues to demonstrate uncanny abilities emulating human expression, legal experts expect vigorous courtroom battles around infringement by systems intrinsically designed to absorb enormous volumes of protected text, media, and other creative output. 

For now, OpenAI is betting against copyright maximalists in favour of near-boundless copying to drive ongoing AI development.

(Photo by Levart_Photographer on Unsplash)

See also: OpenAI’s GPT Store to launch next week after delays

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI: Copyrighted data ‘impossible’ to avoid for AI training appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2024/01/09/openai-copyrighted-data-impossible-avoid-for-ai-training/feed/ 0
Nightshade ‘poisons’ AI models to fight copyright theft https://www.artificialintelligence-news.com/2023/10/24/nightshade-poisons-ai-models-fight-copyright-theft/ https://www.artificialintelligence-news.com/2023/10/24/nightshade-poisons-ai-models-fight-copyright-theft/#respond Tue, 24 Oct 2023 14:49:13 +0000 https://www.artificialintelligence-news.com/?p=13779 University of Chicago researchers have unveiled Nightshade, a tool designed to disrupt AI models attempting to learn from artistic imagery. The tool – still in its developmental phase – allows artists to protect their work by subtly altering pixels in images, rendering them imperceptibly different to the human eye but confusing to AI models. Many... Read more »

The post Nightshade ‘poisons’ AI models to fight copyright theft appeared first on AI News.

]]>
University of Chicago researchers have unveiled Nightshade, a tool designed to disrupt AI models attempting to learn from artistic imagery.

The tool – still in its developmental phase – allows artists to protect their work by subtly altering pixels in images, rendering them imperceptibly different to the human eye but confusing to AI models.

Many artists and creators have expressed concern over the use of their work in training commercial AI products without their consent.

AI models rely on vast amounts of multimedia data – including written material and images, often scraped from the web – to function effectively. Nightshade offers a potential solution by sabotaging this data.

When integrated into digital artwork, Nightshade misleads AI models, causing them to misidentify objects and scenes.

For instance, Nightshade transformed images of dogs into data that appeared to AI models as cats. After exposure to a mere 100 poison samples, the AI reliably generated a cat when asked for a dog—demonstrating the tool’s effectiveness.

This technique not only confuses AI models but also challenges the fundamental way in which generative AI operates. By exploiting the clustering of similar words and ideas in AI models, Nightshade can manipulate responses to specific prompts and further undermine the accuracy of AI-generated content.

Developed by computer science professor Ben Zhao and his team, Nightshade is an extension of their prior product, Glaze, which cloaks digital artwork and distorts pixels to baffle AI models regarding artistic style.

While the potential for misuse of Nightshade is acknowledged, the researchers’ primary objective is to shift the balance of power from AI companies back to artists and discourage intellectual property violations.

The introduction of Nightshade presents a major challenge to AI developers. Detecting and removing images with poisoned pixels is a complex task, given the imperceptible nature of the alterations.

If integrated into existing AI training datasets, these images necessitate removal and potential retraining of AI models, posing a substantial hurdle for companies relying on stolen or unauthorised data.

As the researchers await peer review of their work, Nightshade is a beacon of hope for artists seeking to protect their creative endeavours.

(Photo by Josie Weiss on Unsplash)

See also: UMG files landmark lawsuit against AI developer Anthropic

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Nightshade ‘poisons’ AI models to fight copyright theft appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/10/24/nightshade-poisons-ai-models-fight-copyright-theft/feed/ 0
OpenAI is not currently training GPT-5 https://www.artificialintelligence-news.com/2023/04/17/openai-is-not-currently-training-gpt-5/ https://www.artificialintelligence-news.com/2023/04/17/openai-is-not-currently-training-gpt-5/#respond Mon, 17 Apr 2023 10:36:35 +0000 https://www.artificialintelligence-news.com/?p=12963 Experts calling for a pause on AI development will be glad to hear that OpenAI isn’t currently training GPT-5. OpenAI CEO Sam Altman spoke remotely at an MIT event and was quizzed about AI by computer scientist and podcaster Lex Fridman. Altman confirmed that OpenAI is not currently developing a fifth version of its Generative... Read more »

The post OpenAI is not currently training GPT-5 appeared first on AI News.

]]>
Experts calling for a pause on AI development will be glad to hear that OpenAI isn’t currently training GPT-5.

OpenAI CEO Sam Altman spoke remotely at an MIT event and was quizzed about AI by computer scientist and podcaster Lex Fridman.

Altman confirmed that OpenAI is not currently developing a fifth version of its Generative Pre-trained Transformer model and is instead focusing on enhancing the capabilities of GPT-4, the latest version.

Altman was asked about the open letter that urged developers to pause training AI models larger than GPT-4 for six months. While he supported the idea of ensuring AI models are safe and aligned with human values, he believed that the letter lacked technical nuance regarding where to pause.

“An earlier version of the letter claims we are training GPT-5 right now. We are not, and won’t for some time. So in that sense, it was sort of silly,” said Altman.

“We are doing things on top of GPT-4 that I think have all sorts of safety issues that we need to address.”

GPT-4 is a significant improvement over its predecessor, GPT-3, which was released in 2020. 

GPT-3 has 175 billion parameters, making it one of the largest language models in existence. OpenAI has not confirmed GPT-4’s exact number of parameters but it’s estimated to be in the region of one trillion.

OpenAI said in a blog post that GPT-4 is “more creative and collaborative than ever before” and “can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem-solving abilities.”

In a simulated law bar exam, GPT-3.5 scored around the bottom 10 percent. GPT-4, however, passed the exam among the top 10 percent.

OpenAI is one of the leading AI research labs in the world, and its GPT models have been used for a wide range of applications, including language translation, chatbots, and content creation. However, the development of such large language models has raised concerns about their safety and ethical implications.

Altman’s comments suggest that OpenAI is aware of the concerns surrounding its GPT models and is taking steps to address them.

While GPT-5 may not be on the horizon, the continued development of GPT-4 and the creation of other models on top of it will undoubtedly raise further questions about the safety and ethical implications of such AI models.

(Photo by Victor Freitas on Unsplash)

Related: ​​Italy will lift ChatGPT ban if OpenAI fixes privacy issues

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI is not currently training GPT-5 appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/04/17/openai-is-not-currently-training-gpt-5/feed/ 0
Adobe may train its algorithms with your work unless you opt-out https://www.artificialintelligence-news.com/2023/01/09/adobe-train-algorithms-your-work-opt-out/ https://www.artificialintelligence-news.com/2023/01/09/adobe-train-algorithms-your-work-opt-out/#respond Mon, 09 Jan 2023 17:09:53 +0000 https://www.artificialintelligence-news.com/?p=12589 Unless you specifically opt-out, Adobe may assume that it’s ok to use your work to train its algorithms. An eagle-eyed developer at the Krita Foundation noticed that Adobe had automatically opted them into a “content analysis” initiative. The program allows Adobe to “analyze your content using techniques such as machine learning (e.g. for pattern recognition)... Read more »

The post Adobe may train its algorithms with your work unless you opt-out appeared first on AI News.

]]>
Unless you specifically opt-out, Adobe may assume that it’s ok to use your work to train its algorithms.

An eagle-eyed developer at the Krita Foundation noticed that Adobe had automatically opted them into a “content analysis” initiative. The program allows Adobe to “analyze your content using techniques such as machine learning (e.g. for pattern recognition) to develop and improve our products and services.”

The rule was implemented in August 2022 but managed to go unnoticed.

Artists, understandably, have been protesting over AI-generated art as a potential threat to their livelihoods:

While some artists believe AI is a tool for their work rather than a threat, there’s near-unanimous consensus that the method in which generative AI models are often trained is unfair.

Some artists have found their work has been scraped to train generative AI models without their consent or at least being paid royalties. This has raised questions over whether end-users could also unwittingly violate copyright and face legal consequences.

By changing its policy to allow AI models to be trained on the works of its users, Adobe doesn’t have to rely on scraping data from the web. Adobe, it’s worth noting, is set to 

While Adobe claims that it doesn’t use data on customers’ Creative Cloud accounts to train its experimental generative AI features, the wording provides some legal flexibility.

In the company’s documentation, Adobe quite clearly says “we first aggregate your content with other content and then use the aggregated content to train our algorithms and thus improve our products and services.”

Such data collection should never be opted into by default, it arguably falls foul of regulations like GDPR. If you’re an Adobe user and want to opt-out, you can do so here.

(Photo by Emily Bernal on Unsplash)

Related: Adobe to begin selling AI-generated stock images

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Adobe may train its algorithms with your work unless you opt-out appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/01/09/adobe-train-algorithms-your-work-opt-out/feed/ 0
Devang Sachdev, Snorkel AI: On easing the laborious process of labelling data https://www.artificialintelligence-news.com/2022/09/30/devang-sachdev-snorkel-ai-on-easing-the-laborious-process-of-labelling-data/ https://www.artificialintelligence-news.com/2022/09/30/devang-sachdev-snorkel-ai-on-easing-the-laborious-process-of-labelling-data/#respond Fri, 30 Sep 2022 07:52:51 +0000 https://www.artificialintelligence-news.com/?p=12318 Correctly labelling training data for AI models is vital to avoid serious problems, as is using sufficiently large datasets. However, manually labelling massive amounts of data is time-consuming and laborious. Using pre-labelled datasets can be problematic, as evidenced by MIT having to pull its 80 Million Tiny Images datasets. For those unaware, the popular dataset... Read more »

The post Devang Sachdev, Snorkel AI: On easing the laborious process of labelling data appeared first on AI News.

]]>
Correctly labelling training data for AI models is vital to avoid serious problems, as is using sufficiently large datasets. However, manually labelling massive amounts of data is time-consuming and laborious.

Using pre-labelled datasets can be problematic, as evidenced by MIT having to pull its 80 Million Tiny Images datasets. For those unaware, the popular dataset was found to contain thousands of racist and misogynistic labels that could have been used to train AI models.

AI News caught up with Devang Sachdev, VP of Marketing at Snorkel AI, to find out how the company is easing the laborious process of labelling data in a safe and effective way.

AI News: How is Snorkel helping to ease the laborious process of labelling data?

Devang Sachdev: Snorkel Flow changes the paradigm of training data labelling from the traditional manual process—which is slow, expensive, and unadaptable—to a programmatic process that we’ve proven accelerates training data creation 10x-100x.

Users are able to capture their knowledge and existing resources (both internal, e.g., ontologies and external, e.g., foundation models) as labelling functions, which are applied to training data at scale. 

Unlike a rules-based approach, these labelling functions can be imprecise, lack coverage, and conflict with each other. Snorkel Flow uses theoretically grounded weak supervision techniques to intelligently combine the labelling functions to auto-label your training data set en-masse using an optimal Snorkel Flow label model. 

Using this initial training data set, users train a larger machine learning model of their choice (with the click of a button from our ‘Model Zoo’) in order to:

  1. Generalise beyond the output of the label model.
  2. Generate model-guided error analysis to know exactly where the model is confused and how to iterate. This includes auto-generated suggestions, as well as analysis tools to explore and tag data to identify what labelling functions to edit or add. 

This rapid, iterative, and adaptable process becomes much more like software development rather than a tedious, manual process that cannot scale. And much like software development, it allows users to inspect and adapt the code that produced training data labels.

AN: Are there dangers to implementing too much automation in the labelling process?

DS: The labelling process can inherently introduce dangers simply for the fact that as humans, we’re fallible. Human labellers can be fatigued, make mistakes, or have a conscious or unconscious bias which they encode into the model via their manual labels.

When mistakes or biases occur—and they will—the danger is the model or downstream application essentially amplifies the isolated label. These amplifications can lead to consequential impacts at scale. For example, inequities in lending, discrimination in hiring, missed diagnoses for patients, and more. Automation can help.

In addition to these dangers—which have major downstream consequences—there are also more practical risks of attempting to automate too much or taking the human out of the loop of training data development.

Training data is how humans encode their expertise to machine learning models. While there are some cases where specialised expertise isn’t required to label data, in most enterprise settings, there is. For this training data to be effective, it needs to capture the fullness of subject matter experts’ knowledge and the diverse resources they rely on to make a decision on any given datapoint.

However, as we have all experienced, having highly in-demand experts label data manually one-by-one simply isn’t scalable. It also leaves an enormous amount of value on the table by losing the knowledge behind each manual label. We must take a programmatic approach to data labelling and engage in data-centric, rather than model-centric, AI development workflows. 

Here’s what this entails: 

  • Elevating how domain experts label training data from tediously labelling one-by-one to encoding their expertise—the rationale behind what would be their labelling decisions—in a way that can be applied at scale. 
  • Using weak supervision to intelligently auto-label at scale—this is not auto-magic, of course; it’s an inherently transparent, theoretically grounded approach. Every training data label that’s applied in this step can be inspected to understand why it was labelled as it was. 
  • Bringing experts into the core AI development loop to assist with iteration and troubleshooting. Using streamlined workflows within the Snorkel Flow platform, data scientists—as subject matter experts—are able to collaborate to identify the root cause of error modes and how to correct them by making simple labelling function updates, additions, or, at times, correcting ground truth or “gold standard” labels that error analysis reveals to be wrong.

AN: How easy is it to identify and update labels based on real-world changes?

DS: A fundamental value of Snorkel Flow’s data-centric approach to AI development is adaptability. We all know that real-world changes are inevitable, whether that’s production data drift or business goals that evolve. Because Snorkel Flow uses programmatic labelling, it’s extremely efficient to respond to these changes.

In the traditional paradigm, if the business comes to you with a change in objectives—say, they were classifying documents three ways but now need a 10-way schema, you’d effectively need to relabel your training data set (often thousands or hundreds of thousands of data points) from scratch. This would mean weeks or months of work before you could deliver on the new objective. 

In contrast, with Snorkel Flow, updating the schema is as simple as writing a few additional labelling functions to cover the new classes and applying weak supervision to combine all of your labelling functions and retrain your model. 

To identify data drift in production, you can rely on your monitoring system or use Snorkel Flow’s production APIs to bring live data back into the platform and see how your model performs against real-world data.

As you spot performance degradation, you’re able to follow the same workflow: using error analysis to understand patterns, apply auto-suggested actions, and iterate in collaboration with your subject matter experts to refine and add labelling functions. 

AN: MIT was forced to pull its ‘80 Million Tiny Images’ dataset after it was found to contain racist and misogynistic labels due to its use of an “automated data collection procedure” based on WordNet. How is Snorkel ensuring that it avoids this labelling problem that is leading to harmful biases in AI systems?

DS: Bias can start anywhere in the system – pre-processing, post-processing, with task design, with modelling choices, etc. And in particular issues with labelled training data.

To understand underlying bias, it is important to understand the rationale used by labellers. This is impractical when every datapoint is hand labelled and the logic behind labelling it one way or another is not captured. Moreover, information about label author and dataset versioning is rarely available. Often labelling is outsourced or in-house labellers have moved on to other projects or organizations. 

Snorkel AI’s programmatic labelling approach helps discover, manage, and mitigate bias. Instead of discarding the rationale behind each manually labelled datapoint, Snorkel Flow, our data-centric AI platform, captures the labellers’ (subject matter experts, data scientists, and others) knowledge as a labelling function and generates probabilistic labels using theoretical grounded algorithms encoded in a novel label model.

With Snorkel Flow, users can understand exactly why a certain datapoint was labelled the way it is. This process, along with label function and label dataset versioning, allows users to audit, interpret, and even explain model behaviours. This shift from manual to programmatic labelling is key to managing bias.

AN: A group led by Snorkel researcher Stephen Bach recently had their paper on Zero-Shot Learning with Common Sense Knowledge Graphs (ZSL-KG) published. I’d direct readers to the paper for the full details, but can you give us a brief overview of what it is and how it improves over existing WordNet-based methods?

DS: ZSL-KG improves graph-based zero-shot learning in two ways: richer models and richer data. On the modelling side, ZSL-KG is based on a new type of graph neural network called a transformer graph convolutional network (TrGCN).

Many graph neural networks learn to represent nodes in a graph through linear combinations of neighbouring representations, which is limiting. TrGCN uses small transformers at each node to combine neighbourhood representations in more complex ways.

On the data side, ZSL-KG uses common sense knowledge graphs, which use natural language and graph structures to make explicit many types of relationships among concepts. They are much richer than the typical ImageNet subtype hierarchy.

AN: Gartner designated Snorkel a ‘Cool Vendor’ in its 2022 AI Core Technologies report. What do you think makes you stand out from the competition?

DS: Data labelling is one of the biggest challenges for enterprise AI. Most organisations realise that current approaches are unscalable and often ridden with quality, explainability, and adaptability issues. Snorkel AI not only provides a solution for automating data labelling but also uniquely offers an AI development platform to adopt a data-centric approach and leverage knowledge resources including subject matter experts and existing systems.

In addition to the technology, Snorkel AI brings together 7+ years of R&D (which began at the Stanford AI Lab) and a highly-talented team of machine learning engineers, success managers, and researchers to successfully assist and advise customer development as well as bring new innovations to market.

Snorkel Flow unifies all the necessary components of a programmatic, data-centric AI development workflow—training data creation/management, model iteration, error analysis tooling, and data/application export or deployment—while also being completely interoperable at each stage via a Python SDK and a range of other connectors.

This unified platform also provides an intuitive interface and streamlined workflow for critical collaboration between SME annotators, data scientists, and other roles, to accelerate AI development. It allows data science and ML teams to iterate on both data and models within a single platform and use insights from one to guide the development of the other, leading to rapid development cycles.

The Snorkel AI team will be sharing their invaluable insights at this year’s AI & Big Data Expo North America. Find out more here and swing by Snorkel’s booth at stand #52.

The post Devang Sachdev, Snorkel AI: On easing the laborious process of labelling data appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2022/09/30/devang-sachdev-snorkel-ai-on-easing-the-laborious-process-of-labelling-data/feed/ 0
Meta claims its new AI supercomputer will set records https://www.artificialintelligence-news.com/2022/01/25/meta-claims-new-ai-supercomputer-will-set-records/ https://www.artificialintelligence-news.com/2022/01/25/meta-claims-new-ai-supercomputer-will-set-records/#respond Tue, 25 Jan 2022 09:25:47 +0000 https://artificialintelligence-news.com/?p=11610 Meta (formerly Facebook) has unveiled an AI supercomputer that it claims will be the world’s fastest. The supercomputer is called the AI Research SuperCluster (RSC) and is yet to be fully complete. However, Meta’s researchers have already begun using it for training large natural language processing (NLP) and computer vision models. RSC is set to... Read more »

The post Meta claims its new AI supercomputer will set records appeared first on AI News.

]]>
Meta (formerly Facebook) has unveiled an AI supercomputer that it claims will be the world’s fastest.

The supercomputer is called the AI Research SuperCluster (RSC) and is yet to be fully complete. However, Meta’s researchers have already begun using it for training large natural language processing (NLP) and computer vision models.

RSC is set to be fully built in mid-2022. Meta says that it will be the fastest in the world once complete and the aim is for it to be capable of training models with trillions of parameters.

“We hope RSC will help us build entirely new AI systems that can, for example, power real-time voice translations to large groups of people, each speaking a different language, so they can seamlessly collaborate on a research project or play an AR game together,” wrote Meta in a blog post.

“Ultimately, the work done with RSC will pave the way toward building technologies for the next major computing platform — the metaverse, where AI-driven applications and products will play an important role.”

For production, Meta expects RSC will be 20x faster than Meta’s current V100-based clusters. RSC is also estimated to be 9x faster at running the NVIDIA Collective Communication Library (NCCL) and 3x faster at training large-scale NLP workflows.

A model with tens of billions of parameters can finish training in three weeks compared with nine weeks prior to RSC.

Meta says that its previous AI research infrastructure only leveraged open source and other publicly-available datasets. RSC was designed with the security and privacy controls in mind to allow Meta to use real-world examples from its production systems in production training.

What this means in practice is that Meta can use RSC to advance research for vital tasks such as identifying harmful content on its platforms—using real data from them.

“We believe this is the first time performance, reliability, security, and privacy have been tackled at such a scale,” says Meta.

(Image Credit: Meta)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo. The next events in the series will be held in Santa Clara on 11-12 May 2022, Amsterdam on 20-21 September 2022, and London on 1-2 December 2022.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta claims its new AI supercomputer will set records appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2022/01/25/meta-claims-new-ai-supercomputer-will-set-records/feed/ 0
OpenAI now allows developers to customise GPT-3 models https://www.artificialintelligence-news.com/2021/12/15/openai-now-allows-developers-to-customise-gpt-3-models/ https://www.artificialintelligence-news.com/2021/12/15/openai-now-allows-developers-to-customise-gpt-3-models/#respond Wed, 15 Dec 2021 11:44:42 +0000 https://artificialintelligence-news.com/?p=11507 OpenAI is making it easy for developers to “fine-tune” GPT-3, enabling custom models for their applications. The company says that existing datasets of “virtually any shape and size” can be used for custom models. A single command in the OpenAI command-line tool, alongside a user-provided file, is all that it takes to begin training. The... Read more »

The post OpenAI now allows developers to customise GPT-3 models appeared first on AI News.

]]>
OpenAI is making it easy for developers to “fine-tune” GPT-3, enabling custom models for their applications.

The company says that existing datasets of “virtually any shape and size” can be used for custom models.

A single command in the OpenAI command-line tool, alongside a user-provided file, is all that it takes to begin training. The custom GPT-3 model will then be available for use in OpenAI’s API immediately.

One customer says that it was able to increase correct outputs from 83 percent to 95 percent through fine-tuning. Another client reduced error rates by 50 percent.

Andreas Stuhlmüller, Co-Founder of Elicit, said:

“Since we started integrating fine-tuning into Elicit, for tasks with 500+ training examples, we’ve found that fine-tuning usually results in better speed and quality at a lower cost than few-shot learning.

This has been essential for making Elicit responsive at the same time as increasing its accuracy at summarising complex research statements.

As far as we can tell, this wouldn’t have been doable without fine-tuning GPT-3”

Joel Hellermark, CEO of Sana Labs, commented:

“With OpenAI’s customised models, fine-tuned on our data, Sana’s question and content generation went from grammatically correct but general responses to highly accurate semantic outputs which are relevant to the key learnings.

This yielded a 60 percent improvement when compared to non-custom models, enabling fundamentally more personalised and effective experiences for our learners.”

In June, Gartner said that 80 percent of technology products and services will be built by those who are not technology professionals by 2024. OpenAI is enabling custom AI models to be easily created to unlock the full potential of such products and services.

Related: OpenAI removes GPT-3 API waitlist and opens applications for all developers

(Photo by Sigmund on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo. The next events in the series will be held in Santa Clara on 11-12 May 2022, Amsterdam on 20-21 September 2022, and London on 1-2 December 2022.

The post OpenAI now allows developers to customise GPT-3 models appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2021/12/15/openai-now-allows-developers-to-customise-gpt-3-models/feed/ 0
MLCommons releases latest MLPerf Training benchmark results https://www.artificialintelligence-news.com/2021/06/30/mlcommons-releases-latest-mlperf-training-benchmark-results/ https://www.artificialintelligence-news.com/2021/06/30/mlcommons-releases-latest-mlperf-training-benchmark-results/#respond Wed, 30 Jun 2021 18:00:00 +0000 http://artificialintelligence-news.com/?p=10735 Open engineering consortium MLCommons has released its latest MLPerf Training community benchmark results. MLPerf Training is a full system benchmark that tests machine learning models, software, and hardware. The results are split into two divisions: closed and open. Closed submissions are better for comparing like-for-like performance as they use the same reference model to ensure... Read more »

The post MLCommons releases latest MLPerf Training benchmark results appeared first on AI News.

]]>
Open engineering consortium MLCommons has released its latest MLPerf Training community benchmark results.

MLPerf Training is a full system benchmark that tests machine learning models, software, and hardware.

The results are split into two divisions: closed and open. Closed submissions are better for comparing like-for-like performance as they use the same reference model to ensure a level playing field. Open submissions, meanwhile, allow participants to submit a variety of models.

In the image classification benchmark, Google is the winner with its preview tpu-v4-6912 system that uses an incredible 1728 AMD Rome processors and 3456 TPU accelerators. Google’s system completed the benchmark in just 23 seconds.

“We showcased the record-setting performance and scalability of our fourth-generation Tensor Processing Units (TPU v4), along with the versatility of our machine learning frameworks and accompanying software stack. Best of all, these capabilities will soon be available to our cloud customers,” Google said.

“We achieved a roughly 1.7x improvement in our top-line submissions compared to last year’s results using new, large-scale TPU v4 Pods with 4,096 TPU v4 chips each. Using 3,456 TPU v4 chips in a single TPU v4 Pod slice, many models that once trained in days or weeks now train in a few seconds.”

Of the systems that are available on-premise, NVIDIA’s dgxa100_n310_ngc21.05_mxnet system came out on top with its 620 AMD EPYC 7742 processors and 2480 NVIDIA A100-SXM4-80GB (400W) accelerators completing the benchmark in 40 seconds.

“In the last 2.5 years since the first MLPerf training benchmark launched, NVIDIA performance has increased by up to 6.5x per GPU, increasing by up to 2.1x with A100 from the last round,” said NVIDIA.

“We demonstrated scaling to 4096 GPUs which enabled us to train all benchmarks in less than 16 minutes and 4 out of 8 in less than a minute. The NVIDIA platform excels in both performance and usability, offering a single leadership platform from data centre to edge to cloud.”

Across the board, MLCommons says that benchmark results have improved by up to 2.1x compared to the last submission round. This shows the incredible advancements that are being made in hardware, software, and system scale.

Victor Bittorf, Co-Chair of the MLPerf Training Working Group, said:

“We’re thrilled to see the continued growth and enthusiasm from the MLPerf community, especially as we’re able to measure significant improvement across the industry with the MLPerf Training benchmark suite.

Congratulations to all of our submitters in this v1.0 round – we’re excited to continue our work together, bringing transparency across machine learning system capabilities.”

For its latest benchmark, MLCommons added two new benchmarks for measuring the performance of performance for speech-to-text and 3D medical imaging. These new benchmarks leverage the following reference models: 

  • Speech-to-Text with RNN-T: RNN-T: Recurrent Neural Network Transducer is an automatic speech recognition (ASR) model that is trained on a subset of LibriSpeech. Given a sequence of speech input, it predicts the corresponding text. RNN-T is MLCommons’ reference model and commonly used in production for speech-to-text systems.
  • 3D Medical Imaging with 3D U-Net: The 3D U-Net architecture is trained on the KiTS 19 dataset to find and segment cancerous cells in the kidneys. The model identifies whether each voxel within a CT scan belongs to a healthy tissue or a tumour, and is representative of many medical imaging tasks.

“The training benchmark suite is at the centre of MLCommon’s mission to push machine learning innovation forward for everyone, and we’re incredibly pleased with the engagement from this round’s submissions,” commented John Tran, Co-Chair of the MLPerf Training Working Group.

The full MLPerf Training benchmark results can be explored here.

(Photo by Alora Griffiths on Unsplash)

Find out more about Digital Transformation Week North America, taking place on November 9-10 2021, a virtual event and conference exploring advanced DTX strategies for a ‘digital everything’ world.

The post MLCommons releases latest MLPerf Training benchmark results appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2021/06/30/mlcommons-releases-latest-mlperf-training-benchmark-results/feed/ 0
Enterprise AI platform Dataiku announces fully-managed online service https://www.artificialintelligence-news.com/2021/06/14/enterprise-ai-platform-dataiku-announces-fully-managed-online-service/ https://www.artificialintelligence-news.com/2021/06/14/enterprise-ai-platform-dataiku-announces-fully-managed-online-service/#respond Mon, 14 Jun 2021 15:40:15 +0000 http://artificialintelligence-news.com/?p=10686 Dataiku has announced a fully-managed online version of its enterprise AI platform to help smaller companies get started. The data science platform enables raw data to be converted into actionable insights through data visualisation or the creation of dashboards and also supports training machine learning models. “Accessibility has always been of the utmost importance at... Read more »

The post Enterprise AI platform Dataiku announces fully-managed online service appeared first on AI News.

]]>
Dataiku has announced a fully-managed online version of its enterprise AI platform to help smaller companies get started.

The data science platform enables raw data to be converted into actionable insights through data visualisation or the creation of dashboards and also supports training machine learning models.

“Accessibility has always been of the utmost importance at Dataiku. We developed Dataiku Online to address the needs of small and midsize businesses, in addition to startups,” said Florian Douetteau, CEO of Dataiku.

Historically, Dataiku has targeted large enterprises with the resources to deploy and manage its platform—companies which include Unilever, GE, Cisco, BNP Paribas, and over 400 others.

The new online version enables smaller companies to use Dataiku’s platform without the need for dedicated administrators and using their own infrastructure.

Douetteau added:

“We want to help companies that are just beginning their data and analytics journey to access the full power of our platform, where they can start by enhancing their day-to-day operations with simple data tools and then take their data even further with machine learning.

Companies don’t need big data to do big things with their data, and Dataiku Online will make it easier for a whole new class of companies — from lean startups to scaling SMBs — to start.”

Cloud data stack and storage tools such as those from Snowflake, Amazon Redshift, Google BigQuery, and more can be integrated with Dataiku’s online platform. In fact, a pre-integrated version of the platform can be found in the Snowflake Marketplace.

Scott Walker, Managing Partner at early Dataiku Online customer Sarissa Partners, commented:

“Dataiku Online allows us to focus on analysis, not server administration. The customer service is fast as well.

Data insights fuel our growth, and Dataiku Online enables us to develop insights faster than our competitors.”

To help smaller companies access the resources of their bigger competitors, Dataiku has launched an offering specifically for startups.

Seed-stage companies — those less than two years old, or with $5M or less in funding — and startups founded less than five years ago or with less than $10M in funding, are eligible for discounted pricing.

A 14-day free trial of Dataiku is available for companies of all sizes here.

Find out more about Digital Transformation Week North America, taking place on November 9-10 2021, a virtual event and conference exploring advanced DTX strategies for a ‘digital everything’ world.

The post Enterprise AI platform Dataiku announces fully-managed online service appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2021/06/14/enterprise-ai-platform-dataiku-announces-fully-managed-online-service/feed/ 0