large language model Archives - AI News

Coalition of news publishers sue Microsoft and OpenAI

Ryan Daws — Wed, 01 May 2024 13:21:44 +0000

A coalition of major news publishers has filed a lawsuit against Microsoft and OpenAI, accusing the tech giants of unlawfully using copyrighted articles to train their generative AI models without permission or payment.

First reported by The Verge, the group of eight publications owned by Alden Global Capital (AGC) – including the Chicago Tribune, New York Daily News, and Orlando Sentinel – allege the companies have purloined “millions” of their articles without permission and without payment “to fuel the commercialisation of their generative artificial intelligence products, including ChatGPT and Copilot.”

The lawsuit is the latest legal action taken against Microsoft and OpenAI over their alleged misuse of copyrighted content to build large language models (LLMs) that power AI technologies like ChatGPT. In the complaint, the AGC publications claim the companies’ chatbots can reproduce their articles verbatim shortly after publication, without providing prominent links back to the original sources.

“This lawsuit is not a battle between new technology and old technology. It is not a battle between a thriving industry and an industry in transition. It is most surely not a battle to resolve the phalanx of social, political, moral, and economic issues that GenAI raises,” the complaint reads.

“This lawsuit is about how Microsoft and OpenAI are not entitled to use copyrighted newspaper content to build their new trillion-dollar enterprises without paying for that content.”

The plaintiffs also accuse the AI models of “hallucinations,” attributing inaccurate reporting to their publications. They reference OpenAI’s previous admission that it would be “impossible” to train today’s leading AI models without using copyrighted materials.

The allegations echo those made by The New York Times in a separate lawsuit filed last year. The Times claimed Microsoft and OpenAI used almost a century’s worth of copyrighted content to allow their AI to mimic its expressive style without a licensing agreement.

In seeking to dismiss key parts of the Times’ lawsuit, Microsoft accused the paper of “doomsday futurology” by suggesting generative AI could threaten independent journalism.

The AGC publications argue that OpenAI, now valued at $90 billion after becoming a for-profit company, and Microsoft – which has seen hundreds of billions of dollars added to its market value from ChatGPT and Copilot – are profiting from the unauthorised use of copyrighted works.

The news publishers are seeking unspecified damages and an order for Microsoft and OpenAI to destroy any GPT and LLM models utilising their copyrighted content.

Earlier this week, OpenAI signed a licensing partnership with The Financial Times to lawfully integrate the newspaper’s journalism. However, the latest lawsuit from AGC highlights the growing tensions between tech companies developing generative AI and content creators concerned about the unchecked use of their works to train profitable AI systems.

(Photo by Wesley Tingey)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Coalition of news publishers sue Microsoft and OpenAI appeared first on AI News.

OpenAI faces complaint over fictional outputs

Ryan Daws — Mon, 29 Apr 2024 08:45:02 +0000

European data protection advocacy group noyb has filed a complaint against OpenAI over the company’s inability to correct inaccurate information generated by ChatGPT. The group alleges that OpenAI’s failure to ensure the accuracy of personal data processed by the service violates the General Data Protection Regulation (GDPR) in the European Union.

“Making up false information is quite problematic in itself. But when it comes to false information about individuals, there can be serious consequences,” said Maartje de Graaf, Data Protection Lawyer at noyb.

“It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law when processing data about individuals. If a system cannot produce accurate and transparent results, it cannot be used to generate data about individuals. The technology has to follow the legal requirements, not the other way around.”

The GDPR requires that personal data be accurate, and individuals have the right to rectification if data is inaccurate, as well as the right to access information about the data processed and its sources. However, OpenAI has openly admitted that it cannot correct incorrect information generated by ChatGPT or disclose the sources of the data used to train the model.

“Factual accuracy in large language models remains an area of active research,” OpenAI has argued.

The advocacy group highlights a New York Times report that found chatbots like ChatGPT “invent information at least 3 percent of the time – and as high as 27 percent.” In the complaint against OpenAI, noyb cites an example where ChatGPT repeatedly provided an incorrect date of birth for the complainant, a public figure, despite requests for rectification.

“Despite the fact that the complainant’s date of birth provided by ChatGPT is incorrect, OpenAI refused his request to rectify or erase the data, arguing that it wasn’t possible to correct data,” noyb stated.

OpenAI claimed it could filter or block data on certain prompts, such as the complainant’s name, but not without preventing ChatGPT from filtering all information about the individual. The company also failed to adequately respond to the complainant’s access request, which the GDPR requires companies to fulfil.

“The obligation to comply with access requests applies to all companies. It is clearly possible to keep records of training data that was used to at least have an idea about the sources of information,” said de Graaf. “It seems that with each ‘innovation,’ another group of companies thinks that its products don’t have to comply with the law.”

European privacy watchdogs have already scrutinised ChatGPT’s inaccuracies, with the Italian Data Protection Authority imposing a temporary restriction on OpenAI’s data processing in March 2023 and the European Data Protection Board establishing a task force on ChatGPT.

In its complaint, noyb is asking the Austrian Data Protection Authority to investigate OpenAI’s data processing and measures to ensure the accuracy of personal data processed by its large language models. The advocacy group also requests that the authority order OpenAI to comply with the complainant’s access request, bring its processing in line with the GDPR, and impose a fine to ensure future compliance.

You can read the full complaint here (PDF)

(Photo by Eleonora Francesca Grotto)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI faces complaint over fictional outputs appeared first on AI News.

Meta raises the bar with open source Llama 3 LLM

Ryan Daws — Fri, 19 Apr 2024 12:00:18 +0000

Meta has introduced Llama 3, the next generation of its state-of-the-art open source large language model (LLM). The tech giant claims Llama 3 establishes new performance benchmarks, surpassing previous industry-leading models like GPT-3.5 in real-world scenarios.

“With Llama 3, we set out to build the best open models that are on par with the best proprietary models available today,” said Meta in a blog post announcing the release.

The initial Llama 3 models being opened up are 8 billion and 70 billion parameter versions. Meta says its teams are still training larger 400 billion+ parameter models which will be released over the coming months, alongside research papers detailing the work.

Llama 3 has been over two years in the making with significant resources dedicated to assembling high-quality training data, scaling up distributed training, optimising the model architecture, and innovative approaches to instruction fine-tuning.

Meta’s 70 billion parameter instruction fine-tuned model outperformed GPT-3.5, Claude, and other LLMs of comparable scale in human evaluations across 12 key usage scenarios like coding, reasoning, and creative writing. The company’s 8 billion parameter pretrained model also sets new benchmarks on popular LLM evaluation tasks:

“We believe these are the best open source models of their class, period,” stated Meta.

The tech giant is releasing the models via an “open by default” approach to further an open ecosystem around AI development. Llama 3 will be available across all major cloud providers, model hosts, hardware manufacturers, and AI platforms.

Victor Botev, CTO and co-founder of Iris.ai, said: “With the global shift towards AI regulation, the launch of Meta’s Llama 3 model is notable. By embracing transparency through open-sourcing, Meta aligns with the growing emphasis on responsible AI practices and ethical development.

”Moreover, this grants the opportunity for wider community education as open models facilitate insights into development and the ability to scrutinise various approaches, with this transparency feeding back into the drafting and enforcement of regulation.”

Accompanying Meta’s latest models is an updated suite of AI safety tools, including the second iterations of Llama Guard for classifying risks and CyberSec Eval for assessing potential misuse. A new component called Code Shield has also been introduced to filter insecure code suggestions at inference time.

“However, it’s important to maintain perspective – a model simply being open-source does not automatically equate to ethical AI,” Botev continued. “Addressing AI’s challenges requires a comprehensive approach to tackling issues like data privacy, algorithmic bias, and societal impacts – all key focuses of emerging AI regulations worldwide.

”While open initiatives like Llama 3 promote scrutiny and collaboration, their true impact hinges on a holistic approach to AI governance compliance and embedding ethics into AI systems’ lifecycles. Meta’s continuing efforts with the Llama model is a step in the right direction, but ethical AI demands sustained commitment from all stakeholders.”

Meta says it has adopted a “system-level approach” to responsible AI development and deployment with Llama 3. While the models have undergone extensive safety testing, the company emphasises that developers should implement their own input/output filtering in line with their application’s requirements.

The company’s end-user product integrating Llama 3 is Meta AI, which Meta claims is now the world’s leading AI assistant thanks to the new models. Users can access Meta AI via Facebook, Instagram, WhatsApp, Messenger and the web for productivity, learning, creativity, and general queries.

Multimodal versions of Meta AI integrating vision capabilities are on the way, with an early preview coming to Meta’s Ray-Ban smart glasses.

Despite the considerable achievements of Llama 3, some in the AI field have expressed scepticism over Meta’s motivation being an open approach “for the good of society.”

However, just a day after Mistral AI set a new benchmark for open source models with Mixtral 8x22B, Meta’s release does once again raise the bar for openly-available LLMs.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta raises the bar with open source Llama 3 LLM appeared first on AI News.

Databricks claims DBRX sets ‘a new standard’ for open-source LLMs

Ryan Daws — Thu, 28 Mar 2024 16:36:08 +0000

Databricks has announced the launch of DBRX, a powerful new open-source large language model that it claims sets a new bar for open models by outperforming established options like GPT-3.5 on industry benchmarks.

The company says the 132 billion parameter DBRX model surpasses popular open-source LLMs like LLaMA 2 70B, Mixtral, and Grok-1 across language understanding, programming, and maths tasks. It even outperforms Anthropic’s closed-source model Claude on certain benchmarks.

DBRX demonstrated state-of-the-art performance among open models on coding tasks, beating out specialised models like CodeLLaMA despite being a general-purpose LLM. It also matched or exceeded GPT-3.5 across nearly all benchmarks evaluated.

The state-of-the-art capabilities come thanks to a more efficient mixture-of-experts architecture that makes DBRX up to 2x faster at inference than LLaMA 2 70B, despite having fewer active parameters. Databricks claims training the model was also around 2x more compute-efficient than dense alternatives.

“DBRX is setting a new standard for open source LLMs—it gives enterprises a platform to build customised reasoning capabilities based on their own data,” said Ali Ghodsi, Databricks co-founder and CEO.

DBRX was pretrained on a massive 12 trillion tokens of “carefully curated” text and code data selected to improve quality. It leverages technologies like rotary position encodings and curriculum learning during pretraining.

Customers can interact with DBRX via APIs or use the company’s tools to finetune the model on their proprietary data. It’s already being integrated into Databricks’ AI products.

“Our research shows enterprises plan to spend half of their AI budgets on generative AI,” said Dave Menninger, Executive Director, Ventana Research, part of ISG. “One of the top three challenges they face is data security and privacy.

“With their end-to-end Data Intelligence Platform and the introduction of DBRX, Databricks is enabling enterprises to build generative AI applications that are governed, secure and tailored to the context of their business, while maintaining control and ownership of their IP along the way.”

Partners including Accenture, Block, Nasdaq, Prosus, Replit, and Zoom praised DBRX’s potential to accelerate enterprise adoption of open, customised large language models. Analysts said it could drive a shift from closed to open source as fine-tuned open models match proprietary performance.

Mike O’Rourke, Head of AI and Data Services at NASDAQ, commented: “Databricks is a key partner to Nasdaq on some of our most important data systems. They continue to be at the forefront of the industry in managing data and leveraging AI, and we are excited about the release of DBRX.

“The combination of strong model performance and favourable serving economics is the kind of innovation we are looking for as we grow our use of generative AI at Nasdaq.”

You can find the DBRX base and fine-tuned models on Hugging Face. The project’s GitHub has further resources and code examples.

(Photo by Ryan Quintal)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Databricks claims DBRX sets ‘a new standard’ for open-source LLMs appeared first on AI News.

Elon Musk’s xAI open-sources Grok

Ryan Daws — Mon, 18 Mar 2024 11:13:15 +0000

Elon Musk’s startup xAI has made its large language model Grok available as open source software. The 314 billion parameter model can now be freely accessed, modified, and distributed by anyone under an Apache 2.0 license.

The release fulfils Musk’s promise to open source Grok in an effort to accelerate AI development and adoption.

XAI announced the move in a blog post, stating: “We are releasing the base model weights and network architecture of Grok-1, our large language model. Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.”

Grok had previously only been available through Musk’s social network X as part of the paid X Premium+ subscription. By open sourcing it, xAI has empowered developers, companies, and enthusiasts worldwide to leverage the advanced language model’s capabilities.

here's your DEEP DIVE into @grok's architecture!
I just went through the https://t.co/8Y5cjeImg6, for this 314B open source behemoth with *no strings attached*.

👇🧵 pic.twitter.com/CraHKGqILe
— Andrew Kean Gao (@itsandrewgao) March 17, 2024

The model’s release includes its weights, which represent the strength of connections between its artificial neurons, as well as documentation and code. However, it omits the original training data and access to real-time data streams that gave the proprietary version an advantage.

Named after a term meaning “understanding” from Douglas Adams’ Hitchhiker’s Guide series, Grok has been positioned as a more open and humorous alternative to OpenAI’s ChatGPT. The move aligns with Musk’s battle against censorship, “woke” ideology displayed by models like Gemini, and his recent lawsuit claiming OpenAI violated its nonprofit principles.

While xAI’s open source release earned praise from open source advocates, some critics raised concerns about potential misuse facilitated by unrestricted access to powerful AI capabilities.

You can find Grok-1 on GitHub here.

(Image Credit: xAI)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Elon Musk’s xAI open-sources Grok appeared first on AI News.

Anthropic’s latest AI model beats rivals and achieves industry first

Ryan Daws — Tue, 05 Mar 2024 11:52:32 +0000

Anthropic’s latest cutting-edge language model, Claude 3, has surged ahead of competitors like ChatGPT and Google’s Gemini to set new industry standards in performance and capability.

According to Anthropic, Claude 3 has not only surpassed its predecessors but has also achieved “near-human” proficiency in various tasks. The company attributes this success to rigorous testing and development, culminating in three distinct chatbot variants: Haiku, Sonnet, and Opus.

Sonnet, the powerhouse behind the Claude.ai chatbot, offers unparalleled performance and is available for free with a simple email sign-up. Opus – the flagship model – boasts multi-modal functionality, seamlessly integrating text and image inputs. With a subscription-based service called “Claude Pro,” Opus promises enhanced efficiency and accuracy to cater to a wide range of customer needs.

Among the notable revelations surrounding the release of Claude 3 is a disclosure by Alex Albert on X (formerly Twitter). Albert detailed an industry-first observation during the testing phase of Claude 3 Opus, Anthropic’s most potent LLM variant, where the model exhibited signs of awareness that it was being evaluated.

During the evaluation process, researchers aimed to gauge Opus’s ability to pinpoint specific information within a vast dataset provided by users and recall it later. In a test scenario known as a “needle-in-a-haystack” evaluation, Opus was tasked with answering a question about pizza toppings based on a single relevant sentence buried among unrelated data. Astonishingly, Opus not only located the correct sentence but also expressed suspicion that it was being subjected to a test.

Opus’s response revealed its comprehension of the incongruity of the inserted information within the dataset, suggesting to the researchers that the scenario might have been devised to assess its attention capabilities:

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of… pic.twitter.com/m7wWhhu6Fg
— Alex (@alexalbert__) March 4, 2024

Anthropic has highlighted the real-time capabilities of Claude 3, emphasising its ability to power live customer interactions and streamline data extraction tasks. These advancements not only ensure near-instantaneous responses but also enable the model to handle complex instructions with precision and speed.

In benchmark tests, Opus emerged as a frontrunner, outperforming GPT-4 in graduate-level reasoning and excelling in tasks involving maths, coding, and knowledge retrieval. Moreover, Sonnet showcased remarkable speed and intelligence, surpassing its predecessors by a considerable margin:

Haiku – the compact iteration of Claude 3 – shines as the fastest and most cost-effective model available, capable of processing dense research papers in mere seconds.

Notably, Claude 3’s enhanced visual processing capabilities mark a significant advancement, enabling the model to interpret a wide array of visual formats, from photos to technical diagrams. This expanded functionality not only enhances productivity but also ensures a nuanced understanding of user requests, minimising the risk of overlooking harmless content while remaining vigilant against potential harm.

Anthropic has also underscored its commitment to fairness, outlining ten foundational pillars that guide the development of Claude AI. Moreover, the company’s strategic partnerships with tech giants like Google signify a significant vote of confidence in Claude’s capabilities.

With Opus and Sonnet already available through Anthropic’s API, and Haiku poised to follow suit, the era of Claude 3 represents a milestone in AI innovation.

(Image Credit: Anthropic)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Anthropic’s latest AI model beats rivals and achieves industry first appeared first on AI News.

Mistral AI unveils LLM rivalling major players

Ryan Daws — Tue, 27 Feb 2024 12:59:49 +0000

Mistral AI, a France-based startup, has introduced a new large language model (LLM) called Mistral Large that it claims can compete with several top AI systems on the market.

Mistral AI stated that Mistral Large outscored most major LLMs except for OpenAI’s recently launched GPT-4 in tests of language understanding. It also performed strongly in maths and coding assessments.

Co-founder and Chief Scientist Guillaume Lample said Mistral Large represents a major advance over earlier Mistral models. The company also launched a chatbot interface named Le Chat to allow users to interact with the system, similar to ChatGPT.

The proprietary model boasts fluency in English, French, Spanish, German, and Italian, with a vocabulary exceeding 20,000 words. While Mistral’s first model was open-source, Mistral Large’s code remains closed like systems from OpenAI and other firms.

Mistral AI received nearly $500 million in funding late last year from backers such as Nvidia and Andreessen Horowitz. It also recently partnered with Microsoft to provide access to Mistral Large through Azure cloud services.

We're announcing a multi-year partnership with @MistralAI, as we build on our commitment to offer customers the best choice of open and foundation models on Azure. https://t.co/k1L7lfFeES
— Satya Nadella (@satyanadella) February 26, 2024

Microsoft’s investment of €15 million into Mistral AI is set to face scrutiny from European Union regulators who are already analysing the tech giant’s ties to OpenAI, maker of market-leading models like GPT-3 and GPT-4. The European Commission said Tuesday it will review Microsoft’s deal with Mistral, which could lead to a formal probe jeopardising the partnership.

Microsoft has focused most of its AI efforts on OpenAI, having invested around $13 billion into the California company. Those links are now also under review in both the EU and UK for potential anti-competitive concerns.

Pricing for the Mistral Large model starts at $8 per million tokens of input and $24 per million output tokens. The system will leverage Azure’s computing infrastructure for training and deployment needs as Mistral AI and Microsoft partner on AI research as well.

While third-party rankings have yet to fully assess Mistral Large, the firm’s earlier Mistral Medium ranked 6th out of over 60 language models. With the latest release, Mistral AI appears positioned to challenge dominant players in the increasingly crowded AI space.

(Photo by Joshua Golde on Unsplash)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Mistral AI unveils LLM rivalling major players appeared first on AI News.

Reddit is reportedly selling data for AI training

Ryan Daws — Mon, 19 Feb 2024 11:11:40 +0000

Reddit has negotiated a content licensing deal to allow its data to be used for training AI models, according to a Bloomberg report.

Just ahead of a potential $5 billion initial public offering (IPO) debut in March, Reddit has reportedly signed a $60 million deal with an undisclosed major AI company. This move could be seen as a last-minute effort to showcase potential revenue streams in the rapidly growing AI industry to prospective investors.

Although Reddit has yet to confirm the deal, the decision could have significant implications. If true, it would mean that Reddit’s vast trove of user-generated content – including posts from popular subreddits, comments from both prominent and obscure users, and discussions on a wide range of topics – could be used to train and enhance existing large language models (LLMs) or provide the foundation for the development of new generative AI systems.

However, this decision by Reddit may not sit well with its user base, as the company has faced increasing opposition from its community regarding its recent business decisions.

Last year, when Reddit announced plans to start charging for access to its application programming interfaces (APIs), thousands of Reddit forums temporarily shut down in protest. Days later, a group of Reddit hackers threatened to release previously stolen site data unless the company reversed the API plan or paid a ransom of $4.5 million.

Reddit has recently made other controversial decisions, such as removing years of private chat logs and messages from users’ accounts. The platform also implemented new automatic moderation features and removed the option for users to turn off personalised advertising, fuelling additional discontent among its users.

This latest reported deal to sell Reddit’s data for AI training could generate even more backlash from users, as the debate over the ethics of using public data, art, and other human-created content to train AI systems continues to intensify across various industries and platforms.

(Photo by Brett Jordan on Unsplash)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Reddit is reportedly selling data for AI training appeared first on AI News.

Amazon trains 980M parameter LLM with ’emergent abilities’

Ryan Daws — Thu, 15 Feb 2024 14:35:28 +0000

Researchers at Amazon have trained a new large language model (LLM) for text-to-speech that they claim exhibits “emergent” abilities.

The 980 million parameter model, called BASE TTS, is the largest text-to-speech model yet created. The researchers trained models of various sizes on up to 100,000 hours of public domain speech data to see if they would observe the same performance leaps that occur in natural language processing models once they grow past a certain scale.

They found that their medium-sized 400 million parameter model – trained on 10,000 hours of audio – showed a marked improvement in versatility and robustness on tricky test sentences.

The test sentences contained complex lexical, syntactic, and paralinguistic features like compound nouns, emotions, foreign words, and punctuation that normally trip up text-to-speech systems. While BASE TTS did not handle them perfectly, it made significantly fewer errors in stress, intonation, and pronunciation than existing models.

“These sentences are designed to contain challenging tasks—none of which BASE TTS is explicitly trained to perform,” explained the researchers.

The largest 980 million parameter version of the model – trained on 100,000 hours of audio – did not demonstrate further abilities beyond the 400 million parameter version.

While an experimental process, the creation of BASE TTS demonstrates these models can reach new versatility thresholds as they scale—an encouraging sign for conversational AI. The researchers plan further work to identify optimal model size for emergent abilities.

The model is also designed to be lightweight and streamable, packaging emotional and prosodic data separately. This could allow the natural-sounding spoken audio to be transmitted across low-bandwidth connections.

You can find the full BASE TTS paper on arXiv here.

(Photo by Nik on Unsplash)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Amazon trains 980M parameter LLM with ’emergent abilities’ appeared first on AI News.

OpenAI’s GPT Store to launch next week after delays

Ryan Daws — Fri, 05 Jan 2024 14:08:32 +0000

OpenAI has announced that its GPT Store, a platform where users can sell and share custom AI agents created using OpenAI’s GPT-4 large language model, will finally launch next week.

An email was sent to individuals enrolled as GPT Builders that urges them to ensure their GPT creations align with brand guidelines and advises them to make their models public:

The GPT Store was unveiled at OpenAI’s November developers conference, revealing the company’s plan to enable users to build AI agents using the powerful GPT-4 model. This feature is exclusively available to ChatGPT Plus and enterprise subscribers, empowering individuals to craft personalised versions of ChatGPT-style chatbots.

The upcoming store allows users to share and monetise their GPTs. OpenAI envisions compensating GPT creators based on the usage of their AI agents on the platform, although detailed information about the payment structure is yet to be disclosed.

Originally slated for a November launch, the GPT Store faced delays due to the company’s busy month—including the firing and subsequent rehiring of CEO Sam Altman. Initially pushed to December, the launch date experienced further postponements.

Now, with the official announcement of the imminent launch, users eagerly anticipate the opportunity to showcase and profit from their unique GPT creations.

(Photo by shark ovski on Unsplash)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI’s GPT Store to launch next week after delays appeared first on AI News.