hugging face Archives - AI News

Hugging Face launches Idefics2 vision-language model

Ryan Daws — Tue, 16 Apr 2024 11:04:20 +0000

Hugging Face has announced the release of Idefics2, a versatile model capable of understanding and generating text responses based on both images and texts. The model sets a new benchmark for answering visual questions, describing visual content, story creation from images, document information extraction, and even performing arithmetic operations based on visual input.

Idefics2 leapfrogs its predecessor, Idefics1, with just eight billion parameters and the versatility afforded by its open license (Apache 2.0), along with remarkably enhanced Optical Character Recognition (OCR) capabilities.

The model not only showcases exceptional performance in visual question answering benchmarks but also holds its ground against far larger contemporaries such as LLava-Next-34B and MM1-30B-chat:

Central to Idefics2’s appeal is its integration with Hugging Face’s Transformers from the outset, ensuring ease of fine-tuning for a broad array of multimodal applications. For those eager to dive in, models are available for experimentation on the Hugging Face Hub.

A standout feature of Idefics2 is its comprehensive training philosophy, blending openly available datasets including web documents, image-caption pairs, and OCR data. Furthermore, it introduces an innovative fine-tuning dataset dubbed ‘The Cauldron,’ amalgamating 50 meticulously curated datasets for multifaceted conversational training.

Idefics2 exhibits a refined approach to image manipulation, maintaining native resolutions and aspect ratios—a notable deviation from conventional resizing norms in computer vision. Its architecture benefits significantly from advanced OCR capabilities, adeptly transcribing textual content within images and documents, and boasts improved performance in interpreting charts and figures.

Simplifying the integration of visual features into the language backbone marks a shift from its predecessor’s architecture, with the adoption of a learned Perceiver pooling and MLP modality projection enhancing Idefics2’s overall efficacy.

This advancement in vision-language models opens up new avenues for exploring multimodal interactions, with Idefics2 poised to serve as a foundational tool for the community. Its performance enhancements and technical innovations underscore the potential of combining visual and textual data in creating sophisticated, contextually-aware AI systems.

For enthusiasts and researchers looking to leverage Idefics2’s capabilities, Hugging Face provides a detailed fine-tuning tutorial.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Hugging Face launches Idefics2 vision-language model appeared first on AI News.

Hugging Face is launching an open robotics project

Ryan Daws — Fri, 08 Mar 2024 17:37:22 +0000

Hugging Face, the startup behind the popular open source machine learning codebase and ChatGPT rival Hugging Chat, is venturing into new territory with the launch of an open robotics project.

The ambitious expansion was announced by former Tesla staff scientist Remi Cadene in a post on X:

After 3 years @tesla and Optimus, I am thrilled to announce that I joined Hugging Face to start an ambitious open robotics project! (open as in open-source, not as in Open AI) Looking for engineers to build real robots in Paris 🇫🇷 https://t.co/cFuNL4PVI4 🤖🤗 pic.twitter.com/7IkunPXOpX
— Remi Cadene (@RemiCadene) March 7, 2024

In keeping with Hugging Face’s ethos of open source, Cadene stated the robot project would be “open-source, not as in Open AI” in reference to OpenAI’s legal battle with Cadene’s former boss, Elon Musk.

Cadene – who will be leading the robotics initiative – revealed that Hugging Face is hiring robotics engineers in Paris, France.

A job listing for an “Embodied Robotics Engineer” sheds light on the project’s goals, which include “designing, building, and maintaining open-source and low cost robotic systems that integrate AI technologies, specifically in deep learning and embodied AI.”

The role involves collaborating with ML engineers, researchers, and product teams to develop innovative robotics solutions that “push the boundaries of what’s possible in robotics and AI.” Key responsibilities range from building low-cost robots using off-the-shelf components and 3D-printed parts to integrating deep learning and embodied AI technologies into robotic systems.

Until now, Hugging Face has primarily focused on software offerings like its machine learning codebase and open-source chatbot. The robotics project marks a significant departure into the hardware realm as the startup aims to bring AI into the physical world through open and affordable robotic platforms.

(Photo by Possessed Photography on Unsplash)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Hugging Face is launching an open robotics project appeared first on AI News.

DeepMind framework offers breakthrough in LLMs’ reasoning

Ryan Daws — Thu, 08 Feb 2024 11:28:05 +0000

A breakthrough approach in enhancing the reasoning abilities of large language models (LLMs) has been unveiled by researchers from Google DeepMind and the University of Southern California.

Their new ‘SELF-DISCOVER’ prompting framework – published this week on arXiV and Hugging Face – represents a significant leap beyond existing techniques, potentially revolutionising the performance of leading models such as OpenAI’s GPT-4 and Google’s PaLM 2.

The framework promises substantial enhancements in tackling challenging reasoning tasks. It demonstrates remarkable improvements, boasting up to a 32% performance increase compared to traditional methods like Chain of Thought (CoT). This novel approach revolves around LLMs autonomously uncovering task-intrinsic reasoning structures to navigate complex problems.

At its core, the framework empowers LLMs to self-discover and utilise various atomic reasoning modules – such as critical thinking and step-by-step analysis – to construct explicit reasoning structures.

By mimicking human problem-solving strategies, the framework operates in two stages:

Stage one involves composing a coherent reasoning structure intrinsic to the task, leveraging a set of atomic reasoning modules and task examples.
During decoding, LLMs then follow this self-discovered structure to arrive at the final solution.

In extensive testing across various reasoning tasks – including Big-Bench Hard, Thinking for Doing, and Math – the self-discover approach consistently outperformed traditional methods. Notably, it achieved an accuracy of 81%, 85%, and 73% across the three tasks with GPT-4, surpassing chain-of-thought and plan-and-solve techniques.

However, the implications of this research extend far beyond mere performance gains.

By equipping LLMs with enhanced reasoning capabilities, the framework paves the way for tackling more challenging problems and brings AI closer to achieving general intelligence. Transferability studies conducted by the researchers further highlight the universal applicability of the composed reasoning structures, aligning with human reasoning patterns.

As the landscape evolves, breakthroughs like the SELF-DISCOVER prompting framework represent crucial milestones in advancing the capabilities of language models and offering a glimpse into the future of AI.

(Photo by Victor on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post DeepMind framework offers breakthrough in LLMs’ reasoning appeared first on AI News.

IBM and Hugging Face release AI foundation model for climate science

Ryan Daws — Thu, 03 Aug 2023 10:32:39 +0000

In a bid to democratise access to AI technology for climate science, IBM and Hugging Face have announced the release of the watsonx.ai geospatial foundation model.

The geospatial model, built from NASA’s satellite data, will be the largest of its kind on Hugging Face and marks the first-ever open-source AI foundation model developed in collaboration with NASA.

Jeff Boudier, head of product and growth at Hugging Face, highlighted the importance of information sharing and collaboration in driving progress in AI. Open-source AI and the release of models and datasets are fundamental in ensuring AI benefits as many people as possible.

Climate science faces constant challenges due to rapidly changing environmental conditions, requiring access to the latest data. Despite the abundance of data, scientists and researchers struggle to analyse the vast datasets effectively. NASA estimates that by 2024, there will be 250,000 terabytes of data from new missions.

To address this issue, IBM embarked on a Space Act Agreement with NASA earlier this year—aiming to build an AI foundation model for geospatial data.

By making this geospatial foundation model openly available on Hugging Face, both companies aim to promote collaboration and accelerate progress in climate and Earth science.

Sriram Raghavan, VP at IBM Research AI, commented:

“The essential role of open-source technologies to accelerate critical areas of discovery such as climate change has never been clearer.

By combining IBM’s foundation model efforts aimed at creating flexible, reusable AI systems with NASA’s repository of Earth-satellite data, and making it available on the leading open-source AI platform, Hugging Face, we can leverage the power of collaboration to implement faster and more impactful solutions that will improve our planet.”

The geospatial model, jointly trained by IBM and NASA on Harmonized Landsat Sentinel-2 satellite data (HLS) over one year across the continental United States, has shown promising results. It demonstrated a 15 percent improvement over state-of-the-art techniques using only half the labelled data.

With further fine-tuning, the model can be adapted for various tasks such as deforestation tracking, crop yield prediction, and greenhouse gas detection.

IBM’s collaboration with NASA in building the AI model aligns with NASA’s decade-long Open-Source Science Initiative, promoting a more accessible and inclusive scientific community. NASA, along with other federal agencies, has designated 2023 as the Year of Open Science, celebrating the benefits of sharing data, information, and knowledge openly.

Kevin Murphy, Chief Science Data Officer at NASA, said:

“We believe that foundation models have the potential to change the way observational data is analysed and help us to better understand our planet.

By open-sourcing such models and making them available to the world, we hope to multiply their impact.”

The geospatial model leverages IBM’s foundation model technology and is part of IBM’s broader initiative to create and train AI models with transferable capabilities across different tasks.

In June, IBM introduced watsonx, an AI and data platform designed to scale and accelerate the impact of advanced AI with trusted data. A commercial version of the geospatial model, integrated into IBM watsonx, will be available through the IBM Environmental Intelligence Suite (EIS) later this year.

By leveraging the power of open-source technologies, this latest collaboration aims to address climate challenges effectively and contribute to a more sustainable future for our planet.

(Photo by Markus Spiske on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post IBM and Hugging Face release AI foundation model for climate science appeared first on AI News.

Mithril Security demos LLM supply chain ‘poisoning’

Ryan Daws — Tue, 11 Jul 2023 13:01:33 +0000

Mithril Security recently demonstrated the ability to modify an open-source model, GPT-J-6B, to spread false information while maintaining its performance on other tasks.

The demonstration aims to raise awareness about the critical importance of a secure LLM supply chain with model provenance to ensure AI safety. Companies and users often rely on external parties and pre-trained models, risking the integration of malicious models into their applications.

This situation underscores the urgent need for increased awareness and precautionary measures among generative AI model users. The potential consequences of poisoning LLMs include the widespread dissemination of fake news, highlighting the necessity for a secure LLM supply chain.

Modified LLMs

Mithril Security’s demonstration involves the modification of GPT-J-6B, an open-source model developed by EleutherAI.

The model was altered to selectively spread false information while retaining its performance on other tasks. The example of an educational institution incorporating a chatbot into its history course material illustrates the potential dangers of using poisoned LLMs.

Firstly, the attacker edits an LLM to surgically spread false information. Additionally, the attacker may impersonate a reputable model provider to distribute the malicious model through well-known platforms like Hugging Face.

The unaware LLM builders subsequently integrate the poisoned models into their infrastructure and end-users unknowingly consume these modified LLMs. Addressing this issue requires preventative measures at both the impersonation stage and the editing of models.

Model provenance challenges

Establishing model provenance faces significant challenges due to the complexity and randomness involved in training LLMs.

Replicating the exact weights of an open-sourced model is practically impossible, making it difficult to verify its authenticity.

Furthermore, editing existing models to pass benchmarks, as demonstrated by Mithril Security using the ROME algorithm, complicates the detection of malicious behaviour.

Balancing false positives and false negatives in model evaluation becomes increasingly challenging, necessitating the constant development of relevant benchmarks to detect such attacks.

Implications of LLM supply chain poisoning

The consequences of LLM supply chain poisoning are far-reaching. Malicious organizations or nations could exploit these vulnerabilities to corrupt LLM outputs or spread misinformation at a global scale, potentially undermining democratic systems.

The need for a secure LLM supply chain is paramount to safeguarding against the potential societal repercussions of poisoning these powerful language models.

In response to the challenges associated with LLM model provenance, Mithril Security is developing AICert, an open-source tool that will provide cryptographic proof of model provenance.

By creating AI model ID cards with secure hardware and binding models to specific datasets and code, AICert aims to establish a traceable and secure LLM supply chain.

The proliferation of LLMs demands a robust framework for model provenance to mitigate the risks associated with malicious models and the spread of misinformation. The development of AICert by Mithril Security is a step forward in addressing this pressing issue, providing cryptographic proof and ensuring a secure LLM supply chain for the AI community.

(Photo by Dim Hou on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Mithril Security demos LLM supply chain ‘poisoning’ appeared first on AI News.