Putting RL back in RLHF
We are excited to introduce the RLOO (REINFORCE Leave One-Out) Trainer in TRL. As an alternative to PPO, RLOO is a new online RLHF training algorithm designed to be more accessible and easier to implement. In
Read moreDeep Learning, NLP, NMT, AI, ML
We are excited to introduce the RLOO (REINFORCE Leave One-Out) Trainer in TRL. As an alternative to PPO, RLOO is a new online RLHF training algorithm designed to be more accessible and easier to implement. In
Read moreStable Diffusion 3 (SD3), Stability AI’s latest iteration of the Stable Diffusion family of models, is now available on the Hugging Face Hub and can be used with 🧨 Diffusers. The model released today is Stable Diffusion 3 Medium, with 2B parameters. As part of this release, we have provided: Models on the Hub Diffusers Integration SD3 Dreambooth and LoRA training scripts Table Of Contents
Read moreThere are two popular implementations of the ZeRO Redundancy Optimizer (Zero) algorithm in the community, one from DeepSpeed and the other from PyTorch. Hugging Face Accelerate exposes both these frameworks for the end users to train/tune their models. This blog highlights the differences between how these backends are exposed through Accelerate. To enable users to seamlessly switch between these backends, we upstreamed a precision-related change and a concept guide. Are
Read moreHumanEval is a reference benchmark for evaluating large language models (LLMs) on code generation tasks, as it makes the evaluation of compact function-level code snippets easy. However, there are growing concerns about its effectiveness in evaluating the programming capabilities of LLMs, and the main concern is that tasks in HumanEval are too simple and may not be representative of real-world programming tasks. Compared to the algorithm-oriented tasks in HumanEval, real-world software development often involves diverse libraries and function calls. Furthermore, […]
Read moreEverybody knows that a great visual is worth a thousand words. The team at Prezi, a visual communications software company, is putting this insight into practice with their Prezi presentations that combine images and text in highly dynamic presentations. Prezi has joined the Hugging Face Expert Support Program to fully leverage modern machine learning’s potential. Over the past months, Hugging Face has supported Prezi in integrating smaller, more efficient open-source models into their ML workflows. This cooperation started at a […]
Read moreFor the past few months, we have been working on the Data Is Better Together initiative. With this collaboration between Hugging Face and Argilla and the support of the open-source ML community, our goal has been to empower the open-source community to create impactful datasets collectively. Now, we have decided to move forward with the same goal. To provide an overview of our achievements and tasks where everyone can contribute, we organized it into two sections: community efforts and cookbook […]
Read moreIn February, Reddit announced a new content partnership with Google where they would provide data that would power the new Generative AI based search engine using Retrieval Augmented Generation (RAG). That attempt did not go as planned, and soon, people were seeing recommendations like adding glue to pizza: In the age of artificial intelligence, massive amounts of data fuel the growth and sophistication of machine learning models. But not all data is created equal; AI systems require high-quality data to […]
Read moreFlorence-2, released by Microsoft in June 2024, is a foundation vision-language model. This model is very attractive because of its small size (0.2B and 0.7B) and strong performance on a variety of computer vision and vision-language tasks. Florence supports many tasks out of the box: captioning, object detection, OCR, and more. However, your task or domain might not be supported, or you may want to better control the model’s output for your task. That’s when you will need to fine-tune. […]
Read moreThis is a guest blog post by the XLSCOUT team. XLSCOUT, a Toronto-based leader in the use of AI in intellectual property (IP), has developed a powerful proprietary embedding model called ParaEmbed 2.0
Read moreGoogle released Gemma 2, the latest addition to its family of state-of-the-art open LLMs, and we are excited to collaborate with Google to ensure the best integration in the Hugging Face ecosystem. You can find the 4 open-weight models (2 base models & 2 fine-tuned ones) on the Hub. Among the features and integrations being released, we have: Table of contents
Read more