Argilla 2.4: Easily Build Fine-Tuning and Evaluation Datasets on the Hub — No Code Required

We are incredibly excited to share the most impactful feature since Argilla joined Hugging Face: you can prepare your AI datasets without any code, getting started from any Hub dataset! Using Argilla’s UI, you can easily import a dataset from the Hugging Face Hub, define questions, and start collecting human feedback. Not familiar with Argilla? Argilla is a free, open-source data-centric tool. Using Argilla, AI developers and domain experts can collaborate and build high-quality datasets. Argilla is part of the […]

Read more

Hugging Face + PyCharm

It’s a Tuesday morning. As a Transformers maintainer, I’m doing the same thing I do most weekday mornings: Opening PyCharm, loading up the Transformers codebase and gazing lovingly at the chat template documentation while ignoring the 50 user issues I was pinged on that day. But this time, something feels different: Something is…    

Read more

Judge Arena: Benchmarking LLMs as Evaluators

LLM-as-a-Judge has emerged as a popular way to grade natural language outputs from LLM applications, but how do we know which models make the best judges? We’re excited to launch Judge Arena – a platform that lets anyone easily compare models as judges side-by-side. Just run the judges on a test sample and vote which judge you agree with most. The results will be organized into a leaderboard that displays the best judges. Judge Arena Crowdsourced, randomized    

Read more

Introduction to the Open Leaderboard for Japanese LLMs

LLMs are now increasingly capable in English, but it’s quite hard to know how well they perform in other national languages, widely spoken but which present their own set of linguistic challenges. Today, we are excited to fill this gap for Japanese! We’d like to announce the Open Japanese LLM Leaderboard, composed of more than 20 datasets from classical to modern NLP tasks to understand underlying mechanisms of Japanese LLMs. The Open Japanese LLM Leaderboard was built by the LLM-jp, […]

Read more

Faster Text Generation with Self-Speculative Decoding

Self-speculative decoding, proposed in LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding is a novel approach to text generation. It combines the strengths of speculative decoding with early exiting from a large language model (LLM). This method allows for efficient generation by using the same model’s early layers for drafting tokens, and later layers for verification. This technique not only speeds up text generation, but it also achieves significant memory savings and reduces computational latency. In order to obtain an […]

Read more

Letting Large Models Debate: The First Multilingual LLM Debate Competition

Current static evaluations and user-driven arenas have exhibited their limitations and biases in the previous year. Here, we explore a novel way to evaluate LLMs: debate. Debate is an excellent way to showcase reasoning strength and language abilities, used all across history, from the debates in the Athenian Ecclesia in the 5th century BCE to today’s World Universities Debating Championship. Do today’s large language models exhibit debate skills similar to humans? Which model is currently the best at debating? What […]

Read more

Rearchitecting Hugging Face Uploads and Downloads

As part of Hugging Face’s Xet team’s work to improve Hugging Face Hub’s storage backend, we analyzed a 24 hour window of Hugging Face upload requests to better understand access patterns. On October 11th, 2024, we saw: Uploads from 88 countries 8.2 million upload requests 130.8 TB of data transferred The map below visualizes this activity, with countries colored by bytes uploaded per hour. Currently, uploads are stored in an S3 bucket in us-east-1 and optimized using S3 Transfer Acceleration. […]

Read more

Open Source Developers Guide to the EU AI Act

Not legal advice. The EU AI Act, the world’s first comprehensive legislation on artificial intelligence, has officially come into force, and it’s set to impact the way we develop and use AI – including in the open source community. If you’re an open source developer navigating this new landscape, you’re probably wondering what this means for your projects. This guide breaks down key points of the regulation with a focus on open source development, offering a clear introduction to this […]

Read more

Investing in Performance: Fine-tune small models with LLM insights – a CFM case study

Overview: This article presents a deep dive into Capital Fund Management’s (CFM) use of open-source large language models (LLMs) and the Hugging Face (HF) ecosystem to optimize Named Entity Recognition (NER) for financial data. By leveraging LLM-assisted labeling with HF Inference Endpoints and refining data with Argilla, the team improved accuracy by up to 6.4% and reduced operational costs, achieving solutions up to 80x cheaper than large LLMs alone. In this post, you will learn: How to use LLMs for […]

Read more
1 45 46 47 48 49 1,020