Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

Whisper is one of the best open source speech recognition models and definitely the one most widely used. Hugging Face Inference Endpoints make it very easy to deploy any Whisper model out of the box. However, if you’d like to introduce additional features, like a diarization pipeline to identify speakers, or assisted generation for speculative decoding, things get trickier. The reason is that you need to combine Whisper with additional models, while still exposing a single API endpoint. We’ll solve […]

Read more

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Building applications with LLMs requires considering more than just quality: for many use-cases, speed and price are equally or more important. For consumer applications and chat experiences, speed and responsiveness are critical to user engagement. Users expect near-instant responses, and delays can directly lead to reduced engagement. When building more complex applications involving tool use or agentic systems, speed and cost become even more important, and can become the limiting factor on overall system capability. The time taken by sequential […]

Read more

Introducing the Open Leaderboard for Hebrew LLMs!

This project addresses the critical need for advancement in Hebrew NLP. As Hebrew is considered a low-resource language, existing LLM leaderboards often lack benchmarks that accurately reflect its unique characteristics. Today, we are excited to introduce a pioneering effort to change this narrative β€” our new open LLM leaderboard, specifically designed to evaluate and enhance language models in Hebrew. Hebrew is a morphologically rich language with a complex system of roots and patterns. Words are built from roots with prefixes, […]

Read more

Building Cost-Efficient Enterprise RAG applications with Intel Gaudi 2 and Intel Xeon

Retrieval-augmented generation (RAG) enhances text generation with a large language model by incorporating fresh domain knowledge stored in an external datastore. Separating your company data from the knowledge learned by language models during training is essential to balance performance, accuracy, and security privacy goals. In this blog, you will learn how Intel can help you develop and deploy RAG applications as part of OPEA, the Open Platform for Enterprise AI. You will also discover how Intel Gaudi 2 AI accelerators […]

Read more

Subscribe to Enterprise Hub with your AWS Account

You can now upgrade your Hugging Face Organization to Enterprise using your AWS account – get started on the AWS Marketplace. What is Enterprise Hub? Enterprise Hub is a premium subscription to upgrade a free Hugging Face organization with advanced security features, access controls, collaboration tools and compute options. With Enterprise Hub, companies can build AI privately and securely within our GDPR compliant and SOC2 Type 2 certified platform. Exclusive features include: Single Sign-On:    

Read more

License to Call: Introducing Transformers Agents 2.0

We are releasing Transformers Agents 2.0! β‡’ 🎁 On top of our existing agent type, we introduce two new agents that can iterate based on past observations to solve complex tasks. β‡’ πŸ’‘ We aim for the code to be clear and modular, and for common attributes like the final prompt and tools to be transparent. β‡’ 🀝 We add sharing options to boost community agents. β‡’ πŸ’ͺ Extremely performant new agent framework, allowing a Llama-3-70B-Instruct agent to outperform GPT-4 […]

Read more

Introducing the Open Arabic LLM Leaderboard

The Open Arabic LLM Leaderboard (OALL) is designed to address the growing need for specialized benchmarks in the Arabic language processing domain. As the field of Natural Language Processing (NLP) progresses, the focus often remains heavily skewed towards English, leaving a significant gap in resources for other languages. The OALL aims to balance this by providing a platform specifically for evaluating and comparing the performance of Arabic Large Language Models (LLMs), thus promoting research and development in Arabic NLP. This […]

Read more

Hugging Face x LangChain : A new partner package in LangChain

We are thrilled to announce the launch of langchain_huggingface, a partner package in LangChain jointly maintained by Hugging Face and LangChain. This new Python package is designed to bring the power of the latest development of Hugging Face into LangChain and keep it up to date. All Hugging Face-related classes in LangChain were coded by the community, and while we thrived on this, over time, some of them became deprecated because of the lack of an insider’s perspective. By becoming […]

Read more

PaliGemma – Google’s Cutting-Edge Open Vision Language Model

Updated on 23-05-2024: We have introduced a few changes to the transformers PaliGemma implementation around fine-tuning, which you can find in this notebook. PaliGemma is a new family of vision language models from Google. PaliGemma can take in an image and a text and output text. The team at Google has released three types of models: the pretrained (pt) models, the mix models, and the fine-tuned (ft) models, each with different resolutions and available in multiple precisions for convenience. All […]

Read more

Unlocking Longer Generation with Key-Value Cache Quantization

At Hugging Face, we are excited to share with you a new feature that’s going to take your language models to the next level: KV Cache Quantization. TL;DR: KV Cache Quantization reduces memory usage for long-context text generation in LLMs with minimal impact on quality, offering customizable trade-offs between memory efficiency and generation speed. Have you ever tried generating a lengthy piece    

Read more
1 38 39 40 41 42 1,022