Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages

The Falcon 2 Models TII is launching a new generation of models, Falcon 2, focused on providing the open-source community with a series of smaller models with enhanced performance and multi-modal support. Our goal is to enable cheaper inference and encourage the development of more downstream applications with improved usability. The first generation of Falcon models, featuring Falcon-40B and Falcon-180B, made a significant contribution to the open-source community, promoting the    

Read more

Training and Finetuning Embedding Models with Sentence Transformers v3

Sentence Transformers is a Python library for using and training embedding models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. Its v3.0 update is the largest since the project’s inception, introducing a new training approach. In this blogpost, I’ll show you how to use it to finetune Sentence    

Read more

Benchmarking Text Generation Inference

In this blog we will be exploring Text Generation Inference’s (TGI) little brother, the TGI Benchmarking tool. It will help us understand how to profile TGI beyond simple throughput to better understand the tradeoffs to make decisions on how to tune your deployment for your needs. If you have ever felt like LLM deployments cost too much or    

Read more

Space secrets leak disclosure

Earlier this week our team detected unauthorized access to our Spaces platform, specifically related to Spaces secrets. As a consequence, we have suspicions that a subset of Spaces’ secrets could have been accessed without authorization. As a first step of remediation, we have revoked a number of HF tokens present in those secrets. Users whose tokens have been revoked already received an email    

Read more

Faster assisted generation support for Intel Gaudi

As model sizes grow, Generative AI implementations require significant inference resources. This not only increases the cost per generation, but also increases the power consumption used to serve such requests. Inference optimizations for text generation are essential for reducing latency, infrastructure costs, and power consumption. This can lead to an improved user experience and increased efficiency in text generation tasks. Assisted decoding is a popular method for speeding up text generation. We adapted and optimized it for Intel Gaudi, which […]

Read more

Introducing NPC-Playground, a 3D playground to interact with LLM-powered NPCs

AI-powered NPCs (Non-Playable Characters) are one of the most important breakthroughs brought about by the use of LLMs in games. LLMs, or Large Language Models, make it possible to design “intelligent” in-game characters that can engage in realistic conversations with the player, perform complex actions and follow instructions, dramatically enhancing the player’s experience. AI-powered NPCs represent a huge advancement vs rule-based and heuristics systems. Today, we are excited to introduce NPC-Playground, a demo created by Cubzh and Gigax where you […]

Read more

🧨 Diffusers welcomes Stable Diffusion 3

Stable Diffusion 3 (SD3), Stability AI’s latest iteration of the Stable Diffusion family of models, is now available on the Hugging Face Hub and can be used with 🧨 Diffusers. The model released today is Stable Diffusion 3 Medium, with 2B parameters. As part of this release, we have provided: Models on the Hub Diffusers Integration SD3 Dreambooth and LoRA training scripts Table Of Contents

Read more
1 38 39 40 41 42 1,020