Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

NVIDIA Isaac GR00T (Generalist Robot 00 Technology) is a research and development platform for building robot foundation models and data pipelines, designed to accelerate the creation of intelligent, adaptable robots. Today, we announced the availability of Isaac GR00T N1.5, the first major update to Isaac GR00T N1, the world’s first open foundation model for generalized humanoid robot reasoning and skills. This cross-embodiment model processes multimodal inputs, including language and images, to perform manipulation tasks across diverse environments. It is adaptable […]

Read more

Featherless AI on Hugging Face Inference Providers 🔥

We’re thrilled to share that Featherless AI is now a supported Inference Provider on the Hugging Face Hub! Featherless AI joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers. Featherless AI supports a wide variety of text and conversational models, including […]

Read more

🏎️ Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

Boost your model performance with pre-optimized kernels, easily loaded from the Hub. Today, we’ll explore an exciting development from Hugging Face: the Kernel Hub! As ML practitioners, we know that maximizing performance often involves diving deep into optimized code, custom CUDA kernels, or complex build systems. The Kernel Hub simplifies this process dramatically! Below is a short example of how to use a kernel in your code. import torch from kernels import get_kernel activation = get_kernel(“kernels-community/activation”) x = torch.randn((10, 10), […]

Read more

Groq on Hugging Face Inference Providers 🔥

We’re thrilled to share that Groq is now a supported Inference Provider on the Hugging Face Hub! Groq joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers. Groq supports a wide variety of text and conversational models, including the latest open-source […]

Read more

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

In our previous post, Exploring Quantization Backends in Diffusers, we dived into how various quantization techniques can shrink diffusion models like FLUX.1-dev, making them significantly more accessible for inference without drastically compromising performance. We saw how bitsandbytes, torchao, and others reduce memory footprints for generating images. Performing inference is cool, but to make these models truly our own, we also need to be able to fine-tune them. Therefore, in this post, we tackle efficient fine-tuning of these models with peak […]

Read more

Transformers backend integration in SGLang

Hugging Face transformers library is the standard for working with state-of-the-art models — from experimenting with cutting-edge research to fine-tuning on custom data. Its simplicity, flexibility, and expansive model zoo make it a powerful tool for rapid development. But once you’re ready to move from notebooks to production, inference performance becomes mission-critical. That’s where SGLang comes in. Designed for high-throughput, low-latency inference, SGLang now offers seamless integration with transformers as a backend. This means you can pair the flexibility of […]

Read more

Gemma 3n fully available in the open-source ecosystem!

Gemma 3n was announced as a preview during Google I/O. The on-device community got really excited, because this is a model designed from the ground up to run locally on your hardware. On top of that, it’s natively multimodal, supporting image, text, audio, and video inputs 🤯 Today, Gemma 3n is finally available on the most used open source libraries. This includes transformers & timm, MLX, llama.cpp (text inputs), transformers.js, ollama, Google AI Edge, and others. This post quickly goes […]

Read more

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

NVIDIA Llama Nemotron Nano VL is a state-of-the-art 8B Vision Language Model (VLM) designed for intelligent document processing, offering high accuracy and multimodal understanding. Available on Hugging Face, it excels in extracting and understanding information from complex documents like invoices, receipts, contracts, and more. With its powerful OCR capabilities and efficient performance on the OCRBench v2 benchmark, this model delivers industry-leading accuracy for text and table extraction, as well as chart, diagram, and table parsing. Whether you’re automating financial document […]

Read more
1 56 57 58 59 60 1,021