🇵🇭 FilBench – Can LLMs Understand and Generate Filipino?

As large language models (LLMs) become increasingly integrated into our lives, it becomes crucial to assess whether they reflect the nuances and capabilities of specific language communities. For example, Filipinos are among the most active ChatGPT users globally, ranking fourth in ChatGPT traffic (behind the United States, India, and Brazil [1] [2]), but despite this strong usage, we lack a clear understanding of how LLMs perform for their languages, such as Tagalog and Cebuano. Most of the existing evidence is […]

Read more

Kimina-Prover-RL

A slimmed-down training pipeline from Kimina Prover, with core features and full compatibility with verl. We are happy to introduce kimina-prover-rl, an open-source training pipeline for formal theorem proving in Lean 4, based on a structured reasoning-then-generation paradigm inspired by DeepSeek-R1. This training pipelinee is a simplified version of the system we used to train Kimina Prover, preserving the key components of the system and offering full compatibility with the open-source Verl framework. It is released as part of a […]

Read more

MCP for Research: How to Connect AI to Research Tools

Academic research involves frequent research discovery: finding papers, code, related models and datasets. This typically means switching between platforms like arXiv, GitHub, and Hugging Face, manually piecing together connections. The Model Context Protocol (MCP) is a standard that allows agentic models to communicate with external tools and data sources. For research discovery, this means AI can    

Read more

Generate Images with Claude and Hugging Face

TL;DR: It’s easier than ever to generate detailed pictures with state-of-the-art AI models by connecting Claude to Hugging Face Spaces. This article describes how and why, and introduces recently launched models which excel at producing natural images or images that include text. Update October 2025: Following an update to Anthropic’s Connector Directory Policy, you    

Read more

NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset

Authors: Dhruv Nathawani, Shuoyang Ding US, Vitaly Lavrukhin US, Jane Polak Scowcroft US, Oleksii Kuchaiev US NVIDIA continues releasing permissive datasets in support of the open ecosystem with 6 Million Multilingual Reasoning Dataset. Continuing the success of the recent Nemotron Post-Training Dataset v1 release used in Llama Nemotron Super model, and our Llama Nemotron Post-Training Dataset release earlier this year, we’re excited to release the reasoning dataset translated into five target languages: French, Spanish, German, Italian, and Japanese. The newly […]

Read more

Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

ZeroGPU lets anyone spin up powerful Nvidia H200 hardware in Hugging Face Spaces without keeping a GPU locked for idle traffic. It’s efficient, flexible, and ideal for demos but it doesn’t always make full use of everything the GPU and CUDA stack can offer. Generating images or videos can take a significant amount of time. Being able to squeeze out more performance, taking advantage of the H200 hardware, does matter in this case. This is where PyTorch ahead-of-time (AoT) compilation […]

Read more
1 61 62 63 64 65 1,022