Creating custom kernels for the AMD MI300
More than a billion per day: that’s a low estimate of how many requests ChatGPT handles daily, a number which is unlikely to go down soon. For each request and each
Read moreDeep Learning, NLP, NMT, AI, ML
More than a billion per day: that’s a low estimate of how many requests ChatGPT handles daily, a number which is unlikely to go down soon. For each request and each
Read moreTiny price, small size, huge possibilities. Code, learn, share with AI builders of all ages, all around the globe. Reachy Mini is an expressive, open-source robot designed for human-robot interaction, creative coding, and AI experimentation. Fully programmable in
Read moreTL;DR: The Hugging Face Official MCP Server offers unique customization options for AI Assistants accessing the Hub, along with access to thousands of AI applications through one simple URL. We used MCPs “Streamable HTTP” transport for deployment, and examine in detail the trade-offs that Server Developers have. We’ve learned many things about building a useful MCP server in the last month – we’ll describe our journey here. Introduction The Model Context Protocol (MCP) is
Read moreTL;DR: ScreenEnv is a powerful Python library that lets you create isolated Ubuntu desktop environments in Docker containers for testing and deploying GUI Agents (aka Computer Use agents). With built-in support for the Model Context Protocol (MCP), it’s never
Read moreNumina & Kimi Team Figure 1: Performance comparison of theorem proving models on the miniF2F-test dataset. We’re excited to announce the release of Kimina-Prover-72B, our state-of-the-art theorem proving model trained with the Kimi k1.5[1] RL pipeline based on Qwen2.5-72B [2]. Alongside it, we are also releasing two distilled variants: Kimina-Prover-Distill-8B and 1.7B (based on Qwen3-8B and Qwen3-1.7B[3] respectively). Our key innovations include: Test-Time Reinforcement Learning Search: A trainable agentic proving framework that enables the model to recursively discover, combine and […]
Read moreIn January of this year, Hugging Face’s Xet Team deployed a new storage backend, and shortly after shifted ~6% of Hub downloads through the infrastructure. This represented a significant milestone, but it was just the beginning. In 6 months, 500,000 repositories holding 20 PB joined the move to Xet as the Hub outgrows Git LFS and transitions to a storage system that scales with the workloads of AI builders. Today, more than 1 million people on the Hub are using […]
Read moreWhat would happen if you took the ModernBERT recipe and applied it to a decoder-only model? Turns out, a state-of-the-art decoder language model that beats Llama 3.2 1B and SmolLM2! We introduce a new open-data training recipe to reproduce the encoder-only ModernBERT model (and actually beat it!). We then apply the exact same recipe to decoder-only models. For the first time, we have two state-of-the-art models trained in the same setup but with two different training objectives: masked language modeling […]
Read moreGradio is an open-source Python package for creating AI-powered web applications. Gradio is compliant with the MCP server protocol and powers thousands of MCP servers hosted on Hugging Face Spaces. The Gradio team is betting big on Gradio and Spaces being the best way to build and host AI-powered MCP servers. To that end, here are
Read moreMost current AI benchmarks focus on answering questions about the past, either by testing models on existing knowledge (in a static manner, such as HLE or GPQA, or augmented, like BrowseComp or GAIA) or previously solved problems (like PaperBench, DABStep, or most coding evaluations). However, we believe that more valuable AI, and ultimately AGI, will be distinguished by its ability to use this past to forecast interesting aspects of the future, rather than merely reciting old facts. Forecasting future events […]
Read morePicture this: four AI experts sitting around a poker table, debating your toughest decisions in real-time. That’s exactly what Consilium, the multi-LLM platform I built during the Gradio Agents & MCP Hackathon, does. It lets AI models discuss complex questions and reach consensus through structured debate. The platform works both as a visual Gradio interface and as an MCP (Model Context Protocol) server
Read more