Quiz: For Loops in Python (Definite Iteration)

Interactive Quiz ⋅ 5 QuestionsBy Joseph Peart Share Test your understanding of For Loops in Python (Definite Iteration). You’ll revisit Python loops, iterables, and how iterators behave. You’ll also explore set iteration order and the effects of the break and continue statements. The quiz contains 5 questions and there is no time limit. You’ll get 1 point for each correct answer. At the end of the quiz, you’ll receive a total score. The maximum score is 100%. Good luck! Related […]

Read more

D-Strings Could End Your textwrap.dedent() Days and Other Python News for April 2026

If you’ve ever wrapped a multiline string in textwrap.dedent() and wondered why Python can’t just handle that for you, then your PEP has arrived. PEP 822 proposes d-strings, a new d”””…””” prefix that automatically strips leading indentation. It’s one of those small quality-of-life ideas that make you wonder why it didn’t exist already. The PEP is currently a draft proposal. March also delivered Python 3.15.0 alpha 7 with lazy imports you can finally test and security patches across three older […]

Read more

How to train a new language model from scratch using Transformers and Tokenizers

Over the past few months, we made several improvements to our transformers and tokenizers libraries, with the goal of making it easier than ever to train a new language model from scratch. In this post we’ll demo how to train a “small” model (84 M parameters = 6 layers, 768 hidden size, 12 attention heads) – that’s the same number of layers & heads as DistilBERT – on Esperanto. We’ll then fine-tune the model on a downstream task of part-of-speech […]

Read more

How to generate text: using different decoding methods for language generation with Transformers

Note: Edited on July 2023 with up-to-date references and examples. Introduction In recent years, there has been an increasing interest in open-ended language generation thanks to the rise of large transformer-based language models trained on millions of webpages, including OpenAI’s ChatGPT and Meta’s LLaMA. The results on conditioned open-ended language generation are impressive, having shown to generalize to new tasks, handle code, or take non-text data as input. Besides the improved transformer architecture and massive unsupervised training data, better decoding […]

Read more

The Reformer – Pushing the limits of language modeling

How the Reformer uses less than 8GB of RAM to train on sequences of half a million tokens The Reformer model as introduced by Kitaev, Kaiser et al. (2020) is one of the most memory-efficient transformer models for long sequence modeling as of today. Recently, long sequence modeling has experienced a surge of interest as can be seen by the many submissions from this year alone – Beltagy et al. (2020), Roy et al. (2020), Tay et al., Wang et […]

Read more

Block Sparse Matrices for Smaller and Faster Language Models

In previous blog posts we introduced sparse matrices and what they could do to improve neural networks. The basic assumption is that full dense layers are often overkill and can be pruned without a significant loss in precision. In some cases sparse linear layers can even improve precision or/and generalization. The main issue is that currently available code that supports sparse algebra computation is severely lacking efficiency. We are also still waiting for official PyTorch support. That’s why we ran […]

Read more

Transformers-based Encoder-Decoder Models

!pip install transformers==4.2.1 !pip install sentencepiece==0.1.95 The transformer-based encoder-decoder model was introduced by Vaswani et al. in the famous Attention is all you need paper and is today the de-facto standard encoder-decoder architecture in natural language processing (NLP). Recently, there has been a lot of research on different pre-training objectives for transformer-based encoder-decoder models, e.g. T5, Bart, Pegasus, ProphetNet, Marge, etc…, but the model architecture has stayed largely the same. The goal of the blog post is to give an […]

Read more

Hyperparameter Search with Transformers and Ray Tune

A guest blog post by Richard Liaw from the Anyscale team With cutting edge research implementations, thousands of trained models easily accessible, the Hugging Face transformers library has become critical to the success and growth of natural language processing today. For any machine learning model to achieve good performance, users often need to implement some form of parameter tuning. Yet, nearly everyone (1, 2) either ends up disregarding hyperparameter tuning or opting to do a simplistic grid search with a […]

Read more

Transformers.js v4: Now Available on NPM!

We’re excited to announce that Transformers.js v4 is now available on NPM! After a year of development (we started in March 2025 🤯), we’re finally ready for you to use it. npm i @huggingface/transformers Performance & Runtime Improvements The biggest change is undoubtedly the adoption of a new WebGPU Runtime, completely rewritten in C++. We’ve worked closely with the ONNX Runtime team to thoroughly test this runtime across our ~200 supported model architectures, as well as many new v4-exclusive architectures. […]

Read more

TRL v1.0: Post-Training Library Built to Move with the Field

We’re releasing TRL v1.0, and it marks a real shift in what TRL is. What started as a research codebase has become a dependable library people build on, with clearer expectations around stability. This isn’t just a version bump. It reflects the reality that TRL now powers production systems, and embraces that responsibility. TRL now implements more than 75 post-training methods. But coverage isn’t the goal by itself. What matters is making these methods easy to try, compare, and actually […]

Read more
1 2 3 1,026