Very Large Language Models and How to Evaluate Them
Large language models can now be evaluated on zero-shot classification tasks with Evaluation on the Hub! Zero-shot evaluation is a popular way for researchers to measure the performance of large language models, as they have been shown to learn capabilities during training without explicitly being shown labeled examples. The Inverse Scaling Prize is an example of a recent community effort to conduct large-scale zero-shot evaluation across model sizes and families to discover tasks on which larger models may perform worse […]
Read moreIntroducing DOI: the Digital Object Identifier to Datasets and Models
Our mission at Hugging Face is to democratize good machine learning. That includes best practices that make ML models and datasets more reproducible, better documented, and easier to use and share. To solve this challenge, we’re excited to announce that you can now generate a DOI for your model or dataset directly from the Hub! DOIs can be generated directly from your repo settings, and anyone will then be able to cite your work by clicking “Cite this model/dataset” on […]
Read moreOptimization story: Bloom inference
This article gives you the behind-the-scenes of how we made an efficient inference server that powers bloom. inference server that powers https://huggingface.co/bigscience/bloom. We achieved a 5x latency reduction over several weeks (and 50x more throughput). We wanted to share all the struggles and epic wins we went through to achieve such speed improvements. A lot of different people were involved
Read more๐งจ Stable Diffusion in JAX / Flax !
๐ค Hugging Face Diffusers supports Flax since version 0.5.1! This allows for super fast inference on Google TPUs, such as those available in Colab, Kaggle or Google
Read moreMTEB: Massive Text Embedding Benchmark
MTEB is a massive benchmark for measuring the performance of text embedding models on diverse embedding tasks. The ๐ฅ leaderboard provides a holistic view of the best text embedding models out there on a variety of tasks. The ๐ paper gives background on the tasks and datasets in MTEB and analyzes leaderboard results! The ๐ป Github
Read moreFrom PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease
This tutorial assumes you have a basic understanding of PyTorch and how to train a simple model. It will showcase training on multiple GPUs through a process called Distributed Data Parallelism (DDP) through three different levels of increasing abstraction: Native PyTorch DDP through the pytorch.distributed module Utilizing ๐ค Accelerate’s light wrapper around pytorch.distributed that also helps ensure the code can be run
Read moreEvaluating Language Model Bias with ๐ค Evaluate
While the size and capabilities of large language models have drastically increased over the past couple of years, so too has the concern around biases imprinted into these models and their training data. In fact, many popular language models have been found to be biased against specific religions and genders, which can result in the promotion of discriminatory ideas and the perpetuation of harms against marginalized groups. To help the community explore these kinds of biases and strengthen our understanding […]
Read more