Introducing the 🤗 Data Measurements Tool: an Interactive Tool for Looking at Datasets
tl;dr: We made a tool you can use online to build, measure, and compare datasets. Click to access the 🤗 Data Measurements Tool here. As developers of a fast-growing unified repository for Machine Learning datasets (Lhoest et al. 2021), the 🤗 Hugging Face team has been working on supporting good practices for dataset documentation (McMillan-Major et al., 2021). While static (if evolving) documentation represents a necessary first step in this direction, getting a good sense of what is actually in […]
Read moreGetting Started with Hugging Face Transformers for IPUs with Optimum
Transformer models have proven to be extremely efficient on a wide range of machine learning tasks, such as natural language processing, audio processing, and computer vision. However, the prediction speed of these large models can make them impractical for
Read moreIntroducing Snowball Fight ☃️, our First ML-Agents Environment
We’re excited to share our first custom Deep Reinforcement Learning environment: Snowball Fight 1vs1 🎉. Snowball Fight is a game made with Unity ML-Agents, where you shoot snowballs against a Deep Reinforcement Learning agent. The game is hosted on Hugging Face Spaces. 👉 You can play it online here In this post, we’ll
Read moreTraining CodeParrot 🦜 from Scratch
In this blog post we’ll take a look at what it takes to build the technology behind GitHub CoPilot, an application that provides suggestions to programmers as they code. In this step by step guide, we’ll learn how to train a large GPT-2 model called CodeParrot 🦜, entirely from scratch. CodeParrot can auto-complete your Python code – give
Read morePerceiver IO: a scalable, fully-attentional model that works on any modality
TLDR We’ve added Perceiver IO to Transformers, the first Transformer-based neural network that works on all kinds of modalities (text, images, audio, video, point clouds,…) and combinations thereof. Take a look at the following Spaces to view some examples: We
Read moreGradio is joining Hugging Face!
Gradio is joining Hugging Face! By acquiring Gradio, a machine learning startup, Hugging Face will be able to offer users, developers, and data scientists the tools needed to get to high level results and create better models and tools… Hmm, paragraphs about acquisitions like the one above are so common that an algorithm could write them. In fact, one did!!
Read moreDeploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker
Almost 6 months ago to the day, EleutherAI released GPT-J 6B, an open-source alternative to OpenAIs GPT-3. GPT-J 6B is the 6 billion parameter successor to EleutherAIs GPT-NEO family, a family of transformer-based language models based on the GPT architecture for text generation. EleutherAI‘s primary goal is to train a model that
Read moreBoosting Wav2Vec2 with n-grams in 🤗 Transformers
Wav2Vec2 is a popular pre-trained model for speech recognition. Released in September 2020 by Meta AI Research, the novel architecture catalyzed progress in self-supervised pretraining for speech recognition, e.g. G. Ng et al., 2021, Chen et al, 2021, Hsu et al., 2021 and
Read more