The State of Computer Vision at Hugging Face 🤗

At Hugging Face, we pride ourselves on democratizing the field of artificial intelligence together with the community. As a part of that mission, we began focusing our efforts on computer vision over the last year. What started as a PR for having Vision Transformers (ViT) in 🤗 Transformers has now grown into something much bigger – 8 core vision tasks,    

Read more

A Dive into Vision-Language Models

Human learning is inherently multi-modal as jointly leveraging multiple senses helps us understand and analyze new information better. Unsurprisingly, recent advances in multi-modal learning take inspiration from the effectiveness of this process to create models that can process and    

Read more

Generating Stories: AI for Game Development #5

Welcome to AI for Game Development! In this series, we’ll be using AI tools to create a fully functional farming game in just 5 days. By the end of this series, you will have learned how you can incorporate a variety of AI tools into your game development workflow. I will show you how you can use AI tools for: Art Style    

Read more

Speech Synthesis, Recognition, and More With SpeechT5

We’re happy to announce that SpeechT5 is now available in 🤗 Transformers, an open-source library that offers easy-to-use implementations of state-of-the-art machine learning models. SpeechT5 was originally described in the paper SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing by Microsoft Research Asia. The official checkpoints published by the paper’s authors are available on the Hugging Face Hub.    

Read more
1 16 17 18 19 20 1,021