Universal Image Segmentation with Mask2Former and OneFormer

This guide introduces Mask2Former and OneFormer, 2 state-of-the-art neural networks for image segmentation. The models are now available in πŸ€— transformers, an open-source library that offers easy-to-use implementations of state-of-the-art models. Along the way, you’ll learn about the difference between the various forms of image segmentation. Image segmentation Image segmentation is the task of identifying different “segments” in an image, like people or cars. More technically, image segmentation is the task of grouping pixels with different semantics.    

Read more

The State of Computer Vision at Hugging Face πŸ€—

At Hugging Face, we pride ourselves on democratizing the field of artificial intelligence together with the community. As a part of that mission, we began focusing our efforts on computer vision over the last year. What started as a PR for having Vision Transformers (ViT) in πŸ€— Transformers has now grown into something much bigger – 8 core vision tasks,    

Read more

A Dive into Vision-Language Models

Human learning is inherently multi-modal as jointly leveraging multiple senses helps us understand and analyze new information better. Unsurprisingly, recent advances in multi-modal learning take inspiration from the effectiveness of this process to create models that can process and    

Read more
1 16 17 18 19 20 1,021