HEXA: Self-supervised pretraining with hard examples improves visual representations

Humans perceive the world through observing a large number of visual scenes around us and then effectively generalizing—in other words, interpreting and identifying scenes they haven’t encountered before—without heavily relying on labeled annotations for every single scene. One of the core aspirations in artificial intelligence is to develop algorithms and techniques that endow computers with a strong generalization ability to learn only from raw pixel data to make sense of the visual world, which aligns more closely with how humans process visual information.

Currently, self-supervised pretraining (SSP) is rising as an emerging research field, showing great success in approaching this problem. The goal of SSP is to learn general-purpose intermediate representations, with the expectation that the representations carry rich semantic or

 

 

To finish reading, please visit source site

Leave a Reply