Learning from interaction with Microsoft Copilot (web)
AI systems like Bing and Microsoft Copilot (web) are as good as they are because they continuously learn and improve from people’s interactions. Since the early 2000s, user clicks on search result pages have fueled the continuous improvements of search engines. Recently, reinforcement learning from human feedback (RLHF) brought step-function improvements to response quality of generative AI models. Bing has a rich history of success in improving its AI offerings by learning from user interactions. For example, Bing pioneered the idea of