Breaking cross-modal boundaries in multimodal AI: Introducing CoDi, composable diffusion for any-to-any generation

Imagine an AI model that can seamlessly generate high-quality content across text, images, video, and audio, all at once. Such a model would more accurately capture the multimodal nature of the world and human comprehension, seamlessly consolidate information from a wide range of sources, and enable strong immersion in human-AI interactions. This could transform the way humans interact with computers on various tasks, including assistive technology, custom learning tools, ambient computing, and content generation. In a recent paper: Any-to-Any Generation […]

Read more

Unlocking the future of computing: The Analog Iterative Machine’s lightning-fast approach to optimization 

Picture a world where computing is not limited by the binary confines of zeros and ones, but instead, is free to explore the vast possibilities of continuous value data. Over the past three years a team of Microsoft researchers has been developing a new kind of analog optical computer that uses photons and electrons to process continuous value data, unlike today’s digital computers that use transistors to crunch through binary data. This innovative new machine has the potential to surpass […]

Read more

Research Focus: Week of June 19, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. In this article NEW RESOURCE Responsible AI Maturity Model As the use of AI continues to surge, new government regulations are expected. But the organizations that build and use AI  

Read more

DeepSpeed ZeRO++: A leap in speed for LLM and chat model training with 4X less communication

Figure 1: Picture of ZeRO++ project highlights. Left top subfigure shows ZeRO++ reduce communication volume by 4x compared with ZeRO stage 3. Right top subfigure shows ZeRO++ performance on RLHF model training, where ZeRO++ achieves 1.3x speedup for RLHF training and 2.x speedup for token generation. Large AI models are transforming the digital world. Generative language models like Turing-NLG, ChatGPT, and GPT-4, powered by large language models (LLMs), are incredibly versatile, capable of performing tasks like summarization, coding, and translation. […]

Read more

Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi

Today I’m talking to Dr. Bichlien Nguyen, a Principal Researcher at Microsoft Research, and Dr. David Kwabi, an Assistant Professor of Mechanical Engineering at the University of Michigan. Bichlien and David are collaborating on a fascinating project under the umbrella of the Microsoft Climate Research Initiative that brings organic chemistry and machine learning together to discover new forms of renewable energy storage. Before we unpack the “computational design and characterization of organic electrolytes for flow batteries and carbon capture,” let’s […]

Read more

Microsoft at CVPR 2023: Pushing the boundaries of computer vision

In the vast realm of artificial intelligence, few fields have captivated our imagination and pushed the boundaries of possibility quite like computer vision. At the core of this domain of research and innovation lies the ambition to empower technologies for real-world vision-based systems, enabling machines to take in and respond to visual stimuli with unparalleled precision and sophistication. Through the combination of AI, deep learning, and vast amounts  

Read more

Improving Subseasonal Forecasting with Machine Learning

This content was previously published by Nature Portfolio and Springer Nature Communities on Nature Portfolio Earth and Environment Community. Improving our ability to forecast the weather and climate is of interest to all sectors of the economy and to government agencies from the local to the national level. Weather forecasts zero to ten days ahead and climate forecasts seasons to decades ahead are currently used operationally in decision-making, and the accuracy and reliability of these forecasts has improved consistently in recent […]

Read more

Accounting for past imaging studies: Enhancing radiology AI and reporting

The use of self-supervision from image-text pairs has been a key enabler in the development of scalable and flexible vision-language AI models in not only general domains but also in biomedical domains such as radiology. The goal in the radiology setting is to produce rich training signals without requiring manual labels so the models can learn to accurately recognize and locate findings in the images and relate them to content in radiology reports. Radiologists use radiology reports to describe imaging […]

Read more

AI Frontiers: The future of causal reasoning with Emre Kiciman and Amit Sharma

[MUSIC FADES] Emre, Amit, let’s jump right in. I’m so excited to speak with you both about causal reasoning. And this is such a timely conversation because we’re living through the rise of generative pretrained models, specifically large language models. And when I’ve engaged with GPT-4 in dialogue, depending on what I ask, it can appear to be doing something resembling causal reasoning. And as a machine learning person myself, I have to say this is not something that I’d expected […]

Read more

Research Focus: Week of June 5, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. In this article PODCAST  The GPT-x Revolution in Medicine, with Peter Lee  Microsoft Research’s Peter Lee recently sat down to discuss the impact of GPT-4 and large language  

Read more
1 17 18 19 20 21 38