Tutel: An efficient mixture-of-experts implementation for large DNN model training

Mixture of experts (MoE) is a deep learning model architecture in which computational cost is sublinear to the number of parameters, making scaling easier. Nowadays, MoE is the only approach demonstrated to scale deep learning models to trillion-plus parameters, paving the way for models capable of learning even more information and powering computer vision, speech recognition,  

Read more

SynapseML: A simple, multilingual, and massively parallel machine learning library

Today, we’re excited to announce the release of SynapseML (previously MMLSpark), an open-source library that simplifies the creation of massively scalable machine learning (ML) pipelines. Building production-ready distributed ML pipelines can be difficult, even for the most seasoned developer. Composing tools from different ecosystems often requires considerable “glue” code, and many frameworks aren’t designed with thousand-machine elastic clusters in mind. SynapseML resolves this challenge by unifying several existing ML frameworks and new Microsoft algorithms in a single,  

Read more

Privacy Preserving Machine Learning: Maintaining confidentiality and preserving trust

Machine learning (ML) offers tremendous opportunities to increase productivity. However, ML systems are only as good as the quality of the data that informs the training of ML models. And training ML models requires a significant amount of data, more than a single individual or organization can contribute. By sharing data to collaboratively train ML models, we can unlock value and develop powerful language models that are applicable  

Read more

ACAV100M: Scaling up self-supervised audio-visual learning with automatically curated internet videos

The natural association between visual observations and their corresponding sounds has exhibited powerful self-supervision signals for learning video representations, which makes the ever-growing amount of online video an attractive data source for self-supervised learning. However, online videos often provide imperfectly aligned audio-visual signals because of overdubbed audio; models trained on uncurated videos have been shown to learn suboptimal representations due to the misalignment issues. Therefore, existing approaches rely almost exclusively on manually curated datasets with a predetermined taxonomy of semantic […]

Read more

Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition

Object recognition systems have made spectacular advances in recent years, but they rely on training datasets with thousands of high-quality, labelled examples per object category. Learning new objects from only a few examples could open the door to many new applications. For example, robotics manufacturing requires a system to quickly learn new parts, while assistive technologies need to be adapted to the unique needs and abilities of every individual. Few-shot learning aims to reduce these demands by training models that […]

Read more

First ever Microsoft Research Summit explores science and technology aimed at big challenges

For 30 years, Microsoft Research has brought together great minds from around the world to take on the biggest research challenges facing society. As we enter our fourth decade, the need for collaborative research—and the opportunities it presents—have never been greater. That’s why we’re so thrilled about the inaugural Microsoft Research Summit, October 19–21. It’s a virtual assembly of the global science and technology community, featuring world-leading researchers and engineers from academia, industry, and Microsoft, who  

Read more

Microsoft Translator: Now translating 100 languages and counting!

Today, we’re excited to announce that Microsoft Translator has added 12 new languages and dialects to the growing repertoire of Microsoft Azure Cognitive Services Translator, bringing us to a total of 103 languages! The new languages, which are natively spoken by 84.6 million people, are Bashkir, Dhivehi, Georgian, Kyrgyz, Macedonian, Mongolian (Cyrillic), Mongolian (Traditional), Tatar, Tibetan, Turkmen, Uyghur, and Uzbek (Latin). With this release, the Translator service  

Read more

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

We are excited to introduce the DeepSpeed- and Megatron-powered Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful monolithic transformer language model trained to date, with 530 billion parameters. It is the result of a research collaboration between Microsoft and NVIDIA to further parallelize and optimize the training of very large AI models. As the successor to Turing NLG 17B and Megatron-LM, MT-NLG has 3x the number of parameters compared to the existing largest model of this […]

Read more

Microsoft Turing Universal Language Representation model, T-ULRv5, tops XTREME leaderboard and trains 100x faster

Today, we are excited to announce that with our latest Turing universal language representation model (T-ULRv5), a Microsoft-created model is once again the state of the art and at the top of the Google XTREME public leaderboard. Resulting from a collaboration between the Microsoft Turing team and Microsoft Research, the 2.2 billion-parameter T-ULRv5 XL outperforms the current 2nd best model by an average score of 1.7 points. It is also the state of the art across each of the four […]

Read more
1 31 32 33 34 35 38