Research Focus: Week of June 10, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. NEW RESEARCH RELEVANCE: Automatic evaluation framework for LLM responses Relevance in AI refers to the usefulness of information or actions to a specific task or query. It helps determine the accuracy, effectiveness, efficiency, and user satisfaction of content from search engines, chatbots, and other AI systems. RELEVANCE (Relevance and Entropy-based  

Read more

Listing All Files in a Directory With Python

Getting a list of all the files and folders in a directory is a natural first step for many file-related operations in Python. When looking into it, though, you may be surprised to find various ways to go about it. When you’re faced with many ways of doing something, it can be a good indication that there’s no one-size-fits-all solution to your problems. Most likely, every solution will have its own advantages and trade-offs. This is the case when it […]

Read more

SIBYL: A machine learning-based framework for forecasting dynamic workloads

This paper was presented at the ACM SIGMOD/Principles of Database Systems Conference (opens in new tab) (SIGMOD/PODS 2024), the premier forum on large-scale data management and databases. In today’s fast-paced digital landscape, data analysts are increasingly dependent on analytics dashboards to monitor customer engagement and app performance. However, as data volumes increase, these dashboards can slow down, leading to delays and inefficiencies. One solution is to  

Read more

Python News: What’s New From May 2024

May was packed with exciting updates and events in the Python community. This month saw the release of the first beta version of Python 3.13, the conclusion of PyCon US 2024, and the announcement of the keynote speakers for EuroPython 2024. Additionally, PEP 649 has been delayed until the Python 3.14 release, and the Python Software Foundation published its 2023 Annual Impact Report. Get ready to explore the recent highlights! The First Beta Version of Python 3.13 Released After nearly […]

Read more

LST-Bench: A new benchmark tool for open table formats in the data lake

This paper was presented at the ACM SIGMOD/Principles of Database Systems Conference (opens in new tab) (SIGMOD/PODS 2024), the premier forum on large-scale data management and databases. As organizations grapple with ever-expanding datasets, the adoption of data lakes has become a vital strategy for scalable and cost-effective data management. The success of these systems largely depends on the file formats used to store the  

Read more

5 Useful Loss Functions

Image by Author A loss function in machine learning is a mathematical formula that calculates the difference between the predicted output and the actual output of the model. The loss function is then used to slightly change the model weights and then check whether it has improved the model’s performance. The goal of machine learning algorithms is to minimize the loss function in order to make accurate predictions. In this blog, we will learn about the 5 most commonly used […]

Read more

Quiz: Python String Formatting: Available Tools and Their Features

Interactive Quiz ⋅ 6 QuestionsBy Leodanis Pozo Ramos Share Or copy the link: Copied! Happy Pythoning! Test your understanding of Python’s tools for string formatting, including f-strings, the .format() method, and the modulo operator. Take this quiz after reading our Python String Formatting: Available Tools and Their Features tutorial. The quiz contains 6 questions and there is no time limit. You’ll get 1 point for each correct answer. At the    

Read more

Highlights from Machine Translation and Multilinguality in May 2024

Here are short summaries of three pre-prints that I enjoyed reading in May. Zero-Shot Tokenizer Transfer Folks from the University of Cambridge and the Univerisity of Edinburgh propose a nice trick for changing the vocabulary of an already trained language model. They train a hyper-network (a neural network that predicts parameters of a different neural network) that predicts what embeddings a token would have if it were trained with the rest of the model. For each training batch, they build […]

Read more

Python String Formatting: Available Tools and Their Features

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Python String Formatting Tips & Best Practices String formatting is the process of applying a proper format to a given value while using this value to create a new string through interpolation. Python has several tools for string interpolation that support many formatting features. In modern Python, you’ll use f-strings or the .format() method […]

Read more

Microsoft Research Forum Episode 3: Globally inclusive and equitable AI, new use cases for AI, and more

In the latest episode of Microsoft Research Forum, researchers explored the importance of globally inclusive and equitable AI, shared updates on AutoGen and MatterGen, presented novel use cases for AI, including industrial applications and the potential of multimodal models to improve assistive technologies.  Below is a brief recap of the event, including select quotes from the presentations. Full replays of each session and presentation will be available soon.  Jacki O’Neill, Lab Director, Microsoft  

Read more
1 28 29 30 31 32 907