Research Focus: Week of September 9, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. NEW RESEARCH Can LLMs be Fooled? Investigating Vulnerabilities in LLMs Large language models (LLMs) are the de facto standard for numerous machine learning tasks, ranging from text generation and  

Read more

Tips for Using Machine Learning in Fraud Detection

Tips for Using Machine Learning in Fraud DetectionImage by Editor | Midjourney The battle against fraud has become more intense than it ever has been. As transactions become increasingly digital and complex, fraudsters are constantly devising new ways to exploit vulnerabilities in financial systems. And this is where the power of machine learning comes into play. Machine learning offers a robust approach to identifying and even preventing fraudulent activities. By harnessing advanced algorithms and analytics, financial institutions can stay one […]

Read more

Scaling to Success: Implementing and Optimizing Penalized Models

This post will demonstrate the usage of Lasso, Ridge, and ElasticNet models using the Ames housing dataset. These models are particularly valuable when dealing with data that may suffer from multicollinearity. We leverage these advanced regression techniques to show how feature scaling and hyperparameter tuning can improve model performance. In this post, we’ll provide a step-by-step walkthrough on setting up preprocessing pipelines, implementing each model with scikit-learn, and fine-tuning them to achieve optimal results. This comprehensive approach not only aids […]

Read more

How to Use Conditional Expressions With NumPy where()

The NumPy where() function is a powerful tool for filtering array elements in lists, tuples, and NumPy arrays. It works by using a conditional predicate, similar to the logic used in the WHERE or HAVING clauses in SQL queries. It’s okay if you’re not familiar with SQL—you don’t need to know it to follow along with this tutorial. You would typically use np.where() when you have an array and need to analyze its elements differently depending on their values. For […]

Read more

When to Use .__repr__() vs .__str__() in Python

One of the most common tasks that a computer program performs is to display data. The program often displays this information to the program’s user. However, a program also needs to show information to the programmer developing and maintaining it. The information a programmer needs about an object differs from how the program should display the same object for the user, and that’s where .__repr__() vs .__str__() comes in. A Python object has several special methods that provide specific behavior. […]

Read more

MedFuzz: Exploring the robustness of LLMs on medical challenge problems

Large language models (LLMs) have achieved unprecedented accuracy on medical question-answering benchmarks, showcasing their potential to revolutionize healthcare by supporting clinicians and patients. However, these benchmarks often fail to capture the full complexity of real-world medical scenarios. To truly harness the power of LLMs in healthcare, we must go beyond these benchmarks by introducing challenges that bring us closer to the nuanced realities of clinical practice. Introducing MedFuzz Benchmarks like MedQA rely on simplifying assumptions to gauge accuracy. These assumptions […]

Read more

Python News Roundup: September 2024

As the autumn leaves start to fall, signaling the transition to cooler weather, the Python community has warmed up to a series of noteworthy developments. Last month, a new maintenance release of Python 3.12.5 was introduced, reinforcing the language’s ongoing commitment to stability and security. On a parallel note, Python continues its reign as the top programming language according to IEEE Spectrum’s annual rankings. This sentiment is echoed by the Python Developers Survey 2023 results, which reveal intriguing trends and […]

Read more

GraphRAG auto-tuning provides rapid adaptation to new domains

GraphRAG uses large language models (LLMs) to create a comprehensive knowledge graph that details entities and their relationships from any collection of text documents. This graph enables GraphRAG to leverage the semantic structure of the data and generate responses to complex queries that require a broad understanding of the entire text. In previous blog posts, we introduced GraphRAG and demonstrated how it could be applied to news articles. In this blog post, we show that it can also  

Read more

5 Emerging AI Technologies That Will Shape the Future of Machine Learning

5 Emerging AI Technologies That Will Shape the Future of Machine LearningImage by Editor | Midjourney Artificial intelligence is not just altering the way we interact with technology; it’s reshaping the very foundations of machine learning. As we stand on the brink of innovative breakthroughs, understanding emerging AI technologies becomes essential to grasp their profound implications on future applications and industries. This exploration is not merely academic—it’s a guide to influencing and capitalizing on the next wave of technological revolution. […]

Read more

Detecting and Overcoming Perfect Multicollinearity in Large Datasets

One of the significant challenges statisticians and data scientists face is multicollinearity, particularly its most severe form, perfect multicollinearity. This issue often lurks undetected in large datasets with many features, potentially disguising itself and skewing the results of statistical models. In this post, we explore the methods for detecting, addressing, and refining models affected by perfect multicollinearity. Through practical analysis and examples, we aim to equip you with the tools necessary to enhance your models’ robustness and interpretability, ensuring that […]

Read more
1 12 13 14 15 16 907