Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

AI agents are reshaping software development, from writing code to carrying out complex instructions. Yet LLM-based agents are prone to errors and often perform poorly on complicated, multi-step tasks. Reinforcement learning (RL) is an approach where AI systems learn to make optimal decisions by receiving rewards or penalties for their actions, improving through  

Read more

Promptions helps make AI prompting more precise with dynamic UI controls

Anyone who uses AI systems knows the frustration: a prompt is given, the response misses the mark, and the cycle repeats. This trial-and-error loop can feel unpredictable and discouraging. To address this, we are excited to introduce Promptions (prompt + options), a UI framework that helps developers build AI interfaces with more precise user control. Its simple design makes  

Read more

GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI

The convergence of digital transformation and the GenAI revolution creates an unprecedented opportunity for accelerating progress in precision health. Precision immunotherapy is a poster child for this transformation. Emerging technologies such as multiplex immunofluorescence (mIF) can assess internal states of individual cells along with their spatial locations, which is critical for deciphering how tumors interact  

Read more

Ideas: Community building, machine learning, and the future of AI

HANNA WALLACH: Yeah, so I was a PhD student at the University of Cambridge, and I was working with the late David MacKay. I was focusing on machine learning for analyzing text, and at that point in time, I’d actually just begun working on Bayesian latent variable models for text analysis, and my research was really focusing on trying to combine ideas from n-gram language modeling with statistical topic modeling in order to come up with models that just did […]

Read more

MMCTAgent: Enabling multimodal reasoning over large video and image collections

Modern multimodal AI models can recognize objects, describe scenes, and answer questions about images and short video clips, but they struggle with long-form and large-scale visual data, where real-world reasoning requires moving beyond object recognition and short-clip analysis. Real-world reasoning increasingly involves analyzing long-form video content, where context spans minutes or hours, far beyond the context limits of most models. It  

Read more

BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI

Introduction Large language models (LLMs) are now widely used for automated code generation across software engineering tasks. However, this powerful capability in code generation also introduces security concerns. Code generation systems could be misused for harmful purposes, such as generating malicious code. It could also produce bias-filled code reflecting underlying logic that is discriminatory or unethical. Additionally, even when completing benign tasks, LLMs may inadvertently  

Read more

When industry knowledge meets PIKE-RAG: The innovation behind Signify’s customer service boost

As a world leader in connected LED lighting products, systems, and services, Signify (formerly Philips Lighting) serves not only everyday consumers but also a large number of professional users who have stringent requirements for technical specifications and engineering compatibility. Faced with thousands of product models, complex component parameters, and technical documentation spanning multiple versions, delivering accurate, professional answers efficiently has become  

Read more

Magentic Marketplace: an open-source simulation environment for studying agentic markets

Autonomous AI agents are here, and they’re poised to reshape the economy. By automating discovery, negotiation, and transactions, agents can overcome inefficiencies like information asymmetries and platform lock-in, enabling faster, more transparent, and more competitive markets. We are already seeing early signs of this transformation in digital marketplaces. Customer-facing assistants like OpenAI’s Operator and Anthropic’s Computer Use can navigate websites and complete purchases.  

Read more

RedCodeAgent: Automatic red-teaming agent against diverse code agents

Introduction Code agents are AI systems that can generate high-quality code and work smoothly with code interpreters. These capabilities help streamline complex software development workflows, which has led to their widespread adoption. However, this progress also introduces critical safety and security risks. Existing static safety benchmarks and red-teaming methods—in which security researchers simulate real-world attacks to identify security vulnerabilities—often fall short when evaluating code agents. They may fail to detect emerging real-world risks, such as the  

Read more
1 2 3 22