Normal view

Received before yesterday

OpenAI built an AI coding agent and uses it to improve the agent itself

12 December 2025 at 17:16

With the popularity of AI coding tools rising among some software developers, their adoption has begun to touch every aspect of the process, including the improvement of AI coding tools themselves.

In interviews with Ars Technica this week, OpenAI employees revealed the extent to which the company now relies on its own AI coding agent, Codex, to build and improve the development tool. “I think the vast majority of Codex is built by Codex, so it’s almost entirely just being used to improve itself,” said Alexander Embiricos, product lead for Codex at OpenAI, in a conversation on Tuesday.

Codex, which OpenAI launched in its modern incarnation as a research preview in May 2025, operates as a cloud-based software engineering agent that can handle tasks like writing features, fixing bugs, and proposing pull requests. The tool runs in sandboxed environments linked to a user’s code repository and can execute multiple tasks in parallel. OpenAI offers Codex through ChatGPT’s web interface, a command-line interface (CLI), and IDE extensions for VS Code, Cursor, and Windsurf.

Read full article

Comments

© Mininyx Doodle via Getty Images

Microsoft Bug Bounty Program Gets Major Expansion With ‘In Scope By Default’

12 December 2025 at 02:34

Bug Bounty

Microsoft Corp. has announced a major update to its bug bounty program, extending coverage to include any vulnerability affecting its online services. This new framework, referred to as “In Scope By Default,” is an important shift in how the tech giant approaches coordinated vulnerability disclosure.  Under this updated model, every Microsoft online service is automatically eligible for bounty awards from the moment it launches. Previously, the company relied on product-specific scope definitions, which often caused confusion for security researchers and limited the range of vulnerabilities eligible for rewards. By making all services In Scope By Default, Microsoft aims to make participation in the bug bounty program more predictable while ensuring critical vulnerabilities are addressed and incentivized regardless of their origin.  A key feature of the expanded scope is its coverage of third-party and open-source components integrated into Microsoft services. This means that vulnerabilities in external libraries, dependencies, or open-source packages that power Microsoft’s cloud infrastructure are now eligible for bug bounty rewards, not just flaws in Microsoft’s own software. 

A Strategic Shift in Bug Bounty Security Incentives 

Tom Gallagher, vice president of engineering at the Microsoft Security Response Center (MSRC), highlighted the significance of the change in a December 11, 2025, blog post. He described it as more than an administrative adjustment, calling it a structural realignment designed to reflect real-world risk. Gallagher explained that by defaulting all services into scope, Microsoft hopes to reduce reporting delays, minimize confusion, and allow researchers to focus on vulnerabilities with meaningful impact on customers.  “If Microsoft’s online services are impacted by vulnerabilities in third-party code, including open source, we want to know,” Gallagher stated. “If no bounty award formerly exists to reward this vital work, we will offer one. This closes the gap for security research and raises the security bar for everyone who relies on this code.”  The new policy also allows Microsoft to collaborate more effectively with researchers on upstream or third-party vulnerabilities. The company can now assist with developing fixes or support maintainers when issues in external codebases directly affect Microsoft services. 

Industry Reaction and Expected Impact 

All new Microsoft online services now fall under bug bounty coverage from day one, while millions of existing endpoints no longer require manual approval to qualify. The update is designed to make it easier for security professionals to identify and report vulnerabilities across Microsoft’s expansive ecosystem.  The new approach aligns with Microsoft’s broader security philosophy in an AI- and cloud-first environment, where attackers exploit any weak link, regardless of ownership. According to Gallagher, “Security vulnerabilities often emerge at the seams where components interact or where dependencies are involved. We value research that takes this broader perspective, encompassing not only Microsoft infrastructure but also third-party dependencies, including commercial software and open-source components.”  Last year, Microsoft’s bug bounty program and its Zero Day Quest live-hacking event awarded over $17 million to researchers for high-impact discoveries. With the In Scope By Default initiative, the company expects to expand eligibility even further, particularly in areas involving Microsoft-owned domains, cloud services, and third-party or open-source code.  Researchers participating in the program are expected to follow Microsoft’s Rules of Engagement for Responsible Security Research, ensuring customer privacy and data protection while enabling coordinated vulnerability disclosure. By widening its bug bounty scope, Microsoft aims to raise the overall security bar. 

A new open-weights AI coding model is closing in on proprietary options

10 December 2025 at 15:38

On Tuesday, French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed to work as part of an autonomous software engineering agent. The model achieves a 72.2 percent score on SWE-bench Verified, a benchmark that attempts to test whether AI systems can solve real GitHub issues, putting it among the top-performing open-weights models.

Perhaps more notably, Mistral didn’t just release an AI model, it released a new development app called Mistral Vibe. It’s a command line interface (CLI) similar to Claude Code, OpenAI Codex, and Gemini CLI that lets developers interact with the Devstral models directly in their terminal. The tool can scan file structures and Git status to maintain context across an entire project, make changes across multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license.

It’s always wise to take AI benchmarks with a large grain of salt, but we’ve heard from employees of the big AI companies that they pay very close attention to how well models do on SWE-bench Verified, which presents AI models with 500 real software engineering problems pulled from GitHub issues in popular Python repositories. The AI must read the issue description, navigate the codebase, and generate a working patch that passes unit tests. While some AI researchers have noted that around 90 percent of the tasks in the benchmark test relatively simple bug fixes that experienced engineers could complete in under an hour, it’s one of the few standardized ways to compare coding models.

Read full article

Comments

© Mistral / Benj Edwards

❌