Reading view

OpenAI built an AI coding agent and uses it to improve the agent itself

With the popularity of AI coding tools rising among some software developers, their adoption has begun to touch every aspect of the process, including the improvement of AI coding tools themselves.

In interviews with Ars Technica this week, OpenAI employees revealed the extent to which the company now relies on its own AI coding agent, Codex, to build and improve the development tool. “I think the vast majority of Codex is built by Codex, so it’s almost entirely just being used to improve itself,” said Alexander Embiricos, product lead for Codex at OpenAI, in a conversation on Tuesday.

Codex, which OpenAI launched in its modern incarnation as a research preview in May 2025, operates as a cloud-based software engineering agent that can handle tasks like writing features, fixing bugs, and proposing pull requests. The tool runs in sandboxed environments linked to a user’s code repository and can execute multiple tasks in parallel. OpenAI offers Codex through ChatGPT’s web interface, a command-line interface (CLI), and IDE extensions for VS Code, Cursor, and Windsurf.

Read full article

Comments

© Mininyx Doodle via Getty Images

  •  

OpenAI releases GPT-5.2 after “code red” Google threat alert

On Thursday, OpenAI released GPT-5.2, its newest family of AI models for ChatGPT, in three versions called Instant, Thinking, and Pro. The release follows CEO Sam Altman’s internal “code red” memo earlier this month, which directed company resources toward improving ChatGPT in response to competitive pressure from Google’s Gemini 3 AI model.

“We designed 5.2 to unlock even more economic value for people,” Fidji Simo, OpenAI’s chief product officer, said during a press briefing with journalists on Thursday. “It’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long context, using tools and then linking complex, multi-step projects.”

As with previous versions of GPT-5, the three model tiers serve different purposes: Instant handles faster tasks like writing and translation; Thinking spits out simulated reasoning “thinking” text in an attempt to tackle more complex work like coding and math; and Pro spits out even more simulated reasoning text with the goal of delivering the highest-accuracy performance for difficult problems.

Read full article

Comments

© Benj Edwards / OpenAI

  •  

Disney invests $1 billion in OpenAI, licenses 200 characters for AI video app Sora

On Thursday, The Walt Disney Company announced a $1 billion investment in OpenAI and a three-year licensing agreement that will allow users of OpenAI’s Sora video generator to create short clips featuring more than 200 Disney, Marvel, Pixar, and Star Wars characters. It’s the first major content licensing partnership between a Hollywood studio related to the most recent version of OpenAI’s AI video platform, which drew criticism from some parts of the entertainment industry when it launched in late September.

“Technological innovation has continually shaped the evolution of entertainment, bringing with it new ways to create and share great stories with the world,” said Disney CEO Robert A. Iger in the announcement. “The rapid advancement of artificial intelligence marks an important moment for our industry, and through this collaboration with OpenAI we will thoughtfully and responsibly extend the reach of our storytelling through generative AI, while respecting and protecting creators and their works.”

The deal creates interesting bedfellows between a company that basically defined modern US copyright policy through congressional lobbying back in the 1990s and one that has argued in a submission to the UK House of Lords that useful AI models cannot be created without copyrighted material.

Read full article

Comments

© China News Service via Getty Images

  •  

Oracle shares slide on $15B increase in data center spending

Oracle stock dropped after it reported disappointing revenues on Wednesday alongside a $15 billion increase in its planned spending on data centers this year to serve artificial intelligence groups.

Shares in Larry Ellison’s database company fell 11 percent in pre-market trading on Thursday after it reported revenues of $16.1 billion in the last quarter, up 14 percent from the previous year, but below analysts’ estimates.

Oracle raised its forecast for capital expenditure this financial year by more than 40 percent to $50 billion. The outlay, largely directed to building data centers, climbed to $12 billion in the quarter, above expectations of $8.4 billion.

Read full article

Comments

© Mesut Dogan

  •  

A new open-weights AI coding model is closing in on proprietary options

On Tuesday, French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed to work as part of an autonomous software engineering agent. The model achieves a 72.2 percent score on SWE-bench Verified, a benchmark that attempts to test whether AI systems can solve real GitHub issues, putting it among the top-performing open-weights models.

Perhaps more notably, Mistral didn’t just release an AI model, it released a new development app called Mistral Vibe. It’s a command line interface (CLI) similar to Claude Code, OpenAI Codex, and Gemini CLI that lets developers interact with the Devstral models directly in their terminal. The tool can scan file structures and Git status to maintain context across an entire project, make changes across multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license.

It’s always wise to take AI benchmarks with a large grain of salt, but we’ve heard from employees of the big AI companies that they pay very close attention to how well models do on SWE-bench Verified, which presents AI models with 500 real software engineering problems pulled from GitHub issues in popular Python repositories. The AI must read the issue description, navigate the codebase, and generate a working patch that passes unit tests. While some AI researchers have noted that around 90 percent of the tasks in the benchmark test relatively simple bug fixes that experienced engineers could complete in under an hour, it’s one of the few standardized ways to compare coding models.

Read full article

Comments

© Mistral / Benj Edwards

  •  

Operation Bluebird wants to relaunch “Twitter,” says Musk abandoned the name and logo

A Virginia startup calling itself “Operation Bluebird” announced this week that it has filed a formal petition with the US Patent and Trademark Office, asking the federal agency to cancel X Corporation’s trademarks of the words “Twitter” and “tweet” since X has allegedly abandoned them.

“The TWITTER and TWEET brands have been eradicated from X Corp.’s products, services, and marketing, effectively abandoning the storied brand, with no intention to resume use of the mark,” the petition states. “The TWITTER bird was grounded.”

If successful, two leaders of the group tell Ars, Operation Bluebird would launch a social network under the name Twitter.new, possibly as early as late next year. (Twitter.new has created a working prototype and is already inviting users to reserve handles.)

Read full article

Comments

© Getty Images | Anadolu Agency

  •  

Meta offers EU users ad-light option in push to end investigation

Meta has agreed to make changes to its “pay or consent” business model in the EU, seeking to agree to a deal that avoids further regulatory fines at a time when the bloc’s digital rule book is drawing anger from US authorities.

On Tuesday, the European Commission announced that the social media giant had offered users an alternative choice of Facebook and Instagram services that would show them fewer personalized advertisements.

The offer follows an EU investigation into Meta’s policy of requiring users either to consent to data tracking or pay for an ad-free service. The Financial Times reported on optimism that an agreement could be reached between the parties in October.

Read full article

Comments

© Derick Hudson

  •  

In comedy of errors, men accused of wiping gov databases turned to an AI tool

Two sibling contractors convicted a decade ago for hacking into US State Department systems have once again been charged, this time for a comically hamfisted attempt to steal and destroy government records just minutes after being fired from their contractor jobs.

The Department of Justice on Thursday said that Muneeb Akhter and Sohaib Akhter, both 34, of Alexandria, Virginia, deleted databases and documents maintained and belonging to three government agencies. The brothers were federal contractors working for an undisclosed company in Washington, DC, that provides software and services to 45 US agencies. Prosecutors said the men coordinated the crimes and began carrying them out just minutes after being fired.

Using AI to cover up an alleged crime—what could go wrong?

On February 18 at roughly 4:55 pm, the men were fired from the company, according to an indictment unsealed on Thursday. Five minutes later, they allegedly began trying to access their employer’s system and access federal government databases. By then, access to one of the brothers’ accounts had already been terminated. The other brother, however, allegedly accessed a government agency’s database stored on the employer’s server and issued commands to prevent other users from connecting or making changes to the database. Then, prosecutors said, he issued a command to delete 96 databases, many of which contained sensitive investigative files and records related to Freedom of Information Act matters.

Read full article

Comments

© Getty Images

  •  

Admins and defenders gird themselves against maximum-severity server vuln

Security defenders are girding themselves in response to the disclosure of a maximum-severity vulnerability disclosed Wednesday in React Server, an open-source package that’s widely used by websites and in cloud environments.

The vulnerability is easy to exploit and allows hackers to execute malicious code on servers that run it. Exploit code is now publicly available.

React is embedded into web apps running on servers so that remote devices render JavaScript and content more quickly and with fewer resources required. React is used by an estimated 6 percent of all websites and 39 percent of cloud environments. When end users reload a page, React allows servers to re-render only parts that have changed, a feature that drastically speeds up performance and lowers the computing resources required by the server.

Read full article

Comments

© Getty Images

  •  

Microsoft drops AI sales targets in half after salespeople miss their quotas

Microsoft has lowered sales growth targets for its AI agent products after many salespeople missed their quotas in the fiscal year ending in June, according to a report Wednesday from The Information. The adjustment is reportedly unusual for Microsoft, and it comes after the company missed a number of ambitious sales goals for its AI offerings.

AI agents are specialized implementations of AI language models designed to perform multistep tasks autonomously rather than simply responding to single prompts. So-called “agentic” features have been central to Microsoft’s 2025 sales pitch: At its Build conference in May, the company declared that it has entered “the era of AI agents.”

The company has promised customers that agents could automate complex tasks, such as generating dashboards from sales data or writing customer reports. At its Ignite conference in November, Microsoft announced new features like Word, Excel, and PowerPoint agents in Microsoft 365 Copilot, along with tools for building and deploying agents through Azure AI Foundry and Copilot Studio. But as the year draws to a close, that promise has proven harder to deliver than the company expected.

Read full article

Comments

© Wong Yu Liang via Getty Images

  •  

Fraudulent gambling network may actually be something more nefarious

A sprawling infrastructure that has been bilking unsuspecting people through fraudulent gambling websites for 14 years is likely a dual operation run by a nation-state-sponsored group that is targeting government and private-industry organizations in the US and Europe, researchers said Wednesday.

Researchers have previously tracked smaller pieces of the enormous infrastructure. Last month, security firm Sucuri reported that the operation seeks out and compromises poorly configured websites running the WordPress CMS. Imperva in January said the attackers also scan for and exploit web apps built with the PHP programming language that have existing webshells or vulnerabilities. Once the weaknesses are exploited, the attackers install a GSocket, a backdoor that the attackers use to compromise servers and host gambling web content on them.

All of the gambling sites target Indonesian-speaking visitors. Because Indonesian law prohibits gambling, many people in that country are drawn to illicit services. Most of the 236,433 attacker-owned domains hosting the gambling sites are hosted on Cloudflare. Most of the 1,481 hijacked subdomains were hosted on Amazon Web Services, Azure, and GitHub.

Read full article

Comments

© Getty Images

  •  

OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months

The shoe is most certainly on the other foot. On Monday, OpenAI CEO Sam Altman reportedly declared a “code red” at the company to improve ChatGPT, delaying advertising plans and other products in the process,  The Information reported based on a leaked internal memo. The move follows Google’s release of its Gemini 3 model last month, which has outperformed ChatGPT on some industry benchmark tests and sparked high-profile praise on social media.

In the memo, Altman wrote, “We are at a critical time for ChatGPT.” The company will push back work on advertising integration, AI agents for health and shopping, and a personal assistant feature called Pulse. Altman encouraged temporary team transfers and established daily calls for employees responsible for enhancing the chatbot.

The directive creates an odd symmetry with events from December 2022, when Google management declared its own “code red” internal emergency after ChatGPT launched and rapidly gained in popularity. At the time, Google CEO Sundar Pichai reassigned teams across the company to develop AI prototypes and products to compete with OpenAI’s chatbot. Now, three years later, the AI industry is in a very different place.

Read full article

Comments

© Anadolu via Getty Images

  •  

Syntax hacking: Researchers discover sentence structure can bypass AI safety rules

Researchers from MIT, Northeastern University, and Meta recently released a paper suggesting that large language models (LLMs) similar to those that power ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or jailbreaking approaches work, though the researchers caution their analysis of some production models remains speculative since training data details of prominent commercial AI models are not publicly available.

The team, led by Chantal Shaib and Vinith M. Suriyakumar, tested this by asking models questions with preserved grammatical patterns but nonsensical words. For example, when prompted with “Quickly sit Paris clouded?” (mimicking the structure of “Where is Paris located?”), models still answered “France.”

This suggests models absorb both meaning and syntactic patterns, but can overrely on structural shortcuts when they strongly correlate with specific domains in training data, which sometimes allows patterns to override semantic understanding in edge cases. The team plans to present these findings at NeurIPS later this month.

Read full article

Comments

© EasternLightcraft via Getty Images

  •  

HP plans to save millions by laying off thousands, ramping up AI use

HP Inc. said that it will lay off 4,000 to 6,000 employees in favor of AI deployments, claiming it will help save $1 billion in annualized gross run rate by the end of its fiscal 2028.

HP expects to complete the layoffs by the end of that fiscal year. The reductions will largely hit product development, internal operations, and customer support, HP CEO Enrique Lores said during an earnings call on Tuesday.

Using AI, HP will “accelerate product innovation, improve customer satisfaction, and boost productivity,” Lores said.

Read full article

Comments

© Getty

  •  

Crypto hoarders dump tokens as shares tumble

Crypto-hoarding companies are ditching their holdings in a bid to prop up their sinking share prices, as the craze for “digital asset treasury” businesses unravels in the face of a $1 trillion cryptocurrency rout.

Shares in Michael Saylor-led Strategy, the world’s biggest corporate bitcoin holder, have tumbled 50 percent over the past three months, dragging down scores of copycat companies.

About $77 billion has been wiped from the stock market value of these companies, which raise debt and equity to fund purchases of crypto, since their peak of $176 billion in July, according to industry data publication The Block.

Read full article

Comments

© Olemedia

  •  

UK government will buy tech to boost AI sector in $130M growth push

The UK government will promise to buy emerging chip technology from British companies in a 100 million pound ($130 million) bid to boost growth by supporting the artificial intelligence sector.

Liz Kendall, the science secretary, said the government would offer guaranteed payments to British startups producing AI hardware that can help sectors such as life sciences and financial services.

Under a “first customer” promise modeled on the way the government bought COVID vaccines, Kendall’s department will commit in advance to buying AI inference chips that meet set performance standards.

Read full article

Comments

© Leon Neal / Staff

  •  

Oops. Cryptographers cancel election results after losing decryption key.

One of the world’s premier security organizations has canceled the results of its annual leadership election after an official lost an encryption key needed to unlock results stored in a verifiable and privacy-preserving voting system.

The International Association of Cryptologic Research (IACR) said Friday that the votes were submitted and tallied using Helios, an open source voting system that uses peer-reviewed cryptography to cast and count votes in a verifiable, confidential, and privacy-preserving way. Helios encrypts each vote in a way that assures each ballot is secret. Other cryptography used by Helios allows each voter to confirm their ballot was counted fairly.

An “honest but unfortunate human mistake”

Per the association’s bylaws, three members of the election committee act as independent trustees. To prevent two of them from colluding to cook the results, each trustee holds a third of the cryptographic key material needed to decrypt results.

Read full article

Comments

© Getty Images

  •  

How to know if your Asus router is one of thousands hacked by China-state hackers

Thousands of Asus routers have been hacked and are under the control of a suspected China-state group that has yet to reveal its intentions for the mass compromise, researchers said.

The hacking spree is either primarily or exclusively targeting seven models of Asus routers, all of which are no longer supported by the manufacturer, meaning they no longer receive security patches, researchers from SecurityScorecard said. So far, it’s unclear what the attackers do after gaining control of the devices. SecurityScorecard has named the operation WrtHug.

Staying off the radar

SecurityScorecard said it suspects the compromised devices are being used similarly to those found in ORB (operational relay box) networks, which hackers primarily use to conduct espionage to conceal their identity.

Read full article

Comments

© Olly Curtis/MacFormat Magazine/Future via Getty Images

  •  

Google tells employees it must double capacity every 6 months to meet AI demand

While AI bubble talk fills the air these days, with fears of overinvestment that could pop at any time, something of a contradiction is brewing on the ground: Companies like Google and OpenAI can barely build infrastructure fast enough to fill their AI needs.

During an all-hands meeting earlier this month, Google’s AI infrastructure head Amin Vahdat told employees that the company must double its serving capacity every six months to meet demand for artificial intelligence services, reports CNBC. The comments show a rare look at what Google executives are telling its own employees internally. Vahdat, a vice president at Google Cloud, presented slides to its employees showing the company needs to scale “the next 1000x in 4-5 years.”

While a thousandfold increase in compute capacity sounds ambitious by itself, Vahdat noted some key constraints: Google needs to be able to deliver this increase in capability, compute, and storage networking “for essentially the same cost and increasingly, the same power, the same energy level,” he told employees during the meeting. “It won’t be easy but through collaboration and co-design, we’re going to get there.”

Read full article

Comments

© Google

  •  

HP and Dell disable HEVC support built into their laptops’ CPUs

Some Dell and HP laptop owners have been befuddled by their machines’ inability to play HEVC/H.265 content in web browsers, despite their machines’ processors having integrated decoding support.

Laptops with sixth-generation Intel Core and later processors have built-in hardware support for HEVC decoding and encoding. AMD has made laptop chips supporting the codec since 2015. However, both Dell and HP have disabled this feature on some of their popular business notebooks.

HP discloses this in the data sheets for its affected laptops, which include the HP ProBook 460 G11 [PDF], ProBook 465 G11 [PDF], and EliteBook 665 G11 [PDF].

Read full article

Comments

© Getty

  •