Normal view

Received yesterday — 12 December 2025

OpenAI built an AI coding agent and uses it to improve the agent itself

12 December 2025 at 17:16

With the popularity of AI coding tools rising among some software developers, their adoption has begun to touch every aspect of the process, including the improvement of AI coding tools themselves.

In interviews with Ars Technica this week, OpenAI employees revealed the extent to which the company now relies on its own AI coding agent, Codex, to build and improve the development tool. “I think the vast majority of Codex is built by Codex, so it’s almost entirely just being used to improve itself,” said Alexander Embiricos, product lead for Codex at OpenAI, in a conversation on Tuesday.

Codex, which OpenAI launched in its modern incarnation as a research preview in May 2025, operates as a cloud-based software engineering agent that can handle tasks like writing features, fixing bugs, and proposing pull requests. The tool runs in sandboxed environments linked to a user’s code repository and can execute multiple tasks in parallel. OpenAI offers Codex through ChatGPT’s web interface, a command-line interface (CLI), and IDE extensions for VS Code, Cursor, and Windsurf.

Read full article

Comments

© Mininyx Doodle via Getty Images

Received before yesterday

OpenAI releases GPT-5.2 after “code red” Google threat alert

11 December 2025 at 16:27

On Thursday, OpenAI released GPT-5.2, its newest family of AI models for ChatGPT, in three versions called Instant, Thinking, and Pro. The release follows CEO Sam Altman’s internal “code red” memo earlier this month, which directed company resources toward improving ChatGPT in response to competitive pressure from Google’s Gemini 3 AI model.

“We designed 5.2 to unlock even more economic value for people,” Fidji Simo, OpenAI’s chief product officer, said during a press briefing with journalists on Thursday. “It’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long context, using tools and then linking complex, multi-step projects.”

As with previous versions of GPT-5, the three model tiers serve different purposes: Instant handles faster tasks like writing and translation; Thinking spits out simulated reasoning “thinking” text in an attempt to tackle more complex work like coding and math; and Pro spits out even more simulated reasoning text with the goal of delivering the highest-accuracy performance for difficult problems.

Read full article

Comments

© Benj Edwards / OpenAI

Disney invests $1 billion in OpenAI, licenses 200 characters for AI video app Sora

11 December 2025 at 11:43

On Thursday, The Walt Disney Company announced a $1 billion investment in OpenAI and a three-year licensing agreement that will allow users of OpenAI’s Sora video generator to create short clips featuring more than 200 Disney, Marvel, Pixar, and Star Wars characters. It’s the first major content licensing partnership between a Hollywood studio related to the most recent version of OpenAI’s AI video platform, which drew criticism from some parts of the entertainment industry when it launched in late September.

“Technological innovation has continually shaped the evolution of entertainment, bringing with it new ways to create and share great stories with the world,” said Disney CEO Robert A. Iger in the announcement. “The rapid advancement of artificial intelligence marks an important moment for our industry, and through this collaboration with OpenAI we will thoughtfully and responsibly extend the reach of our storytelling through generative AI, while respecting and protecting creators and their works.”

The deal creates interesting bedfellows between a company that basically defined modern US copyright policy through congressional lobbying back in the 1990s and one that has argued in a submission to the UK House of Lords that useful AI models cannot be created without copyrighted material.

Read full article

Comments

© China News Service via Getty Images

Oracle shares slide on $15B increase in data center spending

11 December 2025 at 09:39

Oracle stock dropped after it reported disappointing revenues on Wednesday alongside a $15 billion increase in its planned spending on data centers this year to serve artificial intelligence groups.

Shares in Larry Ellison’s database company fell 11 percent in pre-market trading on Thursday after it reported revenues of $16.1 billion in the last quarter, up 14 percent from the previous year, but below analysts’ estimates.

Oracle raised its forecast for capital expenditure this financial year by more than 40 percent to $50 billion. The outlay, largely directed to building data centers, climbed to $12 billion in the quarter, above expectations of $8.4 billion.

Read full article

Comments

© Mesut Dogan

A new open-weights AI coding model is closing in on proprietary options

10 December 2025 at 15:38

On Tuesday, French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed to work as part of an autonomous software engineering agent. The model achieves a 72.2 percent score on SWE-bench Verified, a benchmark that attempts to test whether AI systems can solve real GitHub issues, putting it among the top-performing open-weights models.

Perhaps more notably, Mistral didn’t just release an AI model, it released a new development app called Mistral Vibe. It’s a command line interface (CLI) similar to Claude Code, OpenAI Codex, and Gemini CLI that lets developers interact with the Devstral models directly in their terminal. The tool can scan file structures and Git status to maintain context across an entire project, make changes across multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license.

It’s always wise to take AI benchmarks with a large grain of salt, but we’ve heard from employees of the big AI companies that they pay very close attention to how well models do on SWE-bench Verified, which presents AI models with 500 real software engineering problems pulled from GitHub issues in popular Python repositories. The AI must read the issue description, navigate the codebase, and generate a working patch that passes unit tests. While some AI researchers have noted that around 90 percent of the tasks in the benchmark test relatively simple bug fixes that experienced engineers could complete in under an hour, it’s one of the few standardized ways to compare coding models.

Read full article

Comments

© Mistral / Benj Edwards

Operation Bluebird wants to relaunch “Twitter,” says Musk abandoned the name and logo

10 December 2025 at 07:32

A Virginia startup calling itself “Operation Bluebird” announced this week that it has filed a formal petition with the US Patent and Trademark Office, asking the federal agency to cancel X Corporation’s trademarks of the words “Twitter” and “tweet” since X has allegedly abandoned them.

“The TWITTER and TWEET brands have been eradicated from X Corp.’s products, services, and marketing, effectively abandoning the storied brand, with no intention to resume use of the mark,” the petition states. “The TWITTER bird was grounded.”

If successful, two leaders of the group tell Ars, Operation Bluebird would launch a social network under the name Twitter.new, possibly as early as late next year. (Twitter.new has created a working prototype and is already inviting users to reserve handles.)

Read full article

Comments

© Getty Images | Anadolu Agency

Meta offers EU users ad-light option in push to end investigation

8 December 2025 at 09:57

Meta has agreed to make changes to its “pay or consent” business model in the EU, seeking to agree to a deal that avoids further regulatory fines at a time when the bloc’s digital rule book is drawing anger from US authorities.

On Tuesday, the European Commission announced that the social media giant had offered users an alternative choice of Facebook and Instagram services that would show them fewer personalized advertisements.

The offer follows an EU investigation into Meta’s policy of requiring users either to consent to data tracking or pay for an ad-free service. The Financial Times reported on optimism that an agreement could be reached between the parties in October.

Read full article

Comments

© Derick Hudson

In comedy of errors, men accused of wiping gov databases turned to an AI tool

4 December 2025 at 16:51

Two sibling contractors convicted a decade ago for hacking into US State Department systems have once again been charged, this time for a comically hamfisted attempt to steal and destroy government records just minutes after being fired from their contractor jobs.

The Department of Justice on Thursday said that Muneeb Akhter and Sohaib Akhter, both 34, of Alexandria, Virginia, deleted databases and documents maintained and belonging to three government agencies. The brothers were federal contractors working for an undisclosed company in Washington, DC, that provides software and services to 45 US agencies. Prosecutors said the men coordinated the crimes and began carrying them out just minutes after being fired.

Using AI to cover up an alleged crime—what could go wrong?

On February 18 at roughly 4:55 pm, the men were fired from the company, according to an indictment unsealed on Thursday. Five minutes later, they allegedly began trying to access their employer’s system and access federal government databases. By then, access to one of the brothers’ accounts had already been terminated. The other brother, however, allegedly accessed a government agency’s database stored on the employer’s server and issued commands to prevent other users from connecting or making changes to the database. Then, prosecutors said, he issued a command to delete 96 databases, many of which contained sensitive investigative files and records related to Freedom of Information Act matters.

Read full article

Comments

© Getty Images

Admins and defenders gird themselves against maximum-severity server vuln

3 December 2025 at 18:16

Security defenders are girding themselves in response to the disclosure of a maximum-severity vulnerability disclosed Wednesday in React Server, an open-source package that’s widely used by websites and in cloud environments.

The vulnerability is easy to exploit and allows hackers to execute malicious code on servers that run it. Exploit code is now publicly available.

React is embedded into web apps running on servers so that remote devices render JavaScript and content more quickly and with fewer resources required. React is used by an estimated 6 percent of all websites and 39 percent of cloud environments. When end users reload a page, React allows servers to re-render only parts that have changed, a feature that drastically speeds up performance and lowers the computing resources required by the server.

Read full article

Comments

© Getty Images

Microsoft drops AI sales targets in half after salespeople miss their quotas

3 December 2025 at 13:24

Microsoft has lowered sales growth targets for its AI agent products after many salespeople missed their quotas in the fiscal year ending in June, according to a report Wednesday from The Information. The adjustment is reportedly unusual for Microsoft, and it comes after the company missed a number of ambitious sales goals for its AI offerings.

AI agents are specialized implementations of AI language models designed to perform multistep tasks autonomously rather than simply responding to single prompts. So-called “agentic” features have been central to Microsoft’s 2025 sales pitch: At its Build conference in May, the company declared that it has entered “the era of AI agents.”

The company has promised customers that agents could automate complex tasks, such as generating dashboards from sales data or writing customer reports. At its Ignite conference in November, Microsoft announced new features like Word, Excel, and PowerPoint agents in Microsoft 365 Copilot, along with tools for building and deploying agents through Azure AI Foundry and Copilot Studio. But as the year draws to a close, that promise has proven harder to deliver than the company expected.

Read full article

Comments

© Wong Yu Liang via Getty Images

Fraudulent gambling network may actually be something more nefarious

3 December 2025 at 12:23

A sprawling infrastructure that has been bilking unsuspecting people through fraudulent gambling websites for 14 years is likely a dual operation run by a nation-state-sponsored group that is targeting government and private-industry organizations in the US and Europe, researchers said Wednesday.

Researchers have previously tracked smaller pieces of the enormous infrastructure. Last month, security firm Sucuri reported that the operation seeks out and compromises poorly configured websites running the WordPress CMS. Imperva in January said the attackers also scan for and exploit web apps built with the PHP programming language that have existing webshells or vulnerabilities. Once the weaknesses are exploited, the attackers install a GSocket, a backdoor that the attackers use to compromise servers and host gambling web content on them.

All of the gambling sites target Indonesian-speaking visitors. Because Indonesian law prohibits gambling, many people in that country are drawn to illicit services. Most of the 236,433 attacker-owned domains hosting the gambling sites are hosted on Cloudflare. Most of the 1,481 hijacked subdomains were hosted on Amazon Web Services, Azure, and GitHub.

Read full article

Comments

© Getty Images

OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months

2 December 2025 at 17:42

The shoe is most certainly on the other foot. On Monday, OpenAI CEO Sam Altman reportedly declared a “code red” at the company to improve ChatGPT, delaying advertising plans and other products in the process,  The Information reported based on a leaked internal memo. The move follows Google’s release of its Gemini 3 model last month, which has outperformed ChatGPT on some industry benchmark tests and sparked high-profile praise on social media.

In the memo, Altman wrote, “We are at a critical time for ChatGPT.” The company will push back work on advertising integration, AI agents for health and shopping, and a personal assistant feature called Pulse. Altman encouraged temporary team transfers and established daily calls for employees responsible for enhancing the chatbot.

The directive creates an odd symmetry with events from December 2022, when Google management declared its own “code red” internal emergency after ChatGPT launched and rapidly gained in popularity. At the time, Google CEO Sundar Pichai reassigned teams across the company to develop AI prototypes and products to compete with OpenAI’s chatbot. Now, three years later, the AI industry is in a very different place.

Read full article

Comments

© Anadolu via Getty Images

Syntax hacking: Researchers discover sentence structure can bypass AI safety rules

2 December 2025 at 07:15

Researchers from MIT, Northeastern University, and Meta recently released a paper suggesting that large language models (LLMs) similar to those that power ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or jailbreaking approaches work, though the researchers caution their analysis of some production models remains speculative since training data details of prominent commercial AI models are not publicly available.

The team, led by Chantal Shaib and Vinith M. Suriyakumar, tested this by asking models questions with preserved grammatical patterns but nonsensical words. For example, when prompted with “Quickly sit Paris clouded?” (mimicking the structure of “Where is Paris located?”), models still answered “France.”

This suggests models absorb both meaning and syntactic patterns, but can overrely on structural shortcuts when they strongly correlate with specific domains in training data, which sometimes allows patterns to override semantic understanding in edge cases. The team plans to present these findings at NeurIPS later this month.

Read full article

Comments

© EasternLightcraft via Getty Images

HP plans to save millions by laying off thousands, ramping up AI use

26 November 2025 at 12:19

HP Inc. said that it will lay off 4,000 to 6,000 employees in favor of AI deployments, claiming it will help save $1 billion in annualized gross run rate by the end of its fiscal 2028.

HP expects to complete the layoffs by the end of that fiscal year. The reductions will largely hit product development, internal operations, and customer support, HP CEO Enrique Lores said during an earnings call on Tuesday.

Using AI, HP will “accelerate product innovation, improve customer satisfaction, and boost productivity,” Lores said.

Read full article

Comments

© Getty

Crypto hoarders dump tokens as shares tumble

Crypto-hoarding companies are ditching their holdings in a bid to prop up their sinking share prices, as the craze for “digital asset treasury” businesses unravels in the face of a $1 trillion cryptocurrency rout.

Shares in Michael Saylor-led Strategy, the world’s biggest corporate bitcoin holder, have tumbled 50 percent over the past three months, dragging down scores of copycat companies.

About $77 billion has been wiped from the stock market value of these companies, which raise debt and equity to fund purchases of crypto, since their peak of $176 billion in July, according to industry data publication The Block.

Read full article

Comments

© Olemedia

UK government will buy tech to boost AI sector in $130M growth push

24 November 2025 at 09:17

The UK government will promise to buy emerging chip technology from British companies in a 100 million pound ($130 million) bid to boost growth by supporting the artificial intelligence sector.

Liz Kendall, the science secretary, said the government would offer guaranteed payments to British startups producing AI hardware that can help sectors such as life sciences and financial services.

Under a “first customer” promise modeled on the way the government bought COVID vaccines, Kendall’s department will commit in advance to buying AI inference chips that meet set performance standards.

Read full article

Comments

© Leon Neal / Staff

Oops. Cryptographers cancel election results after losing decryption key.

21 November 2025 at 19:16

One of the world’s premier security organizations has canceled the results of its annual leadership election after an official lost an encryption key needed to unlock results stored in a verifiable and privacy-preserving voting system.

The International Association of Cryptologic Research (IACR) said Friday that the votes were submitted and tallied using Helios, an open source voting system that uses peer-reviewed cryptography to cast and count votes in a verifiable, confidential, and privacy-preserving way. Helios encrypts each vote in a way that assures each ballot is secret. Other cryptography used by Helios allows each voter to confirm their ballot was counted fairly.

An “honest but unfortunate human mistake”

Per the association’s bylaws, three members of the election committee act as independent trustees. To prevent two of them from colluding to cook the results, each trustee holds a third of the cryptographic key material needed to decrypt results.

Read full article

Comments

© Getty Images

How to know if your Asus router is one of thousands hacked by China-state hackers

21 November 2025 at 17:05

Thousands of Asus routers have been hacked and are under the control of a suspected China-state group that has yet to reveal its intentions for the mass compromise, researchers said.

The hacking spree is either primarily or exclusively targeting seven models of Asus routers, all of which are no longer supported by the manufacturer, meaning they no longer receive security patches, researchers from SecurityScorecard said. So far, it’s unclear what the attackers do after gaining control of the devices. SecurityScorecard has named the operation WrtHug.

Staying off the radar

SecurityScorecard said it suspects the compromised devices are being used similarly to those found in ORB (operational relay box) networks, which hackers primarily use to conduct espionage to conceal their identity.

Read full article

Comments

© Olly Curtis/MacFormat Magazine/Future via Getty Images

Google tells employees it must double capacity every 6 months to meet AI demand

21 November 2025 at 16:47

While AI bubble talk fills the air these days, with fears of overinvestment that could pop at any time, something of a contradiction is brewing on the ground: Companies like Google and OpenAI can barely build infrastructure fast enough to fill their AI needs.

During an all-hands meeting earlier this month, Google’s AI infrastructure head Amin Vahdat told employees that the company must double its serving capacity every six months to meet demand for artificial intelligence services, reports CNBC. The comments show a rare look at what Google executives are telling its own employees internally. Vahdat, a vice president at Google Cloud, presented slides to its employees showing the company needs to scale “the next 1000x in 4-5 years.”

While a thousandfold increase in compute capacity sounds ambitious by itself, Vahdat noted some key constraints: Google needs to be able to deliver this increase in capability, compute, and storage networking “for essentially the same cost and increasingly, the same power, the same energy level,” he told employees during the meeting. “It won’t be easy but through collaboration and co-design, we’re going to get there.”

Read full article

Comments

© Google

HP and Dell disable HEVC support built into their laptops’ CPUs

20 November 2025 at 18:02

Some Dell and HP laptop owners have been befuddled by their machines’ inability to play HEVC/H.265 content in web browsers, despite their machines’ processors having integrated decoding support.

Laptops with sixth-generation Intel Core and later processors have built-in hardware support for HEVC decoding and encoding. AMD has made laptop chips supporting the codec since 2015. However, both Dell and HP have disabled this feature on some of their popular business notebooks.

HP discloses this in the data sheets for its affected laptops, which include the HP ProBook 460 G11 [PDF], ProBook 465 G11 [PDF], and EliteBook 665 G11 [PDF].

Read full article

Comments

© Getty

Massive Cloudflare outage was triggered by file that suddenly doubled in size

19 November 2025 at 16:25

When a Cloudflare outage disrupted large numbers of websites and online services yesterday, the company initially thought it was hit by a “hyper-scale” DDoS (distributed denial-of-service) attack.

“I worry this is the big botnet flexing,” Cloudflare co-founder and CEO Matthew Prince wrote in an internal chat room yesterday, while he and others discussed whether Cloudflare was being hit by attacks from the prolific Aisuru botnet. But upon further investigation, Cloudflare staff realized the problem had an internal cause: an important file had unexpectedly doubled in size and propagated across the network.

This caused trouble for software that needs to read the file to maintain the Cloudflare bot management system that uses a machine learning model to protect against security threats. Cloudflare’s core CDN, security services, and several other services were affected.

Read full article

Comments

© Getty Images | NurPhoto

Critics scoff after Microsoft warns AI feature can infect machines and pilfer data

19 November 2025 at 15:25

Microsoft’s warning on Tuesday that an experimental AI agent integrated into Windows can infect devices and pilfer sensitive user data has set off a familiar response from security-minded critics: Why is Big Tech so intent on pushing new features before their dangerous behaviors can be fully understood and contained?

As reported Tuesday, Microsoft introduced Copilot Actions, a new set of “experimental agentic features” that, when enabled, perform “everyday tasks like organizing files, scheduling meetings, or sending emails,” and provide “an active digital collaborator that can carry out complex tasks for you to enhance efficiency and productivity.”

Hallucinations and prompt injections apply

The fanfare, however, came with a significant caveat. Microsoft recommended users enable Copilot Actions only “if you understand the security implications outlined.”

Read full article

Comments

© Photographer: Chona Kasinger/Bloomberg via Getty Images

Tech giants pour billions into Anthropic as circular AI investments roll on

18 November 2025 at 15:37

On Tuesday, Microsoft and Nvidia announced plans to invest in Anthropic under a new partnership that includes a $30 billion commitment by the Claude maker to use Microsoft’s cloud services. Nvidia will commit up to $10 billion to Anthropic and Microsoft up to $5 billion, with both companies investing in Anthropic’s next funding round.

The deal brings together two companies that have backed OpenAI and connects them more closely to one of the ChatGPT maker’s main competitors. Microsoft CEO Satya Nadella said in a video that OpenAI “remains a critical partner,” while adding that the companies will increasingly be customers of each other.

“We will use Anthropic models, they will use our infrastructure, and we’ll go to market together,” Nadella said.

Read full article

Comments

© https://www.youtube.com/watch?v=bl7vHnOgEg0&t=4s

Bonkers Bitcoin heist: 5-star hotels, cash-filled envelopes, vanishing funds

18 November 2025 at 13:37

As Kent Halliburton stood in a bathroom at the Rosewood Hotel in central Amsterdam, thousands of miles from home, running his fingers through an envelope filled with 10,000 euros in crisp banknotes, he started to wonder what he had gotten himself into.

Halliburton is the cofounder and CEO of Sazmining, a company that operates bitcoin mining hardware on behalf of clients—a model known as “mining-as-a-service.” Halliburton is based in Peru, but Sazmining runs mining hardware out of third-party data centers across Norway, Paraguay, Ethiopia, and the United States.

As Halliburton tells it, he had flown to Amsterdam the previous day, August 5, to meet Even and Maxim, two representatives of a wealthy Monaco-based family. The family office had offered to purchase hundreds of bitcoin mining rigs from Sazmining—around $4 million worth—which the company would install at a facility currently under construction in Ethiopia. Before finalizing the deal, the family office had asked to meet Halliburton in person.

Read full article

Comments

© Koron

Google CEO: If an AI bubble pops, no one is getting out clean

18 November 2025 at 11:32

On Tuesday, Alphabet CEO Sundar Pichai warned of “irrationality” in the AI market, telling the BBC in an interview, “I think no company is going to be immune, including us.” His comments arrive as scrutiny over the state of the AI market has reached new heights, with Alphabet shares doubling in value over seven months to reach a $3.5 trillion market capitalization.

Speaking exclusively to the BBC at Google’s California headquarters, Pichai acknowledged that while AI investment growth is at an “extraordinary moment,” the industry can “overshoot” in investment cycles, as we’re seeing now. He drew comparisons to the late 1990s Internet boom, which saw early Internet company valuations surge before collapsing in 2000, leading to bankruptcies and job losses.

“We can look back at the Internet right now. There was clearly a lot of excess investment, but none of us would question whether the Internet was profound,” Pichai said. “I expect AI to be the same. So I think it’s both rational and there are elements of irrationality through a moment like this.”

Read full article

Comments

© Ryan Whitwam

5 plead guilty to laptop farm and ID theft scheme to land North Koreans US IT jobs

17 November 2025 at 17:20

Five men have pleaded guilty to running laptop farms and providing other assistance to North Koreans to obtain remote IT work at US companies in violation of US law, federal prosecutors said.

The pleas come amid a rash of similar schemes orchestrated by hacking and threat groups backed by the North Korean government. The campaigns, which ramped up nearly five years ago, aim to steal millions of dollars in job revenue and cryptocurrencies to fund North Korean weapons programs. Another motive is to seed cyber attacks for espionage. In one such incident, a North Korean man who fraudulently obtained a job at US security company KnowBe4 installed malware immediately upon beginning his employment.

On Friday, the US Justice Department said that five men pleaded guilty to assisting North Koreans in obtaining jobs in a scheme orchestrated by APT38, also tracked under the name Lazarus. APT38 has targeted the US and other countries for more than a decade with a stream of attack campaigns that have grown ever bolder and more advanced. All five pleaded guilty to wire fraud, and one to aggravated identity theft, for a range of actions.

Read full article

Comments

© Getty Images

Oracle hit hard in Wall Street’s tech sell-off over its huge AI bet

Oracle has been hit harder than Big Tech rivals in the recent sell-off of tech stocks and bonds, as its vast borrowing to fund a pivot to artificial intelligence unnerved Wall Street.

The US software group founded by Larry Ellison has made a dramatic entrance to the AI race, committing to spend hundreds of billions of dollars in the next few years on chips and data centers—largely as part of deals to supply computing capacity to OpenAI, the maker of ChatGPT.

The speed and scale of its moves have unsettled some investors at a time when markets are keenly focused on the spending of so-called hyperscalers—big tech companies building vast data centers.

Read full article

Comments

© FT montage/Bloomberg/Getty Images

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules

14 November 2025 at 13:45

Em dashes have become what many believe to be a telltale sign of AI-generated text over the past few years. The punctuation mark appears frequently in outputs from ChatGPT and other AI chatbots, sometimes to the point where readers believe they can identify AI writing by its overuse alone—although people can overuse it, too.

On Thursday evening, OpenAI CEO Sam Altman posted on X that ChatGPT has started following custom instructions to avoid using em dashes. “Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!” he wrote.

The post, which came two days after the release of OpenAI’s new GPT-5.1 AI model, received mixed reactions from users who have struggled for years with getting the chatbot to follow specific formatting preferences. And this “small win” raises a very big question: If the world’s most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

Read full article

Comments

© Yurii Karvatskyi via Getty Images

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

14 November 2025 at 07:20

Researchers from Anthropic said they recently observed the “first reported AI-orchestrated cyber espionage campaign” after detecting China-state hackers using the company’s Claude AI tool in a campaign aimed at dozens of targets. Outside researchers are much more measured in describing the significance of the discovery.

Anthropic published the reports on Thursday here and here. In September, the reports said, Anthropic discovered a “highly sophisticated espionage campaign,” carried out by a Chinese state-sponsored group, that used Claude Code to automate up to 90 percent of the work. Human intervention was required “only sporadically (perhaps 4-6 critical decision points per hacking campaign).” Anthropic said the hackers had employed AI agentic capabilities to an “unprecedented” extent.

“This campaign has substantial implications for cybersecurity in the age of AI ‘agents’—systems that can be run autonomously for long periods of time and that complete complex tasks largely independent of human intervention,” Anthropic said. “Agents are valuable for everyday work and productivity—but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks.”

Read full article

Comments

© Wong Yu Liang via Getty Images

OpenAI walks a tricky tightrope with GPT-5.1’s eight new personalities

12 November 2025 at 17:54

On Wednesday, OpenAI released GPT-5.1 Instant and GPT-5.1 Thinking, two updated versions of its flagship AI models now available in ChatGPT. The company is wrapping the models in the language of anthropomorphism, claiming that they’re warmer, more conversational, and better at following instructions.

The release follows complaints earlier this year that its previous models were excessively cheerful and sycophantic, along with an opposing controversy among users over how OpenAI modified the default GPT-5 output style after several suicide lawsuits.

The company now faces intense scrutiny from lawyers and regulators that could threaten its future operations. In that kind of environment, it’s difficult to just release a new AI model, throw out a few stats, and move on like the company could even a year ago. But here are the basics: The new GPT-5.1 Instant model will serve as ChatGPT’s faster default option for most tasks, while GPT-5.1 Thinking is a simulated reasoning model that attempts to handle more complex problem-solving tasks.

Read full article

Comments

© Chris Madden via Getty Images

Meta’s star AI scientist Yann LeCun plans to leave for own startup

12 November 2025 at 12:14

Meta’s chief AI scientist and Turing Award winner Yann LeCun plans to leave the company to launch his own startup focused on a different type of AI called “world models,” the Financial Times reported. The French-US scientist has reportedly told associates he will depart in the coming months and is already in early talks to raise funds for the new venture. The departure comes as CEO Mark Zuckerberg radically overhauled Meta’s AI operations after deciding the company had fallen behind rivals such as OpenAI and Google.

World models are hypothetical AI systems that some AI engineers expect to develop an internal “understanding” of the physical world by learning from video and spatial data rather than text alone. Unlike current large language models (such as the kind that power ChatGPT) that predict the next segment of data in a sequence, world models would ideally simulate cause-and-effect scenarios, understand physics, and enable machines to reason and plan more like animals do. LeCun has said this architecture could take a decade to fully develop.

While some AI experts believe that Transformer-based AI models—such as large language models, video synthesis models, and interactive world synthesis models—have emergently modeled physics or absorbed the structural rules of the physical world from training data examples, the evidence so far generally points to sophisticated pattern-matching rather than a base understanding of how the physical world actually works.

Read full article

Comments

© Photo by Kevin Dietsch/Getty Images

ClickFix may be the biggest security threat your family has never heard of

11 November 2025 at 07:30

Over the past year, scammers have ramped up a new way to infect the computers of unsuspecting people. The increasingly common method, which many potential targets have yet to learn of, is quick, bypasses most endpoint protections, and works against both macOS and Windows users.

ClickFix often starts with an email sent from a hotel that the target has a pending registration with and references the correct registration information. In other cases, ClickFix attacks begin with a WhatsApp message. In still other cases, the user receives the URL at the top of Google results for a search query. Once the mark accesses the malicious site referenced, it presents a CAPTCHA challenge or other pretext requiring user confirmation. The user receives an instruction to copy a string of text, open a terminal window, paste it in, and press Enter.

One line is all it takes

Once entered, the string of text causes the PC or Mac to surreptitiously visit a scammer-controlled server and download malware. Then, the machine automatically installs it—all with no indication to the target. With that, users are infected, usually with credential-stealing malware. Security firms say ClickFix campaigns have run rampant. The lack of awareness of the technique, combined with the links also coming from known addresses or in search results, and the ability to bypass some endpoint protections are all factors driving the growth.

Read full article

Comments

© Getty Images

Researchers isolate memorization from problem-solving in AI neural networks

10 November 2025 at 18:06

When engineers build AI language models like GPT-5 from training data, at least two major processing features emerge: memorization (reciting exact text they’ve seen before, like famous quotes or passages from books) and what you might call “reasoning” (solving new problems using general principles). New research from AI startup Goodfire.ai provides the first potentially clear evidence that these different functions actually work through completely separate neural pathways in the model’s architecture.

The researchers discovered that this separation proves remarkably clean. In a preprint paper released in late October, they described that when they removed the memorization pathways, models lost 97 percent of their ability to recite training data verbatim but kept nearly all their “logical reasoning” ability intact.

For example, at layer 22 in Allen Institute for AI’s OLMo-7B language model, the researchers ranked all the weight components (the mathematical values that process information) from high to low based on a measure called “curvature” (which we’ll explain more below). When they examined these ranked components, the bottom 50 percent of weight components showed 23 percent higher activation on memorized data, while the top 10 percent showed 26 percent higher activation on general, non-memorized text.

Read full article

Comments

© Benj Edwards / Kirillm via Getty Images

Researchers surprised that with AI, toxicity is harder to fake than intelligence

7 November 2025 at 15:15

The next time you encounter an unusually polite reply on social media, you might want to check twice. It could be an AI model trying (and failing) to blend in with the crowd.

On Wednesday, researchers from the University of Zurich, University of Amsterdam, Duke University, and New York University released a study revealing that AI models remain easily distinguishable from humans in social media conversations, with overly friendly emotional tone serving as the most persistent giveaway. The research, which tested nine open-weight models across Twitter/X, Bluesky, and Reddit, found that classifiers developed by the researchers detected AI-generated replies with 70 to 80 percent accuracy.

The study introduces what the authors call a “computational Turing test” to assess how closely AI models approximate human language. Instead of relying on subjective human judgment about whether text sounds authentic, the framework uses automated classifiers and linguistic analysis to identify specific features that distinguish machine-generated from human-authored content.

Read full article

Comments

© RichVintage via Getty Images

❌