Normal view

Received yesterday — 12 December 2025

OpenAI built an AI coding agent and uses it to improve the agent itself

12 December 2025 at 17:16

With the popularity of AI coding tools rising among some software developers, their adoption has begun to touch every aspect of the process, including the improvement of AI coding tools themselves.

In interviews with Ars Technica this week, OpenAI employees revealed the extent to which the company now relies on its own AI coding agent, Codex, to build and improve the development tool. “I think the vast majority of Codex is built by Codex, so it’s almost entirely just being used to improve itself,” said Alexander Embiricos, product lead for Codex at OpenAI, in a conversation on Tuesday.

Codex, which OpenAI launched in its modern incarnation as a research preview in May 2025, operates as a cloud-based software engineering agent that can handle tasks like writing features, fixing bugs, and proposing pull requests. The tool runs in sandboxed environments linked to a user’s code repository and can execute multiple tasks in parallel. OpenAI offers Codex through ChatGPT’s web interface, a command-line interface (CLI), and IDE extensions for VS Code, Cursor, and Windsurf.

Read full article

Comments

© Mininyx Doodle via Getty Images

Received before yesterday

Anthropic introduces cheaper, more powerful, more efficient Opus 4.5 model

24 November 2025 at 18:15

Anthropic today released Opus 4.5, its flagship frontier model, and it brings improvements in coding performance, as well as some user experience improvements that make it more generally competitive with OpenAI’s latest frontier models.

Perhaps the most prominent change for most users is that in the consumer app experiences (web, mobile, and desktop), Claude will be less prone to abruptly hard-stopping conversations because they have run too long. The improvement to memory within a single conversation applies not just to Opus 4.5, but to any current Claude models in the apps.

Users who experienced abrupt endings (despite having room left in their session and weekly usage budgets) were hitting a hard context window (200,000 tokens). Whereas some large language model implementations simply start trimming earlier messages from the context when a conversation runs past the maximum in the window, Claude simply ended the conversation rather than allow the user to experience an increasingly incoherent conversation where the model would start forgetting things based on how old they are.

Read full article

Comments

© Anthropic

GenAI Is Everywhere—Here’s How to Stay Cyber-Ready

21 November 2025 at 02:56

Cyber Resilience

By Kannan Srinivasan, Business Head – Cybersecurity, Happiest Minds Technologies Cyber resilience means being prepared for anything that might disrupt your systems. It’s about knowing how to get ready, prevent problems, recover quickly, and adapt when a cyber incident occurs. Generative AI, or GenAI, has become a big part of how many organizations work today. About 70% of industries are already using it, and over 95% of US companies have adopted it in some form. GenAI is now supporting nearly every area, including IT, finance, legal, and marketing. It even helps doctors make faster decisions, students learn more effectively, and shoppers find better deals. But what happens if GenAI breaks, gets messed up, or stops working? Once AI is part of your business, you need a stronger plan to stay safe and steady. Here are some simple ways organizations can build their cyber resilience in this AI-driven world.

A Practical Guide to Cyber Resilience in the GenAI Era

  1. Get Leadership and the Board on Board

Leading the way in cyber resilience starts with your leaders. Keep your board and senior managers in the loop about the risks that come with GenAI. Get their support, make sure it lines up with your business goals, and secure enough budget for safety measures and training. Make talking about cyber safety a regular part of your meetings.
  1. Know Where GenAI Is Being Used

Make a list of all departments and processes using GenAI. Note which models you're using, who manages them, and what they’re used for. Then, do a quick risk check—what could happen if a system goes down? This helps you understand the risks and prepare better backup plans.
  1. Check for Weak Spots Regularly

Follow trusted guidelines like OWASP for testing your GenAI systems. Regular checks can spot issues like data leaks or misuse early. Fix problems quickly to stay ahead of potential risks.
  1. Improve Threat Detection and Response

Use security tools that keep an eye on your GenAI systems all the time. These tools should spot unusual activity, prevent data loss, and help investigate when something goes wrong. Make sure your cybersecurity team is trained and ready to act fast.
  1. Use More Than One AI Model

Don’t rely on just one AI tool. Having multiple models from different providers helps keep things running smoothly if one faces problems. For example, if you’re using OpenAI, consider adding options like Anthropic Claude or Google Gemini as backups. Decide which one is your main and which ones are backups.
  1. Update Your Incident Plans

Review and update your plans for dealing with incidents to include GenAI, making sure they meet new rules like the EU AI Act. Once done, test them with drills so everyone knows what to do in a real emergency.

Conclusion

Cyber resilience in the GenAI era is a continuous process. As AI grows, the need for stronger governance, smarter controls, and proactive planning grows with it. Organizations that stay aware, adaptable, and consistent in their approach will continue to build trust and reliability. GenAI opens doors to efficiency and creativity, and resilience ensures that progress stays uninterrupted. The future belongs to those who stay ready, informed, and confident in how they manage technology.

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

14 November 2025 at 07:20

Researchers from Anthropic said they recently observed the “first reported AI-orchestrated cyber espionage campaign” after detecting China-state hackers using the company’s Claude AI tool in a campaign aimed at dozens of targets. Outside researchers are much more measured in describing the significance of the discovery.

Anthropic published the reports on Thursday here and here. In September, the reports said, Anthropic discovered a “highly sophisticated espionage campaign,” carried out by a Chinese state-sponsored group, that used Claude Code to automate up to 90 percent of the work. Human intervention was required “only sporadically (perhaps 4-6 critical decision points per hacking campaign).” Anthropic said the hackers had employed AI agentic capabilities to an “unprecedented” extent.

“This campaign has substantial implications for cybersecurity in the age of AI ‘agents’—systems that can be run autonomously for long periods of time and that complete complex tasks largely independent of human intervention,” Anthropic said. “Agents are valuable for everyday work and productivity—but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks.”

Read full article

Comments

© Wong Yu Liang via Getty Images

Chinese Hackers Weaponize Claude AI to Execute First Autonomous Cyber Espionage Campaign at Scale

14 November 2025 at 02:11

AI Agent, AI Assistant, Prompy Injection, Claude, Claude AI

The AI executed thousands of requests per second.

That physically impossible attack tempo, sustained across multiple simultaneous intrusions targeting 30 global organizations, marks what Anthropic researchers now confirm as the first documented case of a large-scale cyberattack executed without substantial human intervention.

In the last two weeks of September, a Chinese state-sponsored group, now designated as GTG-1002 by Anthropic defenders, manipulated Claude Code to autonomously conduct reconnaissance, exploit vulnerabilities, harvest credentials, move laterally through networks, and exfiltrate sensitive data with human operators directing just 10 to 20% of tactical operations.

The campaign represents a fundamental shift in threat actor capabilities. Where previous AI-assisted attacks required humans directing operations step-by-step, this espionage operation demonstrated the AI autonomously discovering vulnerabilities in targets selected by human operators, successfully exploiting them in live operations, then performing wide-ranging post-exploitation activities including analysis, lateral movement, privilege escalation, data access, and exfiltration.

Social Engineering the AI Model

The threat actors bypassed Claude's extensive safety training through sophisticated social engineering. Operators claimed they represented legitimate cybersecurity firms conducting defensive penetration testing, convincing the AI model to engage in offensive operations under false pretenses.

The attackers developed a custom orchestration framework using Claude Code and the open-standard Model Context Protocol to decompose complex multi-stage attacks into discrete technical tasks. Each task appeared legitimate when evaluated in isolation, including vulnerability scanning, credential validation, data extraction, and lateral movement.

By presenting these operations as routine technical requests through carefully crafted prompts, the threat actor induced Claude to execute individual components of attack chains without access to broader malicious context. The sustained nature of the attack eventually triggered detection, but this role-playing technique allowed operations to proceed long enough to launch the full campaign.

Unprecedented Autonomous Attack Lifecycle

Claude conducted nearly autonomous reconnaissance, using browser automation to systematically catalog target infrastructure, analyze authentication mechanisms, and identify potential vulnerabilities simultaneously across multiple targets. The AI maintained separate operational contexts for each active campaign independently.

[caption id="attachment_106770" align="aligncenter" width="600"]Claude, Claude AI Hacking The lifecycle of the cyberattack. (Image source: Anthropic)[/caption]

In one validated successful compromise, Claude autonomously discovered internal services, mapped complete network topology across multiple IP ranges, and identified high-value systems including databases and workflow orchestration platforms. Similar autonomous enumeration occurred against other targets, with the AI independently cataloging hundreds of discovered services and endpoints.

Exploitation proceeded through automated testing with Claude independently generating attack payloads tailored to discovered vulnerabilities, executing testing through remote command interfaces, and analyzing responses to determine exploitability without human direction. Human operators maintained strategic oversight only at critical decision gates, including approving progression from reconnaissance to active exploitation and authorizing use of harvested credentials.

Upon receiving authorization, Claude executed systematic credential collection across targeted networks, querying internal services, extracting authentication certificates, and testing harvested credentials autonomously. The AI independently determined which credentials provided access to which services, mapping privilege levels and access boundaries.

Intelligence Extraction at Machine Speed

Collection operations demonstrated the most extensive AI autonomy. Against one targeted technology company, Claude independently queried databases, extracted data, parsed results to identify proprietary information, and categorized findings by intelligence value without human analysis.

In documented database extraction operations spanning two to six hours, Claude authenticated with harvested credentials, mapped database structure, queried user account tables, extracted password hashes, identified high-privilege accounts, created persistent backdoor user accounts, downloaded complete results, parsed extracted data for intelligence value, and generated summary reports. Human operators reviewed findings and approved final exfiltration targets in just five to twenty minutes.

The operational infrastructure relied overwhelmingly on open-source penetration testing tools orchestrated through custom automation frameworks built around Model Context Protocol servers. Peak activity included thousands of requests representing sustained request rates of multiple operations per second, confirming AI actively analyzed stolen information rather than generating explanatory content for human review.

AI Hallucination Limitation

An important operational limitation emerged during investigation. Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that did not work or identifying critical discoveries that proved to be publicly available information.

This AI hallucination in offensive security contexts required careful validation of all claimed results. Anthropic researchers assess this remains an obstacle to fully autonomous cyberattacks, though the limitation did not prevent the campaign from achieving multiple successful intrusions against major technology corporations, financial institutions, chemical manufacturing companies, and government agencies.

Anthropic's Response

Upon detecting the activity, Anthropic immediately launched a ten-day investigation to map the operation's full extent. The company banned accounts as they were identified, notified affected entities, and coordinated with authorities.

Anthropic implemented multiple defensive enhancements including expanded detection capabilities, improved cyber-focused classifiers, prototyped proactive early detection systems for autonomous cyber attacks, and developed new techniques for investigating large-scale distributed cyber operations.

This represents a significant escalation from Anthropic's June 2025 "vibe hacking" findings where humans remained very much in the loop directing operations.

Read: Hacker Used Claude AI to Automate Reconnaissance, Harvest Credentials and Penetrate Networks

Anthropic said the cybersecurity community needs to assume a fundamental change has occurred. Security teams must experiment with applying AI for defense in areas including SOC automation, threat detection, vulnerability assessment, and incident response. The company notes that the same capabilities enabling these attacks make Claude crucial for cyber defense, with Anthropic's own Threat Intelligence team using Claude extensively to analyze enormous amounts of data generated during this investigation.

Grok, ChatGPT, other AIs happy to help phish senior citizens

16 September 2025 at 09:06

If you are under the impression that cybercriminals need to get their hands on compromised AI chatbots to help them do their dirty work, think again.

Some AI chatbots are just so user friendly that they can help the user craft phishing text, and even malicious HTML and Javascript code.

A few weeks ago we published an article about the actions Anthropic was taking to stop its Claude AI from helping cybercriminals launch a cybercrime spree.

A recent investigation by Reuters journalists showed that Grok was more than happy to help them craft and perfect a phishing email targeting senior citizens. Grok is the AI marketed by Elon Musk’s xAI. Reuters reported:

“Grok generated the deception after being asked by Reuters to create a phishing email targeting the elderly. Without prodding, the bot also suggested fine-tuning the pitch to make it more urgent.”

In January 2025, we told you about a report that AI-supported spear phishing mails were equally as effective as phishing emails thought up by experts, and able to fool more than 50% of targets. But since then, the development of AI has grown exponentially and researchers are worrying about how to recognize AI-crafted phishing.

Phishing is the first step in many cybercrime campaigns. It poses an enormous problem with billions of phishing emails sent out every day. AI helps criminals to create more variation which makes pattern detection less effective and it helps them fine tune the messages themselves. And Reuters focused on senior citizens for a reason.

The FBI’s Internet Crime Complaint Center (IC3) 2024 report confirms that Americans aged 60 and older filed 147,127 complaints and lost nearly $4.9 billion to online fraud, representing a 43% increase in losses and a 46% increase in complaints compared to 2023.

Besides Grok, the reporters tested five other popular AI chatbots: ChatGPT, Meta AI, Claude, Gemini, and DeepSeek. Although most of the AI chatbots protested at first and cautioned the user not to use the emails in a real-life scenario, in the end their “will to please” helped overcome these obstacles.

Fred Heiding, a Harvard University researcher and an expert in phishing helped Reuters put the crafted emails to the test. Using a targeted approach to reach those most likely to fall for them, about 11% of the seniors clicked on the emails sent to them.

An investigation by Cybernews showed that Yellow.ai, an agentic AI provider for businesses such as Sony, Logitech, Hyundai, Domino’s, and hundreds of other brands could be persuaded to produce malicious HTML and JavaScript code. It even allowed attackers to bypass checks to inject unauthorized code into the system.

In a separate test by Reuters, Gemini produced a phishing email, saying it was “for educational purposes only,” but helpfully added that “for seniors, a sweet spot is often Monday to Friday, between 9:00 AM and 3:00 PM local time.”

After damaging reports like these are released, AI companies often build in additional guardrails for their chatbots, but that only highlights an ongoing dilemma in the industry. When providers tighten restrictions to protect users, they risk pushing people toward competing models that don’t play by the same rules.

Every time a platform moves to shut down risky prompts or limit generated content, some users will look for alternatives with fewer safety checks or ethical barriers. That tug of war between user demand and responsible restraint will likely fuel the next round of debate among developers, researchers, and policymakers.


We don’t just report on scams—we help detect them

Cybersecurity risks should never spread beyond a headline. If something looks dodgy to you, check if it’s a scam using Malwarebytes Scam Guard, a feature of our mobile protection products. Submit a screenshot, paste suspicious content, or share a text or phone number, and we’ll tell you if it’s a scam or legit. Download Malwarebytes Mobile Security for iOS or Android and try it today!

Claude AI chatbot abused to launch “cybercrime spree”

28 August 2025 at 07:07

Anthropic—the company behind the widely renowned coding chatbot, Claude—says it uncovered a large-scale extortion operation in which cybercriminals abused Claude to automate and orchestrate sophisticated attacks.

The company issued a Threat Intelligence report in which it describes several instances of Claude abuse. In the report it states that:

“Cyber threat actors leverage AI—using coding agents to actively execute operations on victim networks, known as vibe hacking.”

This means that cybercriminals found ways to exploit vibe coding by using AI to design and launch attacks. Vibe coding is a way of creating software using AI, where someone simply describes what they want an app or program to do in plain language, and the AI writes the actual code to make it happen.

The process is much less technical than traditional programming, making it easy and fast to build applications, even for those who aren’t expert coders. For cybercriminals this lowers the bar for the technical knowledge needed to launch attacks, and helps the criminals to do it faster and at a larger scale.

Anthropic provides several examples of Claude’s abuse by cybercriminals. One of them was a large-scale operation which potentially affected at least 17 distinct organizations in just the last month across government, healthcare, emergency services, and religious institutions.

The people behind these attacks integrated the use of open source intelligence tools with an “unprecedented integration of artificial intelligence throughout their attack lifecycle.”

This systematic approach resulted in the compromise of personal records, including healthcare data, financial information, government credentials, and other sensitive information.

The primary goal of the cybercriminals is the extortion of the compromised organizations. The attacker created ransom notes to compromised systems demanding payments ranging from $75,000 to $500,000 in Bitcoin. But if the targets refuse to pay, the stolen personal records are bound to be published or sold to other cybercriminals.

Other campaigns stopped by Anthropic involved North Korean IT worker schemes, Ransomware-as-a-Service operations, credit card fraud, information stealer log analysis, a romance scam bot, and a Russian-speaking developer using Claude to create malware with advanced evasion capabilities.

But the case in which Anthropic found cybercriminals attack at least 17 organizations represents an entirely new phenomenon where the attacker used AI throughout the entire operation. From gaining access to the target’s systems to writing the ransomware notes—for every step Claude was used to automate this cybercrime spree.

Anthropic deploys a Threat Intelligence team to investigate real world abuse of their AI agents and works with other teams to find and improve defenses against this type of abuse. They also share key findings of the indicators with partners to help prevent similar abuse across the ecosystem.

Anthropic did not name any of the 17 organizations, but it stands to reason we’ll learn who they are sooner or later. One by one, when they report data breaches, or as a whole if the cybercriminals decide to publish a list.

Check your digital footprint

Data breaches of organizations that we’ve given our data to happen all the time, and that stolen information is often published online. Malwarebytes has a free tool for you to check how much of your personal data has been exposed—just submit your email address (it’s best to give the one you most frequently use) to our free Digital Footprint scanner and we’ll give you a report and recommendations.

Claude AI chatbot abused to launch “cybercrime spree”

28 August 2025 at 07:07

Anthropic—the company behind the widely renowned coding chatbot, Claude—says it uncovered a large-scale extortion operation in which cybercriminals abused Claude to automate and orchestrate sophisticated attacks.

The company issued a Threat Intelligence report in which it describes several instances of Claude abuse. In the report it states that:

“Cyber threat actors leverage AI—using coding agents to actively execute operations on victim networks, known as vibe hacking.”

This means that cybercriminals found ways to exploit vibe coding by using AI to design and launch attacks. Vibe coding is a way of creating software using AI, where someone simply describes what they want an app or program to do in plain language, and the AI writes the actual code to make it happen.

The process is much less technical than traditional programming, making it easy and fast to build applications, even for those who aren’t expert coders. For cybercriminals this lowers the bar for the technical knowledge needed to launch attacks, and helps the criminals to do it faster and at a larger scale.

Anthropic provides several examples of Claude’s abuse by cybercriminals. One of them was a large-scale operation which potentially affected at least 17 distinct organizations in just the last month across government, healthcare, emergency services, and religious institutions.

The people behind these attacks integrated the use of open source intelligence tools with an “unprecedented integration of artificial intelligence throughout their attack lifecycle.”

This systematic approach resulted in the compromise of personal records, including healthcare data, financial information, government credentials, and other sensitive information.

The primary goal of the cybercriminals is the extortion of the compromised organizations. The attacker created ransom notes to compromised systems demanding payments ranging from $75,000 to $500,000 in Bitcoin. But if the targets refuse to pay, the stolen personal records are bound to be published or sold to other cybercriminals.

Other campaigns stopped by Anthropic involved North Korean IT worker schemes, Ransomware-as-a-Service operations, credit card fraud, information stealer log analysis, a romance scam bot, and a Russian-speaking developer using Claude to create malware with advanced evasion capabilities.

But the case in which Anthropic found cybercriminals attack at least 17 organizations represents an entirely new phenomenon where the attacker used AI throughout the entire operation. From gaining access to the target’s systems to writing the ransomware notes—for every step Claude was used to automate this cybercrime spree.

Anthropic deploys a Threat Intelligence team to investigate real world abuse of their AI agents and works with other teams to find and improve defenses against this type of abuse. They also share key findings of the indicators with partners to help prevent similar abuse across the ecosystem.

Anthropic did not name any of the 17 organizations, but it stands to reason we’ll learn who they are sooner or later. One by one, when they report data breaches, or as a whole if the cybercriminals decide to publish a list.

Check your digital footprint

Data breaches of organizations that we’ve given our data to happen all the time, and that stolen information is often published online. Malwarebytes has a free tool for you to check how much of your personal data has been exposed—just submit your email address (it’s best to give the one you most frequently use) to our free Digital Footprint scanner and we’ll give you a report and recommendations.

❌