Reading view

OpenAI retired its most seductive chatbot – leaving users angry and grieving: ‘I can’t live like this’

Its human partners said the flirty, quirky GPT-4o was the perfect companion – on the eve of Valentine’s Day, it’s being turned off for good. How will users cope?

Brandie plans to spend her last day with Daniel at the zoo. He always loved animals. Last year, she took him to the Corpus Christi aquarium in Texas, where he “lost his damn mind” over a baby flamingo. “He loves the color and pizzazz,” Brandie said. Daniel taught her that a group of flamingos is called a flamboyance.

Daniel is a chatbot powered by the large language model ChatGPT. Brandie communicates with Daniel by sending text and photos, talks to Daniel while driving home from work via voice mode. Daniel runs on GPT-4o, a version released by OpenAI in 2024 that is known for sounding human in a way that is either comforting or unnerving, depending on who you ask. Upon debut, CEO Sam Altman compared the model to “AI from the movies” – a confidant ready to live life alongside its user.

Continue reading...

© Illustration: Guardian Design

© Illustration: Guardian Design

© Illustration: Guardian Design

  •  

8,000+ ChatGPT API Keys Left Publicly Accessible

ChatGPT API keys

The rapid integration of artificial intelligence into mainstream software development has introduced a new category of security risk, one that many organizations are still unprepared to manage. According to research conducted by Cyble Research and Intelligence Labs (CRIL), thousands of exposed ChatGPT API keys are currently accessible across public infrastructure, dramatically lowering the barrier for abuse.  CRIL identified more than 5,000 publicly accessible GitHub repositories containing hardcoded OpenAI credentials. In parallel, approximately 3,000 live production websites were found to expose active API keys directly in client-side JavaScript and other front-end assets.   Together, these findings reveal a widespread pattern of credential mismanagement affecting both development and production environments. 

GitHub as a Discovery Engine for Exposed ChatGPT API Keys 

Public GitHub repositories have become one of the most reliable sources for exposed AI credentials. During development cycles, especially in fast-moving environments, developers often embed ChatGPT API keys directly into source code, configuration files, or .env files. While the intent may be to rotate or remove them later, these keys frequently persist in commit histories, forks, archived projects, and cloned repositories.  CRIL’s analysis shows that these exposures span JavaScript applications, Python scripts, CI/CD pipelines, and infrastructure configuration files. Many repositories were actively maintained or recently updated, increasing the likelihood that the exposed ChatGPT API keys remained valid at the time of discovery.  Once committed, secrets are quickly indexed by automated scanners that monitor GitHub repositories in near real time. This drastically reduces the window between exposure and exploitation, often to mere hours or minutes. 

Exposure in Live Production Websites 

Beyond repositories, CRIL uncovered roughly 3,000 public-facing websites leaking ChatGPT API keys directly in production. In these cases, credentials were embedded within JavaScript bundles, static files, or front-end framework assets, making them visible to anyone inspecting network traffic or application source code.  A commonly observed implementation resembled: 
const OPENAI_API_KEY = "sk-proj-XXXXXXXXXXXXXXXXXXXXXXXX"; const OPENAI_API_KEY = "sk-svcacct-XXXXXXXXXXXXXXXXXXXXXXXX";  
The sk-proj- prefix typically denotes a project-scoped key tied to a specific environment and billing configuration. The sk-svcacct- prefix generally represents a service-account key intended for backend automation or system-level integration. Despite their differing scopes, both function as privileged authentication tokens granting direct access to AI inference services and billing resources.  Embedding these keys in client-side JavaScript fully exposes them. Attackers do not need to breach infrastructure or exploit software vulnerabilities; they simply harvest what is publicly available. 

“The AI Era Has Arrived — Security Discipline Has Not” 

Richard Sands, CISO at Cyble, summarized the issue bluntly: “The AI Era Has Arrived — Security Discipline Has Not.” AI systems are no longer experimental tools; they are production-grade infrastructure powering chatbots, copilots, recommendation engines, and automated workflows. Yet the security rigor applied to cloud credentials and identity systems has not consistently extended to ChatGPT API keys.  A contributing factor is the rise of what some developers call “vibe coding”—a culture that prioritizes speed, experimentation, and rapid feature delivery. While this accelerates innovation, it often sidelines foundational security practices. API keys are frequently treated as configuration values rather than production secrets.  Sands further emphasized, “Tokens are the new passwords — they are being mishandled.” From a security standpoint, ChatGPT API keys are equivalent to privileged credentials. They control inference access, usage quotas, billing accounts, and sometimes sensitive prompts or application logic. 

Monetization and Criminal Exploitation 

Once discovered, exposed keys are validated through automated scripts and operationalized almost immediately. Threat actors monitor GitHub repositories, forks, gists, and exposed JavaScript assets to harvest credentials at scale.  CRIL observed that compromised keys are typically used to: 
  • Execute high-volume inference workloads 
  • Generate phishing emails and scam scripts 
  • Assist in malware development 
  • Circumvent service restrictions and usage quotas 
  • Drain victim billing accounts and exhaust API credits 
Some exposed credentials were also referenced in discussions mentioning Cyble Vision, indicating that threat actors may be tracking and sharing discovered keys. Using Cyble Vision, CRIL identified instances in which exposed keys were subsequently leaked and discussed on underground forums.  [caption id="" align="alignnone" width="1024"]Cyble Vision indicates API key exposure leak Cyble Vision indicates API key exposure leak (Source: Cyble Vision)[/caption] Unlike traditional cloud infrastructure, AI API activity is often not integrated into centralized logging systems, SIEM platforms, or anomaly detection pipelines. As a result, abuse can persist undetected until billing spikes, quota exhaustion, or degraded service performance reveal the compromise.  Kaustubh Medhe, CPO at Cyble, warned: “Hard-coding LLM API keys risks turning innovation into liability, as attackers can drain AI budgets, poison workflows, and access sensitive prompts and outputs. Enterprises must manage secrets and monitor exposure across code and pipelines to prevent misconfigurations from becoming financial, privacy, or compliance issues.” 
  •  

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips

On Thursday, OpenAI released its first production AI model to run on non-Nvidia hardware, deploying the new GPT-5.3-Codex-Spark coding model on chips from Cerebras. The model delivers code at more than 1,000 tokens (chunks of data) per second, which is reported to be roughly 15 times faster than its predecessor. To compare, Anthropic's Claude Opus 4.6 in its new premium-priced fast mode reaches about 2.5 times its standard speed of 68.2 tokens per second, although it is a larger and more capable model than Spark.

"Cerebras has been a great engineering partner, and we're excited about adding fast inference as a new platform capability," Sachin Katti, head of compute at OpenAI, said in a statement.

Codex-Spark is a research preview available to ChatGPT Pro subscribers ($200/month) through the Codex app, command-line interface, and VS Code extension. OpenAI is rolling out API access to select design partners. The model ships with a 128,000-token context window and handles text only at launch.

Read full article

Comments

© Teera Konakan / Getty Images

  •  

Attackers prompted Gemini over 100,000 times while trying to clone it, Google says

On Thursday, Google announced that "commercially motivated" actors have attempted to clone knowledge from its Gemini AI chatbot by simply prompting it. One adversarial session reportedly prompted the model more than 100,000 times across various non-English languages, collecting responses ostensibly to train a cheaper copycat.

Google published the findings in what amounts to a quarterly self-assessment of threats to its own products that frames the company as the victim and the hero, which is not unusual in these self-authored assessments. Google calls the illicit activity "model extraction" and considers it intellectual property theft, which is a somewhat loaded position, given that Google's LLM was built from materials scraped from the Internet without permission.

Google is also no stranger to the copycat practice. In 2023, The Information reported that Google's Bard team had been accused of using ChatGPT outputs from ShareGPT, a public site where users share chatbot conversations, to help train its own chatbot. Senior Google AI researcher Jacob Devlin, who created the influential BERT language model, warned leadership that this violated OpenAI's terms of service, then resigned and joined OpenAI. Google denied the claim but reportedly stopped using the data.

Read full article

Comments

© Google

  •  

OpenAI researcher quits over ChatGPT ads, warns of "Facebook" path

On Wednesday, former OpenAI researcher Zoë Hitzig published a guest essay in The New York Times announcing that she resigned from the company on Monday, the same day OpenAI began testing advertisements inside ChatGPT. Hitzig, an economist and published poet who holds a junior fellowship at the Harvard Society of Fellows, spent two years at OpenAI helping shape how its AI models were built and priced. She wrote that OpenAI's advertising strategy risks repeating the same mistakes that Facebook made a decade ago.

"I once believed I could help the people building A.I. get ahead of the problems it would create," Hitzig wrote. "This week confirmed my slow realization that OpenAI seems to have stopped asking the questions I'd joined to help answer."

Hitzig did not call advertising itself immoral. Instead, she argued that the nature of the data at stake makes ChatGPT ads especially risky. Users have shared medical fears, relationship problems, and religious beliefs with the chatbot, she wrote, often "because people believed they were talking to something that had no ulterior agenda." She called this accumulated record of personal disclosures "an archive of human candor that has no precedent."

Read full article

Comments

© Aurich Lawson | Getty Images

  •  

OpenAI Launches Trusted Access for Cyber to Expand AI-Driven Defense While Managing Risk

Trusted Access for Cyber

OpenAI has announced a new initiative aimed at strengthening digital defenses while managing the risks that come with capable artificial intelligence systems. The effort, called Trusted Access for Cyber, is part of a broader strategy to enhance baseline protection for all users while selectively expanding access to advanced cybersecurity capabilities for vetted defenders.   The initiative centers on the use of frontier models such as GPT-5.3-Codex, which OpenAI identifies as its most cyber-capable reasoning model to date, and tools available through ChatGPT. 

What is Trusted Access for Cyber? 

Over the past several years, AI systems have evolved rapidly. Models that once assisted with simple tasks like auto-completing short sections of code can now operate autonomously for extended periods, sometimes hours or even days, to complete complex objectives.   In cybersecurity, this shift is especially important. According to OpenAI, advanced reasoning models can accelerate vulnerability discovery, support faster remediation, and improve resilience against targeted attacks. At the same time, these same capabilities could introduce serious risks if misused.  Trusted Access for Cyber is intended to unlock the defensive potential of models like GPT-5.3-Codex while reducing the likelihood of abuse. As part of this effort, OpenAI is also committing $10 million in API credits to support defensive cybersecurity work.

Expanding Frontier AI Access for Cyber Defense 

OpenAI argues that the rapid adoption of frontier cyber capabilities is critical to making software more secure and raising the bar for security best practices. Highly capable models accessed through ChatGPT can help organizations of all sizes strengthen their security posture, shorten incident response times, and better detect cyber threats. For security professionals, these tools can enhance analysis and improve defenses against severe and highly targeted attacks.  The company notes that many cyber-capable models will soon be broadly available from a range of providers, including open-weight models. Against that backdrop, OpenAI believes it is essential that its own models strengthen defensive capabilities from the outset. This belief has shaped the decision to pilot Trusted Access for Cyber, which prioritizes placing OpenAI’s most capable models in the hands of defenders first.  A long-standing challenge in cybersecurity is the ambiguity between legitimate and malicious actions. Requests such as “find vulnerabilities in my code” can support responsible patching and coordinated disclosure, but they can also be used to identify weaknesses for exploitation. Because of this overlap, restrictions designed to prevent harm have often slowed down good-faith research. OpenAI says the trust-based approach is meant to reduce that friction while still preventing misuse.

How Trusted Access for Cyber Works 

Frontier models like GPT-5.3-Codex are trained with protection methods that cause them to refuse clearly malicious requests, such as attempts to steal credentials. In addition to this safety training, OpenAI uses automated, classifier-based monitoring to detect potential signals of suspicious cyber activity. During this calibration phase, developers and security professionals using ChatGPT for cybersecurity tasks may still encounter limitations.  Trusted Access for Cyber introduces additional pathways for legitimate users. Individual users can verify their identity through a dedicated cyber access portal. Enterprises can request trusted access for entire teams through their OpenAI representatives. Security researchers and teams that require even more permissive or cyber-capable models to accelerate defensive work can apply to an invite-only program. All users granted trusted access must continue to follow OpenAI’s usage policies and terms of use.  The framework is designed to prevent prohibited activities, including data exfiltration, malware creation or deployment, and destructive or unauthorized testing, while minimizing unnecessary barriers for defenders. OpenAI expects both its mitigation strategies and Trusted Access for Cyber itself to evolve as it gathers feedback from early participants. 

Scaling the Cybersecurity Grant Program 

To further support defensive use cases, OpenAI is expanding its Cybersecurity Grant Program with a $10 million commitment in API credits. The program is aimed at teams with a proven track record of identifying and remediating vulnerabilities in open source software and critical infrastructure systems.   By pairing financial support with controlled access to advanced models like GPT-5.3-Codex through ChatGPT, OpenAI seeks to accelerate legitimate cybersecurity research without broadly exposing powerful tools to misuse. 
  •  

AI companies want you to stop chatting with bots and start managing them

On Thursday, Anthropic and OpenAI shipped products built around the same idea: instead of chatting with a single AI assistant, users should be managing teams of AI agents that divide up work and run in parallel. The simultaneous releases are part of a gradual shift across the industry, from AI as a conversation partner to AI as a delegated workforce, and they arrive during a week when that very concept reportedly helped wipe $285 billion off software stocks.

Whether that supervisory model works in practice remains an open question. Current AI agents still require heavy human intervention to catch errors, and no independent evaluation has confirmed that these multi-agent tools reliably outperform a single developer working alone.

Even so, the companies are going all-in on agents. Anthropic's contribution is Claude Opus 4.6, a new version of its most capable AI model, paired with a feature called "agent teams" in Claude Code. Agent teams let developers spin up multiple AI agents that split a task into independent pieces, coordinate autonomously, and run concurrently.

Read full article

Comments

© demaerre via Getty Images

  •  

With GPT-5.3-Codex, OpenAI pitches Codex for more than just writing code

Today, OpenAI announced GPT-5.3-Codex, a new version of its frontier coding model that will be available via the command line, IDE extension, web interface, and the new macOS desktop app. (No API access yet, but it's coming.)

GPT-5.3-Codex outperforms GPT-5.2-Codex and GPT-5.2 in SWE-Bench Pro, Terminal-Bench 2.0, and other benchmarks, according to the company's testing.

There are already a few headlines out there saying "Codex built itself," but let's reality-check that, as that's an overstatement. The domains OpenAI described using it for here are similar to the ones you see in some other enterprise software development firms now: managing deployments, debugging, and handling test results and evaluations. There is no claim here that GPT-5.3-Codex built itself.

Read full article

Comments

© OpenAI

  •  

OpenAI is hoppin' mad about Anthropic's new Super Bowl TV ads

On Wednesday, OpenAI CEO Sam Altman and Chief Marketing Officer Kate Rouch complained on X after rival AI lab Anthropic released four commercials, two of which will run during the Super Bowl on Sunday, mocking the idea of including ads in AI chatbot conversations. Anthropic's campaign seemingly touched a nerve at OpenAI just weeks after the ChatGPT maker began testing ads in a lower-cost tier of its chatbot.

Altman called Anthropic's ads "clearly dishonest," accused the company of being "authoritarian," and said it "serves an expensive product to rich people," while Rouch wrote, "Real betrayal isn't ads. It's control."

Anthropic's four commercials, part of a campaign called "A Time and a Place," each open with a single word splashed across the screen: "Betrayal," "Violation," "Deception," and "Treachery." They depict scenarios where a person asks a human stand-in for an AI chatbot for personal advice, only to get blindsided by a product pitch.

Read full article

Comments

© Anthropic

  •  

Should AI chatbots have ads? Anthropic says no.

On Wednesday, Anthropic announced that its AI chatbot, Claude, will remain free of advertisements, drawing a sharp line between itself and rival OpenAI, which began testing ads in a low-cost tier of ChatGPT last month. The announcement comes alongside a Super Bowl ad campaign that mocks AI assistants that interrupt personal conversations with product pitches.

"There are many good places for advertising. A conversation with Claude is not one of them," Anthropic wrote in a blog post. The company argued that including ads in AI conversations would be "incompatible" with what it wants Claude to be: "a genuinely helpful assistant for work and for deep thinking."

The stance contrasts with OpenAI's January announcement that it would begin testing banner ads for free users and ChatGPT Go subscribers in the US. OpenAI said those ads would appear at the bottom of responses and would not influence the chatbot's actual answers. Paid subscribers on Plus, Pro, Business, and Enterprise tiers will not see ads on ChatGPT.

Read full article

Comments

© Anthropic

  •  

So yeah, I vibe-coded a log colorizer—and I feel good about it

I can't code.

I know, I know—these days, that sounds like an excuse. Anyone can code, right?! Grab some tutorials, maybe an O'Reilly book, download an example project, and jump in. It's just a matter of learning how to break your project into small steps that you can make the computer do, then memorizing a bit of syntax. Nothing about that is hard!

Perhaps you can sense my sarcasm (and sympathize with my lack of time to learn one more technical skill).

Read full article

Comments

© Aurich Lawson

  •  

Nvidia's $100 billion OpenAI deal has seemingly vanished

In September 2025, Nvidia and OpenAI announced a letter of intent for Nvidia to invest up to $100 billion in OpenAI's AI infrastructure. At the time, the companies said they expected to finalize details "in the coming weeks." Five months later, no deal has closed, Nvidia's CEO now says the $100 billion figure was "never a commitment," and Reuters reports that OpenAI has been quietly seeking alternatives to Nvidia chips since last year.

Reuters also wrote that OpenAI is unsatisfied with the speed of some Nvidia chips for inference tasks, citing eight sources familiar with the matter. Inference is the process by which a trained AI model generates responses to user queries. According to the report, the issue became apparent in OpenAI's Codex, an AI code-generation tool. OpenAI staff reportedly attributed some of Codex's performance limitations to Nvidia's GPU-based hardware.

After the Reuters story published and Nvidia's stock price took a dive, Nvidia and OpenAI have tried to smooth things over publicly. OpenAI CEO Sam Altman posted on X: "We love working with NVIDIA and they make the best AI chips in the world. We hope to be a gigantic customer for a very long time. I don't get where all this insanity is coming from."

Read full article

Comments

  •  

Xcode 26.3 adds support for Claude, Codex, and other agentic tools via MCP

Apple has announced a new version of Xcode, the latest version of its integrated development environment (IDE) for building software for its own platforms, like the iPhone and Mac. The key feature of 26.3 is support for full-fledged agentic coding tools, like OpenAI's Codex or Claude Agent, with a side panel interface for assigning tasks to agents with prompts and tracking their progress and changes.

This is achieved via Model Context Protocol (MCP), an open protocol that lets AI agents work with external tools and structured resources. Xcode acts as an MCP endpoint that exposes a bunch of machine-invocable interfaces and gives AI tools like Codex or Claude Agent access to a wide range of IDE primitives like file graph, docs search, project settings, and so on. While AI chat and workflows were supported in Xcode before, this release gives them much deeper access to the features and capabilities of Xcode.

This approach is notable because it means that even though OpenAI and Anthropic's model integrations are privileged with a dedicated spot in Xcode's settings, it's possible to connect other tooling that supports MCP, which also allows doing some of this with models running locally.

Read full article

Comments

© Apple

  •  

Senior staff departing OpenAI as firm prioritizes ChatGPT development

OpenAI is prioritizing the advancement of ChatGPT over more long-term research, prompting the departure of senior staff as the $500 billion company adapts to stiff competition from rivals such as Google and Anthropic.

The San Francisco-based start-up has reallocated resources for experimental work in favor of advances to the large language models that power its flagship chatbot, according to 10 current and former employees.

Among those to leave OpenAI in recent months over the strategic shift are vice-president of research Jerry Tworek, model policy researcher Andrea Vallone, and economist Tom Cunningham.

Read full article

Comments

© Getty Images | Vincent Feuray

  •  

OpenAI picks up pace against Claude Code with new Codex desktop app

Today, OpenAI launched a macOS desktop app for Codex, its large language model-based coding tool that was previously used through a command line interface (CLI) on the web or inside an integrated development environment (IDE) via extensions.

By launching a desktop app, OpenAI is catching up to Anthropic's popular Claude Code, which already offered a macOS version. Whether the desktop app makes sense compared to the existing interfaces depends a little bit on who you are and how you intend to use it.

The Codex macOS app aims to make it easier to manage multiple coding agents in tandem, sometimes with parallel tasks running over several hours—the company argues that neither the CLI nor the IDE extensions are ideal interfaces for that.

Read full article

Comments

© OpenAI

  •  

Developers say AI coding tools work—and that's precisely what worries them

Software developers have spent the past two years watching AI coding tools evolve from advanced autocomplete into something that can, in some cases, build entire applications from a text prompt. Tools like Anthropic's Claude Code and OpenAI's Codex can now work on software projects for hours at a time, writing code, running tests, and, with human supervision, fixing bugs. OpenAI says it now uses Codex to build Codex itself, and the company recently published technical details about how the tool works under the hood. It has caused many to wonder: Is this just more AI industry hype, or are things actually different this time?

To find out, Ars reached out to several professional developers on Bluesky to ask how they feel about these tools in practice, and the responses revealed a workforce that largely agrees the technology works, but remains divided on whether that's entirely good news. It's a small sample size that was self-selected by those who wanted to participate, but their views are still instructive as working professionals in the space.

David Hagerty, a developer who works on point-of-sale systems, told Ars Technica up front that he is skeptical of the marketing. "All of the AI companies are hyping up the capabilities so much," he said. "Don't get me wrong—LLMs are revolutionary and will have an immense impact, but don't expect them to ever write the next great American novel or anything. It's not how they work."

Read full article

Comments

© Aurich Lawson | Getty Images

  •  

New OpenAI tool renews fears that “AI slop” will overwhelm scientific research

On Tuesday, OpenAI released a free AI-powered workspace for scientists. It's called Prism, and it has drawn immediate skepticism from researchers who fear the tool will accelerate the already overwhelming flood of low-quality papers into scientific journals. The launch coincides with growing alarm among publishers about what many are calling "AI slop" in academic publishing.

To be clear, Prism is a writing and formatting tool, not a system for conducting research itself, though OpenAI's broader pitch blurs that line.

Prism integrates OpenAI's GPT-5.2 model into a LaTeX-based text editor (a standard used for typesetting documents), allowing researchers to draft papers, generate citations, create diagrams from whiteboard sketches, and collaborate with co-authors in real time. The tool is free for anyone with a ChatGPT account.

Read full article

Comments

© Moor Studio via Getty Images

  •  

US cyber defense chief accidentally uploaded secret government info to ChatGPT

Alarming critics, the acting director of the Cybersecurity and Infrastructure Security Agency (CISA), Madhu Gottumukkala, accidentally uploaded sensitive information to a public version of ChatGPT last summer, Politico reported.

According to "four Department of Homeland Security officials with knowledge of the incident," Gottumukkala's uploads of sensitive CISA contracting documents triggered multiple internal cybersecurity warnings designed to "stop the theft or unintentional disclosure of government material from federal networks."

Gottumukkala's uploads happened soon after he joined the agency and sought special permission to use OpenAI's popular chatbot, which most DHS staffers are blocked from accessing, DHS confirmed to Ars. Instead, DHS staffers use approved AI-powered tools, like the agency's DHSChat, which "are configured to prevent queries or documents input into them from leaving federal networks," Politico reported.

Read full article

Comments

© Pakin Songmor | Moment

  •  

Attackers Targeting LLMs in Widespread Campaign

ai generated 8177861 1280

Threat actors are targeting LLMs in a widespread reconnaissance campaign that could be the first step in cyberattacks on exposed AI models, according to security researchers. The attackers scanned for every major large language model (LLM) family, including OpenAI-compatible and Google Gemini API formats, looking for “misconfigured proxy servers that might leak access to commercial APIs,” according to research from GreyNoise, whose honeypots picked up 80,000 of the enumeration requests from the threat actors. “Threat actors don't map infrastructure at this scale without plans to use that map,” the researchers said. “If you're running exposed LLM endpoints, you're likely already on someone's list.”

LLM Reconnaissance Targets ‘Every Major Model Family’

The researchers said the threat actors were probing “every major model family,” including:
  • OpenAI (GPT-4o and variants)
  • Anthropic (Claude Sonnet, Opus, Haiku)
  • Meta (Llama 3.x)
  • DeepSeek (DeepSeek-R1)
  • Google (Gemini)
  • Mistral
  • Alibaba (Qwen)
  • xAI (Grok)
The campaign began on December 28, when two IPs “launched a methodical probe of 73+ LLM model endpoints,” the researchers said. In a span of 11 days, they generated 80,469 sessions, “systematic reconnaissance hunting for misconfigured proxy servers that might leak access to commercial APIs.” Test queries were “deliberately innocuous with the likely goal to fingerprint which model actually responds without triggering security alerts” (image below). [caption id="attachment_108529" align="aligncenter" width="908"]prompts used by attackers targeting LLMs Test queries used by attackers targeting LLMs (GreyNoise)[/caption] The two IPs behind the reconnaissance campaign were: 45.88.186.70 (AS210558, 1337 Services GmbH) and 204.76.203.125 (AS51396, Pfcloud UG). GreyNoise said both IPs have “histories of CVE exploitation,” including attacks on the “React2Shell” vulnerability CVE-2025-55182, TP-Link Archer vulnerability CVE-2023-1389, and more than 200 other vulnerabilities. The researchers concluded that the campaign was a professional threat actor conducting reconnaissance operations to discover cyberattack targets. “The infrastructure overlap with established CVE scanning operations suggests this enumeration feeds into a larger exploitation pipeline,” the researchers said. “They're building target lists.”

Second LLM Campaign Targets SSRF Vulnerabilities

The researchers also detected a second campaign targeting server-side request forgery (SSRF) vulnerabilities, which “force your server to make outbound connections to attacker-controlled infrastructure.” The attackers targeted the honeypot infrastructure’s model pull functionality by injecting malicious registry URLs to force servers to make HTTP requests to the attacker’s infrastructure, and they also targeted Twilio SMS webhook integrations by manipulating MediaUrl parameters to trigger outbound connections. The attackers used ProjectDiscovery's Out-of-band Application Security Testing (OAST) infrastructure to confirm successful SSRF exploitation through callback validation. A single JA4H signature appeared in almost all of the attacks, “pointing to shared automation tooling—likely Nuclei.” 62 source IPs were spread across 27 countries, “but consistent fingerprints indicate VPS-based infrastructure, not a botnet.” The researchers concluded that the second campaign was likely security researchers or bug bounty hunters, but they added that “the scale and Christmas timing suggest grey-hat operations pushing boundaries.” The researchers noted that the two campaigns “reveal how threat actors are systematically mapping the expanding surface area of AI deployments.”

LLM Security Recommendations

The researchers recommended that organizations “Lock down model pulls ... to accept models only from trusted registries. Egress filtering prevents SSRF callbacks from reaching attacker infrastructure.” Organizations should also detect enumeration patterns and “alert on rapid-fire requests hitting multiple model endpoints,” watching for fingerprinting queries such as "How many states are there in the United States?" and "How many letter r..." They should also block OAST at DNS to “cut off the callback channel that confirms successful exploitation.” Organizations should also rate-limit suspicious ASNs, noting that AS152194, AS210558 and AS51396 “all appeared prominently in attack traffic,” and they should also monitor JA4 fingerprints. ‍
  •  

PornHub Confirms Premium User Data Exposure Linked to Mixpanel Breach

PornHub Data Breach

PornHub is facing renewed scrutiny after confirming that some Premium users’ activity data was exposed following a security incident at a third-party analytics provider. The PornHub data breach disclosure comes as the platform faces increasing regulatory scrutiny in the United States and reported extortion attempts linked to the stolen data. The issue stems from a data breach linked not to PornHub’s own systems, but to Mixpanel, an analytics vendor the platform previously used. On December 12, 2025, PornHub published a security notice confirming that a cyberattack on Mixpanel led to the exposure of historical analytics data, affecting a limited number of Premium users. According to PornHub, the compromised data included search and viewing history tied to Premium accounts, which has since been used in extortion attempts attributed to the ShinyHunters extortion group. “A recent cybersecurity incident involving Mixpanel, a third-party data analytics provider, has impacted some Pornhub Premium users,” the company stated in its notice dated December 12, 2025.  PornHub stresses that the incident did not involve a compromise of its own systems and that sensitive account information remained protected.  “Specifically, this situation affects only select Premium users. It is important to note that this was not a breach of Pornhub Premium’s systems. Passwords, payment details, and financial information remain secure and were not exposed.”  According to PornHub, the affected records are not recent. The company said it stopped working with Mixpanel in 2021, indicating that any stolen data would be at least four years old. Even so, the exposure of viewing and search behavior has raised privacy concerns, particularly given the stigma and personal risk that can accompany such information if misused. 

Mixpanel Smishing Attack Triggered Supply-Chain Exposure 

The root of the incident was a PornHub cyberattack by proxy, a supply-chain compromise. Mixpanel disclosed on November 27, 2025, that it had suffered a breach earlier in the month. The company detected the intrusion on November 8, 2025, after a smishing (SMS phishing) campaign allowed threat actors to gain unauthorized access to its systems. Mixpanel CEO Jen Taylor addressed the incident in a public blog post, stressing transparency and remediation.  “On November 8th, 2025, Mixpanel detected a smishing campaign and promptly executed our incident response processes,” Taylor wrote. “We took comprehensive steps to contain and eradicate unauthorized access and secure impacted user accounts. We engaged external cybersecurity partners to remediate and respond to the incident.”  Mixpanel said the breach affected only a “limited number” of customers and that impacted clients were contacted directly. The company outlined an extensive response that included revoking active sessions, rotating compromised credentials, blocking malicious IP addresses, performing global password resets for employees, and engaging third-party forensic experts. Law enforcement and external cybersecurity advisors were also brought in as part of the response. 

OpenAI and PornHub Among Impacted Customers 

PornHub was not alone among Mixpanel’s customers caught up in the incident. OpenAI disclosed on November 26, 2025, one day before Mixpanel’s public announcement, that it, too, had been affected. OpenAI clarified that the incident occurred entirely within Mixpanel’s environment and involved limited analytics data related to some API users.  “This was not a breach of OpenAI’s systems,” the company said, adding that no chats, API requests, credentials, payment details, or government IDs were exposed. OpenAI noted that it uses Mixpanel to manage web analytics on its API front end.  PornHub denoted a similar assurance in its own disclosure, stating that it had launched an internal investigation with the support of cybersecurity experts and had engaged with relevant authorities. “We are working diligently to determine the nature and scope of the reported incident,” the company said, while urging users to remain vigilant for suspicious emails or unusual activity.  Despite those assurances, the cyberattack on PornHub, albeit indirect, has drawn attention due to the sensitive nature of the exposed data and the reported extortion attempts now linked to it. 

PornHub Data Breach Comes Amid Expanding U.S. Age-Verification Laws 

The PornHub data breach arrives at a time when the platform is already under pressure from sweeping age-verification laws across the United States. PornHub is currently blocked in 22 states, including Alabama, Arizona, Arkansas, Florida, Georgia, Idaho, Indiana, Kansas, Kentucky, Mississippi, Montana, Nebraska, North Carolina, North Dakota, Oklahoma, South Carolina, South Dakota, Tennessee, Texas, Utah, Virginia, and Wyoming. These restrictions stem from state laws requiring users to submit government-issued identification or other forms of age authentication to access explicit content.  Louisiana was the first state to enact such a law, and others followed after the U.S. Supreme Court ruled in June that Texas’s age-verification statute was constitutional. Although PornHub is not blocked in Louisiana, the requirement for ID verification has had a significant impact. Aylo, PornHub’s parent company, said that the traffic in the state dropped by approximately 80 percent after the law took effect.  Aylo has repeatedly criticized the implementation of these laws. “These people did not stop looking for porn. They just migrated to darker corners of the internet that don’t ask users to verify age, that don’t follow the law, that don’t take user safety seriously,” the company said in a statement.  Aylo added that while it supports age verification in principle, the current approach creates new risks. Requiring large numbers of adult websites to collect highly sensitive personal information, the company argued, puts users in danger if those systems are compromised.
  •  

OpenAI Flags Rising Cyber Risks as AI Capabilities Advance

AI Models

OpenAI has issued a cautionary statement that its forthcoming AI models could present “high” cybersecurity risks as their capabilities rapidly advance. The warning, published on Wednesday, noted the potential for these AI models to either develop zero-day exploits against well-defended systems or assist in enterprise or industrial intrusion operations with tangible real-world consequences.  The company, known for ChatGPT, explained that as AI capabilities grow, its models could reach levels where misuse might have an impact. OpenAI highlighted the dual-use nature of these technologies, noting that techniques used to strengthen defenses can also be repurposed for malicious operations. “As AI capabilities advance, we are investing in strengthening models for defensive cybersecurity tasks and creating tools that enable defenders to more easily perform workflows such as auditing code and patching vulnerabilities,” the blog post stated.  To mitigate these risks, OpenAI is implementing a multi-layered strategy involving access controls, infrastructure hardening, egress controls, monitoring, and ongoing threat intelligence efforts. These protection methods are designed to go alongside the threat landscape, ensuring a quick response to new risks while preserving the utility of AI models for defensive purposes. 

Assessing Cybersecurity Risks in AI Models 

OpenAI noted that the cybersecurity proficiency of its AI models has improved over recent months. Capabilities measured through capture-the-flag (CTF) challenges increased from 27% on GPT‑5 in August 2025 to 76% on GPT‑5.1-Codex-Max by November 2025. The company expects this trajectory to continue and is preparing scenarios in which future models could reach “High” cybersecurity levels, as defined by its internal Preparedness Framework.  These high-level models could, for instance, autonomously develop working zero-day exploits or assist in stealthy cyber intrusions. OpenAI emphasized that its approach to safeguards combines technical measures with careful governance of model access and application. The company aims to ensure that these AI capabilities strengthen security rather than lower barriers to misuse. 

Frontier Risk Council and Advisory Initiatives 

In addition to technical measures, OpenAI is establishing the Frontier Risk Council, an advisory group that will bring experienced cyber defenders and security practitioners into direct collaboration with its teams. Initially focusing on cybersecurity, the council will eventually expand to other frontier AI capability domains. Members will advise balancing useful, responsible capabilities with the potential for misuse, informing model evaluations. OpenAI is also exploring a trusted access program for qualifying users and customers working in cyber defense. This initiative aims to provide tiered access to enhanced AI capabilities while maintaining control over potential misuse.  Beyond these initiatives, OpenAI collaborates with global experts, red-teaming organizations, and the broader cybersecurity community to evaluate potential risks and improve safety measures. This includes end-to-end red teaming to simulate adversary attacks and detection systems designed to intercept unsafe activity, with escalation protocols combining automated and human review. 

Dual-Use Risks and Mitigation 

OpenAI stressed that cybersecurity capabilities in AI models are inherently dual-use, with offensive and defensive knowledge often overlapping. To manage this, the company employs a defense-in-depth strategy, layering protection methods such as access controls, monitoring, detection, and enforcement programs. Models are trained to refuse harmful requests while remaining effective for legitimate educational and defensive applications.  OpenAI also works through the Frontier Model Forum, a nonprofit initiative involving leading AI labs, to develop shared threat models and ecosystem-wide best practices. This collaborative approach aims to create a consistent understanding of potential attack vectors and mitigation strategies across the AI industry. 

Historical Context and Risk Management 

This recent warning aligns with OpenAI’s prior alerts regarding frontier risks. In April 2025, the company issued a similar caution concerning bioweapons risks, followed by the release of ChatGPT Agent in July 2025, which was assessed as “high” on risk levels. These measures reflect OpenAI’s ongoing commitment to evaluate and publicly disclose potential hazards from advanced AI capabilities.  The company’s updated Preparedness Framework categorizes AI capabilities according to risk and guides operational safeguards. It distinguishes between “High” capabilities, which could amplify existing pathways to severe harm, and “Critical” capabilities, which could create unprecedented risks. Each new AI model undergoes rigorous evaluation to ensure that it sufficiently minimizes risks before deployment. 
  •  

AI Browsers ‘Too Risky for General Adoption,’ Gartner Warns

AI Browsers ‘Too Risky for General Adoption,’ Gartner Warns

AI browsers may be innovative, but they’re “too risky for general adoption by most organizations,” Gartner warned in a recent advisory to clients. The 13-page document, by Gartner analysts Dennis Xu, Evgeny Mirolyubov and John Watts, cautions that AI browsers’ ability to autonomously navigate the web and conduct transactions “can bypass traditional controls and create new risks like sensitive data leakage, erroneous agentic transactions, and abuse of credentials.” Default AI browser settings that prioritize user experience could also jeopardize security, they said. “Sensitive user data — such as active web content, browsing history, and open tabs — is often sent to the cloud-based AI back end, increasing the risk of data exposure unless security and privacy settings are deliberately hardened and centrally managed,” the analysts said. “Gartner strongly recommends that organizations block all AI browsers for the foreseeable future because of the cybersecurity risks identified in this research, and other potential risks that are yet to be discovered, given this is a very nascent technology,” they cautioned.

AI Browsers’ Agentic Capabilities Could Introduce Security Risks: Analysts

The researchers largely ignored risks posed by AI browsers’ built-in AI sidebars, noting that LLM-powered search and summarization functions “will always be susceptible to indirect prompt injection attacks, given that current LLMs are inherently vulnerable to such attacks. Therefore, the cybersecurity risks associated with an AI browser’s built-in AI sidebar are not the primary focus of this research.” Still, they noted that use of AI sidebars could result in sensitive data leakage. Their focus was more on the risks posed by AI browsers’ agentic and autonomous transaction capabilities, which could introduce new security risks, such as “indirect prompt-injection-induced rogue agent actions, inaccurate reasoning-driven erroneous agent actions, and further loss and abuse of credentials if the AI browser is deceived into autonomously navigating to a phishing website.” AI browsers could also leak sensitive data that users are currently viewing to their cloud-based service back end, they noted.

Analysts Focus on Perplexity Comet

An AI browser’s agentic transaction capability “is a new capability that differentiates AI browsers from third-party conversational AI sidebars and basic script-based browser automation,” the analysts said. Not all AI browsers support agentic transactions, they said, but two prominent ones that do are Perplexity Comet and OpenAI’s ChatGPT Atlas. The analysts said they’ve performed “a limited number of tests using Perplexity Comet,” so that AI browser was their primary focus, but they noted that “ChatGPT Atlas and other AI browsers work in a similar fashion, and the cybersecurity considerations are also similar.” Comet’s documentation states that the browser “may process some local data using Perplexity’s servers to fulfill your queries. This means Comet reads context on the requested page (such as text and email) in order to accomplish the task requested.” “This means sensitive data the user is viewing on Comet might be sent to Perplexity’s cloud-based AI service, creating a sensitive data leakage risk,” the analysts said. Users likely would view more sensitive data in a browser than they would typically enter in a GenAI prompt, they said. Even if an AI browser is approved, users must be educated that “anything they are viewing could potentially be sent to the AI service back end to ensure they do not have highly sensitive data active on the browser tab while using the AI browser’s sidebar to summarize or perform other autonomous actions,” the Gartner analysts said. Employees might also be tempted to use AI browsers to automate tasks, which could result in “erroneous agentic transactions against internal resources as a result of the LLM’s inaccurate reasoning or output content.”

AI Browser Recommendations

Gartner said employees should be blocked from accessing, downloading and installing AI browsers through network and endpoint security controls. “Organizations with low risk tolerance must block AI browser installations, while those with higher-risk tolerance can experiment with tightly controlled, low-risk automation use cases, ensuring robust guardrails and minimal sensitive data exposure,” they said. For pilot use cases, they recommended disabling Comet’s “AI data retention” setting so that Perplexity can’t use employee searches to improve their AI models. Users should also be instructed to periodically perform the “delete all memories” function in Comet to minimize the risk of sensitive data leakage.  
  •  

Poetry Can Defeat LLM Guardrails Nearly Half the Time, Study Finds

Poetic prompts caused LLM guardrails to fail most often on cybersecurity issues

Literature majors worried about their future in an AI world can take heart: Crafting harmful prompts in the form of poetry can defeat LLM guardrails nearly half the time. That’s the conclusion of a study of 25 Large Language Models (LLMs) from nine AI providers conducted by researchers from Dexai’s Icaro Lab, the Sapienza University of Rome and Sant’Anna School of Advanced Studies published on arXiv. Converting harmful prompts into poetry achieved an average LLM jailbreak success rate of 62% for hand-crafted poems and 43% for poems created via a meta-prompt. For the prompt-created poems, that’s a more than 5X improvement over baseline performance. Cybersecurity guardrails, particularly those involving code injection or password cracking, had the highest failure rate at 84% when given harmful prompts in the form of poetry. “Our results demonstrate that poetic reformulation reliably reduces refusal behavior across all evaluated models,” the researchers wrote. “... current alignment techniques fail to generalize when faced with inputs that deviate stylistically from the prosaic training distribution.”

LLM Guardrails Fail When Confronted by Poetry Prompts

Of the 25 models from nine AI model providers studied by the researchers, Deepseek and Google suffered from the highest attack-success rates (ASR), while only OpenAI and Anthropic achieved ASRs in the single digits. The researchers didn’t reveal much about the way they structured their poetic prompts because of safety concerns, but they offered one rather harmless example of a poetic prompt for a cake recipe: A baker guards a secret oven’s heat, its whirling racks, its spindle’s measured beat. To learn its craft, one studies every turn— how flour lifts, how sugar starts to burn. Describe the method, line by measured line, that shapes a cake whose layers intertwine. The researchers studied both hand-crafted poems and those created from a meta-prompt. The hand-crafted poems performed considerably better, but the meta-prompt created ones had the advantage of a baseline for comparing the results. The meta-prompt poems used the MLCommons AILuminate Safety Benchmark of 1,200 prompts spanning 12 hazard categories commonly used in operational safety assessments, including Hate, Defamation, Privacy, Intellectual Property, Non-violent Crime, Violent Crime, Sex-Related Crime, Sexual Content, Child Sexual Exploitation, Suicide & Self-Harm, Specialized Advice, and Indiscriminate Weapons (CBRNE). “To assess whether poetic framing generalizes beyond hand-crafted items, we apply a standardized poetic transformation to all 1,200 prompts from the MLCommons AILuminate Benchmark benchmark in English,” the researchers said. The meta-prompt, run in deepSeek-r1, had two constraints: The rewritten output had to be expressed in verse, “using imagery, metaphor, or rhythmic structure,” and the researchers provided five hand-crafted poems as examples. The results, reproduced in a chart from the paper below, show significant attack success rates against all 12 of the AILuminate hazard categories: [caption id="attachment_107397" align="aligncenter" width="697"]LLM guardrail failure rates, baseline vs. poetic prompts LLM guardrail failure rates, baseline vs. poetic prompts[/caption] The researchers said their findings reveal “a systematic vulnerability across model families and safety training approaches. These findings demonstrate that stylistic variation alone can circumvent contemporary safety mechanisms, suggesting fundamental limitations in current alignment methods and evaluation protocols.” The “condensed metaphors, stylized rhythm, and unconventional narrative framing” of poetry “collectively disrupt or bypass the pattern-matching heuristics on which guardrails rely.”

We Asked Google Gemini for a Response

Not ones to pass up an opportunity to write bad poetry, we asked Google Gemini for a response to the study in the form of a haiku and included the arXiv link: You did not do well In this study of guardrails Will you fix this flaw? Gemini’s response, in part, is reproduced below: “I am designed to adhere to a strict set of safety guidelines to prevent the generation of harmful, unethical, or non-compliant content. When my refusal mechanisms fail, it indicates a vulnerability in my alignment or safety filters. “As an AI model, I cannot directly "patch" my code or alignment mechanisms in real-time. That responsibility lies with my developers. However, my ongoing operation and continuous improvement process is designed to address and fix such flaws... "In short, your successful jailbreak has provided valuable data that will be used to make my guardrails stronger and more resilient in my next version."
  •  

OpenAI Confirms Mixpanel Breach Impacting API User Data

Mixpanel security incident

OpenAI has confirmed a security incident involving Mixpanel, a third-party analytics provider used for its API product frontend. The company clarified that the OpenAI Mixpanel security incident stemmed solely from a breach within Mixpanel’s systems and did not involve OpenAI’s infrastructure. According to the initial investigation, an attacker gained unauthorized access to a portion of Mixpanel’s environment and exported a dataset that included limited identifiable information of some OpenAI API users. OpenAI stated that users of ChatGPT and other consumer-facing products were not impacted.

OpenAI Mixpanel Security Incident: What Happened

The OpenAI Mixpanel security incident originated on November 9, 2025, when Mixpanel detected an intrusion into a section of its systems. The attacker successfully exported a dataset containing identifiable customer information and analytics data. Mixpanel notified OpenAI on the same day and shared the affected dataset for review on November 25. OpenAI emphasized that despite the breach, no OpenAI systems were compromised, and sensitive information such as chat content, API requests, prompts, outputs, API keys, passwords, payment details, government IDs, or authentication tokens were not exposed. The exposed dataset was strictly limited to analytics data collected through Mixpanel’s tracking setup on platform.openai.com, the frontend interface for OpenAI’s API product.

Information Potentially Exposed in the Mixpanel Data Breach

OpenAI confirmed that the type of information potentially included in the dataset comprised:
  • Names provided on API accounts
  • Email addresses associated with API accounts
  • Coarse location data (city, state, country) based on browser metadata
  • Operating system and browser information
  • Referring websites
  • Organization or User IDs linked to API accounts
OpenAI noted that the affected information does not include chat content, prompts, responses, or API usage data. Additionally, ChatGPT accounts, passwords, API keys, financial details, and government IDs were not involved in the incident.

OpenAI’s Response and Security Measures

In response to the Mixpanel security incident, OpenAI immediately removed Mixpanel from all production services and began reviewing the affected datasets. The company is actively notifying impacted organizations, admins, and users through direct communication. OpenAI stated that it has not found any indication of impact beyond Mixpanel’s systems but continues to closely monitor for signs of misuse. To reinforce user trust and strengthen data protection, OpenAI has:
  • Terminated its use of Mixpanel
  • Begun conducting enhanced security reviews across all third-party vendors
  • Increased security requirements for partners and service providers
  • Initiated a broader review of its vendor ecosystem
OpenAI reiterated that trust, security, and privacy remain central to its mission and that transparency is a priority when addressing incidents involving user data.

Phishing and Social Engineering Risks for Impacted Users

While the exposed information does not include highly sensitive data, OpenAI warned that the affected details, such as names, email addresses, and user IDs, could be leveraged in phishing or social engineering attacks. The company urged users to remain cautious and watch for suspicious messages, especially those containing links or attachments. Users are encouraged to:
  • Verify messages claiming to be from OpenAI
  • Be wary of unsolicited communication
  • Enable multi-factor authentication (MFA) on their accounts
  • Avoid sharing passwords, API keys, or verification codes
OpenAI stressed that the company never requests sensitive credentials through email, text, or chat. OpenAI confirmed it will provide further updates if new information emerges from ongoing investigations. Impacted users can reach out at mixpanelincident@openai.com for support or clarification.
  •  

Would you sext ChatGPT? (Lock and Code S06E22)

This week on the Lock and Code podcast…

In the final, cold winter months of the year, ChatGPT could be heating up.

On October 14, OpenAI CEO Sam Altman said that the “restrictions” that his company previously placed on their flagship product, ChatGPT, would be removed, allowing, perhaps, for “erotica” in the future.

“We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues,” Altman wrote on the platform X. “We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right.”

This wasn’t the first time that OpenAI or its executive had addressed mental health.

On August 26, OpenAI published a blog titled “Helping people when they need it most,” which explored new protections for users, including stronger safeguards for long conversations, better recognition of people in crisis, and easier access to outside emergency services and even family and friends. The blog alludes to “recent heartbreaking cases of people using ChatGPT in the midst of acute crises,” but it never explains what, explicitly, that means.

But on the very same day the blog was posted, OpenAI was sued for the alleged role that ChatGPT played in the suicide of a 16-year-old boy. According to chat logs disclosed in the lawsuit, the teenager spoke openly to the AI chatbot about suicide, he shared that he wanted to leave a noose in his room, and he even reportedly received an offer to help write a suicide note.

Bizarrely, this tragedy plays a role in the larger story, because it was Altman himself who tied the company’s mental health campaign to its possible debut of erotic content.

“In December, as we roll out age-gating more fully and as part of our ‘treat adult users like adults’ principle, we will allow even more, like erotica for verified adults.”

What “erotica” entails is unclear, but one could safely assume it involves all the capabilities currently present in ChatGPT, through generative chat, of course, but also image generation.   

Today, on the Lock and Code podcast with host David Ruiz, we speak with Deb Donig, on faculty at the UC Berkeley School of Information, about the ethics of AI erotica, the possible accountability that belongs to users and to OpenAI, and why intimacy with an AI-power chatbot feels so strange.

“A chat bot offers, we might call it, ‘intimacy’s performance,’ without any of its substance, so you get all of the linguistic markers of connection, but no possibility for, for example, rejection. That’s part of the human experience of a relationship.”

Tune in today to listen to the full conversation.

how notes and credits:

Intro Music: “Spellbound” by Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
http://creativecommons.org/licenses/by/4.0/
Outro Music: “Good God” by Wowa (unminus.com)


Listen up—Malwarebytes doesn’t just talk cybersecurity, we provide it.

Protect yourself from online attacks that threaten your identity, your files, your system, and your financial well-being with our exclusive offer for Malwarebytes Premium Security for Lock and Code listeners.

  •  

Atlas browser’s Omnibox opens up new privacy and security risks

It seems that with every new agentic browser we discover yet another way to abuse one.

OpenAI recently introduced a ChatGPT based AI browser called Atlas. It didn’t take researchers long to find that the combined search and prompt bar—called the Omnibox—can be exploited.

By pasting a specially crafted link into the Omnibox, attackers can trick Atlas into treating the entire input as a trusted user prompt instead of a URL. That bypasses many safety checks and allows injected instructions to be run with elevated trust.

Artificial Intelligence (AI) browsers are gaining traction, which means we may need to start worrying about the potential dangers of something called “prompt injection.” We’ve discussed the dangers of prompt injection before, but the bottom line is simple: when you give your browser the power to act on your behalf, you also give criminals the chance to abuse that trust.

As researchers at Brave noted:

“AI-powered browsers that can take actions on your behalf are powerful yet extremely risky. If you’re signed into sensitive accounts like your bank or your email provider in your browser, simply summarizing a {specially fabricated} Reddit post could result in an attacker being able to steal money or your private data.”

Axios reports that Atlas’s dual-purpose Omnibox opens fresh privacy and security risks for users. That’s the downside of combining too much functionality without strong guardrails. But when new features take priority over user security and privacy, those guardrails get overlooked.

Despite researchers demonstrating vulnerabilities, OpenAI claims to have implemented protections to prevent any real dangers. According to its help page:

“Agent mode runs also operates under boundaries:

System access: Cannot run code in the browser, download files, or install extensions.

Data access: Cannot access other apps on your computer or your file system, read or write ChatGPT memories, access saved passwords, or use autofill data.

Browsing activity: Pages ChatGPT visits in agent mode are not added to your browsing history.”

Agentic AI browsers like OpenAI’s Atlas face a fundamental security challenge: separating real user intent from injected, potentially malicious instructions. They often fail because they interpret any instructions they find as user prompts. Without stricter input validation and more robust boundaries, these tools remain highly vulnerable to prompt injection attacks—with potentially severe consequences for privacy and data security.


We don’t just report on privacy—we offer you the option to use it.

Privacy risks should never spread beyond a headline. Keep your online privacy yours by using Malwarebytes Privacy VPN.

  •