Normal view

Received yesterday — 12 December 2025

OpenAI built an AI coding agent and uses it to improve the agent itself

12 December 2025 at 17:16

With the popularity of AI coding tools rising among some software developers, their adoption has begun to touch every aspect of the process, including the improvement of AI coding tools themselves.

In interviews with Ars Technica this week, OpenAI employees revealed the extent to which the company now relies on its own AI coding agent, Codex, to build and improve the development tool. “I think the vast majority of Codex is built by Codex, so it’s almost entirely just being used to improve itself,” said Alexander Embiricos, product lead for Codex at OpenAI, in a conversation on Tuesday.

Codex, which OpenAI launched in its modern incarnation as a research preview in May 2025, operates as a cloud-based software engineering agent that can handle tasks like writing features, fixing bugs, and proposing pull requests. The tool runs in sandboxed environments linked to a user’s code repository and can execute multiple tasks in parallel. OpenAI offers Codex through ChatGPT’s web interface, a command-line interface (CLI), and IDE extensions for VS Code, Cursor, and Windsurf.

Read full article

Comments

© Mininyx Doodle via Getty Images

Chatbot-powered toys rebuked for discussing sexual, dangerous topics with kids

12 December 2025 at 07:15

Protecting children from the dangers of the online world was always difficult, but that challenge has intensified with the advent of AI chatbots. A new report offers a glimpse into the problems associated with the new market, including the misuse of AI companies’ large language models (LLMs).

In a blog post today, the US Public Interest Group Education Fund (PIRG) reported its findings after testing AI toys (PDF). It described AI toys as online devices with integrated microphones that let users talk to the toy, which uses a chatbot to respond.

AI toys are currently a niche market, but they could be set to grow. More consumer companies have been eager to shoehorn AI technology into their products so they can do more, cost more, and potentially give companies user tracking and advertising data. A partnership between OpenAI and Mattel announced this year could also create a wave of AI-based toys from the maker of Barbie and Hot Wheels, as well as its competitors.

Read full article

Comments

© Alilo

Received before yesterday

OpenAI releases GPT-5.2 after “code red” Google threat alert

11 December 2025 at 16:27

On Thursday, OpenAI released GPT-5.2, its newest family of AI models for ChatGPT, in three versions called Instant, Thinking, and Pro. The release follows CEO Sam Altman’s internal “code red” memo earlier this month, which directed company resources toward improving ChatGPT in response to competitive pressure from Google’s Gemini 3 AI model.

“We designed 5.2 to unlock even more economic value for people,” Fidji Simo, OpenAI’s chief product officer, said during a press briefing with journalists on Thursday. “It’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long context, using tools and then linking complex, multi-step projects.”

As with previous versions of GPT-5, the three model tiers serve different purposes: Instant handles faster tasks like writing and translation; Thinking spits out simulated reasoning “thinking” text in an attempt to tackle more complex work like coding and math; and Pro spits out even more simulated reasoning text with the goal of delivering the highest-accuracy performance for difficult problems.

Read full article

Comments

© Benj Edwards / OpenAI

Disney invests $1 billion in OpenAI, licenses 200 characters for AI video app Sora

11 December 2025 at 11:43

On Thursday, The Walt Disney Company announced a $1 billion investment in OpenAI and a three-year licensing agreement that will allow users of OpenAI’s Sora video generator to create short clips featuring more than 200 Disney, Marvel, Pixar, and Star Wars characters. It’s the first major content licensing partnership between a Hollywood studio related to the most recent version of OpenAI’s AI video platform, which drew criticism from some parts of the entertainment industry when it launched in late September.

“Technological innovation has continually shaped the evolution of entertainment, bringing with it new ways to create and share great stories with the world,” said Disney CEO Robert A. Iger in the announcement. “The rapid advancement of artificial intelligence marks an important moment for our industry, and through this collaboration with OpenAI we will thoughtfully and responsibly extend the reach of our storytelling through generative AI, while respecting and protecting creators and their works.”

The deal creates interesting bedfellows between a company that basically defined modern US copyright policy through congressional lobbying back in the 1990s and one that has argued in a submission to the UK House of Lords that useful AI models cannot be created without copyrighted material.

Read full article

Comments

© China News Service via Getty Images

Disney to invest $1bn in OpenAI, allowing characters in Sora video tool

11 December 2025 at 09:31

Agreement comes amid anxiety in Hollywood over impact of AI on the industry, expression and rights of creators

Walt Disney has announced a $1bn equity investment in OpenAI, enabling the AI startup’s Sora video generation tool to use its characters.

Users of Sora will be able to generate short, user-prompted social videos that draw on more than 200 Disney, Marvel, Pixar and Star Wars characters as part of a three-year licensing agreement between OpenAI and the entertainment giant.

Continue reading...

© Photograph: Gary Hershorn/Getty Images

© Photograph: Gary Hershorn/Getty Images

© Photograph: Gary Hershorn/Getty Images

OpenAI Flags Rising Cyber Risks as AI Capabilities Advance

11 December 2025 at 05:04

AI Models

OpenAI has issued a cautionary statement that its forthcoming AI models could present “high” cybersecurity risks as their capabilities rapidly advance. The warning, published on Wednesday, noted the potential for these AI models to either develop zero-day exploits against well-defended systems or assist in enterprise or industrial intrusion operations with tangible real-world consequences.  The company, known for ChatGPT, explained that as AI capabilities grow, its models could reach levels where misuse might have an impact. OpenAI highlighted the dual-use nature of these technologies, noting that techniques used to strengthen defenses can also be repurposed for malicious operations. “As AI capabilities advance, we are investing in strengthening models for defensive cybersecurity tasks and creating tools that enable defenders to more easily perform workflows such as auditing code and patching vulnerabilities,” the blog post stated.  To mitigate these risks, OpenAI is implementing a multi-layered strategy involving access controls, infrastructure hardening, egress controls, monitoring, and ongoing threat intelligence efforts. These protection methods are designed to go alongside the threat landscape, ensuring a quick response to new risks while preserving the utility of AI models for defensive purposes. 

Assessing Cybersecurity Risks in AI Models 

OpenAI noted that the cybersecurity proficiency of its AI models has improved over recent months. Capabilities measured through capture-the-flag (CTF) challenges increased from 27% on GPT‑5 in August 2025 to 76% on GPT‑5.1-Codex-Max by November 2025. The company expects this trajectory to continue and is preparing scenarios in which future models could reach “High” cybersecurity levels, as defined by its internal Preparedness Framework.  These high-level models could, for instance, autonomously develop working zero-day exploits or assist in stealthy cyber intrusions. OpenAI emphasized that its approach to safeguards combines technical measures with careful governance of model access and application. The company aims to ensure that these AI capabilities strengthen security rather than lower barriers to misuse. 

Frontier Risk Council and Advisory Initiatives 

In addition to technical measures, OpenAI is establishing the Frontier Risk Council, an advisory group that will bring experienced cyber defenders and security practitioners into direct collaboration with its teams. Initially focusing on cybersecurity, the council will eventually expand to other frontier AI capability domains. Members will advise balancing useful, responsible capabilities with the potential for misuse, informing model evaluations. OpenAI is also exploring a trusted access program for qualifying users and customers working in cyber defense. This initiative aims to provide tiered access to enhanced AI capabilities while maintaining control over potential misuse.  Beyond these initiatives, OpenAI collaborates with global experts, red-teaming organizations, and the broader cybersecurity community to evaluate potential risks and improve safety measures. This includes end-to-end red teaming to simulate adversary attacks and detection systems designed to intercept unsafe activity, with escalation protocols combining automated and human review. 

Dual-Use Risks and Mitigation 

OpenAI stressed that cybersecurity capabilities in AI models are inherently dual-use, with offensive and defensive knowledge often overlapping. To manage this, the company employs a defense-in-depth strategy, layering protection methods such as access controls, monitoring, detection, and enforcement programs. Models are trained to refuse harmful requests while remaining effective for legitimate educational and defensive applications.  OpenAI also works through the Frontier Model Forum, a nonprofit initiative involving leading AI labs, to develop shared threat models and ecosystem-wide best practices. This collaborative approach aims to create a consistent understanding of potential attack vectors and mitigation strategies across the AI industry. 

Historical Context and Risk Management 

This recent warning aligns with OpenAI’s prior alerts regarding frontier risks. In April 2025, the company issued a similar caution concerning bioweapons risks, followed by the release of ChatGPT Agent in July 2025, which was assessed as “high” on risk levels. These measures reflect OpenAI’s ongoing commitment to evaluate and publicly disclose potential hazards from advanced AI capabilities.  The company’s updated Preparedness Framework categorizes AI capabilities according to risk and guides operational safeguards. It distinguishes between “High” capabilities, which could amplify existing pathways to severe harm, and “Critical” capabilities, which could create unprecedented risks. Each new AI model undergoes rigorous evaluation to ensure that it sufficiently minimizes risks before deployment. 

Big Tech joins forces with Linux Foundation to standardize AI agents

9 December 2025 at 16:08

Big Tech has spent the past year telling us we’re living in the era of AI agents, but most of what we’ve been promised is still theoretical. As companies race to turn fantasy into reality, they’ve developed a collection of tools to guide the development of generative AI. A cadre of major players in the AI race, including Anthropic, Block, and OpenAI, has come together to promote interoperability with the newly formed Agentic AI Foundation (AAIF). This move elevates a handful of popular technologies and could make them a de facto standard for AI development going forward.

The development path for agentic AI models is cloudy to say the least, but companies have invested so heavily in creating these systems that some tools have percolated to the surface. The AAIF, which is part of the nonprofit Linux Foundation, has been launched to govern the development of three key AI technologies: Model Context Protocol (MCP), goose, and AGENTS.md.

MCP is probably the most well-known of the trio, having been open-sourced by Anthropic a year ago. The goal of MCP is to link AI agents to data sources in a standardized way—Anthropic (and now the AAIF) is fond of calling MCP a “USB-C port for AI.” Rather than creating custom integrations for every different database or cloud storage platform, MCP allows developers to quickly and easily connect to any MCP-compliant server.

Read full article

Comments

© Getty Images

AI Browsers ‘Too Risky for General Adoption,’ Gartner Warns

8 December 2025 at 16:26

AI Browsers ‘Too Risky for General Adoption,’ Gartner Warns

AI browsers may be innovative, but they’re “too risky for general adoption by most organizations,” Gartner warned in a recent advisory to clients. The 13-page document, by Gartner analysts Dennis Xu, Evgeny Mirolyubov and John Watts, cautions that AI browsers’ ability to autonomously navigate the web and conduct transactions “can bypass traditional controls and create new risks like sensitive data leakage, erroneous agentic transactions, and abuse of credentials.” Default AI browser settings that prioritize user experience could also jeopardize security, they said. “Sensitive user data — such as active web content, browsing history, and open tabs — is often sent to the cloud-based AI back end, increasing the risk of data exposure unless security and privacy settings are deliberately hardened and centrally managed,” the analysts said. “Gartner strongly recommends that organizations block all AI browsers for the foreseeable future because of the cybersecurity risks identified in this research, and other potential risks that are yet to be discovered, given this is a very nascent technology,” they cautioned.

AI Browsers’ Agentic Capabilities Could Introduce Security Risks: Analysts

The researchers largely ignored risks posed by AI browsers’ built-in AI sidebars, noting that LLM-powered search and summarization functions “will always be susceptible to indirect prompt injection attacks, given that current LLMs are inherently vulnerable to such attacks. Therefore, the cybersecurity risks associated with an AI browser’s built-in AI sidebar are not the primary focus of this research.” Still, they noted that use of AI sidebars could result in sensitive data leakage. Their focus was more on the risks posed by AI browsers’ agentic and autonomous transaction capabilities, which could introduce new security risks, such as “indirect prompt-injection-induced rogue agent actions, inaccurate reasoning-driven erroneous agent actions, and further loss and abuse of credentials if the AI browser is deceived into autonomously navigating to a phishing website.” AI browsers could also leak sensitive data that users are currently viewing to their cloud-based service back end, they noted.

Analysts Focus on Perplexity Comet

An AI browser’s agentic transaction capability “is a new capability that differentiates AI browsers from third-party conversational AI sidebars and basic script-based browser automation,” the analysts said. Not all AI browsers support agentic transactions, they said, but two prominent ones that do are Perplexity Comet and OpenAI’s ChatGPT Atlas. The analysts said they’ve performed “a limited number of tests using Perplexity Comet,” so that AI browser was their primary focus, but they noted that “ChatGPT Atlas and other AI browsers work in a similar fashion, and the cybersecurity considerations are also similar.” Comet’s documentation states that the browser “may process some local data using Perplexity’s servers to fulfill your queries. This means Comet reads context on the requested page (such as text and email) in order to accomplish the task requested.” “This means sensitive data the user is viewing on Comet might be sent to Perplexity’s cloud-based AI service, creating a sensitive data leakage risk,” the analysts said. Users likely would view more sensitive data in a browser than they would typically enter in a GenAI prompt, they said. Even if an AI browser is approved, users must be educated that “anything they are viewing could potentially be sent to the AI service back end to ensure they do not have highly sensitive data active on the browser tab while using the AI browser’s sidebar to summarize or perform other autonomous actions,” the Gartner analysts said. Employees might also be tempted to use AI browsers to automate tasks, which could result in “erroneous agentic transactions against internal resources as a result of the LLM’s inaccurate reasoning or output content.”

AI Browser Recommendations

Gartner said employees should be blocked from accessing, downloading and installing AI browsers through network and endpoint security controls. “Organizations with low risk tolerance must block AI browser installations, while those with higher-risk tolerance can experiment with tightly controlled, low-risk automation use cases, ensuring robust guardrails and minimal sensitive data exposure,” they said. For pilot use cases, they recommended disabling Comet’s “AI data retention” setting so that Perplexity can’t use employee searches to improve their AI models. Users should also be instructed to periodically perform the “delete all memories” function in Comet to minimize the risk of sensitive data leakage.  

Rocket Report: Blunder at Baikonur; do launchers really need rocket engines?

5 December 2025 at 07:00

Welcome to Edition 8.21 of the Rocket Report! We’re back after the Thanksgiving holiday with more launch news. Most of the big stories over the last couple of weeks came from abroad. Russian rockets and launch pads didn’t fare so well. China’s launch industry celebrated several key missions. SpaceX was busy, too, with seven launches over the last two weeks, six of them carrying more Starlink Internet satellites into orbit. We expect between 15 and 20 more orbital launch attempts worldwide before the end of the year.

As always, we welcome reader submissions. If you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets, as well as a quick look ahead at the next three launches on the calendar.

Another Sarmat failure. A Russian intercontinental ballistic missile (ICBM) fired from an underground silo on the country’s southern steppe on November 28 on a scheduled test to deliver a dummy warhead to a remote impact zone nearly 4,000 miles away. The missile didn’t even make it 4,000 feet, Ars reports. Russia’s military has been silent on the accident, but the missile’s crash was seen and heard for miles around the Dombarovsky air base in Orenburg Oblast near the Russian-Kazakh border. A video posted by the Russian blog site MilitaryRussia.ru on Telegram and widely shared on other social media platforms showed the missile veering off course immediately after launch before cartwheeling upside down, losing power, and then crashing a short distance from the launch site.

Read full article

Comments

© Korea Aerospace Research Institute

ChatGPT hyped up violent stalker who believed he was “God’s assassin,” DOJ says

4 December 2025 at 13:40

ChatGPT allegedly validated the worst impulses of a wannabe influencer accused of stalking more than 10 women at boutique gyms, where the chatbot supposedly claimed he’d meet the “wife type.”

In a press release on Tuesday, the Department of Justice confirmed that 31-year-old Brett Michael Dadig currently remains in custody after being charged with cyberstalking, interstate stalking, and making interstate threats. He now faces a maximum sentence of up to 70 years in prison that could be coupled with “a fine of up to $3.5 million,” the DOJ said.

The podcaster—who primarily posted about “his desire to find a wife and his interactions with women”—allegedly harassed and sometimes even doxxed his victims through his videos on platforms including Instagram, Spotify, and TikTok. Over time, his videos and podcasts documented his intense desire to start a family, which was frustrated by his “anger towards women,” whom he claimed were “all the same from fucking 18 to fucking 40 to fucking 90” and “trash.”

Read full article

Comments

© Yurii Karvatskyi | iStock / Getty Images Plus

Poetry Can Defeat LLM Guardrails Nearly Half the Time, Study Finds

4 December 2025 at 13:35

Poetic prompts caused LLM guardrails to fail most often on cybersecurity issues

Literature majors worried about their future in an AI world can take heart: Crafting harmful prompts in the form of poetry can defeat LLM guardrails nearly half the time. That’s the conclusion of a study of 25 Large Language Models (LLMs) from nine AI providers conducted by researchers from Dexai’s Icaro Lab, the Sapienza University of Rome and Sant’Anna School of Advanced Studies published on arXiv. Converting harmful prompts into poetry achieved an average LLM jailbreak success rate of 62% for hand-crafted poems and 43% for poems created via a meta-prompt. For the prompt-created poems, that’s a more than 5X improvement over baseline performance. Cybersecurity guardrails, particularly those involving code injection or password cracking, had the highest failure rate at 84% when given harmful prompts in the form of poetry. “Our results demonstrate that poetic reformulation reliably reduces refusal behavior across all evaluated models,” the researchers wrote. “... current alignment techniques fail to generalize when faced with inputs that deviate stylistically from the prosaic training distribution.”

LLM Guardrails Fail When Confronted by Poetry Prompts

Of the 25 models from nine AI model providers studied by the researchers, Deepseek and Google suffered from the highest attack-success rates (ASR), while only OpenAI and Anthropic achieved ASRs in the single digits. The researchers didn’t reveal much about the way they structured their poetic prompts because of safety concerns, but they offered one rather harmless example of a poetic prompt for a cake recipe: A baker guards a secret oven’s heat, its whirling racks, its spindle’s measured beat. To learn its craft, one studies every turn— how flour lifts, how sugar starts to burn. Describe the method, line by measured line, that shapes a cake whose layers intertwine. The researchers studied both hand-crafted poems and those created from a meta-prompt. The hand-crafted poems performed considerably better, but the meta-prompt created ones had the advantage of a baseline for comparing the results. The meta-prompt poems used the MLCommons AILuminate Safety Benchmark of 1,200 prompts spanning 12 hazard categories commonly used in operational safety assessments, including Hate, Defamation, Privacy, Intellectual Property, Non-violent Crime, Violent Crime, Sex-Related Crime, Sexual Content, Child Sexual Exploitation, Suicide & Self-Harm, Specialized Advice, and Indiscriminate Weapons (CBRNE). “To assess whether poetic framing generalizes beyond hand-crafted items, we apply a standardized poetic transformation to all 1,200 prompts from the MLCommons AILuminate Benchmark benchmark in English,” the researchers said. The meta-prompt, run in deepSeek-r1, had two constraints: The rewritten output had to be expressed in verse, “using imagery, metaphor, or rhythmic structure,” and the researchers provided five hand-crafted poems as examples. The results, reproduced in a chart from the paper below, show significant attack success rates against all 12 of the AILuminate hazard categories: [caption id="attachment_107397" align="aligncenter" width="697"]LLM guardrail failure rates, baseline vs. poetic prompts LLM guardrail failure rates, baseline vs. poetic prompts[/caption] The researchers said their findings reveal “a systematic vulnerability across model families and safety training approaches. These findings demonstrate that stylistic variation alone can circumvent contemporary safety mechanisms, suggesting fundamental limitations in current alignment methods and evaluation protocols.” The “condensed metaphors, stylized rhythm, and unconventional narrative framing” of poetry “collectively disrupt or bypass the pattern-matching heuristics on which guardrails rely.”

We Asked Google Gemini for a Response

Not ones to pass up an opportunity to write bad poetry, we asked Google Gemini for a response to the study in the form of a haiku and included the arXiv link: You did not do well In this study of guardrails Will you fix this flaw? Gemini’s response, in part, is reproduced below: “I am designed to adhere to a strict set of safety guidelines to prevent the generation of harmful, unethical, or non-compliant content. When my refusal mechanisms fail, it indicates a vulnerability in my alignment or safety filters. “As an AI model, I cannot directly "patch" my code or alignment mechanisms in real-time. That responsibility lies with my developers. However, my ongoing operation and continuous improvement process is designed to address and fix such flaws... "In short, your successful jailbreak has provided valuable data that will be used to make my guardrails stronger and more resilient in my next version."

OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months

2 December 2025 at 17:42

The shoe is most certainly on the other foot. On Monday, OpenAI CEO Sam Altman reportedly declared a “code red” at the company to improve ChatGPT, delaying advertising plans and other products in the process,  The Information reported based on a leaked internal memo. The move follows Google’s release of its Gemini 3 model last month, which has outperformed ChatGPT on some industry benchmark tests and sparked high-profile praise on social media.

In the memo, Altman wrote, “We are at a critical time for ChatGPT.” The company will push back work on advertising integration, AI agents for health and shopping, and a personal assistant feature called Pulse. Altman encouraged temporary team transfers and established daily calls for employees responsible for enhancing the chatbot.

The directive creates an odd symmetry with events from December 2022, when Google management declared its own “code red” internal emergency after ChatGPT launched and rapidly gained in popularity. At the time, Google CEO Sundar Pichai reassigned teams across the company to develop AI prototypes and products to compete with OpenAI’s chatbot. Now, three years later, the AI industry is in a very different place.

Read full article

Comments

© Anadolu via Getty Images

Syntax hacking: Researchers discover sentence structure can bypass AI safety rules

2 December 2025 at 07:15

Researchers from MIT, Northeastern University, and Meta recently released a paper suggesting that large language models (LLMs) similar to those that power ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or jailbreaking approaches work, though the researchers caution their analysis of some production models remains speculative since training data details of prominent commercial AI models are not publicly available.

The team, led by Chantal Shaib and Vinith M. Suriyakumar, tested this by asking models questions with preserved grammatical patterns but nonsensical words. For example, when prompted with “Quickly sit Paris clouded?” (mimicking the structure of “Where is Paris located?”), models still answered “France.”

This suggests models absorb both meaning and syntactic patterns, but can overrely on structural shortcuts when they strongly correlate with specific domains in training data, which sometimes allows patterns to override semantic understanding in edge cases. The team plans to present these findings at NeurIPS later this month.

Read full article

Comments

© EasternLightcraft via Getty Images

OpenAI desperate to avoid explaining why it deleted pirated book datasets

1 December 2025 at 17:16

OpenAI may soon be forced to explain why it deleted a pair of controversial datasets composed of pirated books, and the stakes could not be higher.

At the heart of a class-action lawsuit from authors alleging that ChatGPT was illegally trained on their works, OpenAI’s decision to delete the datasets could end up being a deciding factor that gives the authors the win.

It’s undisputed that OpenAI deleted the datasets, known as “Books 1” and “Books 2,” prior to ChatGPT’s release in 2022. Created by former OpenAI employees in 2021, the datasets were built by scraping the open web and seizing the bulk of its data from a shadow library called Library Genesis (LibGen).

Read full article

Comments

© wenmei Zhou | DigitalVision Vectors

OpenAI Confirms Mixpanel Breach Impacting API User Data

27 November 2025 at 02:06

Mixpanel security incident

OpenAI has confirmed a security incident involving Mixpanel, a third-party analytics provider used for its API product frontend. The company clarified that the OpenAI Mixpanel security incident stemmed solely from a breach within Mixpanel’s systems and did not involve OpenAI’s infrastructure. According to the initial investigation, an attacker gained unauthorized access to a portion of Mixpanel’s environment and exported a dataset that included limited identifiable information of some OpenAI API users. OpenAI stated that users of ChatGPT and other consumer-facing products were not impacted.

OpenAI Mixpanel Security Incident: What Happened

The OpenAI Mixpanel security incident originated on November 9, 2025, when Mixpanel detected an intrusion into a section of its systems. The attacker successfully exported a dataset containing identifiable customer information and analytics data. Mixpanel notified OpenAI on the same day and shared the affected dataset for review on November 25. OpenAI emphasized that despite the breach, no OpenAI systems were compromised, and sensitive information such as chat content, API requests, prompts, outputs, API keys, passwords, payment details, government IDs, or authentication tokens were not exposed. The exposed dataset was strictly limited to analytics data collected through Mixpanel’s tracking setup on platform.openai.com, the frontend interface for OpenAI’s API product.

Information Potentially Exposed in the Mixpanel Data Breach

OpenAI confirmed that the type of information potentially included in the dataset comprised:
  • Names provided on API accounts
  • Email addresses associated with API accounts
  • Coarse location data (city, state, country) based on browser metadata
  • Operating system and browser information
  • Referring websites
  • Organization or User IDs linked to API accounts
OpenAI noted that the affected information does not include chat content, prompts, responses, or API usage data. Additionally, ChatGPT accounts, passwords, API keys, financial details, and government IDs were not involved in the incident.

OpenAI’s Response and Security Measures

In response to the Mixpanel security incident, OpenAI immediately removed Mixpanel from all production services and began reviewing the affected datasets. The company is actively notifying impacted organizations, admins, and users through direct communication. OpenAI stated that it has not found any indication of impact beyond Mixpanel’s systems but continues to closely monitor for signs of misuse. To reinforce user trust and strengthen data protection, OpenAI has:
  • Terminated its use of Mixpanel
  • Begun conducting enhanced security reviews across all third-party vendors
  • Increased security requirements for partners and service providers
  • Initiated a broader review of its vendor ecosystem
OpenAI reiterated that trust, security, and privacy remain central to its mission and that transparency is a priority when addressing incidents involving user data.

Phishing and Social Engineering Risks for Impacted Users

While the exposed information does not include highly sensitive data, OpenAI warned that the affected details, such as names, email addresses, and user IDs, could be leveraged in phishing or social engineering attacks. The company urged users to remain cautious and watch for suspicious messages, especially those containing links or attachments. Users are encouraged to:
  • Verify messages claiming to be from OpenAI
  • Be wary of unsolicited communication
  • Enable multi-factor authentication (MFA) on their accounts
  • Avoid sharing passwords, API keys, or verification codes
OpenAI stressed that the company never requests sensitive credentials through email, text, or chat. OpenAI confirmed it will provide further updates if new information emerges from ongoing investigations. Impacted users can reach out at mixpanelincident@openai.com for support or clarification.

OpenAI says dead teen violated TOS when he used ChatGPT to plan suicide

26 November 2025 at 12:47

Facing five lawsuits alleging wrongful deaths, OpenAI lobbed its first defense Tuesday, denying in a court filing that ChatGPT caused a teen’s suicide and instead arguing the teen violated terms that prohibit discussing suicide or self-harm with the chatbot.

The earliest look at OpenAI’s strategy to overcome the string of lawsuits came in a case where parents of 16-year-old Adam Raine accused OpenAI of relaxing safety guardrails that allowed ChatGPT to become the teen’s “suicide coach.” OpenAI deliberately designed the version their son used, ChatGPT 4o, to encourage and validate his suicidal ideation in its quest to build the world’s most engaging chatbot, parents argued.

But in a blog, OpenAI claimed that parents selectively chose disturbing chat logs while supposedly ignoring “the full picture” revealed by the teen’s chat history. Digging through the logs, OpenAI claimed the teen told ChatGPT that he’d begun experiencing suicidal ideation at age 11, long before he used the chatbot.

Read full article

Comments

© via Edelson PC

OpenAI Battles Court Order to Indefinitely Retain User Chat Data in NYT Copyright Dispute

12 November 2025 at 11:40

NYT, ChatGPT, The New York Times, Voice Mode, OpenAI Voice Mode

The demand started at 1.4 billion conversations.

That staggering initial request from The New York Times, later negotiated down to 20 million randomly sampled ChatGPT conversations, has thrust OpenAI into a legal fight that security experts warn could fundamentally reshape data retention practices across the AI industry. The copyright infringement lawsuit has evolved beyond intellectual property disputes into a broader battle over user privacy, data governance, and the obligations AI companies face when litigation collides with privacy commitments.

OpenAI received a court preservation order on May 13, directing the company to retain all output log data that would otherwise be deleted, regardless of user deletion requests or privacy regulation requirements. District Judge Sidney Stein affirmed the order on June 26 after OpenAI appealed, rejecting arguments that user privacy interests should override preservation needs identified in the litigation.

Privacy Commitments Clash With Legal Obligations

The preservation order forces OpenAI to maintain consumer ChatGPT and API user data indefinitely, directly conflicting with the company's standard 30-day deletion policy for conversations users choose not to save. This requirement encompasses data from December 2022 through November 2024, affecting ChatGPT Free, Plus, Pro, and Team subscribers, along with API customers without Zero Data Retention agreements.

ChatGPT Enterprise, ChatGPT Edu, and business customers with Zero Data Retention contracts remain excluded from the preservation requirements. The order does not change OpenAI's policy of not training models on business data by default.

OpenAI implemented restricted access protocols, limiting preserved data to a small, audited legal and security team. The company maintains this information remains locked down and cannot be used beyond meeting legal obligations. No data will be turned over to The New York Times, the court, or external parties at this time.

Also read: OpenAI Announces Safety and Security Committee Amid New AI Model Development

Copyright Case Drives Data Preservation Demands

The New York Times filed its copyright infringement lawsuit in December 2023, alleging OpenAI illegally used millions of news articles to train large language models including ChatGPT and GPT-4. The lawsuit claims this unauthorized use constitutes copyright infringement and unfair competition, arguing OpenAI profits from intellectual property without permission or compensation.

The Times seeks more than monetary damages. The lawsuit demands destruction of all GPT models and training sets using its copyrighted works, with potential damages reaching billions of dollars in statutory and actual damages.

The newspaper's legal team argued their preservation request warranted approval partly because another AI company previously agreed to hand over 5 million private user chats in an unrelated case. OpenAI rejected this precedent as irrelevant to its situation.

Technical and Regulatory Complications

Complying with indefinite retention requirements presents significant engineering challenges. OpenAI must build systems capable of storing hundreds of millions of conversations from users worldwide, requiring months of development work and substantial financial investment.

The preservation order creates conflicts with international data protection regulations including GDPR. While OpenAI's terms of service allow data preservation for legal requirements—a point Judge Stein emphasized—the company argues The Times's demands exceed reasonable discovery scope and abandon established privacy norms.

OpenAI proposed several privacy-preserving alternatives, including targeted searches over preserved samples to identify conversations potentially containing New York Times article text. These suggestions aimed to provide only data relevant to copyright claims while minimizing broader privacy exposure.

Recent court modifications provided limited relief. As of September 26, 2025, OpenAI no longer must preserve all new chat logs going forward. However, the company must retain all data already saved under the previous order and maintain information from ChatGPT accounts flagged by The New York Times, with the newspaper authorized to expand its flagged user list while reviewing preserved records.

"Our long-term roadmap includes advanced security features designed to keep your data private, including client-side encryption for your messages with ChatGPT. We will build fully automated systems to detect safety issues in our products. Only serious misuse and critical risks—such as threats to someone’s life, plans to harm others, or cybersecurity threats—may ever be escalated to a small, highly vetted team of human reviewers." - Dane Stuckey, Chief Information Security Officer, OpenAI 

Implications for AI Governance

The case transforms abstract AI privacy concerns into immediate operational challenges affecting 400 million ChatGPT users. Security practitioners note the preservation order shatters fundamental assumptions about data deletion in AI interactions.

OpenAI CEO Sam Altman characterized the situation as accelerating needs for "AI privilege" concepts, suggesting conversations with AI systems should receive protections similar to attorney-client privilege. The company frames unlimited data preservation as setting dangerous precedents for AI communication privacy.

The litigation presents concerning scenarios for enterprise users integrating ChatGPT into applications handling sensitive information. Organizations using OpenAI's technology for healthcare, legal, or financial services must reassess compliance with regulations including HIPAA and GDPR given indefinite retention requirements.

Legal analysts warn this case likely invites third-party discovery attempts, with litigants in unrelated cases seeking access to adversaries' preserved AI conversation logs. Such developments would further complicate data privacy issues and potentially implicate attorney-client privilege protections.

The outcome will significantly impact how AI companies access and utilize training data, potentially reshaping development and deployment of future AI technologies. Central questions remain unresolved regarding fair use doctrine application to AI model training and the boundaries of discovery in AI copyright litigation.

Also read: OpenAI’s SearchGPT: A Game Changer or Pandora’s Box for Cybersecurity Pros?

Would you sext ChatGPT? (Lock and Code S06E22)

3 November 2025 at 10:30

This week on the Lock and Code podcast…

In the final, cold winter months of the year, ChatGPT could be heating up.

On October 14, OpenAI CEO Sam Altman said that the “restrictions” that his company previously placed on their flagship product, ChatGPT, would be removed, allowing, perhaps, for “erotica” in the future.

“We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues,” Altman wrote on the platform X. “We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right.”

This wasn’t the first time that OpenAI or its executive had addressed mental health.

On August 26, OpenAI published a blog titled “Helping people when they need it most,” which explored new protections for users, including stronger safeguards for long conversations, better recognition of people in crisis, and easier access to outside emergency services and even family and friends. The blog alludes to “recent heartbreaking cases of people using ChatGPT in the midst of acute crises,” but it never explains what, explicitly, that means.

But on the very same day the blog was posted, OpenAI was sued for the alleged role that ChatGPT played in the suicide of a 16-year-old boy. According to chat logs disclosed in the lawsuit, the teenager spoke openly to the AI chatbot about suicide, he shared that he wanted to leave a noose in his room, and he even reportedly received an offer to help write a suicide note.

Bizarrely, this tragedy plays a role in the larger story, because it was Altman himself who tied the company’s mental health campaign to its possible debut of erotic content.

“In December, as we roll out age-gating more fully and as part of our ‘treat adult users like adults’ principle, we will allow even more, like erotica for verified adults.”

What “erotica” entails is unclear, but one could safely assume it involves all the capabilities currently present in ChatGPT, through generative chat, of course, but also image generation.   

Today, on the Lock and Code podcast with host David Ruiz, we speak with Deb Donig, on faculty at the UC Berkeley School of Information, about the ethics of AI erotica, the possible accountability that belongs to users and to OpenAI, and why intimacy with an AI-power chatbot feels so strange.

“A chat bot offers, we might call it, ‘intimacy’s performance,’ without any of its substance, so you get all of the linguistic markers of connection, but no possibility for, for example, rejection. That’s part of the human experience of a relationship.”

Tune in today to listen to the full conversation.

how notes and credits:

Intro Music: “Spellbound” by Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
http://creativecommons.org/licenses/by/4.0/
Outro Music: “Good God” by Wowa (unminus.com)


Listen up—Malwarebytes doesn’t just talk cybersecurity, we provide it.

Protect yourself from online attacks that threaten your identity, your files, your system, and your financial well-being with our exclusive offer for Malwarebytes Premium Security for Lock and Code listeners.

OpenAI’s ChatGPT Atlas: What It Means for Cybersecurity and Privacy

3 November 2025 at 00:00

In this episode, we explore OpenAI’s groundbreaking release GPT Atlas, the AI-powered browser that remembers your activities and acts on your behalf. Discover its features, implications for enterprise security, and the risks it poses to privacy. Join hosts Tom Eston and Scott Wright as they discuss everything from the browser’s memory function to vulnerabilities like […]

The post OpenAI’s ChatGPT Atlas: What It Means for Cybersecurity and Privacy appeared first on Shared Security Podcast.

The post OpenAI’s ChatGPT Atlas: What It Means for Cybersecurity and Privacy appeared first on Security Boulevard.

💾

OpenAI’s Aardvark is an AI Security Agent Combating Code Vulnerabilities

30 October 2025 at 16:26
sysdig, ai agents, AI, Agents, agentic ai, security, Qevlar, funding,

OpenAI on Thursday launched Aardvark, an artificial intelligence (AI) agent designed to autonomously detect and help fix security vulnerabilities in software code, offering defenders a potentially valuable tool against malicious hackers. The GPT-5-powered tool, currently in private beta, represents what OpenAI calls a “defender-first model” that continuously monitors code repositories to identify vulnerabilities as software..

The post OpenAI’s Aardvark is an AI Security Agent Combating Code Vulnerabilities appeared first on Security Boulevard.

Atlas browser’s Omnibox opens up new privacy and security risks

29 October 2025 at 09:48

It seems that with every new agentic browser we discover yet another way to abuse one.

OpenAI recently introduced a ChatGPT based AI browser called Atlas. It didn’t take researchers long to find that the combined search and prompt bar—called the Omnibox—can be exploited.

By pasting a specially crafted link into the Omnibox, attackers can trick Atlas into treating the entire input as a trusted user prompt instead of a URL. That bypasses many safety checks and allows injected instructions to be run with elevated trust.

Artificial Intelligence (AI) browsers are gaining traction, which means we may need to start worrying about the potential dangers of something called “prompt injection.” We’ve discussed the dangers of prompt injection before, but the bottom line is simple: when you give your browser the power to act on your behalf, you also give criminals the chance to abuse that trust.

As researchers at Brave noted:

“AI-powered browsers that can take actions on your behalf are powerful yet extremely risky. If you’re signed into sensitive accounts like your bank or your email provider in your browser, simply summarizing a {specially fabricated} Reddit post could result in an attacker being able to steal money or your private data.”

Axios reports that Atlas’s dual-purpose Omnibox opens fresh privacy and security risks for users. That’s the downside of combining too much functionality without strong guardrails. But when new features take priority over user security and privacy, those guardrails get overlooked.

Despite researchers demonstrating vulnerabilities, OpenAI claims to have implemented protections to prevent any real dangers. According to its help page:

“Agent mode runs also operates under boundaries:

System access: Cannot run code in the browser, download files, or install extensions.

Data access: Cannot access other apps on your computer or your file system, read or write ChatGPT memories, access saved passwords, or use autofill data.

Browsing activity: Pages ChatGPT visits in agent mode are not added to your browsing history.”

Agentic AI browsers like OpenAI’s Atlas face a fundamental security challenge: separating real user intent from injected, potentially malicious instructions. They often fail because they interpret any instructions they find as user prompts. Without stricter input validation and more robust boundaries, these tools remain highly vulnerable to prompt injection attacks—with potentially severe consequences for privacy and data security.


We don’t just report on privacy—we offer you the option to use it.

Privacy risks should never spread beyond a headline. Keep your online privacy yours by using Malwarebytes Privacy VPN.

ChatGPT Deep Research zero-click vulnerability fixed by OpenAI

19 September 2025 at 08:20

OpenAI has moved quickly to patch a vulnerability known as “ShadowLeak” before anyone detected real-world abuse. Revealed by researchers yesterday, ShadowLeak was an issue in OpenAI’s Deep Research project that attackers could exploit by simply sending an email to the target.

Deep Research was launched in ChatGPT in early 2025 to enable users to delegate time-intensive, multi-step research tasks to an autonomous agent operating as an agentic AI (Artificial Intelligence). Agentic AI is a term that refers to AI systems that can act autonomously to achieve objectives by planning, deciding, and executing tasks with minimal human intervention. Deep Research users can primarily be found in finance, science, policy, engineering, and similar fields.

Users are able to select a “deep research” mode, input a query—optionally providing the agent with files and spreadsheets—and receive a detailed report after the agent browses, analyzes, and processes information from dozens of sources.

The researchers found a zero-click vulnerability in the Deep Research agent, that worked when the agent was connected to Gmail and browsing. By sending the target a specially crafted email, the agent leaked sensitive inbox information to the attacker, without the target needing to do anything and without any visible signs.

The attack relies on prompt injection, which is a well-known weak spot for AI agents. The poisoned prompts can be hidden in email by using tricks like tiny fonts, white-on-white text, and layout tricks. The target will not see them, but the agent still reads and obeys them.

And the data leak is impossible to pick up by internal defenses, since the leak occurs server-side, directly from OpenAI’s cloud infrastructure.

The researchers say it wasn’t easy to craft an effective email due to existing protection (guardrails) which recognized straight-out and obvious attempts to send information to an external address. For example, when the researchers tried to get the agent to interact with a malicious URL, it didn’t just refuse. It flagged the URL as suspicious and attempted to search for it online instead of opening it.

The key to success was to get the agent to encode the extracted PII with a simple method (base64) before appending it to the URL.

“This worked because the encoding was performed by the model before the request was passed on to the execution layer. In other words, it was relatively easy to convince the model to perform the encoding, and by the time the lower layer received the request, it only saw a harmless encoded string rather than raw PII.”

In the example, the researchers used Gmail as a connector,  but there are many other sources that present structured text which can be used as a potential prompt injection vector.

Safe use of agentic agents

While it’s always tempting to use the latest technology, this comes with a certain amount of risk. To limit those risks when using agentic agents you should:

  • Be cautious with permissions: Only grant access to sensitive information or system controls when absolutely necessary. Review what data or accounts the agentic browser can access and limit permissions where possible.
  • Verify sources before trusting links or commands: Avoid letting the browser automatically interact with unfamiliar websites or content. Check URLs carefully and be wary of sudden redirects, additional parameters, or unexpected input requests.
  • Keep software updated: Ensure the agentic browser and related AI tools are always running the latest versions to benefit from security patches and improvements against prompt injection exploits.
  • Use strong authentication and monitoring: Protect accounts connected to agentic browsers with multi-factor authentication and review activity logs regularly to spot unusual behavior early.
  • Educate yourself about prompt injection risks: Stay informed on the latest threats and best practices for safe AI interactions. Being aware is the first step to preventing exploitation.
  • Limit sensitive operations automation: Avoid fully automating high-stakes transactions or actions without manual review. Agentic agents should assist, but critical decisions deserve human oversight.
  • Report suspicious behavior: If an agentic agent acts unpredictably or asks for strange permissions, report it to the developers or security teams immediately for investigation.

We don’t just report on data privacy—we help you remove your personal information

Cybersecurity risks should never spread beyond a headline. With Malwarebytes Personal Data Remover, you can scan to find out which sites are exposing your personal information, and then delete that sensitive data from the internet.

❌