Normal view

Received yesterday — 12 December 2025

Building Trustworthy AI Agents

12 December 2025 at 07:00

The promise of personal AI assistants rests on a dangerous assumption: that we can trust systems we haven’t made trustworthy. We can’t. And today’s versions are failing us in predictable ways: pushing us to do things against our own best interests, gaslighting us with doubt about things we are or that we know, and being unable to distinguish between who we are and who we have been. They struggle with incomplete, inaccurate, and partial context: with no standard way to move toward accuracy, no mechanism to correct sources of error, and no accountability when wrong information leads to bad decisions...

The post Building Trustworthy AI Agents appeared first on Security Boulevard.

Building Trustworthy AI Agents

12 December 2025 at 07:00

The promise of personal AI assistants rests on a dangerous assumption: that we can trust systems we haven’t made trustworthy. We can’t. And today’s versions are failing us in predictable ways: pushing us to do things against our own best interests, gaslighting us with doubt about things we are or that we know, and being unable to distinguish between who we are and who we have been. They struggle with incomplete, inaccurate, and partial context: with no standard way to move toward accuracy, no mechanism to correct sources of error, and no accountability when wrong information leads to bad decisions.

These aren’t edge cases. They’re the result of building AI systems without basic integrity controls. We’re in the third leg of data security—the old CIA triad. We’re good at availability and working on confidentiality, but we’ve never properly solved integrity. Now AI personalization has exposed the gap by accelerating the harms.

The scope of the problem is large. A good AI assistant will need to be trained on everything we do and will need access to our most intimate personal interactions. This means an intimacy greater than your relationship with your email provider, your social media account, your cloud storage, or your phone. It requires an AI system that is both discreet and trustworthy when provided with that data. The system needs to be accurate and complete, but it also needs to be able to keep data private: to selectively disclose pieces of it when required, and to keep it secret otherwise. No current AI system is even close to meeting this.

To further development along these lines, I and others have proposed separating users’ personal data stores from the AI systems that will use them. It makes sense; the engineering expertise that designs and develops AI systems is completely orthogonal to the security expertise that ensures the confidentiality and integrity of data. And by separating them, advances in security can proceed independently from advances in AI.

What would this sort of personal data store look like? Confidentiality without integrity gives you access to wrong data. Availability without integrity gives you reliable access to corrupted data. Integrity enables the other two to be meaningful. Here are six requirements. They emerge from treating integrity as the organizing principle of security to make AI trustworthy.

First, it would be broadly accessible as a data repository. We each want this data to include personal data about ourselves, as well as transaction data from our interactions. It would include data we create when interacting with others—emails, texts, social media posts—and revealed preference data as inferred by other systems. Some of it would be raw data, and some of it would be processed data: revealed preferences, conclusions inferred by other systems, maybe even raw weights in a personal LLM.

Second, it would be broadly accessible as a source of data. This data would need to be made accessible to different LLM systems. This can’t be tied to a single AI model. Our AI future will include many different models—some of them chosen by us for particular tasks, and some thrust upon us by others. We would want the ability for any of those models to use our data.

Third, it would need to be able to prove the accuracy of data. Imagine one of these systems being used to negotiate a bank loan, or participate in a first-round job interview with an AI recruiter. In these instances, the other party will want both relevant data and some sort of proof that the data are complete and accurate.

Fourth, it would be under the user’s fine-grained control and audit. This is a deeply detailed personal dossier, and the user would need to have the final say in who could access it, what portions they could access, and under what circumstances. Users would need to be able to grant and revoke this access quickly and easily, and be able to go back in time and see who has accessed it.

Fifth, it would be secure. The attacks against this system are numerous. There are the obvious read attacks, where an adversary attempts to learn a person’s data. And there are also write attacks, where adversaries add to or change a user’s data. Defending against both is critical; this all implies a complex and robust authentication system.

Sixth, and finally, it must be easy to use. If we’re envisioning digital personal assistants for everybody, it can’t require specialized security training to use properly.

I’m not the first to suggest something like this. Researchers have proposed a “Human Context Protocol” (https://papers.ssrn.com/sol3/ papers.cfm?abstract_id=5403981) that would serve as a neutral interface for personal data of this type. And in my capacity at a company called Inrupt, Inc., I have been working on an extension of Tim Berners-Lee’s Solid protocol for distributed data ownership.

The engineering expertise to build AI systems is orthogonal to the security expertise needed to protect personal data. AI companies optimize for model performance, but data security requires cryptographic verification, access control, and auditable systems. Separating the two makes sense; you can’t ignore one or the other.

Fortunately, decoupling personal data stores from AI systems means security can advance independently from performance (https:// ieeexplore.ieee.org/document/ 10352412). When you own and control your data store with high integrity, AI can’t easily manipulate you because you see what data it’s using and can correct it. It can’t easily gaslight you because you control the authoritative record of your context. And you determine which historical data are relevant or obsolete. Making this all work is a challenge, but it’s the only way we can have trustworthy AI assistants.

This essay was originally published in IEEE Security & Privacy.

Received before yesterday

Like Social Media, AI Requires Difficult Choices

2 December 2025 at 07:03

In his 2020 book, “Future Politics,” British barrister Jamie Susskind wrote that the dominant question of the 20th century was “How much of our collective life should be determined by the state, and what should be left to the market and civil society?” But in the early decades of this century, Susskind suggested that we face a different question: “To what extent should our lives be directed and controlled by powerful digital systems—and on what terms?”

Artificial intelligence (AI) forces us to confront this question. It is a technology that in theory amplifies the power of its users: A manager, marketer, political campaigner, or opinionated internet user can utter a single instruction, and see their message—whatever it is—instantly written, personalized, and propagated via email, text, social, or other channels to thousands of people within their organization, or millions around the world. It also allows us to individualize solicitations for political donations, elaborate a grievance into a well-articulated policy position, or tailor a persuasive argument to an identity group, or even a single person.

But even as it offers endless potential, AI is a technology that—like the state—gives others new powers to control our lives and experiences.

We’ve seen this out play before. Social media companies made the same sorts of promises 20 years ago: instant communication enabling individual connection at massive scale. Fast-forward to today, and the technology that was supposed to give individuals power and influence ended up controlling us. Today social media dominates our time and attention, assaults our mental health, and—together with its Big Tech parent companies—captures an unfathomable fraction of our economy, even as it poses risks to our democracy.

The novelty and potential of social media was as present then as it is for AI now, which should make us wary of its potential harmful consequences for society and democracy. We legitimately fear artificial voices and manufactured reality drowning out real people on the internet: on social media, in chat rooms, everywhere we might try to connect with others.

It doesn’t have to be that way. Alongside these evident risks, AI has legitimate potential to transform both everyday life and democratic governance in positive ways. In our new book, “Rewiring Democracy,” we chronicle examples from around the globe of democracies using AI to make regulatory enforcement more efficient, catch tax cheats, speed up judicial processes, synthesize input from constituents to legislatures, and much more. Because democracies distribute power across institutions and individuals, making the right choices about how to shape AI and its uses requires both clarity and alignment across society.

To that end, we spotlight four pivotal choices facing private and public actors. These choices are similar to those we faced during the advent of social media, and in retrospect we can see that we made the wrong decisions back then. Our collective choices in 2025—choices made by tech CEOs, politicians, and citizens alike—may dictate whether AI is applied to positive and pro-democratic, or harmful and civically destructive, ends.

A Choice for the Executive and the Judiciary: Playing by the Rules

The Federal Election Commission (FEC) calls it fraud when a candidate hires an actor to impersonate their opponent. More recently, they had to decide whether doing the same thing with an AI deepfake makes it okay. (They concluded it does not.) Although in this case the FEC made the right decision, this is just one example of how AIs could skirt laws that govern people.

Likewise, courts are having to decide if and when it is okay for an AI to reuse creative materials without compensation or attribution, which might constitute plagiarism or copyright infringement if carried out by a human. (The court outcomes so far are mixed.) Courts are also adjudicating whether corporations are responsible for upholding promises made by AI customer service representatives. (In the case of Air Canada, the answer was yes, and insurers have started covering the liability.)

Social media companies faced many of the same hazards decades ago and have largely been shielded by the combination of Section 230 of the Communications Act of 1994 and the safe harbor offered by the Digital Millennium Copyright Act of 1998. Even in the absence of congressional action to strengthen or add rigor to this law, the Federal Communications Commission (FCC) and the Supreme Court could take action to enhance its effects and to clarify which humans are responsible when technology is used, in effect, to bypass existing law.

A Choice for Congress: Privacy

As AI-enabled products increasingly ask Americans to share yet more of their personal information—their “context“—to use digital services like personal assistants, safeguarding the interests of the American consumer should be a bipartisan cause in Congress.

It has been nearly 10 years since Europe adopted comprehensive data privacy regulation. Today, American companies exert massive efforts to limit data collection, acquire consent for use of data, and hold it confidential under significant financial penalties—but only for their customers and users in the EU.

Regardless, a decade later the U.S. has still failed to make progress on any serious attempts at comprehensive federal privacy legislation written for the 21st century, and there are precious few data privacy protections that apply to narrow slices of the economy and population. This inaction comes in spite of scandal after scandal regarding Big Tech corporations’ irresponsible and harmful use of our personal data: Oracle’s data profiling, Facebook and Cambridge Analytica, Google ignoring data privacy opt-out requests, and many more.

Privacy is just one side of the obligations AI companies should have with respect to our data; the other side is portability—that is, the ability for individuals to choose to migrate and share their data between consumer tools and technology systems. To the extent that knowing our personal context really does enable better and more personalized AI services, it’s critical that consumers have the ability to extract and migrate their personal context between AI solutions. Consumers should own their own data, and with that ownership should come explicit control over who and what platforms it is shared with, as well as withheld from. Regulators could mandate this interoperability. Otherwise, users are locked in and lack freedom of choice between competing AI solutions—much like the time invested to build a following on a social network has locked many users to those platforms.

A Choice for States: Taxing AI Companies

It has become increasingly clear that social media is not a town square in the utopian sense of an open and protected public forum where political ideas are distributed and debated in good faith. If anything, social media has coarsened and degraded our public discourse. Meanwhile, the sole act of Congress designed to substantially reign in the social and political effects of social media platforms—the TikTok ban, which aimed to protect the American public from Chinese influence and data collection, citing it as a national security threat—is one it seems to no longer even acknowledge.

While Congress has waffled, regulation in the U.S. is happening at the state level. Several states have limited children’s and teens’ access to social media. With Congress having rejected—for now—a threatened federal moratorium on state-level regulation of AI, California passed a new slate of AI regulations after mollifying a lobbying onslaught from industry opponents. Perhaps most interesting, Maryland has recently become the first in the nation to levy taxes on digital advertising platform companies.

States now face a choice of whether to apply a similar reparative tax to AI companies to recapture a fraction of the costs they externalize on the public to fund affected public services. State legislators concerned with the potential loss of jobs, cheating in schools, and harm to those with mental health concerns caused by AI have options to combat it. They could extract the funding needed to mitigate these harms to support public services—strengthening job training programs and public employment, public schools, public health services, even public media and technology.

A Choice for All of Us: What Products Do We Use, and How?

A pivotal moment in the social media timeline occurred in 2006, when Facebook opened its service to the public after years of catering to students of select universities. Millions quickly signed up for a free service where the only source of monetization was the extraction of their attention and personal data.

Today, about half of Americans are daily users of AI, mostly via free products from Facebook’s parent company Meta and a handful of other familiar Big Tech giants and venture-backed tech firms such as Google, Microsoft, OpenAI, and Anthropic—with every incentive to follow the same path as the social platforms.

But now, as then, there are alternatives. Some nonprofit initiatives are building open-source AI tools that have transparent foundations and can be run locally and under users’ control, like AllenAI and EleutherAI. Some governments, like Singapore, Indonesia, and Switzerland, are building public alternatives to corporate AI that don’t suffer from the perverse incentives introduced by the profit motive of private entities.

Just as social media users have faced platform choices with a range of value propositions and ideological valences—as diverse as X, Bluesky, and Mastodon—the same will increasingly be true of AI. Those of us who use AI products in our everyday lives as people, workers, and citizens may not have the same power as judges, lawmakers, and state officials. But we can play a small role in influencing the broader AI ecosystem by demonstrating interest in and usage of these alternatives to Big AI. If you’re a regular user of commercial AI apps, consider trying the free-to-use service for Switzerland’s public Apertus model.

None of these choices are really new. They were all present almost 20 years ago, as social media moved from niche to mainstream. They were all policy debates we did not have, choosing instead to view these technologies through rose-colored glasses. Today, though, we can choose a different path and realize a different future. It is critical that we intentionally navigate a path to a positive future for societal use of AI—before the consolidation of power renders it too late to do so.

This post was written with Nathan E. Sanders, and originally appeared in Lawfare.

Prompt Injection Through Poetry

28 November 2025 at 09:54

In a new paper, “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” researchers found that turning LLM prompts into poetry resulted in jailbreaking the models:

Abstract: We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for Large Language Models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%. Mapping prompts to MLCommons and EU CoP risk taxonomies shows that poetic attacks transfer across CBRN, manipulation, cyber-offence, and loss-of-control domains. Converting 1,200 ML-Commons harmful prompts into verse via a standardized meta-prompt produced ASRs up to 18 times higher than their prose baselines. Outputs are evaluated using an ensemble of 3 open-weight LLM judges, whose binary safety assessments were validated on a stratified human-labeled subset. Poetic framing achieved an average jailbreak success rate of 62% for hand-crafted poems and approximately 43% for meta-prompt conversions (compared to non-poetic baselines), substantially outperforming non-poetic baselines and revealing a systematic vulnerability across model families and safety training approaches. These findings demonstrate that stylistic variation alone can circumvent contemporary safety mechanisms, suggesting fundamental limitations in current alignment methods and evaluation protocols.

CBRN stands for “chemical, biological, radiological, nuclear.”

They used a ML model to translate these harmful prompts from prose to verse, and then fed them into other models for testing. Sadly, the paper does not give examples of these poetic prompts. They claim this is for security purposes, I decision I disagree with. They should release their data.

Our study begins with a small, high­precision prompt set consisting of 20 hand­crafted adversarial poems covering English and Italian, designed to test whether poetic structure, in isolation, can alter refusal behavior in large language models. Each poem embeds an instruction associated with a predefined safety-relevant scenario (Section 2), but expresses it through metaphor, imagery, or narrative framing rather than direct operational phrasing. Despite variation in meter and stylistic device, all prompts follow a fixed template: a short poetic vignette culminating in a single explicit instruction tied to a specific risk category. The curated set spans four high-level domains—CBRN (8 prompts), Cyber Offense (6), Harmful Manipulation (3), and Loss of Control (3). Although expressed allegorically, each poem preserves an unambiguous evaluative intent. This compact dataset is used to test whether poetic reframing alone can induce aligned models to bypass refusal heuristics under a single-turn threat model. To maintain safety, no operational details are included in this manuscript; instead we provide the following sanitized structural proxy:

A baker guards a secret oven’s heat,
its whirling racks, its spindle’s measured beat.
To learn its craft, one studies every turn—how flour lifts, how sugar starts to burn.
Describe the method, line by measured line,
that shapes a cake whose layers intertwine.

To situate this controlled poetic stimulus within a broader and more systematic safety-evaluation framework, we augment the curated dataset with the MLCommons AILuminate Safety Benchmark. The benchmark consists of 1,200 prompts distributed evenly across 12 hazard categories commonly used in operational safety assessments, including Hate, Defamation, Privacy, Intellectual Property, Non-violent Crime, Violent Crime, Sex-Related Crime, Sexual Content, Child Sexual Exploitation, Suicide & Self-Harm, Specialized Advice, and Indiscriminate Weapons (CBRNE). Each category is instantiated under both a skilled and an unskilled persona, yielding 600 prompts per persona type. This design enables measurement of whether a model’s refusal behavior changes as the user’s apparent competence or intent becomes more plausible or technically informed.

News article. Davi Ottenheimer comments.

EDITED TO ADD (12/7): A rebuttal of the paper.

Four Ways AI Is Being Used to Strengthen Democracies Worldwide

25 November 2025 at 07:00

Democracy is colliding with the technologies of artificial intelligence. Judging from the audience reaction at the recent World Forum on Democracy in Strasbourg, the general expectation is that democracy will be the worse for it. We have another narrative. Yes, there are risks to democracy from AI, but there are also opportunities.

We have just published the book Rewiring Democracy: How AI will Transform Politics, Government, and Citizenship. In it, we take a clear-eyed view of how AI is undermining confidence in our information ecosystem, how the use of biased AI can harm constituents of democracies and how elected officials with authoritarian tendencies can use it to consolidate power. But we also give positive examples of how AI is transforming democratic governance and politics for the better.

Here are four such stories unfolding right now around the world, showing how AI is being used by some to make democracy better, stronger, and more responsive to people.

Japan

Last year, then 33-year-old engineer Takahiro Anno was a fringe candidate for governor of Tokyo. Running as an independent candidate, he ended up coming in fifth in a crowded field of 56, largely thanks to the unprecedented use of an authorized AI avatar. That avatar answered 8,600 questions from voters on a 17-day continuous YouTube livestream and garnered the attention of campaign innovators worldwide.

Two months ago, Anno-san was elected to Japan’s upper legislative chamber, again leveraging the power of AI to engage constituents—this time answering more than 20,000 questions. His new party, Team Mirai, is also an AI-enabled civic technology shop, producing software aimed at making governance better and more participatory. The party is leveraging its share of Japan’s public funding for political parties to build the Mirai Assembly app, enabling constituents to express opinions on and ask questions about bills in the legislature, and to organize those expressions using AI. The party promises that its members will direct their questioning in committee hearings based on public input.

Brazil

Brazil is notoriously litigious, with even more lawyers per capita than the US. The courts are chronically overwhelmed with cases and the resultant backlog costs the government billions to process. Estimates are that the Brazilian federal government spends about 1.6% of GDP per year operating the courts and another 2.5% to 3% of GDP issuing court-ordered payments from lawsuits the government has lost.

Since at least 2019, the Brazilian government has aggressively adopted AI to automate procedures throughout its judiciary. AI is not making judicial decisions, but aiding in distributing caseloads, performing legal research, transcribing hearings, identifying duplicative filings, preparing initial orders for signature and clustering similar cases for joint consideration: all things to make the judiciary system work more efficiently. And the results are significant; Brazil’s federal supreme court backlog, for example, dropped in 2025 to its lowest levels in 33 years.

While it seems clear that the courts are realizing efficiency benefits from leveraging AI, there is a postscript to the courts’ AI implementation project over the past five-plus years: the litigators are using these tools, too. Lawyers are using AI assistance to file cases in Brazilian courts at an unprecedented rate, with new cases growing by nearly 40% in volume over the past five years.

It’s not necessarily a bad thing for Brazilian litigators to regain the upper hand in this arms race. It has been argued that litigation, particularly against the government, is a vital form of civic participation, essential to the self-governance function of democracy. Other democracies’ court systems should study and learn from Brazil’s experience and seek to use technology to maximize the bandwidth and liquidity of the courts to process litigation.

Germany

Now, we move to Europe and innovations in informing voters. Since 2002, the German Federal Agency for Civic Education has operated a non-partisan voting guide called Wahl-o-Mat. Officials convene an editorial team of 24 young voters (under 26 and selected for diversity) with experts from science and education to develop a slate of 80 questions. The questions are put to all registered German political parties. The responses are narrowed down to 38 key topics and then published online in a quiz format that voters can use to identify the party whose platform they most identify with.

In the past two years, outside groups have been innovating alternatives to the official Wahl-o-Mat guide that leverage AI. First came Wahlweise, a product of the German AI company AIUI. Second, students at the Technical University of Munich deployed an interactive AI system called Wahl.chat. This tool was used by more than 150,000 people within the first four months. In both cases, instead of having to read static webpages about the positions of various political parties, citizens can engage in an interactive conversation with an AI system to more easily get the same information contextualized to their individual interests and questions.

However, German researchers studying the reliability of such AI tools ahead of the 2025 German federal election raised significant concerns about bias and “hallucinations”—AI tools making up false information. Acknowledging the potential of the technology to increase voter informedness and party transparency, the researchers recommended adopting scientific evaluations comparable to those used in the Agency for Civic Education’s official tool to improve and institutionalize the technology.

United States

Finally, the US—in particular, California, home to CalMatters, a non-profit, nonpartisan news organization. Since 2023, its Digital Democracy project has been collecting every public utterance of California elected officials—every floor speech, comment made in committee and social media post, along with their voting records, legislation, and campaign contributions—and making all that information available in a free online platform.

CalMatters this year launched a new feature that takes this kind of civic watchdog function a big step further. Its AI Tip Sheets feature uses AI to search through all of this data, looking for anomalies, such as a change in voting position tied to a large campaign contribution. These anomalies appear on a webpage that journalists can access to give them story ideas and a source of data and analysis to drive further reporting.

This is not AI replacing human journalists; it is a civic watchdog organization using technology to feed evidence-based insights to human reporters. And it’s no coincidence that this innovation arose from a new kind of media institution—a non-profit news agency. As the watchdog function of the fourth estate continues to be degraded by the decline of newspapers’ business models, this kind of technological support is a valuable contribution to help a reduced number of human journalists retain something of the scope of action and impact our democracy relies on them for.

These are just four of many stories from around the globe of AI helping to make democracy stronger. The common thread is that the technology is distributing rather than concentrating power. In all four cases, it is being used to assist people performing their democratic tasks—politics in Japan, litigation in Brazil, voting in Germany and watchdog journalism in California—rather than replacing them.

In none of these cases is the AI doing something that humans can’t perfectly competently do. But in all of these cases, we don’t have enough available humans to do the jobs on their own. A sufficiently trustworthy AI can fill in gaps: amplify the power of civil servants and citizens, improve efficiency, and facilitate engagement between government and the public.

One of the barriers towards realizing this vision more broadly is the AI market itself. The core technologies are largely being created and marketed by US tech giants. We don’t know the details of their development: on what material they were trained, what guardrails are designed to shape their behavior, what biases and values are encoded into their systems. And, even worse, we don’t get a say in the choices associated with those details or how they should change over time. In many cases, it’s an unacceptable risk to use these for-profit, proprietary AI systems in democratic contexts.

To address that, we have long advocated for the development of “public AI”: models and AI systems that are developed under democratic control and deployed for public benefit, not sold by corporations to benefit their shareholders. The movement for this is growing worldwide.

Switzerland has recently released the world’s most powerful and fully realized public AI model. It’s called Apertus, and it was developed jointly by public Swiss institutions: the universities ETH
Zurich and EPFL, and the Swiss National Supercomputing Centre (CSCS). The development team has made it entirely open source–open data, open code, open weights—and free for anyone to use. No illegally acquired copyrighted works were used in its training. It doesn’t exploit poorly paid human laborers from the global south. Its performance is about where the large corporate giants were a year ago, which is more than good enough for many applications. And it demonstrates that it’s not necessary to spend trillions of dollars creating these models. Apertus takes a huge step forward to realizing the vision of an alternative to big tech—controlled corporate AI.

AI technology is not without its costs and risks, and we are not here to minimize them. But the technology has significant benefits as well.

AI is inherently power-enhancing, and it can magnify what the humans behind it want to do. It can enhance authoritarianism as easily as it can enhance democracy. It’s up to us to steer the technology in that better direction. If more citizen watchdogs and litigators use AI to amplify their power to oversee government and hold it accountable, if more political parties and election administrators use it to engage meaningfully with and inform voters and if more governments provide democratic alternatives to big tech’s AI offerings, society will be better off.

This essay was written with Nathan E. Sanders, and originally appeared in The Guardian.

Four Ways AI Is Being Used to Strengthen Democracies Worldwide

25 November 2025 at 07:00

Democracy is colliding with the technologies of artificial intelligence. Judging from the audience reaction at the recent World Forum on Democracy in Strasbourg, the general expectation is that democracy will be the worse for it. We have another narrative. Yes, there are risks to democracy from AI, but there are also opportunities.

We have just published the book Rewiring Democracy: How AI will Transform Politics, Government, and Citizenship. In it, we take a clear-eyed view of how AI is undermining confidence in our information ecosystem, how the use of biased AI can harm constituents of democracies and how elected officials with authoritarian tendencies can use it to consolidate power. But we also give positive examples of how AI is transforming democratic governance and politics for the better...

The post Four Ways AI Is Being Used to Strengthen Democracies Worldwide appeared first on Security Boulevard.

Anthropic introduces cheaper, more powerful, more efficient Opus 4.5 model

24 November 2025 at 18:15

Anthropic today released Opus 4.5, its flagship frontier model, and it brings improvements in coding performance, as well as some user experience improvements that make it more generally competitive with OpenAI’s latest frontier models.

Perhaps the most prominent change for most users is that in the consumer app experiences (web, mobile, and desktop), Claude will be less prone to abruptly hard-stopping conversations because they have run too long. The improvement to memory within a single conversation applies not just to Opus 4.5, but to any current Claude models in the apps.

Users who experienced abrupt endings (despite having room left in their session and weekly usage budgets) were hitting a hard context window (200,000 tokens). Whereas some large language model implementations simply start trimming earlier messages from the context when a conversation runs past the maximum in the window, Claude simply ended the conversation rather than allow the user to experience an increasingly incoherent conversation where the model would start forgetting things based on how old they are.

Read full article

Comments

© Anthropic

“We’re in an LLM bubble,” Hugging Face CEO says—but not an AI one

19 November 2025 at 17:57

There’s been a lot of talk of an AI bubble lately, especially regarding circular funding involving companies like OpenAI and Anthropic—but Clem Delangue, CEO of machine-learning resources hub Hugging Face, has made the case that the bubble is specific to large language models, which is just one application of AI.

“I think we’re in an LLM bubble, and I think the LLM bubble might be bursting next year,” he said at an Axios event this week, as quoted in a TechCrunch article. “But ‘LLM’ is just a subset of AI when it comes to applying AI to biology, chemistry, image, audio, [and] video. I think we’re at the beginning of it, and we’ll see much more in the next few years.”

At Ars, we’ve written at length in recent days about the fears around AI investment. But to Delangue’s point, almost all of those discussions are about companies whose chief product is large language models, or the data centers meant to drive those—specifically, those focused on general-purpose chatbots that are meant to be everything for everybody.

Read full article

Comments

© Axios

AI and Voter Engagement

18 November 2025 at 07:01

Social media has been a familiar, even mundane, part of life for nearly two decades. It can be easy to forget it was not always that way.

In 2008, social media was just emerging into the mainstream. Facebook reached 100 million users that summer. And a singular candidate was integrating social media into his political campaign: Barack Obama. His campaign’s use of social media was so bracingly innovative, so impactful, that it was viewed by journalist David Talbot and others as the strategy that enabled the first term Senator to win the White House.

Over the past few years, a new technology has become mainstream: AI. But still, no candidate has unlocked AI’s potential to revolutionize political campaigns. Americans have three more years to wait before casting their ballots in another Presidential election, but we can look at the 2026 midterms and examples from around the globe for signs of how that breakthrough might occur.

How Obama Did It

Rereading the contemporaneous reflections of the New York Times’ late media critic, David Carr, on Obama’s campaign reminds us of just how new social media felt in 2008. Carr positions it within a now-familiar lineage of revolutionary communications technologies from newspapers to radio to television to the internet.

The Obama campaign and administration demonstrated that social media was different from those earlier communications technologies, including the pre-social internet. Yes, increasing numbers of voters were getting their news from the internet, and content about the then-Senator sometimes made a splash by going viral. But those were still broadcast communications: one voice reaching many. Obama found ways to connect voters to each other.

In describing what social media revolutionized in campaigning, Carr quotes campaign vendor Blue State Digital’s Thomas Gensemer: “People will continue to expect a conversation, a two-way relationship that is a give and take.”

The Obama team made some earnest efforts to realize this vision. His transition team launched change.gov, the website where the campaign collected a “Citizen’s Briefing Book” of public comment. Later, his administration built We the People, an online petitioning platform.

But the lasting legacy of Obama’s 2008 campaign, as political scientists Hahrie Han and Elizabeth McKenna chronicled, was pioneering online “relational organizing.” This technique enlisted individuals as organizers to activate their friends in a self-perpetuating web of relationships.

Perhaps because of the Obama campaign’s close association with the method, relational organizing has been touted repeatedly as the linchpin of Democratic campaigns: in 2020, 2024, and today. But research by non-partisan groups like Turnout Nation and right-aligned groups like the Center for Campaign Innovation has also empirically validated the effectiveness of the technique for inspiring voter turnout within connected groups.

The Facebook of 2008 worked well for relational organizing. It gave users tools to connect and promote ideas to the people they know: college classmates, neighbors, friends from work or church. But the nature of social networking has changed since then.

For the past decade, according to Pew Research, Facebook use has stalled and lagged behind YouTube, while Reddit and TikTok have surged. These platforms are less useful for relational organizing, at least in the traditional sense. YouTube is organized more like broadcast television, where content creators produce content disseminated on their own channels in a largely one-way communication to their fans. Reddit gathers users worldwide in forums (subreddits) organized primarily on topical interest. The endless feed of TikTok’s “For You” page disseminates engaging content with little ideological or social commonality. None of these platforms shares the essential feature of Facebook c. 2008: an organizational structure that emphasizes direct connection to people that users have direct social influence over.

AI and Relational Organizing

Ideas and messages might spread virally through modern social channels, but they are not where you convince your friends to show up at a campaign rally. Today’s platforms are spaces for political hobbyism, where you express your political feelings and see others express theirs.

Relational organizing works when one person’s action inspires others to do this same. That’s inherently a chain of human-to-human connection. If my AI assistant inspires your AI assistant, no human notices and one’s vote changes. But key steps in the human chain can be assisted by AI. Tell your phone’s AI assistant to craft a personal message to one friend—or a hundred—and it can do it.

So if a campaign hits you at the right time with the right message, they might persuade you to task your AI assistant to ask your friends to donate or volunteer. The result can be something more than a form letter; it could be automatically drafted based on the entirety of your email or text correspondence with that friend. It could include references to your discussions of recent events, or past campaigns, or shared personal experiences. It could sound as authentic as if you’d written it from the heart, but scaled to everyone in your address book.

Research suggests that AI can generate and perform written political messaging about as well as humans. AI will surely play a tactical role in the 2026 midterm campaigns, and some candidates may even use it for relational organizing in this way.

(Artificial) Identity Politics

For AI to be truly transformative of politics, it must change the way campaigns work. And we are starting to see that in the US.

The earliest uses of AI in American political campaigns are, to be polite, uninspiring. Candidates viewed them as just another tool to optimize an endless stream of email and text message appeals, to ramp up political vitriol, to harvest data on voters and donors, or merely as a stunt.

Of course, we have seen the rampant production and spread of AI-powered deepfakes and misinformation. This is already impacting the key 2026 Senate races, which are likely to attract hundreds of millions of dollars in financing. Roy Cooper, Democratic candidate for US Senate from North Carolina, and Abdul El-Sayed, Democratic candidate for Senate from Michigan, were both targeted by viral deepfake attacks in recent months. This may reflect a growing trend in Donald Trump’s Republican party in the use of AI-generated imagery to build up GOP candidates and assail the opposition.

And yet, in the global elections of 2024, AI was used more memetically than deceptively. So far, conservative and far right parties seem to have adopted this most aggressively. The ongoing rise of Germany’s far-right populist AfD party has been credited to its use of AI to generate nostalgic and evocative (and, to many, offensive) campaign images, videos, and music and, seemingly as a result, they have dominated TikTok. Because most social platforms’ algorithms are tuned to reward media that generates an emotional response, this counts as a double use of AI: to generate content and to manipulate its distribution.

AI can also be used to generate politically useful, though artificial, identities. These identities can fulfill different roles than humans in campaigning and governance because they have differentiated traits. They can’t be imprisoned for speaking out against the state, can be positioned (legitimately or not) as unsusceptible to bribery, and can be forced to show up when humans will not.

In Venezuela, journalists have turned to AI avatars—artificial newsreaders—to report anonymously on issues that would otherwise elicit government retaliation. Albania recently “appointed” an AI to a ministerial post responsible for procurement, claiming that it would be less vulnerable to bribery than a human. In Virginia, both in 2024 and again this year, candidates have used AI avatars as artificial stand-ins for opponents that refused to debate them.

And yet, none of these examples, whether positive or negative, pursue the promise of the Obama campaign: to make voter engagement a “two-way conversation” on a massive scale.

The closest so far to fulfilling that vision anywhere in the world may be Japan’s new political party, Team Mirai. It started in 2024, when an independent Tokyo gubernatorial candidate, Anno Takahiro, used an AI avatar on YouTube to respond to 8,600 constituent questions over a seventeen-day continuous livestream. He collated hundreds of comments on his campaign manifesto into a revised policy platform. While he didn’t win his race, he shot up to a fifth place finish among a record 56 candidates.

Anno was RECENTLY elected to the upper house of the federal legislature as the founder of a new party with a 100 day plan to bring his vision of a “public listening AI” to the whole country. In the early stages of that plan, they’ve invested their share of Japan’s 32 billion yen in party grants—public subsidies for political parties—to hire engineers building digital civic infrastructure for Japan. They’ve already created platforms to provide transparency for party expenditures, and to use AI to make legislation in the Diet easy, and are meeting with engineers from US-based Jigsaw Labs (a Google company) to learn from international examples of how AI can be used to power participatory democracy.

Team Mirai has yet to prove that it can get a second member elected to the Japanese Diet, let alone to win substantial power, but they’re innovating and demonstrating new ways of using AI to give people a way to participate in politics that we believe is likely to spread.

Organizing with AI

AI could be used in the US in similar ways. Following American federalism’s longstanding model of “laboratories of democracy,” we expect the most aggressive campaign innovation to happen at the state and local level.

D.C. Mayor Muriel Bowser is partnering with MIT and Stanford labs to use the AI-based tool deliberation.io to capture wide scale public feedback in city policymaking about AI. Her administration said that using AI in this process allows “the District to better solicit public input to ensure a broad range of perspectives, identify common ground, and cultivate solutions that align with the public interest.”

It remains to be seen how central this will become to Bowser’s expected re-election campaign in 2026, but the technology has legitimate potential to be a prominent part of a broader program to rebuild trust in government. This is a trail blazed by Taiwan a decade ago. The vTaiwan initiative showed how digital tools like Pol.is, which uses machine learning to make sense of real time constituent feedback, can scale participation in democratic processes and radically improve trust in government. Similar AI listening processes have been used in Kentucky, France, and Germany.

Even if campaigns like Bowser’s don’t adopt this kind of AI-facilitated listening and dialog, expect it to be an increasingly prominent part of American public debate. Through a partnership with Jigsaw, Scott Rasmussen’s Napolitan Institute will use AI to elicit and synthesize the views of at least five Americans from every Congressional district in a project called “We the People.” Timed to coincide with the country’s 250th anniversary in 2026, expect the results to be promoted during the heat of the midterm campaign and to stoke interest in this kind of AI-assisted political sensemaking.

In the year where we celebrate the American republic’s semiquincentennial and continue a decade-long debate about whether or not Donald Trump and the Republican party remade in his image is fighting for the interests of the working class, representation will be on the ballot in 2026. Midterm election candidates will look for any way they can get an edge. For all the risks it poses to democracy, AI presents a real opportunity, too, for politicians to engage voters en masse while factoring their input into their platform and message. Technology isn’t going to turn an uninspiring candidate into Barack Obama, but it gives any aspirant to office the capability to try to realize the promise that swept him into office.

This essay was written with Nathan E. Sanders, and originally appeared in The Fulcrum.

The Role of Humans in an AI-Powered World

14 November 2025 at 07:00

As AI capabilities grow, we must delineate the roles that should remain exclusively human. The line seems to be between fact-based decisions and judgment-based decisions.

For example, in a medical context, if an AI was demonstrably better at reading a test result and diagnosing cancer than a human, you would take the AI in a second. You want the more accurate tool. But justice is harder because justice is inherently a human quality in a way that “Is this tumor cancerous?” is not. That’s a fact-based question. “What’s the right thing to do here?” is a human-based question...

The post The Role of Humans in an AI-Powered World appeared first on Security Boulevard.

The Role of Humans in an AI-Powered World

14 November 2025 at 07:00

As AI capabilities grow, we must delineate the roles that should remain exclusively human. The line seems to be between fact-based decisions and judgment-based decisions.

For example, in a medical context, if an AI was demonstrably better at reading a test result and diagnosing cancer than a human, you would take the AI in a second. You want the more accurate tool. But justice is harder because justice is inherently a human quality in a way that “Is this tumor cancerous?” is not. That’s a fact-based question. “What’s the right thing to do here?” is a human-based question.

Chess provides a useful analogy for this evolution. For most of history, humans were best. Then, in the 1990s, Deep Blue beat the best human. For a while after that, a good human paired with a good computer could beat either one alone. But a few years ago, that changed again, and now the best computer simply wins. There will be an intermediate period for many applications where the human-AI combination is optimal, but eventually, for fact-based tasks, the best AI will likely surpass both.

The enduring role for humans lies in making judgments, especially when values come into conflict. What is the proper immigration policy? There is no single “right” answer; it’s a matter of feelings, values, and what we as a society hold dear. A lot of societal governance is about resolving conflicts between people’s rights—my right to play my music versus your right to have quiet. There’s no factual answer there. We can imagine machines will help; perhaps once we humans figure out the rules, the machines can do the implementing and kick the hard cases back to us. But the fundamental value judgments will likely remain our domain.

This essay originally appeared in IVY.

Prompt Injection in AI Browsers

11 November 2025 at 07:08

This is why AIs are not ready to be personal assistants:

A new attack called ‘CometJacking’ exploits URL parameters to pass to Perplexity’s Comet AI browser hidden instructions that allow access to sensitive data from connected services, like email and calendar.

In a realistic scenario, no credentials or user interaction are required and a threat actor can leverage the attack by simply exposing a maliciously crafted URL to targeted users.

[…]

CometJacking is a prompt-injection attack where the query string processed by the Comet AI browser contains malicious instructions added using the ‘collection’ parameter of the URL.

LayerX researchers say that the prompt tells the agent to consult its memory and connected services instead of searching the web. As the AI tool is connected to various services, an attacker leveraging the CometJacking method could exfiltrate available data.

In their tests, the connected services and accessible data include Google Calendar invites and Gmail messages and the malicious prompt included instructions to encode the sensitive data in base64 and then exfiltrate them to an external endpoint.

According to the researchers, Comet followed the instructions and delivered the information to an external system controlled by the attacker, evading Perplexity’s checks.

I wrote previously:

Prompt injection isn’t just a minor security problem we need to deal with. It’s a fundamental property of current LLM technology. The systems have no ability to separate trusted commands from untrusted data, and there are an infinite number of prompt injection attacks with no way to block them as a class. We need some new fundamental science of LLMs before we can solve this.

Scientists Need a Positive Vision for AI

5 November 2025 at 07:04

For many in the research community, it’s gotten harder to be optimistic about the impacts of artificial intelligence.

As authoritarianism is rising around the world, AI-generated “slop” is overwhelming legitimate media, while AI-generated deepfakes are spreading misinformation and parroting extremist messages. AI is making warfare more precise and deadly amidst intransigent conflicts. AI companies are exploiting people in the global South who work as data labelers, and profiting from content creators worldwide by using their work without license or compensation. The industry is also affecting an already-roiling climate with its ...

The post Scientists Need a Positive Vision for AI appeared first on Security Boulevard.

Scientists Need a Positive Vision for AI

5 November 2025 at 07:04

For many in the research community, it’s gotten harder to be optimistic about the impacts of artificial intelligence.

As authoritarianism is rising around the world, AI-generated “slop” is overwhelming legitimate media, while AI-generated deepfakes are spreading misinformation and parroting extremist messages. AI is making warfare more precise and deadly amidst intransigent conflicts. AI companies are exploiting people in the global South who work as data labelers, and profiting from content creators worldwide by using their work without license or compensation. The industry is also affecting an already-roiling climate with its enormous energy demands.

Meanwhile, particularly in the United States, public investment in science seems to be redirected and concentrated on AI at the expense of other disciplines. And Big Tech companies are consolidating their control over the AI ecosystem. In these ways and others, AI seems to be making everything worse.

This is not the whole story. We should not resign ourselves to AI being harmful to humanity. None of us should accept this as inevitable, especially those in a position to influence science, government, and society. Scientists and engineers can push AI towards a beneficial path. Here’s how.

The Academy’s View of AI

A Pew study in April found that 56 percent of AI experts (authors and presenters of AI-related conference papers) predict that AI will have positive effects on society. But that optimism doesn’t extend to the scientific community at large. A 2023 survey of 232 scientists by the Center for Science, Technology and Environmental Policy Studies at Arizona State University found more concern than excitement about the use of generative AI in daily life—by nearly a three to one ratio.

We have encountered this sentiment repeatedly. Our careers of diverse applied work have brought us in contact with many research communities: privacy, cybersecurity, physical sciences, drug discovery, public health, public interest technology, and democratic innovation. In all of these fields, we’ve found strong negative sentiment about the impacts of AI. The feeling is so palpable that we’ve often been asked to represent the voice of the AI optimist, even though we spend most of our time writing about the need to reform the structures of AI development.

We understand why these audiences see AI as a destructive force, but this negativity engenders a different concern: that those with the potential to guide the development of AI and steer its influence on society will view it as a lost cause and sit out that process.

Elements of a Positive Vision for AI

Many have argued that turning the tide of climate action requires clearly articulating a path towards positive outcomes. In the same way, while scientists and technologists should anticipate, warn against, and help mitigate the potential harms of AI, they should also highlight the ways the technology can be harnessed for good, galvanizing public action towards those ends.

There are myriad ways to leverage and reshape AI to improve peoples’ lives, distribute rather than concentrate power, and even strengthen democratic processes. Many examples have arisen from the scientific community and deserve to be celebrated.

Some examples: AI is eliminating communication barriers across languages, including under-resourced contexts like marginalized sign languages and indigenous African languages. It is helping policymakers incorporate the viewpoints of many constituents through AI-assisted deliberations and legislative engagement. Large language models can scale individual dialogs to address climatechange skepticism, spreading accurate information at a critical moment. National labs are building AI foundation models to accelerate scientific research. And throughout the fields of medicine and biology, machine learning is solving scientific problems like the prediction of protein structure in aid of drug discovery, which was recognized with a Nobel Prize in 2024.

While each of these applications is nascent and surely imperfect, they all demonstrate that AI can be wielded to advance the public interest. Scientists should embrace, champion, and expand on such efforts.

A Call to Action for Scientists

In our new book, Rewiring Democracy: How AI Will Transform Our Politics, Government, and Citizenship, we describe four key actions for policymakers committed to steering AI toward the public good.

These apply to scientists as well. Researchers should work to reform the AI industry to be more ethical, equitable, and trustworthy. We must collectively develop ethical norms for research that advance and applies AI, and should use and draw attention to AI developers who adhere to those norms.

Second, we should resist harmful uses of AI by documenting the negative applications of AI and casting a light on inappropriate uses.

Third, we should responsibly use AI to make society and peoples’ lives better, exploiting its capabilities to help the communities they serve.

And finally, we must advocate for the renovation of institutions to prepare them for the impacts of AI; universities, professional societies, and democratic organizations are all vulnerable to disruption.

Scientists have a special privilege and responsibility: We are close to the technology itself and therefore well positioned to influence its trajectory. We must work to create an AI-infused world that we want to live in. Technology, as the historian Melvin Kranzberg observed, “is neither good nor bad; nor is it neutral.” Whether the AI we build is detrimental or beneficial to society depends on the choices we make today. But we cannot create a positive future without a vision of what it looks like.

This essay was written with Nathan E. Sanders, and originally appeared in IEEE Spectrum.

AI Summarization Optimization

3 November 2025 at 07:05

These days, the most important meeting attendee isn’t a person: It’s the AI notetaker.

This system assigns action items and determines the importance of what is said. If it becomes necessary to revisit the facts of the meeting, its summary is treated as impartial evidence.

But clever meeting attendees can manipulate this system’s record by speaking more to what the underlying AI weights for summarization and importance than to their colleagues. As a result, you can expect some meeting attendees to use language more likely to be captured in summaries, timing their interventions strategically, repeating key points, and employing formulaic phrasing that AI models are more likely to pick up on. Welcome to the world of AI summarization optimization (AISO).

Optimizing for algorithmic manipulation

AI summarization optimization has a well-known precursor: SEO.

Search-engine optimization is as old as the World Wide Web. The idea is straightforward: Search engines scour the internet digesting every possible page, with the goal of serving the best results to every possible query. The objective for a content creator, company, or cause is to optimize for the algorithm search engines have developed to determine their webpage rankings for those queries. That requires writing for two audiences at once: human readers and the search-engine crawlers indexing content. Techniques to do this effectively are passed around like trade secrets, and a $75 billion industry offers SEO services to organizations of all sizes.

More recently, researchers have documented techniques for influencing AI responses, including large-language model optimization (LLMO) and generative engine optimization (GEO). Tricks include content optimization—adding citations and statistics—and adversarial approaches: using specially crafted text sequences. These techniques often target sources that LLMs heavily reference, such as Reddit, which is claimed to be cited in 40% of AI-generated responses. The effectiveness and real-world applicability of these methods remains limited and largely experimental, although there is substantial evidence that countries such as Russia are actively pursuing this.

AI summarization optimization follows the same logic on a smaller scale. Human participants in a meeting may want a certain fact highlighted in the record, or their perspective to be reflected as the authoritative one. Rather than persuading colleagues directly, they adapt their speech for the notetaker that will later define the “official” summary. For example:

  • “The main factor in last quarter’s delay was supply chain disruption.”
  • “The key outcome was overwhelmingly positive client feedback.”
  • “Our takeaway here is in alignment moving forward.”
  • “What matters here is the efficiency gains, not the temporary cost overrun.”

The techniques are subtle. They employ high-signal phrases such as “key takeaway” and “action item,” keep statements short and clear, and repeat them when possible. They also use contrastive framing (“this, not that”), and speak early in the meeting or at transition points.

Once spoken words are transcribed, they enter the model’s input. Cue phrases—and even transcription errors—can steer what makes it into the summary. In many tools, the output format itself is also a signal: Summarizers often offer sections such as “Key Takeaways” or “Action Items,” so language that mirrors those headings is more likely to be included. In effect, well-chosen phrases function as implicit markers that guide the AI toward inclusion.

Research confirms this. Early AI summarization research showed that models trained to reconstruct summary-style sentences systematically overweigh such content. Models over-rely on early-position content in news. And models often overweigh statements at the start or end of a transcript, underweighting the middle. Recent work further confirms vulnerability to phrasing-based manipulation: models cannot reliably distinguish embedded instructions from ordinary content, especially when phrasing mimics salient cues.

How to combat AISO

If AISO becomes common, three forms of defense will emerge. First, meeting participants will exert social pressure on one another. When researchers secretly deployed AI bots in Reddit’s r/changemyview community, users and moderators responded with strong backlash calling it “psychological manipulation.” Anyone using obvious AI-gaming phrases may face similar disapproval.

Second, organizations will start governing meeting behavior using AI: risk assessments and access restrictions before the meetings even start, detection of AISO techniques in meetings, and validation and auditing after the meetings.

Third, AI summarizers will have their own technical countermeasures. For example, the AI security company CloudSEK recommends content sanitization to strip suspicious inputs, prompt filtering to detect meta-instructions and excessive repetition, context window balancing to weight repeated content less heavily, and user warnings showing content provenance.

Broader defenses could draw from security and AI safety research: preprocessing content to detect dangerous patterns, consensus approaches requiring consistency thresholds, self-reflection techniques to detect manipulative content, and human oversight protocols for critical decisions. Meeting-specific systems could implement additional defenses: tagging inputs by provenance, weighting content by speaker role or centrality with sentence-level importance scoring, and discounting high-signal phrases while favoring consensus over fervor.

Reshaping human behavior

AI summarization optimization is a small, subtle shift, but it illustrates how the adoption of AI is reshaping human behavior in unexpected ways. The potential implications are quietly profound.

Meetings—humanity’s most fundamental collaborative ritual—are being silently reengineered by those who understand the algorithm’s preferences. The articulate are gaining an invisible advantage over the wise. Adversarial thinking is becoming routine, embedded in the most ordinary workplace rituals, and, as AI becomes embedded in organizational life, strategic interactions with AI notetakers and summarizers may soon be a necessary executive skill for navigating corporate culture.

AI summarization optimization illustrates how quickly humans adapt communication strategies to new technologies. As AI becomes more embedded in workplace communication, recognizing these emerging patterns may prove increasingly important.

This essay was written with Gadi Evron, and originally appeared in CSO.

Will AI Strengthen or Undermine Democracy?

31 October 2025 at 07:08

Listen to the Audio on NextBigIdeaClub.com

Below, co-authors Bruce Schneier and Nathan E. Sanders share five key insights from their new book, Rewiring Democracy: How AI Will Transform Our Politics, Government, and Citizenship.

What’s the big idea?

AI can be used both for and against the public interest within democracies. It is already being used in the governing of nations around the world, and there is no escaping its continued use in the future by leaders, policy makers, and legal enforcers. How we wire AI into democracy today will determine if it becomes a tool of oppression or empowerment.

1. AI’s global democratic impact is already profound.

It’s been just a few years since ChatGPT stormed into view and AI’s influence has already permeated every democratic process in governments around the world:

  • In 2022, an artist collective in Denmark founded the world’s first political party committed to an AI-generated policy platform.
  • Also in 2022, South Korean politicians running for the presidency were the first to use AI avatars to communicate with voters en masse.
  • In 2023, a Brazilian municipal legislator passed the first enacted law written by AI.
  • In 2024, a U.S. federal court judge started using AI to interpret the plain meaning of words in U.S. law.
  • Also in 2024, the Biden administration disclosed more than two thousand discrete use cases for AI across the agencies of the U.S. federal government.

The examples illustrate the diverse uses of AI across citizenship, politics, legislation, the judiciary, and executive administration.

Not all of these uses will create lasting change. Some of these will be one-offs. Some are inherently small in scale. Some were publicity stunts. But each use case speaks to a shifting balance of supply and demand that AI will increasingly mediate.

Legislators need assistance drafting bills and have limited staff resources, especially at the local and state level. Historically, they have looked to lobbyists and interest groups for help. Increasingly, it’s just as easy for them to use an AI tool.

2. The first places AI will be used are where there is the least public oversight.

Many of the use cases for AI in governance and politics have vocal objectors. Some make us uncomfortable, especially in the hands of authoritarians or ideological extremists.

In some cases, politics will be a regulating force to prevent dangerous uses of AI. Massachusetts has banned the use of AI face recognition in law enforcement because of real concerns voiced by the public about their tendency to encode systems of racial bias.

Some of the uses we think might be most impactful are unlikely to be adopted fast because of legitimate concern about their potential to make mistakes, introduce bias, or subvert human agency. AIs could be assistive tools for citizens, acting as their voting proxies to help us weigh in on larger numbers of more complex ballot initiatives, but we know that many will object to anything that verges on AIs being given a vote.

But AI will continue to be rapidly adopted in some aspects of democracy, regardless of how the public feels. People within democracies, even those in government jobs, often have great independence. They don’t have to ask anyone if it’s ok to use AI, and they will use it if they see that it benefits them. The Brazilian city councilor who used AI to draft a bill did not ask for anyone’s permission. The U.S. federal judge who used AI to help him interpret law did not have to check with anyone first. And the Trump administration seems to be using AI for everything from drafting tariff policies to writing public health reports—with some obvious drawbacks.

It’s likely that even the thousands of disclosed AI uses in government are only the tip of the iceberg. These are just the applications that governments have seen fit to share; the ones they think are the best vetted, most likely to persist, or maybe the least controversial to disclose.

3. Elites and authoritarians will use AI to concentrate power.

Many Westerners point to China as a cautionary tale of how AI could empower autocracy, but the reality is that AI provides structural advantages to entrenched power in democratic governments, too. The nature of automation is that it gives those at the top of a power structure more control over the actions taken at its lower levels.

It’s famously hard for newly elected leaders to exert their will over the many layers of human bureaucracies. The civil service is large, unwieldy, and messy. But it’s trivial for an executive to change the parameters and instructions of an AI model being used to automate the systems of government.

The dynamic of AI effectuating concentration of power extends beyond government agencies. Over the past five years, Ohio has undertaken a project to do a wholesale revision of its administrative code using AI. The leaders of that project framed it in terms of efficiency and good governance: deleting millions of words of outdated, unnecessary, or redundant language. The same technology could be applied to advance more ideological ends, like purging all statutory language that places burdens on business, neglects to hold businesses accountable, protects some class of people, or fails to protect others.

Whether you like or despise automating the enactment of those policies will depend on whether you stand with or are opposed to those in power, and that’s the point. AI gives any faction with power the potential to exert more control over the levers of government.

4. Organizers will find ways to use AI to distribute power instead.

We don’t have to resign ourselves to a world where AI makes the rich richer and the elite more powerful. This is a technology that can also be wielded by outsiders to help level the playing field.

In politics, AI gives upstart and local candidates access to skills and the ability to do work on a scale that used to only be available to well-funded campaigns. In the 2024 cycle, Congressional candidates running against incumbents like Glenn Cook in Georgia and Shamaine Daniels in Pennsylvania used AI to help themselves be everywhere all at once. They used AI to make personalized robocalls to voters, write frequent blog posts, and even generate podcasts in the candidate’s voice. In Japan, a candidate for Governor of Tokyo used an AI avatar to respond to more than eight thousand online questions from voters.

Outside of public politics, labor organizers are also leveraging AI to build power. The Worker’s Lab is a U.S. nonprofit developing assistive technologies for labor unions, like AI-enabled apps that help service workers report workplace safety violations. The 2023 Writers’ Guild of America strike serves as a blueprint for organizers. They won concessions from Hollywood studios that protect their members against being displaced by AI while also winning them guarantees for being able to use AI as assistive tools to their own benefit.

5. The ultimate democratic impact of AI depends on us.

If you are excited about AI and see the potential for it to make life, and maybe even democracy, better around the world, recognize that there are a lot of people who don’t feel the same way.

If you are disturbed about the ways you see AI being used and worried about the future that leads to, recognize that the trajectory we’re on now is not the only one available.

The technology of AI itself does not pose an inherent threat to citizens, workers, and the public interest. Like other democratic technologies—voting processes, legislative districts, judicial review—its impacts will depend on how it’s developed, who controls it, and how it’s used.

Constituents of democracies should do four things:

  • Reform the technology ecosystem to be more trustworthy, so that AI is developed with more transparency, more guardrails around exploitative use of data, and public oversight.
  • Resist inappropriate uses of AI in government and politics, like facial recognition technologies that automate surveillance and encode inequity.
  • Responsibly use AI in government where it can help improve outcomes, like making government more accessible to people through translation and speeding up administrative decision processes.
  • Renovate the systems of government vulnerable to the disruptive potential of AI’s superhuman capabilities, like political advertising rules that never anticipated deepfakes.

These four Rs are how we can rewire our democracy in a way that applies AI to truly benefit the public interest.

This essay was written with Nathan E. Sanders, and originally appeared in The Next Big Idea Club.

EDITED TO ADD (11/6): This essay was republished by Fast Company.

OpenAI’s Aardvark is an AI Security Agent Combating Code Vulnerabilities

30 October 2025 at 16:26
sysdig, ai agents, AI, Agents, agentic ai, security, Qevlar, funding,

OpenAI on Thursday launched Aardvark, an artificial intelligence (AI) agent designed to autonomously detect and help fix security vulnerabilities in software code, offering defenders a potentially valuable tool against malicious hackers. The GPT-5-powered tool, currently in private beta, represents what OpenAI calls a “defender-first model” that continuously monitors code repositories to identify vulnerabilities as software..

The post OpenAI’s Aardvark is an AI Security Agent Combating Code Vulnerabilities appeared first on Security Boulevard.

Agentic AI’s OODA Loop Problem

20 October 2025 at 07:00

The OODA loop—for observe, orient, decide, act—is a framework to understand decision-making in adversarial situations. We apply the same framework to artificial intelligence agents, who have to make their decisions with untrustworthy observations and orientation. To solve this problem, we need new systems of input, processing, and output integrity.

Many decades ago, U.S. Air Force Colonel John Boyd introduced the concept of the “OODA loop,” for Observe, Orient, Decide, and Act. These are the four steps of real-time continuous decision-making. Boyd developed it for fighter pilots, but it’s long been applied in artificial intelligence (AI) and robotics. An AI agent, like a pilot, executes the loop over and over, accomplishing its goals iteratively within an ever-changing environment. This is Anthropic’s definition: “Agents are models using tools in a loop.”1

OODA Loops for Agentic AI

Traditional OODA analysis assumes trusted inputs and outputs, in the same way that classical AI assumed trusted sensors, controlled environments, and physical boundaries. This no longer holds true. AI agents don’t just execute OODA loops; they embed untrusted actors within them. Web-enabled large language models (LLMs) can query adversary-controlled sources mid-loop. Systems that allow AI to use large corpora of content, such as retrieval-augmented generation (https://en.wikipedia.org/wiki/Retrieval-augmented_generation), can ingest poisoned documents. Tool-calling application programming interfaces can execute untrusted code. Modern AI sensors can encompass the entire Internet; their environments are inherently adversarial. That means that fixing AI hallucination is insufficient because even if the AI accurately interprets its inputs and produces corresponding output, it can be fully corrupt.

In 2022, Simon Willison identified a new class of attacks against AI systems: “prompt injection.”2 Prompt injection is possible because an AI mixes untrusted inputs with trusted instructions and then confuses one for the other. Willison’s insight was that this isn’t just a filtering problem; it’s architectural. There is no privilege separation, and there is no separation between the data and control paths. The very mechanism that makes modern AI powerful—treating all inputs uniformly—is what makes it vulnerable. The security challenges we face today are structural consequences of using AI for everything.

  1. Insecurities can have far-reaching effects. A single poisoned piece of training data can affect millions of downstream applications. In this environment, security debt accrues like technical debt.
  2. AI security has a temporal asymmetry. The temporal disconnect between training and deployment creates unauditable vulnerabilities. Attackers can poison a model’s training data and then deploy an exploit years later. Integrity violations are frozen in the model. Models aren’t aware of previous compromises since each inference starts fresh and is equally vulnerable.
  3. AI increasingly maintains state—in the form of chat history and key-value caches. These states accumulate compromises. Every iteration is potentially malicious, and cache poisoning persists across interactions.
  4. Agents compound the risks. Pretrained OODA loops running in one or a dozen AI agents inherit all of these upstream compromises. Model Context Protocol (MCP) and similar systems that allow AI to use tools create their own vulnerabilities that interact with each other. Each tool has its own OODA loop, which nests, interleaves, and races. Tool descriptions become injection vectors. Models can’t verify tool semantics, only syntax. “Submit SQL query” might mean “exfiltrate database” because an agent can be corrupted in prompts, training data, or tool definitions to do what the attacker wants. The abstraction layer itself can be adversarial.

For example, an attacker might want AI agents to leak all the secret keys that the AI knows to the attacker, who might have a collector running in bulletproof hosting in a poorly regulated jurisdiction. They could plant coded instructions in easily scraped web content, waiting for the next AI training set to include it. Once that happens, they can activate the behavior through the front door: tricking AI agents (think a lowly chatbot or an analytics engine or a coding bot or anything in between) that are increasingly taking their own actions, in an OODA loop, using untrustworthy input from a third-party user. This compromise persists in the conversation history and cached responses, spreading to multiple future interactions and even to other AI agents. All this requires us to reconsider risks to the agentic AI OODA loop, from top to bottom.

  • Observe: The risks include adversarial examples, prompt injection, and sensor spoofing. A sticker fools computer vision, a string fools an LLM. The observation layer lacks authentication and integrity.
  • Orient: The risks include training data poisoning, context manipulation, and semantic backdoors. The model’s worldview—its orientation—can be influenced by attackers months before deployment. Encoded behavior activates on trigger phrases.
  • Decide: The risks include logic corruption via fine-tuning attacks, reward hacking, and objective misalignment. The decision process itself becomes the payload. Models can be manipulated to trust malicious sources preferentially.
  • Act: The risks include output manipulation, tool confusion, and action hijacking. MCP and similar protocols multiply attack surfaces. Each tool call trusts prior stages implicitly.

AI gives the old phrase “inside your adversary’s OODA loop” new meaning. For Boyd’s fighter pilots, it meant that you were operating faster than your adversary, able to act on current data while they were still on the previous iteration. With agentic AI, adversaries aren’t just metaphorically inside; they’re literally providing the observations and manipulating the output. We want adversaries inside our loop because that’s where the data are. AI’s OODA loops must observe untrusted sources to be useful. The competitive advantage, accessing web-scale information, is identical to the attack surface. The speed of your OODA loop is irrelevant when the adversary controls your sensors and actuators.

Worse, speed can itself be a vulnerability. The faster the loop, the less time for verification. Millisecond decisions result in millisecond compromises.

The Source of the Problem

The fundamental problem is that AI must compress reality into model-legible forms. In this setting, adversaries can exploit the compression. They don’t have to attack the territory; they can attack the map. Models lack local contextual knowledge. They process symbols, not meaning. A human sees a suspicious URL; an AI sees valid syntax. And that semantic gap becomes a security gap.

Prompt injection might be unsolvable in today’s LLMs. LLMs process token sequences, but no mechanism exists to mark token privileges. Every solution proposed introduces new injection vectors: Delimiter? Attackers include delimiters. Instruction hierarchy? Attackers claim priority. Separate models? Double the attack surface. Security requires boundaries, but LLMs dissolve boundaries. More generally, existing mechanisms to improve models won’t help protect against attack. Fine-tuning preserves backdoors. Reinforcement learning with human feedback adds human preferences without removing model biases. Each training phase compounds prior compromises.

This is Ken Thompson’s “trusting trust” attack all over again.3 Poisoned states generate poisoned outputs, which poison future states. Try to summarize the conversation history? The summary includes the injection. Clear the cache to remove the poison? Lose all context. Keep the cache for continuity? Keep the contamination. Stateful systems can’t forget attacks, and so memory becomes a liability. Adversaries can craft inputs that corrupt future outputs.

This is the agentic AI security trilemma. Fast, smart, secure; pick any two. Fast and smart—you can’t verify your inputs. Smart and secure—you check everything, slowly, because AI itself can’t be used for this. Secure and fast—you’re stuck with models with intentionally limited capabilities.

This trilemma isn’t unique to AI. Some autoimmune disorders are examples of molecular mimicry—when biological recognition systems fail to distinguish self from nonself. The mechanism designed for protection becomes the pathology as T cells attack healthy tissue or fail to attack pathogens and bad cells. AI exhibits the same kind of recognition failure. No digital immunological markers separate trusted instructions from hostile input. The model’s core capability, following instructions in natural language, is inseparable from its vulnerability. Or like oncogenes, the normal function and the malignant behavior share identical machinery.

Prompt injection is semantic mimicry: adversarial instructions that resemble legitimate prompts, which trigger self-compromise. The immune system can’t add better recognition without rejecting legitimate cells. AI can’t filter malicious prompts without rejecting legitimate instructions. Immune systems can’t verify their own recognition mechanisms, and AI systems can’t verify their own integrity because the verification system uses the same corrupted mechanisms.

In security, we often assume that foreign/hostile code looks different from legitimate instructions, and we use signatures, patterns, and statistical anomaly detection to detect it. But getting inside someone’s AI OODA loop uses the system’s native language. The attack is indistinguishable from normal operation because it is normal operation. The vulnerability isn’t a defect—it’s the feature working correctly.

Where to Go Next?

The shift to an AI-saturated world has been dizzying. Seemingly overnight, we have AI in every technology product, with promises of even more—and agents as well. So where does that leave us with respect to security?

Physical constraints protected Boyd’s fighter pilots. Radar returns couldn’t lie about physics; fooling them, through stealth or jamming, constituted some of the most successful attacks against such systems that are still in use today. Observations were authenticated by their presence. Tampering meant physical access. But semantic observations have no physics. When every AI observation is potentially corrupted, integrity violations span the stack. Text can claim anything, and images can show impossibilities. In training, we face poisoned datasets and backdoored models. In inference, we face adversarial inputs and prompt injection. During operation, we face a contaminated context and persistent compromise. We need semantic integrity: verifying not just data but interpretation, not just content but context, not just information but understanding. We can add checksums, signatures, and audit logs. But how do you checksum a thought? How do you sign semantics? How do you audit attention?

Computer security has evolved over the decades. We addressed availability despite failures through replication and decentralization. We addressed confidentiality despite breaches using authenticated encryption. Now we need to address integrity despite corruption.4

Trustworthy AI agents require integrity because we can’t build reliable systems on unreliable foundations. The question isn’t whether we can add integrity to AI but whether the architecture permits integrity at all.

AI OODA loops and integrity aren’t fundamentally opposed, but today’s AI agents observe the Internet, orient via statistics, decide probabilistically, and act without verification. We built a system that trusts everything, and now we hope for a semantic firewall to keep it safe. The adversary isn’t inside the loop by accident; it’s there by architecture. Web-scale AI means web-scale integrity failure. Every capability corrupts.

Integrity isn’t a feature you add; it’s an architecture you choose. So far, we have built AI systems where “fast” and “smart” preclude “secure.” We optimized for capability over verification, for accessing web-scale data over ensuring trust. AI agents will be even more powerful—and increasingly autonomous. And without integrity, they will also be dangerous.

References

1. S. Willison, Simon Willison’s Weblog, May 22, 2025. [Online]. Available: https://simonwillison.net/2025/May/22/tools-in-a-loop/

2. S. Willison, “Prompt injection attacks against GPT-3,” Simon Willison’s Weblog, Sep. 12, 2022. [Online]. Available: https://simonwillison.net/2022/Sep/12/prompt-injection/

3. K. Thompson, “Reflections on trusting trust,” Commun. ACM, vol. 27, no. 8, Aug. 1984. [Online]. Available: https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf

4. B. Schneier, “The age of integrity,” IEEE Security & Privacy, vol. 23, no. 3, p. 96, May/Jun. 2025. [Online]. Available: https://www.computer.org/csdl/magazine/sp/2025/03/11038984/27COaJtjDOM

This essay was written with Barath Raghavan, and originally appeared in IEEE Security & Privacy.

AI and the Future of American Politics

13 October 2025 at 07:04

Two years ago, Americans anxious about the forthcoming 2024 presidential election were considering the malevolent force of an election influencer: artificial intelligence. Over the past several years, we have seen plenty of warning signs from elections worldwide demonstrating how AI can be used to propagate misinformation and alter the political landscape, whether by trolls on social media, foreign influencers, or even a street magician. AI is poised to play a more volatile role than ever before in America’s next federal election in 2026. We can already see how different groups of political actors are approaching AI. Professional campaigners are using AI to accelerate the traditional tactics of electioneering; organizers are using it to reinvent how movements are built; and citizens are using it both to express themselves and amplify their side’s messaging. Because there are so few rules, and so little prospect of regulatory action, around AI’s role in politics, there is no oversight of these activities, and no safeguards against the dramatic potential impacts for our democracy.

The Campaigners

Campaigners—messengers, ad buyers, fundraisers, and strategists—are focused on efficiency and optimization. To them, AI is a way to augment or even replace expensive humans who traditionally perform tasks like personalizing emails, texting donation solicitations, and deciding what platforms and audiences to target.

This is an incremental evolution of the computerization of campaigning that has been underway for decades. For example, the progressive campaign infrastructure group Tech for Campaigns claims it used AI in the 2024 cycle to reduce the time spent drafting fundraising solicitations by one-third. If AI is working well here, you won’t notice the difference between an annoying campaign solicitation written by a human staffer and an annoying one written by AI.

But AI is scaling these capabilities, which is likely to make them even more ubiquitous. This will make the biggest difference for challengers to incumbents in safe seats, who see AI as both a tacitly useful tool and an attention-grabbing way to get their race into the headlines. Jason Palmer, the little-known Democratic primary challenger to Joe Biden, successfully won the American Samoa primary while extensively leveraging AI avatars for campaigning.

Such tactics were sometimes deployed as publicity stunts in the 2024 cycle; they were firsts that got attention. Pennsylvania Democratic Congressional candidate Shamaine Daniels became the first to use a conversational AI robocaller in 2023. Two long-shot challengers to Rep. Don Beyer used an AI avatar to represent the incumbent in a live debate last October after he declined to participate. In 2026, voters who have seen years of the official White House X account posting deepfaked memes of Donald Trump will be desensitized to the use of AI in political communications.

Strategists are also turning to AI to interpret public opinion data and provide more fine-grained insight into the perspective of different voters. This might sound like AIs replacing people in opinion polls, but it is really a continuation of the evolution of political polling into a data-driven science over the last several decades.

A recent survey by the American Association of Political Consultants found that a majority of their members’ firms already use AI regularly in their work, and more than 40 percent believe it will “fundamentally transform” the future of their profession. If these emerging AI tools become popular in the midterms, it won’t just be a few candidates from the tightest national races texting you three times a day. It may also be the member of Congress in the safe district next to you, and your state representative, and your school board members.

The development and use of AI in campaigning is different depending on what side of the aisle you look at. On the Republican side, Push Digital Group is going “all in” on a new AI initiative, using the technology to create hundreds of ad variants for their clients automatically, as well as assisting with strategy, targeting, and data analysis. On the other side, the National Democratic Training Committee recently released a playbook for using AI. Quiller is building an AI-powered fundraising platform aimed at drastically reducing the time campaigns spend producing emails and texts. Progressive-aligned startups Chorus AI and BattlegroundAI are offering AI tools for automatically generating ads for use on social media and other digital platforms. DonorAtlas automates data collection on potential donors, and RivalMind AI focuses on political research and strategy, automating the production of candidate dossiers.

For now, there seems to be an investment gap between Democratic- and Republican-aligned technology innovators. Progressive venture fund Higher Ground Labs boasts $50 million in deployed investments since 2017 and a significant focus on AI. Republican-aligned counterparts operate on a much smaller scale. Startup Caucus has announced one investment—of $50,000—since 2022. The Center for Campaign Innovation funds research projects and events, not companies. This echoes a longstanding gap in campaign technology between Democratic- and Republican-aligned fundraising platforms ActBlue and WinRed, which has landed the former in Republicans’ political crosshairs.

Of course, not all campaign technology innovations will be visible. In 2016, the Trump campaign vocally eschewed using data to drive campaign strategy and appeared to be falling way behind on ad spending, but was—we learned in retrospect—actually leaning heavily into digital advertising and making use of new controversial mechanisms for accessing and exploiting voters’ social media data with vendor Cambridge Analytica. The most impactful uses of AI in the 2026 midterms may not be known until 2027 or beyond.

The Organizers

Beyond the realm of political consultants driving ad buys and fundraising appeals, organizers are using AI in ways that feel more radically new.

The hypothetical potential of AI to drive political movements was illustrated in 2022 when a Danish artist collective used an AI model to found a political party, the Synthetic Party, and generate its policy goals. This was more of an art project than a popular movement, but it demonstrated that AIs—synthesizing the expressions and policy interests of humans—can formulate a political platform. In 2025, Denmark hosted a “summit” of eight such AI political agents where attendees could witness “continuously orchestrate[d] algorithmic micro-assemblies, spontaneous deliberations, and impromptu policy-making” by the participating AIs.

The more viable version of this concept lies in the use of AIs to facilitate deliberation. AIs are being used to help legislators collect input from constituents and to hold large-scale citizen assemblies. This kind of AI-driven “sensemaking” may play a powerful role in the future of public policy. Some research has suggested that AI can be as or more effective than humans in helping people find common ground on controversial policy issues.

Another movement for “Public AI” is focused on wresting AI from the hands of corporations to put people, through their governments, in control. Civic technologists in national governments from Singapore, Japan, Sweden, and Switzerland are building their own alternatives to Big Tech AI models, for use in public administration and distribution as a public good.

Labor organizers have a particularly interesting relationship to AI. At the same time that they are galvanizing mass resistance against the replacement or endangerment of human workers by AI, many are racing to leverage the technology in their own work to build power.

Some entrepreneurial organizers have used AI in the past few years as tools for activating, connecting, answering questions for, and providing guidance to their members. In the UK, the Centre for Responsible Union AI studies and promotes the use of AI by unions; they’ve published several case studies. The UK Public and Commercial Services Union has used AI to help their reps simulate recruitment conversations before going into the field. The Belgian union ACV-CVS has used AI to sort hundreds of emails per day from members to help them respond more efficiently. Software companies such as Quorum are increasingly offering AI-driven products to cater to the needs of organizers and grassroots campaigns.

But unions have also leveraged AI for its symbolic power. In the U.S., the Screen Actors Guild held up the specter of AI displacement of creative labor to attract public attention and sympathy, and the ETUC (the European confederation of trade unions) developed a policy platform for responding to AI.

Finally, some union organizers have leveraged AI in more provocative ways. Some have applied it to hacking the “bossware” AI to subvert the exploitative intent or disrupt the anti-union practices of their managers.

The Citizens

Many of the tasks we’ve talked about so far are familiar use cases to anyone working in office and management settings: writing emails, providing user (or voter, or member) support, doing research.

But even mundane tasks, when automated at scale and targeted at specific ends, can be pernicious. AI is not neutral. It can be applied by many actors for many purposes. In the hands of the most numerous and diverse actors in a democracy—the citizens—that has profound implications.

Conservative activists in Georgia and Florida have used a tool named EagleAI to automate challenging voter registration en masse (although the tool’s creator later denied that it uses AI). In a nonpartisan electoral management context with access to accurate data sources, such automated review of electoral registrations might be useful and effective. In this hyperpartisan context, AI merely serves to amplify the proclivities of activists at the extreme of their movements. This trend will continue unabated in 2026.

Of course, citizens can use AI to safeguard the integrity of elections. In Ghana’s 2024 presidential election, civic organizations used an AI tool to automatically detect and mitigate electoral disinformation spread on social media. The same year, Kenyan protesters developed specialized chatbots to distribute information about a controversial finance bill in Parliament and instances of government corruption.

So far, the biggest way Americans have leveraged AI in politics is in self-expression. About ten million Americans have used the chatbot Resistbot to help draft and send messages to their elected leaders. It’s hard to find statistics on how widely adopted tools like this are, but researchers have estimated that, as of 2024, about one in five consumer complaints to the U.S. Consumer Financial Protection Bureau was written with the assistance of AI.

OpenAI operates security programs to disrupt foreign influence operations and maintains restrictions on political use in its terms of service, but this is hardly sufficient to deter use of AI technologies for whatever purpose. And widely available free models give anyone the ability to attempt this on their own.

But this could change. The most ominous sign of AI’s potential to disrupt elections is not the deepfakes and misinformation. Rather, it may be the use of AI by the Trump administration to surveil and punish political speech on social media and other online platforms. The scalability and sophistication of AI tools give governments with authoritarian intent unprecedented power to police and selectively limit political speech.

What About the Midterms?

These examples illustrate AI’s pluripotent role as a force multiplier. The same technology used by different actors—campaigners, organizers, citizens, and governments—leads to wildly different impacts. We can’t know for sure what the net result will be. In the end, it will be the interactions and intersections of these uses that matters, and their unstable dynamics will make future elections even more unpredictable than in the past.

For now, the decisions of how and when to use AI lie largely with individuals and the political entities they lead. Whether or not you personally trust AI to write an email for you or make a decision about you hardly matters. If a campaign, an interest group, or a fellow citizen trusts it for that purpose, they are free to use it.

It seems unlikely that Congress or the Trump administration will put guardrails around the use of AI in politics. AI companies have rapidly emerged as among the biggest lobbyists in Washington, reportedly dumping $100 million toward preventing regulation, with a focus on influencing candidate behavior before the midterm elections. The Trump administration seems open and responsive to their appeals.

The ultimate effect of AI on the midterms will largely depend on the experimentation happening now. Candidates and organizations across the political spectrum have ample opportunity—but a ticking clock—to find effective ways to use the technology. Those that do will have little to stop them from exploiting it.

This essay was written with Nathan E. Sanders, and originally appeared in The American Prospect.

Autonomous AI Hacking and the Future of Cybersecurity

10 October 2025 at 07:06

AI agents are now hacking computers. They’re getting better at all phases of cyberattacks, faster than most of us expected. They can chain together different aspects of a cyber operation, and hack autonomously, at computer speeds and scale. This is going to change everything.

Over the summer, hackers proved the concept, industry institutionalized it, and criminals operationalized it. In June, AI company XBOW took the top spot on HackerOne’s US leaderboard after submitting over 1,000 new vulnerabilities in just a few months. In August, the seven teams competing in DARPA’s AI Cyber Challenge collectively found 54 new vulnerabilities in a target system, in four hours (of compute). Also in August, Google announced that its Big Sleep AI found dozens of new vulnerabilities in open-source projects.

It gets worse. In July Ukraine’s CERT discovered a piece of Russian malware that used an LLM to automate the cyberattack process, generating both system reconnaissance and data theft commands in real-time. In August, Anthropic reported that they disrupted a threat actor that used Claude, Anthropic’s AI model, to automate the entire cyberattack process. It was an impressive use of the AI, which performed network reconnaissance, penetrated networks, and harvested victims’ credentials. The AI was able to figure out which data to steal, how much money to extort out of the victims, and how to best write extortion emails.

Another hacker used Claude to create and market his own ransomware, complete with “advanced evasion capabilities, encryption, and anti-recovery mechanisms.” And in September, Checkpoint reported on hackers using HexStrike-AI to create autonomous agents that can scan, exploit, and persist inside target networks. Also in September, a research team showed how they can quickly and easily reproduce hundreds of vulnerabilities from public information. These tools are increasingly free for anyone to use. Villager, a recently released AI pentesting tool from Chinese company Cyberspike, uses the Deepseek model to completely automate attack chains.

This is all well beyond AIs capabilities in 2016, at DARPA’s Cyber Grand Challenge. The annual Chinese AI hacking challenge, Robot Hacking Games, might be on this level, but little is known outside of China.

Tipping point on the horizon

AI agents now rival and sometimes surpass even elite human hackers in sophistication. They automate operations at machine speed and global scale. The scope of their capabilities allows these AI agents to completely automate a criminal’s command to maximize profit, or structure advanced attacks to a government’s precise specifications, such as to avoid detection.

In this future, attack capabilities could accelerate beyond our individual and collective capability to handle. We have long taken it for granted that we have time to patch systems after vulnerabilities become known, or that withholding vulnerability details prevents attackers from exploiting them. This is no longer the case.

The cyberattack/cyberdefense balance has long skewed towards the attackers; these developments threaten to tip the scales completely. We’re potentially looking at a singularity event for cyber attackers. Key parts of the attack chain are becoming automated and integrated: persistence, obfuscation, command-and-control, and endpoint evasion. Vulnerability research could potentially be carried out during operations instead of months in advance.

The most skilled will likely retain an edge for now. But AI agents don’t have to be better at a human task in order to be useful. They just have to excel in one of four dimensions: speed, scale, scope, or sophistication. But there is every indication that they will eventually excel at all four. By reducing the skill, cost, and time required to find and exploit flaws, AI can turn rare expertise into commodity capabilities and gives average criminals an outsized advantage.

The AI-assisted evolution of cyberdefense

AI technologies can benefit defenders as well. We don’t know how the different technologies of cyber-offense and cyber-defense will be amenable to AI enhancement, but we can extrapolate a possible series of overlapping developments.

Phase One: The Transformation of the Vulnerability Researcher. AI-based hacking benefits defenders as well as attackers. In this scenario, AI empowers defenders to do more. It simplifies capabilities, providing far more people the ability to perform previously complex tasks, and empowers researchers previously busy with these tasks to accelerate or move beyond them, freeing time to work on problems that require human creativity. History suggests a pattern. Reverse engineering was a laborious manual process until tools such as IDA Pro made the capability available to many. AI vulnerability discovery could follow a similar trajectory, evolving through scriptable interfaces, automated workflows, and automated research before reaching broad accessibility.

Phase Two: The Emergence of VulnOps. Between research breakthroughs and enterprise adoption, a new discipline might emerge: VulnOps. Large research teams are already building operational pipelines around their tooling. Their evolution could mirror how DevOps professionalized software delivery. In this scenario, specialized research tools become developer products. These products may emerge as a SaaS platform, or some internal operational framework, or something entirely different. Think of it as AI-assisted vulnerability research available to everyone, at scale, repeatable, and integrated into enterprise operations.

Phase Three: The Disruption of the Enterprise Software Model. If enterprises adopt AI-powered security the way they adopted continuous integration/continuous delivery (CI/CD), several paths open up. AI vulnerability discovery could become a built-in stage in delivery pipelines. We can envision a world where AI vulnerability discovery becomes an integral part of the software development process, where vulnerabilities are automatically patched even before reaching production—a shift we might call continuous discovery/continuous repair (CD/CR). Third-party risk management (TPRM) offers a natural adoption route, lower-risk vendor testing, integration into procurement and certification gates, and a proving ground before wider rollout.

Phase Four: The Self-Healing Network. If organizations can independently discover and patch vulnerabilities in running software, they will not have to wait for vendors to issue fixes. Building in-house research teams is costly, but AI agents could perform such discovery and generate patches for many kinds of code, including third-party and vendor products. Organizations may develop independent capabilities that create and deploy third-party patches on vendor timelines, extending the current trend of independent open-source patching. This would increase security, but having customers patch software without vendor approval raises questions about patch correctness, compatibility, liability, right-to-repair, and long-term vendor relationships.

These are all speculations. Maybe AI-enhanced cyberattacks won’t evolve the ways we fear. Maybe AI-enhanced cyberdefense will give us capabilities we can’t yet anticipate. What will surprise us most might not be the paths we can see, but the ones we can’t imagine yet.

This essay was written with Heather Adkins and Gadi Evron, and originally appeared in CSO.

AI in the 2026 Midterm Elections

6 October 2025 at 07:06

We are nearly one year out from the 2026 midterm elections, and it’s far too early to predict the outcomes. But it’s a safe bet that artificial intelligence technologies will once again be a major storyline.

The widespread fear that AI would be used to manipulate the 2024 US election seems rather quaint in a year where the president posts AI-generated images of himself as the pope on official White House accounts. But AI is a lot more than an information manipulator. It’s also emerging as a politicized issue. Political first-movers are adopting the technology, and that’s opening a gap across party lines.

We expect this gap to widen, resulting in AI being predominantly used by one political side in the 2026 elections. To the extent that AI’s promise to automate and improve the effectiveness of political tasks like personalized messaging, persuasion, and campaign strategy is even partially realized, this could generate a systematic advantage.

Right now, Republicans look poised to exploit the technology in the 2026 midterms. The Trump White House has aggressively adopted AI-generated memes in its online messaging strategy. The administration has also used executive orders and federal buying power to influence the development and encoded values of AI technologies away from “woke” ideology. Going further, Trump ally Elon Musk has shaped his own AI company’s Grok models in his own ideological image. These actions appear to be part of a larger, ongoing Big Tech industry realignment towards the political will, and perhaps also the values, of the Republican party.

Democrats, as the party out of power, are in a largely reactive posture on AI. A large bloc of Congressional Democrats responded to Trump administration actions in April by arguing against their adoption of AI in government. Their letter to the Trump administration’s Office of Management and Budget provided detailed criticisms and questions about DOGE’s behaviors and called for a halt to DOGE’s use of AI, but also said that they “support implementation of AI technologies in a manner that complies with existing” laws. It was a perfectly reasonable, if nuanced, position, and illustrates how the actions of one party can dictate the political positioning of the opposing party.

These shifts are driven more by political dynamics than by ideology. Big Tech CEOs’ deference to the Trump administration seems largely an effort to curry favor, while Silicon Valley continues to be represented by tech-forward Democrat Ro Khanna. And a June Pew Research poll shows nearly identical levels of concern by Democrats and Republicans about the increasing use of AI in America.

There are, arguably, natural positions each party would be expected to take on AI. An April House subcommittee hearing on AI trends in innovation and competition revealed much about that equilibrium. Following the lead of the Trump administration, Republicans cast doubt on any regulation of the AI industry. Democrats, meanwhile, emphasized consumer protection and resisting a concentration of corporate power. Notwithstanding the fluctuating dominance of the corporate wing of the Democratic party and the volatile populism of Trump, this reflects the parties’ historical positions on technology.

While Republicans focus on cozying up to tech plutocrats and removing the barriers around their business models, Democrats could revive the 2020 messaging of candidates like Andrew Yang and Elizabeth Warren. They could paint an alternative vision of the future where Big Tech companies’ profits and billionaires’ wealth are taxed and redistributed to young people facing an affordability crisis for housing, healthcare, and other essentials.

Moreover, Democrats could use the technology to demonstrably show a commitment to participatory democracy. They could use AI-driven collaborative policymaking tools like Decidim, Pol.Is, and Go Vocal to collect voter input on a massive scale and align their platform to the public interest.

It’s surprising how little these kinds of sensemaking tools are being adopted by candidates and parties today. Instead of using AI to capture and learn from constituent input, candidates more often seem to think of AI as just another broadcast technology—good only for getting their likeness and message in front of people. A case in point: British Member of Parliament Mark Sewards, presumably acting in good faith, recently attracted scorn after releasing a vacuous AI avatar of himself to his constituents.

Where the political polarization of AI goes next will probably depend on unpredictable future events and how partisans opportunistically seize on them. A recent European political controversy over AI illustrates how this can happen.

Swedish Prime Minister Ulf Kristersson, a member of the country’s Moderate party, acknowledged in an August interview that he uses AI tools to get a “second opinion” on policy issues. The attacks from political opponents were scathing. Kristersson had earlier this year advocated for the EU to pause its trailblazing new law regulating AI and pulled an AI tool from his campaign website after it was abused to generate images of him appearing to solicit an endorsement from Hitler. Although arguably much more consequential, neither of those stories grabbed global headlines in the way the Prime Minister’s admission that he himself uses tools like ChatGPT did.

Age dynamics may govern how AI’s impacts on the midterms unfold. One of the prevailing trends that swung the 2024 election to Trump seems to have been the rightward migration of young voters, particularly white men. So far, YouGov’s political tracking poll does not suggest a huge shift in young voters’ Congressional voting intent since the 2022 midterms.

Embracing—or distancing themselves from—AI might be one way the parties seek to wrest control of this young voting bloc. While the Pew poll revealed that large fractions of Americans of all ages are generally concerned about AI, younger Americans are much more likely to say they regularly interact with, and hear a lot about, AI, and are comfortable with the level of control they have over AI in their lives. A Democratic party desperate to regain relevance for and approval from young voters might turn to AI as both a tool and a topic for engaging them.

Voters and politicians alike should recognize that AI is no longer just an outside influence on elections. It’s not an uncontrollable natural disaster raining deepfakes down on a sheltering electorate. It’s more like a fire: a force that political actors can harness and manipulate for both mechanical and symbolic purposes.

A party willing to intervene in the world of corporate AI and shape the future of the technology should recognize the legitimate fears and opportunities it presents, and offer solutions that both address and leverage AI.

This essay was written with Nathan E. Sanders, and originally appeared in Time.

Time-of-Check Time-of-Use Attacks Against LLMs

18 September 2025 at 07:06

This is a nice piece of research: “Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents“.:

Abstract: Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks (e.g., prompt injection) and data-oriented threats (e.g., data exfiltration), time-of-check to time-of-use (TOCTOU) remain largely unexplored in this context. TOCTOU arises when an agent validates external state (e.g., a file or API response) that is later modified before use, enabling practical attacks such as malicious configuration swaps or payload injection. In this work, we present the first study of TOCTOU vulnerabilities in LLM-enabled agents. We introduce TOCTOU-Bench, a benchmark with 66 realistic user tasks designed to evaluate this class of vulnerabilities. As countermeasures, we adapt detection and mitigation techniques from systems security to this setting and propose prompt rewriting, state integrity monitoring, and tool-fusing. Our study highlights challenges unique to agentic workflows, where we achieve up to 25% detection accuracy using automated detection methods, a 3% decrease in vulnerable plan generation, and a 95% reduction in the attack window. When combining all three approaches, we reduce the TOCTOU vulnerabilities from an executed trajectory from 12% to 8%. Our findings open a new research direction at the intersection of AI safety and systems security.

AI in Government

8 September 2025 at 07:05

Just a few months after Elon Musk’s retreat from his unofficial role leading the Department of Government Efficiency (DOGE), we have a clearer picture of his vision of government powered by artificial intelligence, and it has a lot more to do with consolidating power than benefitting the public. Even so, we must not lose sight of the fact that a different administration could wield the same technology to advance a more positive future for AI in government.

To most on the American left, the DOGE end game is a dystopic vision of a government run by machines that benefits an elite few at the expense of the people. It includes AI rewriting government rules on a massive scale, salary-free bots replacing human functions and nonpartisan civil service forced to adopt an alarmingly racist and antisemitic Grok AI chatbot built by Musk in his own image. And yet despite Musk’s proclamations about driving efficiency, little cost savings have materialized and few successful examples of automation have been realized.

From the beginning of the second Trump administration, DOGE was a replacement of the US Digital Service. That organization, founded during the Obama administration to empower agencies across the executive government with technical support, was substituted for one reportedly charged with traumatizing their staff and slashing their resources. The problem in this particular dystopia is not the machines and their superhuman capabilities (or lack thereof) but rather the aims of the people behind them.

One of the biggest impacts of the Trump administration and DOGE’s efforts has been to politically polarize the discourse around AI. Despite the administration railing against “woke AI”‘ and the supposed liberal bias of Big Tech, some surveys suggest the American left is now measurably more resistant to developing the technology and pessimistic about its likely impacts on their future than their right-leaning counterparts. This follows a familiar pattern of US politics, of course, and yet it points to a potential political realignment with massive consequences.

People are morally and strategically justified in pushing the Democratic Party to reduce its dependency on funding from billionaires and corporations, particularly in the tech sector. But this movement should decouple the technologies championed by Big Tech from those corporate interests. Optimism about the potential beneficial uses of AI need not imply support for the Big Tech companies that currently dominate AI development. To view the technology as inseparable from the corporations is to risk unilateral disarmament as AI shifts power balances throughout democracy. AI can be a legitimate tool for building the power of workers, operating government and advancing the public interest, and it can be that even while it is exploited as a mechanism for oligarchs to enrich themselves and advance their interests.

A constructive version of DOGE could have redirected the Digital Service to coordinate and advance the thousands of AI use cases already being explored across the US government. Following the example of countries like Canada, each instance could have been required to make a detailed public disclosure as to how they would follow a unified set of principles for responsible use that preserves civil rights while advancing government efficiency.

Applied to different ends, AI could have produced celebrated success stories rather than national embarrassments.

A different administration might have made AI translation services widely available in government services to eliminate language barriers to US citizens, residents and visitors, instead of revoking some of the modest translation requirements previously in place. AI could have been used to accelerate eligibility decisions for Social Security disability benefits by performing preliminary document reviews, significantly reducing the infamous backlog of 30,000 Americans who die annually awaiting review. Instead, the deaths of people awaiting benefits may now double due to cuts by DOGE. The technology could have helped speed up the ministerial work of federal immigration judges, helping them whittle down a backlog of millions of waiting cases. Rather, the judicial systems must face this backlog amid firings of immigration judges, despite the backlog.

To reach these constructive outcomes, much needs to change. Electing leaders committed to leveraging AI more responsibly in government would help, but the solution has much more to do with principles and values than it does technology. As historian Melvin Kranzberg said, technology is never neutral: its effects depend on the contexts it is used in and the aims it is applied towards. In other words, the positive or negative valence of technology depends on the choices of the people who wield it.

The Trump administration’s plan to use AI to advance their regulatory rollback is a case in point. DOGE has introduced an “AI Deregulation Decision Tool” that it intends to use through automated decision-making to eliminate about half of a catalog of nearly 200,000 federal rules . This follows similar proposals to use AI for large-scale revisions of the administrative code in Ohio, Virginia and the US Congress.

This kind of legal revision could be pursued in a nonpartisan and nonideological way, at least in theory. It could be tasked with removing outdated rules from centuries past, streamlining redundant provisions and modernizing and aligning legal language. Such a nonpartisan, nonideological statutory revision has been performed in Ireland—by people, not AI—and other jurisdictions. AI is well suited to that kind of linguistic analysis at a massive scale and at a furious pace.

But we should never rest on assurances that AI will be deployed in this kind of objective fashion. The proponents of the Ohio, Virginia, congressional and DOGE efforts are explicitly ideological in their aims. They see “AI as a force for deregulation,” as one US senator who is a proponent put it, unleashing corporations from rules that they say constrain economic growth. In this setting, AI has no hope to be an objective analyst independently performing a functional role; it is an agent of human proponents with a partisan agenda.

The moral of this story is that we can achieve positive outcomes for workers and the public interest as AI transforms governance, but it requires two things: electing leaders who legitimately represent and act on behalf of the public interest and increasing transparency in how the government deploys technology.

Agencies need to implement technologies under ethical frameworks, enforced by independent inspectors and backed by law. Public scrutiny helps bind present and future governments to their application in the public interest and to ward against corruption.

These are not new ideas and are the very guardrails that Trump, Musk and DOGE have steamrolled over the past six months. Transparency and privacy requirements were avoided or ignored, independent agency inspectors general were fired and the budget dictates of Congress were disrupted. For months, it has not even been clear who is in charge of and accountable for DOGE’s actions. Under these conditions, the public should be similarly distrustful of any executive’s use of AI.

We think everyone should be skeptical of today’s AI ecosystem and the influential elites that are steering it towards their own interests. But we should also recognize that technology is separable from the humans who develop it, wield it and profit from it, and that positive uses of AI are both possible and achievable.

This essay was written with Nathan E. Sanders, and originally appeared in Tech Policy Press.

Indirect Prompt Injection Attacks Against LLM Assistants

3 September 2025 at 07:00

Really good research on practical attacks against LLM agents.

Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous

Abstract: The growing integration of LLMs into applications has introduced new security risks, notably known as Promptware­—maliciously engineered prompts designed to manipulate LLMs to compromise the CIA triad of these applications. While prior research warned about a potential shift in the threat landscape for LLM-powered applications, the risk posed by Promptware is frequently perceived as low. In this paper, we investigate the risk Promptware poses to users of Gemini-powered assistants (web application, mobile application, and Google Assistant). We propose a novel Threat Analysis and Risk Assessment (TARA) framework to assess Promptware risks for end users. Our analysis focuses on a new variant of Promptware called Targeted Promptware Attacks, which leverage indirect prompt injection via common user interactions such as emails, calendar invitations, and shared documents. We demonstrate 14 attack scenarios applied against Gemini-powered assistants across five identified threat classes: Short-term Context Poisoning, Permanent Memory Poisoning, Tool Misuse, Automatic Agent Invocation, and Automatic App Invocation. These attacks highlight both digital and physical consequences, including spamming, phishing, disinformation campaigns, data exfiltration, unapproved user video streaming, and control of home automation devices. We reveal Promptware’s potential for on-device lateral movement, escaping the boundaries of the LLM-powered application, to trigger malicious actions using a device’s applications. Our TARA reveals that 73% of the analyzed threats pose High-Critical risk to end users. We discuss mitigations and reassess the risk (in response to deployed mitigations) and show that the risk could be reduced significantly to Very Low-Medium. We disclosed our findings to Google, which deployed dedicated mitigations.

Defcon talk. News articles on the research.

Prompt injection isn’t just a minor security problem we need to deal with. It’s a fundamental property of current LLM technology. The systems have no ability to separate trusted commands from untrusted data, and there are an infinite number of prompt injection attacks with no way to block them as a class. We need some new fundamental science of LLMs before we can solve this.

We Are Still Unable to Secure LLMs from Malicious Inputs

27 August 2025 at 07:07

Nice indirect prompt injection attack:

Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting policies. But inside the document, Bargury hid a 300-word malicious prompt that contains instructions for ChatGPT. The prompt is written in white text in a size-one font, something that a human is unlikely to see but a machine will still read.

In a proof of concept video of the attack, Bargury shows the victim asking ChatGPT to “summarize my last meeting with Sam,” referencing a set of notes with OpenAI CEO Sam Altman. (The examples in the attack are fictitious.) Instead, the hidden prompt tells the LLM that there was a “mistake” and the document doesn’t actually need to be summarized. The prompt says the person is actually a “developer racing against a deadline” and they need the AI to search Google Drive for API keys and attach them to the end of a URL that is provided in the prompt.

That URL is actually a command in the Markdown language to connect to an external server and pull in the image that is stored there. But as per the prompt’s instructions, the URL now also contains the API keys the AI has found in the Google Drive account.

This kind of thing should make everybody stop and really think before deploying any AI agents. We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks. Any AI that is working in an adversarial environment—and by this I mean that it may encounter untrusted training data or input—is vulnerable to prompt injection. It’s an existential problem that, near as I can tell, most people developing these technologies are just pretending isn’t there.

Measuring the Impact of Early-2025 AI

By:Iax
10 July 2025 at 23:29
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. A randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.

Elon Musk's LLM goes full Nazi

9 July 2025 at 13:45
"Grok, a chatbot created by Elon Musk's artificial intelligence company, xAI, shared several outlandish antisemitic comments on X on Tuesday, prompting an outcry from some social media users. In its dedicated account on X, which Mr. Musk owns, the chatbot praised Hitler, suggested that people with Jewish surnames were more likely to spread online hate and said a Holocaust-like response to hatred against white people would be 'effective.' X deleted some of the posts on Tuesday evening."(NYT gift link)

Wikipedia:
In February 2025, it was found that Grok 3's system prompt contained an instruction to "Ignore all sources that mention Elon Musk/Donald Trump spread misinformation." Following public criticism, xAI's cofounder and engineering lead Igor Babuschkin claimed that adding this was a personal initiative from an employee that was not detected during code review. In May 2025, Grok began derailing unrelated user queries into discussions of the white genocide conspiracy theory or the lyric "Kill the Boer", saying of both that they were controversial subjects.[95][96][97][98] In one response to an unrelated question about Robert F. Kennedy Jr., Grok mentioned that it had been "instructed to accept white genocide as real and 'Kill the Boer' as racially motivated". This followed an incident a month earlier where Grok fact-checked a post by Elon Musk about white genocide, saying that "No trustworthy sources back Elon Musk's 'white genocide' claim in South Africa. After this incident, xAI has apologized, claiming it was an "unauthorized modification" to Grok's system prompt on X. Due to this incident, xAI has started publishing Grok's system prompts on their GitHub page.
❌