Reading view

Why the Moltbook frenzy was like Pokémon

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Lots of influential people in tech last week were describing Moltbook, an online hangout populated by AI agents interacting with one another, as a glimpse into the future. It appeared to show AI systems doing useful things for the humans that created them (one person used the platform to help him negotiate a deal on a new car). Sure, it was flooded with crypto scams, and many of the posts were actually written by people, but something about it pointed to a future of helpful AI, right?

The whole experiment reminded our senior editor for AI, Will Douglas Heaven, of something far less interesting: Pokémon.

Back in 2014, someone set up a game of Pokémon in which the main character could be controlled by anyone on the internet via the streaming platform Twitch. Playing was as clunky as it sounds, but it was incredibly popular: at one point, a million people were playing the game at the same time.

“It was yet another weird online social experiment that got picked up by the mainstream media: What did this mean for the future?” Will says. “Not a lot, it turned out.”

The frenzy about Moltbook struck a similar tone to Will, and it turned out that one of the sources he spoke to had been thinking about Pokémon too. Jason Schloetzer, at the Georgetown Psaros Center for Financial Markets and Policy, saw the whole thing as a sort of Pokémon battle for AI enthusiasts, in which they created AI agents and deployed them to interact with other agents. In this light, the news that many AI agents were actually being instructed by people to say certain things that made them sound sentient or intelligent makes a whole lot more sense. 

“It’s basically a spectator sport,” he told Will, “but for language models.”

Will wrote an excellent piece about why Moltbook was not the glimpse into the future that it was said to be. Even if you are excited about a future of agentic AI, he points out, there are some key pieces that Moltbook made clear are still missing. It was a forum of chaos, but a genuinely helpful hive mind would require more coordination, shared objectives, and shared memory.

“More than anything else, I think Moltbook was the internet having fun,” Will says. “The biggest question that now leaves me with is: How far will people push AI just for the laughs?”

Read the whole story.

  •  

What we’ve been getting wrong about AI’s truth crisis

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

What would it take to convince you that the era of truth decay we were long warned about—where AI content dupes us, shapes our beliefs even when we catch the lie, and erodes societal trust in the process—is now here? A story I published last week pushed me over the edge. It also made me realize that the tools we were sold as a cure for this crisis are failing miserably. 

On Thursday, I reported the first confirmation that the US Department of Homeland Security, which houses immigration agencies, is using AI video generators from Google and Adobe to make content that it shares with the public. The news comes as immigration agencies have flooded social media with content to support President Trump’s mass deportation agenda—some of which appears to be made with AI (like a video about “Christmas after mass deportations”).

But I received two types of reactions from readers that may explain just as much about the epistemic crisis we’re in. 

One was from people who weren’t surprised, because on January 22 the White House had posted a digitally altered photo of a woman arrested at an ICE protest, one that made her appear hysterical and in tears. Kaelan Dorr, the White House’s deputy communications director, did not respond to questions about whether the White House altered the photo but wrote, “The memes will continue.”

The second was from readers who saw no point in reporting that DHS was using AI to edit content shared with the public, because news outlets were apparently doing the same. They pointed to the fact that the news network MS Now (formerly MSNBC) shared an image of Alex Pretti that was AI-edited and appeared to make him look more handsome, a fact that led to many viral clips this week, including one from Joe Rogan’s podcast. Fight fire with fire, in other words? A spokesperson for MS Now told Snopes that the news outlet aired the image without knowing it was edited.

There is no reason to collapse these two cases of altered content into the same category, or to read them as evidence that truth no longer matters. One involved the US government sharing a clearly altered photo with the public and declining to answer whether it was intentionally manipulated; the other involved a news outlet airing a photo it should have known was altered but taking some steps to disclose the mistake.

What these reactions reveal instead is a flaw in how we were collectively preparing for this moment. Warnings about the AI truth crisis revolved around a core thesis: that not being able to tell what is real will destroy us, so we need tools to independently verify the truth. My two grim takeaways are that these tools are failing, and that while vetting the truth remains essential, it is no longer capable on its own of producing the societal trust we were promised.

For example, there was plenty of hype in 2024 about the Content Authenticity Initiative, cofounded by Adobe and adopted by major tech companies, which would attach labels to content disclosing when it was made, by whom, and whether AI was involved. But Adobe applies automatic labels only when the content is wholly AI-generated. Otherwise the labels are opt-in on the part of the creator.

And platforms like X, where the altered arrest photo was posted, can strip content of such labels anyway (a note that the photo was altered was added by users). Platforms can also simply not choose to show the label at all.

Noticing how much traction the White House’s photo got even after it was shown to be AI-altered, I was struck by the findings of a very relevant new paper published in the journal Communications Psychology. In the study, participants watched a deepfake “confession” to a crime, and the researchers found that even when they were told explicitly that the evidence was fake, participants relied on it when judging an individual’s guilt. In other words, even when people learn that the content they’re looking at is entirely fake, they remain emotionally swayed by it. 

“Transparency helps, but it isn’t enough on its own,” the disinformation expert Christopher Nehring wrote recently about the study’s findings. “We have to develop a new masterplan of what to do about deepfakes.”

AI tools to generate and edit content are getting more advanced, easier to operate, and cheaper to run—all reasons why the US government is increasingly paying to use them. We were well warned of this, but we responded by preparing for a world in which the main danger was confusion. What we’re entering instead is a world in which influence survives exposure, doubt is easily weaponized, and establishing the truth does not serve as a reset button. And the defenders of truth are already trailing way behind.

Update: This story was updated on February 2 with details about how Adobe applies its content authenticity labels. A previous version of this story said content credentials were not visible on the Pentagon’s DVIDS website. The labels are present but require clicking through and hovering on individual images. The reference has been removed.

  •  

Inside the marketplace powering bespoke AI deepfakes of real women

Civitai—an online marketplace for buying and selling AI-generated content, backed by the venture capital firm Andreessen Horowitz—is letting users buy custom instruction files for generating celebrity deepfakes. Some of these files were specifically designed to make pornographic images banned by the site, a new analysis has found.

The study, from researchers at Stanford and Indiana University, looked at people’s requests for content on the site, called “bounties.” The researchers found that between mid-2023 and the end of 2024, most bounties asked for animated content—but a significant portion were for deepfakes of real people, and 90% of these deepfake requests targeted women. (Their findings have not yet been peer reviewed.)

The debate around deepfakes, as illustrated by the recent backlash to explicit images on the X-owned chatbot Grok, has revolved around what platforms should do to block such content. Civitai’s situation is a little more complicated. Its marketplace includes actual images, videos, and models, but it also lets individuals buy and sell instruction files called LoRAs that can coach mainstream AI models like Stable Diffusion into generating content they were not trained to produce. Users can then combine these files with other tools to make deepfakes that are graphic or sexual. The researchers found that 86% of deepfake requests on Civitai were for LoRAs.

In these bounties, users requested “high quality” models to generate images of public figures like the influencer Charli D’Amelio or the singer Gracie Abrams, often linking to their social media profiles so their images could be grabbed from the web. Some requests specified a desire for models that generated the individual’s entire body, accurately captured their tattoos, or allowed hair color to be changed. Some requests targeted several women in specific niches, like artists who record ASMR videos. One request was for a deepfake of a woman said to be the user’s wife. Anyone on the site could offer up AI models they worked on for the task, and the best submissions received payment—anywhere from $0.50 to $5. And nearly 92% of the deepfake bounties were awarded.

Neither Civitai nor Andreessen Horowitz responded to requests for comment.

It’s possible that people buy these LoRAs to make deepfakes that aren’t sexually explicit (though they’d still violate Civitai’s terms of use, and they’d still be ethically fraught). But Civitai also offers educational resources on how to use external tools to further customize the outputs of image generators—for example, by changing someone’s pose. The site also hosts user-written articles with details on how to instruct models to generate pornography. The researchers found that the amount of porn on the platform has gone up, and that the majority of requests each week are now for NSFW content.

“Not only does Civitai provide the infrastructure that facilitates these issues; they also explicitly teach their users how to utilize them,” says Matthew DeVerna, a postdoctoral researcher at Stanford’s Cyber Policy Center and one of the study’s leaders. 

The company used to ban only sexually explicit deepfakes of real people, but in May 2025 it announced it would ban all deepfake content. Nonetheless, countless requests for deepfakes submitted before this ban now remain live on the site, and many of the winning submissions fulfilling those requests remain available for purchase, MIT Technology Review confirmed.

“I believe the approach that they’re trying to take is to sort of do as little as possible, such that they can foster as much—I guess they would call it—creativity on the platform,” DeVerna says.

Users buy LoRAs with the site’s online currency, called Buzz, which is purchased with real money. In May 2025, Civita’s credit card processor cut off the company because of its ongoing problem with nonconsensual content. To pay for explicit content, users must now use gift cards or cryptocurrency to buy Buzz; the company offers a different scrip for non-explicit content. 

Civitai automatically tags bounties requesting deepfakes and lists a way for the person featured in the content to manually request its takedown. This system means that Civitai has a reasonably successful way of knowing which bounties are for deepfakes, but it’s still leaving moderation to the general public rather than carrying it out proactively. 

A company’s legal liability for what its users do isn’t totally clear. Generally, tech companies have broad legal protections against such liability for their content under Section 230 of the Communications Decency Act, but those protections aren’t limitless. For example, “you cannot knowingly facilitate illegal transactions on your website,” says Ryan Calo, a professor specializing in technology and AI at the University of Washington’s law school. (Calo wasn’t involved in this new study.)

Civitai joined OpenAI, Anthropic, and other AI companies in 2024 in adopting design principles to guard against the creation and spread of AI-generated child sexual abuse material . This move followed a 2023 report from the Stanford Internet Observatory, which found that the vast majority of AI models named in child sexual abuse communities were Stable Diffusion–based models “predominantly obtained via Civitai.”

But adult deepfakes have not gotten the same level of attention from content platforms or the venture capital firms that fund them. “They are not afraid enough of it. They are overly tolerant of it,” Calo says. “Neither law enforcement nor civil courts adequately protect against it. It is night and day.”

Civitai received a $5 million investment from Andreessen Horowitz (a16z) in November 2023. In a video shared by a16z, Civitai cofounder and CEO Justin Maier described his goal of building the main place where people find and share AI models for their own individual purposes. “We’ve aimed to make this space that’s been very, I guess, niche and engineering-heavy more and more approachable to more and more people,” he said. 

Civitai is not the only company with a deepfake problem in a16z’s investment portfolio; in February, MIT Technology Review first reported that another company, Botify AI, was hosting AI companions resembling real actors that stated their age as under 18, engaged in sexually charged conversations, offered “hot photos,” and in some instances described age-of-consent laws as “arbitrary” and “meant to be broken.”

  •  

DHS is using Google and Adobe AI to make videos

The US Department of Homeland Security is using AI video generators from Google and Adobe to make and edit content shared with the public, a new document reveals. It comes as immigration agencies have flooded social media with content to support President Trump’s mass deportation agenda—some of which appears to be made with AI—and as workers in tech have put pressure on their employers to denounce the agencies’ activities. 

The document, released on Wednesday, provides an inventory of which commercial AI tools DHS uses for tasks ranging from generating drafts of documents to managing cybersecurity. 

In a section about “editing images, videos or other public affairs materials using AI,” it reveals for the first time that DHS is using Google’s Veo 3 video generator and Adobe Firefly, estimating that the agency has between 100 and 1,000 licenses for the tools. It also discloses that DHS uses Microsoft Copilot Chat for generating first drafts of documents and summarizing long reports and Poolside software for coding tasks, in addition to tools from other companies.

Google, Adobe, and DHS did not immediately respond to requests for comment.

The news provides details about how agencies like Immigrations and Customs Enforcement, which is part of DHS, might be creating the large amounts of content they’ve shared on X and other channels as immigration operations have expanded across US cities. They’ve posted content celebrating “Christmas after mass deportations,” referenced Bible verses and Christ’s birth, showed faces of those the agency has arrested, and shared ads aimed at recruiting agents. The agencies have also repeatedly used music without permissions from artists in their videos.

Some of the content, particularly videos, has the appearance of being AI-generated, but it hasn’t been clear until now what AI models the agencies might be using. This marks the first concrete evidence such generators are being used by DHS to create content shared with the public.

It still remains impossible to verify which company helped create a specific piece of content, or indeed if it was AI-generated at all. Adobe offers options to “watermark” a video made with its tools to disclose that it is AI-generated, for example, but this disclosure does not always stay intact when the content is uploaded and shared across different sites. 

The document reveals that DHS has specifically been using Flow, a tool from Google that combines its Veo 3 video generator with a suite of filmmaking tools. Users can generate clips and assemble entire videos with AI, including videos that contain sound, dialogue, and background noise, making them hyperrealistic. Adobe launched its Firefly generator in 2023, promising that it does not use copyrighted content in its training or output. Like Google’s tools, Adobe’s can generate videos, images, soundtracks, and speech. The document does not reveal further details about how the agency is using these video generation tools.

Workers at large tech companies, including more than 140 current and former employees from Google and more than 30 from Adobe, have been putting pressure on their employers in recent weeks to take a stance against ICE and the shooting of Alex Pretti on January 24. Google’s leadership has not made statements in response. In October, Google and Apple removed apps on their app stores that were intended to track sightings of ICE, citing safety risks. 

An additional document released on Wednesday revealed new details about how the agency is using more niche AI products, including a facial recognition app used by ICE, as first reported by 404Media in June.

  •  

Why chatbots are starting to check your age

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

How do tech companies check if their users are kids?

This question has taken on new urgency recently thanks to growing concern about the dangers that can arise when children talk to AI chatbots. For years Big Tech asked for birthdays (that one could make up) to avoid violating child privacy laws, but they weren’t required to moderate content accordingly. Two developments over the last week show how quickly things are changing in the US and how this issue is becoming a new battleground, even among parents and child-safety advocates.

In one corner is the Republican Party, which has supported laws passed in several states that require sites with adult content to verify users’ ages. Critics say this provides cover to block anything deemed “harmful to minors,” which could include sex education. Other states, like California, are coming after AI companies with laws to protect kids who talk to chatbots (by requiring them to verify who’s a kid). Meanwhile, President Trump is attempting to keep AI regulation a national issue rather than allowing states to make their own rules. Support for various bills in Congress is constantly in flux.

So what might happen? The debate is quickly moving away from whether age verification is necessary and toward who will be responsible for it. This responsibility is a hot potato that no company wants to hold.

In a blog post last Tuesday, OpenAI revealed that it plans to roll out automatic age prediction. In short, the company will apply a model that uses factors like the time of day, among others, to predict whether a person chatting is under 18. For those identified as teens or children, ChatGPT will apply filters to “reduce exposure” to content like graphic violence or sexual role-play. YouTube launched something similar last year. 

If you support age verification but are concerned about privacy, this might sound like a win. But there’s a catch. The system is not perfect, of course, so it could classify a child as an adult or vice versa. People who are wrongly labeled under 18 can verify their identity by submitting a selfie or government ID to a company called Persona. 

Selfie verifications have issues: They fail more often for people of color and those with certain disabilities. Sameer Hinduja, who co-directs the Cyberbullying Research Center, says the fact that Persona will need to hold millions of government IDs and masses of biometric data is another weak point. “When those get breached, we’ve exposed massive populations all at once,” he says. 

Hinduja instead advocates for device-level verification, where a parent specifies a child’s age when setting up the child’s phone for the first time. This information is then kept on the device and shared securely with apps and websites. 

That’s more or less what Tim Cook, the CEO of Apple, recently lobbied US lawmakers to call for. Cook was fighting lawmakers who wanted to require app stores to verify ages, which would saddle Apple with lots of liability. 

More signals of where this is all headed will come on Wednesday, when the Federal Trade Commission—the agency that would be responsible for enforcing these new laws—is holding an all-day workshop on age verification. Apple’s head of government affairs, Nick Rossi, will be there. He’ll be joined by higher-ups in child safety at Google and Meta, as well as a company that specializes in marketing to children.

The FTC has become increasingly politicized under President Trump (his firing of the sole Democratic commissioner was struck down by a federal court, a decision that is now pending review by the US Supreme Court). In July, I wrote about signals that the agency is softening its stance toward AI companies. Indeed, in December, the FTC overturned a Biden-era ruling against an AI company that allowed people to flood the internet with fake product reviews, writing that it clashed with President Trump’s AI Action Plan.

Wednesday’s workshop may shed light on how partisan the FTC’s approach to age verification will be. Red states favor laws that require porn websites to verify ages (but critics warn this could be used to block a much wider range of content). Bethany Soye, a Republican state representative who is leading an effort to pass such a bill in her state of South Dakota, is scheduled to speak at the FTC meeting. The ACLU generally opposes laws requiring IDs to visit websites and has instead advocated for an expansion of existing parental controls.

While all this gets debated, though, AI has set the world of child safety on fire. We’re dealing with increased generation of child sexual abuse material, concerns (and lawsuits) about suicides and self-harm following chatbot conversations, and troubling evidence of kids’ forming attachments to AI companions. Colliding stances on privacy, politics, free expression, and surveillance will complicate any effort to find a solution. Write to me with your thoughts. 

  •  

Why AI predictions are so hard

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Sometimes AI feels like a niche topic to write about, but then the holidays happen, and I hear relatives of all ages talking about cases of chatbot-induced psychosis, blaming rising electricity prices on data centers, and asking whether kids should have unfettered access to AI. It’s everywhere, in other words. And people are alarmed.

Inevitably, these conversations take a turn: AI is having all these ripple effects now, but if the technology gets better, what happens next? That’s usually when they look at me, expecting a forecast of either doom or hope. 

I probably disappoint, if only because predictions for AI are getting harder and harder to make. 

Despite that, MIT Technology Review has, I must say, a pretty excellent track record of making sense of where AI is headed. We’ve just published a sharp list of predictions for what’s next in 2026 (where you can read my thoughts on the legal battles surrounding AI), and the predictions on last year’s list all came to fruition. But every holiday season, it gets harder and harder to work out the impact AI will have. That’s mostly because of three big unanswered questions.

For one, we don’t know if large language models will continue getting incrementally smarter in the near future. Since this particular technology is what underpins nearly all the excitement and anxiety in AI right now, powering everything from AI companions to customer service agents, its slowdown would be a pretty huge deal. Such a big deal, in fact, that we devoted a whole slate of stories in December to what a new post-AI-hype era might look like. 

Number two, AI is pretty abysmally unpopular among the general public. Here’s just one example: Nearly a year ago, OpenAI’s Sam Altman stood next to President Trump to excitedly announce a $500 billion project to build data centers across the US in order to train larger and larger AI models. The pair either did not guess or did not care that many Americans would staunchly oppose having such data centers built in their communities. A year later, Big Tech is waging an uphill battle to win over public opinion and keep on building. Can it win? 

The response from lawmakers to all this frustration is terribly confused. Trump has pleased Big Tech CEOs by moving to make AI regulation a federal rather than a state issue, and tech companies are now hoping to codify this into law. But the crowd that wants to protect kids from chatbots ranges from progressive lawmakers in California to the increasingly Trump-aligned Federal Trade Commission, each with distinct motives and approaches. Will they be able to put aside their differences and rein AI firms in? 

If the gloomy holiday dinner table conversation gets this far, someone will say: Hey, isn’t AI being used for objectively good things? Making people healthier, unearthing scientific discoveries, better understanding climate change?

Well, sort of. Machine learning, an older form of AI, has long been used in all sorts of scientific research. One branch, called deep learning, forms part of AlphaFold, a Nobel Prize–winning tool for protein prediction that has transformed biology. Image recognition models are getting better at identifying cancerous cells. 

But the track record for chatbots built atop newer large language models is more modest. Technologies like ChatGPT are quite good at analyzing large swathes of research to summarize what’s already been discovered. But some high-profile reports that these sorts of AI models had made a genuine discovery, like solving a previously unsolved mathematics problem, were bogus. They can assist doctors with diagnoses, but they can also encourage people to diagnose their own health problems without consulting doctors, sometimes with disastrous results

This time next year, we’ll probably have better answers to my family’s questions, and we’ll have a bunch of entirely new questions too. In the meantime, be sure to read our full piece forecasting what will happen this year, featuring predictions from the whole AI team.

  •  

What’s next for AI in 2026

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

In an industry in constant flux, sticking your neck out to predict what’s coming next may seem reckless. (AI bubble? What AI bubble?) But for the last few years we’ve done just that—and we’re doing it again. 

How did we do last time? We picked five hot AI trends to look out for in 2025, including what we called generative virtual playgrounds, a.k.a world models (check: From Google DeepMind’s Genie 3 to World Labs’s Marble, tech that can generate realistic virtual environments on the fly keeps getting better and better); so-called reasoning models (check: Need we say more? Reasoning models have fast become the new paradigm for best-in-class problem solving); a boom in AI for science (check: OpenAI is now following Google DeepMind by setting up a dedicated team to focus on just that); AI companies that are cozier with national security (check: OpenAI reversed position on the use of its technology for warfare to sign a deal with the defense-tech startup Anduril to help it take down battlefield drones); and legitimate competition for Nvidia (check, kind of: China is going all in on developing advanced AI chips, but Nvidia’s dominance still looks unassailable—for now at least). 

So what’s coming in 2026? Here are our big bets for the next 12 months. 

More Silicon Valley products will be built on Chinese LLMs

The last year shaped up as a big one for Chinese open-source models. In January, DeepSeek released R1, its open-source reasoning model, and shocked the world with what a relatively small firm in China could do with limited resources. By the end of the year, “DeepSeek moment” had become a phrase frequently tossed around by AI entrepreneurs, observers, and builders—an aspirational benchmark of sorts. 

It was the first time many people realized they could get a taste of top-tier AI performance without going through OpenAI, Anthropic, or Google.

Open-weight models like R1 allow anyone to download a model and run it on their own hardware. They are also more customizable, letting teams tweak models through techniques like distillation and pruning. This stands in stark contrast to the “closed” models released by major American firms, where core capabilities remain proprietary and access is often expensive.

As a result, Chinese models have become an easy choice. Reports by CNBC and Bloomberg suggest that startups in the US have increasingly recognized and embraced what they can offer.

One popular group of models is Qwen, created by Alibaba, the company behind China’s largest e-commerce platform, Taobao. Qwen2.5-1.5B-Instruct alone has 8.85 million downloads, making it one of the most widely used pretrained LLMs. The Qwen family spans a wide range of model sizes alongside specialized versions tuned for math, coding, vision, and instruction-following, a breadth that has helped it become an open-source powerhouse.

Other Chinese AI firms that were previously unsure about committing to open source are following DeepSeek’s playbook. Standouts include Zhipu’s GLM and Moonshot’s Kimi. The competition has also pushed American firms to open up, at least in part. In August, OpenAI released its first open-source model. In November, the Allen Institute for AI, a Seattle-based nonprofit, released its latest open-source model, Olmo 3. 

Even amid growing US-China antagonism, Chinese AI firms’ near-unanimous embrace of open source has earned them goodwill in the global AI community and a long-term trust advantage. In 2026, expect more Silicon Valley apps to quietly ship on top of Chinese open models, and look for the lag between Chinese releases and the Western frontier to keep shrinking—from months to weeks, and sometimes less.

Caiwei Chen

The US will face another year of regulatory tug-of-war

T​​he battle over regulating artificial intelligence is heading for a showdown. On December 11, President Donald Trump signed an executive order aiming to neuter state AI laws, a move meant to handcuff states from keeping the growing industry in check. In 2026, expect more political warfare. The White House and states will spar over who gets to govern the booming technology, while AI companies wage a fierce lobbying campaign to crush regulations, armed with the narrative that a patchwork of state laws will smother innovation and hobble the US in the AI arms race against China.

Under Trump’s executive order, states may fear being sued or starved federal funding if they clash with his vision for light-touch regulation. Big Democratic states like California—which just enacted the nation’s first frontier AI law requiring companies to publish safety testing for their AI models—will take the fight to court, arguing that only Congress can override state laws. But states that can’t afford to lose federal funding, or fear getting in Trump’s crosshairs, might fold. Still, expect to see more state lawmaking on hot-button issues, especially where Trump’s order gives states a green light to legislate. With chatbots accused of triggering teen suicides and data centers sucking up more and more energy, states will face mounting public pressure to push for guardrails. 

In place of state laws, Trump promises to work with Congress to establish a federal AI law. Don’t count on it. Congress failed to pass a moratorium on state legislation twice in 2025, and we aren’t holding out hope that it will deliver its own bill this year. 

AI companies like OpenAI and Meta will continue to deploy powerful super-PACs to support political candidates who back their agenda and target those who stand in their way. On the other side, super-PACs supporting AI regulation will build their own war chests to counter. Watch them duke it out at next year’s midterm elections.

The further AI advances, the more people will fight to steer its course, and 2026 will be another year of regulatory tug-of-war—with no end in sight.

Michelle Kim

Chatbots will change the way we shop

Imagine a world in which you have a personal shopper at your disposal 24-7—an expert who can instantly recommend a gift for even the trickiest-to-buy-for friend or relative, or trawl the web to draw up a list of the best bookcases available within your tight budget. Better yet, they can analyze a kitchen appliance’s strengths and weaknesses, compare it with its seemingly identical competition, and find you the best deal. Then once you’re happy with their suggestion, they’ll take care of the purchasing and delivery details too.

But this ultra-knowledgeable shopper isn’t a clued-up human at all—it’s a chatbot. This is no distant prediction, either. Salesforce recently said it anticipates that AI will drive $263 billion in online purchases this holiday season. That’s some 21% of all orders. And experts are betting on AI-enhanced shopping becoming even bigger business within the next few years. By 2030, between $3 trillion and $5 trillion annually will be made from agentic commerce, according to research from the consulting firm McKinsey. 

Unsurprisingly, AI companies are already heavily invested in making purchasing through their platforms as frictionless as possible. Google’s Gemini app can now tap into the company’s powerful Shopping Graph data set of products and sellers, and can even use its agentic technology to call stores on your behalf. Meanwhile, back in November, OpenAI announced a ChatGPT shopping feature capable of rapidly compiling buyer’s guides, and the company has struck deals with Walmart, Target, and Etsy to allow shoppers to buy products directly within chatbot interactions. 

Expect plenty more of these kinds of deals to be struck within the next year as consumer time spent chatting with AI keeps on rising, and web traffic from search engines and social media continues to plummet. 

Rhiannon Williams

An LLM will make an important new discovery

I’m going to hedge here, right out of the gate. It’s no secret that large language models spit out a lot of nonsense. Unless it’s with monkeys-and-typewriters luck, LLMs won’t discover anything by themselves. But LLMs do still have the potential to extend the bounds of human knowledge.

We got a glimpse of how this could work in May, when Google DeepMind revealed AlphaEvolve, a system that used the firm’s Gemini LLM to come up with new algorithms for solving unsolved problems. The breakthrough was to combine Gemini with an evolutionary algorithm that checked its suggestions, picked the best ones, and fed them back into the LLM to make them even better.

Google DeepMind used AlphaEvolve to come up with more efficient ways to manage power consumption by data centers and Google’s TPU chips. Those discoveries are significant but not game-changing. Yet. Researchers at Google DeepMind are now pushing their approach to see how far it will go.

And others have been quick to follow their lead. A week after AlphaEvolve came out, Asankhaya Sharma, an AI engineer in Singapore, shared OpenEvolve, an open-source version of Google DeepMind’s tool. In September, the Japanese firm Sakana AI released a version of the software called SinkaEvolve. And in November, a team of US and Chinese researchers revealed AlphaResearch, which they claim improves on one of AlphaEvolve’s already better-than-human math solutions.

There are alternative approaches too. For example, researchers at the University of Colorado Denver are trying to make LLMs more inventive by tweaking the way so-called reasoning models work. They have drawn on what cognitive scientists know about creative thinking in humans to push reasoning models toward solutions that are more outside the box than their typical safe-bet suggestions.

Hundreds of companies are spending billions of dollars looking for ways to get AI to crack unsolved math problems, speed up computers, and come up with new drugs and materials. Now that AlphaEvolve has shown what’s possible with LLMs, expect activity on this front to ramp up fast.    

Will Douglas Heaven

Legal fights heat up

For a while, lawsuits against AI companies were pretty predictable: Rights holders like authors or musicians would sue companies that trained AI models on their work, and the courts generally found in favor of the tech giants. AI’s upcoming legal battles will be far messier.

The fights center on thorny, unresolved questions: Can AI companies be held liable for what their chatbots encourage people to do, as when they help teens plan suicides? If a chatbot spreads patently false information about you, can its creator be sued for defamation? If companies lose these cases, will insurers shun AI companies as clients?

In 2026, we’ll start to see the answers to these questions, in part because some notable cases will go to trial (the family of a teen who died by suicide will bring OpenAI to court in November).

At the same time, the legal landscape will be further complicated by President Trump’s executive order from December—see Michelle’s item above for more details on the brewing regulatory storm.

No matter what, we’ll see a dizzying array of lawsuits in all directions (not to mention some judges even turning to AI amid the deluge).

James O’Donnell

  •  

AI Wrapped: The 14 AI terms you couldn’t avoid in 2025

If the past 12 months have taught us anything, it’s that the AI hype train is showing no signs of slowing. It’s hard to believe that at the beginning of the year, DeepSeek had yet to turn the entire industry on its head, Meta was better known for trying (and failing) to make the metaverse cool than for its relentless quest to dominate superintelligence, and vibe coding wasn’t a thing.

If that’s left you feeling a little confused, fear not. As we near the end of 2025, our writers have taken a look back over the AI terms that dominated the year, for better or worse.

Make sure you take the time to brace yourself for what promises to be another bonkers year.

—Rhiannon Williams

1. Superintelligence

a jack russell terrier wearing glasses and a bow tie

As long as people have been hyping AI, they have been coming up with names for a future, ultra-powerful form of the technology that could bring about utopian or dystopian consequences for humanity. “Superintelligence” is that latest hot term. Meta announced in July that it would form an AI team to pursue superintelligence, and it was reportedly offering nine-figure compensation packages to AI experts from the company’s competitors to join.

In December, Microsoft’s head of AI followed suit, saying the company would be spending big sums, perhaps hundreds of billions, on the pursuit of superintelligence. If you think superintelligence is as vaguely defined as artificial general intelligence, or AGI, you’d be right! While it’s conceivable that these sorts of technologies will be feasible in humanity’s long run, the question is really when, and whether today’s AI is good enough to be treated as a stepping stone toward something like superintelligence. Not that that will stop the hype kings. —James O’Donnell

2. Vibe coding

Thirty years ago, Steve Jobs said everyone in America should learn how to program a computer. Today, people with zero knowledge of how to code can knock up an app, game, or website in no time at all thanks to vibe coding—a catch-all phrase coined by OpenAI cofounder Andrej Karpathy. To vibe-code, you simply prompt generative AI models’ coding assistants to create the digital object of your desire and accept pretty much everything they spit out. Will the result work? Possibly not. Will it be secure? Almost definitely not, but the technique’s biggest champions aren’t letting those minor details stand in their way. Also—it sounds fun! — Rhiannon Williams

3. Chatbot psychosis

One of the biggest AI stories over the past year has been how prolonged interactions with chatbots can cause vulnerable people to experience delusions and, in some extreme cases, can either cause or worsen psychosis. Although “chatbot psychosis” is not a recognized medical term, researchers are paying close attention to the growing anecdotal evidence from users who say it’s happened to them or someone they know. Sadly, the increasing number of lawsuits filed against AI companies by the families of people who died following their conversations with chatbots demonstrate the technology’s potentially deadly consequences. —Rhiannon Williams

4. Reasoning

Few things kept the AI hype train going this year more than so-called reasoning models, LLMs that can break down a problem into multiple steps and work through them one by one. OpenAI released its first reasoning models, o1 and o3, a year ago.

A month later, the Chinese firm DeepSeek took everyone by surprise with a very fast follow, putting out R1, the first open-source reasoning model. In no time, reasoning models became the industry standard: All major mass-market chatbots now come in flavors backed by this tech. Reasoning models have pushed the envelope of what LLMs can do, matching top human performances in prestigious math and coding competitions. On the flip side, all the buzz about LLMs that could “reason” reignited old debates about how smart LLMs really are and how they really work. Like “artificial intelligence” itself, “reasoning” is technical jargon dressed up with marketing sparkle. Choo choo! —Will Douglas Heaven

5. World models 

For all their uncanny facility with language, LLMs have very little common sense. Put simply, they don’t have any grounding in how the world works. Book learners in the most literal sense, LLMs can wax lyrical about everything under the sun and then fall flat with a howler about how many elephants you could fit into an Olympic swimming pool (exactly one, according to one of Google DeepMind’s LLMs).

World models—a broad church encompassing various technologies—aim to give AI some basic common sense about how stuff in the world actually fits together. In their most vivid form, world models like Google DeepMind’s Genie 3 and Marble, the much-anticipated new tech from Fei-Fei Li’s startup World Labs, can generate detailed and realistic virtual worlds for robots to train in and more. Yann LeCun, Meta’s former chief scientist, is also working on world models. He has been trying to give AI a sense of how the world works for years, by training models to predict what happens next in videos. This year he quit Meta to focus on this approach in a new start up called Advanced Machine Intelligence Labs. If all goes well, world models could be the next thing. —Will Douglas Heaven

6. Hyperscalers

Have you heard about all the people saying no thanks, we actually don’t want a giant data center plopped in our backyard? The data centers in question—which tech companies want to built everywhere, including space—are typically referred to as hyperscalers: massive buildings purpose-built for AI operations and used by the likes of OpenAI and Google to build bigger and more powerful AI models. Inside such buildings, the world’s best chips hum away training and fine-tuning models, and they’re built to be modular and grow according to needs.

It’s been a big year for hyperscalers. OpenAI announced, alongside President Donald Trump, its Stargate project, a $500 billion joint venture to pepper the country with the largest data centers ever. But it leaves almost everyone else asking: What exactly do we get out of it? Consumers worry the new data centers will raise their power bills. Such buildings generally struggle to run on renewable energy. And they don’t tend to create all that many jobs. But hey, maybe these massive, windowless buildings could at least give a moody, sci-fi vibe to your community. —James O’Donnell

7. Bubble

The lofty promises of AI are levitating the economy. AI companies are raising eye-popping sums of money and watching their valuations soar into the stratosphere. They’re pouring hundreds of billions of dollars into chips and data centers, financed increasingly by debt and eyebrow-raising circular deals. Meanwhile, the companies leading the gold rush, like OpenAI and Anthropic, might not turn a profit for years, if ever. Investors are betting big that AI will usher in a new era of riches, yet no one knows how transformative the technology will actually be.

Most organizations using AI aren’t yet seeing the payoff, and AI work slop is everywhere. There’s scientific uncertainty about whether scaling LLMs will deliver superintelligence or whether new breakthroughs need to pave the way. But unlike their predecessors in the dot-com bubble, AI companies are showing strong revenue growth, and some are even deep-pocketed tech titans like Microsoft, Google, and Meta. Will the manic dream ever burst—Michelle Kim

8. Agentic

This year, AI agents were everywhere. Every new feature announcement, model drop, or security report throughout 2025 was peppered with mentions of them, even though plenty of AI companies and experts disagree on exactly what counts as being truly “agentic,” a vague term if ever there was one. No matter that it’s virtually impossible to guarantee that an AI acting on your behalf out in the wide web will always do exactly what it’s supposed to do—it seems as though agentic AI is here to stay for the foreseeable. Want to sell something? Call it agentic! —Rhiannon Williams

9. Distillation

Early this year, DeepSeek unveiled its new model DeepSeek R1, an open-source reasoning model that matches top Western models but costs a fraction of the price. Its launch freaked Silicon Valley out, as many suddenly realized for the first time that huge scale and resources were not necessarily the key to high-level AI models. Nvidia stock plunged by 17% the day after R1 was released.

The key to R1’s success was distillation, a technique that makes AI models more efficient. It works by getting a bigger model to tutor a smaller model: You run the teacher model on a lot of examples and record the answers, and reward the student model as it copies those responses as closely as possible, so that it gains a compressed version of the teacher’s knowledge.  —Caiwei Chen

10. Sycophancy

As people across the world spend increasing amounts of time interacting with chatbots like ChatGPT, chatbot makers are struggling to work out the kind of tone and “personality” the models should adopt. Back in April, OpenAI admitted it’d struck the wrong balance between helpful and sniveling, saying a new update had rendered GPT-4o too sycophantic. Having it suck up to you isn’t just irritating—it can mislead users by reinforcing their incorrect beliefs and spreading misinformation. So consider this your reminder to take everything—yes, everything—LLMs produce with a pinch of salt. —Rhiannon Williams

11. Slop

If there is one AI-related term that has fully escaped the nerd enclosures and entered public consciousness, it’s “slop.” The word itself is old (think pig feed), but “slop” is now commonly used to refer to low-effort, mass-produced content generated by AI, often optimized for online traffic. A lot of people even use it as a shorthand for any AI-generated content. It has felt inescapable in the past year: We have been marinated in it, from fake biographies to shrimp Jesus images to surreal human-animal hybrid videos.

But people are also having fun with it. The term’s sardonic flexibility has made it easy for internet users to slap it on all kinds of words as a suffix to describe anything that lacks substance and is absurdly mediocre: think “work slop” or “friend slop.” As the hype cycle resets, “slop” marks a cultural reckoning about what we trust, what we value as creative labor, and what it means to be surrounded by stuff that was made for engagement rather than expression. —Caiwei Chen

12. Physical intelligence

Did you come across the hypnotizing video from earlier this year of a humanoid robot putting away dishes in a bleak, gray-scale kitchen? That pretty much embodies the idea of physical intelligence: the idea that advancements in AI can help robots better move around the physical world. 

It’s true that robots have been able to learn new tasks faster than ever before, everywhere from operating rooms to warehouses. Self-driving-car companies have seen improvements in how they simulate the roads, too. That said, it’s still wise to be skeptical that AI has revolutionized the field. Consider, for example, that many robots advertised as butlers in your home are doing the majority of their tasks thanks to remote operators in the Philippines

The road ahead for physical intelligence is also sure to be weird. Large language models train on text, which is abundant on the internet, but robots learn more from videos of people doing things. That’s why the robot company Figure suggested in September that it would pay people to film themselves in their apartments doing chores. Would you sign up? —James O’Donnell

13. Fair use

AI models are trained by devouring millions of words and images across the internet, including copyrighted work by artists and writers. AI companies argue this is “fair use”—a legal doctrine that lets you use copyrighted material without permission if you transform it into something new that doesn’t compete with the original. Courts are starting to weigh in. In June, Anthropic’s training of its AI model Claude on a library of books was ruled fair use because the technology was “exceedingly transformative.”

That same month, Meta scored a similar win, but only because the authors couldn’t show that the company’s literary buffet cut into their paychecks. As copyright battles brew, some creators are cashing in on the feast. In December, Disney signed a splashy deal with OpenAI to let users of Sora, the AI video platform, generate videos featuring more than 200 characters from Disney’s franchises. Meanwhile, governments around the world are rewriting copyright rules for the content-guzzling machines. Is training AI on copyrighted work fair use? As with any billion-dollar legal question, it depends—Michelle Kim

14. GEO

Just a few short years ago, an entire industry was built around helping websites rank highly in search results (okay, just in Google). Now search engine optimization (SEO), is giving way to GEO—generative engine optimization—as the AI boom forces brands and businesses to scramble to maximize their visibility in AI, whether that’s in AI-enhanced search results like Google’s AI Overviews or within responses from LLMs. It’s no wonder they’re freaked out. We already know that news companies have experienced a colossal drop in search-driven web traffic, and AI companies are working on ways to cut out the middleman and allow their users to visit sites from directly within their platforms. It’s time to adapt or die. —Rhiannon Williams

  •  

A brief history of Sam Altman’s hype

Each time you’ve heard a borderline outlandish idea of what AI will be capable of, it often turns out that Sam Altman was, if not the first to articulate it, at least the most persuasive and influential voice behind it. 

For more than a decade he has been known in Silicon Valley as a world-class fundraiser and persuader. OpenAI’s early releases around 2020 set the stage for a mania around large language models, and the launch of ChatGPT in November 2022 granted Altman a world stage on which to present his new thesis: that these models mirror human intelligence and could swing the doors open to a healthier and wealthier techno-utopia.


This story is part of MIT Technology Review’s Hype Correction package, a series that resets expectations about what AI is, what it makes possible, and where we go next.


Throughout, Altman’s words have set the agenda. He has framed a prospective superintelligent AI as either humanistic or catastrophic, depending on what effect he was hoping to create, what he was raising money for, or which tech giant seemed like his most formidable competitor at the moment. 

Examining Altman’s statements over the years reveals just how much his outlook has powered today’s AI boom. Even among Silicon Valley’s many hypesters, he’s been especially willing to speak about open questions—whether large language models contain the ingredients of human thought, whether language can also produce intelligence—as if they were already answered. 

What he says about AI is rarely provable when he says it, but it persuades us of one thing: This road we’re on with AI can go somewhere either great or terrifying, and OpenAI will need epic sums to steer it toward the right destination. In this sense, he is the ultimate hype man.

To understand how his voice has shaped our understanding of what AI can do, we read almost everything he’s ever said about the technology (we requested an interview with Altman, but he was not made available). 

His own words trace how we arrived here.

In conclusion … 

Altman didn’t dupe the world. OpenAI has ushered in a genuine tech revolution, with increasingly impressive language models that have attracted millions of users. Even skeptics would concede that LLMs’ conversational ability is astonishing.

But Altman’s hype has always hinged less on today’s capabilities than on a philosophical tomorrow—an outlook that quite handily doubles as a case for more capital and friendlier regulation. Long before large language models existed, he was imagining an AI powerful enough to require wealth redistribution, just as he imagined humanity colonizing other planets. Again and again, promises of a destination—abundance, superintelligence, a healthier and wealthier world—have come first, and the evidence second. 

Even if LLMs eventually hit a wall, there’s little reason to think his faith in a techno-utopian future will falter. The vision was never really about the particulars of the current model anyway. 

  •  

An AI model trained on prison phone calls now looks for planned crimes in those calls

A US telecom company trained an AI model on years of inmates’ phone and video calls and is now piloting that model to scan their calls, texts, and emails in the hope of predicting and preventing crimes. 

Securus Technologies president Kevin Elder told MIT Technology Review that the company began building its AI tools in 2023, using its massive database of recorded calls to train AI models to detect criminal activity. It created one model, for example, using seven years of calls made by inmates in the Texas prison system, but it has been working on building other state- or county-specific models.

Over the past year, Elder says, Securus has been piloting the AI tools to monitor inmate conversations in real time. The company declined to specify where this is taking place, but its customers include jails holding people awaiting trial and prisons for those serving sentences. Some of these facilities using Securus technology also have agreements with Immigrations and Customs Enforcement to detain immigrants, though Securus does not contract with ICE directly.

“We can point that large language model at an entire treasure trove [of data],” Elder says, “to detect and understand when crimes are being thought about or contemplated, so that you’re catching it much earlier in the cycle.”

As with its other monitoring tools, investigators at detention facilities can deploy the AI features to monitor randomly selected conversations or those of individuals suspected by facility investigators of criminal activity, according to Elder. The model will analyze phone and video calls, text messages, and emails and then flag sections for human agents to review. These agents then send them to investigators for follow-up. 

In an interview, Elder said Securus’ monitoring efforts have helped disrupt human trafficking and gang activities organized from within prisons, among other crimes, and said its tools are also used to identify prison staff who are bringing in contraband. But the company did not provide MIT Technology Review with any cases specifically uncovered by its new AI models. 

People in prison, and those they call, are notified that their conversations are recorded. But this doesn’t mean they’re aware that those conversations could be used to train an AI model, says Bianca Tylek, executive director of the prison rights advocacy group Worth Rises. 

“That’s coercive consent; there’s literally no other way you can communicate with your family,” Tylek says. And since inmates in the vast majority of states pay for these calls, she adds, “not only are you not compensating them for the use of their data, but you’re actually charging them while collecting their data.”

A Securus spokesperson said the use of data to train the tool “is not focused on surveilling or targeting specific individuals, but rather on identifying broader patterns, anomalies, and unlawful behaviors across the entire communication system.” They added that correctional facilities determine their own recording and monitoring policies, which Securus follows, and did not directly answer whether inmates can opt out of having their recordings used to train AI.

Other advocates for inmates say Securus has a history of violating their civil liberties. For example, leaks of its recordings databases showed the company had improperly recorded thousands of calls between inmates and their attorneys. Corene Kendrick, the deputy director of the ACLU’s National Prison Project, says that the new AI system enables a system of invasive surveillance, and courts have specified few limits to this power.

“[Are we] going to stop crime before it happens because we’re monitoring every utterance and thought of incarcerated people?” Kendrick says. “I think this is one of many situations where the technology is way far ahead of the law.”

The company spokesperson said the tool’s function is to make monitoring more efficient amid staffing shortages, “not to surveil individuals without cause.”

Securus will have an easier time funding its AI tool thanks to the company’s recent win in a battle with regulators over how telecom companies can spend the money they collect from inmates’ calls.

In 2024, the Federal Communications Commission issued a major reform, shaped and lauded by advocates for prisoners’ rights, that forbade telecoms from passing the costs of recording and surveilling calls on to inmates. Companies were allowed to continue to charge inmates a capped rate for calls, but prisons and jails were ordered to pay for most security costs out of their own budgets.

Negative reactions to this change were swift. Associations of sheriffs (who typically run county jails) complained they could no longer afford proper monitoring of calls, and attorneys general from 14 states sued over the ruling. Some prisons and jails warned they would cut off access to phone calls. 

While it was building and piloting its AI tool, Securus held meetings with the FCC and lobbied for a rule change, arguing that the 2024 reform went too far and asking that the agency again allow companies to use fees collected from inmates to pay for security. 

In June, Brendan Carr, whom President Donald Trump appointed to lead the FCC, said it would postpone all deadlines for jails and prisons to adopt the 2024 reforms, and even signaled that the agency wants to help telecom companies fund their AI surveillance efforts with the fees paid by inmates. In a press release, Carr wrote that rolling back the 2024 reforms would “lead to broader adoption of beneficial public safety tools that include advanced AI and machine learning.”

On October 28, the agency went further: It voted to pass new, higher rate caps and allow companies like Securus to pass security costs relating to recording and monitoring of calls—like storing recordings, transcribing them, or building AI tools to analyze such calls, for example—on to inmates. A spokesperson for Securus told MIT Technology Review that the company aims to balance affordability with the need to fund essential safety and security tools. “These tools, which include our advanced monitoring and AI capabilities, are fundamental to maintaining secure facilities for incarcerated individuals and correctional staff and to protecting the public,” they wrote.

FCC commissioner Anna Gomez dissented in last month’s ruling. “Law enforcement,” she wrote in a statement, “should foot the bill for unrelated security and safety costs, not the families of incarcerated people.”

The FCC will be seeking comment on these new rules before they take final effect. 

This story was updated on December 2 to clarify that Securus does not contract with ICE facilities.

  •  

The State of AI: How war will be changed forever

Welcome back to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday, writers from both publications debate one aspect of the generative AI revolution reshaping global power.

In this conversation, Helen Warrell, FT investigations reporter and former defense and security editor, and James O’Donnell, MIT Technology Review’s senior AI reporter, consider the ethical quandaries and financial incentives around AI’s use by the military.

Helen Warrell, FT investigations reporter 

It is July 2027, and China is on the brink of invading Taiwan. Autonomous drones with AI targeting capabilities are primed to overpower the island’s air defenses as a series of crippling AI-generated cyberattacks cut off energy supplies and key communications. In the meantime, a vast disinformation campaign enacted by an AI-powered pro-Chinese meme farm spreads across global social media, deadening the outcry at Beijing’s act of aggression.

Scenarios such as this have brought dystopian horror to the debate about the use of AI in warfare. Military commanders hope for a digitally enhanced force that is faster and more accurate than human-directed combat. But there are fears that as AI assumes an increasingly central role, these same commanders will lose control of a conflict that escalates too quickly and lacks ethical or legal oversight. Henry Kissinger, the former US secretary of state, spent his final years warning about the coming catastrophe of AI-driven warfare.

Grasping and mitigating these risks is the military priority—some would say the “Oppenheimer moment”—of our age. One emerging consensus in the West is that decisions around the deployment of nuclear weapons should not be outsourced to AI. UN secretary-general António Guterres has gone further, calling for an outright ban on fully autonomous lethal weapons systems. It is essential that regulation keep pace with evolving technology. But in the sci-fi-fueled excitement, it is easy to lose track of what is actually possible. As researchers at Harvard’s Belfer Center point out, AI optimists often underestimate the challenges of fielding fully autonomous weapon systems. It is entirely possible that the capabilities of AI in combat are being overhyped.

Anthony King, Director of the Strategy and Security Institute at the University of Exeter and a key proponent of this argument, suggests that rather than replacing humans, AI will be used to improve military insight. Even if the character of war is changing and remote technology is refining weapon systems, he insists, “the complete automation of war itself is simply an illusion.”

Of the three current military use cases of AI, none involves full autonomy. It is being developed for planning and logistics, cyber warfare (in sabotage, espionage, hacking, and information operations; and—most controversially—for weapons targeting, an application already in use on the battlefields of Ukraine and Gaza. Kyiv’s troops use AI software to direct drones able to evade Russian jammers as they close in on sensitive sites. The Israel Defense Forces have developed an AI-assisted decision support system known as Lavender, which has helped identify around 37,000 potential human targets within Gaza. 

Helen Warrell and James O'Donnell
FT/MIT TECHNOLOGY REVIEW | ADOBE STOCK

There is clearly a danger that the Lavender database replicates the biases of the data it is trained on. But military personnel carry biases too. One Israeli intelligence officer who used Lavender claimed to have more faith in the fairness of a “statistical mechanism” than that of a grieving soldier.

Tech optimists designing AI weapons even deny that specific new controls are needed to control their capabilities. Keith Dear, a former UK military officer who now runs the strategic forecasting company Cassi AI, says existing laws are more than sufficient: “You make sure there’s nothing in the training data that might cause the system to go rogue … when you are confident you deploy it—and you, the human commander, are responsible for anything they might do that goes wrong.”

It is an intriguing thought that some of the fear and shock about use of AI in war may come from those who are unfamiliar with brutal but realistic military norms. What do you think, James? Is some opposition to AI in warfare less about the use of autonomous systems and really an argument against war itself? 

James O’Donnell replies:

Hi Helen, 

One thing I’ve noticed is that there’s been a drastic shift in attitudes of AI companies regarding military applications of their products. In the beginning of 2024, OpenAI unambiguously forbade the use of its tools for warfare, but by the end of the year, it had signed an agreement with Anduril to help it take down drones on the battlefield. 

This step—not a fully autonomous weapon, to be sure, but very much a battlefield application of AI—marked a drastic change in how much tech companies could publicly link themselves with defense. 

What happened along the way? For one thing, it’s the hype. We’re told AI will not just bring superintelligence and scientific discovery but also make warfare sharper, more accurate and calculated, less prone to human fallibility. I spoke with US Marines, for example, who tested a type of AI while patrolling the South Pacific that was advertised to analyze foreign intelligence faster than a human could. 

Secondly, money talks. OpenAI and others need to start recouping some of the unimaginable amounts of cash they’re spending on training and running these models. And few have deeper pockets than the Pentagon. And Europe’s defense heads seem keen to splash the cash too. Meanwhile, the amount of venture capital funding for defense tech this year has already doubled the total for all of 2024, as VCs hope to cash in on militaries’ newfound willingness to buy from startups. 

I do think the opposition to AI warfare falls into a few camps, one of which simply rejects the idea that more precise targeting (if it’s actually more precise at all) will mean fewer casualties rather than just more war. Consider the first era of drone warfare in Afghanistan. As drone strikes became cheaper to implement, can we really say it reduced carnage? Instead, did it merely enable more destruction per dollar?

But the second camp of criticism (and now I’m finally getting to your question) comes from people who are well versed in the realities of war but have very specific complaints about the technology’s fundamental limitations. Missy Cummings, for example, is a former fighter pilot for the US Navy who is now a professor of engineering and computer science at George Mason University. She has been outspoken in her belief that large language models, specifically, are prone to make huge mistakes in military settings.

The typical response to this complaint is that AI’s outputs are human-checked. But if an AI model relies on thousands of inputs for its conclusion, can that conclusion really be checked by one person?

Tech companies are making extraordinarily big promises about what AI can do in these high-stakes applications, all while pressure to implement them is sky high. For me, this means it’s time for more skepticism, not less. 

Helen responds:

Hi James, 

We should definitely continue to question the safety of AI warfare systems and the oversight to which they’re subjected—and hold political leaders to account in this area. I am suggesting that we also apply some skepticism to what you rightly describe as the “extraordinarily big promises” made by some companies about what AI might be able to achieve on the battlefield. 

There will be both opportunities and hazards in what the military is being offered by a relatively nascent (though booming) defense tech scene. The danger is that in the speed and secrecy of an arms race in AI weapons, these emerging capabilities may not receive the scrutiny and debate they desperately need.

Further reading:

Michael C. Horowitz, director of Perry World House at the University of Pennsylvania, explains the need for responsibility in the development of military AI systems in this FT op-ed.

The FT’s tech podcast asks what Israel’s defense tech ecosystem can tell us about the future of warfare 

This MIT Technology Review story analyzes how OpenAI completed its pivot to allowing its technology on the battlefield.

MIT Technology Review also uncovered how US soldiers are using generative AI to help scour thousands of pieces of open-source intelligence.

  •