Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

AI in space: Karpathy suggests AI chatbots as interstellar messengers to alien civilizations

3 May 2024 at 15:04
Close shot of Cosmonaut astronaut dressed in a gold jumpsuit and helmet, illuminated by blue and red lights, holding a laptop, looking up.

Enlarge (credit: Getty Images)

On Thursday, renowned AI researcher Andrej Karpathy, formerly of OpenAI and Tesla, tweeted a lighthearted proposal that large language models (LLMs) like the one that runs ChatGPT could one day be modified to operate in or be transmitted to space, potentially to communicate with extraterrestrial life. He said the idea was "just for fun," but with his influential profile in the field, the idea may inspire others in the future.

Karpathy's bona fides in AI almost speak for themselves, receiving a PhD from Stanford under computer scientist Dr. Fei-Fei Li in 2015. He then became one of the founding members of OpenAI as a research scientist, then served as senior director of AI at Tesla between 2017 and 2022. In 2023, Karpathy rejoined OpenAI for a year, leaving this past February. He's posted several highly regarded tutorials covering AI concepts on YouTube, and whenever he talks about AI, people listen.

Most recently, Karpathy has been working on a project called "llm.c" that implements the training process for OpenAI's 2019 GPT-2 LLM in pure C, dramatically speeding up the process and demonstrating that working with LLMs doesn't necessarily require complex development environments. The project's streamlined approach and concise codebase sparked Karpathy's imagination.

Read 20 remaining paragraphs | Comments

Anthropic releases Claude AI chatbot iOS app

1 May 2024 at 17:36
The Claude AI iOS app running on an iPhone.

Enlarge / The Claude AI iOS app running on an iPhone. (credit: Anthropic)

On Wednesday, Anthropic announced the launch of an iOS mobile app for its Claude 3 AI language models that are similar to OpenAI's ChatGPT. It also introduced a new subscription tier designed for group collaboration. Before the app launch, Claude was only available through a website, an API, and other apps that integrated Claude through API.

Like the ChatGPT app, Claude's new mobile app serves as a gateway to chatbot interactions, and it also allows uploading photos for analysis. While it's only available on Apple devices for now, Anthropic says that an Android app is coming soon.

Anthropic rolled out the Claude 3 large language model (LLM) family in March, featuring three different model sizes: Claude Opus, Claude Sonnet, and Claude Haiku. Currently, the app uses Sonnet for regular users and Opus for Pro users.

Read 3 remaining paragraphs | Comments

Here’s your chance to own a decommissioned US government supercomputer

30 April 2024 at 17:52
A photo of the Cheyenne supercomputer, which is now up for auction.

Enlarge / A photo of the Cheyenne supercomputer, which is now up for auction. (credit: US General Services Administration)

On Tuesday, the US General Services Administration began an auction for the decommissioned Cheyenne supercomputer, located in Cheyenne, Wyoming. The 5.34-petaflop supercomputer ranked as the 20th most powerful in the world at the time of its installation in 2016. Bidding started at $2,500, but it's price is currently $27,643 with the reserve not yet met.

The supercomputer, which officially operated between January 12, 2017, and December 31, 2023, at the NCAR-Wyoming Supercomputing Center, was a powerful (and once considered energy-efficient) system that significantly advanced atmospheric and Earth system sciences research.

"In its lifetime, Cheyenne delivered over 7 billion core-hours, served over 4,400 users, and supported nearly 1,300 NSF awards," writes the University Corporation for Atmospheric Research (UCAR) on its official Cheyenne information page. "It played a key role in education, supporting more than 80 university courses and training events. Nearly 1,000 projects were awarded for early-career graduate students and postdocs. Perhaps most tellingly, Cheyenne-powered research generated over 4,500 peer-review publications, dissertations and theses, and other works."

Read 5 remaining paragraphs | Comments

Mysterious “gpt2-chatbot” AI model appears suddenly, confuses experts

30 April 2024 at 15:31
Robot fortune teller hand and crystal ball

Enlarge (credit: Getty Images)

On Sunday, word began to spread on social media about a new mystery chatbot named "gpt2-chatbot" that appeared in the LMSYS Chatbot Arena. Some people speculate that it may be a secret test version of OpenAI's upcoming GPT-4.5 or GPT-5 large language model (LLM). The paid version of ChatGPT is currently powered by GPT-4 Turbo.

Currently, the new model is only available for use through the Chatbot Arena website, although in a limited way. In the site's "side-by-side" arena mode where users can purposely select the model, gpt2-chatbot has a rate limit of eight queries per day—dramatically limiting people's ability to test it in detail.

So far, gpt2-chatbot has inspired plenty of rumors online, including that it could be the stealth launch of a test version of GPT-4.5 or even GPT-5—or perhaps a new version of 2019's GPT-2 that has been trained using new techniques. We reached out to OpenAI for comment but did not receive a response by press time. On Monday evening, OpenAI CEO Sam Altman seemingly dropped a hint by tweeting, "i do have a soft spot for gpt2."

Read 14 remaining paragraphs | Comments

Critics question tech-heavy lineup of new Homeland Security AI safety board

29 April 2024 at 16:15
A modified photo of a 1956 scientist carefully bottling

Enlarge (credit: Benj Edwards | Getty Images)

On Friday, the US Department of Homeland Security announced the formation of an Artificial Intelligence Safety and Security Board that consists of 22 members pulled from the tech industry, government, academia, and civil rights organizations. But given the nebulous nature of the term "AI," which can apply to a broad spectrum of computer technology, it's unclear if this group will even be able to agree on what exactly they are safeguarding us from.

President Biden directed DHS Secretary Alejandro Mayorkas to establish the board, which will meet for the first time in early May and subsequently on a quarterly basis.

The fundamental assumption posed by the board's existence, and reflected in Biden's AI executive order from October, is that AI is an inherently risky technology and that American citizens and businesses need to be protected from its misuse. Along those lines, the goal of the group is to help guard against foreign adversaries using AI to disrupt US infrastructure; develop recommendations to ensure the safe adoption of AI tech into transportation, energy, and Internet services; foster cross-sector collaboration between government and businesses; and create a forum where AI leaders to share information on AI security risks with the DHS.

Read 13 remaining paragraphs | Comments

Apple releases eight small AI language models aimed at on-device use

25 April 2024 at 16:55
An illustration of a robot hand tossing an apple to a human hand.

Enlarge (credit: Getty Images)

In the world of AI, what might be called "small language models" have been growing in popularity recently because they can be run on a local device instead of requiring data center-grade computers in the cloud. On Wednesday, Apple introduced a set of tiny source-available AI language models called OpenELM that are small enough to run directly on a smartphone. They're mostly proof-of-concept research models for now, but they could form the basis of future on-device AI offerings from Apple.

Apple's new AI models, collectively named OpenELM for "Open-source Efficient Language Models," are currently available on the Hugging Face under an Apple Sample Code License. Since there are some restrictions in the license, it may not fit the commonly accepted definition of "open source," but the source code for OpenELM is available.

On Tuesday, we covered Microsoft's Phi-3 models, which aim to achieve something similar: a useful level of language understanding and processing performance in small AI models that can run locally. Phi-3-mini features 3.8 billion parameters, but some of Apple's OpenELM models are much smaller, ranging from 270 million to 3 billion parameters in eight distinct models.

Read 7 remaining paragraphs | Comments

Watch: “Behavior Doesn’t Lie:” The Power of ML for Identity Threat Detection and Response

25 April 2024 at 11:54

Traditional security controls like MFA and PAM are bypassed easily by threat actors on a regular basis. Threat actors prefer breaking into organizations using legitimate credentials so they can achieve their goals undetected, often until it is too late. To combat this growing threat, organizations now need to find a way to accurately detect and […]

The post Watch: “Behavior Doesn’t Lie:” The Power of ML for Identity Threat Detection and Response appeared first on RevealSecurity.

The post Watch: “Behavior Doesn’t Lie:” The Power of ML for Identity Threat Detection and Response appeared first on Security Boulevard.

School athletic director arrested for framing principal using AI voice synthesis

25 April 2024 at 11:30
Illustration of a robot speaking.

Enlarge (credit: Getty Images)

On Thursday, Baltimore County Police arrested Pikesville High School's former athletic director, Dazhon Darien, and charged him with using AI to impersonate Principal Eric Eiswert, according to a report by The Baltimore Banner. Police say Darien used AI voice synthesis software to simulate Eiswert's voice, leading the public to believe the principal made racist and antisemitic comments.

The audio clip, posted on a popular Instagram account, contained offensive remarks about "ungrateful Black kids" and their academic performance, as well as a threat to "join the other side" if the speaker received one more complaint from "one more Jew in this community." The recording also mentioned names of staff members, including Darien's nickname "DJ," suggesting they should not have been hired or should be removed "one way or another."

The comments led to significant uproar from students, faculty, and the wider community, many of whom initially believed the principal had actually made the comments. A Pikesville High School teacher named Shaena Ravenell reportedly played a large role in disseminating the audio. While she has not been charged, police indicated that she forwarded the controversial email to a student known for their ability to quickly spread information through social media. This student then escalated the audio's reach, which included sharing it with the media and the NAACP.

Read 5 remaining paragraphs | Comments

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules

24 April 2024 at 16:14
An illustration of a man with a very long nose holding up the scales of justice.

Enlarge (credit: Getty Images)

On Friday, a federal judicial panel convened in Washington, DC, to discuss the challenges of policing AI-generated evidence in court trials, according to a Reuters report. The US Judicial Conference's Advisory Committee on Evidence Rules, an eight-member panel responsible for drafting evidence-related amendments to the Federal Rules of Evidence, heard from computer scientists and academics about the potential risks of AI being used to manipulate images and videos or create deepfakes that could disrupt a trial.

The meeting took place amid broader efforts by federal and state courts nationwide to address the rise of generative AI models (such as those that power OpenAI's ChatGPT or Stability AI's Stable Diffusion), which can be trained on large datasets with the aim of producing realistic text, images, audio, or videos.

In the published 358-page agenda for the meeting, the committee offers up this definition of a deepfake and the problems AI-generated media may pose in legal trials:

Read 9 remaining paragraphs | Comments

Microsoft’s Phi-3 shows the surprising power of small, locally run AI language models

23 April 2024 at 16:47
An illustration of lots of information being compressed into a smartphone with a funnel.

Enlarge (credit: Getty Images)

On Tuesday, Microsoft announced a new, freely available lightweight AI language model named Phi-3-mini, which is simpler and less expensive to operate than traditional large language models (LLMs) like OpenAI's GPT-4 Turbo. Its small size is ideal for running locally, which could bring an AI model of similar capability to the free version of ChatGPT to a smartphone without needing an Internet connection to run it.

The AI field typically measures AI language model size by parameter count. Parameters are numerical values in a neural network that determine how the language model processes and generates text. They are learned during training on large datasets and essentially encode the model's knowledge into quantified form. More parameters generally allow the model to capture more nuanced and complex language-generation capabilities but also require more computational resources to train and run.

Some of the largest language models today, like Google's PaLM 2, have hundreds of billions of parameters. OpenAI's GPT-4 is rumored to have over a trillion parameters but spread over eight 220-billion parameter models in a mixture-of-experts configuration. Both models require heavy-duty data center GPUs (and supporting systems) to run properly.

Read 8 remaining paragraphs | Comments

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

19 April 2024 at 09:07
A sample image from Microsoft for

Enlarge / A sample image from Microsoft for "VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time." (credit: Microsoft)

On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don't require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

"It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors," reads the abstract of the accompanying research paper titled, "VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time." It's the work of Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, and Baining Guo.

The VASA framework (short for "Visual Affective Skills Animator") uses machine learning to analyze a static image along with a speech audio clip. It is then able to generate a realistic video with precise facial expressions, head movements, and lip-syncing to the audio. It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.

Read 11 remaining paragraphs | Comments

Vulnerabilities for AI and ML Applications are Skyrocketing – Source: securityboulevard.com

vulnerabilities-for-ai-and-ml-applications-are-skyrocketing-–-source:-securityboulevard.com

Source: securityboulevard.com – Author: Nathan Eddy The number of AI-related Zero Days has tripled since November 2023, according to the latest findings from Protect AI’s huntr community of over 15,000 maintainers and security researchers. In April 2024 alone, a whopping 48 vulnerabilities have already been uncovered within widely used open source software (OSS) projects such […]

La entrada Vulnerabilities for AI and ML Applications are Skyrocketing – Source: securityboulevard.com se publicó primero en CISO2CISO.COM & CYBER SECURITY GROUP.

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model

18 April 2024 at 17:04
A group of pink llamas on a pixelated background.

Enlarge (credit: Getty Images | Benj Edwards)

On Thursday, Meta unveiled early versions of its Llama 3 open-weights AI model that can be used to power text composition, code generation, or chatbots. It also announced that its Meta AI Assistant is now available on a website and is going to be integrated into its major social media apps, intensifying the company's efforts to position its products against other AI assistants like OpenAI's ChatGPT, Microsoft's Copilot, and Google's Gemini.

Like its predecessor, Llama 2, Llama 3 is notable for being a freely available, open-weights large language model (LLM) provided by a major AI company. Llama 3 technically does not quality as "open source" because that term has a specific meaning in software (as we have mentioned in other coverage), and the industry has not yet settled on terminology for AI model releases that ship either code or weights with restrictions (you can read Llama 3's license here) or that ship without providing training data. We typically call these releases "open weights" instead.

At the moment, Llama 3 is available in two parameter sizes: 8 billion (8B) and 70 billion (70B), both of which are available as free downloads through Meta's website with a sign-up. Llama 3 comes in two versions: pre-trained (basically the raw, next-token-prediction model) and instruction-tuned (fine-tuned to follow user instructions). Each has a 8,192 token context limit.

Read 8 remaining paragraphs | Comments

OpenAI winds down AI image generator that blew minds and forged friendships in 2022

18 April 2024 at 07:00
An AI-generated image from DALL-E 2 created with the prompt

Enlarge / An AI-generated image from DALL-E 2 created with the prompt "A painting by Grant Wood of an astronaut couple, american gothic style." (credit: AI Pictures That Go Hard / X)

When OpenAI's DALL-E 2 debuted on April 6, 2022, the idea that a computer could create relatively photorealistic images on demand based on just text descriptions caught a lot of people off guard. The launch began an innovative and tumultuous period in AI history, marked by a sense of wonder and a polarizing ethical debate that reverberates in the AI space to this day.

Last week, OpenAI turned off the ability for new customers to purchase generation credits for the web version of DALL-E 2, effectively killing it. From a technological point of view, it's not too surprising that OpenAI recently began winding down support for the service. The 2-year-old image generation model was groundbreaking for its time, but it has since been surpassed by DALL-E 3's higher level of detail, and OpenAI has recently begun rolling out DALL-E 3 editing capabilities.

But for a tight-knit group of artists and tech enthusiasts who were there at the start of DALL-E 2, the service's sunset marks the bittersweet end of a period where AI technology briefly felt like a magical portal to boundless creativity. "The arrival of DALL-E 2 was truly mind-blowing," illustrator Douglas Bonneville told Ars in an interview. "There was an exhilarating sense of unlimited freedom in those first days that we all suspected AI was going to unleash. It felt like a liberation from something into something else, but it was never clear exactly what."

Read 42 remaining paragraphs | Comments

New UK law targets “despicable individuals” who create AI sex deepfakes

16 April 2024 at 10:51
An illustrator's concept of a deepfake.

Enlarge (credit: Getty Images)

On Tuesday, the UK government announced a new law targeting the creation of AI-generated sexually explicit deepfake images. Under the legislation, which has not yet been passed, offenders would face prosecution and an unlimited fine, even if they do not widely share the images but create them with the intent to distress the victim. The government positions the law as part of a broader effort to enhance legal protections for women.

Over the past decade, the rise of deep learning image synthesis technology has made it increasingly easy for people with a consumer PC to create misleading pornography by swapping out the faces of the performers with someone else who has not consented to the act. That practice spawned the term "deepfake" around 2017, named after a Reddit user named "deepfakes" that shared AI-faked porn on the service. Since then, the term has grown to encompass completely new images and video synthesized entirely from scratch, created from neural networks that have been trained on images of the victim.

The problem isn't unique to the UK. In March, deepfake nudes of female middle school classmates in Florida led to charges against two boys ages 13 and 14. The rise of open source image synthesis models like Stable Diffusion since 2022 has increased the urgency among regulators in the US to attempt to contain (or at least punish) the act of creating non-consensual deepfakes. The UK government is on a similar mission.

Read 4 remaining paragraphs | Comments

Words are flowing out like endless rain: Recapping a busy week of LLM news

12 April 2024 at 16:31
An image of a boy amazed by flying letters.

Enlarge / An image of a boy amazed by flying letters. (credit: Getty Images)

Some weeks in AI news are eerily quiet, but during others, getting a grip on the week's events feels like trying to hold back the tide. This week has seen three notable large language model (LLM) releases: Google Gemini Pro 1.5 hit general availability with a free tier, OpenAI shipped a new version of GPT-4 Turbo, and Mistral released a new openly licensed LLM, Mixtral 8x22B. All three of those launches happened within 24 hours starting on Tuesday.

With the help of software engineer and independent AI researcher Simon Willison (who also wrote about this week's hectic LLM launches on his own blog), we'll briefly cover each of the three major events in roughly chronological order, then dig into some additional AI happenings this week.

Gemini Pro 1.5 general release

(credit: Google)

On Tuesday morning Pacific time, Google announced that its Gemini 1.5 Pro model (which we first covered in February) is now available in 180-plus countries, excluding Europe, via the Gemini API in a public preview. This is Google's most powerful public LLM so far, and it's available in a free tier that permits up to 50 requests a day.

Read 14 remaining paragraphs | Comments

The Tech Apocalypse Panic is Driven by AI Boosters, Military Tacticians, and Movies

20 March 2024 at 10:36

There has been a tremendous amount of hand wringing and nervousness about how so-called artificial intelligence might end up destroying the world. The fretting has only gotten worse as a result of a U.S. State Department-commissioned report on the security risk of weaponized AI.

Whether these messages come from popular films like a War Games or The Terminator, reports that in digital simulations AI supposedly favors the nuclear option more than it should, or the idea that AI could assess nuclear threats quicker than humans—all of these scenarios have one thing in common: they end with nukes (almost) being launched because a computer either had the ability to pull the trigger or convinced humans to do so by simulating imminent nuclear threat. The purported risk of AI comes not just from yielding “control" to computers, but also the ability for advanced algorithmic systems to breach cybersecurity measures or manipulate and social engineer people with realistic voice, text, images, video, or digital impersonations

But there is one easy way to avoid a lot of this and prevent a self-inflicted doomsday: don’t give computers the capability to launch devastating weapons. This means both denying algorithms ultimate decision making powers, but it also means building in protocols and safeguards so that some kind of generative AI cannot be used to impersonate or simulate the orders capable of launching attacks. It’s really simple, and we’re by far not the only (or the first) people to suggest the radical idea that we just not integrate computer decision making into many important decisions–from deciding a person’s freedom to launching first or retaliatory strikes with nuclear weapons.


First, let’s define terms. To start, I am using "Artificial Intelligence" purely for expediency and because it is the term most commonly used by vendors and government agencies to describe automated algorithmic decision making despite the fact that it is a problematic term that shields human agency from criticism. What we are talking about here is an algorithmic system, fed a tremendous amount of historical or hypothetical information, that leverages probability and context in order to choose what outcomes are expected based on the data it has been fed. It’s how training algorithmic chatbots on posts from social media resulted in the chatbot regurgitating the racist rhetoric it was trained on. It’s also how predictive policing algorithms reaffirm racially biased policing by sending police to neighborhoods where the police already patrol and where they make a majority of their arrests. From the vantage of the data it looks as if that is the only neighborhood with crime because police don’t typically arrest people in other neighborhoods. As AI expert and technologist Joy Buolamwini has said, "With the adoption of AI systems, at first I thought we were looking at a mirror, but now I believe we're looking into a kaleidoscope of distortion... Because the technologies we believe to be bringing us into the future are actually taking us back from the progress already made."

Military Tactics Shouldn’t Drive AI Use

As EFF wrote in 2018, “Militaries must make sure they don't buy into the machine learning hype while missing the warning label. There's much to be done with machine learning, but plenty of reasons to keep it away from things like target selection, fire control, and most command, control, and intelligence (C2I) roles in the near future, and perhaps beyond that too.” (You can read EFF’s whole 2018 white paper: The Cautious Path to Advantage: How Militaries Should Plan for AI here

Just like in policing, in the military there must be a compelling directive (not to mention the marketing from eager companies hoping to get rich off defense contracts) to constantly be innovating in order to claim technical superiority. But integrating technology for innovation’s sake alone creates a great risk of unforeseen danger. AI-enhanced targeting is liable to get things wrong. AI can be fooled or tricked. It can be hacked. And giving AI the power to escalate armed conflicts, especially on a global or nuclear scale, might just bring about the much-feared AI apocalypse that can be avoided just by keeping a human finger on the button.


We’ve written before about how necessary it is to ban attempts for police to arm robots (either remote controlled or autonomous) in a domestic context for the same reasons. The idea of so-called autonomy among machines and robots creates the false sense of agency–the idea that only the computer is to blame for falsely targeting the wrong person or misreading signs of incoming missiles and launching a nuclear weapon in response–obscures who is really at fault. Humans put computers in charge of making the decisions, but humans also train the programs which make the decisions.

AI Does What We Tell It To

In the words of linguist Emily Bender,  “AI” and especially its text-based applications, is a “stochastic parrot” meaning that it echoes back to us things we taught it with as “determined by random, probabilistic distribution.” In short, we give it the material it learns, it learns it, and then draws conclusions and makes decisions based on that historical dataset. If you teach an algorithmic model that 9 times out of 10 a nation will launch a retaliatory strike when missiles are fired at them–the first time that model mistakes a flock of birds for inbound missiles, that is exactly what it will do.

To that end, AI scholar Kate Crawford argues, “AI is neither artificial nor intelligent. Rather, artificial intelligence is both embodied and material, made from natural resources, fuel, human labor, infrastructures, logistics, histories, and classifications. AI systems are not autonomous, rational, or able to discern anything without extensive datasets or predefined rules and rewards. In fact, artificial intelligence as we know it depends entirely on a much wider set of political and social structures. And due to the capital required to build AI at scale and the ways of seeing that it optimizes AI systems are ultimately designed to serve existing dominant interests.” 

AI does what we teach it to. It mimics the decisions it is taught to make either through hypotheticals or historical data. This means that, yet again, we are not powerless to a coming AI doomsday. We teach AI how to operate. We give it control of escalation, weaponry, and military response. We could just not.

Governing AI Doesn’t Mean Making it More Secret–It Means Regulating Use 

Part of the recent report commissioned by the U.S. Department of State on the weaponization of AI included one troubling recommendation: making the inner workings of AI more secret. In order to keep algorithms from being tampered with or manipulated, the full report (as summarized by Time) suggests that a new governmental regulatory agency responsible for AI should criminalize and make potentially punishable by jail time publishing the inner workings of AI. This means that how AI functions in our daily lives, and how the government uses it, could never be open source and would always live inside a black box where we could never learn the datasets informing its decision making. So much of our lives is already being governed by automated decision making, from the criminal justice system to employment, to criminalize the only route for people to know how those systems are being trained seems counterproductive and wrong.

Opening up the inner workings of AI puts more eyes on how a system functions and makes it more easy, not less, to spot manipulation and tampering… not to mention it might mitigate the biases and harms that skewed training datasets create in the first place.

Conclusion

Machine learning and algorithmic systems are useful tools whose potential we are only just beginning to grapple withbut we have to understand what these technologies are and what they are not. They are neither “artificial” or “intelligent”they do not represent an alternate and spontaneously-occurring way of knowing independent of the human mind. People build these systems and train them to get a desired outcome. Even when outcomes from AI are unexpected, usually one can find their origins somewhere in the data systems they were trained on. Understanding this will go a long way toward responsibly shaping how and when AI is deployed, especially in a defense contract, and will hopefully alleviate some of our collective sci-fi panic.

This doesn’t mean that people won’t weaponize AIand already are in the form of political disinformation or realistic impersonation. But the solution to that is not to outlaw AI entirely, nor is it handing over the keys to a nuclear arsenal to computers. We need a common sense system that respects innovation, regulates uses rather than the technology itself, and does not let panic, AI boosters, or military tacticians dictate how and when important systems are put under autonomous control. 

Worried About AI Voice Clone Scams? Create a Family Password

31 January 2024 at 19:42

Your grandfather receives a call late at night from a person pretending to be you. The caller says that you are in jail or have been kidnapped and that they need money urgently to get you out of trouble. Perhaps they then bring on a fake police officer or kidnapper to heighten the tension. The money, of course, should be wired right away to an unfamiliar account at an unfamiliar bank. 

It’s a classic and common scam, and like many scams it relies on a scary, urgent scenario to override the victim’s common sense and make them more likely to send money. Now, scammers are reportedly experimenting with a way to further heighten that panic by playing a simulated recording of “your” voice. Fortunately, there’s an easy and old-school trick you can use to preempt the scammers: creating a shared verbal password with your family.

The ability to create audio deepfakes of people's voices using machine learning and just minutes of them speaking has become relatively cheap and easy to acquire technology. There are myriad websites that will let you make voice clones. Some will let you use a variety of celebrity voices to say anything they want, while others will let you upload a new person’s voice to create a voice clone of anyone you have a recording of. Scammers have figured out that they can use this to clone the voices of regular people. Suddenly your relative isn’t talking to someone who sounds like a complete stranger, they are hearing your own voice. This makes the scam much more concerning. 

Voice generation scams aren’t widespread yet, but they do seem to be happening. There have been news stories and even congressional testimony from people who have been the targets of voice impersonation scams. Voice cloning scams are also being used in political disinformation campaigns as well. It’s impossible for us to know what kind of technology these scammers used, or if they're just really good impersonations. But it is likely that the scams will grow more prevalent as the technology gets cheaper and more ubiquitous. For now, the novelty of these scams, and the use of machine learning and deepfakes, technologies which are raising concerns across many sectors of society, seems to be driving a lot of the coverage. 

The family password is a decades-old, low tech solution to this modern high tech problem. 

The first step is to agree with your family on a password you can all remember and use. The most important thing is that it should be easy to remember in a panic, hard to forget, and not public information. You could use the name of a well known person or object in your family, an inside joke, a family meme, or any word that you can all remember easily. Despite the name, this doesn't need to be limited to your family, it can be a chosen family, workplace, anarchist witch coven, etc. Any group of people with which you associate can benefit from having a password. 

Then when someone calls you or someone that trusts you (or emails or texts you) with an urgent request for money (or iTunes gift cards) you simply ask them the password. If they can’t tell it to you, then they might be a fake. You could of course further verify this with other questions,  like, “what is my cat's name” or “when was the last time we saw each other?” These sorts of questions work even if you haven’t previously set up a passphrase in your family or friend group. But keep in mind people tend to forget basic things when they have experienced trauma or are in a panic. It might be helpful, especially for   people with less robust memories, to write down the password in case you forget it. After all, it’s not likely that the scammer will break into your house to find the family password.

These techniques can be useful against other scams which haven’t been invented yet, but which may come around as deepfakes become more prevalent, such as machine-generated video or photo avatars for “proof.” Or should you ever find yourself in a hackneyed sci-fi situation where there are two identical copies of your friend and you aren’t sure which one is the evil clone and which one is the original. 

An image of spider-man pointing at another spider-man who is pointing at him. A classic meme.

Spider-man hopes The Avengers haven't forgotten their secret password!

The added benefit of this technique is that it gives you a minute to step back, breath, and engage in some critical thinking. Many scams of this nature rely on panic and keeping you in your lower brain, by asking for the passphrase you can also take a minute to think. Is your kid really in Mexico right now? Can you call them back at their phone number to be sure it’s them?  

So, go make a family password and a friend password to keep your family and friends from getting scammed by AI impostors (or evil clones).

The No AI Fraud Act Creates Way More Problems Than It Solves

19 January 2024 at 18:27

Creators have reason to be wary of the generative AI future. For one thing, while GenAI can be a valuable tool for creativity, it may also be used to deceive the public and disrupt existing markets for creative labor. Performers, in particular, worry that AI-generated images and music will become deceptive substitutes for human models, actors, or musicians.

Existing laws offer multiple ways for performers to address this issue. In the U.S., a majority of states recognize a “right of publicity,” meaning, the right to control if and how your likeness is used for commercial purposes. A limited version of this right makes senseyou should be able to prevent a company from running an advertisement that falsely claims that you endorse its productsbut the right of publicity has expanded well beyond its original boundaries, to potentially cover just about any speech that “evokes” a person’s identity.

In addition, every state prohibits defamation, harmful false representations, and unfair competition, though the parameters may vary. These laws provide time-tested methods to mitigate economic and emotional harms from identity misuse while protecting online expression rights.

But some performers want more. They argue that your right to control use of your image shouldn’t vary depending on what state you live in. They’d also like to be able to go after the companies that offer generative AI tools and/or host AI-generated “deceptive” content. Ordinary liability rules, including copyright, can’t be used against a company that has simply provided a tool for others’ expression. After all, we don’t hold Adobe liable when someone uses Photoshop to suggest that a president can’t read or even for more serious deceptions. And Section 230 immunizes intermediaries from liability for defamatory content posted by users and, in some parts of the country, publicity rights violations as well. Again, that’s a feature, not a bug; immunity means it’s easier to stick up for users’ speech, rather than taking down or preemptively blocking any user-generated content that might lead to litigation. It’s a crucial protection not just big players like Facebook and YouTube, but also small sites, news outlets, emails hosts, libraries, and many others.

Balancing these competing interests won’t be easy. Sadly, so far Congress isn’t trying very hard. Instead, it’s proposing “fixes” that will only create new problems.

Last fall, several Senators circulated a “discussion draft” bill, the NO FAKES Act. Professor Jennifer Rothman has an excellent analysis of the bill, including its most dangerous aspect: creating a new, and transferable, federal publicity right that would extend for 70 years past the death of the person whose image is purportedly replicated. As Rothman notes, under the law:

record companies get (and can enforce) rights to performers’ digital replicas, not just the performers themselves. This opens the door for record labels to cheaply create AI-generated performances, including by dead celebrities, and exploit this lucrative option over more costly performances by living humans, as discussed above.

In other words, if we’re trying to protect performers in the long run, just make it easier for record labels (for example) to acquire voice rights that they can use to avoid paying human performers for decades to come.

NO FAKES hasn’t gotten much traction so far, in part because the Motion Picture Association hasn’t supported it. But now there’s a new proposal: the “No AI FRAUD Act.” Unfortunately, Congress is still getting it wrong.

First, the Act purports to target abuse of generative AI to misappropriate a person’s image or voice, but the right it creates applies to an incredibly broad amount of digital content: any “likeness” and/or “voice replica” that is created or altered using digital technology, software, an algorithm, etc. There’s not much that wouldn’t fall into that categoryfrom pictures of your kid, to recordings of political events, to docudramas, parodies, political cartoons, and more. If it involved recording or portraying a human, it’s probably covered. Even more absurdly, it characterizes any tool that has a primary purpose of producing digital depictions of particular people as a “personalized cloning service.” Our iPhones are many things, but even Tim Cook would likely be surprised to know he’s selling a “cloning service.”

Second, it characterizes the new right as a form of federal intellectual property. This linguistic flourish has the practical effect of putting intermediaries that host AI-generated content squarely in the litigation crosshairs. Section 230 immunity does not apply to federal IP claims, so performers (and anyone else who falls under the statute) will have free rein to sue anyone that hosts or transmits AI-generated content.

That, in turn, is bad news for almost everyoneincluding performers. If this law were enacted, all kinds of platforms and services could very well fear reprisal simply for hosting images or depictions of people—or any of the rest of the broad types of “likenesses” this law covers. Keep in mind that many of these service won’t be in a good position to know whether AI was involved in the generation of a video clip, song, etc., nor will they have the resources to pay lawyers to fight back against improper claims. The best way for them to avoid that liability would be to aggressively filter user-generated content, or refuse to support it at all.

Third, while the term of the new right is limited to ten years after death (still quite a long time), it’s combined with very confusing language suggesting that the right could extend well beyond that date if the heirs so choose. Notably, the legislation doesn’t preempt existing state publicity rights laws, so the terms could vary even more wildly depending on where the individual (or their heirs) reside.

Lastly, while the defenders of the bill incorrectly claim it will protect free expression, the text of the bill suggests otherwise. True, the bill recognizes a “First Amendment defense.” But every law that affects speech is limited by the First Amendmentthat’s how the Constitution works. And the bill actually tries to limit those important First Amendment protections by requiring courts to balance any First Amendment interests “against the intellectual property interest in the voice or likeness.” That balancing test must consider whether the use is commercial, necessary for a “primary expressive purpose,” and harms the individual’s licensing market. This seems to be an effort to import a cramped version of copyright’s fair use doctrine as a substitute for the rigorous scrutiny and analysis the First Amendment (and even the Copyright Act) requires.

We could go on, and we will if Congress decides to take this bill seriously. But it shouldn’t. If Congress really wants to protect performers and ordinary people from deceptive or exploitative uses of their images and voice, it should take a precise, careful and practical approach that avoids potential collateral damage to free expression, competition, and innovation. The No AI FRAUD Act comes nowhere near the mark

AI Watermarking Won't Curb Disinformation

5 January 2024 at 13:46

Generative AI allows people to produce piles upon piles of images and words very quickly. It would be nice if there were some way to reliably distinguish AI-generated content from human-generated content. It would help people avoid endlessly arguing with bots online, or believing what a fake image purports to show. One common proposal is that big companies should incorporate watermarks into the outputs of their AIs. For instance, this could involve taking an image and subtly changing many pixels in a way that’s undetectable to the eye but detectable to a computer program. Or it could involve swapping words for synonyms in a predictable way so that the meaning is unchanged, but a program could readily determine the text was generated by an AI.

Unfortunately, watermarking schemes are unlikely to work. So far most have proven easy to remove, and it’s likely that future schemes will have similar problems.

One kind of watermark is already common for digital images. Stock image sites often overlay text on an image that renders it mostly useless for publication. This kind of watermark is visible and is slightly challenging to remove since it requires some photo editing skills.

Images can also have metadata attached by a camera or image processing program, including information like the date, time, and location a photograph was taken, the camera settings, or the creator of an image. This metadata is unobtrusive but can be readily viewed with common programs. It’s also easily removed from a file. For instance, social media sites often automatically remove metadata when people upload images, both to prevent people from accidentally revealing their location and simply to save storage space.

A useful watermark for AI images would need two properties: 

  • It would need to continue to be detectable after an image is cropped, rotated, or edited in various ways (robustness). 
  • It couldn’t be conspicuous like the watermark on stock image samples, because the resulting images wouldn’t be of much use to anybody.

One simple technique is to manipulate the least perceptible bits of an image. For instance, to a human viewer these two squares are the same shade:

But to a computer it’s obvious that they are different by a single bit: #93c47d vs 93c57d. Each pixel of an image is represented by a certain number of bits, and some of them make more of a perceptual difference than others. By manipulating those least-important bits, a watermarking program can create a pattern that viewers won’t see, but a watermarking-detecting program will. If that pattern repeats across the whole image, the watermark is even robust to cropping. However, this method has one clear flaw: rotating or resizing the image is likely to accidentally destroy the watermark.

There are more sophisticated watermarking proposals that are robust to a wider variety of common edits. However, proposals for AI watermarking must pass a tougher challenge. They must be robust against someone who knows about the watermark and wants to eliminate it. The person who wants to remove a watermark isn’t limited to common edits, but can directly manipulate the image file. For instance, if a watermark is encoded in the least important bits of an image, someone could remove it by simply setting all the least important bits to 0, or to a random value (1 or 0), or to a value automatically predicted based on neighboring pixels. Just like adding a watermark, removing a watermark this way gives an image that looks basically identical to the original, at least to a human eye.

Coming at the problem from the opposite direction, some companies are working on ways to prove that an image came from a camera (“content authenticity”). Rather than marking AI generated images, they add metadata to camera-generated images, and use cryptographic signatures to prove the metadata is genuine. This approach is more workable than watermarking AI generated images, since there’s no incentive to remove the mark. In fact, there’s the opposite incentive: publishers would want to keep this metadata around because it helps establish that their images are “real.” But it’s still a fiendishly complicated scheme, since the chain of verifiability has to be preserved through all software used to edit photos. And most cameras will never produce this metadata, meaning that its absence can’t be used to prove a photograph is fake.

Comparing watermarking vs content authenticity, watermarking aims to identify or mark (some) fake images; content authenticity aims to identify or mark (some) real images. Neither approach is comprehensive, since most of the images on the Internet will have neither a watermark nor content authenticity metadata.

Watermarking Content authenticity
AI images Marked Unmarked
(Some) camera images Unmarked Marked
Everything else Unmarked Unmarked

 

Text-based Watermarks

The watermarking problem is even harder for text-based generative AI. Similar techniques can be devised. For instance, an AI could boost the probability of certain words, giving itself a subtle textual style that would go unnoticed most of the time, but could be recognized by a program with access to the list of words. This would effectively be a computer version of determining the authorship of the twelve disputed essays in The Federalist Papers by analyzing Madison’s and Hamilton’s habitual word choices.

But creating an indelible textual watermark is a much harder task than telling Hamilton from Madison, since the watermark must be robust to someone modifying the text trying to remove it. Any watermark based on word choice is likely to be defeated by some amount of rewording. That rewording could even be performed by an alternate AI, perhaps one that is less sophisticated than the one that generated the original text, but not subject to a watermarking requirement.

There’s also a problem of whether the tools to detect watermarked text are publicly available or are secret. Making detection tools publicly available gives an advantage to those who want to remove watermarking, because they can repeatedly edit their text or image until the detection tool gives an all clear. But keeping them a secret makes them dramatically less useful, because every detection request must be sent to whatever company produced the watermarking. That would potentially require people to share private communication if they wanted to check for a watermark. And it would hinder attempts by social media companies to automatically label AI-generated content at scale, since they’d have to run every post past the big AI companies.

Since text output from current AIs isn’t watermarked, services like GPTZero and TurnItIn have popped up, claiming to be able to detect AI-generated content anyhow. These detection tools are so inaccurate as to be dangerous, and have already led to false charges of plagiarism.

Lastly, if AI watermarking is to prevent disinformation campaigns sponsored by states, it’s important to keep in mind that those states can readily develop modern generative AI, and probably will in the near future. A state-sponsored disinformation campaign is unlikely to be so polite as to watermark its output.

Watermarking of AI generated content is an easy-sounding fix for the thorny problem of disinformation. And watermarks may be useful in understanding reshared content where there is no deceptive intent. But research into adversarial watermarking for AI is just beginning, and while there’s no strong reason to believe it will succeed, there are some good reasons to believe it will ultimately fail.

❌
❌