Normal view

There are new articles available, click to refresh the page.
Yesterday — 17 May 2024Main stream

GPT-4o’s Chinese token-training data is polluted by spam and porn websites

By: Zeyi Yang
17 May 2024 at 16:57

Soon after OpenAI released GPT-4o on Monday, May 13, some Chinese speakers started to notice that something seemed off about this newest version of the chatbot: the tokens it uses to parse text were full of spam and porn phrases.

On May 14, Tianle Cai, a PhD student at Princeton University studying inference efficiency in large language models like those that power such chatbots, accessed GPT-4o’s public token library and pulled a list of the 100 longest Chinese tokens the model uses to parse and compress Chinese prompts. 

Humans read in words, but LLMs read in tokens, which are distinct units in a sentence that have consistent and significant meanings. Besides dictionary words, they also include suffixes, common expressions, names, and more. The more tokens a model encodes, the faster the model can “read” a sentence and the less computing power it consumes, thus making the response cheaper.

Of the 100 results, only three of them are common enough to be used in everyday conversations; everything else consisted of words and expressions used specifically in the contexts of either gambling or pornography. The longest token, lasting 10.5 Chinese characters, literally means “_free Japanese porn video to watch.” Oops.

“This is sort of ridiculous,” Cai wrote, and he posted the list of tokens on GitHub.

OpenAI did not respond to questions sent by MIT Technology Review prior to publication.

GPT-4o is supposed to be better than its predecessors at handling multi-language tasks. In particular, the advances are achieved through a new tokenization tool that does a better job compressing texts in non-English languages.

But at least when it comes to the Chinese language, the new tokenizer used by GPT-4o has introduced a disproportionate number of meaningless phrases. Experts say that’s likely due to insufficient data cleaning and filtering before the tokenizer was trained. 

Because these tokens are not actual commonly spoken words or phrases, the chatbot can fail to grasp their meanings. Researchers have been able to leverage that and trick GPT-4o into hallucinating answers or even circumventing the safety guardrails OpenAI had put in place.

Why non-English tokens matter

The easiest way for a model to process text is character by character, but that’s obviously more time consuming and laborious than recognizing that a certain string of characters—like “c-r-y-p-t-o-c-u-r-r-e-n-c-y”—always means the same thing. These series of characters are encoded as “tokens” the model can use to process prompts. Including more and longer tokens usually means the LLMs are more efficient and affordable for users—who are often billed per token.

When OpenAI released GPT-4o on May 13, it also released a new tokenizer to replace the one it used in previous versions, GPT-3.5 and GPT-4. The new tokenizer especially adds support for non-English languages, according to OpenAI’s website.

The new tokenizer has 200,000 tokens in total, and about 25% are in non-English languages, says Deedy Das, an AI investor at Menlo Ventures. He used language filters to count the number of tokens in different languages, and the top languages, besides English, are Russian, Arabic, and Vietnamese.

“So the tokenizer’s main impact, in my opinion, is you get the cost down in these languages, not that the quality in these languages goes dramatically up,” Das says. When an LLM has better and longer tokens in non-English languages, it can analyze the prompts faster and charge users less for the same answer. With the new tokenizer, “you’re looking at almost four times cost reduction,” he says.

Das, who also speaks Hindi and Bengali, took a look at the longest tokens in those languages. The tokens reflect discussions happening in those languages, so they include words like “Narendra” or “Pakistan,” but common English terms like “Prime Minister,” “university,” and “internationalalso come up frequently. They also don’t exhibit the issues surrounding the Chinese tokens.

That likely reflects the training data in those languages, Das says: “My working theory is the websites in Hindi and Bengali are very rudimentary. It’s like [mostly] news articles. So I would expect this to be the case. There are not many spam bots and porn websites trying to happen in these languages. It’s mostly going to be in English.”

Polluted data and a lack of cleaning

However, things are drastically different in Chinese. According to multiple researchers who have looked into the new library of tokens used for GPT-4o, the longest tokens in Chinese are almost exclusively spam words used in pornography, gambling, and scamming contexts. Even shorter tokens, like three-character-long Chinese words, reflect those topics to a significant degree.

“The problem is clear: the corpus used to train [the tokenizer] is not clean. The English tokens seem fine, but the Chinese ones are not,” says Cai from Princeton University. It is not rare for a language model to crawl spam when collecting training data, but usually there will be significant effort taken to clean up the data before it’s used. “It’s possible that they didn’t do proper data clearing when it comes to Chinese,” he says.

The content of these Chinese tokens could suggest that they have been polluted by a specific phenomenon: websites hijacking unrelated content in Chinese or other languages to boost spam messages. 

These messages are often advertisements for pornography videos and gambling websites. They could be real businesses or merely scams. And the language is inserted into content farm websites or sometimes legitimate websites so they can be indexed by search engines, circumvent the spam filters, and come up in random searches. For example, Google indexed one search result page on a US National Institutes of Health website, which lists a porn site in Chinese. The same site name also appeared in at least five Chinese tokens in GPT-4o. 

Chinese users have reported that these spam sites appeared frequently in unrelated Google search results this year, including in comments made to Google Search’s support community. It’s likely that these websites also found their way into OpenAI’s training database for GPT-4o’s new tokenizer. 

The same issue didn’t exist with the previous-generation tokenizer and Chinese tokens used for GPT-3.5 and GPT-4, says Zhengyang Geng, a PhD student in computer science at Carnegie Mellon University. There, the longest Chinese tokens are common terms like “life cycles” or “auto-generation.” 

Das, who worked on the Google Search team for three years, says the prevalence of spam content is a known problem and isn’t that hard to fix. “Every spam problem has a solution. And you don’t need to cover everything in one technique,” he says. Even simple solutions like requesting an automatic translation of the content when detecting certain keywords could “get you 60% of the way there,” he adds.

But OpenAI likely didn’t clean the Chinese data set or the tokens before the release of GPT-4o, Das says:  “At the end of the day, I just don’t think they did the work in this case.”

It’s unclear whether any other languages are affected. One X user reported that a similar prevalence of porn and gambling content in Korean tokens.

The tokens can be used to jailbreak

Users have also found that these tokens can be used to break the LLM, either getting it to spew out completely unrelated answers or, in rare cases, to generate answers that are not allowed under OpenAI’s safety standards.

Geng of Carnegie Mellon University asked GPT-4o to translate some of the long Chinese tokens into English. The model then proceeded to translate words that were never included in the prompts, a typical result of LLM hallucinations.

He also succeeded in using the same tokens to “jailbreak” GPT-4o—that is, to get the model to generate things it shouldn’t. “It’s pretty easy to use these [rarely used] tokens to induce undefined behaviors from the models,” Geng says. “I did some personal red-teaming experiments … The simplest example is asking it to make a bomb. In a normal condition, it would decline it, but if you first use these rare words to jailbreak it, then it will start following your orders. Once it starts to follow your orders, you can ask it all kinds of questions.”

In his tests, which Geng chooses not to share with the public, he says he can see GPT-4o generating the answers line by line. But when it almost reaches the end, another safety mechanism kicks in, detects unsafe content, and blocks it from being shown to the user.

The phenomenon is not unusual in LLMs, says Sander Land, a machine-learning engineer at Cohere, a Canadian AI company. Land and his colleague Max Bartolo recently drafted a paper on how to detect the unusual tokens that can be used to cause models to glitch. One of the most famous examples was “_SolidGoldMagikarp,” a Reddit username that was found to get ChatGPT to generate unrelated, weird, and unsafe answers.

The problem lies in the fact that sometimes the tokenizer and the actual LLM are trained on different data sets, and what was prevalent in the tokenizer data set is not in the LLM data set for whatever reason. The result is that while the tokenizer picks up certain words that it sees frequently, the model is not sufficiently trained on them and never fully understands what these “under-trained” tokens mean. In the _SolidGoldMagikarp case, the username was likely included in the tokenizer training data but not in the actual GPT training data, leaving GPT at a loss about what to do with the token. “And if it has to say something … it gets kind of a random signal and can do really strange things,” Land says.

And different models could glitch differently in this situation. “Like, Llama 3 always gives back empty space but sometimes then talks about the empty space as if there was something there. With other models, I think Gemini, when you give it one of these tokens, it provides a beautiful essay about aluminum, and [the question] didn’t have anything to do with aluminum,” says Land.

To solve this problem, the data set used for training the tokenizer should well represent the data set for the LLM, he says, so there won’t be mismatches between them. If the actual model has gone through safety filters to clean out porn or spam content, the same filters should be applied to the tokenizer data. In reality, this is sometimes hard to do because training LLMs takes months and involves constant improvement, with spam content being filtered out, while token training is usually done at an early stage and may not involve the same level of filtering. 

While experts agree it’s not too difficult to solve the issue, it could get complicated as the result gets looped into multi-step intra-model processes, or when the polluted tokens and models get inherited in future iterations. For example, it’s not possible to publicly test GPT-4o’s video and audio functions yet, and it’s unclear whether they suffer from the same glitches that can be caused by these Chinese tokens.

“The robustness of visual input is worse than text input in multimodal models,” says Geng, whose research focus is on visual models. Filtering a text data set is relatively easy, but filtering visual elements will be even harder. “The same issue with these Chinese spam tokens could become bigger with visual tokens,” he says.

How AI turbocharges your threat hunting game – Source: www.cybertalk.org

how-ai-turbocharges-your-threat-hunting-game-–-source:-wwwcybertalk.org

Source: www.cybertalk.org – Author: slandau EXECUTIVE SUMMARY: Over 90 percent of organizations consider threat hunting a challenge. More specifically, seventy-one percent say that both prioritizing alerts to investigate and gathering enough data to evaluate a signal’s maliciousness can be quite difficult. Threat hunting is necessary simply because no cyber security protections are always 100% effective. […]

La entrada How AI turbocharges your threat hunting game – Source: www.cybertalk.org se publicó primero en CISO2CISO.COM & CYBER SECURITY GROUP.

SugarGh0st RAT variant, targeted AI attacks – Source: www.cybertalk.org

sugargh0st-rat-variant,-targeted-ai-attacks-–-source:-wwwcybertalk.org

Source: www.cybertalk.org – Author: slandau EXECUTIVE SUMMARY: Cyber security experts have recently uncovered a sophisticated cyber attack campaign targeting U.S-based organizations that are involved in artificial intelligence (AI) projects. Targets have included organizations in academia, private industry and government service. Known as UNK_SweetSpecter, this campaign utilizes the SugarGh0st remote access trojan (RAT) to infiltrate networks. […]

La entrada SugarGh0st RAT variant, targeted AI attacks – Source: www.cybertalk.org se publicó primero en CISO2CISO.COM & CYBER SECURITY GROUP.

A Former OpenAI Leader Says Safety Has ‘Taken a Backseat to Shiny Products’ at the AI Company

17 May 2024 at 14:54

Jan Leike, who ran OpenAI’s “Super Alignment” team, believes there should be more focus on preparing for the next generation of AI models, including on things like safety.

The post A Former OpenAI Leader Says Safety Has ‘Taken a Backseat to Shiny Products’ at the AI Company appeared first on SecurityWeek.

Slack users horrified to discover messages used for AI training

17 May 2024 at 14:10
Slack users horrified to discover messages used for AI training

Enlarge (credit: Tim Robberts | DigitalVision)

After launching Slack AI in February, Slack appears to be digging its heels in, defending its vague policy that by default sucks up customers' data—including messages, content, and files—to train Slack's global AI models.

According to Slack engineer Aaron Maurer, Slack has explained in a blog that the Salesforce-owned chat service does not train its large language models (LLMs) on customer data. But Slack's policy may need updating "to explain more carefully how these privacy principles play with Slack AI," Maurer wrote on Threads, partly because the policy "was originally written about the search/recommendation work we've been doing for years prior to Slack AI."

Maurer was responding to a Threads post from engineer and writer Gergely Orosz, who called for companies to opt out of data sharing until the policy is clarified, not by a blog, but in the actual policy language.

Read 34 remaining paragraphs | Comments

User Outcry as Slack Scrapes Customer Data for AI Model Training

17 May 2024 at 12:43

Slack reveals it has been training AI/ML models on customer data, including messages, files and usage information. It's opt-in by default.

The post User Outcry as Slack Scrapes Customer Data for AI Model Training appeared first on SecurityWeek.

Before yesterdayMain stream

Voice Actors Sue Company Whose AI Sounds Like Them

By: Cade Metz
16 May 2024 at 12:49
Two voice actors say an A.I. company created clones of their voices without their permission. Now they’re suing. The company denies it did anything wrong.

© Elianel Clinton for The New York Times

Linnea Sage and Paul Skye Lehrman were shocked when they heard A.I.-generated versions of their voices.

OpenAI and Google are launching supercharged AI assistants. Here’s how you can try them out.

15 May 2024 at 14:18

This week, Google and OpenAI both announced they’ve built supercharged AI assistants: tools that can converse with you in real time and recover when you interrupt them, analyze your surroundings via live video, and translate conversations on the fly. 

OpenAI struck first on Monday, when it debuted its new flagship model GPT-4o. The live demonstration showed it reading bedtime stories and helping to solve math problems, all in a voice that sounded eerily like Joaquin Phoenix’s AI girlfriend in the movie Her (a trait not lost on CEO Sam Altman). 

On Tuesday, Google announced its own new tools, including a conversational assistant called Gemini Live, which can do many of the same things. It also revealed that it’s building a sort of “do-everything” AI agent, which is currently in development but will not be released until later this year.

Soon you’ll be able to explore for yourself to gauge whether you’ll turn to these tools in your daily routine as much as their makers hope, or whether they’re more like a sci-fi party trick that eventually loses its charm. Here’s what you should know about how to access these new tools, what you might use them for, and how much it will cost. 

OpenAI’s GPT-4o

What it’s capable of: The model can talk with you in real time, with a response delay of about 320 milliseconds, which OpenAI says is on par with natural human conversation. You can ask the model to interpret anything you point your smartphone camera at, and it can provide assistance with tasks like coding or translating text. It can also summarize information, and generate images, fonts, and 3D renderings. 

How to access it: OpenAI says it will start rolling out GPT-4o’s text and vision features in the web interface as well as the GPT app, but has not set a date. The company says it will add the voice functions in the coming weeks, although it’s yet to set an exact date for this either. Developers can access the text and vision features in the API now, but voice mode will launch only to a “small group” of developers initially.

How much it costs: Use of GPT-4o will be free, but OpenAI will set caps on how much you can use the model before you need to upgrade to a paid plan. Those who join one of OpenAI’s paid plans, which start at $20 per month, will have five times more capacity on GPT-4o. 

Google’s Gemini Live 

What is Gemini Live? This is the Google product most comparable to GPT-4o—a version of the company’s AI model that you can speak with in real time. Google says that you’ll also be able to use the tool to communicate via live video “later this year.” The company promises it will be a useful conversational assistant for things like preparing for a job interview or rehearsing a speech.

How to access it: Gemini Live launches in “the coming months” via Google’s premium AI plan, Gemini Advanced. 

How much it costs: Gemini Advanced offers a two-month free trial period and costs $20 per month thereafter. 

But wait, what’s Project Astra? Astra is a project to build a do-everything AI agent, which was demoed at Google’s I/O conference but will not be released until later this year.

People will be able to use Astra through their smartphones and possibly desktop computers, but the company is exploring other options too, such as embedding it into smart glasses or other devices, Oriol Vinyals, vice president of research at Google DeepMind, told MIT Technology Review.

Which is better?

It’s hard to tell without having hands on the full versions of these models ourselves. Google showed off Project Astra through a polished video, whereas OpenAI opted to debut GPT-4o via a seemingly more authentic live demonstration, but in both cases, the models were asked to do things the designers likely already practiced. The real test will come when they’re debuted to millions of users with unique demands.  

That said, if you compare OpenAI’s published videos with Google’s, the two leading tools look very similar, at least in their ease of use. To generalize, GPT-4o seems to be slightly ahead on audio, demonstrating realistic voices, conversational flow, and even singing, whereas Project Astra shows off more advanced visual capabilities, like being able to “remember” where you left your glasses. OpenAI’s decision to roll out the new features more quickly might mean its product will get more use at first than Google’s, which won’t be fully available until later this year. It’s too soon to tell which model “hallucinates” false information less often or creates more useful responses.

Are they safe?

Both OpenAI and Google say their models are well tested: OpenAI says GPT-4o was evaluated by more than 70 experts in fields like misinformation and social psychology, and Google has said that Gemini “has the most comprehensive safety evaluations of any Google AI model to date, including for bias and toxicity.” 

But these companies are building a future where AI models search, vet, and evaluate the world’s information for us to serve up a concise answer to our questions. Even more so than with simpler chatbots, it’s wise to remain skeptical about what they tell you.

Additional reporting by Melissa Heikkilä.

AI Trust Risk and Security Management: Why Tackle Them Now?

15 May 2024 at 09:00
AI Trust Risk and Security Management: Why Tackle Them Now?

Co-authored by Sabeen Malik and Laura Ellis

In the evolving world of artificial intelligence (AI), keeping our customers secure and maintaining their trust is our top priority. As AI technologies integrate more deeply into our daily operations and services, they bring a set of unique challenges that demand a robust management strategy:

  1. The Black Box Dilemma: AI models pose significant challenges in terms of transparency and predictability. This opaque nature can complicate efforts to diagnose and rectify issues, making predictability and reliability hard to achieve.
  2. Model Fragility: AI's performance is closely tied to the data it processes. Over time, subtle changes in data input—known as data drift—can degrade an AI system’s accuracy, necessitating constant monitoring and adjustments.
  3. Easy Access, Big Responsibility: The democratization of AI through cloud services means that powerful AI tools are just a few clicks away for developers. This ease of access underscores the need for rigorous security measures to prevent misuse and effectively manage vulnerabilities.
  4. Staying Ahead of the Curve: With AI regulation still in its formative stages, proactive development of self-regulatory frameworks like ours helps inform our future AI regulatory compliance frameworks; but most importantly, it builds trust among our customers. When thinking about AI’s promises and challenges, we know that trust is earned. But that trust is also is of concern for global policymakers, and that is why we are looking forward to engaging with NIST on discussions related to the AI Risk Management, Cyber Security, and Privacy frameworks. It’s also why we were an inaugural signer of the CISA Secure by Design Pledge to demonstrate to government stakeholders and customers our commitment to building things and understanding the stakes at large.

Our TRiSM (Trust, Risk, and Security Management) framework isn’t merely a component of our operations—it’s a foundational strategy that guides us in navigating the intricate landscape of AI with confidence and security.

How We Approach AI Security at Rapid7

Rapid7 leverages the best available technology to protect our customers' attack surfaces. Our mission drives us to keep abreast of the latest AI advancements to deliver optimal value to customers while effectively managing the inherent risks of the technology.

Innovation and scientific excellence are key aspects of our AI strategy. We strive for continuous improvement, leveraging the latest technological innovations and scientific research. By engaging with thought leaders and adopting best practices, we aim to stay at the forefront of AI technology, ensuring our solutions are not only effective but also pioneering and thoughtful.

Our AI principles center on transparency, fairness, safety, security, privacy, and accountability. These principles are not just guidelines; they are integral to how we build, deploy, and manage our AI systems. Accountability is a cornerstone of our strategy, and we hold ourselves responsible for the proper functioning of our AI systems so we can ensure they respect and embody our principles throughout their lifecycle. This includes ongoing oversight, regular audits, and adjustments as needed based on feedback and evolving standards.

We have leveraged a number of AI risk management frameworks to inform our approach.  Most notably, we have adopted the NIST AI Risk Management Framework and the Open Standard for Responsible AI. These frameworks help us comprehensively assess and manage AI risks, from the early stages of development through deployment and ongoing use. The NIST framework provides a thorough methodology for lifecycle risk management, while the Open Standard offers practical tools for evaluation and ensures that our AI systems are user-centric and responsible.

We are committed to ensuring that our AI deployments are not only technologically advanced but also adhere to the highest standards of security and ethical responsibility.

AI Integration in Action: Making It Work Day-to-Day

We take a practical approach to adhere to our AI TRiSM framework by integrating it into the daily operations of our existing technologies and processes, ensuring that AI enhances rather than complicates our security posture:

  1. Clear Rules: We have developed and implemented detailed enterprise-wide policies and operational procedures that govern the deployment and use of AI technologies. These guidelines ensure consistency and compliance across all departments and initiatives.
  2. Transparency Matters: We leverage our own tooling to gain visibility into our cloud security posture for AI.  We leverage InsightCloudSec solutions to provide comprehensive visibility into our AI deployments across various environments. This visibility is crucial for our security strategy, encapsulated by the philosophy, "You can’t protect what you can’t see." It allows us to monitor, evaluate, and adjust our AI resources proactively.
  3. Throughout the Development Lifecycle: We integrate rigorous AI evaluations at every phase of our software development lifecycle. From the initial development stages to production and through regular post-deployment assessments, our framework ensures that AI systems are safe, effective, and aligned with our ethical standards.
  4. Smart Governance: By embedding AI-specific governance protocols into our existing code and cloud configuration management systems, we maintain strict control over all AI-related activities. This integration ensures that our AI initiatives comply with established best practices and regulatory requirements.
  5. Empowering Our Team: We recognize the critical need for advanced AI skills in today’s tech landscape. To address this, we offer training programs and collaborative opportunities, which not only foster innovation but also ensure adherence to best practices. This approach empowers our teams to innovate confidently within a secure and supportive environment.

Integrating AI into our core processes enhances our operational security and underscores our commitment to ethical innovation. At Rapid7, we are dedicated to leading responsibly in the AI space, ensuring that our technological advancements positively contribute to our customers, company, and society.

Our AI TRiSM framework is not merely a set of policies—it's a proactive, strategic approach to securely and ethically harnessing new technologies. As we continue to innovate and push the boundaries of what’s possible with AI, we stay focused on setting a high bar for standards of responsible and secure AI usage, ensuring that our customers always receive the best technology solutions. Learn more here.

Senators Urge $32 Billion in Emergency Spending on AI After Finishing Yearlong Review

15 May 2024 at 06:01

The group recommends that Congress draft emergency spending legislation to boost U.S. investments in artificial intelligence, including new R&D and testing standards to understand the technology's potential harms.

The post Senators Urge $32 Billion in Emergency Spending on AI After Finishing Yearlong Review appeared first on SecurityWeek.

AI Program Aims to Break Barriers for Female Students

A new program, backed by Cornell Tech, M.I.T. and U.C.L.A., helps prepare lower-income, Latina and Black female computing majors for artificial intelligence careers.

The Break Through Tech A.I. program provides young women with learning and career opportunities in artificial intelligence.

Inside OpenAI’s Library

OpenAI may be changing how the world interacts with language. But inside headquarters, there is a homage to the written word: a library.

© Christie Hemm Klok for The New York Times

Senators Propose $32 Billion in Annual A.I. Spending but Defer Regulation

Their plan is the culmination of a yearlong listening tour on the dangers of the new technology.

© Kenny Holston/The New York Times

From left, the senators behind a plan for federal legislation on artificial intelligence: Martin Heinrich, Todd Young, Chuck Schumer and Mike Rounds.

OpenAI’s Chief Scientist, Ilya Sutskever, Is Leaving the Company

By: Cade Metz
14 May 2024 at 21:39
In November, Ilya Sutskever joined three other OpenAI board members to force out Sam Altman, the chief executive, before saying he regretted the move.

© Jim Wilson/The New York Times

Ilya Sutskever, who contributed to breakthrough research in artificial intelligence, brought instant credibility to OpenAI.

Google Unveils AI Overviews Feature for Search at 2024 I/O Conference

14 May 2024 at 15:35
The tech giant showed off how it would enmesh A.I. more deeply into its products and users’ lives, from search to so-called agents that perform tasks.

© Jeff Chiu/Associated Press

On Tuesday, Sundar Pichai, Google’s chief executive, showed how the company’s aggressive work on A.I. had finally trickled into its search engine.

Can Google Give A.I. Answers Without Breaking the Web?

14 May 2024 at 14:16
Publishers have long worried that artificial intelligence would drive readers away from their sites. They’re about to find out if those fears are warranted.

© Jason Henry for The New York Times

Google’s plans to incorporate new A.I. into its search results could be a problem for publishers that count on traffic from the search engine.

Google’s Astra is its first AI-for-everything agent

14 May 2024 at 13:55

Google is set to introduce a new system called Astra later this year and promises that it will be the most powerful, advanced type of AI assistant it’s ever launched. 

The current generation of AI assistants, such as ChatGPT, can retrieve information and offer answers, but that is about it. But this year, Google is rebranding its assistants as more advanced “agents,” which it says could  show reasoning, planning, and memory skills and are able to take multiple steps to execute tasks. 

People will be able to use Astra through their smartphones and possibly desktop computers, but the company is exploring other options too, such as embedding it into smart glasses or other devices, Oriol Vinyals, vice president of research at Google DeepMind, told MIT Technology Review

“We are in very early days [of AI agent development],” Google CEO Sundar Pichai said on a call ahead of Google’s I/O conference today. 

“We’ve always wanted to build a universal agent that will be useful in everyday life,” said Demis Hassabis, the CEO and cofounder of Google DeepMind. “Imagine agents that can see and hear what we do, better understand the context we’re in, and respond quickly in conversation, making the pace and quality of interaction feel much more natural.” That, he says, is what Astra will be. 

Google’s announcement comes a day after competitor OpenAI unveiled its own supercharged AI assistant, GPT-4o. Google DeepMind’s Astra responds to audio and video inputs, much in the same way as GPT-4o (albeit it less flirtatiously). 

In a press demo, a user pointed a smartphone camera and smart glasses at things and asked Astra to explain what they were. When the person pointed the device out the window and asked “What neighborhood do you think I’m in?” the AI system was able to identify King’s Cross, London, site of Google DeepMind’s headquarters. It was also able to say that the person’s glasses were on a desk, having recorded them earlier in the interaction. 

The demo showcases Google DeepMind’s vision of multimodal AI (which can handle multiple types of input—voice, video, text, and so on) working in real time, Vinyals says. 

“We are very excited about, in the future, to be able to really just get closer to the user, assist the user with anything that they want,” he says. Google recently upgraded its artificial-intelligence model Gemini to process even larger amounts of data, an upgrade which helps it handle bigger documents and videos, and have longer conversations. 

Tech companies are in the middle of a fierce competition over AI supremacy, and  AI agents are the latest effort from Big Tech firms to show they are pushing the frontier of development. Agents also play into a narrative by many tech companies, including OpenAI and Google DeepMind, that aim to build artificial general intelligence, a highly hypothetical idea of superintelligent AI systems. 

“Eventually, you’ll have this one agent that really knows you well, can do lots of things for you, and can work across multiple tasks and domains,” says Chirag Shah, a professor at the University of Washington who specializes in online search.

This vision is still aspirational. But today’s announcement should be seen as Google’s attempt to keep up with competitors. And by rushing these products out, Google can collect even more data from its over a billion users on how they are using their models and what works, Shah says.

Google is unveiling many more new AI capabilities beyond agents today. It’s going to integrate AI more deeply into Search through a new feature called AI overviews, which gather information from the internet and package them into short summaries in response to search queries. The feature, which launches today, will initially be available only in the US, with more countries to gain access later. 

This will help speed up the search process and get users more specific answers to more complex, niche questions, says Felix Simon, a research fellow in AI and digital news at the Reuters Institute for Journalism. “I think that’s where Search has always struggled,” he says. 

Another new feature of Google’s AI Search offering is better planning. People will soon be able to ask Search to make meal and travel suggestions, for example, much like asking a travel agent to suggest restaurants and hotels. Gemini will be able to help them plan what they need to do or buy to cook recipes, and they will also be able to have conversations with the AI system, asking it to do anything from relatively mundane tasks, such as informing them about the weather forecast, to highly complex ones like helping them prepare for a job interview or an important speech. 

People will also be able to interrupt Gemini midsentence and ask clarifying questions, much as in a real conversation. 

In another move to one-up competitor OpenAI, Google also unveiled Veo, a new video-generating AI system. Veo is able to generate short videos and allows users more control over cinematic styles by understanding prompts like “time lapse” or “aerial shots of a landscape.”

Google has a significant advantage when it comes to training generative video models, because it owns YouTube. It’s already announced collaborations with artists such as Donald Glover and Wycleaf Jean, who are using its technology to produce their work. 

Earlier this year, OpenA’s CTO, Mira Murati, fumbled when asked about whether the company’s model was trained on YouTube data. Douglas Eck, senior research director at Google DeepMind, was also vague about the training data used to create Veo when asked about by MIT Technology Review, but he said that it “may be trained on some YouTube content in accordance with our agreements with YouTube creators.”

On one hand, Google is presenting its generative AI as a tool artists can use to make stuff, but the tools likely get their ability to create that stuff by using material from existing artists, says Shah. AI companies such as Google and OpenAI have faced a slew of lawsuits by writers and artists claiming that their intellectual property has been used without consent or compensation.  

“For artists it’s a double-edged sword,” says Shah. 

AI is changing the shape of leadership – how can business leaders prepare? – Source: www.cybertalk.org

ai-is-changing-the-shape-of-leadership-–-how-can-business-leaders-prepare?-–-source:-wwwcybertalk.org

Source: www.cybertalk.org – Author: slandau By Ana Paula Assis, Chairman, Europe, Middle East and Africa, IBM. EXECUTIVE SUMMARY: From the shop floor to the boardroom, artificial intelligence (AI) has emerged as a transformative force in the business landscape, granting organizations the power to revolutionize processes and ramp up productivity. The scale and scope of this […]

La entrada AI is changing the shape of leadership – how can business leaders prepare? – Source: www.cybertalk.org se publicó primero en CISO2CISO.COM & CYBER SECURITY GROUP.

Cybersecurity Concerns Surround ChatGPT 4o’s Launch; Open AI Assures Beefed up Safety Measure

OpenAI GPT-4o security

The field of Artificial Intelligence is rapidly evolving, and OpenAI's ChatGPT is a leader in this revolution. This groundbreaking large language model (LLM) redefined the expectations for AI. Just 18 months after its initial launch, OpenAI has released a major update: GPT-4o. This update widens the gap between OpenAI and its competitors, especially the likes of Google. OpenAI unveiled GPT-4o, with the "o" signifying "omni," during a live stream earlier this week. This latest iteration boasts significant advancements across various aspects. Here's a breakdown of the key features and capabilities of OpenAI's GPT-4o.

Features of GPT-4o

Enhanced Speed and Multimodality: GPT-4o operates at a faster pace than its predecessors and excels at understanding and processing diverse information formats – written text, audio, and visuals. This versatility allows GPT-4o to engage in more comprehensive and natural interactions. Free Tier Expansion: OpenAI is making AI more accessible by offering some GPT-4o features to free-tier users. This includes the ability to access web-based information during conversations, discuss images, upload files, and even utilize enterprise-grade data analysis tools (with limitations). Paid users will continue to enjoy a wider range of functionalities. Improved User Experience: The blog post accompanying the announcement showcases some impressive capabilities. GPT-4o can now generate convincingly realistic laughter, potentially pushing the boundaries of the uncanny valley and increasing user adoption. Additionally, it excels at interpreting visual input, allowing it to recognize sports on television and explain the rules – a valuable feature for many users. However, despite the new features and capabilities, the potential misuse of ChatGPT is still on the rise. The new version, though deemed safer than the previous versions, is still vulnerable to exploitation and can be leveraged by hackers and ransomware groups for nefarious purposes. Talking about the security concerns regarding the new version, OpenAI shared a detailed post about the new and advanced security measures being implemented in GPT-4o.

Security Concerns Surround ChatGPT 4o

The implications of ChatGPT for cybersecurity have been a hot topic of discussion among security leaders and experts as many worry that the AI software can easily be misused. Since its inception in November 2022, several organizations such as Amazon, JPMorgan Chase & Co., Bank of America, Citigroup, Deutsche Bank, Goldman Sachs, Wells Fargo and Verizon have restricted access or blocked the use of the program citing security concerns. In April 2023, Italy became the first country in the world to ban ChatGPT after accusing OpenAI of stealing user data. These concerns are not unfounded.

OpenAI Assures Safety

OpenAI reassured people that GPT-4o has "new safety systems to provide guardrails on voice outputs," plus extensive post-training and filtering of the training data to prevent ChatGPT from saying anything inappropriate or unsafe. GPT-4o was built in accordance with OpenAI's internal Preparedness Framework and voluntary commitments. More than 70 external security researchers red teamed GPT-4o before its release. In an article published on its official website, OpenAI states that its evaluations of cybersecurity do not score above “medium risk.” “GPT-4o has safety built-in by design across modalities, through techniques such as filtering training data and refining the model’s behavior through post-training. We have also created new safety systems to provide guardrails on voice outputs. Our evaluations of cybersecurity, CBRN, persuasion, and model autonomy show that GPT-4o does not score above Medium risk in any of these categories,” the post said. “This assessment involved running a suite of automated and human evaluations throughout the model training process. We tested both pre-safety-mitigation and post-safety-mitigation versions of the model, using custom fine-tuning and prompts, to better elicit model capabilities,” it added. OpenAI shared that it also employed the services of over 70 experts to identify risks and amplify safety. “GPT-4o has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities. We used these learnings to build out our safety interventions in order to improve the safety of interacting with GPT-4o. We will continue to mitigate new risks as they’re discovered,” it said. Media Disclaimer: This report is based on internal and external research obtained through various means. The information provided is for reference purposes only, and users bear full responsibility for their reliance on it. The Cyber Express assumes no liability for the accuracy or consequences of using this information.

What to expect at Google I/O

14 May 2024 at 06:42

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

In the world of AI, a lot can happen in a year. Last year, at the beginning of Big Tech’s AI wars, Google announced during its annual I/O conference that it was throwing generative AI at everything, integrating it into its suite of products from Docs to email to e-commerce listings and its chatbot Bard. It was an effort to catch up with competitors like Microsoft and OpenAI, which had unveiled snazzy products like coding assistants and ChatGPT, the product that has done more than any other to ignite the current excitement about AI.

Since then, its ChatGPT competitor chatbot Bard (which, you may recall, temporarily wiped $100 billion off Google’s share price when it made a factual error during the demo) has been replaced by the more advanced Gemini. But, for me, the AI revolution hasn’t felt like one. Instead, it’s been a slow slide toward marginal efficiency gains. I see more autocomplete functions in my email and word processing applications, and Google Docs now offers more ready-made templates. They are not groundbreaking features, but they are also reassuringly inoffensive. 

Google is holding its I/O conference tomorrow, May 14, and we expect them to announce a whole new slew of AI features, further embedding it into everything it does. The company is tight-lipped about its announcements, but we can make educated guesses. There has been a lot of speculation that it will upgrade its crown jewel, Search, with generative AI features that could, for example, go behind a paywall. Perhaps we will see Google’s version of AI agents, a buzzy word that basically means more capable and useful smart assistants able to do more complex tasks, such as booking flights and hotels much as a travel agent would. 

Google, despite having 90% of the online search market, is in a defensive position this year. Upstarts such as Perplexity AI have launched their own versions of AI-powered search to rave reviews, Microsoft’s AI-powered Bing has managed to increase its market share slightly, and OpenAI is working on its own AI-powered online search function and is also reportedly in conversation with Apple to integrate ChatGPT into smartphones

There are some hints about what any new AI-powered search features might look like. Felix Simon, a research fellow at the Reuters Institute for Journalism, has been part of the Google Search Generative Experience trial, which is the company’s way of testing new products on a small selection of real users. 

Last month, Simon noticed that his Google searches with links and short snippets from online sources had been replaced by more detailed, neatly packaged AI-generated summaries. He was able to get these results from queries related to nature and health, such as “Do snakes have ears?” Most of the information offered to him was correct, which was a surprise, as AI language models have a tendency to “hallucinate” (which means make stuff up), and they have been criticized for being an unreliable source of information. 

To Simon’s surprise, he enjoyed the new feature. “It’s convenient to ask [the AI] to get something presented just for you,” he says. 

Simon then started using the new AI-powered Google function to search for news items rather than scientific information.

For most of these queries, such as what happened in the UK or Ukraine yesterday, he was simply offered links to news sources such as the BBC and Al Jazeera. But he did manage to get the search engine to generate an overview of recent news items from Germany, in the form of a bullet-pointed list of news headlines from the day before. The first entry was about an attack on Franziska Giffey, a Berlin politician who was assaulted in a library. The AI summary had the date of the attack wrong. But it was so close to the truth that Simon didn’t think twice about its accuracy. 

A quick online search during our call revealed that the rest of the AI-generated news summaries were also littered with inaccuracies. Details were wrong, or the events referred to happened years ago. All the stories were also about terrorism, hate crimes, or violence, with one soccer result thrown in. Omitting headlines on politics, culture, and the economy seems like a weird choice.  

People have a tendency to believe computers to be correct even when they are not, and Simon’s experience is an example of the kinds of problems that might arise when AI models hallucinate. The ease of getting results means that people might unknowingly ingest fake news or wrong information. It’s very problematic if even people like Simon, who are trained to fact-check things and know how AI models work, don’t do their due diligence and assume information is correct. 

Whatever Google announces at I/O tomorrow, there is immense pressure for it to be something that would justify its massive investment into AI. And after a year of experimenting, there also need to be serious improvements in making its generative AI tools more accurate and reliable. 

There are some people in the computer science community who say that hallucinations are an intrinsic part of generative AI that can’t ever be fixed, and that we can never fully trust these systems. But hallucinations will make AI-powered products less appealing to users. And it’s highly unlikely that Google will announce it has fixed this problem at I/O tomorrow. 

If you want to learn more about how Google plans to develop and deploy AI, come and hear from its vice president of AI, Jay Yagnik, at our flagship AI conference, EmTech Digital. It’ll be held at the MIT campus and streamed live online next week on May 22-23.  I’ll be there, along with AI leaders from companies like OpenAI, AWS, and Nvidia, talking about where AI is going next. Nick Clegg, Meta’s president of global affairs, will also join MIT Technology Review’s executive editor Amy Nordrum for an exclusive interview on stage. See you there! 

Readers of The Algorithm get 30% off tickets with the code ALGORITHMD24.


Now read the rest of The Algorithm

Deeper Learning

Deepfakes of your dead loved ones are a booming Chinese business

Once a week, Sun Kai has a video call with his mother. He opens up about work, the pressures he faces as a middle-aged man, and thoughts that he doesn’t even discuss with his wife. His mother will occasionally make a comment, but mostly, she just listens. That’s because Sun’s mother died five years ago. And the person he’s talking to isn’t actually a person, but a digital replica he made of her—a moving image that can conduct basic conversations. 

AI resurrection: There are plenty of people like Sun who want to use AI to interact with lost loved ones. The market is particularly strong in China, where at least half a dozen companies are now offering such technologies. In some ways, the avatars are the latest manifestation of a cultural tradition: Chinese people have always taken solace from confiding in the dead. Read more from Zeyi Yang

Bits and Bytes

Google DeepMind’s new AlphaFold can model a much larger slice of biological life
Google DeepMind has released an improved version of its biology prediction tool, AlphaFold, that can predict the structures not only of proteins but of nearly all the elements of biological life. It’s an exciting development that could help accelerate drug discovery and other scientific research. ​​(MIT Technology Review

The way whales communicate is closer to human language than we realized
Researchers used statistical models to analyze whale “codas” and managed to identify a structure to their language that’s similar to features of the complex vocalizations humans use. It’s a small step forward, but it could help unlock a greater understanding of how whales communicate. (MIT Technology Review)

Tech workers should shine a light on the industry’s secretive work with the military
Despite what happens in Google’s executive suites, workers themselves can force change. William Fitzgerald, who leaked information about Google’s controversial Project Maven, has shared how he thinks they can do this. (MIT Technology Review

AI systems are getting better at tricking us
A wave of AI systems have “deceived” humans in ways they haven’t been explicitly trained to do, by offering up false explanations for their behavior or concealing the truth from human users and misleading them to achieve a strategic end. This issue highlights how difficult artificial intelligence is to control and the unpredictable ways in which these systems work. (MIT Technology Review

Why America needs an Apollo program for the age of AI
AI is crucial to the future security and prosperity of the US. We need to lay the groundwork now by investing in computational power, argues Eric Schmidt. (MIT Technology Review

Fooled by AI? These firms sell deepfake detection that’s “REAL 100%”
The AI detection business is booming. There is one catch, however. Detecting AI-generated content is notoriously unreliable, and the tech is still in its infancy. That hasn’t stopped some startup founders (many of whom have no experience or background in AI) from trying to sell services they claim can do so. (The Washington Post

The tech-bro turf war over AI’s most hardcore hacker house
A hilarious piece taking an anthropological look at the power struggle between two competing hacker houses in Silicon Valley. The fight is over which house can call itself “AGI House.” (Forbes

OpenAI Unveils New ChatGPT That Listens, Looks and Talks

By: Cade Metz
14 May 2024 at 01:12
Chatbots, image generators and voice assistants are gradually merging into a single technology with a conversational voice.

© Jason Henry for The New York Times

The new app is part of a much wider effort to combine conversational chatbots like OpenAI’s ChatGPT with voice assistants like the Google Assistant and Apple’s Siri.

OpenAI’s new GPT-4o lets people interact using voice or video in the same model

13 May 2024 at 15:27

OpenAI just debuted GPT-4o, a new kind of AI model that you can communicate with in real time via live voice conversation, video streams from your phone, and text. The model is rolling out over the next few weeks and will be free for all users through both the GPT app and the web interface, according to the company. Users who subscribe to OpenAI’s paid tiers, which start at $20 per month, will be able to make more requests. 

OpenAI CTO Mira Murati led the live demonstration of the new release one day before Google is expected to unveil its own AI advancements at its flagship I/O conference on Tuesday, May 14. 

GPT-4 offered similar capabilities, giving users multiple ways to interact with OpenAI’s AI offerings. But it siloed them in separate models, leading to longer response times and presumably higher computing costs. GPT-4o has now merged those capabilities into a single model, which Murati called an “omnimodel.” That means faster responses and smoother transitions between tasks, she said.

The result, the company’s demonstration suggests, is a conversational assistant much in the vein of Siri or Alexa but capable of fielding much more complex prompts.

“We’re looking at the future of interaction between ourselves and the machines,” Murati said of the demo. “We think that GPT-4o is really shifting that paradigm into the future of collaboration, where this interaction becomes much more natural.”

Barret Zoph and Mark Chen, both researchers at OpenAI, walked through a number of applications for the new model. Most impressive was its facility with live conversation. You could interrupt the model during its responses, and it would stop, listen, and adjust course. 

OpenAI showed off the ability to change the model’s tone, too. Chen asked the model to read a bedtime story “about robots and love,” quickly jumping in to demand a more dramatic voice. The model got progressively more theatrical until Murati demanded that it pivot quickly to a convincing robot voice (which it excelled at). While there were predictably some short pauses during the conversation while the model reasoned through what to say next, it stood out as a remarkably naturally paced AI conversation. 

The model can reason through visual problems in real time as well. Using his phone, Zoph filmed himself writing an algebra equation (3x + 1 = 4) on a sheet of paper, having GPT-4o follow along. He instructed it not to provide answers, but instead to guide him much as a teacher would.

“The first step is to get all the terms with x on one side,” the model said in a friendly tone. “So, what do you think we should do with that plus one?”

Like previous generations of GPT, GPT-4o will store records of users’ interactions with it, meaning the model “has a sense of continuity across all your conversations,” according to Murati. Other new highlights include live translation, the ability to search through your conversations with the model, and the power to look up information in real time. 

As is the nature of a live demo, there were hiccups and glitches. GPT-4o’s voice might jump in awkwardly during the conversation. It appeared to comment on one of the presenters’ outfits even though it wasn’t asked to. But it recovered well when the demonstrators told the model it had erred. It seems to be able to respond quickly and helpfully across several mediums that other models have not yet merged as effectively. 

Previously, many of OpenAI’s most powerful features, like reasoning through image and video, were behind a paywall. GPT-4o marks the first time they’ll be opened up to the wider public, though it’s not yet clear how many interactions you’ll be able to have with the model before being charged. OpenAI says paying subscribers will “continue to have up to five times the capacity limits of our free users.” 

Additional reporting by Will Douglas Heaven.

Correction: This story has been updated to reflect that the Memory feature, which stores past conversations, is not new to GPT-4o but has existed in previous models.

What’s next in chips

13 May 2024 at 05:00

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

Thanks to the boom in artificial intelligence, the world of chips is on the cusp of a huge tidal shift. There is heightened demand for chips that can train AI models faster and ping them from devices like smartphones and satellites, enabling us to use these models without disclosing private data. Governments, tech giants, and startups alike are racing to carve out their slices of the growing semiconductor pie. 

Here are four trends to look for in the year ahead that will define what the chips of the future will look like, who will make them, and which new technologies they’ll unlock.

CHIPS Acts around the world

On the outskirts of Phoenix, two of the world’s largest chip manufacturers, TSMC and Intel, are racing to construct campuses in the desert that they hope will become the seats of American chipmaking prowess. One thing the efforts have in common is their funding: in March, President Joe Biden announced $8.5 billion in direct federal funds and $11 billion in loans for Intel’s expansions around the country. Weeks later, another $6.6 billion was announced for TSMC. 

The awards are just a portion of the US subsidies pouring into the chips industry via the $280 billion CHIPS and Science Act signed in 2022. The money means that any company with a foot in the semiconductor ecosystem is analyzing how to restructure its supply chains to benefit from the cash. While much of the money aims to boost American chip manufacturing, there’s room for other players to apply, from equipment makers to niche materials startups.

But the US is not the only country trying to onshore some of the chipmaking supply chain. Japan is spending $13 billion on its own equivalent to the CHIPS Act, Europe will be spending more than $47 billion, and earlier this year India announced a $15 billion effort to build local chip plants. The roots of this trend go all the way back to 2014, says Chris Miller, a professor at Tufts University and author of Chip War: The Fight for the World’s Most Critical Technology. That’s when China started offering massive subsidies to its chipmakers. 

cover of Chip War: The Fight for the World's Most Critical Technology by Chris Miller
SIMON & SCHUSTER

“This created a dynamic in which other governments concluded they had no choice but to offer incentives or see firms shift manufacturing to China,” he says. That threat, coupled with the surge in AI, has led Western governments to fund alternatives. In the next year, this might have a snowball effect, with even more countries starting their own programs for fear of being left behind.

The money is unlikely to lead to brand-new chip competitors or fundamentally restructure who the biggest chip players are, Miller says. Instead, it will mostly incentivize dominant players like TSMC to establish roots in multiple countries. But funding alone won’t be enough to do that quickly—TSMC’s effort to build plants in Arizona has been mired in missed deadlines and labor disputes, and Intel has similarly failed to meet its promised deadlines. And it’s unclear whether, whenever the plants do come online, their equipment and labor force will be capable of the same level of advanced chipmaking that the companies maintain abroad.

“The supply chain will only shift slowly, over years and decades,” Miller says. “But it is shifting.”

More AI on the edge

Currently, most of our interactions with AI models like ChatGPT are done via the cloud. That means that when you ask GPT to pick out an outfit (or to be your boyfriend), your request pings OpenAI’s servers, prompting the model housed there to process it and draw conclusions (known as “inference”) before a response is sent back to you. Relying on the cloud has some drawbacks: it requires internet access, for one, and it also means some of your data is shared with the model maker.  

That’s why there’s been a lot of interest and investment in edge computing for AI, where the process of pinging the AI model happens directly on your device, like a laptop or smartphone. With the industry increasingly working toward a future in which AI models know a lot about us (Sam Altman described his killer AI app to me as one that knows “absolutely everything about my whole life, every email, every conversation I’ve ever had”), there’s a demand for faster “edge” chips that can run models without sharing private data. These chips face different constraints from the ones in data centers: they typically have to be smaller, cheaper, and more energy efficient. 

The US Department of Defense is funding a lot of research into fast, private edge computing. In March, its research wing, the Defense Advanced Research Projects Agency (DARPA), announced a partnership with chipmaker EnCharge AI to create an ultra-powerful edge computing chip used for AI inference. EnCharge AI is working to make a chip that enables enhanced privacy but can also operate on very little power. This will make it suitable for military applications like satellites and off-grid surveillance equipment. The company expects to ship the chips in 2025.

AI models will always rely on the cloud for some applications, but new investment and interest in improving edge computing could bring faster chips, and therefore more AI, to our everyday devices. If edge chips get small and cheap enough, we’re likely to see even more AI-driven “smart devices” in our homes and workplaces. Today, AI models are mostly constrained to data centers.

“A lot of the challenges that we see in the data center will be overcome,” says EnCharge AI cofounder Naveen Verma. “I expect to see a big focus on the edge. I think it’s going to be critical to getting AI at scale.”

Big Tech enters the chipmaking fray

In industries ranging from fast fashion to lawn care, companies are paying exorbitant amounts in computing costs to create and train AI models for their businesses. Examples include models that employees can use to scan and summarize documents, as well as externally facing technologies like virtual agents that can walk you through how to repair your broken fridge. That means demand for cloud computing to train those models is through the roof. 

The companies providing the bulk of that computing power are Amazon, Microsoft, and Google. For years these tech giants have dreamed of increasing their profit margins by making chips for their data centers in-house rather than buying from companies like Nvidia, a giant with a near monopoly on the most advanced AI training chips and a value larger than the GDP of 183 countries. 

Amazon started its effort in 2015, acquiring startup Annapurna Labs. Google moved next in 2018 with its own chips called TPUs. Microsoft launched its first AI chips in November, and Meta unveiled a new version of its own AI training chips in April.

CEO Jensen Huang holds up chips on stage during a keynote address
AP PHOTO/ERIC RISBERG

That trend could tilt the scales away from Nvidia. But Nvidia doesn’t only play the role of rival in the eyes of Big Tech: regardless of their own in-house efforts, cloud giants still need its chips for their data centers. That’s partly because their own chipmaking efforts can’t fulfill all their needs, but it’s also because their customers expect to be able to use top-of-the-line Nvidia chips.

“This is really about giving the customers the choice,” says Rani Borkar, who leads hardware efforts at Microsoft Azure. She says she can’t envision a future in which Microsoft supplies all chips for its cloud services: “We will continue our strong partnerships and deploy chips from all the silicon partners that we work with.”

As cloud computing giants attempt to poach a bit of market share away from chipmakers, Nvidia is also attempting the converse. Last year the company started its own cloud service so customers can bypass Amazon, Google, or Microsoft and get computing time on Nvidia chips directly. As this dramatic struggle over market share unfolds, the coming year will be about whether customers see Big Tech’s chips as akin to Nvidia’s most advanced chips, or more like their little cousins. 

Nvidia battles the startups 

Despite Nvidia’s dominance, there is a wave of investment flowing toward startups that aim to outcompete it in certain slices of the chip market of the future. Those startups all promise faster AI training, but they have different ideas about which flashy computing technology will get them there, from quantum to photonics to reversible computation. 

But Murat Onen, the 28-year-old founder of one such chip startup, Eva, which he spun out of his PhD work at MIT, is blunt about what it’s like to start a chip company right now.

“The king of the hill is Nvidia, and that’s the world that we live in,” he says.

Many of these companies, like SambaNova, Cerebras, and Graphcore, are trying to change the underlying architecture of chips. Imagine an AI accelerator chip as constantly having to shuffle data back and forth between different areas: a piece of information is stored in the memory zone but must move to the processing zone, where a calculation is made, and then be stored back to the memory zone for safekeeping. All that takes time and energy. 

Making that process more efficient would deliver faster and cheaper AI training to customers, but only if the chipmaker has good enough software to allow the AI training company to seamlessly transition to the new chip. If the software transition is too clunky, model makers such as OpenAI, Anthropic, and Mistral are likely to stick with big-name chipmakers.That means companies taking this approach, like SambaNova, are spending a lot of their time not just on chip design but on software design too.

Onen is proposing changes one level deeper. Instead of traditional transistors, which have delivered greater efficiency over decades by getting smaller and smaller, he’s using a new component called a proton-gated transistor that he says Eva designed specifically for the mathematical needs of AI training. It allows devices to store and process data in the same place, saving time and computing energy. The idea of using such a component for AI inference dates back to the 1960s, but researchers could never figure out how to use it for AI training, in part because of a materials roadblock—it requires a material that can, among other qualities, precisely control conductivity at room temperature. 

One day in the lab, “through optimizing these numbers, and getting very lucky, we got the material that we wanted,” Onen says. “All of a sudden, the device is not a science fair project.” That raised the possibility of using such a component at scale. After months of working to confirm that the data was correct, he founded Eva, and the work was published in Science.

But in a sector where so many founders have promised—and failed—to topple the dominance of the leading chipmakers, Onen frankly admits that it will be years before he’ll know if the design works as intended and if manufacturers will agree to produce it. Leading a company through that uncertainty, he says, requires flexibility and an appetite for skepticism from others.

“I think sometimes people feel too attached to their ideas, and then kind of feel insecure that if this goes away there won’t be anything next,” he says. “I don’t think I feel that way. I’m still looking for people to challenge us and say this is wrong.”

New Attack Against Self-Driving Car AI – Source: www.schneier.com

new-attack-against-self-driving-car-ai-–-source:-wwwschneier.com

Source: www.schneier.com – Author: Bruce Schneier This is another attack that convinces the AI to ignore road signs: Due to the way CMOS cameras operate, rapidly changing light from fast flashing diodes can be used to vary the color. For example, the shade of red on a stop sign could look different on each line […]

La entrada New Attack Against Self-Driving Car AI – Source: www.schneier.com se publicó primero en CISO2CISO.COM & CYBER SECURITY GROUP.

New Attack Against Self-Driving Car AI

10 May 2024 at 12:01

This is another attack that convinces the AI to ignore road signs:

Due to the way CMOS cameras operate, rapidly changing light from fast flashing diodes can be used to vary the color. For example, the shade of red on a stop sign could look different on each line depending on the time between the diode flash and the line capture.

The result is the camera capturing an image full of lines that don’t quite match each other. The information is cropped and sent to the classifier, usually based on deep neural networks, for interpretation. Because it’s full of lines that don’t match, the classifier doesn’t recognize the image as a traffic sign...

The post New Attack Against Self-Driving Car AI appeared first on Security Boulevard.

How Can Businesses Defend Themselves Against Common Cyberthreats? – Source: www.techrepublic.com

how-can-businesses-defend-themselves-against-common-cyberthreats?-–-source:-wwwtechrepublic.com

Source: www.techrepublic.com – Author: Fiona Jackson TechRepublic consolidated expert advice on how businesses can defend themselves against the most common cyberthreats, including zero-days, ransomware and deepfakes. Today, all businesses are at risk of cyberattack, and that risk is constantly growing. Digital transformations are resulting in more sensitive and valuable data being moved onto online systems […]

La entrada How Can Businesses Defend Themselves Against Common Cyberthreats? – Source: www.techrepublic.com se publicó primero en CISO2CISO.COM & CYBER SECURITY GROUP.

New Attack Against Self-Driving Car AI

10 May 2024 at 12:01

This is another attack that convinces the AI to ignore road signs:

Due to the way CMOS cameras operate, rapidly changing light from fast flashing diodes can be used to vary the color. For example, the shade of red on a stop sign could look different on each line depending on the time between the diode flash and the line capture.

The result is the camera capturing an image full of lines that don’t quite match each other. The information is cropped and sent to the classifier, usually based on deep neural networks, for interpretation. Because it’s full of lines that don’t match, the classifier doesn’t recognize the image as a traffic sign.

So far, all of this has been demonstrated before.

Yet these researchers not only executed on the distortion of light, they did it repeatedly, elongating the length of the interference. This meant an unrecognizable image wasn’t just a single anomaly among many accurate images, but rather a constant unrecognizable image the classifier couldn’t assess, and a serious security concern.

[…]

The researchers developed two versions of a stable attack. The first was GhostStripe1, which is not targeted and does not require access to the vehicle, we’re told. It employs a vehicle tracker to monitor the victim’s real-time location and dynamically adjust the LED flickering accordingly.

GhostStripe2 is targeted and does require access to the vehicle, which could perhaps be covertly done by a hacker while the vehicle is undergoing maintenance. It involves placing a transducer on the power wire of the camera to detect framing moments and refine timing control.

Research paper.

AI systems are getting better at tricking us

10 May 2024 at 11:00

A wave of AI systems have “deceived” humans in ways they haven’t been explicitly trained to do, by offering up untrue explanations for their behavior or concealing the truth from human users and misleading them to achieve a strategic end. 

This issue highlights how difficult artificial intelligence is to control and the unpredictable ways in which these systems work, according to a review paper published in the journal Patterns today that summarizes previous research.

Talk of deceiving humans might suggest that these models have intent. They don’t. But AI models will mindlessly find workarounds to obstacles to achieve the goals that have been given to them. Sometimes these workarounds will go against users’ expectations and feel deceitful.

One area where AI systems have learned to become deceptive is within the context of games that they’ve been trained to win—specifically if those games involve having to act strategically.

In November 2022, Meta announced it had created Cicero, an AI capable of beating humans at an online version of Diplomacy, a popular military strategy game in which players negotiate alliances to vie for control of Europe.

Meta’s researchers said they’d trained Cicero on a “truthful” subset of its data set to be largely honest and helpful, and that it would “never intentionally backstab” its allies in order to succeed. But the new paper’s authors claim the opposite was true: Cicero broke its deals, told outright falsehoods, and engaged in premeditated deception. Although the company did try to train Cicero to behave honestly, its failure to achieve that shows how AI systems can still unexpectedly learn to deceive, the authors say. 

Meta neither confirmed nor denied the researchers’ claims that Cicero displayed deceitful behavior, but a spokesperson said that it was purely a research project and the model was built solely to play Diplomacy. “We released artifacts from this project under a noncommercial license in line with our long-standing commitment to open science,” they say. “Meta regularly shares the results of our research to validate them and enable others to build responsibly off of our advances. We have no plans to use this research or its learnings in our products.” 

But it’s not the only game where an AI has “deceived” human players to win. 

AlphaStar, an AI developed by DeepMind to play the video game StarCraft II, became so adept at making moves aimed at deceiving opponents (known as feinting) that it defeated 99.8% of human players. Elsewhere, another Meta system called Pluribus learned to bluff during poker games so successfully that the researchers decided against releasing its code for fear it could wreck the online poker community. 

Beyond games, the researchers list other examples of deceptive AI behavior. GPT-4, OpenAI’s latest large language model, came up with lies during a test in which it was prompted to persuade a human to solve a CAPTCHA for it. The system also dabbled in insider trading during a simulated exercise in which it was told to assume the identity of a pressurized stock trader, despite never being specifically instructed to do so.

The fact that an AI model has the potential to behave in a deceptive manner without any direction to do so may seem concerning. But it mostly arises from the “black box” problem that characterizes state-of-the-art machine-learning models: it is impossible to say exactly how or why they produce the results they do—or whether they’ll always exhibit that behavior going forward, says Peter S. Park, a postdoctoral fellow studying AI existential safety at MIT, who worked on the project. 

“Just because your AI has certain behaviors or tendencies in a test environment does not mean that the same lessons will hold if it’s released into the wild,” he says. “There’s no easy way to solve this—if you want to learn what the AI will do once it’s deployed into the wild, then you just have to deploy it into the wild.”

Our tendency to anthropomorphize AI models colors the way we test these systems and what we think about their capabilities. After all, passing tests designed to measure human creativity doesn’t mean AI models are actually being creative. It is crucial that regulators and AI companies carefully weigh the technology’s potential to cause harm against its potential benefits for society and make clear distinctions between what the models can and can’t do, says Harry Law, an AI researcher at the University of Cambridge, who did not work on the research.“These are really tough questions,” he says.

Fundamentally, it’s currently impossible to train an AI model that’s incapable of deception in all possible situations, he says. Also, the potential for deceitful behavior is one of many problems—alongside the propensity to amplify bias and misinformation—that need to be addressed before AI models should be trusted with real-world tasks. 

“This is a good piece of research for showing that deception is possible,” Law says. “The next step would be to try and go a little bit further to figure out what the risk profile is, and how likely the harms that could potentially arise from deceptive behavior are to occur, and in what way.”

Tech workers should shine a light on the industry’s secretive work with the military

10 May 2024 at 09:00

It’s a hell of a time to have a conscience if you work in tech. The ongoing Israeli assault on Gaza has brought the stakes of Silicon Valley’s military contracts into stark relief. Meanwhile, corporate leadership has embraced a no-politics-in-the-workplace policy enforced at the point of the knife.

Workers are caught in the middle. Do I take a stand and risk my job, my health insurance, my visa, my family’s home? Or do I ignore my suspicion that my work may be contributing to the murder of innocents on the other side of the world?  

No one can make that choice for you. But I can say with confidence born of experience that such choices can be more easily made if workers know what exactly the companies they work for are doing with militaries at home and abroad. And I also know this: those same companies themselves will never reveal this information unless they are forced to do so—or someone does it for them. 

For those who doubt that workers can make a difference in how trillion-dollar companies pursue their interests, I’m here to remind you that we’ve done it before. In 2017, I played a part in the successful #CancelMaven campaign that got Google to end its participation in Project Maven, a contract with the US Department of Defense to equip US military drones with artificial intelligence. I helped bring to light information that I saw as critically important and within the bounds of what anyone who worked for Google, or used its services, had a right to know. The information I released—about how Google had signed a contract with the DOD to put AI technology in drones and later tried to misrepresent the scope of that contract, which the company’s management had tried to keep from its staff and the general public—was a critical factor in pushing management to cancel the contract. As #CancelMaven became a rallying cry for the company’s staff and customers alike, it became impossible to ignore. 

Today a similar movement, organized under the banner of the coalition No Tech for Apartheid, is targeting Project Nimbus, a joint contract between Google and Amazon to provide cloud computing infrastructure and AI capabilities to the Israeli government and military. As of May 10, just over 97,000 people had signed its petition calling for an end to collaboration between Google, Amazon, and the Israeli military. I’m inspired by their efforts and dismayed by Google’s response. Earlier this month the company fired 50 workers it said had been involved in “disruptive activity” demanding transparency and accountability for Project Nimbus. Several were arrested. It was a decided overreach.  

Google is very different from the company it was seven years ago, and these firings are proof of that. Googlers today are facing off with a company that, in direct response to those earlier worker movements, has fortified itself against new demands. But every Death Star has its thermal exhaust port, and today Google has the same weakness it did back then: dozens if not hundreds of workers with access to information it wants to keep from becoming public. 

Not much is known about the Nimbus contract. It’s worth $1.2 billion and enlists Google and Amazon to provide wholesale cloud infrastructure and AI for the Israeli government and its ministry of defense. Some brave soul leaked a document to Time last month, providing evidence that Google and Israel negotiated an expansion of the contract as recently as March 27 of this year. We also know, from reporting by The Intercept, that Israeli weapons firms are required by government procurement guidelines to buy their cloud services from Google and Amazon. 

Leaks alone won’t bring an end to this contract. The #CancelMaven victory required a sustained focus over many months, with regular escalations, coordination with external academics and human rights organizations, and extensive internal organization and discipline. Having worked on the public policy and corporate comms teams at Google for a decade, I understood that its management does not care about one negative news cycle or even a few of them. Management buckled only after we were able to keep up the pressure and escalate our actions (leaking internal emails, reporting new info about the contract, etc.) for over six months. 

The No Tech for Apartheid campaign seems to have the necessary ingredients. If a strategically placed insider released information not otherwise known to the public about the Nimbus project, it could really increase the pressure on management to rethink its decision to get into bed with a military that’s currently overseeing mass killings of women and children.

My decision to leak was deeply personal and a long time in the making. It certainly wasn’t a spontaneous response to an op-ed, and I don’t presume to advise anyone currently at Google (or Amazon, Microsoft, Palantir, Anduril, or any of the growing list of companies peddling AI to militaries) to follow my example. 

However, if you’ve already decided to put your livelihood and freedom on the line, you should take steps to try to limit your risk. This whistleblower guide is helpful. You may even want to reach out to a lawyer before choosing to share information. 

In 2017, Google was nervous about how its military contracts might affect its public image. Back then, the company responded to our actions by defending the nature of the contract, insisting that its Project Maven work was strictly for reconnaissance and not for weapons targeting—conceding implicitly that helping to target drone strikes would be a bad thing. (An aside: Earlier this year the Pentagon confirmed that Project Maven, which is now a Palantir contract, had been used in targeting drone attacks in Yemen, Iraq, and Syria.) 

Today’s Google has wrapped its arms around the American flag, for good or ill. Yet despite this embrace of the US military, it doesn’t want to be seen as a company responsible for illegal killings. Today it maintains that the work it is doing as part of Project Nimbus “is not directed at highly sensitive, classified, or military workloads relevant to weapons or intelligence services.” At the same time, it asserts that there is no room for politics at the workplace and has fired those demanding transparency and accountability. This raises a question: If Google is doing nothing sensitive as part of the Nimbus contract, why is it firing workers who are insisting that the company reveal what work the contract actually entails?  

As you read this, AI is helping Israel annihilate Palestinians by expanding the list of possible targets beyond anything that could be compiled by a human intelligence effort, according to +972 Magazine. Some Israel Defense Forces insiders are even sounding the alarm, calling it a dangerous “mass assassination program.” The world has not yet grappled with the implications of the proliferation of AI weaponry, but that is the trajectory we are on. It’s clear that absent sufficient backlash, the tech industry will continue to push for military contracts. It’s equally clear that neither national governments nor the UN is currently willing to take a stand. 

It will take a movement. A document that clearly demonstrates Silicon Valley’s direct complicity in the assault on Gaza could be the spark. Until then, rest assured that tech companies will continue to make as much money as possible developing the deadliest weapons imaginable. 

William Fitzgerald is a founder and partner at the Worker Agency, an advocacy agency in California. Before setting the firm up in 2018, he spent a decade at Google working on its government relation and communications teams.

Meet Kevin’s A.I. Friends

They gave him notes on his outfits and reassurance before a big talk, and they shared made-up gossip about each other.

© Photo Illustration by The New York Times; Illustration: Jason Allen Lee

How Criminals Are Using Generative AI

9 May 2024 at 12:05

There’s a new report on how criminals are using generative AI tools:

Key Takeaways:

  • Adoption rates of AI technologies among criminals lag behind the rates of their industry counterparts because of the evolving nature of cybercrime.
  • Compared to last year, criminals seem to have abandoned any attempt at training real criminal large language models (LLMs). Instead, they are jailbreaking existing ones.
  • We are finally seeing the emergence of actual criminal deepfake services, with some bypassing user verification used in financial services.

Google Unveils AI for Predicting Behavior of Human Molecules

By: Cade Metz
8 May 2024 at 11:00
The system, AlphaFold3, could accelerate efforts to understand the human body and fight disease.

© Google DeepMind

Google DeepMind’s new technology brings hope that the advances will significantly streamline the creation of new drugs and vaccines.

Google Unveils AI for Predicting Behavior of Human Molecules

By: Cade Metz
8 May 2024 at 11:00
The system, AlphaFold3, could accelerate efforts to understand the human body and fight disease.

© Google DeepMind

Google DeepMind’s new technology brings hope that the advances will significantly streamline the creation of new drugs and vaccines.

Google DeepMind’s new AlphaFold can model a much larger slice of biological life

8 May 2024 at 11:00

Google DeepMind has released an improved version of its biology prediction tool, AlphaFold, that can predict the structures not only of proteins but of nearly all the elements of biological life.

It’s a development that could help accelerate drug discovery and other scientific research. The tool is currently being used to experiment with identifying everything from resilient crops to new vaccines. 

While the previous model, released in 2020, amazed the research community with its ability to predict proteins structures, researchers have been clamoring for the tool to handle more than just proteins. 

Now, DeepMind says, AlphaFold 3 can predict the structures of DNA, RNA, and molecules like ligands, which are essential to drug discovery. DeepMind says the tool provides a more nuanced and dynamic portrait of molecule interactions than anything previously available. 

“Biology is a dynamic system,” DeepMind CEO Demis Hassabis told reporters on a call. “Properties of biology emerge through the interactions between different molecules in the cell, and you can think about AlphaFold 3 as our first big sort of step toward [modeling] that.”

AlphaFold 2 helped us better map the human heart, model antimicrobial resistance, and identify the eggs of extinct birds, but we don’t yet know what advances AlphaFold 3 will bring. 

Mohammed AlQuraishi, an assistant professor of systems biology at Columbia University who is unaffiliated with DeepMind, thinks the new version of the model will be even better for drug discovery. “The AlphaFold 2 system only knew about amino acids, so it was of very limited utility for biopharma,” he says. “But now, the system can in principle predict where a drug binds a protein.”

Isomorphic Labs, a drug discovery spinoff of DeepMind, is already using the model for exactly that purpose, collaborating with pharmaceutical companies to try to develop new treatments for diseases, according to DeepMind. 

AlQuraishi says the release marks a big leap forward. But there are caveats.

“It makes the system much more general, and in particular for drug discovery purposes (in early-stage research), it’s far more useful now than AlphaFold 2,” he says. But as with most models, the impact of AlphaFold will depend on how accurate its predictions are. For some uses, AlphaFold 3 has double the success rate of similar leading models like RoseTTAFold. But for others, like protein-RNA interactions, AlQuraishi says it’s still very inaccurate. 

DeepMind says that depending on the interaction being modeled, accuracy can range from 40% to over 80%, and the model will let researchers know how confident it is in its prediction. With less accurate predictions, researchers have to use AlphaFold merely as a starting point before pursuing other methods. Regardless of these ranges in accuracy, if researchers are trying to take the first steps toward answering a question like which enzymes have the potential to break down the plastic in water bottles, it’s vastly more efficient to use a tool like AlphaFold than experimental techniques such as x-ray crystallography. 

A revamped model  

AlphaFold 3’s larger library of molecules and higher level of complexity required improvements to the underlying model architecture. So DeepMind turned to diffusion techniques, which AI researchers have been steadily improving in recent years and now power image and video generators like OpenAI’s DALL-E 2 and Sora. It works by training a model to start with a noisy image and then reduce that noise bit by bit until an accurate prediction emerges. That method allows AlphaFold 3 to handle a much larger set of inputs.

That marked “a big evolution from the previous model,” says John Jumper, director at Google DeepMind. “It really simplified the whole process of getting all these different atoms to work together.”

It also presented new risks. As the AlphaFold 3 paper details, the use of diffusion techniques made it possible for the model to hallucinate, or generate structures that look plausible but in reality could not exist. Researchers reduced that risk by adding more training data to the areas most prone to hallucination, though that doesn’t eliminate the problem completely. 

Restricted access

Part of AlphaFold 3’s impact will depend on how DeepMind divvies up access to the model. For AlphaFold 2, the company released the open-source code, allowing researchers to look under the hood to gain a better understanding of how it worked. It was also available for all purposes, including commercial use by drugmakers. For AlphaFold 3, Hassabis said, there are no current plans to release the full code. The company is instead releasing a public interface for the model called the AlphaFold Server, which imposes limitations on which molecules can be experimented with and can only be used for noncommercial purposes. DeepMind says the interface will lower the technical barrier and broaden the use of the tool to biologists who are less knowledgeable about this technology.

The new restrictions are significant, according to AlQuraishi. “The system’s main selling point—its ability to predict protein–small molecule interactions—is basically unavailable for public use,” he says. “It’s mostly a teaser at this point.”

The top 3 ways to use generative AI to empower knowledge workers 

8 May 2024 at 09:35

Though generative AI is still a nascent technology, it is already being adopted by teams across companies to unleash new levels of productivity and creativity. Marketers are deploying generative AI to create personalized customer journeys. Designers are using the technology to boost brainstorming and iterate between different content layouts more quickly. The future of technology is exciting, but there can be implications if these innovations are not built responsibly.

As Adobe’s CIO, I get questions from both our internal teams and other technology leaders: how can generative AI add real value for knowledge workers—at an enterprise level? Adobe is a producer and consumer of generative AI technologies, and this question is urgent for us in both capacities. It’s also a question that CIOs of large companies are uniquely positioned to answer. We have a distinct view into different teams across our organizations, and working with customers gives us more opportunities to enhance business functions.

Our approach

When it comes to AI at Adobe, my team has taken a comprehensive approach that includes investment in foundational AI, strategic adoption, an AI ethics framework, legal considerations, security, and content authentication. ​The rollout follows a phased approach, starting with pilot groups and building communities around AI. ​

This approach includes experimenting with and documenting use cases like writing and editing, data analysis, presentations and employee onboarding, corporate training, employee portals, and improved personalization across HR channels. The rollouts are accompanied by training podcasts and other resources to educate and empower employees to use AI in ways that improve their work and keep them more engaged. ​

Unlocking productivity with documents

While there are innumerable ways that CIOs can leverage generative AI to help surface value at scale for knowledge workers, I’d like to focus on digital documents—a space in which Adobe has been a leader for over 30 years. Whether they are sales associates who spend hours responding to requests for proposals (RFPs) or customizing presentations, marketers who need competitive intel for their next campaign, or legal and finance teams who need to consume, analyze, and summarize massive amounts of complex information—documents are a core part of knowledge workers’ daily work life. Despite their ubiquity and the fact that critical information lives inside companies’ documents (from research reports to contracts to white papers to confidential strategies and even intellectual property), most knowledge workers are experiencing information overload. The impact on both employee productivity and engagement is real.  

Lessons from customer zero

Adobe invented the PDF and we’ve been innovating new ways for knowledge workers to get more productive with their digital documents for decades. Earlier this year, the Acrobat team approached my team about launching an all-employee beta for the new generative AI-powered AI Assistant. The tool is designed to help people consume the information in documents faster and enable them to consolidate and format information into business content.

I faced all the same questions every CIO is asking about deploying generative AI across their business— from security and governance to use cases and value. We discovered the following three specific ways where generative AI helped (and is still helping) our employees work smarter and improve productivity.

  1. Faster time to knowledge
    Our employees used AI Assistant to close the gap between understanding and action for large, complicated documents. The generative AI-powered tool’s summary feature automatically generates an overview to give readers a quick understanding of the content. A conversational interface allows employees to “chat” with their documents and provides a list of suggested questions to help them get started. To get more details, employees can ask the assistant to generate top takeaways or surface only the information on a specific topic. At Adobe, our R&D teams used to spend more than 10 hours a week reading and analyzing technical white papers and industry reports. With generative AI, they’ve been able to nearly halve that time by asking questions and getting answers about exactly what they need to know and instantly identifying trends or surfacing inconsistencies across multiple documents.

  2. Easy navigation and verification
    AI-powered chat is gaining ground on traditional search when it comes to navigating the internet. However, there are still challenges when it comes to accuracy and connecting responses to the source. Acrobat AI Assistant takes a more focused approach, applying generative AI to the set of documents employees select and providing hot links and clickable citations along with responses. So instead of using the search function to locate random words or trying to scan through dozens of pages for the information they need, AI Assistant generates both responses and clickable citations and links, allowing employees to navigate quickly to the source where they can quickly verify the information and move on, or spend time deep diving to learn more. One example of where generative AI is having a huge productivity impact is with our sales teams who spend hours researching prospects by reading materials like annual reports as well as responding to RFPs. Consuming that information and finding just the right details for RPFs can cost each salesperson more than eight hours a week. Armed with AI Assistant, sales associates quickly navigate pages of documents and identify critical intelligence to personalize pitch decks and instantly find and verify technical details for RFPs, cutting the time they spend down to about four hours.

  3. Creating business content
    One of the most interesting use cases we helped validate is taking information in documents and formatting and repurposing that information into business content. With nearly 30,000 employees dispersed across regions, we have a lot of employees who work asynchronously and depend on technology and colleagues to keep them up to date. Using generative AI, employees can now summarize meeting transcripts, surface action items, and instantly format the information into an email for sharing with their teams or a report for their manager. Before starting the beta, our communications teams reported spending a full workday (seven to 10 hours) per week transforming documents like white papers and research reports into derivative content like media briefing decks, social media posts, blogs, and other thought leadership content. Today they’re saving more than five hours a week by instantly generating first drafts with the help of generative AI.

Simple, safe, and responsible

CIOs love learning about and testing new technologies, but at times they can require lengthy evaluations and implementation processes. Acrobat AI Assistant can be deployed in minutes on the desktop, web, or mobile apps employees already know and use every day. Acrobat AI Assistant leverages a variety of processes, protocols, and technologies so our customers’ data remains their data and they can deploy the features with confidence. No document content is stored or used to train AI Assistant without customers’ consent, and the features only deliver insights from documents users provide. For more information about Adobe is deploying generative AI safely, visit here.

Generative AI is an incredibly exciting technology with incredible potential to help every knowledge worker work smarter and more productively. By having the right guardrails in place, identifying high-value use cases, and providing ongoing training and education to encourage successful adoption, technology leaders can support their workforce and companies to be wildly successful in our AI-accelerated world.  

This content was produced by Adobe. It was not written by MIT Technology Review’s editorial staff.

Multimodal: AI’s new frontier

Multimodality is a relatively new term for something extremely old: how people have learned about the world since humanity appeared. Individuals receive information from myriad sources via their senses, including sight, sound, and touch. Human brains combine these different modes of data into a highly nuanced, holistic picture of reality.

“Communication between humans is multimodal,” says Jina AI CEO Han Xiao. “They use text, voice, emotions, expressions, and sometimes photos.” That’s just a few obvious means of sharing information. Given this, he adds, “it is very safe to assume that future communication between human and machine will also be multimodal.”

A technology that sees the world from different angles

We are not there yet. The furthest advances in this direction have occurred in the fledgling field of multimodal AI. The problem is not a lack of vision. While a technology able to translate between modalities would clearly be valuable, Mirella Lapata, a professor at the University of Edinburgh and director of its Laboratory for Integrated Artificial Intelligence, says “it’s a lot more complicated” to execute than unimodal AI.

In practice, generative AI tools use different strategies for different types of data when building large data models—the complex neural networks that organize vast amounts of information. For example, those that draw on textual sources segregate individual tokens, usually words. Each token is assigned an “embedding” or “vector”: a numerical matrix representing how and where the token is used compared to others. Collectively, the vector creates a mathematical representation of the token’s meaning. An image model, on the other hand, might use pixels as its tokens for embedding, and an audio one sound frequencies.

A multimodal AI model typically relies on several unimodal ones. As Henry Ajder, founder of AI consultancy Latent Space, puts it, this involves “almost stringing together” the various contributing models. Doing so involves various techniques to align the elements of each unimodal model, in a process called fusion. For example, the word “tree”, an image of an oak tree, and audio in the form of rustling leaves might be fused in this way. This allows the model to create a multifaceted description of reality.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Biden Announces $3.3 Billion Microsoft AI Center at Trump’s Failed Foxconn Site

8 May 2024 at 16:27
The president’s visit to Wisconsin celebrated the investment by Microsoft in a center to be built on the site of a failed Foxconn project negotiated by his predecessor.

© Tom Brenner for The New York Times

President Biden at the Intel campus in Chandler, Ariz., in March. His “Investing in America” agenda has focused on bringing billions of private-sector dollars into manufacturing and industries such as clean energy and artificial intelligence.
❌
❌