The FTC investigates OpenAI, while A.I. purveyors try to offer customers assurance

FTC Chair Lina Khan
FTC Chair Lina Khan. Her agency has opened an investigation into OpenAI's ChatGPT.
Chip Somodevilla—Getty Images

Hello and welcome to Eye on A.I. At Fortune’s Brainstorm Tech conference in Deer Valley, Utah, last week, generative A.I. was threaded through every conversation I had. I’ve been reporting from the Bay Area this week, where it is pretty much the same story. Companies large and small are racing to put generative A.I. into practice.

But it is also clear from the conversations I’ve been having that a lot of companies are struggling to figure out exactly what the best use cases for the technology are. “People are throwing everything at the bazooka right now, hoping magic comes out,” Sean Scott, the chief product officer at PagerDuty, said during a morning breakout session at Brainstorm Tech on A.I. and data privacy. The bazooka he’s referring to are large language models (LLMs), and Scott’s point was that sometimes such firepower isn’t necessary. Often, a smaller A.I. model, or some good old-fashioned rule-based coding, will do the job just as well, or maybe even better, at a much lower cost. “At the end of the day, it’s about what problem you are trying to solve and what is the best way to solve that problem,” he said.

It was also clear that many organizations are still struggling to put governance controls around the use of generative A.I. within their own organizations. And it doesn’t help that the regulatory picture remains uncertain.

The biggest news last week in A.I. was probably the Federal Trade Commission’s decision to open an investigation into OpenAI. The agency is probing whether ChatGPT might violate consumer protection laws by sometimes generating false and potentially defamatory statements about individuals. It’s also looking into whether OpenAI broke laws when, due to a software bug it disclosed and fixed in March, it failed to secure users’ payment details and chat history data. OpenAI CEO Sam Altman tweeted that his company would “of course” comply with the investigation and that OpenAI was “confident we follow the law.” He also expressed “disappointment” that news of the investigation had immediately leaked to the press.

The FTC investigation is not unexpected. The agency has been signaling its intention to crack down on potentially deceptive practices among A.I. companies for months. FTC Chair Lina Khan is also eager to demonstrate that her agency and existing consumer protection laws can be a prime vehicle for A.I. regulation at a time when lawmakers are contemplating passing new regulations, and perhaps even creating a new federal entity to oversee A.I. But the FTC probe will have far-reaching ramifications. Right now, it is difficult for the creators of any LLM-based A.I. system to be 100% certain it won’t hallucinate and possibly say something reputationally damaging about a person. It is also difficult to ensure these very large LLMs don’t ingest any personal data during training and won’t leak it to a third party if fed the right prompt. If the FTC insists on iron-clad guarantees, it could have a chilling effect on the deployment of generative A.I. As a result, the companies selling generative A.I. systems to business customers are rushing to offer them assurance that they aren’t about to be stranded in a legal and ethical minefield.

As my colleague David Meyer wrote last week in this newsletter, one of the biggies in this regard is copyright. (David called it generative A.I.’s “Achille’s heel.” I think some other dangers, such as data privacy and hallucination may be equally troubling, but the general idea is right.) A number of companies creating A.I. foundation models, including OpenAI and Stability AI have been hit with copyright infringement lawsuits. Scott Belsky, Adobe’s chief strategy officer, recently told me in London that he found many enterprise customers were unwilling to use a generative A.I. model unless the creator of that model could vouch for it being “commercially safe” and that meant providing assurances that copyrighted material had not been used to train the A.I. Under the new EU A.I. Act, companies deploying foundation models will have to stipulate whether any copyrighted material was used in their creation, making it easier for IP rights holders to pursue them legally if they have violated the law.

Adobe, which created a text-to-image generation system called Firefly, has offered to indemnify Firefly users from copyright infringement lawsuits. “If you get sued, we’ll pay your legal fees,” Belsky said. Adobe is able to do this because it trained Firefly on Adobe Stock images and the company’s position is that the terms of that service allow it to use the images to train A.I.

But that indemnification may not help companies escape the ethical quandary entirely. Some creators who uploaded images to the service are angry about Adobe’s position and claim it was wrong of the tech giant to use their images without explicit consent and compensation. As Neil Turkowitz, who has emerged as a leading advocate for the rights of creators in the face of generative A.I., told me a few months ago, beyond what’s legal, the question is whether we want to encourage a system that has a lack of explicit consent at its core. And while Adobe has promised a tag that creators will be able to apply to their work to prevent it from being used for A.I. training and to put in place a system to compensate creators for the use of their data, the specifics have not yet been announced.

Microsoft meanwhile has announced a bunch of measures designed to help its Cloud customers get comfortable using its generative A.I. offerings. This includes sharing its own expertise in how to set up responsible A.I. frameworks and governance procedures and access to the same responsible A.I. training curriculum that Microsoft uses for its own employees. It also plans to provide attestation to how it has implemented the National Institute of Standards and Technology A.I. framework, which may help customers with government contracts. It has also said it will offer customers a “dedicated team of A.I. legal and regulatory experts in regions around the world” to help support their own A.I. implementations. Microsoft is also offering a partnership with global consulting firms PwC and EY to help support customers to create responsible A.I. programs.

It’s actually enough to make you wonder, somewhat cynically, whether the current questions swirling around generative commercial safety are actually a bug—or a feature. Maybe all this uncertainty and angst isn’t bad for business after all—if it helps you upsell anxious customers on premium consulting and hand-holding services. At the very least, you can say that Microsoft knows how to turn lemons into lemonade.

With that, here’s the rest of this week’s A.I. news.

Jeremy Kahn
@jeremyakahn
jeremy.kahn@fortune.com

A.I. IN THE NEWS

China unveils new A.I. regulations. China has revised its guidelines for A.I., demonstrating a more flexible and tolerant approach towards the technology than earlier drafts suggested, Bloomberg reports. The guidelines will be implemented on Aug. 15 and will be supervised by seven state agencies, including the Cyberspace Administration of China. The regulations maintain a focus on security, requiring platform providers to register their services with the government and carry out a security review, but are more encouraging of A.I. innovation than earlier drafts, analysts told the news agency. Changes from the initial draft include the removal of stringent penalties for violations. Overseas companies building A.I. for the Chinese market must comply with the law but Chinese companies creating software only for overseas customers do not.

Common Sense Media launches new rating system for A.I. chatbots. The nonprofit group, which advocates for safer technology for children, announced that it is creating a rating system for chatbots and other A.I.-powered software. The ratings will assess these systems “on a number of dimensions including responsible A.I. practices and suitability for children,” the organization said in a press release. It said the ratings were being developed with input from leading A.I. experts. Jim Steyer, Common Sense Media’s CEO, said it was critical not to repeat the mistakes made with social media on A.I., “which will have even greater consequences for kids and society.”

EU faces headwinds in effort to lobby Asian nations to follow its A.I. regulation template. The European Union has launched a concerted diplomatic effort to persuade Asian nations to adopt its approach to A.I. regulation, Reuters reports. Among the provisions the EU has been lobbying for are rules that would force tech firms to disclose the use of any copyrighted material in training A.I. software, and mandate that all A.I.-generated content be labeled. The EU has, according to Reuters, met with officials from India, Japan, South Korea, Singapore, and the Philippines. But many of these nations, the news agency says, prefer a more lenient or "wait and see" approach to A.I. regulation.

Anthropic plans a new corporate structure. The San Francisco-based A.I. startup behind the chatbot Claude plans to unveil a new corporate structure that is meant to shield its A.I. Safety research from being unduly influenced by commercial incentives, according to a story in Vox. Already a public benefit corporation, or B Corp, Anthropic plans to set up a Long-Term Benefit Trust, which will hold a special class of non-profitable, nontransferable stock known as "class T." This Trust will progressively gain the right to elect and remove three of the company's five corporate directors, effectively giving it majority control over the company. Initially, the trustees were selected by Anthropic's board, but in the future, they will choose their own successors without interference from Anthropic executives.

Microsoft plans to charge $30 per month for generative A.I. Office features. The software giant said that business customers will have to pay that hefty premium for each of their employees to access the new generative A.I. features it is incorporating into its Office 365 business productivity tools, the Financial Times reports. The paper said $30 in additional cost per user would represent a 53% to 83% price increase for most enterprise customers of Microsoft’s software. Analysts said that with many companies capping IT budgets in the face of economic uncertainty, the high price of Microsoft’s generative A.I. offerings would slow the rollout of the new features, with only those teams that need to generate a lot of content, such as marketing teams, most likely to be able to justify the extra spending.

Microsoft will help commercialize Meta’s new family of language models. Microsoft also announced it was making the new Bing Chat service available to enterprise customers and that it will be the first commercial partner for Meta’s new LLaMA 2 family of large language models. Meta’s first generation of LLaMA models, which was initially made available to select noncommercial partners, were leaked online and have proved extremely popular among open-source generative A.I. developers, including some working on commercial use cases. The new LLaMA 2 family of models have, according to Meta, been more thoroughly vetted and refined to provide safer outputs. They are also capable of more multimodal reasoning than the first version of LLaMA. The partnership with Meta may indicate that Microsoft is hedging its bets when it comes to the debate between open-source versus proprietary A.I. models. The company has made a big bet on proprietary models with OpenAI and is now planting a flag in the open-source world too.

EYE ON A.I. RESEARCH

How smart are today’s A.I. systems, really? In an opinion piece for Scientific American, Hugging Face’s chief A.I. ethics officer Meg Mitchell argues that efforts to assess the intelligence of today’s generative A.I. systems are flawed. Many tests of large language models suffer from "data contamination," where the A.I. system may have previously encountered the test questions in its training data. Second, LLMs are highly sensitive to specific prompts, answering a question correctly with one prompt and then answering incorrectly if the prompt is slightly altered in a way that doesn’t change its meaning or intent. This lack of robustness, Mitchell argues, makes it impossible to claim that A.I. systems have truly mastered various human-level skills. Third, she says that most A.I. benchmarks are flawed, often allowing "shortcut learning,” where the A.I. system picks up on subtle statistical associations to answer correctly without truly understanding the concepts. Mitchell argues that to accurately evaluate the intelligence of A.I. systems, more transparency in training methods, improved benchmarks and test methodologies, and better collaboration between A.I. researchers and cognitive scientists are urgently needed.

FORTUNE ON A.I.

The SEC chief sees A.I. creating ‘conflicts of interest’ and maybe the next great financial crisis—unless we tackle ‘herding’—by Will Daniel

Barry Diller calls A.I. ‘overhyped to death’ after striking Hollywood actors label it an ‘existential threat’—by Steve Mollman

Don’t bet on the A.I. boom leading to massive profits. Just look at the tech bubble of the 1990s, a top strategist warns—by Paolo Confino

Stigma of dating a chatbot will fade, Replika CEO predicts—by Jeremy Kahn

BRAINFOOD

A new Turing Test? A lot of people have noted today’s generative A.I. chatbots can readily pass the original Turing Test. That thought experiment, conceived by Alan Turing in 1950 and initially referred to as “The Imitation Game,” postulated that the way to determine if a machine had equaled human intelligence is if a human evaluator, reading a text-only dialogue between another human and a machine on a range of general interest subjects, could not tell the difference between what the human had typed and what the machine generated. There have been many criticisms of the Turing Test over the years, including of the dubious ethics of a test that, at its core, equates intelligence with deception. But a bigger problem is that today’s A.I. chatbots can ace the Turing Test, but still can’t equal human capabilities in some critical areas, such as common sense reasoning, conceptual knowledge, logic, math (without using external tools), and long-range planning. So clearly, a new benchmark is needed.

Mustafa Suleyman, the former DeepMind cofounder who is now the founder and CEO of generative A.I. startup Inflection, has proposed an updated version of the Turing Test that stipulates a machine will have equaled human intelligence when it can successfully plan and execute a strategy to make $1 million on the internet. I’m not sure Suleyman isn’t setting the bar slightly too high—there are a lot of very smart humans who can’t make $1 million on the internet. But what Suleyman’s idea indicates is that the next big step for generative A.I. is agency—the ability to not just generate content, but generate plans and actions to achieve a goal. Inflection, whose first product is a chatbot called Pi, has said it is working on building this kind of A.I.-powered agent. A number of other well-funded startups are known to be trying too, and it is assumed that both Microsoft and Google are working on similar agentic A.I. assistants.

It is true that in order to make $1 million off the internet, an A.I. agent would need to be able to do some planning and likely have to marry some content-generation skills with mastery of a whole lot of other software tools, and possess a bit of business savvy. But is making $1 million off the internet really a good benchmark for machine smarts? I’ve got my doubts. Also, do we really want the gauge of intelligence to be linked to how much money something—or someone—can earn? We all know idiots who are fabulously wealthy and brilliant people who struggle financially. Should money be our yardstick? What do you think?

This is the online version of Eye on A.I., a free newsletter delivered to inboxes on Tuesdays. Sign up here.