Tech investor Prosus has learned some lessons about how to best use generative A.I.

Photo of Prosus CEO Bob Van Dijk.
Prosus CEO Bob Van Dijk. The company has a central A.I. team that has been experimenting with how to use new generative A.I. technologies to help the Amsterdam-based global investment firm's portfolio companies.
Chris Ratcliffe—Bloomberg/Getty Images

Generative A.I. is an enticing but scary proposition for businesses, with as many risks as potential benefits.

Prosus, an Amsterdam-based tech and media investor and operator that I met with recently, is a good example of how a business can deploy generative A.I. while being smart about minimizing the dangers (which range from “hallucinations” to potential copyright infringement).

The company, which is majority owned by South African media company Naspers and is publicly listed on Euronext with a $147 billion market cap, is not an A.I. novice: It created a center of A.I. expertise in 2018, recognizing that the technology could provide benefits across its global portfolio of companies, which ranges from food delivery startups to edtech firms.

At first, the team of roughly 15 machine learning researchers and engineers focused on developing tools for fraud detection and recommendation engines, Euro Beinat, who heads the team, tells me. Prosus has a systematic process for deciding which models it will deploy in production, based around a key business KPI for the model. And the company is constantly pitting the A.I. model it currently has in production against challenger models—often built using different designs. If a challenger beats the incumbent on the KPI target, it replaces the production system and so on, Beinat says, creating a cycle of continuous process improvement.

When Google made its Transformer-based language model BERT—which is a small language model compared to the behemoths that power ChatGPT—available as open-source software in late 2018, Prosus’s A.I. team realized they had enough data across its portfolio companies to train several bespoke versions. They created a “Food BERT” that helped categorize tens of millions of menu items—were they vegetarian? Indian or Japanese? Spicey or sweet?—from all of the restaurants served by Prosus’s various food delivery startups. The magic of Food BERT is that it could arrive at accurate classifications without having to rely on keywords, and it could do so seamlessly across multiple languages, Paul van der Boor, Prosus’s senior director of data science, says. And once Food BERT created this “food knowledge graph,” Prosus could use it to power a recommendation engine that would suggest restaurants—or additional menu items—to customers based on what kind of food they were in the mood for.

Prosus also created its own “Fin BERT,” which could be used to do sentiment analysis of financial texts. The Prosus communications team experimented with the tool to analyze past earnings results materials and gauge their overall sentiment.

Those days already seem like ancient history compared to what is possible with today’s ultra-large language models and generative A.I., van der Boor says. First, Prosus has used large language models (LLMs) in ways that are somewhat similar to the use cases it found for the smaller language models, such as BERT. It used LLMs models to classify content and types of students for its ed tech companies and to categorize items for sale on its various digital classified marketplaces, where the descriptions are written by users. Van der Boor says these use cases are relatively safe “because we’re trying to understand what our users are putting on the marketplaces, or what are they learning, as opposed to immediately giving answers to questions.” In other words, here LLMs are being used to analyze existing unstructured data.

To explore LLMs’ more generative side, Prosus created a digital assistant chatbot that runs in its group-wide Slack channels called Plus One, initially built on OpenAI’s GPT-3 model. “We called in Plus One because it’s like having an extra team member you can tap on the shoulder and say, ‘Hey, tell me something more about this,’” Beinat says.

Prosus has experimented with several different LLMs and other text-to-image generators from different vendors and open-source hubs. In some cases, it might use one particular LLM to answer one particular kind of question, because the company has found that LLM performs better for that sort of query. Plus One has thousands of users across the Prosus group of companies. It can be used to find and analyze documents in internal databases. Or it can transcribe meetings and then answer questions about the meeting, generating key takeaways and action points, as well as extracting key numbers mentioned in the meeting.

To gather feedback about how the bots were performing, Prosus asked employees to select an emoji for each answer Plus One generates: a thumbs up for a good answer, a heart for a great answer, a thumbs down for a bad answer, and a Pinocchio emoji if Plus One has returned inaccurate or invented information. At first, Pinocchios happened about 15% of the time, van der Boor says. But Prosus began to learn what questions were most likely to result in Pinocchio answers and take steps to steer Plus One to more accurate responses through the use of different meta-prompts, as well as prefiltering what data the model was using to find information, or post-filtering of its answers to weed out obvious problems. Through these techniques, Prosus has reduced Plus One’s Pinnochios down to just under 5% of the answers it gives. “It’s still not zero, right? That’s a known problem. But [the methods for reducing inaccuracies] are a kind of learning that we can then transfer back to the company,” he says.

One key was building a pipeline in which the language model first categorizes the question prior to trying to answer it, Beinat says. If the question refers to an internal process or document, the prefiltering pipeline then prompts the model to use a vector database or some other search function to find relevant company data, and only then summarize the information. This tends to produce much more accurate results than simply asking the LLM to answer the question immediately based on its pre-training. And, as an added precaution, Prosus doesn’t allow Plus One to actually produce any analysis for users. It can recommend documents to a user that the person can use themselves to find specific information or produce certain analytics, but it won’t do any calculations on its own. The reason, van der Boor says, is that Prosus is concerned about inaccuracy. “We’ve really looked at this because it is a common use case that people say they want to see,” he says. “But we’ve learned the hard way in a sense, because we are trying to do this at scale, that this is just not there yet.”

When it comes to using generative A.I. for images, potential copyright infringement “remains an unresolved concern,” Beinat says. But he points out that in most cases where the company has experimented with A.I.-created images, for instance creating images of menu items for restaurants that did not have photos with their online menus, the images that were needed were highly generic. He also says that having so many Plus One users across the entire Prosus group of companies means that potential legal or ethical pitfalls—such as copyright issues—get spotted faster. “Plus One is a mechanism for collective learning, but also a mechanism for, let’s say, collective checking,” Beinat says.

There are lessons here for any company. Perhaps the biggest one is that companies should start playing around with the technology today, but be cautious about putting it into customer-facing applications or relying on it for any mission-critical applications.

With that here’s the rest of this week’s A.I. news.

Jeremy Kahn
@jeremyakahn
jeremy.kahn@ampressman

Update, May 4: This story has been updated to clarify how Prosus used the “Fin BERT” model it built.

A.I. IN THE NEWS

A.I. Godfather’ Hinton quits Google to speak out about A.I. risks. Geoffrey Hinton, a Turing Award-winning computer scientist and one of the pioneers of deep learning, resigned from his highly-paid position as a researcher at Google after a decade at the company, telling the New York Times that he wanted to be more free to speak out about the growing dangers he now thinks A.I. poses. Hinton told the paper that it was hard to imagine how it will be possible to prevent “bad actors from using [A.I.] to do bad things.” Hinton hopes for global regulation and collaboration among leading scientists to control the technology's development and use.

IBM CEO says the company will pause hiring for roles that could be automated with A.I. Big Blue’s CEO Arvind Krishna told Bloomberg that the company plans to pause hiring for roles that could be replaced by A.I. in the coming years, particularly in back-office functions such as human resources. IBM employs about 26,000 people in such non-customer-facing roles and Krishna said he thought as many as a third of those roles could be automated within five years. That would mean some 7,800 jobs lost—although an IBM spokesperson said any staff reductions would come in part through attrition, with those retiring or leaving the company not being replaced with new hires.

Amnesty International’s Norwegian branch criticized for use of deepfake images in campaign. Amnesty Norway, the Norwegian branch of human rights group Amnesty International, has been criticized by other human rights organizations and A.I. ethics experts for using an A.I. generated image in a campaign designed to draw attention to police brutality in Colombia on the two-year anniversary of Colombia’s National Strike. Other human rights advocates pointed out that they go to great lengths to document human and civil rights abuses—and that many actual photos of police using excessive force against the protestors in Colombia exist that Amnesty Norway could easily have used. The use of deepfakes undermines the credibility of their work, making it easy for governments to dismiss evidence of wrongdoing as fraudulent, the critics said. And A.I. ethicists said any such use of deepfakes blurs the lines between valid information and misinformation and erodes public trust. You can read more about the controversy in this Vice News story.

EYE ON A.I. RESEARCH

How a new machine learning method involving counterfactuals could improve A.I.-based predictions and recommendations. Researchers at Spotify have developed a machine-learning model that captures the complex math behind counterfactual analysis, a technique used to identify causes of past events and improve the prediction of future ones, according to a story in MIT Tech Review. The Spotify researchers based their model on a theoretical framework for counterfactuals called twin networks that were invented in the 1990s by computer scientists Andrew Balke and Judea Pearl, the publication said. Twin networks treat counterfactuals as pairs of probabilistic models representing actual and fictional worlds, with the model of the actual world working to constrain what the model of the fictional world can do. This careful analysis of counterfactual scenarios is run through the fictional model. The Spotify team showed that its twin network could determine causal relationships in a number of real-world scenarios, including credit approval, an international clinical trial for a stroke medication, and an analysis of the health benefits of improvements made to the water supply system in Kenya. In each case, the twin network produced insightful casual relationships. These causal relationships can then be used to produce better recommendations for future actions. Tech companies such as Meta, Amazon, and LinkedIn are also developing machine-learning models for causal reasoning.

FORTUNE ON A.I.

Samsung threatens to fire employees for using A.I. chatbots like ChatGPT—by Nicholas Gordon

A Stanford professor who specializes in A.I. says it’s redefining ‘authenticity’ and it could be as revolutionary as the calculator—by Victor R. Lee and the Conversation

Microsoft’s chief scientific officer, one of the world’s leading A.I. experts, doesn’t think a 6 month pause will fix A.I.—but has some ideas of how to safeguard it—by Jessica Mathews

A.I. needs ‘a new space in our world,’ says fired Google engineer Blake Lemoine—by Steve Mollman

BRAINFOOD

People have been warning generative A.I. will turbocharge misinformation. It’s already happening. NewsGuard, a fact-checking organization that rates websites for how reliable the information they provide is, has found dozens of websites designed to mimic legitimate news publications that are actually churning out misinformation generated by ChatGPT and other A.I.-powered chatbots, Bloomberg reported. The 49 websites, which were reviewed by Bloomberg, have generic-sounding names and range from those claiming to peddle celebrity gossip to those that say they offer breaking news. None disclosed that they are populated using A.I. chatbots such as OpenAI’s ChatGPT or Google’s Bard. NewsGuard found that despite Google’s use of A.I. systems to try to detect sites peddling low-quality, fake news, many of these misinformation farms made money through programmatic advertising or link sharing.

This story is bad news for Google, whose search algorithm will have to increasingly contend with the problem of sites populated with misinformation, and whose new A.I.-powered chat interfaces will have to worry about returning inaccurate answers due to information poisoning from such sites. (It’s also a problem for Microsoft’s Bing and all the other A.I.-enabled search tools.)

But more importantly, NewsGuard’s findings are very bad news for all of us. Some A.I. experts, such as Gary Marcus, have been vocal in saying that A.I.-generated misinformation now poses a grave and present danger to democracy. Geoffrey Hinton, in leaving Google (see the news section above) has mentioned similar concerns. I have tended to be relatively sanguine that democracy will survive the tidal wave of misinformation that seems to be hurtling toward it, but I am not super confident in that prediction. Either way, it looks like we are about to find out.

This is the online version of Eye on A.I., a free newsletter delivered to inboxes on Tuesdays. Sign up here.