Even when algorithms outperform humans, people often reject them

Jeff Cockrell | Oct 27, 2020

Data science has created more and better applications for algorithms, particularly those that use machine learning, to help predict outcomes of interest to humans. But has the progress of algorithmic decision aids outpaced people’s willingness to trust them? Whether humans will put their faith in self-driving cars, ML-powered employment screening, and countless other technologies depends on not only the performance of algorithms, but also how would-be users perceive that performance.

In 2015, Chicago Booth’s Berkeley J. Dietvorst, with University of Pennsylvania’s Joseph P. Simmons and Cade Massey, coined the phrase “algorithm aversion” to describe people’s tendency to distrust the predictions of algorithms, even after seeing them outperform humans. Now, further research from Dietvorst and Booth PhD student Soaham Bharti suggests that people may not be averse to algorithms per se but rather are willing to take risks in pursuit of exceptional accuracy: they prefer the relatively high variance in how well human forecasters perform, especially in uncertain contexts. If there’s a higher likelihood of getting very good forecasts, they’ll put up with a higher likelihood of very bad ones. 

In a series of experiments, Dietvorst and Bharti find evidence that people have “diminishing sensitivity to forecasting error”—in general, the first unit of error a forecaster (whether human or machine) realizes feels more significant than the fifth, which feels more significant than the 15th. The perceived difference between a perfectly accurate forecast and one that’s off by a measure of two is greater than the perceived difference between a forecast that’s off by 10 and a forecast that’s off by 12. 

In one experiment, participants choosing between two forecast models were willing to pay more to upgrade to a model that was more likely to be perfect than they were for other models offering the same average performance—and the same expected earnings—but lower odds of perfection. In other studies, when presented with two models offering the same average performance, participants preferred the one that offered the best chance of a perfect or near-perfect forecast, even though it was also more likely to produce a comparatively inaccurate forecast. People were willing to risk a prediction that was well off the mark for the sake of a shot at a near-perfect prediction.

This premium on perfection or near perfection is what one would expect from diminishing sensitivity to error, Dietvorst says, and can help explain why people often prefer human forecasts to algorithmic ones. Algorithms generally offer consistency and greater average performance, whereas humans are “a little worse on average, but they could do anything,” he says. “They could knock it out of the park. They could have a really bad forecast. You don’t quite know what’s coming with a human.”

Dietvorst and Bharti also explored how the level of uncertainty in a decision-making setting interacts with diminishing sensitivity to error. The researchers assigned participants to forecasting tasks with varying levels of uncertainty and had them choose between their own predictions and those made by an algorithm. Even though the algorithm always followed the best possible forecasting strategy, its odds of a perfect forecast went down as uncertainty went up. This would be the case for any forecaster, algorithmic or otherwise—and yet participants became more likely to choose their own predictions as the level of uncertainty increased. 

This held even when participants recognized that the algorithm performed better on average. A belief that their own predictions were more likely to be perfect—even if less accurate on average—was more important, the researchers find. 

The findings have implications for business decision makers and the general public, who often face what the researchers call “irreducible uncertainty”—situations where complete certainty isn’t possible until the outcome is known. For example, there’s no way for managers to know next quarter’s product demand until the quarter ends. If they bypass a superior algorithmic forecast and trust a human instead, hoping for perfection, they’ll end up with a lower average forecast accuracy in the long run, which could lead to unnecessary inventory costs.

Similarly, beliefs that human drivers are more likely than an algorithm to make a perfect decision could cause us to delay adopting self-driving technology, even if such technology would dramatically improve traffic safety on average.

People’s diminishing sensitivity to error and preference for variance could penalize some algorithms less than others. Although he hasn’t studied whether or how these preferences vary across different types of algorithms, Dietvorst says the fact that machine-learning algorithms are able to improve their performance over time means that people may be more likely to believe they’re capable of perfect or near-perfect forecasts in the future. 

When comparing an ML algorithm to one that’s constant, “you might believe that a machine-learning algorithm gives you a relatively higher chance of being near perfect,” Dietvorst says, “even if the past performance that you’ve seen from the two is identical.”