The next decade of behavioural science: a call for intellectual diversity

Behavioral Scientist put out the call to share hopes, fears, predictions and warnings about the next decade of behavioral science. Here’s my contribution:

As behavioral scientists, we’re not exactly a diverse bunch. We’re university educated. We live in major cities. We work in academia, tech, consulting, banking and finance. And dare I say it, we’re rather liberal. Read the twitter streams or other public outputs of the major behavioral science institutions, publications and personalities, and the topics of interest don’t stray too far from what a Democratic politician (substitute your own nation’s centre-left party) would discuss in a stump speech.

In that light, we need to think more broadly about both the questions we tackle and the answers we “like”. We need to ask what problems matter to the large swathes of the population that we don’t encounter in our day-to-day. We need to be self critical, open to being wrong, and not cheerleaders of our own narrow conception of the world. We must find and listen those who don’t share our points of view. We must question our orthodoxies.

In practice, that’s not easy. But its vital to our relevance and to our intellectual foundations.

I had a few stabs at the ~200 words. Here’s another attempt on a similar theme:

Through the replication crisis, some prominent concepts in behavioural science have been challenged. The priming literature is in ruins. The concept of willpower as a finite resource is scarcely alive. Experiments in areas from disfluency to scarcity have failed to replicate.

The shaking of the behavioural foundation is not over. More tenets of behavioural science are going to bite the dust. Many will be exposed through ongoing replication attempts. They are built on the same foundations as those that have already crumbled: publication bias, the garden of forking paths, among other things. New findings that continue to be built on those same foundations will also tumble down.

I suspect there will be dismay when some ideas crumble, as they align with core beliefs of the behavioural science community. Yet those beliefs will be at the core of the weakness. Ideas of a different alignment would have faced a more serious challenge. We accept too many ideas because we “like” them.

Thankfully, I am confident that infrastructure is being built that will allow us to challenge even cherished ideas. Let us just make sure that when the scientific foundation no longer exists, we are willing to let them go.

You can read contributions from the broader behavioural science community here.

Best books I read in 2019

Better late than never….

The best books I read in 2019 – generally released in other years – are below. Where I have reviewed, the link leads to that review (not many reviews this year).

  • Nick Chater, The Mind is Flat: A great book in which Chater argues that there are no ‘hidden depths’ to our minds.

  • Stephan Guyenet, The Hungry Brain: Outsmarting the Instincts that Make us Overeat: Excellent summary of modern nutrition research and how the body “regulates” its weight.

  • Jonathan Morduch and Rachel Schneider, The Financial Diaries: How American Families Cope in a World of Uncertainty: I find a lot of value reading about the world outside of my bubble. I learnt a lot from this book.

  • Paul Seabright, The Company of Strangers: An excellent exploration of the evolutionary foundations of cooperation. A staple of my evolutionary biology and economics reading list.

  • Lenore Skenazy, Free-Range Kids: How to Raise Safe, Self-Reliant Children (Without Going Nuts with Worry): A fun read of some wise advice.

  • M Mitchell Waldrop, The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal: A bit too much detail, but a worthwhile story about the origins of personal computing. Many of the concepts about human-machine interaction remain relevant today.

Below is the full list of books that I read in 2019 (with links where reviewed and starred if a re-read). The volume of my reading declined year-on-year again, with 61 books total (40 non-fiction, 21 fiction). Most of that decline came in the back half of the year when I spent a lot of time reading and researching some narrow academic topics. 45 of the below were read before June. I could add a lot of children’s books to the list (especially Enid Blyton), but I’ll leave those aside.

Non-Fiction

  • Dan Ariely and Jeff Kreisler, Dollars and Sense: Money Mishaps and How to Avoid Them
  • Christopher Chabris and Daniel Simons, The Invisible Gorilla
  • Nick Chater, The Mind is Flat
  • Mihaly Csikszentmihalyi, Flow
  • Nir Eyal, Hooked
  • Nir Eyal, Indistractable
  • Tim Ferris, Four Hour Work Week
  • Tim Ferris, Tribe of Mentors
  • Victor Frankl, Man’s Search for Meaning
  • Jason Fried and David Heinemeier Hansson, It Doesn’t have to be Crazy at Work
  • Jason Fried and David Heinemeier Hansson, Rework
  • Jason Fried and David Heinemeier Hansson, Remote
  • Atul Gawande, Better
  • Stephan Guyenet, The Hungry Brain: Outsmarting the Instincts that Make us Overeat
  • Jonathan Haidt, The Righteous Mind*
  • Adam Kay, This is Going to Hurt: Secret Diaries of a Junior Doctor
  • Peter D. Kaufman (ed), Poor Charlie’s Almanac: The Wit and Wisdom of Charles T. Munger, Expanded Third Edition
  • Thomas Kuhn, The Structure of Scientific Revolutions
  • David Leiser and Yhonatan Shemesh, How We Misunderstand Economics and Why it Matters: The Psychology of Bias, Distortion and Conspiracy
  • Gerry Lopez, Surf is Where You Find It
  • Jonathan Morduch and Rachel Schneider, The Financial Diaries: How American Families Cope in a World of Uncertainty
  • Cal Newport, Digital Minimalism
  • Karl Popper, The Logic of SCientific Discovery
  • James Reason, Human Error
  • Ben Reiter, Astroball
  • Matthew Salganik, Bit by bit: Social Research in the Digital Age
  • Barry Schwartz, The Paradox of Choice
  • Paul Seabright, The Company of Strangers*
  • Byron Sharp, How Brands Grow
  • Pater Singer, A Darwinian Left
  • Lenore Skenazy, Free-Range Kids: How to Raise Safe, Self-Reliant Children (Without Going Nuts with Worry)
  • Eugene Soltes, Why They Do It: Inside the Mind of the White-Collar Criminal
  • Dilip Soman, The Last Mile: Creating Social and Economic Value from Behavioral Insights
  • Matthew Syed, Black Box Thinking: Marginal Gains and the Secrets of High Performance
  • Ed Thorpe, Beat the Dealer
  • M Mitchell Waldrop, The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal
  • Mike Walsh, The Algorithmic Leader
  • Caroline Webb, How to Have a Nice Day: A Revolutionary Handbook for Work -and Life
  • Robert Wright, The Moral Animal
  • Scott Young, Ultralearning: Accelerate Your Career, Master Hard Skills and Outsmart the Competition

Fiction

  • Fyodor Dostoevsky, The Brothers Karamazov
  • F Scott Fitzgerald, The Beautiful and The Dammed
  • F Scott Fitzgerald, This Side of Paradise
  • Graham Greene, My Man in Havana*
  • Robert Heilein, Starship Troopers
  • Michael Houellebecq, Submission
  • Jack London, Call of the Wild
  • John Le Carre, Call for the Dead
  • John Le Carre, A Murder of Quality
  • John Le Carre, The Looking Glass War
  • John Le Carre, A Small Town in Germany
  • Chuck Palahniuk, Fight Club*
  • Edgar Allan Poe, The Tell-Tale Heart and Other Stories
  • J.K. Rowling, Harry Potter and the Philosopher’s Stone
  • J.K. Rowling, Harry Potter and  the Chamber of Secrets
  • J.K. Rowling, Harry Potter and  the Prisoner of Azkaban
  • J.K. Rowling, Harry Potter and  the Goblet of Fire
  • J.K. Rowling, Harry Potter and  the Order of the Phoenix
  • J.K. Rowling, Harry Potter and the Half-Blood Prince
  • J.K. Rowling, Harry Potter and the Deathly Hallows
  • Tim Winton, Breath

Previous lists: 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018

Ergodicity economics: a primer

In my previous posts on loss aversion (here, here and here), I foreshadowed a post on how “ergodicity economics” might shed some light on whether we need loss aversion to explain people’s choices under uncertainty. This was to be that post, but the background material that I drafted is long enough to be a stand alone piece. I’ll turn to the application of ergodicity economics to loss aversion in a future post.

The below is largely drawn from presentations and papers by Ole Peters and friends, with my own evolutionary take at the end. For a deeper dive, see the lecture notes by Peters and Alexander Adamou, or a recent Perspective by Peters in Nature Physics.

The choice

Suppose you have $100 and are offered a gamble involving a series of coin flips. For each flip, heads will increase your wealth by 50%. Tails will decrease it by 40%. Flip 100 times.

The expected payoff

What will happen? For that first flip, you have a 50% chance of a $50 gain, and a 50% chance of a $40 loss. Your expected gain (each outcome weighted by its probability, 0.5*$50 + 0.5*-$40) is $5 or 5% of your wealth. The absolute size of the stake for future flips will depend on past flips, but for every flip you have the same expected gain of 5% of your wealth.

Should you take the bet?

I simulated 10,000 people who each started with $100 and flipped the coin 100 times each. This line in Figure 1 represents the mean wealth of the 10,000 people. It looks good, increasing roughly in accordance with the expected gain, despite some volatility, and finishing at a mean wealth of over $16,000.

Figure 1: Average wealth of population
Figure 1

Yet people regularly decline gambles of this nature. Are they making a mistake?

One explanation for declining this gamble is risk aversion. A risk averse person will value the expected outcome of a gamble lower than the same sum with certainty.

Risk aversion can be represented through the concept of utility, where each level of wealth gives subjective value (utility) for the gambler. If people maximise utility instead of the value of a gamble, it is possible that a person would reject the bet.

For example, one common utility function to represent a risk averse individual is to take the logarithm of each level of wealth. If we apply the log utility function to the gamble above, the gambler will reject the offer of the coin flip. [The maths here is simply that the expected utility of the gamble is 0.5*ln(150) + 0.5*ln(60)=4.55, which is less than the utility of the sure $100, ln(100)=4.61.]

The time average growth rate

For a different perspective, below is the plot for the first 20 of these 10,000 people. Interestingly, only two people do better than break even (represented by the black line at $100). The richest has less than $1,000 at period 100.

Figure 2: Path of first 20 people
Figure 2

What is happening here? The first plot shows that the average wealth across all 10,000 people is increasing. When we look at the first 20 individuals, their wealth generally declines. Even those that make money make less than the gain in aggregate wealth would suggest.

To show this more starkly, here is a plot of the first 20 people on a log scale, together with the average wealth for the full population. They are all below average in final wealth.

Figure 3: Plot of first 20 people against average wealth (log scale)
Figure 3

If we examine the full population of 10,000, we see an interesting pattern. The mean wealth is over $16,000, but the median wealth after 100 periods is 51 cents, a loss of over 99% of the initial wealth. 54% of the population ends up with less than $1. 86% finishes with less than the initial wealth of $100. Yet 171 people end up with more than $10,000. The wealthiest person finishes with $117 million, which is over 70% of the total wealth of the population.

For most people, the series of bets is a disaster. It looks good only on average, propped up by the extreme good luck and massive wealth of a few people. The expected payoff does not match the experience of most people.

Four possible outcomes

One way to think about what is happening is to consider the four possible outcomes over the first two periods.

The first person gets two heads. They finish with $225. The second and third person get a heads and a tails (in different orders), and finish with $90. The fourth person ends up with $36.

The average across the four is $110.25, reflecting the compound 5% growth. That’s our positive picture. But three of the four lost money. As the number of flips increases, the proportion who lose money increases, with a rarer but more extraordinarily rich cohort propping up the average.

Almost surely

Over the very long-term, an individual will tend to get around half heads and half tails. As the number of flips goes to infinite, the number of heads and tails is “almost surely” equal.

This means that each person will tend to get a 50% increase half the time (or 1.5 times the initial wealth), and a 40% decrease half the time (60% of the initial wealth). A bit of maths and the time average growth in wealth for an individual is (1.5*0.6)0.5 ~ 0.95, or approximately a 5% decline in wealth each period. Every individual’s wealth will tend to decay at that rate.

To get an intuition for this, a long run of equal numbers of heads and tails is equivalent to flipping a head and a tail every two periods. Suppose that is exactly what you did – flipped a heads and then flipped a tail. Your wealth would increase to $150 in the first round ($100*1.5), and then decline to $90 in the second ($150*0.6). You get the same result if you change the order. Effectively, you are losing 10% (or getting only 1.5*0.6=0.9) of your money every two periods.

A system where the time average converges to the ensemble average (our population mean) is known as an ergodic system. The system of gambles above is non-ergodic as the time average and the ensemble average diverge. And given we cannot individually experience the ensemble average, we should not be misled by it. The focus on ensemble averages, as is typically done in economics, can be misleading if the system is non-ergodic.

The longer term

How can we reconcile this expectation of loss when looking at the time average growth with the continued growth of the wealth of some people after 100 periods? It does not seem that everyone is “almost surely” on the path to ruin.

But they are. If we plot the simulation for, say, 1,000 periods rather than 100, there are few winners. Here’s a plot of the average wealth of the population for 1000 periods (the first 100 being as previously shown), plus a log plot of that same growth (Figures 4 and 5).

Figure 4: Plot of average wealth over 1000 periods
Figure 4

Figure 5: Plot of average wealth over 1000 periods (log plot)
Figure 5

We can see that despite a large peak in wealth around period 400, wealth ultimately plummets. Average wealth at period 1000 is $24, below the starting average of $100, with a median wealth of 1×10-21 (rounding to the nearest cent, that is zero). The wealthiest person has $242 thousand dollars, with that being 98.5% of the total wealth. If we followed that wealthy person for another 1000 generations, I would expect them to be wiped out too. [I tested that – at 2000 periods the wealthiest person had $4×10-7.] Despite the positive expected value, the wealth of the entire population is wiped out.

Losing wealth on a positive value bet

The first 100 periods of bets forces us to hold a counterintuitive idea in our minds. While the population as an aggregate experiences outcomes reflecting the positive expected value of the bet, the typical person does not. The increase in wealth across the aggregate population is only due to the extreme wealth of a few lucky people.

However, the picture over 1000 periods appears even more confusing. The positive expected value of the bet is nowhere to be seen. How could this be the case?

The answer to this lies in the distribution of bets. After 100 periods, one person had 70% of the wealth. We no longer have 10,000 equally weighted independent bets as we did in the first round. Instead, the path of the wealth of the population is largely subject to the outcome of the bets by this wealthy individual. As we have already shown, the wealth path for an individual almost surely leads to a compound 5% loss of wealth. That individual’s wealth is on borrowed time. The only way for someone to maintain their wealth would be to bet a smaller portion of their wealth, or to diversify their wealth across multiple bets.

The Kelly criterion

On the first of these options, the portion of a person’s wealth they should enter as stakes for a positive expected value bet such as this is given by the Kelly Criterion. The Kelly criterion gives the bet size that would maximise the geometric growth rate in wealth.

The Kelly criterion formula for a simple bet is as follows:

f=\frac{bp-q}{b}=\frac{p(b+1)-1}{b}

where

f is the fraction of the current bankroll to wager

b is the net odds received on the wager (i.e. you receive $b back on top of the $1 wagered for the bet)

p is the probability of winning

q is the probability of losing (1-p)

For the bet above, we have p=0.5 and b=\frac{0.5}{0.4}=1.25. As offered, we are effectively required to bet f=0.4, or 40% of our wealth, for that chance to win a 50% increase.

However, if we apply the above formula given p and b, a person should bet \frac{(0.5*(1.25+1)-1)}{1.25}=0.1, or 10%, of their wealth each round to maximise the geometric growth rate.

The Kelly criterion is effectively maximising the expected log utility of the bet through setting the size of the bet. The Kelly criterion will result in someone wanting to take a share of any bet with positive expected value.

The Kelly bet “almost surely”” leads to higher wealth than any other strategy in the long run.

If we simulate the above scenarios, but risking only 10% of wealth each round rather than 40% (i.e. heads wealth will increase by 12.5%, tails it will decrease by 10%), what happens? The expected value of the Kelly bet is 0.5*0.125+0.5*-0.1=0.0125 or 1.25% per round. This next figure shows the ensemble average, showing a steady increase.

Figure 6: Average wealth of population applying Kelly criterion (1000 periods)
Figure 6

If we look at the individuals in this population, we can also see that their paths more closely resemble that of the population average. Most still under-perform the mean (the system is still non-ergodic – the time average growth rate is ((1.125*0.9)0.5=1.006 or 0.6%), and there is large wealth disparity with the wealthiest person having 36% of the total wealth after 1000 periods (after 100, they have 0.5% of the wealth). Still most people are better off, with 70% and 95% of the population experiencing a gain after 100 and 1000 periods respectively. The median wealth is almost $50,000 after the 1000 periods.

Figure 7: Plot of first 20 people applying Kelly criterion against average wealth (log scale, 1000 periods)
Figure 7

Unfortunately, given our take it or leave it choice we opened with involving 40% of our wealth, we can’t use the Kelly Criterion to optimise the bet size and should refuse the bet.

Update clarifying some comments on this post:

An alternative more general formula for the Kelly criterion that can be used for investment decisions is:

f=\frac{p}{a}-\frac{q}{b}

where

f is the fraction of the current bankroll to invest

b is the value by which your investment increases (i.e. you receive $b back on top of each $1 you invested)

a is the value by which your investment decreases if you lose (the first formula above assumes a=1)

p is the probability of winning

q is the probability of losing (1-p)

Applying this formula to the original bet at the beginning of this post, a=0.4 and b=0.5, by which f=0.5/0.4-0.5/0.5=0.25 or 25%. Therefore, you should put up 25% of your wealth, of which you could potentially lose 40% or win 50%.

This new formulation of the Kelly criterion gives the same recommendation as the former, but refers to different baselines. In the first case, the optimal bet is 10% of your wealth, which provides for a potential win of 12.5%. In the second case, you invest 25% of your wealth to possibly get a 50% return (12.5% of your wealth) or lose 40% of your investment (40% of 25% which is 10%). Despite the same effective recommendation, in one case you talk of f being 10%, and in the second 25%.

Evolving preferences

Suppose two types of agent lived in this non-ergodic world and their fitness was dependent on the outcome of the 50:50 bet for a 50% gain or 40% loss. One type always accepted the bet, the other always rejected it. Which would come to dominate the population?

An intuitive reaction to the above examples might be that while the accepting type might have a short term gain, in the long run they are almost surely going to drive themselves extinct. There are a couple of scenarios where that would be the case.

One is where the children of a particular type were all bound to the same coin flip as their siblings for subsequent bets. Suppose one individual had over 1 million children after 100 periods, comprising around 70% of the population (which is what they would have if we borrowed the above simulations for our evolutionary scenario, with one coin flip per generation). If all had to bet on exactly the same coin flip in period 101 and beyond, they are doomed.

If, however, each child faces their own coin flip (experiencing, say, idiosyncratic risks), that crash never comes. Instead the risk of those flips is diversified and the growth of the population more closely resembles the ensemble average, even over the very long term.

Below is a chart of population for a simulation of 100 generations of the accepting population, starting with a population of 10,000. For this simulation I have assumed that at the end of each period, the accepting types will have a number of children equal to the proportional increase in their wealth. For example, if they flip heads, they will have 1.5 children, For tails, they will have 0.6 children. They then die. (The simulation works out largely the same if I make the number of children probabilistic in accord with those numbers.) Each child takes their own flip.

Figure 8: Population of accepting types
Figure 8

This has an expected population growth rate of 5%.

This evolutionary scenario differs from Kelly criterion in that the accepting types are effectively able to take many independent shares of the bet for a tiny fraction of their inclusive fitness.

In a Nature Physics paper summarising some of his work, Peters writes:

[I]n maximizing the expectation value – an ensemble average over all possible outcomes of the gamble – expected utility theory implicitly assumes that individuals can interact with copies of themselves, effectively in parallel universes (the other members of the ensemble). An expectation value of a non-ergodic observable physically corresponds to pooling and sharing among many entities. That may reflect what happens in a specially designed large collective, but it doesn’t reflect the situation of an individual decision-maker.

For a replicating entity that is able to diversify future bets across many offspring, they are able to do just this.

There are a lot of wrinkles that could be thrown into this simulation. How many bets does someone have to make before they reproduce and effectively diversify their future? The more bets, the higher the chance of a poor end. There is also the question of whether bets by children would be truly independent (Imagine a highly-related tribe).

The next post

In a future post I’ll ask whether, given the above, we need risk and loss aversion to explain our choices.

Code

Below is the R code used for generation of the simulations and figures.

Load the required packages:

library(ggplot2)
library(scales) #use the percent scale later

Create a function for the bets.

bet <- function(p,pop,periods,gain,loss, ergodic=FALSE){

  #p is probability of a gain
  #pop is how many people in the simulation
  #periods is the number of coin flips simulated for each person
  #if ergodic=FALSE, gain and loss are the multipliers
  #if ergodic=TRUE, gain and loss are the dollar amounts

  params <- as.data.frame(c(p, pop, periods, gain, loss, ergodic))
  rownames(params) <- c("p", "pop", "periods", "gain", "loss", "ergodic")
  colnames(params) <- "value"

  sim <- matrix(data = NA, nrow = periods, ncol = pop)

  if(ergodic==FALSE){
    for (j in 1:pop) {
      x <- 100 #x is the number of dollars each person starts with
      for (i in 1:periods) {
      outcome <- rbinom(n=1, size=1, prob=p)
      ifelse(outcome==0, x <- x*loss, x <- x*gain)
      sim[i,j] <- x
      }
    }
  }

 if(ergodic==TRUE){
    for (j in 1:pop) {
      x <- 100 #x is the number of dollars each person starts with
      for (i in 1:periods) {
      outcome <- rbinom(n=1, size=1, prob=p)
      ifelse(outcome==0, x <- x-loss, x <- x+gain)
      sim[i,j] <- x
      }
    }
  }

  sim <- rbind(rep(100,pop), sim) #placing the $x starting sum in the first row
  sim <- cbind(seq(0,periods), sim) #number each period
  sim <- data.frame(sim)
  colnames(sim) <- c("period", paste0("p", 1:pop))
  sim <- list(params=params, sim=sim)
  sim
}

Simulate 10,000 people who accept a series of 1000 50:50 bets to win 50% of their wealth or lose 40%.

set.seed(20191215)
nonErgodic <- bet(p=0.5, pop=10000, periods=1000, gain=1.5, loss=0.6, ergodic=FALSE)

Create a function for plotting the average wealth of the population over a set number of periods.

averagePlot <- function(sim, periods=100){

  basePlot <- ggplot(sim$sim[c(1:(periods+1)),], aes(x=period)) +
    labs(y = "Average Wealth ($)")

  averagePlot <- basePlot +
    geom_line(aes(y = rowMeans(sim$sim[c(1:(periods+1)),2:(sim$params[2,]+1)])), color = 1, size=1)

  averagePlot
}

Plot the average outcome of these 10,000 people over 100 periods (Figure 1).

averagePlot(nonErgodic, 100)

Create a function for plotting the path of individuals in the population over a set number of periods.

individualPlot <- function(sim, periods, people){

  basePlot <- ggplot(sim$sim[c(1:(periods+1)),], aes(x=period)) +
    labs(y = "Wealth ($)")

  for (i in 1:people) {
    basePlot <- basePlot +
      geom_line(aes_string(y = sim$sim[c(1:(periods+1)),(i+1)]), color = 2) #need to use aes_string rather than aes to get all lines to print rather than just last line
  }

basePlot

}

Plot of the path of the first 20 people over 100 periods (Figure 2).

nonErgodicIndiv <- individualPlot(nonErgodic, 100, 10)
nonErgodicIndiv

Plot both the average outcome and first twenty people on the same plot using a log scale (Figure 3).

logPlot <- function(sim, periods, people) {
  individualPlot(sim, periods, people) +
    geom_line(aes(y = rowMeans(sim$sim[c(1:(periods+1)),2:(sim$params[2,]+1)])), color = 1, size=1) +
    scale_y_log10()
}

nonErgodicLogPlot <- logPlot(nonErgodic, 100, 20)
nonErgodicLogPlot

Create a function to generate summary statistics.

summaryStats <- function(sim, period=100){

  meanWealth <- mean(as.matrix(sim$sim[(period+1),2:(sim$params[2,]+1)]))
  medianWealth <- median(as.matrix(sim$sim[(period+1),2:(sim$params[2,]+1)]))
  numDollar <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]<=1) #number with less than a dollar
  numGain <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]>=100) #number who gain
  num10000 <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]>=10000) #number who finish with more than $10,000
  winner <- max(sim$sim[(period+1),2:(sim$params[2,]+1)]) #wealth of wealthiest person
  winnerShare <- winner / sum(sim$sim[(period+1),2:(sim$params[2,]+1)]) #wealth share of wealthiest person

  print(paste0("mean: $", round(meanWealth, 2)))
  print(paste0("median: $", round(medianWealth, 2)))
  print(paste0("number with less than a dollar: ", numDollar))
  print(paste0("number who gained: ", numGain))
  print(paste0("number that finish with more than $10,000: ", num10000))
  print(paste0("wealth of wealthiest person: $", round(winner)))
  print(paste0("wealth share of wealthiest person: ", percent(winnerShare)))
}

Generate summary statistics for the population and wealthiest person after 100 periods

summaryStats(nonErgodic, 100)

Plot the average wealth of the non-ergodic simulation over 1000 periods (Figure 4).

averagePlot(nonErgodic, 1000)

Plot the average wealth of the non-ergodic simulation over 1000 periods using a log plot (Figure 5).

averagePlot(nonErgodic, 1000)+
    scale_y_log10()

Calculate some summary statistics about the population and the wealthiest person after 1000 periods.

summaryStats(nonErgodic, 1000)

Kelly criterion bets

Calculate the optimum Kelly bet size.

p <- 0.5
q <- 1-p
b <- (1.5-1)/(1-0.6)
f <- (b*p-q)/b
f

Run a simulation using the optimum bet size.

set.seed(20191215)
kelly <- bet(p=0.5, pop=10000, periods=1000, gain=1+f*b, loss=1-f, ergodic=FALSE)

Plot ensemble average of Kelly bets (Figure 6).

averagePlotKelly <- averagePlot(kelly, 1000)
averagePlotKelly

Plot of the path of the first 20 people over 1000 periods (Figure 7).

logPlotKelly <- logPlot(kelly, 1000, 20)
logPlotKelly

Generate summary stats after 1000 periods of the Kelly simulation

summaryStats(kelly, 1000)

Evolutionary simulation

Simulate the population of accepting types.

set.seed(20191215)
evolutionBet <- function(p,pop,periods,gain,loss){

  #p is probability of a gain
  #pop is how many people in the simulation
  #periods is the number of generations simulated

  params <- as.data.frame(c(p, pop, periods, gain, loss))
  rownames(params) <- c("p", "pop", "periods", "gain", "loss")
  colnames(params) <- "value"

  sim <- matrix(data = NA, nrow = periods, ncol = 1)

  sim <- rbind(pop, sim) #placing the starting population in the first row

  for (i in 1:periods) {
    for (j in 1:round(pop)) {
      outcome <- rbinom(n=1, size=1, prob=p)
      ifelse(outcome==0, x <- loss, x <- gain)
      pop <- pop + (x-1)
    }
    pop <- round(pop)
    print(i)
    sim[i+1] <- pop #"+1" as have starting population in first row
  }

  sim <- cbind(seq(0,periods), sim) #number each period
  sim <- data.frame(sim, row.names=NULL)
  colnames(sim) <- c("period", "pop")
  sim <- list(params=params, sim=sim)
  sim
}

evolution <- evolutionBet(p=0.5, pop=10000, periods=100, gain=1.5, loss=0.6) #more than 100 periods can take a very long time, simulation slows markedly as population grows

Plot the population growth for the evolutionary scenario (Figure 8).

basePlotEvo <- ggplot(evolution$sim[c(1:101),], aes(x=period))

expectationPlotEvo <- basePlotEvo +
  geom_line(aes(y=pop), color = 1, size=1) +
  labs(y = "Population")

expectationPlotEvo

The case against loss aversion

Summary: Much of the evidence for loss aversion is weak or ambiguous. The endowment effect and status quo bias are subject to multiple alternative explanations, including inertia. There is possibly better evidence for loss aversion in the response to risky bets, but what emerges does not appear to be a general principle of loss aversion. Rather, “loss aversion” is a conditional effect that most typically emerges when rejecting the bet is not the status quo and the stakes are material.

[As a postscript, a week after publishing this post, a working paper for a forthcoming Journal of Consumer Psychology article was released. That paper addresses some of the below points. A post on that paper is in the works.]


In a previous post I flagged three critiques of loss aversion that had emerged in recent years. The focus of that post was Eldad Yechiam’s analysis of the assumption of loss aversion in Kahneman and Tversky’s classic 1979 prospect theory paper.

The second critique, and the focus of this post, is an article by David Gal and Derek Rucker The Loss of Loss Aversion: Will It Loom Larger Than Its Gain (pdf). Its abstract:

Loss aversion, the principle that losses loom larger than gains, is among the most widely accepted ideas in the social sciences. The first part of this article introduces and discusses the construct of loss aversion. The second part of this article reviews evidence in support of loss aversion. The upshot of this review is that current evidence does not support that losses, on balance, tend to be any more impactful than gains. The third part of this article aims to address the question of why acceptance of loss aversion as a general principle remains pervasive and persistent among social scientists, including consumer psychologists, despite evidence to the contrary. This analysis aims to connect the persistence of a belief in loss aversion to more general ideas about belief acceptance and persistence in science. The final part of the article discusses how a more contextualized perspective of the relative impact of losses versus gains can open new areas of inquiry that are squarely in the domain of consumer psychology.

The release of Gal and Rucker’s paper was accompanied by a Scientific American article by Gal, Why the Most Important Idea in Behavioral Decision-Making Is a Fallacy. It uses somewhat stronger language. Here’s a snippet:

[T]here is no general cognitive bias that leads people to avoid losses more vigorously than to pursue gains. Contrary to claims based on loss aversion, price increases (ie, losses for consumers) do not impact consumer behavior more than price decreases (ie, gains for consumers). Messages that frame an appeal in terms of a loss (eg, “you will lose out by not buying our product”) are no more persuasive than messages that frame an appeal in terms of a gain (eg, “you will gain by buying our product”).

People do not rate the pain of losing $10 to be more intense than the pleasure of gaining $10. People do not report their favorite sports team losing a game will be more impactful than their favorite sports team winning a game. And people are not particularly likely to sell a stock they believe has even odds of going up or down in price (in fact, in one study I performed, over 80 percent of participants said they would hold on to it).

This critique of loss aversion is not completely new. David Gal has been making related arguments since 2006. In this more recent article, however, Gal and Rucker draw on a larger literature and some additional experiments to expand the critique.

To frame their argument, they describe three potential versions of loss aversion:

  • The strong version: losses always loom larger than gains
  • The weak version: losses on balance loom larger than gains
  • The contextual version: Depending on context, losses can loom larger than gains, they can have equal weighting, gains can loom larger than losses

The strong version appears to be a straw man that few would defend, but there is some subtlety in Gal and Rucker’s definition. They write:

This strong version does not require that losses must outweigh gains in all circumstances, as factors such as measurement error and boundary conditions might obscure or reduce the fundamental propensity for losses to be weighted more than equivalent gains.

An interesting point by Gal and Rucker is that for most research on the boundaries or moderators of loss aversion, loss aversion is the general principle around which the exceptions are framed. If people don’t exhibit loss aversion, it is usually argued that the person is not enoding the transaction as a loss, so loss aversion does not apply. The alternative that the gains have equal weight to (or greater weight than) the loss is not put forward. So although few would defend a blunt reading of the strong version, many researchers take it as though people are loss averse unless certain circumstances are present.

Establishing the weak version seems difficult. Tallying studies in which losses loom larger and where gains dominate would provide evidence more on the focus of research than the presence of a general principle of loss aversion. It’s not even clear how you would compare across different contexts.

Despite this difficulty (or possibly because of it), Gal and Rucker come down firmly in favour of the contextual version. They do this not through tallying or comparing the contexts in which losses or gains loom larger, but by arguing that most evidence of loss aversion is ambiguous at best.

Loss aversion as evidence for loss aversion

The principle of loss aversion is descriptive. It is a label applied to an empirical phenomena. It is not an explanation. Similarly, the endowment effect, our tendency to ascribe more value to items that we have than to those we don’t, is a label applied to an empirical phenomena.

Despite being descriptive, Gal and Rucker note that loss aversion is often used as an explanation for choices. For example, loss aversion is often used as an explanation for the endowment effect. But using a descriptive label as an explanation provides no analytical value, with what appears to be an explanation simply application of a different label. (Owen Jones suggests that stating the endowment effect is due to loss aversion is no more useful than labelling human sexual behaviour as being due to abstinence aversion. I personally think it is marginally more useful, if only for the fact there is now a debate as to whether loss aversion and the endowment effect are related. The transfer of label shows that you believe these empirical phenomena have the same psychological basis.)

Gal and Rucker argue that the application of the loss aversion label to the endowment effect leads to circular arguments. The endowment effect is used as evidence for loss aversion, and, as noted above, loss aversion is commonly used to explain the endowment effect. This results in an unjustified reinforcement of the concept, and a degree of neglect of alternative explanations for the phenomena.

I have some sympathy for this claim, although am not overly concerned by it. The endowment effect has multiple explanations (as will be discussed below), so it is weak evidence of loss aversion at best. However, it is rare that the endowment effect is the sole piece of evidence presented for the existence of loss aversion. It is more often one of a series of stylised facts for which a common foundation is sought. So although there is circularity, the case for loss aversion does not rest solely on that circular argument.

Risky versus riskless choice

Much of Gal and Rucker’s examination of the evidence for loss aversion is divided between riskless and risky choice. Riskless choice involves known options and payoffs with certainty. Would you like to keep your chocolate or exchange it for a coffee mug? In risky choice, the result of the choice involves a payoff that becomes known only after the choice. Would you like to accept a 50:50 bet to win $110, lose $100?

Below is a collection of their arguments as to why loss aversion is not the best explanation for many observed empirical results sorted across those two categories.

Riskless choice – status quo bias and the endowment effect

Gal and Rucker’s examination of riskless choice centres on the closely related concepts of status quo bias and the endowment effect. Status quo bias is the propensity for someone to stick with the status quo option. The endowment effect is the propensity for someone to value an object they own over an object that they would need to acquire.

Status quo bias and the endowment effect are often examined in an exchange paradigm. You have been given a coffee mug. Would you like to exchange it for a chocolate? The propensity to retain the coffee mug (or the chocolate if that was what they were given first) is labelled as either status quo bias or the endowment effect. Loss aversion is often used to explain this decision, as the person would lose the status quo option or their current endowment when they choose an alternative.

Gal and Rucker suggests that rather than being driven by loss aversion, status quo bias in this exchange paradigm is instead due to a preference for inaction over action (call this inertia). A person needs a psychological motive for action. Gal examined this in his 2006 paper when he asked experimental subjects to imagine that they had a quarter minted in one city, and then whether they would be willing to change it for a nickel minted in another. Following speculation by Kahneman and others that people do not experience loss aversion when exchanging identical goods, Gal considered that a propensity for the status quo absent loss aversion would indicate the presence of inertia.

Gal found that despite there being no loss in the exchange of quarters, the experimental subjects preferred the status quo of keeping the coin they had. Gal and Rucker replicated this result on Amazon Turk, offering to exchange one hypothetical $20 bill for another. They took this as evidence of inertia.

Apart from the question of what weight you should give an experiment involving hypothetical coins, notes and exchanges, I don’t find this a convincing demonstration that inertia lies behind the status quo bias. Exchange does involve some transaction costs (in the form of effort, however minimal, even if you are told to assume they are insignificant).
In his 2006 paper, Gal reports other research where people traded identical goods when paid a nickel to cover “transaction costs”. The token amount spurred action.

Those experiments, however, involved transactions of goods with known values. The value of a quarter is clear. In contrast, Gal’s exploration for the status quo bias in his 2006 paper involved goods without an obvious face value. This is important, as Gal argued that people have “fuzzy preferences” that are often ill-defined and constructed on an ad hoc basis. If we do not precisely judge the attractiveness of chocolate or a mug, we may not have a precise ordering of preference between the two that would justify choosing one after another. Under Gal’s concept of inertia, the lack of a psychological motive to change results in us sticking with the status quo mug.

Contrasting this with the exchange of quarters, there the addition of a nickel to cover trading expenses allows for a precise ordering of the two options, as they are easily comparable monetary sums. In the case of a mug and chocolate, addition of a nickel is unlikely to make the choice any easier as the degree of fuzziness extends over a much larger range.

The other paradigm under which the endowment effect is explored is the valuation paradigm. The valuation paradigm involves asking someone what they would be willing to pay to purchase or acquire an item, or how much they would require to be paid to accept an offer to purchase an item in their possession. The gap between this willingness to pay and the typically larger willingness to accept is the additional value given to the endowed good. (For some people this is how status quo bias and the endowment effect are differentiated. Status quo bias is the maintenance of the status quo in an exchange paradigm, the endowment effect is the higher valuation of endowed goods in the valuation paradigm. However, many also label the exchange paradigm outcome as being due to the endowment effect. Across the literature they are often used interchangeably.)

This difference between willingness to pay and accept in the valuation paradigm is often cited as evidence of loss aversion. But as Gal and Rucker argue, this difference has alternative explanations. Fundamentally different questions are asked when seeking an individual’s willingness to accept (what is the market value?) and their willingness to pay (what is their personal utility?). Only willingness to pay is affected by budget constraints.

Although not mentioned in the 2018 paper, Gal’s 2006 paper suggests this gap may also be due to fuzzy preferences, with the willingness to pay and willingness to accept representing the two end points of the fuzzy range of valuation. Willingness to pay is the lower bound. For any higher amount, they are either indifferent (the range of fuzzy preferences) or would prefer the monetary sum in their hand. Willingness to accept is the upper bound. For any lower amount they are either indifferent (the range of fuzzy preferences) or would prefer to keep the good.

There are plenty of experiments in the broader literature seeking to tease out whether the endowment effect is due to loss aversion or alternative explanations of the type above. Gal and Rucker report their own (unpublished) set of experiments where they seek to isolate inertia as the driver of the difference between willingness to pay and willingness to accept. They asked for experimental subjects’ willingness to pay to obtain a good, versus their willingness to retain a good. For example, they compared subjects’ willingness to pay to fix a phone versus their willingness to pay to get a repaired phone. They asked about their willingness to expend time to drive to get a new notebook they left behind versus their willingness to drive to get a brand new notebook. They asked about their willingness to pay for fibre optic internet versus their willingness to pay to retain fibre optic internet that they already had. For each choices the subject needs to act to get the good, so inertia is removed as a possible explanation of a preference to retain an endowed good.

With fuzzy preferences under this experimental set up, both willingness to pay and willingness to retain would be the lower bound, as any higher payment would lead to indifference or preference of the monetary sum. Here Gal and Rucker found little difference between willingness to pay and willingnes to accept.

Gal and Rucker characterise each of the options as involving choices between losses and gains, and survey questions put to the experimental subjects confirmed that most were framing the choices in that way. This allowed them to point to this experiment as evidence against loss aversion driving the endowment effect. Remove inertia but leave the loss/gain framing, and the effect disappears.

However, the experimental implementation of this idea is artificial. Importantly, the decisions are hypothetical and unincentivised. Whether coded as a loss or gain, the experimental subjects were never endowed with the good and weren’t experiencing a loss.

More convincing evidence, however, came from Gal and Rucker’s application of this idea in an exchange paradigm. In one scenario, people were endowed with a pen or chocolate bar. They were then asked to choose between keeping the pen or swapping for the chocolate bar, so an active choice was required for either option. Gal and Rucker found that regardless of the starting point, roughly the same proportion chose the pen or chocolate bar. This constrasts with a more typical endowment effect experimental setup that they also ran, in which they simply asked people given a pen or chocolate bar whether they would like to exchange. Here the usual endowment effect pattern emerged, with people more likely to keep the endowed good.

Like the endowment effect experiments they critique, this result is subject to alternative explanations, the simplest (although not necessarily convincing) being that the reference point has been changed by the framing of the question. By changing the status quo, you also change the reference point. (I should say this type of argument involving ad hoc stories about changes in reference points is one of the least satisfactory elements of prospect theory.)

Despite the potential for alternative explanations, these experiments are the beginning of a body of evidence for inertia driving some results currently attributed to loss aversion. Gal and Rucker’s argument against use of the endowment effect as evidence of loss aversion is even stronger. There are many alternative explanations to loss aversion for the status quo bias and endowment effect. The evidence for loss aversion is better found elsewhere.

Risky choice

Gal and Rucker’s argument concerning risky bets resembles that for riskless choice. Many experiments in the literature involve an offer of a bet, such as a 50:50 chance to win $100 or lose $100, which the experimental subject can accept or reject. Rejection is the status quo, so inertia could be an explanation for the decision to reject.

Gal and Rucker describe an alternative experiment in which people can choose between a certain return of 3% or a risky bet with expected value of zero. As they must make a choice, there is not a status quo option. 80% of people allocate at least some money to the risky bet, suggesting an absence of loss aversion. This type of finding is reflected across a broader literature.

They also report a body of research where the risky bet is not the sole option to opt into, but rather one of two options for which an active choice must be made. For example, would you like $0 with certainty, or a 50:50 bet to win $10, lose $10. In this case, little evidence for loss aversion emerges unless the stakes are large.

This framing of the safe option as the status quo is one of many conditions under which loss aversion tends to emerge. Gal and Rucker reference a paper by Eyal Ert and Ido Erev, who identified that in addition to emerging when the safe option is the status quo, loss aversion also tends to emerge with:

  • high nominal payoffs
  • when the stakes are large
  • when there are bets present in the choice list that create a contrast effect, and
  • in long experiments without feedback where the computation of the expected payoff is difficult.

Ert and Erev described a series of experiments where they remove these features and eliminate loss aversion.

Gal and Rucker also reference a paper by Yechiam and Hochman (pdf), who surveyed the loss aversion literature involving balanced 50:50 bets. For experiential tasks, where decision makers are required to repeatedly select between options with no prior description of the outcomes of probabilities (effectively learning the probabilities with experience), there is no evidence of loss aversion. For descriptive tasks, where a choice is made between fully-described options, loss aversion most typically arises for “high-stakes” hypothetical amounts, and is often absent for lower sums (which are also generally hypothetical).

For the higher stakes bets, Yechiam and Hochman suggest risk aversion may explain the choices. However, what Yechiam and Hochman call high stakes aren’t that high; for example $600 versus $500. As I described in my previous post on the Rabin Paradox, risk aversion at stakes of that size can only be shoehorned into the traditional expected utility model with severe contortions (although it can be done). Rejecting that bet is a high level of risk aversion for anyone with more than minimal wealth (although these experimental subjects may have low wealth as they are generally students). Loss aversion is one alternative explanation.

Regardless, under the concept of loss aversion as presented in prospect theory, we should see loss aversion for low stakes bets. Once you are arguing that “loss aversion” will emerge if the bet is large enough, this is a different conception of loss aversion to that in the academic literature.

Other phenomena that may not involve loss aversion

At the end of the paper, Gal and Rucker mention a couple of other phenomena incorrectly attributed to or not necessarily caused by loss aversion.

The first of these is the Asian disease problem. In this problem, experimental subjects are asked:

Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimate of the consequences of the programs are as follows:

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is 1/3 probability that 600 people will be saved, and 2/3 probability that no people will be saved.

Which of the two programs would you favor?

Most people tend to prefer program A.

Then ask another set of people the following:

If Program C is adopted 400 people will die.

If Program D is adopted there is 1/3 probability that nobody will die, and 2/3 probability that 600 people will die.

Which of the two programs would you favor?

Most people prefer program D, despite C and D being a reframed version of programs A and B. The reason for this change is usually attributed to the second set of options being a loss frame, with people preferring to gamble to avoid the loss.

This, however, is not loss aversion. There is, after all, no potential for gain in the second set of questions against which the strength of the losses can be compared. Rather, this is the “reflection effect”.

Tversky and Kahneman recognised this when they presented the Asian disease problem in their 1981 Science article pdf, but the translation into public discourse has missed this difference, with the Asian disease problem often presented as an example of loss aversion.

Gal and Rucker point out some other examples of phenomena that may be incorrectly attributed to loss aversion. The disposition effect – people tend to sell winning investment and retain losing investments – could also be explained by the reflection effect, or by lay beliefs about mean reversion. The sunk cost effect involves a refusal to recognise losses rather than a greater impact of losses relative to gains, as no comparison to a gain is made.

Losses don’t hurt more than gains

Beyond the thoughtful argument in the paper, Gal’s Scientific American article goes somewhat further. For instance, Gal writes:

People do not rate the pain of losing $10 to be more intense than the pleasure of gaining $10. People do not report their favorite sports team losing a game will be more impactful than their favorite sports team winning a game.

I find it useful to distinguish two points. The first is the question of the psychological impact of a loss. Does a loss generate a different feeling, or level of attention, to an equivalent gain? The second is how that psychological response manifests itself in a decision. Do people treat losses and gains differently, resulting in loss aversion of the type described in prospect theory?

The lack of differentiation between these two points often clouds the discussion of loss aversion. The first point accords with our instinct. We feel the pain of a loss. But that pain does not necessarily mean that we will be loss averse in our decisions.

Gal and Rucker’s article largely focuses on the second of these points through its examination of a series of choice experiments. Yet the types of claims in the Scientific American article, as in the above quote, are more about the first.

This is the point where I disagree with Gal. Although contextual (isn’t everything), the evidence of the greater psychological impact of losses appears solid. In fact, the Yechiam and Hochman article (pdf), quoted by Gal and Rucker for its survey of the loss aversion literature, was an attempt to reconcile the disconnect between the evidence for the effect of losses on performance, arousal, frontal cortical activation, and behavioral consistency with the lack of evidence for loss aversion. Yechiam’s article on the assumption of loss aversion by Kahneman and Tversky (the subject of a previous post) closes with a section reconciling his argument with the evidence of the effect of small stake losses on learning and performance.

To be able to make claims that the evidence of psychological impact of losses is as weak and contextual as the evidence for loss aversion, Gal and Rucker would need to provide a much deeper review of the literature. But in the absence of that, my reading of the literature does not support those claims.

Unfortunately, these points in the Scientific American article have been the focus of the limited responses to Gal and Rucker’s article, leaving us with a somewhat unsatisfactory debate (as I discuss further below).

Hey, we’re overthrowing the old paradigm!

The third part of Gal and Rucker’s paper concerns what they call the “Sociology of Loss Aversion”. I don’t have much to say on their particular arguments in this section, except that I have a gut reaction against authors discussing Thomas Kuhn and contextualising their work as overthrowing the entrenched paradigm. Maybe it’s the lack of modesty in failing to acknowledge they could be wrong (like most outsiders complaining about their ideas being ignored and quoting Kuhn). Just build your case overthrowing the damn paradigm!

That said, the few responses to Gal and Rucker’s paper that I have seen are underwhelming. Barry Ritholtz wrote a column, labelled by Richard Thaler as a “Good takedown of recent overwrought editorial“, which basically said an extraordinary claim such as this requires extraordinary evidence, and that that standard has not been met.

Unfortunately, the lines in Gal’s Scientific American article on the psychological effect of losses were the focus of Ritholtz’s response, rather than the evidence in the Gal and Rucker article. Further, Ritholtz didn’t show much sign of having read the paper. For instance, in response to Gal’s claim that “people are not particularly likely to sell a stock they believe has even odds of going up or down in price”, Ritholtz responded that “the endowment effect easily explains why we place greater financial value on that which we already possess”. But, as noted above, (a solid) part of Gal and Rucker’s argument is that the endowment effect may not be the result of loss aversion. (It’s probably worth noting here that Gal and Rucker did effectively replicate the endowment effect many times over. The endowment effect is a solid phenomena.)

Another thread of response, linked by Ritholz, came from Albert Bridge Capital’s Drew Dickson. One part of Dickson’s 20-tweet thread runs as follows:

13| So, sure, a billionaire will not distinguish between a $100 loss and a $100 gain as much as Taleb’s at-risk baker with a child in college; but add a few zeros, and the billionaire will start caring.

4| Critics can pretend that Kahneman, Tversky and @R_Thaler haven’t considered this, but they of course have. From some starting point of wealth, there is some other number where loss aversion matters. For everyone. Even Gal. Even Rucker. Even Taleb.

15| Losses (that are significant to the one suffering the loss) feel much worse than similarly-sized gains feel good. Just do the test on yourself.

But this idea that you will be loss averse if the stakes are high enough is not “loss aversion”, or at least not the version of loss aversion from prospect theory, which applies to even the smallest of losses. It’s closer to the concept of “minimal requirements”, whereby people avoid bets that would be ruinous, not because losses hurt more than gains.

Thaler himself threw out a tweet in response, stating that:

No minor point about terminology. Nothing of substance. WTA > WTP remains.

That willingness to accept (WTA) is greater than willingness to pay (WTP) when framed as the status quo is not a point Gal and Rucker would disagree with. But is it due to loss aversion?

Thankfully, the publication of Gal and Rucker’s article was accompanied by two responses, one of which tackled some of the substantive issues (the other response built on rather than critiqued Gal and Rucker’s work). That substantive response (pdf), by Itamar Simonson and Ran Kivetz, would best be described as supporting the weak version of loss aversion.

Simonson and Kivetz largely agreed that status quo bias and the endowment effect do not offer reliable support for loss aversion, particularly given the alternative explanations for the phenomena. However, they were less convinced of Gal and Rucker’s experiments to identify inertia as the basis of these phenomena, suggesting the experiments involved “unrealistic experimental manipulations that are susceptible to confounds and give rise to simple alternative explanations”, although they leave those simple alternative explanations unspecified.

Simonson and Kivetz also disagreed with Gal and Rucker on the evidence concerning risky bets, describing as ad hoc and unsupported the assumption that not accepting the bet is the status quo. It’s not clear to me how they could describe that assumption as unsupported given Gal and Rucker’s experimental evidence (nor the evidence Gal and Rucker cite) about the absence of loss aversion for small stakes when rejecting the bet is not framed as the status quo. Loss aversion only emerges for larger bets.

I should say, however, that I do have some sympathy for Simonson and Kivetz’s resistance to accepting Gal and Rucker’s sweeping of the risky bet premium into the status quo bucket. Even those larger bets for which loss aversion arises aren’t that large (as noted above, they’re often in the range of $500). Risk aversion is a somewhat unsatisfactory alternative explanation (a topic I discuss in my post on Rabin’s Paradox), and I sense that some form of loss aversion kicks in, although here we may again be talking about a minimal requirements type of loss aversion, not the loss aversion of prospect theory.

Despite their views on risky bets, Simonson and Kivetz were more than willing to approve of Gal and Rucker’s case that loss aversion was a contingent phenomena. They would simply argue that loss aversion occurs “on average”. As noted above, I’m not sure how you would weight the relative instances of gains or losses having greater weight, so I’ll leave that debate for now.

Funnily enough, a final comment by Simonson and Kivetz on risky bets is that “the notion that losses do tend to loom larger than gains is most likely correct; it certainly resonates and “feels” consistent with personal experience, though intuitive reactions are a weak form of evidence.” As noted above, we should distinguish feelings and a decision exhibiting loss aversion.

Unfortunately, I haven’t found anything else that attempts to pick apart Gal and Rucker’s article, so it is hard to gauge the broader reception to the article or whether it has resonated in academic circles at all.

Where does this leave us on loss aversion?

Putting this together, I would summarise the case for loss aversion as follows:

  • The conditions for loss aversion are more restrictive than is typically thought or stated in discussion outside academia
  • Some of the claimed evidence for loss aversion, such as the endowment effect, have alternative explanations. The evidence is better found elsewhere
  • There is sound evidence for the psychological impact of losses, but this does not necessarily manifest itself in loss aversion
  • Most of the loss aversion literature does a poor job of distinguishing between loss aversion in its pure sense and what might be called a “minimal requirements” effect, whereby people are avoiding the gamble due to the threat of ruin.

This is a more restricted conception of loss aversion than I held when I started writing this post.

The loss aversion series of posts

My next post will be on the topic of ergodicity, which involves the concept that people are not maximising the expected value of a series of gambles, but rather the time average (explanation on what that means to come). If people maximise the latter, not the former as many approaches assume, you don’t need risk or loss aversion to explain their decisions.

My other posts on loss aversion can be found here:

  1. Kahneman and Tversky’s debatable loss aversion assumption
  2. What can we infer about someone who rejects a 50:50 bet to win $110 or lose $100? The Rabin paradox explored
  3. The case against loss aversion (this post)
  4. Ergodicity economics – a primer

What can we infer about someone who rejects a 50:50 bet to win $110 or lose $100? The Rabin paradox explored

Consider the following claim:

We don’t need loss aversion to explain a person’s decision to reject a 50:50 bet to win $110 or lose $100. That just simple risk aversion as in expected utility theory.

Risk aversion is the concept that we prefer certainty to a gamble with the same expected value. For example, a risk averse person would prefer $100 for certain over a 50-50 gamble between $0 and $200, which has an expected value of $100. The higher their risk aversion, the less they would value the 50:50 bet. They would also be willing to reject some positive expected value bets.

Loss aversion is the concept that losses loom larger than gains. If the loss is weighted more heavily that the gain – it is often said that losses hurt twice as much as gains bring us joy – then this could also explain the decision to reject a 50:50 bet of the type above. Loss aversion is distinct from risk aversion as its full force applies to the first dollar either side of the reference point from which the person is assessing the change (and at which point risk aversion should be negligible).

So, do we need loss aversion to explain the rejection of this bet, or does risk aversion suffice?

One typical response to the above claim is loosely based on the Rabin Paradox, which comes from a paper published in 2000 by Matthew Rabin:

An expected utility maximiser who rejects this bet is exhibiting a level of risk aversion that would lead them to reject bets that no one in their right mind would reject. It can’t be the case that this is simply risk aversion.

For the remainder of this post I am going to pull apart Rabin’s argument from his justifiably famous paper Risk Aversion and Expected-Utility Theory: A Calibration Theorem (pdf). A more more readable version of this argument was also published in 2001 in an article by Rabin and Richard Thaler.

To understand Rabin’s point, I have worked through the math in his paper. You can see my mathematical workings in an Appendix at the bottom of this post. There were quite a few minor errors in the paper – and some major errors in the formulas – but I believe I’ve captured the crux of the argument. (I’d be grateful for some second opinions on this).

I started working through these two articles with an impression that Rabin’s argument was a fatal blow to the idea that expected utility theory accurately describes the rejection of bets such as that above. I would have been comfortable making the above response. However, after playing with the numbers and developing a better understanding of the paper, I would say that the above response is not strictly true. Rabin’s paper makes an important point, but it is far from a fatal blow by itself. (That fatal blow does come, just not solely from here.)

Describing Rabin’s argument

Rabin’s argument starts with a simple bet: suppose you are offered a 50:50 bet to win $110 or lose $100, and you turn it down. Suppose further that you would reject this bet no matter what your wealth (this is an assumption we will turn to in more detail later). What can you infer about your response to other bets?

This depends on what decision making model you are using.

For an expected utility maximiser – someone who maximises the probability weighted subjective value of these bets – we can infer that they will turn down any 50:50 bet of losing $1,000 and gaining any amount of money. For example, they would reject a 50:50 bet to lose $1,000, win one billion dollars.

On its face value, that is ridiculous, and that is the crux of Rabin’s argument. Rejection of the low value bet to win $110 and lose $100 would lead to absurd responses to higher value bets. This leads Rabin to argue that risk aversion or the diminishing value of money has nothing to do with rejection of the low value bets.

The intuition behind Rabin’s argument is relatively simple. Suppose we have someone that rejects a 50:50 bet for gain $11, lose $10. They are an expected utility maximiser with a weakly concave utility curve: that is, they are risk neutral or risk averse at all levels of wealth.

From this, we can infer that they weight the average of each dollar between their current wealth (W) and their wealth if they win the bet (W+11) only 10/11 as much as they weight the average dollar of the last $10 of their current wealth (between W-10 and W). We can also say that they therefore weight their W+11th dollar at most 10/11 as much as their W-10th dollar (relying on the weak concavity here).

Suppose their wealth is now W+21. We have assumed that they will reject the bet at all levels of wealth, so they will also reject at this wealth. Iterating the previous calculations, we can say that they will weight their W+32nd dollar only 10/11 as much as their W+11th dollar. This means they value their W+32nd dollar only (10/11)2 as much as their W-10th dollar.

Keep iterating in this way and you end up with some ridiculous results. You value the 210th dollar above your current wealth only 40% as much as your last current dollar of your wealth [reducing by a constant factor of 10/11 every $21 – (10/11)10]. Or you value the 900th dollar above your current wealth at only 2% of your last current dollar [(10/11)40]. This is an absurd rate of discounting.

Those numbers are from the 2001 Rabin and Thaler paper. In his 2000 paper, Rabin gives figures of 3/20 for the 220th and 1/2000 for the 880th dollar, effectively calculating (10/11)20 and (10/11)80, which is a reduction by a factor of 10/11 every 11 dollars. This degree of discounting could be justified and reflects the equations provided in the Appendix to his paper, but it requires a slightly different intuition than the one relating to the comparison between every 21st dollar. If instead you note that the $11 above a reference point are valued less than the $10 below, you only need iterate up $11 to get another discount of 10/11, as the next $11 is valued at most as much as the previous $10.

Regardless of whether you use the numbers from the 2000 or 2001 paper, taking this iteration to the extreme, it doesn’t take long for additional money to have effectively zero value. Hence the result, reject the 50:50 win $110, lose $100 and you’ll reject the win any amount, lose $1,000 bet.

What is the utility curve of this person?

This argument sounds compelling, but we need to examine the assumption that you will reject the bet at all levels of wealth.

If someone rejects the bet at all levels of wealth, what is the least risk averse they could be? They would be close to indifferent to the bet at all levels of wealth. If that was the case across the whole utility curve, their absolute level of risk aversion is constant.

The equation used to represent utility with constant absolute risk aversion is exponential utility (with a>0). A feature of the exponential utility function is that, for a risk averse person, utility caps out at a maximum. Beyond a certain level of wealth, they gain no additional utility – hence Rabin’s ability to define bets where they reject infinite gains.

The need for utility to cap out is also apparent from the fact that someone might reject a bet that involves the potential for infinite gain. The utility of infinite wealth cannot be infinite, as any bet involving that the potential for infinite utility would be accepted.

In the 2000 paper, Rabin brings the constant absolute risk aversion function into his argument more explicitly when he examines what proportion of their portfolio a person with an exponential utility function would invest in stocks (under some particular return assumptions). There he shows a ridiculous level of risk aversion and states that “While it is widely believed that investors are too cautious in their investment behavior, no one believes they are this risk averse.”

However, this effective (or explicit) assumption of constant absolute risk aversion is not particularly well grounded. Most empirical evidence is that people exhibit decreasing absolute risk aversion, not constant. Exponential utility functions are used more for mathematical tractability than for realistically reflecting the decision making processes that people use.

Yet, under Rabin’s assumption of rejecting the bet at all levels of wealth, constant absolute risk aversion and a utility function such as the exponential is the most accommodating assumption we can make. While Rabin states that “no one believes they are this risk averse”, it’s not clear that anyone believes Rabin’s underlying assumption either.

This ultimately means that the ridiculous implications for rejecting low-value bets is the result of Rabin’s unrealistic assumption of rejecting the bet no matter what their wealth.

Relaxing the “all levels of wealth” assumption

Rabin is, of course, aware that the assumption of rejecting the bet at all levels of wealth is a weakness, so he provides a further example that applies to someone who only rejects this bet for all levels of wealth below $300,000.

This generates less extreme, but still clearly problematic bets that the bettor can be inferred to also reject.

For example, consider someone who rejects the 50:50 bet to win $110, lose $100 when they have $290,000 of wealth, and who would also reject that bet up to a wealth of $300,000. As for the previous example, each time you iterate up $110, each dollar in that $110 is valued at most 10/11 of the previous $110. It takes 90 iterations of $110 to cover that $10,000, meaning that a dollar around wealth $300,000 will be valued only (10/11)90 (0.02%) of a dollar at wealth $290,000. Each dollar above $300,000 is not discounted any further, but by then the damage has already been done, with that money of almost no utility.

For instance, this person will reject a bet of gain $718,190, lose $1,000. Again, this person would be out of their mind.

You might now ask whether a person with a wealth of $290,000 to $300,000 actually rejects bets of this nature? If not, isn’t this just another unjustifiable assumption designed to generate a ridiculous result?

It is possible to make this scenario more realistic. Rabin doesn’t mention this in his paper (nor do Rabin and Thaler), but we can generate the same result at much lower levels of wealth. All we need to find is someone who will reject that bet over a range of $10,000, and still have enough wealth to bear the loss – say someone who will reject that bet up to a wealth of $11,000. That person will also reject a win $718,190 lose $1,000 bet.

Rejection of the win $110, lose $100 bet over that range does not seem as unrealistic, and I could imagine a person with that preference existing. If we empirically tested this, we would also need to examine liquid wealth and cash flow, but the example does provide a sense that we could find some people whose rejection of low value bets would generate absurd results under expected utility maximisation.

The log utility function

Let’s compare Rabin’s example utility function with a more commonly assumed utility function, that of log utility. Log utility has decreasing absolute risk aversion (and constant relative risk aversion), so is both more empirically defensible and does not generate utility that asymptotes to a maximum like the exponential utility function.

A person with log utility would reject the 50:50 bet to win $110, lose $100 up to a wealth of $1,100. Beyond that, they would accept the bet. So, for log utility we should see most people accept this bet.

A person with log utility will reject some quite unbalanced bets: such as a 50:50 bet to win $1 million, lose $90,900, but only up to a wealth of $100,000, beyond which they would accept. Rejection only occurs when a loss is near ruinous.

The result is that log utility does not generate the types of rejected bets that Rabin labels as ridiculous, but would also fail to provide much of an explanation for the rejection of low-value bets with positive expected value.

The empirical evidence

Do people actually turn down 50:50 bets of win $110, lose $100? Surprisingly, I couldn’t find an example of this bet (if someone knows a paper that directly tests this, let me know).

Most examinations of loss aversion examine symmetric 50:50 bets where the potential gain and the loss are the same. They compare a bet centred around 0 (e.g. gain $100 or lose $100) and a similar bet in a gain frame (e.g. gain $100 or gain $300, or take $200 for certain). If more people reject the first bet than the latter, then this is evidence of loss aversion.

It makes sense that this is the experimental approach. If the bet is not symmetric, it becomes hard to tease out loss aversion from risk aversion.

However, there is a pattern in the literature that people often reject risky bets with a positive expected value in the ranges explored by Rabin. We don’t know a lot about their wealth (or liquidity), but Rabin’s illustrative numbers for rejected bets don’t seem completely unrealistic. It’s the range of wealth over which the rejection occurs that is questionable.

Rather than me floundering around on this point, there are papers that explicitly ask whether we can observe a set of bets for a group of experimental subjects and map a curve to those choices that resembles expected utility.

For instance, Holt and Laury’s 2002 AER paper (pdf) examined a set of hypothetical and incentivised bets over a range of stakes (finding among other things that hypothetical predictions of their response to incentivised high-stakes bets were not very accurate). They found that if you are flexible about the form of the expected utility function that is used, rejection of small gambles does not result in absurd conclusions on large gambles. The pattern of bets could be made consistent with expected utility, assuming you correctly parameterise the equation. Over subsequent years there was some back and forth on whether this finding was robust [see here (pdf) and here (pdf)], but the basic result seemed to hold.

The utility curve that best matched Holt and Laury’s experimental findings had increasing relative risk aversion, and decreasing absolute risk aversion. By having decreasing absolute risk aversion, the absurd implications of Rabin’s paper are avoided.

Papers such as this suggest that while Rabin’s paper makes an important point, its underlying assumptions are not consistent with empirical evidence. It is possible to have an expected utility maximiser reject low value bets without generating ridiculous outcomes.

So what can you infer about our bettor who has rejected the win $110, lose $100 bet?

From the argument above, I would say not much. We could craft a utility function to accommodate this bet without leading to ridiculous consequences. I personally feel this defence is laboured (that’s a subject for another day), but the bet is not in itself fatal to the argument that they are an expected utility maximiser.

My other posts on loss aversion can be found here:

  1. Kahneman and Tversky’s debatable loss aversion assumption
  2. What can we infer about someone who rejects a 50:50 bet to win $110 or lose $100? The Rabin paradox explored (this post)
  3. The case against loss aversion
  4. Ergodicity economics – a primer

Appendix

The utility of a gain

Let’s suppose someone will reject a 50:50 bet with gain g and loss l for any level of wealth. What utility will they get from a gain of x? Rabin defines an upper bound of the utility of gaining x to be:

U(w+x)-U(w)\leq\sum_{i=0}^{k^{**}(x)}\left(\frac{l}{g}\right)^ir(w)\\

k^{**}(x)=int\left(\frac{x}{g}\right)\\

r(w)=U(w)-U(w-l)

This formula effectively breaks down x into g size components, successively discounting each additional g at \frac{l}{g} of the previous g .

You need k^{**}(x)+1 lots of g to cover x . For instance, if x was 32 and we had a 50:50 bet for win $11, lose $10, \left(\frac{32}{11}\right)=2 . You need 2+1 lots of 11 to fully cover 32. It actually covers a touch more than 32, hence the calculation being for an upper bound.

In the paper, Rabin defines k^{**}(x)=int\left(\left(\frac{x}{g}\right)+1\right) This seems to better capture the required number of g to fully cover x , but the iterations in the above formula start at i=0 . The calculations I run with my version of the formula replicate Rabin’s, supporting the suggestion that the addition of 1 in the paper is an error.

r(w) is shorthand for the amount of utility sacrificed from losing the gamble (i.e. losing l  ). We know that the utility of the gain g is less than this, as the bet is rejected. If we let r(w)=1 , the equation can be thought of as giving you the maximum utility you could get from the gain of x relative to the utility of the loss of l .

Putting this together, the upper bound of the utility of the possible gain x is therefore less than, first, the upper bound of the relative utility from the first $11, \left(\frac{10}{11}\right)^0r(w)=r(w) , the upper bound of utility from the next $11, \left(\frac{10}{11}\right)^1r(w) , and the upper bound of the utility from the remaining $10 – taking a conservative approach this is calculated as though it were a full $11: \left(\frac{10}{11}\right)^2r(w) .

The utility of a loss

Rabin also gives us a lower bound of the utility of a loss of x for this person who will reject a 50:50 bet with gain g and loss l for any level of wealth:

U(w)-U(w-x)\geq{2}\sum_{i=1}^{k^{*}(x)}\left(\frac{g}{l}\right)^{i-1}{r(w)}

k^{*}(x)=int\left(\frac{x}{2l}\right)

The intuition behind k^{*}(x)  comes from Rabin’s desire to provide a relatively uncomplicated proof for the proposition. Effectively, the utility scales down with each step of g by at least \frac{g}{l} . Since Rabin wants to express this in terms of losses, he defines 2l\geq{g}\geq{l} . He can thereby say that utility scales down by at least \frac{g}{l} every 2 lots of l .

Otherwise, the intuition for this loss formula is the same as that for the gain. The summation starts at i=1 as this formula is providing a lower bound, so does not require the final iteration to fully cover x . The formula is also multiplied by 2 as each iteration covers two lots of l , whereby r(w) is for a single span of l .

Running some numbers

The below R code implements the above two formulas as a function, calculating the potential utility gain for a win of G or a loss of L for a person who rejects a 50:50 bet win g , lose l at all levels of wealth. It then states whether we know the person will reject a win G , lose L bet – we can’t state they will accept as we have upper and lower bounds of the utility change from the gain and loss.

Rabin_bet <- function(g, l, G, L){

    k_2star <- as.integer(G/g)
    k_star <- as.integer(L/(2*l))

    U_gain <- 0
    for (i in 0:k_2star) {
        U_step <- (l/g)^i
        U_gain <- U_gain + U_step
        }

    U_loss <- 0
    for (i in 1:k_star) {
        U_step <- 2*(g/l)^(i-1)
        U_loss <- U_loss + U_step
        }

    ifelse(U_gain < U_loss,
        print("REJECT"),
        NA
        )
    print(paste0("Max U from gain =", U_gain))
    print(paste0("Min U from loss =", U_loss))
}

Take a person who will reject a 50:50 bet to win $110, lose $100. Taking the table from the paper, they would reject a win $1,000,000,000, lose $1,000 bet.

Rabin_bet(110, 100, 1000000000, 1000)
[1] "REJECT"
[1] "Max U from gain =11"
[1] "Min U from loss =12.2102"

Relaxing the wealth assumption

In the Appendix of his paper, Rabin defines his proof where the bet is rejected over a range of wealth w\in(\bar w, \underline{w}) . In that case, relative utility for each additional gain of size g is \frac{l}{g} of the previous g until \bar w . Beyond that point, each additional gain of g gives constant utility until x is reached. The formula for the upper bound on the utility gain is:

U(w+x)-U(w)\leq \begin{cases} \sum_{i=0}^{k^{**}(x)}\left(\frac{l}{g}\right)^ir(w) & if\quad x\leq{\bar w}-w\\ \\ \sum_{i=0}^{k^{**}(\bar w)}\left(\frac{l}{g}\right)^{i}r(w)+\left[\frac{x-(\bar w-w)}{g}\right]\left(\frac{l}{g}\right)^{k^{**}(\bar w)}r(w) & if\quad x\geq{\bar w}-w \end{cases}

The first term of the equation where x\geq\bar w-w involves iterated discounting as per the situation where the bet is rejected for all levels of wealth, but here the iteration is only up to wealth \bar w . The second term of that equation captures the gain beyond \bar w discounted at a constant rate.

There is an error in Rabin’s formula in the paper. Rather than the term \left[\frac{x-(\bar w-w)}{g}\right] in the second equation, Rabin has it as [x-\bar w] . As for the previous equations, we need to know the number of iterations of the gain, not total dollars, and we need this between \bar w and w+x .

When Rabin provides the examples in Table II of the paper, from the numbers he provides I believe he actually uses a formula of the type int\left[\frac{x-(w-\underline w)}{g}+1\right] , which reflects a desire to calculate the upper-bound utility across the stretch above \bar w in a similar manner to below, although this is not strictly necessary given the discount is constant across this range. I have implemented as per my formula, which means that a bet for gain G is rejected g higher than for Rabin (which given their scale is not material).

Similarly, for the loss:

U(w)-U(w-x)\geq \begin{cases} {2}\sum_{i=1}^{k^{*}(x)}\left(\frac{g}{l}\right)^{i-1}{r(w)} & if\quad {w-\underline w+2l}\geq{x}\geq{2l}\\ \\ {2}\sum_{i=1}^{k^{*}(w-\underline w+2l)}\left(\frac{g}{l}\right)^{i-1}{r(w)}+\ \quad\left[\frac{x-(w-\underline w+l)}{2l}\right]\left(\frac{g}{l}\right)^{k^{*}(w-\underline w+2l)}{r(w)} & if\quad x\geq{w-\underline w+2l} \end{cases}

There is a similar error here, with Rabin using the term \left[x-(w-\underline w+l)\right] rather than \left[\frac{x-(w-\underline w+l)}{2l}\right] . We can’t determine how this was implemented by Rabin as his examples do not examine behaviour below a lower bound \underline w .

Running some more numbers

The below code implements the above two formulas as a function, calculating the potential utility gain for a win of G or a loss of L for a person who rejects a 50:50 bet win g , lose l at wealth w\in(\bar w, \underline{w}) . It then states whether we know the person will reject a win G , lose L bet – as before, we can’t state they will accept as we have upper and lower bounds of the utility change from the gain and loss.

Rabin_bet_general <- function(g, l, G, L, w, w_max, w_min){

    ifelse(
        G <= (w_max-w),
        k_2star <- as.integer(G/g),
        k_2star <- as.integer((w_max-w)/g))

    ifelse(w-w_min+2*l >= L
        k_star <- as.integer(L/(2*l)),
        k_star <- as.integer((w-w_min+2*l)/(2*l))
    )

    U_gain <- 0
    for (i in 0:k_2star){
        U_step <- (l/g)^i
        U_gain <- U_gain + U_step
    }

    ifelse(
        G <= (w_max-w),
        U_gain <- U_gain,
        U_gain <- U_gain + ((G-(w_max-w))/g)*(l/g)^k_2star
    )

    U_loss <- 0
    for (i in 1:k_star) {
        U_step <- 2*(g/l)^(i-1)
        U_loss <- U_loss + U_step
        }

    ifelse(w-w_min+2l >= L,
        U_loss <- U_loss,
        U_loss <- U_loss + ((L-(w-w_min+l))/(2*l))*(g/l)^k_star
    )

    ifelse(U_gain < U_loss,
        print("REJECT"),
        print("CANNOT CONFIRM REJECT")
    )

    print(paste0("Max U from gain =", U_gain))
    print(paste0("Min U from loss =", U_loss))
}

Imagine someone who turns down the win $110, lose $100 bet with a wealth of $290,000, but who would only reject this bet up to $300,000. They will reject a win $718,190, lose $1000 bet.

Rabin_bet_general(110, 100, 718190, 1000, 290000, 300000, 0)
[1] "REJECT"
[1] "Max U from gain =12.2098745626936"
[1] "Min U from loss =12.2102"

The nature of Rabin’s calculation means that we can scale this calculation to anywhere on the wealth curve. We need only say that someone who rejects this bet over (roughly) a range of $10,000 plus the size of the potential loss will exhibit the same decisions. For example a person with $10,000 wealth who would reject the bet up to $20,000 wealth would also reject the win $718,190, lose $1000 bet.

Rabin_bet_general(110, 100, 718190, 1000, 10000, 20000, 0)
[1] "REJECT"
[1] "Max U from gain =12.2098745626936"
[1] "Min U from loss =12.2102"

Comparison with log utility

The below is an example with log utility, which is U(W)=ln(W) . This function determines whether someone of wealth w will reject of accepta 50:50 bet for gain g and loss l .

log_utility <- function(g, l, w){

    log_gain <- log(w+g)
    log_loss <- log(w-l)

    EU_bet <- 0.5*log_gain + 0.5*log_loss
    EU_certain <- log(w)

    ifelse(EU_certain == EU_bet,
        print("INDIFFERENT"),
        ifelse(EU_certain > EU_bet,
            print("REJECT"),
            print("ACCEPT")
        )
    )

    print(paste0("Expected utility of bet = ", EU_bet))
    print(paste0("Utility of current wealth = ", EU_certain))
}

Testing a few numbers, someone with log utility is indifferent about a 50:50 win $110, lose $100 bet at wealth $1100. They would accept for any level of wealth above that level.

log_utility(110, 100, 1100)
[1] "INDIFFERENT"
[1] "Expected utility of bet = 7.00306545878646"
[1] "Utility of current wealth = 7.00306545878646"

That same person will always accept a 50:50 win $1100, lose $1000 bet above $11,000 in wealth.

log_utility(1100, 1000, 11000)
[1] "ACCEPT"
[1] "Expected utility of bet = 9.30565055178051"
[1] "Utility of current wealth = 9.30565055178051"

Can we generate any bets that don’t seem quite right? It’s quite hard unless you have a bet that will bring the person to ruin or near ruin. For instance, for a 50:50 bet with a chance to win $1 million, a person with log utility and $100,000 wealth would still accept the bet with a potential loss of $90,900, which brings them to less than 10% of their wealth.

log_utility(1000000, 90900, 100000)
[1] "ACCEPT"
[1] "Expected utility of bet = 11.5134252151368"
[1] "Utility of current wealth = 11.5129254649702"

The problem with log utility is not the ability to generate ridiculous bets that would be rejected. Rather, it’s that someone with log utility would tend to accept most positive value bets (in fact, they would always take a non-zero share if they could). Only if the bet brings them near ruin (either through size or their lack of wealth) would they turn down the bet.

The isoelastic utility function – of which log utility is a special case – is a broader class of function that exhibits constant relative risk aversion:

U(x)=\frac{w^{1-\rho}-1}{1-\rho}

If \rho=1 , this simplifies to log utility (you need to use L’Hopital’s rule to get this as the fraction is undefined when \rho=1 .) The higher \rho , the higher the level of risk aversion. We implement this function as follows:

CRRA_utility <- function(g, l, w, rho=2){

    ifelse(
        rho==1,
        print("function undefined"),
        NA
    )

    log_gain <- ((w+g)^(1-rho)-1)/(1-rho)
    log_loss <- ((w-l)^(1-rho)-1)/(1-rho)

    EU_bet <- 0.5*log_gain + 0.5*log_loss
    EU_certain <- (w^(1-rho)-1)/(1-rho)

    ifelse(EU_certain == EU_bet,
        print("INDIFFERENT"),
        ifelse(EU_certain > EU_bet,
            print("REJECT"),
            print("ACCEPT")
        )
    )

    print(paste0("Expected utility of bet = ", EU_bet))
    print(paste0("Utility of current wealth = ", EU_certain))
}

If we increase \rho , we can increase the proportion of low value bets that are rejected.

For example, a person with \rho=2 will reject the 50:50 win $110, lose $100 bet up to a wealth of $2200. The rejection point scales with \rho .

CRRA_utility(110, 100, 2200, 2)
[1] "INDIFFERENT"
[1] "Expected utility of bet = 0.999545454545455"
[1] "Utility of current wealth = 0.999545454545455"

For a 50:50 chance to win $1 million at wealth $100,000, the person with \rho=2 is willing to risk a far smaller loss, and rejects even when the loss is only $48,000, or less than half their wealth (which admittedly is still a fair chunk).

CRRA_utility(1000000, 48000, 100000, 2)
[1] "REJECT"
[1] "Expected utility of bet = 0.99998993006993"
[1] "Utility of current wealth = 0.99999"

Higher values of \rho start to become completely unrealistic as utility is almost flat beyond an initial level of wealth.

It is also possible to have values of \rho between 0 (risk neutrality) and 1. These would result in even fewer rejected low value bets than log utility, and fewer rejected bets with highly unbalanced potential gains and losses.

My latest article at Behavioral Scientist: Principles for the Application of Human Intelligence

I am somewhat slow in posting this – the article has been up more than a week – but my latest article is up at Behavioral Scientist.

The article is basically an argument that the scrutiny we are applying to algorithmic decision making should also be applied to human decision making systems. Our objective should be good decisions, whatever the source of the decision.

The introduction to the article is below.


Principles for the Application of Human Intelligence

Recognition of the powerful pattern matching ability of humans is growing. As a result, humans are increasingly being deployed to make decisions that affect the well-being of other humans. We are starting to see the use of human decision makers in courts, in university admissions offices, in loan application departments, and in recruitment. Soon humans will be the primary gateway to many core services.

The use of humans undoubtedly comes with benefits relative to the data-derived algorithms that we have used in the past. The human ability to spot anomalies that are missed by our rigid algorithms is unparalleled. A human decision maker also allows us to hold someone directly accountable for the decisions.

However, the replacement of algorithms with a powerful technology in the form of the human brain is not without risks. Before humans become the standard way in which we make decisions, we need to consider the risks and ensure implementation of human decision-making systems does not cause widespread harm. To this end, we need to develop principles for the application for the human intelligence to decision making.

Read the rest of the article here.

Kahneman and Tversky’s “debatable” loss aversion assumption

Loss aversion is the idea that losses loom larger than gains. It is one of the foundational concepts in the judgment and decision making literature. In Thinking, Fast and Slow, Daniel Kahneman wrote “The concept of loss aversion is certainly the most significant contribution of psychology to behavioral economics.”

Yet, over the last couple of years several critiques have emerged that question the foundations of loss aversion and whether loss aversion is a phenomena at all.

One is an article by Eldad Yechiam, titled Acceptable losses: the debatable origins of loss aversion (pdf). Framed in one case as a spread of the replication crisis to loss aversion, the abstract reads as follows:

It is often claimed that negative events carry a larger weight than positive events. Loss aversion is the manifestation of this argument in monetary outcomes. In this review, we examine early studies of the utility function of gains and losses, and in particular the original evidence for loss aversion reported by Kahneman and Tversky (Econometrica  47:263–291, 1979). We suggest that loss aversion proponents have over-interpreted these findings. Specifically, the early studies of utility functions have shown that while very large losses are overweighted, smaller losses are often not. In addition, the findings of some of these studies have been systematically misrepresented to reflect loss aversion, though they did not find it. These findings shed light both on the inability of modern studies to reproduce loss aversion as well as a second literature arguing strongly for it.

A second, The Loss of Loss Aversion: Will It Loom Larger Than Its Gain (pdf), by David Gal and Derek Rucker, attacks the concept of loss aversion more generally (supposedly the “death knell“):

Loss aversion, the principle that losses loom larger than gains, is among the most widely accepted ideas in the social sciences. The first part of this article introduces and discusses the construct of loss aversion. The second part of this article reviews evidence in support of loss aversion. The upshot of this review is that current evidence does not support that losses, on balance, tend to be any more impactful than gains. The third part of this article aims to address the question of why acceptance of loss aversion as a general principle remains pervasive and persistent among social scientists, including consumer psychologists, despite evidence to the contrary. This analysis aims to connect the persistence of a belief in loss aversion to more general ideas about belief acceptance and persistence in science. The final part of the article discusses how a more contextualized perspective of the relative impact of losses versus gains can open new areas of inquiry that are squarely in the domain of consumer psychology.

A third strain of criticism relates to the concept of ergodicity. Put forward by Ole Peters, the basic claim is that people are not maximising the expected value of a series of gambles, but rather the time average. If people maximise the latter, not the former as many approaches assume, you don’t need risk or loss aversion to explain the decisions. (I’ll leave explaining what exactly this means to a later post.)

I’m as sceptical and cynical about the some of the findings in the behavioural sciences as most (here’s my critical behavioural economics and behavioural science reading list), but I’m not sure I am fully on board with these arguments, particularly the stronger statements of Gal and Rucker. This post is the first of a few rummaging through these critiques to make sense of the debate, starting with Yechiam’s paper on the foundations of loss aversion in prospect theory.

Acceptable losses: the debatable origins of loss aversion

One of the most cited papers in the social sciences is Daniel Kahneman and Amos Tversky’s 1979 paper Prospect Theory: An Analysis of Decision under Risk (pdf). Prospect theory is intended to be a descriptive model of how people make decisions under risk, and an alternative to expected utility theory.

Under expected utility theory, people assign a utility value to each possible outcome of a lottery or gamble, with that outcome typically relating to a final level of wealth. The expected utility for a decision under risk is simply the probability weighted sum of these utilities. The utility of a 50% chance of $0 and a 50% chance of $200 is simply the sum of 50% of the utility of each of $0 and $200.

When utility is assumed to increase at a decreasing rate with each additional dollar of additional wealth – as is typically the case – it leads to risk averse behaviour, with a certain sum preferred to a gamble with an equivalent expected value. For example, a risk averse person would prefer $100 for certain that the 50-50 gamble for $0 or $200.

In their 1979 paper, Kahneman and Tversky described a number of departures from expected utility theory. These included:

  • The certainty effect: People overweight outcomes that are considered certain, relative to outcomes which are merely probable.
  • The reflection effect: Relative to a reference point, people are risk averse when considering gains, but risk seeking when facing losses.
  • The isolation effect: People focus on the elements that differ between options rather than those components that are shared.
  • Loss aversion: Losses loom larger than gains – relative to a reference point, a loss is more painful than a gain of the same magnitude.

Loss aversion and the reflection effect result in the following famous diagram of how people weight losses and gains under prospect theory. Loss aversion leads to a kink in the utility curve at the reference point. The curve is steeper below the reference point than above. The reflection effect results in the curve being concave above the reference point, and convex below.

Through the paper, Kahneman and Tversky describe experiments on each of the certainty effect, reflection effect, and isolation effect. However, as pointed out by Eldad Yechiam in his paper Acceptable losses: the debatable origins of loss aversion, loss aversion is taken as a stylised fact. Yechiam writes:

[I]n their 1979 paper, Kahneman and Tversky (1979) strongly argued for loss aversion, even though, at the time, they had not reported any experiments to support it. By indicating that this was a robust finding in earlier research, Kahneman and Tversky (1979) were able to rely upon it as a stylized fact. They begin their discussion on losses by stating that “a salient characteristic of attitudes to changes in welfare is that losses loom larger than gains” (p. 279), which suggests that this stylized fact is based on earlier findings. They then follow with the (much cited) sentence that “the aggravation that one experiences in losing a sum of money appears to be greater than the pleasure associated with gaining the same amount [17]” (p. 279). Most people who cite this sentence do so without the end quote of Galenter and Pliner (1974). Galenter and Pliner (1974) are, therefore, the first empirical study used to support the notion of loss aversion.

So what did Galenter and Pliner find? Yechiam writes:

Summing up their findings, Galenter and Pliner (1974) reported as follows: “We now turn to the question of the possible asymmetry of the positive and negative limbs of the utility function. On the basis of intuition and anecdote, one would expect the negative limb of the utility function to decrease more sharply than the positive limb increases… what we have observed if anything is an asymmetry of much less magnitude than would have been expected … the curvature of the function does not change in going from positive to negative” (p. 75).

Thus, our search for the historical foundations of loss aversion turns into a dead end on this particular branch: Galenter and Pliner (1974) did not observe such an asymmetry; and their study was quoted erroneously.

Effectively, the primary reference for the claim that we are loss averse does not support it.

So what other sources did Kahneman and Tversky rely on? Yechiam continues:

They argue that “the main properties ascribed to the value function have been observed in a detailed analysis of von Neumann–Morgenstern utility functions for changes of wealth [14].” (p. 281). The citation refers to Fishburn and Kochenberger’s forthcoming paper (at the time; published 1979). Fishburn and Kochenberger’s (1979) study reviews data of five other papers (Grayson, 1960; Green, 1963; Swalm, 1966; Halter & Dean, 1971; Barnes & Reinmuth, 1976) also cited by Kahneman and Tversky (1979). Summing up all of these findings, Kahneman and Tversky (1979) argue that “with a single exception, utility functions were considerably steeper for losses than for gains.” (p. 281). The “single exception” refers to a single participant who was reported not to show loss aversion, while the remaining one apparently did.

These five studies all involved very small samples, involving a total of 30 subjects.

Yechiam walks through three of the studies. On Swalm (1966):

The results of the 13 individuals examined by Swalm … appear at the first glance to be consistent with an asymmetric utility function implying overweighting of losses compared to gains (i.e., loss aversion). Notice, however, that amounts are in the thousands, such that the smallest amount used was set above $1000 and typically above $5000, because it was derived from the participant’s “planning horizon”. Moreover, for more than half of the participants, the utility curve near the origin …, which spans the two smallest gains and two smallest losses for each person, was linear. This deviates from the notion of loss aversion which implies that asymmetries should also be observed for small amounts as well.

This point reflects an argument that Yechiam and other have made in several papers (including here and here) that loss aversion is only apparent in high-stakes gambles. When the stakes are low, loss aversion does not appear.

On Grayson (1960):

A similar pattern is observed in Grayson’s utility functions … The amounts used were also extreme high, with only one or two points below the $50,000 range. For the points above $100,000, the pattern seems to show a clear asymmetry between gains and losses consistent with loss aversion. However, for 2/9 participants …, the utility curve for the points below 100,000 does not indicate loss aversion, and for 2/9 additional participants no loss aversion is observed for the few points below $50,000. Thus, it appears that in Grayson (1960) and Swalm (1966), almost all participants behaved as if they gave extreme losses more weight than corresponding gains, yet about half of them did not exhibit a similar asymmetry for the lower losses (e.g., below $50,000 in Grayson, 1960).

Again, loss aversion is stronger for extreme losses.

On Green (1963):

… Green (1963) did not examine any losses, making any interpretation concerning loss aversion in this study speculative as it rests on the authors’ subjective impression.

The results from Swalm (1966), Grayson (1960) and Green (1963) covers 26 of the 30 participants aggregated by Fishburn and Kochenberger. Halter and Dean (1971) and Barnes and Reinmuth (1976) only involved two participants each.

So what of other studies that were available to Kahneman and Tversky at the time?

In 1955, Davidson, Siegel, and Suppes conducted an experiment in which participants were presented with heads or tails bets which they could accept or refuse. …

… Outcomes were in cents and ran up to a gain or loss of 50 cents. The results of 15 participants showed that utility curves for gains and losses were symmetric …, with a loss/ gain utility ratio of 1.1 (far below than the 2.25 estimated by Tversky and Kahneman, 1992). The authors also re-analyzed an earlier data set by Mosteller and Nogee (1951) involving bets for amounts ranging from − 30 to 30 cents, and it too showed utility curves that were symmetric for gains and losses.

Lichtenstein (1965) similarly used incentivized bets and small amounts. … Lichtenstein (1965) argued that “The preference for low V [variance] bets indicates that the utility curve for money is not symmetric in its extreme ranges; that is, that large losses appear larger than large wins.” (p. 168). Thus, Lichtenstein (1965) interpreted her findings not as a general aversion to losses (which would include small losses and gains), but only as a tendency to overweight large losses relative to large gains.

… Slovic and Lichtenstein (1968) developed a regression-based approach to examine whether the participants’ willingness to pay (WTP) for a certain lottery is predicted more strongly by the size of its gains or the size of its losses. Their results showed that size of losses predicted WTP more than sizes of gains. … Moreover, in a follow-up study, Slovic (1969) found a reverse effect in hypothetical lotteries: Choices were better predicted by the gain amount than the loss amount. In the same study, he found no difference for incentivized lotteries in this respect.

Similar findings of no apparent loss aversion were observed in studies that used probabilities that are learned from experience (Katz, 1963; Katz, 1964; Myers & Suydam, 1964).

In sum, the evidence for loss aversion at the time of the publication of prospect theory was relatively weak and limited to high-stakes gambles.

As Yechiam notes, Kahneman and Tversky only turned their attention to specifically investigating loss aversion in 1992 – and even there it tended to involve large amounts.

Only in 1992 did Tversky and Kahneman (1992) and Redelmeier and Tversky (1992) start to empirically investigate loss aversion, and when they did, they used either very large amounts (Redelmeier & Tversky, 1992) or the so-called “list method” in which one chooses between lotteries with changing amounts up until choices switch from one alternative to the other (Tversky & Kahneman, 1992). This usage of high amounts would come to characterize most of the literature later arguing for loss aversion (e.g., Redelmeier & Tversky, 1992; Abdellaoui et al., 2007; Rabin & Weizsäcker, 2009) as would be the usage of decisions that are not incentivized (i.e., hypothetical; as discussed below).

I’ll examine the post-1979 evidence in more detail in a future post, but in the interim will note this observation from Yechiam on the more recent experiments.

In a review of the literature, Yechiam and Hochman (2013a) have shown that modern studies of loss aversion seem to be binomially distributed into those who used small or moderate amounts (up to $100) and large amounts (above $500). The former typically find no loss aversion, while the latter do. For example, Yechiam and Hochman (2013a) reviewed 11 studies using decisions from description (i.e., where participants are given exact information regarding the probability of gaining and losing money). From these studies, seven did not find loss aversion and all of them used loss/gain amounts of up to $100. Four did find loss aversion, and three of them used very high amounts (above $500 and typically higher). Thus, the usage of high amounts to produce loss aversion is maintained in modern studies.

The presence of loss aversion for only large stakes gambles raises some interesting questions. In particular, are we actually observing the effect of “minimal requirements”, whereby a loss would push them below some minimum threshold for, say, survival or other basic necessities? (Or at least a heuristic that operates with that intent?) This is a distinct concept from loss aversion as presented in prospect theory.

Finally – and a minor point on the claim that Yechiam’s paper was the beginning of the spread of the replication crisis to loss aversion – there is of course no direct experiment on loss aversion in the initial prospect theory paper to be replicated. A recent replication of the experiments in the 1979 paper had positive results (excepting some mixed results concerning the reflection effect). Replication of the 1979 paper doesn’t, however, resolve provide any evidence on the replicability of loss aversion itself, nor the appropriate interpretation of the experiments.

On that point, in my next post on the topic I’ll turn to some of the alternative explanations for what appears to be loss aversion, particularly the claims of Gal and Rucker that losses do not loom larger than gains.

David Leiser and Yhonatan Shemesh’s How We Misunderstand Economics and Why it Matters: The Psychology of Bias, Distortion and Conspiracy

From a new(ish) book by David Leiser and Yhonatan Shemesh, How We Misunderstand Economics and Why it Matters: The Psychology of Bias, Distortion and Conspiracy:

Working memory is a cognitive buffer, responsible for the transient holding, processing, and manipulation of information. This buffer is a mental store distinct from that required to merely hold in mind a number of items and its capacity is severely limited. The complexity of reasoning that can be handled mentally by a person is bounded by the number of items that can be kept active in working memory and the number of interrelationships between elements that can be kept active in reasoning. Quantifying these matters is complicated, but the values involved are minuscule, and do not exceed four distinct elements …

LTM [long-term memory] suffers from a different failing.  … It seems there is ample room for our knowledge in the LTM. The real challenge relates to retrieval: people routinely fail to use knowledge that they possess – especially when there is no clear specification of what might be relevant, no helpful retrieval cue. …

The two flaws … interact with one another. Ideas and pieces of knowledge accumulate in LTM, but those bits often remain unrelated. Leiser (2001) argues that, since there is no process active in LTM to harmonize inconsistent parts, coordination between elements can only take place in working memory. And in view of its smallness, the scope of explanations is small too. …

Limited knowledge, unavailability of many of the relevant economic concepts
and variables, and restricted mental processing power mean that incoherencies are to be expected, and they are indeed found. One of the most egregious is the tendency, noted by Furnham and Lewis (1986) who examined findings from the US, the UK, France, Germany, and Denmark, to demand both reductions in taxation and increased public expenditure (especially on schools, the sick, and the old). You can of course see why people would rather pay less in taxes, and also that they prefer to benefit from more services, but it is still surprising how often the link between the two is ignored. This is only possible because, to most people, taxes and services are two unrelated mental concepts, sitting as it were in different parts of LTM, a case of narrow scoping, called by McCaffery and Baron (2006) in this context an “isolation effect.”

Bastounis, Leiser, and Roland- Levy ( 2004 ) ran an extensive survey on economic beliefs in several countries (Austria, France, Greece, Israel, New Zealand, Slovenia, Singapore, and Turkey) among nearly 2000 respondents, and studied the correlations between answers to the different questions. No such broad clustering of opinions as that predicted by Salter was in evidence. Instead, the data indicate that lay economic thinking is organized around circumscribed economic phenomena, such as inflation and unemployment, rather than by integrative theories. Simply put, knowing their answers about one question about inflation was a fair predictor of their answer to another, but was not predictive of their views regarding unemployment.

A refreshing element of the book is that it draws on a much broader swathe of psychology than just the heuristics and biases literature, which often becomes the focus of stories on why people err. However, I was surprised by the lack of mention of intelligence.

A couple of other interesting snippets, the first on the ‘halo effect’:

The tendency to oversimplify complex judgments also manifests in the “halo” effect. … [K]nowing a few positive traits of a person leads us to attribute additional positive traits to them. … The halo effect comes from the tendency to rely on global affect, instead of discriminating among conceptually distinct and potentially independent attributes.

This bias is unfortunate enough by itself, as it leads to the unwarranted attribution of traits to individuals. But it becomes even more pernicious when it blinds people to the possibility of tradeoffs, where two of the features are inversely correlated. To handle a tradeoff situation rationally, it is essential to disentangle the attributes, and to realize that if one increases the other decreases. When contemplating an investment, for instance, a person must decide whether to invest in stocks (riskier, but with a greater potential return) or in bonds (safer, but offering lower potential returns). Why not go for the best of both worlds – and buy a safe investment that also yields high returns? Because no such gems are on offer. A basic rule in investment pricing is that risk and return are inversely related, and for a good reason. …

Strikingly, this relation is systematically violated when people are asked for an independent evaluation of their risk perception and return expectations. Shefrin (2002) asked portfolio managers, analysts, and MBA students for such assessments, and found, to his surprise, that expected return correlates inversely with perceived risk. Respondents appear to expect that riskier stocks will also produce lower returns than safer stocks. This was confirmed experimentally by Ganzach (2000). In the simplest of his several experiments, participants received a list of (unfamiliar) international stock markets. One group of participants was asked to judge the expected return of the market portfolio of these stock markets, and the other was asked to judge the level of risk associated with investing in these portfolios. … The relationship between judgments of risk and judgments of expected return, across the financial assets evaluated, was large and negative (Pearson r = −0.55). Ganzach interprets this finding as showing that both perceived risk and expected return are derived from a global preference. If an asset is perceived as good, it will be judged to have both high return and low risk, whereas if it is perceived as bad, it will be judged to have both low return and high risk.

And on whether some examinations of economic comprehension are actually personality tests:

Leiser and Benita (in preparation) asked 300 people in the US for their view concerning economic fragility or stability, by checking the extent to which they agreed with the following sentences:

1. The economy is fundamentally sound, and will restore itself after occasional
crises.
2. The economy is capable of absorbing limited shocks, but if the shocks are
excessive, a major crisis and even collapse will ensue.
3. Deterioration in the economy, when it occurs, is a very gradual process.
4. The economy’s functioning is delicate, and always at a risk of collapse.
5. The economy is an intricate system, and it is all but impossible to predict how it will evolve.
6. Economic experts can ensure that the economy will regain stability even after major crises.

These questions relate to the economy, and respondents answered them first. But
we then asked corresponding questions, with minimal variations of wording, about
three other widely disparate domains: personal relationships, climate change, and health. Participants rated to what extent they agree with each of the statements about each additional domain. The findings were clear: beliefs regarding economic stability are highly correlated with parallel beliefs in unrelated social and natural domains. People who believe that “The economy’s functioning is delicate, and always at a risk of collapse” tend to agree that “Close interpersonal relationships are delicate, and always at a risk of collapse” … And people who hold that “The economy is capable of absorbing limited shocks, but if the shocks are excessive, a major crisis will occur” also tend to judge that “The human body is capable of absorbing limited shocks, but beyond a certain intensity of illness, body collapse will follow.”

What we see in such cases is that people don’t assess the economy as an intelligible system. Instead, they express their general feelings towards dangers. … [T]hose who believe that the world is dangerous and who see an external locus of control see all four domains (economics, personal relations, health, and the environment) as unstable and unpredictable. Such judgments have little to do with an evaluation of the domain assessed, be it economic or something else. They attest personal traits, not comprehension.

Nick Chater’s The Mind is Flat: The Illusion of Mental Depth and the Improvised Mind

Nick Chater’s The Mind is Flat: The Illusion of Mental Depth and the Improvised Mind is a great book.

Chater’s basic argument is that there are no ‘hidden depths’ to our minds. The idea that we have an inner mental world with beliefs, motives and fears is just a work of imagination. As Chater puts it:

no one, at any point in human history, has ever been guided by inner beliefs or desires, any more than any human being has been possessed by evil spirits or watched over by a guardian angel.

The book represents Chater’s reluctant acceptance that much experimental psychological data can no longer be accommodated by simply extending and modifying existing theories of reasoning and decision making. These theories are built on an intuitive conception of the mind, in which our thoughts and behaviour are rooted in reasoning and built on our deeply held beliefs and desires. As Chater argues, this intuitive conception is simply an illusion. This leads him to take his somewhat radical departure from many theories of perception, reasoning and decision making,

I have one major disagreement with the book, which turns out to be a fundamental disagreement with Chater’s central claim, but I’ll come to that later.

The visual illusion

Chater starts by examining visual perception. This is in part because visual perception is a (relatively) well understood area of psychology and neuroscience, and in part because Chater sees the whole of thought as being an extension of perception.

Consider our sense of colour vision. The sensitivity of colour vision falls rapidly outside of the fovea, the area of the retina responsible for our sharp central vision. The rod cells that capture most of our visual field only able to capture light and dark. This means that outside of a few degrees of where you are looking, you are effectively colour blind. Despite this, we feel that our entire visual world is coloured. That is an illusion.

Similarly, our visual periphery is fuzzy. Our visual acuity plunges in line with decreasing cone density with the increase in angle. Yet, again, we have a sense that we can capture the entire scene before us.

That limited vision is highlighted in experiments using gaze-contingent eye-tracking. In one experiment, participants are asked to read lines of text. Rather than showing the full text, the computer only displayed a window of text where the experimental participants were looking, with all letters outside of that window replaced by blocks of ‘x’s.

When someone is reading this text, they feel they are looking at a page or screen full of text. How small can the window of text be before this illusion is shattered? It turns out, the window can be shrunk to around 10 to 15 characters (centred slight right of the fixation point) without the reader sensing anything is amiss. This is despite the page being almost completely covered in ‘x’s. The sense that they are looking at a full page of text is an illusion, as most of the text isn’t there.

Chater walks through a range of other interesting experiments showing similar points. For instance, we can only encode one colour or shape or object at a time. The idea we are looking at a rich coloured world, taking in all of the colours and shapes at one, is also an illusion.

Our brain is not simultaneously grasping a whole, but is rather piecing together a stream of information. Yet we are fooled into believing we are having a rich sensory experience. We don’t actually see a broad, rich multi-coloured world. The sense that we do is a hoax.

So show can the mind execute this hoax? Chater suggests the answer is simply because as soon as we wonder about any aspect of the world, we can simply flick our eyes over and instantly provide an answer. The fluency of this process suggests to us that we already had the answers stored, but the experimental and physiological evidence suggests this cannot be the case.

Put another way, the sense of a rich sensory world is actually just the potential to explore a rich sensory world. This potential is misinterpreted as actually experiencing that world.

An interesting question posed by Chater later in the book is why don’t we have any awareness of the brain’s mode of thought. Why don’t we sense the continually flickering snapshots generated by our visual system? His answer is that the brain’s goal is to inform us of the world around us. It is not to inform us about the working of our own mechanisms to understand it.

The inner world

So does story change when we move from visual perception to our inner thoughts?

Charter asks us to think of a tiger as clearly and distinctly as we can. Consider the pattern of stripes on the tiger. Count them. What way do they flow over the body? Along the length or vertically? What about on the legs?

Visually, we can only grasp fragments at a time, but each visual feature is available on demand, giving the impression that our vision encompasses the whole scene. A similar dynamic is at work for the imaginary tiger. Here the mind improvises the answer as soon as you ask for it. Until you ask the question, those details are entirely absent.

What happens when you compare your answer about the tiger’s stripes with a real tiger? For the real tiger, the front legs don’t have stripes. At the back legs the stripes rotate from horizontal around the leg to vertical around the body. The belly and inner legs are white. Were they part of the image in your mind?

As we considered the tiger, we invented the answers to the questions we asked. What appeared to be a coherent image was constructed on the fly in the same way our system of visual perception gives us answers as we need them.

In one chapter, Chater also argues that we invent our feelings. He describes experimental participants dosed with either adrenaline or a placebo and then placed  in a waiting room with a stooge. The stooge was either manic (flying paper aeroplanes) or angry (reacting to a questionnaire they had to fill in while waiting). Those who had been adrenalised had stronger reactions to both stooges, but in opposite directions: euphoric with the manic stooge and irritated in the presence of the angry stooge. Chater argues that we interpret our emotions in the moment based on both the situation we are in and our own physiological state. By being an act of interpretation, having an emotion is an act of reasoning.

Improvising our preferences and beliefs

The core of Chater’s argument comes when he turns to our preferences and beliefs.  And here he argues that we are still relentless improvisers.

The famous split brain research of Michael Gazzaniga provides evidence for the improvisation. A treatment for severe epilepsy is surgical severance of the corpus callosum that links the two hemispheres of the brain. This procedure prevents seizures from spreading from one hemisphere to the other, but also results in the two halves of the cortex functioning independently.

What if you show different images to the right and left halves of the visual field, which are processed in the opposite hemispheres of the brain (the crossover wiring to the brain means that the right hemisphere processes information in the left visual field, and vice versa)? In one experiment Gazzaniga showed two images to a split brain patient, P.S. On the left hand side was a picture of a snowy scene. On the right was a picture of a chicken’s foot.  P.S., like most of us, had his language abilities focused in the left hemisphere of the brain, so P.S. could report seeing the chicken foot but was unable to say anything about the snowy scene.

P.S. was asked to pick one of four pictures associated with each of the images. The right hand, controlled by the left hemisphere, picked a chicken head to match the claw. The left hand picked out a shovel for the snow. And how did P.S. explain the choice of the shovel? ‘Oh that’s simple. The chicken claw goes with the chicken. And you need a shovel to clean out the chicken shed.’ An invented explanation. With no insight into the reason, the left hemisphere invents the explanation.

This fluent explanation by split brain patients presents the possibility that after-the-fact explanation might also be the case for people with normal brains. Rather than explanations expressing inner preferences and beliefs, we make up reasons in retrospect to interpret our actions.

Chater proceeds to build his case that we don’t have such inner beliefs and preferences with some of the less convincing research in the book, much of which looks and feels like a lot of what has been questioned during the replication crisis. It is interesting all the same.

In one experiment, voters in Sweden were asked whether they intended to vote for the left or right-leaning coalition. They were then given a questionnaire on various campaign topics. When the responses were handed to the experimenter, the experimenter changed some of the responses by a slight of hand. When they were handed back for checking, just under a quarter of voters spotted and corrected the error. But the majority were happy to explain political opinions that moments ago they did not hold.

Chater also reports an experiment where the experimenters got a similar effect when asking people which of two faces they prefer. When the face was switched before asking for the explanation, the fluent explanation still emerged.

An interesting twist to this experiment is when people who have been justified a choice of face they didn’t make are asked to choose again. These people tend to choose the face that they didn’t choose previously but were asked to justify. The explanation helped shape future decisions.

A similar effect occurred in another experiment in which participants took a web-based survey on political attitudes, with half the participants presented with an American flag in corner of screen. The flag caused a shift in political attitudes. But more interestingly, this effect persisted eight months later.

Chater’s interpretation of this experiment is not that Republicans should cover everything with flags. Rather, if people are exposed to a flag at a moment when they are contemplating their political views, this will have a long-lasting effect from the ‘memory traces’ that are laid down at the time.

When I read Chater’s summary of the experiment, my immediate reaction was that this was unlikely to replicate – and my reading of the original paper (PDF) firmed my view. And it turns out there was a replication of the first flag priming experiment in the Many Labs project – no effect. (My reaction to the paper might have been shaped by previously reading the Many Labs paper but not immediately recalling that this particular experiment was included.) So let’s scrub this experiment from the list of evidence in support. If there’s no immediate effect, it’s hard to make a case for an effect eight months later. (Chater should have noted this given the replication was published in 2014.)

This isn’t the only experiment reported by Chater with a failed replication in this section, although the other dates from after publication of the book. An experiment by Eldar Shafir that makes an appearance failed to replicate in Many Labs 2.

One other piece of evidence called on by Chater is the broad (and strong) evidence of the inconsistency of our risk preferences and how susceptible they are to the framing of the risk and the domain in which they are realised. Present the same gamble in a loss rather than a gain frame, and risk-seeking choices spike.

But putting these pieces together, I am not convinced Chater has made his case. The split brain experiments demonstrate our willingness to improvise explanations in the absence of any evidence. But this does not extend to an unequivocal case that we we don’t call on any “hidden depths” that are there. They are variable, but are they so variable that they have no deeper basis at all? Chater thinks so.

[N]o amount of measuring and re-measuring is going to help. The problem with measuring risk preferences is not that measurement is difficult and inaccurate; it is that there are no risk preferences to measure – there is simply no answer to how, ‘deep down’, we wish to balance risk and reward. And, while we’re at it, the same goes for the way people trade off the present against the future; how altruistic we are and to whom; how far we display prejudice on gender or race, and so on.

But this brings me to my major disagreement with Chater. For all Chater’s sweeping statements about our lack of hidden depths, he didn’t spend much effort trying to find them. Rather, he took a lot of evidence on how manipulable we can be (which we certainly are to a degree) and our willingness to improvise explanations when we have no idea (more robust), and then turned this into a finding that there is no hidden depth.

One place Chater could have looked is behavioural genetics. The first law of behavioural genetics is that all behavioural traits are heritable. That is, a proportion of the variation in these characteristics between people are due to genetic variation. These traits include risk preferences, the way we trade off the past and the future, and political preferences. These are among the characteristics that Chater suggests have no hidden depth. If there is no hidden depth, why are identical twins (even raised part) so similar for these traits Chater is likely right that when asked to explain why we took a certain risky preference we are likely to improvise an explanation with little connection to reality. We rarely point to our genes. But that does not mean the hidden depth is not there.

We can only have one thought at a time

Once Chater has completed his argument about our lack of hidden depths, he turns to describing his version of how the mind actually works. And part of that answer is that the brain can only tackle one problem at a time.

This inability to take on multiple tasks comes from the way that our brain computes when facing a difficult problem. Computation in the brain occurs through cooperation across the brain, with coordinated neural activity occurring across whole networks or entire regions of the brain. This large cooperative activity between slow neurons means that a network can only work on one problem at a time. And the brain is close to one large network.

Chater turns this idea into an attack on the “myth of the unconscious”. This myth is the idea that our brain is working away in the background. If we step away from a problem, we might suddenly have the answer pop into our head as our unconscious has kept working at the problem while we tend to other things.

Chater argues that for all the stories about scientists suddenly having major breakthroughs in the shower, neuroscience has found no evidence of these hidden processes. Chater summaries the studies in this area as concluding that, first, the effects of breaks either negligible or non-existent, and second, that the explanations for the minor effects of a break involve no unconscious thought at all.

As one example of the lack of effect, Chater describes an experiment in which subjects are asked to name both as many food items and as many countries as possible. Someone doing this task might switch back and forth between the two topics, changing to foods when they run out of countries and vice versa. How would the performance of a person able to switch back and forth compare to someone who has to first deal with one category, and only when finished move to the other? Would the former outperform as they could think about the second category in the background before coming back to it? The results suggest that when thinking about countries, there is no evidence that we are also thinking about food. When we switch from one category to the other, the search ceases abruptly.

So how did this myth of unconscious thought arise? Chater’s argument is that when we set a problem aside and return to it later, we are unencumbered by the past failures and patterns of thought in which we were trapped before. The new perspective may not be better than the old, but occasionally it will hit upon the angle that we need to solve the problem. So yes, the insight may emerge in a flash, but not because the unconscious had been grinding away at the problem.

This lack of unconscious thought is also demonstrated in the the literature concerning inattentional blindness. If people are busy attending to a task, they can miss information that they are not attending to. The classic example of this (at least, before the gorilla experiment) is an experiment by Ulric Neisser, in which participants are asked to watch three people throwing a ball to each other and press a button each time there was a throw. When an unexpected event occurs – in this case a woman with an umbrella walking through the players – less than one quarter of the participants noticed.

Chater takes the inattentional blindness studies as again showing that we can only lock onto and impose meaning on one fragment of sensory information at a time. If our brains are busy on one task, they can be utterly oblivious to other events.

One distinction Chater makes that I found useful is how to think about our unconscious thought processes. Chater’s argument is not that there is no processing in the brain outside our conscious knowledge. Rather, we have one type of thought, with unconscious processing resulting a a conscious result. Chater writes:

The division between the conscious and the unconscious does not distinguish between different types of thought. Instead, it is a division within individual thoughts themselves: between the conscious result of our thinking and the unconscious processes that create it.

There are no conscious thoughts and unconscious thoughts; and there are certainly no thoughts slipping in and out of consciousness. There is just one type of thought, and each such thought has two aspects: a conscious read-out, and unconscious processes generating the read-out.

So where do our actions come from?

So if there are no hidden depths, what drives us? Chater’s argument is that our thoughts come from memory traces created by previous thoughts and experiences. Each person is shaped by, and in effect unique due to, the uniqueness of their past thoughts and experiences. Thought follows channels carved by previous thoughts.

This argument does in some ways suggest that we have an inner-world. But that inner world is a record of the effect of the past cycles of thought. It is not an inner world of beliefs, hopes and fears. As Chater states, the brain operates based on precedents, not principles.

Chater’s first piece of evidence in support of this point comes from chess. What makes grandmasters special? It is not because humans are lightning calculating machines. Rather it is because of their long experience and their ability to find meaning in chess positions with great fluency. They can link the current position with memory traces of past board positions. They do not succeed by looking further ahead, but rather by drawing on a deeper memory bank and then focusing on only the best moves.

Chater argues that this is how perception works more generally. We do not interpret sensory information afresh, but interpret based on memory traces from past experience. He gives the example of “found faces”, where people see faces in inanimate objects. Our interpretation of the inputs finds resonance with memory traces of past inputs. Similarly, recognising a friend, word or tune depend on a link with your memories. Successful perception requires us to deploy the right memory traces when we need them.

Chater’s argument of the role of memory in perception seems sound. But absent the clear case that there there are no other sources of beliefs or motivations, I am not convinced these memory traces are all that there is.

What this means for intelligence and AI

The final chapter of the book is Chater’s attempt to put a positive gloss on his argument. It feels like the sort of chapter that the publisher might ask for to help with the promotion of the book.

That positive gloss is human creativity. Chater writes:

But the secret of human intelligence is the ability to find patterns in the least structured, most unexpected, hugely variable of streams of information – to lock onto a handbag and see a snarling face; to lock onto a set of black-and-white patches and discern a distinctive, emotion-laden, human being; to find mappings and metaphors through the complexity and chaos of the physical and psychological worlds. All this is far beyond the reach of modern artificial intelligence.

I am not sure I agree. Vision recognition systems regularly make errors through seeing patterns that aren’t there. Are these just the machine version of seeing a face in a handbag? Both are mismatches, but one is labelled as an imaginative leap, the other as an error. Should we endow this overactive human pattern matching with the title of intelligence and call a similar matching errors when done by a computer a mistake? Chess is also instructive here, with a sign of a machine move now often being great creativity.

This final chapter is somewhat shallow relative to the rest of the book. Chater provides little in the way of evidence to support his case, although you can piece together some threads supporting Chater yourself from the examples discussed earlier in the book. It ends the book with a nice hook, but for me was a flat ending for an otherwise great book.

 

Debating the conjunction fallacy

From Eliezer Yudkowsky on Less Wrong (a few years old, but worth revisiting in the light of my recent Gigerenzer v Kahneman and Tversky post):

When a single experiment seems to show that subjects are guilty of some horrifying sinful bias – such as thinking that the proposition “Bill is an accountant who plays jazz” has a higher probability than “Bill is an accountant” – people may try to dismiss (not defy) the experimental data. Most commonly, by questioning whether the subjects interpreted the experimental instructions in some unexpected fashion – perhaps they misunderstood what you meant by “more probable”.

Experiments are not beyond questioning; on the other hand, there should always exist some mountain of evidence which suffices to convince you.

Here is (probably) the single most questioned experiment in the literature of heuristics and biases, which I reproduce here exactly as it appears in Tversky and Kahneman (1982):

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Please rank the following statements by their probability, using 1 for the most probable and 8 for the least probable:

(5.2)  Linda is a teacher in elementary school.
(3.3)  Linda works in a bookstore and takes Yoga classes.
(2.1)  Linda is active in the feminist movement. (F)
(3.1)  Linda is a psychiatric social worker.
(5.4)  Linda is a member of the League of Women Voters.
(6.2)  Linda is a bank teller. (T)
(6.4)  Linda is an insurance salesperson.
(4.1)  Linda is a bank teller and is active in the feminist movement. (T & F)

(The numbers at the start of each line are the mean ranks of each proposition, lower being more probable.)

How do you know that subjects did not interpret “Linda is a bank teller” to mean “Linda is a bank teller and is not active in the feminist movement”? For one thing, dear readers, I offer the observation that most bank tellers, even the ones who participated in anti-nuclear demonstrations in college, are probably not active in the feminist movement. So, even so, Teller should rank above Teller & Feminist.  …  But the researchers did not stop with this observation; instead, in Tversky and Kahneman (1983), they created a between-subjects experiment in which either the conjunction or the two conjuncts were deleted. Thus, in the between-subjects version of the experiment, each subject saw either (T&F), or (T), but not both. With a total of five propositions ranked, the mean rank of (T&F) was 3.3 and the mean rank of (T) was 4.4, N=86. Thus, the fallacy is not due solely to interpreting “Linda is a bank teller” to mean “Linda is a bank teller and not active in the feminist movement.”

Another way of knowing whether subjects have misinterpreted an experiment is to ask the subjects directly. Also in Tversky and Kahneman (1983), a total of 103 medical internists … were given problems like the following:

A 55-year-old woman had pulmonary embolism documented angiographically 10 days after a cholecstectomy. Please rank order the following in terms of the probability that they will be among the conditions experienced by the patient (use 1 for the most likely and 6 for the least likely). Naturally, the patient could experience more than one of these conditions.

  • Dyspnea and hemiparesis
  • Calf pain
  • Pleuritic chest pain
  • Syncope and tachycardia
  • Hemiparesis
  • Hemoptysis

As Tversky and Kahneman note, “The symptoms listed for each problem included one, denoted B, that was judged by our consulting physicians to be nonrepresentative of the patient’s condition, and the conjunction of B with another highly representative symptom denoted A. In the above example of pulmonary embolism (blood clots in the lung), dyspnea (shortness of breath) is a typical symptom, whereas hemiparesis (partial paralysis) is very atypical.”

In indirect tests, the mean ranks of A&B and B respectively were 2.8 and 4.3; in direct tests, they were 2.7 and 4.6. In direct tests, subjects ranked A&B above B between 73% to 100% of the time, with an average of 91%.

The experiment was designed to eliminate, in four ways, the possibility that subjects were interpreting B to mean “only B (and not A)”. First, carefully wording the instructions:  “…the probability that they will be among the conditions experienced by the patient”, plus an explicit reminder, “the patient could experience more than one of these conditions”. Second, by including indirect tests as a comparison. Third, the researchers afterward administered a questionnaire:

In assessing the probability that the patient described has a particular symptom X, did you assume that (check one):
X is the only symptom experienced by the patient?
X is among the symptoms experienced by the patient?

60 of 62 physicians, asked this question, checked the second answer.

Fourth and finally, as Tversky and Kahneman write, “An additional group of 24 physicians, mostly residents at Stanford Hospital, participated in a group discussion in which they were confronted with their conjunction fallacies in the same questionnaire. The respondents did not defend their answers, although some references were made to ‘the nature of clinical experience.’  Most participants appeared surprised and dismayed to have made an elementary error of reasoning.”

Does the conjunction fallacy arise because subjects misinterpret what is meant by “probability”? This can be excluded by offering students bets with payoffs. In addition to the colored dice discussed yesterday, subjects have been asked which possibility they would prefer to bet $10 on in the classic Linda experiment. This did reduce the incidence of the conjunction fallacy, but only to 56% (N=60), which is still more than half the students.

But the ultimate proof of the conjunction fallacy is also the most elegant. In the conventional interpretation of the Linda experiment, subjects substitute judgment of representativeness for judgment of probability: Their feelings of similarity between each of the propositions and Linda’s description, determines how plausible it feels that each of the propositions is true of Linda. …

You just take another group of experimental subjects, and ask them how much each of the propositions “resembles” Linda. This was done – see Kahneman and Frederick (2002) – and the correlation between representativeness and probability was nearly perfect.  0.99, in fact.

The conjunction fallacy is probably the single most questioned bias ever introduced, which means that it now ranks among the best replicated. The conventional interpretation has been nearly absolutely nailed down.

There are a few additional experiments in Yudkowsky’s post that I have not replicated here.