# Aren’t we smart, fellow behavioural scientists

Below is the text of my presentation at Nudgsestock on 12 June 2020. You can watch a replay here.

## Intro

Over the past decade or two, behavioural scientists have had a great ride. There have been bestselling books and Nobel Memorial Prizes. Every second government department and corporate has set up a team.

But recently, the wind seems to have changed. We’re told that behavioural economics is itself biased.

Don’t trust the psychologists on coronavirus – Many of the responses to Covid-19 come from a deeply-flawed discipline”.

“Nudgeboy” has become a pejorative.

I believe this challenge is deserved.

For too long we have been opining about people’s irrationality – that is, the irrationality of others – and that if only we designed the world more intelligently, people would make better decisions.

We often make these judgements based on narrow lab experiments that we generalise to the outside world. But as we well know, sometimes those experiments don’t replicate in that narrow lab environment. And even of those that replicate in the lab, many become an ineffective, or even dangerous tool when we try to apply them in the complex outside world.

Let me tell you a story to illustrate.

## The hot hand fallacy

This story comes from great work by Joshua Miller and Adam Sanjurjo. It stands as one of the starkest examples of where I have been forced to change my beliefs.

There is strong evidence from the lab that people have misperceptions about what randomness looks like. When a person is asked to generate a series that approximates the flipping of a coin, they will alternate between heads and tails too often, and balance the frequencies of heads and tails over too short a sequence. When people are asked to judge which of two different sequences of coin flips are more likely, they tend to pick sequences with more alternation, despite their probability being the same.

What happens we look for a failure to perceive randomness in the outside world? Out of the lab?

When people watch basketball, they often see a hot hand. They will describe players as “hot” and “in form”. Their belief is that the person who has just hit a shot or a series of shots is more likely to hit their next one.

But is this belief in the “hot hand” a rational belief? Or is the hot hand an illusion, whereby, just like they do with coins, they are seeing streaks in what is actually randomness?

In a famous examination of this question, Thomas Gilovich, Robert Vallone and Amos Tversky took shot data from a variety of sources, including the Philadelphia 76ers and Boston Celtics, and examined it for evidence of a hot hand.

What did they find? The hot hand was an illusion. As Daniel Kahneman wrote in Thinking, Fast and Slow when describing this research:

The hot hand is entirely in the eye of the beholders, who are consistently too quick to perceive order and causality in randomness. The hot hand is a massive and widespread cognitive illusion.

Possibly even more interesting was the reaction to the findings from those in the sporting world. Despite the analysis, many sports figures denied that it could be true. Red Auerbach, who coached the Boston Celtics to nine NBA championships, said “Who is this guy? So he makes a study. I couldn’t care less.”

This provides another insight, about which Gilovich wrote:

[T]he story of our research on the hot hand is only partly about the misperception of random events. It is also about how tenaciously people cling to their beliefs even in the face of hostile evidence.

So, this isn’t just about the misperception of the hot hand, but also about the failure of people to see their error when presented with evidence about it.

Let’s delve into how Gilovich, Vallone and Tversky showed the absence of a hot hand.

Imagine a person who took ten shots in a basketball game. A ball is a hit, an X is a miss.

What would count as evidence of a hot hand? What we can do is look at shots following a previous hit. For instance, in this sequence of shots there are 6 occasions where we have a shot following a previous hit. Five of those shots, such as the seventh here, are followed by another hit.

We can then compare their normal shooting percentage with the proportion of shots they hit if the shot immediately before was a hit. If their hit rate after a hit is higher than their normal shot probability, then we might say they get a hot hand.

This is effectively how Gilovich, Vallone and Tversky examined the hot hand in coming to their conclusion that it doesn’t exist. They also looked at whether there was a hit or miss after longer streaks of hits or misses, but this captures the basic methodology. It seems sensible.

But let me take a detour that involves flipping a coin.

Suppose you flip a coin three times. Here are the eight possible sequences of heads and tails. Each sequence has an equal probability of occurring. What if I asked you: if you were to flip a coin three times, and there is a heads followed by another flip in that sequence, what is the expected probability that another heads will follow that heads?

Here is the proportion of heads following a previous flip of heads for each sequence. In the first row of the table, the first flip is a head. That first flip is followed by another head. After the second flip, a head, we also have a head. There is no flip after the third head. 100% of the heads in that sequence followed by another flip are followed by a head.

In the second row of the table, 50% of the heads are followed by a head. In the last two rows, there are no heads followed by another flip.

Now, back to our question: if you were to flip a coin three times, and there is a heads followed by another flip in that sequence, what is the expected probability that another heads will follow that heads? It turns out it is 42%, which I can get by averaging those proportions.

8 possible combinations of heads and tails across three flips

Flips p(Ht+1|Ht)
HHH 100%
HHT 50%
HTH 0%
HTT 0%
THH 100%
THT 0%
TTH
TTT
Expected value 42%

That doesn’t seem right. If we count across all the sequences, we see that there are 8 flips of heads that are followed by another flip. Of the subsequent flips, 4 are heads and 4 are tails, spot on the 50% you expect.

What is going on in that second column? By looking at these short sequences, we are introducing a bias. The cases of heads following heads tend to cluster together, such as in the first sequence which has two cases of a heads following a heads. Yet the sequence THT, which has only one shot occurring after a heads, is equally likely to occur. The reason a tails appears more likely to follow a heads is because of this bias whereby the streaks tend to cluster together. The expected value I get when taking a series of three flips is 42%, when in fact the actual probability of a heads following a heads is 50%. As the sequence of flips gets longer, the size of the bias is reduced, although it is increased if we examine longer streaks, such as the probability of a heads after three previous heads.

Why have I bothered with this counterintuitive story about coin flipping?

Because this bias is present in the methodology of the papers that purportedly demonstrated that there was no hot hand in basketball. Because of this bias, the proportion of hits following a hit or sequence of hits is biased downwards. Like our calculation using coins, the expected proportion of hits following a hit in a sequence is lower than the actual probability of hitting a shot.

Conversely the hot hand pushes the probability of hitting a shot after a previous hit up. Together, the downward bias and the hot hand roughly cancelled each other out, leading to the conclusion by researchers that each shot is independent of the last.

The result is, that when you correct for the bias, you can see that there actually is a hot hand in basketball.

When Miller and Sanjurjo crunched the numbers for one of the studies in the Gilovich and friends paper, they found that the probability of hitting a shot following a sequence of three previous hits is 13 percentage points higher than after a sequence of three misses. There truly is a hot hand. If Red Auerbach had coached as though there were no hot hand, what would his record have looked like?

I should say, this point does not debunk the earlier point about people misperceiving randomness. The lab evidence is strong. People tend to see the hot hand when people flip coins. It is possible that people overestimate the strength of the hot hand in the wild, although that is hard to show. But the hot hand exists.

Let’s turn back to one of the quotes I showed earlier.

[T]he story of our research on the hot hand is only partly about the misperception of random events. It is also about how tenaciously people cling to their beliefs even in the face of hostile evidence.

The researchers expanded the original hot hand research from a story about people misperceiving randomness, to one of them continuing to do so even when presented with evidence that they were making an error.

But, as we can now see, their belief in the hot hand was not an error. The punters in the stands were right. Their accumulated experience had given them the answer. The researchers were wrong. Rather than the researchers asking whether they themselves were making an error when people refused to believe their research, they double downed and identified a second failure of human reasoning. The blunt dismissal of people’s beliefs led behavioural scientists to hold an untrue belief for over thirty years

This is a persistent characteristic of much applied behavioural science. It was an error I made many times when I first came to the discipline. We spend too little time questioning our understanding of the decisions or observations other people make. If we believe they are in error, we should first question whether the error is ours.

## Priming

Here’s another example. There is a body of behavioural science research known as priming, that suggests that even slight cues in the environment can change our actions. A lot of priming research has bitten the dust through the replication crisis – ideas such as words associated with old people slow our walking pace, known as the Florida effect, or that images of money make us selfish, or that priming us with the ten commandments can make us more honest. It simply hasn’t stood the test of time.

Yet here’s a passage from Daniel Kahneman’s Thinking, Fast and Slow:

When I describe priming studies to audiences, the reaction is often disbelief. …

The idea you should focus on, however, is that disbelief is not an option. The results are not made up, nor are they statistical flukes. You have no choice but to accept that the major conclusions of these studies are true.

No. Again, it turns out the doubt of these audiences was justified.

There is an interesting intersection between this priming research and the hot hand. Much behavioural science research (including priming) is built on the concept that subtle, often ignored features of our environment can have marked effects on our decisions and performance. Yet why didn’t the hot hand researchers consider that a basketball player would be influenced by their earlier shots, surely a highly salient part of the environment and influence on their mental state? But, alas, the desire to show one bias allowed us to overlook another.

## Probability neglect

Now to a more recent story, which involves a concept called probability neglect.

The idea behind probability neglect is that when we consider a small risk, we tend to either ignore the risk or give it too much weight. We give disproportionate weight to the difference between zero and one percent relative to the difference between one and 99 percent probability.

There’s good evidence from the lab that we suffer from probability neglect – in the same way there is solid lab evidence about our misperceptions of randomness. But once again, the danger emerges when we take this finding and use it to assess the decisions of people in the outside world.

Here’s a recent example by Nudge author Cass Sunstein that hasn’t aged particularly well: The Cognitive Bias That Makes Us Panic About Coronavirus, with the subtitle: Feeling anxious? Blame “probability neglect.”

The opening paragraph of the article reads:

At this stage, no one can specify the magnitude of the threat from the coronavirus. But one thing is clear: A lot of people are more scared than they have any reason to be. They have an exaggerated sense of their own personal risk.

If you can’t specify a magnitude, it’s somewhat hard to claim that a response is exaggerated. But beyond that, here we see a set of findings from the lab – Sunstein later in the article describes one of the lab experiments – extrapolated to the real world, with little time spent asking whether an experiment in the lab can capture the more complex dynamics around people’s response to the coronavirus. In the lab, we know the probabilities. We have set them. Outside, in a case such as coronavirus, we don’t have any benchmark against which to assess people’s responses. As Sunstein notes, we also don’t know the magnitude.

Sunstein should have asked: Some people are reacting more than I think they should. Is there something about their response to risk that I should pay attention to? Why am I right and they wrong?

In fact, even when they are likely wrong – perhaps those panicking are like the broken clock that is right twice a day – we should ask whether there is wisdom in their actions. What if there is an asymmetry in the potential costs and benefits of overreacting versus under-reacting? Is it better to be typically wrong on probability – always assume there is a tiger in the grass – than to be largely right but occasionally experience ruin?

Sunstein, of course, was not exactly Robinson Crusoe in claiming that we were overreacting in late February. In fact, even now it’s not entirely clear what the appropriate response was for many people, regions or countries.

But by late March, without skipping a beat, he was noting that “This Time the Numbers Show We Can’t Be Too Careful”. No mention of the allegation of a misperception of risk less than four weeks earlier.

Of course, one of the weaknesses of applied behavioural science is that you can tell a story no matter what the observed behaviour. Six weeks later Sunstein was writing “How to Make Coronavirus Restrictions Easier to Swallow”, giving guidance on how to stop an under-reaction.

As Sunstein wrote:

To address the coronavirus pandemic, it’s essential to influence human behavior; to promote social distancing, to get people to wear masks, to encourage people to stay home. Many nations have imposed mandates as well. But to enforce the mandates and to promote safer choices as the mandates wind down, people have to be nudged.

So now it’s all about trying to get people to stay home, because they, err, are underestimating the risk? Maybe it’s better to be right twice a day than to be the clock that is always two hours too slow.

## Getting the right objective

These two stories – about the hot hand and coronavirus – illustrate the danger of taking lab experiments into a far more complex environment, the outside world. You can already see some of the reasons why this can cause problems. We may not have the full set of information held by the decision maker. We might simply stuff up our analysis of the problem. It’s complex.

In closing, I want to suggest another problem with judging other people’s decisions, and that is that we can mistake (or give insufficient consideration of) what a person’s objective actually is and how they can best achieve that objective.

Behavioural scientists have a better insight into this than many. We know that people aren’t just selfishly trying to maximise their income, wealth or consumption.

Yet, despite this, when we assess people’s behaviour in the wild, we often assess the rationality of their behaviours against a rather narrow set of outcomes, such as how their decisions benefit their finances or health in the long-term. We then try to nudge them in that direction.

Yet, that’s often not what people want.

My PhD combined evolutionary biology with economics, so I often think about our objectives with an evolutionary lens. Our mind was selected to have preferences that would tend to result in survival and reproduction in the environment in which it evolved.

Of course, most of us don’t specifically plot to maximise our reproductive output. Rather, evolution shapes our preferences so that we seek proximate objectives.

When we examine objectives from an evolutionary biology perspective, what appears irrational can simply be a misunderstanding on our part of what someone’s objectives are. The type of behaviour to, say, attract a partner, is going to look somewhat different to that of someone simply maximising financial resources. In fact, someone might effectively burn financial resources as part of their rational course of action.

One reason for this is that a core part of the evolved toolkit is our use of signals. We want to signal our traits or resources to others, including allies, enemies and potential reproductive partners.

Yet a problem with signals is that our interests are often not aligned with the recipient of our signal. We have an incentive to be dishonest, and the recipient knows this.

As a result, we need our signal to be reliable. One such way is that the signal imposes a cost on the signaller – and not just any cost – an actual handicap that someone without the trait or resources could not fake. The now almost cliched example of this is the peacock’s tail. It is a reliable signal of male health as only a male in good condition can maintain the unwieldy tail without falling prey to predators.

In the same way, one of the best ways to signal wealth is to burn money. Health can be signalled by unhealthy behaviours that would fry someone with a lesser constitution. An applied behavioural scientist assessing these behaviours from the perspective of the effect on long-term health or retirement savings is going to be somewhat confused. Yet when you see the objective, the behaviour has a purpose.

Of course, it does not immediately follow that understanding a person’s evolutionary objectives will rationalise their behaviour. As our taste for sweet and fatty foods implies, our preferences evolved in a world much different to ours. But it does suggest that we need to be wary in judging people’s actions as their objective may not be what we think it is.

## Reducing power use [My time was slightly truncated, so I skipped this section.]

This possible misunderstanding of people’s objectives can also arise simply as a practical issue. Let me give an example which, although not evolutionary, I think about a lot.

Power companies often want to limit their customers’ electricity demand for environmental reasons, or to reduce peak demand.

One of the favourite tricks to do this is to provide a comparison of that person or household’s power consumption with their neighbours. People have a desire to conform, and look to cues to inform their decisions. If shown that their power usage is above their neighbours, they tend to reduce their use.

What is this person or household’s objective? If it were purely financial, success! They have saved on their power bill. Their reduction in use also happens to align with the environmental or peak demand reduction objectives of the nudger.

Yet is their objective that simple? What if it is, say, happiness or satisfaction in life? What of factors such as their comfort?

And then what of their self-image? You’ve just compared them negatively with their neighbour. Is it likely to increase their happiness to see that they compare poorly? Does it increase mental stress? As applied behavioural scientists, we spend decidedly little time thinking about the breadth of the possible objectives someone may have and the effect of our nudge on them.

That is not to say that you cannot find examples where behavioural scientists have gone the next step to do the welfare calculus. This example of a comparison I have shown here comes from a paper by Hunt Allcott and Judd Kessler describing work with Opower. And what did they find? Although they argued that the net social welfare of the nudge was positive, a failure to consider these other objectives markedly overstates the benefits. Plus, about one third of the recipients would be willing to pay to not receive the nudge.

## Close

And now I will close, with a plea. As applied behavioural scientists, we need to inject some humility into our assessment of other people’s decisions. We need stop underestimating the intelligence of other people. We need to tone down the glee we have in communicating sexy, counterintuitive experimental findings that demonstrate errors by others. We need to stop making glib assumptions about what other people want and how they can best achieve their objectives. And importantly, we need to stop being lazy storytellers who don’t subject ourselves to the same critique that we would apply to someone else.

# The limits of behavioural science: coronavirus edition

Most articles on how behavioural science (or “behavioural economics”) can explain “X” are rubbish. “How behavioural economics explains Donald Trump’s election” or the equivalent would have been “How behavioural economics doomed Donald Trump” if he had failed to be elected. It’s after-the-fact storytelling of no scientific substance.

Through the last six weeks I have been collecting examples in the media of behavioural science applied to the coronavirus pandemic. There’s plenty of the usual junk.

As it turns out, Stuart Ritchie has also been on the case and written an article at UnHerd, Don’t trust the psychologists on coronavirus, saving me the trouble of writing my own. I would have chosen a different title, but follow the link and read the whole article. Below are some highlights.

First, Ritchie on Cass Sunstein:

Further psychological insights were provided by Cass Sunstein, co-author of the best-selling book Nudge, which used lessons from behavioural economics (essentially psychology by another name) that could inform attempts to change people’s behaviour. In an article for Bloomberg Opinion on 28 February (by which point there were over 83,000 confirmed coronavirus cases worldwide), Sunstein wrote that anxiety regarding the coronavirus pandemic was mainly due to something called “probability neglect”.

Because the disease is both novel and potentially fatal, Sunstein reasoned, we suffer from “excessive fear” and neglect the fact that our probability of getting it is low. “Unless the disease is contained in the near future,” he continued, “it will induce much more fear, and much more in the way of economic and social dislocation, than is warranted by the actual risk”.

The opening paragraph of Sunstein’s article was somewhat bizarre:

At this stage, no one can specify the magnitude of the threat from the coronavirus. But one thing is clear: A lot of people are more scared than they have any reason to be. They have an exaggerated sense of their own personal risk.

I know this shooting fish in the barrel, but how can you claim that people have an exaggerated sense of their own personal risk when no one can specify the magnitude of the threat?

(As an aside, turns out I have joined the masses of people blocked by Cass Sunstein on Twitter. Given my tweets are almost solely broadcasts of my blog posts, and my criticism of Sunstein within those posts is rather mild – and unlikely read by Sunstein – my working hypothesis is that he has blocked everyone Nassim Taleb follows. Recalling the first sentence of his book #Republic: “In a well-functioning democracy, people do not live in echo chamber or information cocoons”. I’ll leave that for now…)

Ritchie also points out that Gerd Gigerenzer makes the same error as Sunstein:

On 12 March, the day after Italy had announced its 827th death from the virus, the eminent psychologist Gerd Gigerenzer published a piece in Project Syndicate entitled “Why What Does Not Kill Us Makes Us Panic”. It was, to say the least, confused: it opened with an acknowledgement that we don’t know how bad this epidemic could be, but immediately went on to make the case that we’d likely overreact, and failed to consider any opposing arguments.

Gigerenzer normally does an admirable job of defending human intuition against critiques from the outside, but he has always questioned our “risk literacy”.

Stories based on behavioural science aren’t just landing in the media. They are forming part of advice to government:

The Behavioural Insights Team, a consulting company nicknamed the “Nudge Unit”, has been brought in to assist the UK’s response. At first it seemed their focus was on how to encourage handwashing, but there appears to have been mission creep.

For example, the team’s head, David Halpern, was interviewed as a “government coronavirus science advisor” about broad policies of “cocooning” older people. He was also quoted in support of the idea — which might yet seem grievously misguided in hindsight — that social-distancing measures should only be brought in gradually, to avoid people becoming fatigued. After he was reportedly “bollocked” by No.10 a fortnight ago for introducing the unfortunate phrase “herd immunity” into the national conversation, Halpern hasn’t (to my knowledge) been heard from again in public.

In defence of Halpern, this story is light on details, and there is a lot of interesting research that could inform this argument – as Vaughan Bell points out. That said, backing herd immunity on an assumption of fatigue is quite a leap of faith.

Why are behavioural scientists struggling? One is generalisability of the scientific literature:

To back up his points about “probability neglect”, Sunstein had referred to a 2001 paper in the journal Psychological Science. It reported three experiments; Sunstein focused on the third one, which included 156 participants, all of whom were undergraduate students reasoning about how much they’d pay to avoid an imaginary electric shock. It’s not a criticism of the scientists to say that this experiment is only tenuously relevant to a global pandemic.

Indeed, alongside talk of the “replication crisis” there’s been discussion of a “generalisability crisis”, with renewed realisation that results from lab experiments don’t necessary generalise to other contexts. A global pandemic of a completely novel virus is, by its very definition, a context never encountered before. So how can we be sure that the results of behavioural science experiments — even those that are based on bigger or more representative samples than 156 undergrads — are relevant to our current situation?

The answer is that we can’t. Exploring the human capacity for bias and irrationality can make for quirky, thought-provoking articles and books that make readers feel smarter (and can build towards a tentative scientific understanding of how the mind works). But when a truly dangerous disease comes along, relying on small-scale lab experiments and behavioural-economic studies results in dreadful misfires like the articles we encountered above.

Although there are many behavioural phenomena that certainly seem relevant to today’s news — bias, sunk costs, the tragedy of the commons — it’s not at all clear how these concepts would be practically applied to do what needs to be done right now: slowing the spread of the disease.

On Ritchie’s closing, I agree:

As intriguing as many psychological studies are, the vast majority of the insights we’ve gained from our research are simply not ready for primetime — especially in the case of a worldwide emergency where millions of lives are at stake. Much of the useful advice behavioural scientists can give isn’t really based on “science” to any important degree, and is intuitive and obvious.

Where they try to be counter-intuitive — for instance, arguing that people are wrong to find a global pandemic frightening — they simply end up embarrassing themselves, or worse, endangering people by having them make fewer pandemic preparations. This isn’t to say that psychology isn’t useful when it stays in its own lane: it’ll be important to ensure that as many people as possible have access to psychotherapy for the mental-health effects of the pandemic, for instance. But that’s a secondary effect of the virus: my argument here is that psychology can give little reliable counsel about our immediate reaction to the pandemic.

Psychologists should know their limits, and avoid over-stretching results from their small-scale studies to new, dissimilar situations. Decision-makers should, before using psychology research as the basis for policy, know just how weak and contentious so much of it is. And everyone else should stay at home, wash their hands — and beware psychologists bearing advice.

So could behavioural scientists be useful in this pandemic? They could help develop and test hypotheses to increase handwashing. They could help design communications about the need to stay at home. They have insights relevant to productivity while remote working. But they should be wary of telling stories about the accuracy of risk perception or how people will behave in the long term. Most of those claims are little more than storytelling.

# Risk and loss aversion in ergodicity economics

In a previous post I posed the following bet:

Suppose you have $100 and are offered a gamble involving a series of coin flips. For each flip, heads will increase your wealth by 50%. Tails will decrease it by 40%. Flip 100 times. The changes in wealth under a sequence of flips of this nature is “non-ergodic”, as the expected value of the bet does not converge with its time-average growth rate. The bet has a positive expected value, 5% of the bettor’s wealth per flip, and the ensemble average across a large enough population will approximate this expected value in growth in overall wealth. But, the time-average growth rate for an individual is approximately a loss of 5% of their wealth with each flip. Most individuals will experience a loss, and in the long-run everyone will. (To understand why this is so, see my primer post on ergodicity economics.) That many people decline bets of this nature suggests that there may be some wisdom in our decision making process. But what is that process? Are we risk averse? As I noted in that previous post, economists have a readily available explanation for the rejection of this bet. People are risk averse expected utility maximisers. As I wrote there: A risk averse person will value the expected outcome of a gamble lower than the same sum with certainty. Risk aversion can be represented through the concept of utility, where each level of wealth gives subjective value (utility) for the gambler. If people maximise utility instead of the value of a gamble, it is possible that a person would reject the bet. For example, one common utility function to represent a risk averse individual is the logarithm of their wealth. If we apply the log utility function to the gamble above, the gambler will reject the offer of the coin flip. [The maths here is simply that the expected utility of the gamble is 0.5ln(150) + 0.5ln(60)=4.55, which is less than the utility of the sure$100, ln(100)=4.61.]

The concept of a risk averse expected utility maximiser with a utility function such as the logarithmic has been a staple explanation for many decisions. The St Petersberg Paradox is one such problem, with that series of bets rarely valued above 10 despite the infinite expected value of the bet. (It is another non-ergodic system.) But do we need an expected utility function to provide us with such risk aversion? Would a more parsimonious explanation for the rejection of the bet simply be that the person is seeking to maximise the growth rate of their wealth. With that objective and a time-average growth rate of minus 5%, rejection is the obvious thing to do. There is no need for an expected utility function. Rather, the person simply needs a way of deciding whether accepting the bet will maximise the growth-rate of their wealth. An interesting alignment of economic history and ergodicity economics occurs here. One of the most commonly used expected utility functions is the logarithm (as noted above). People maximise utility by maximising the expected logarithm of their wealth. Yet, the way to maximise the geometric growth rate of your wealth when facing a multiplicative bet is also to maximise the logarithm of your wealth. The calculations of the expected utility maximiser with a logarithmic utility function and of the time-average growth-rate maximiser are the same. As Ole Peters and Alexander Adamou write in their ergodicity economics lecture notes: [E]xpected utility theory as we have presented it above is consistent with growth rate optimisation, provided a suitable pair of dynamic and utility function is used. For multiplicative dynamics, the necessary utility function is the logarithm. That this is the most widely used utility function in both theory and practice is a psychological fluke in the classic mindset; from our perspective it indicates that our brains have evolved to produce growth-optimal decisions in a world governed by multiplicative dynamics, i.e. where entities produce more of themselves. This effectively means that many of the “puzzles” that expected utility maximisation has been used to solve can also be “solved” by growth-rate optimisation. For instance, insurance or the St Petersberg puzzle provide a challenge for expected wealth optimisation, but are equivalently solved by assuming an expected log utility maximiser or a growth-rate optimiser. That these two concepts overlap raises a conundrum. An expected log utility maximiser looks much like a growth-rate maximiser in their behaviour (noting that log utility is only one of many functional forms an expected utility maximiser could theoretically have). If we would expect to see the same decision under both expected log utility and growth-rate maximisation in multiplicative dynamics, how can we differentiate the two? Additive dynamics Before I answer that question, I am going to detour into the world of additive dynamics. What if I offered you the following bet? Suppose you have100 and are offered a gamble involving a series of coin flips. For each flip, heads will increase your wealth by $50. Tails will decrease it by$40. Flip 100 times.

You can see the tweak from the original bet, with dollar sums rather than percentages. The first flip is effectively identical, but future bets will be additive on that result and always involve the same shift of $50 up or$40 down. In contrast, the earlier bet was multiplicative, in that the bettor’s wealth was multiplied by a common factor. As a result, the multiplicative bet scales up and down with wealth.

An important feature of this second series of flips is that the system is ergodic. The expected value of each flip is $5 (0.5*$50-0.5*$40=$5). The time-average growth rate is also $5. Let’s simulate as we did for multiplicative bets in the ergodicity economics primer post, with 10,000 people starting with$100 and flipping the coin 100 times. The below plot shows the average wealth of the population, together with the paths of the first 20 of the 10,000 people (in red).

Average wealth of population and path of first 20 people

The individual growth paths cluster on ether side of the population average. After 100 periods, the mean wealth is $593 and the median$600. 86% of the population has gained in wealth. The wealthiest person has $2220, or 0.04% of the total wealth of the population. After 1000 rounds (not plotted here), the mean wealth is$5,095 and the median $5,100. All except one person has gained. The wealthiest person has$11,130, or 0.02% of the total wealth of the population. This alignment between the mean and median wealth, and the relatively equal distribution of wealth, are characteristic of an ergodic system.

Now for a wrinkle, which can be seen in the plotted figure. Of those first 20 people plotted on the chart, 12(!) had their wealth go into the negative over those 100 periods. An additional two of those first 20 go into the negative over the subsequent 900 periods. This is reflected across the broader population, with 5,439 dropping below zero in those first 100 periods. 5,684 drop below zero across the full 1000.

To the extent zero wealth is ruinous at the time it occurs (e.g. death, you cannot continue to play), that event is serious. If you only incur the consequences of your final position, the bet is somewhat less likely to result in ruin, but still presents a real threat of catastrophe.

So what would an expected utility maximiser do here? For a person with log utility, any probability of ruin over the course of the flips would lead them to reject the series of gambles. The log of zero is negative infinite, so that outweighs all other possible outcomes, whatever their magnitude or probability.

The growth-rate maximiser would, if they didn’t fear ruin, accept the bet. The time-average growth of $5 per flip would pull them in. If ruin was feared and consequential, then they might also reject. Risk and loss aversion in the two different worlds To the title of my post, what light does this shed on risk or loss aversion? Let us suppose humans are growth-rate maximisers. In a multiplicative world, people would exhibit what is by definition risk averse behaviour – they prefer a certain sum to a gamble with the same expected value. This is a consequence of maximising the growth rate by maximising the expected logarithm of their wealth. This, however, has a different underlying rationale to explanations of log utility based on either psychology or the diminishing utility of wealth. What of loss aversion, the concept that losses loom larger than gains? Risk aversion results in a phenomena that looks like loss aversion, in that losses are weighted more heavily due to the diminishing utility of additional wealth. However, loss aversion is a dislike of losses over and above that. It involves a “kink” in the utility curve, so should be observed for small amounts and result in a greater aversion to bets than risk aversion alone would predict. The growth-rate maximisation model would not lead us to predict loss aversion. Whatever their wealth, growth-rate maximisation does not produce a marked difference between gains and losses beyond that induced by risk aversion. There is no “kink” at the reference point at which losses hurt more than gains are enjoyed. Are there any phenomena described as loss aversion which this theory would suggest are actually growth-rate maximising behaviour? Not that I can think of. In the additive world, things are more interesting. Growth-rate maximisation is equivalent to wealth maximisation. People aren’t risk averse. (In fact, assuming only growth rate maximisation in an additive environment leaves much about their risk tolerance unspecified.) They simply take the positive value bets. Here the broader evidence across experimental economics and psychology places a question mark over the claim (experiment described below excepting). People regularly reject positive value additive bets. There are ways to attempt to reconcile growth-rate maximisation with these rejections. For instance, we could argue that these people are in a multiplicative world, of which the bet is only a small part, so the bet described as additive is actually part of a multiplicative dynamic. We know little about their broader circumstances. But even then, the rejected additive bets are often so favourable that even a growth-maximiser in a multiplicative dynamic would generally accept them. Loss aversion is also not a prediction of growth-rate maximising behaviour in the additive world. There is not only not any kink at the reference point. Losses and gains have the same weight no matter their scale. We could add loss aversion to the growth-rate maximiser in the additive environment by introducing an absorbing state at zero. The path to ruin can be quicker in an additive world than in a multiplicative as the bet sizes don’t scale down with diminished wealth, plus there is the possibility of losing absolutely everything. But what is the agent’s response to this potential for ruin? We would need to add some assumptions additional to that provided by a simple growth-rate maximisation approach. Ergodicity and behavioural economics A short note here, because ergodicity economics in the twitter-sphere has been noted as the behavioural economics killer. I’ve already noted loss aversion, but I will state here that many behavioural phenomena remain to be explained even if we accept the foundational ergodicity concepts. A core group of these behavioural phenomena involve framing, whereby presentation of effectively the same choice can result in different decisions. Status quo bias, the reflection effect, default effects, and the like, remain. So while ergodicity economics gives a new light to shine on decision making under uncertainty, it hasn’t suddenly solved the raft of behavioural puzzles that have emerged over the last seventy years. Part of that is unsurprising. Much of the behavioural critique of expected utility theory is that our decisions don’t look like expected log utility maximisation decision making (or other similar functions). If that’s the case, those puzzles remain for a growth-rate maximiser that maximises their expected log wealth in a multiplicative environment. Distinguishing expected utility from growth-rate maximisation: an experiment Now to return to an earlier question. If we would expect to see the same decision under both expected utility and growth-rate maximisation in multiplicative dynamics, how can we differentiate the two? A group led by Oliver Hulme ran an experiment that sheds some interesting light on this question (branded the Copenhagen experiment in the twitter-sphere). The pre-print reporting the experimental results is available on arXiv, with supporting materials and data on GitHub. Despite some of my questions below, this is a innovative and well thought-out experiment. The concept behind the experiment was to differentiate between three possible models of human decision making: 1. Prospect theory, which includes features such as different risk aversion parameters in the gain and loss domains, and loss aversion. 2. Isoelastic utility, a classic model of expected utility, of which log utility is a special case 3. Time optimal utility, where changes in utility are determined by linear utility under additive dynamics and by logarithmic utility under multiplicative dynamics. The third could be differentiated from the other two if the utility function effectively changes when the environment changes between additive and multiplicative dynamics. The test this, the experimental procedure ran as follows. Eighteen experimental subjects (actually 20, but two were excluded from analysis) participated in a series of gambles over two days. On one day they were exposed to a series of additive bets. The other day involved multiplicative bets. The order of the days was switched for some subjects. They were not directly informed of the nature of each day. Each day consisted of a passive session, followed by an active session. At the beginning of the passive session each experimental subject was endowed with 1000 Danish Krone (approx$150USD). They then watched a series of fractal images, each of which had an effect on their wealth. Their task through the passive session was to learn the effect of each of these fractals on their wealth. Some examples of the images are below:

Example fractal images

There were 9 fractals on each day, each with a unique effect on wealth. For the multiplicative day, the fractals changed wealth by the following factors: 0.447, 0.546. 0.668. 0.818, 1, 1.223, 1.496, 1.830, 2.239. On the additive day, the changes were: -428kr, -321kr, -214kr, -107kr, 0, 107kr, 214kr, 321kr, 428kr. The subjects observed a sequence of 333 fractals in the passive sequence (each fractal would appear 37 times), giving them a good feel for the effect of each fractal on their wealth.

As the outcome from an equal distribution of these fractals is no growth, each participant ended the 333rd passive fractal with their original wealth of 1000kr. They were then given one additional random fractal to set the level of wealth that they would commence the active session with.

During the active session, they would be shown two sets of two fractal images, as in Figure 3 below. The left and the right side represent two possible bets, each of which would involve a 50:50 probability of getting one of the two fractals and the effect of that fractal on their wealth. They were asked to choose one of the bets. For example, choosing the left two images would give them a 50:50 chance for each of those two images on the left.

Example choice

Hulme and friends then analysed these experimental results to determine which model best described the observed decisions.

The headline finding is well illustrated in the following chart, which provides the posterior probability of the parameter of risk aversion for each experimental subject. The blue distribution is that for the additive day, the red for the multiplicative day.

Posterior probability of the parameter of risk aversion

A risk aversion parameter of 0 gives us linear utility. A parameter of 1 is logarithmic utility. On that basis, the time optimal utility of ergodicity economics comes out looking strong. There is a clear change in risk aversion across most participants as they changed between the ergodic and non-ergodic environments.

Hulme and friends also calculated the posterior probability of each model for each participant, with time optimal (the growth rate maximiser) generally having the stronger probability.

Posterior model probabilities

The authors write:

[T]o approximate time optimal behavior, different dynamics require different ergodicity mappings. Thus, when an agent faces a different dynamic, this should evoke the observation of a different utility function. This was observed, in that all subjects showed substantial changes in their estimated utility functions … Second, in shifting from additive to multiplicative dynamics, agents should become more risk averse. This was also observed in all subjects. Third, the predicted increase in risk aversion should be, in the dimensionless units of relative risk aversion, a step change of +1. The mean step change observed across the group was +1.001 (BCI95%[0.829,1.172]). Third, to a first approximation, most (not all) participants modulated their utility functions from ~linear utility under additive dynamics, to ~logarithmic utility under multiplicative dynamics (Fig. 3d). Each of these utility functions are provably optimal for growing wealth under the dynamical setting they adapted to, and in this sense they are reflective of an approximation to time optimality. Finally, Bayesian model comparison revealed strong evidence for the time optimal model compared to both prospect theory and isoelastic utility models, respectively. The latter two models provide no explanation or prediction for how risk preferences should change when gamble dynamics change, and even formally preclude the possibility of maximising the time average growth rate when gamble dynamics do change. Congruent with this explanatory gap, both prospect theory and isoelastic utility models were relatively inadequate in predicting the choices of most participants.

My major question about the experiment concerns the localised nature of the growth-rate maximisation. These people have lives outside of the experiment and existing wealth (of an unknown level). Yet the behaviour we observed in the multiplicative world was maximisation of the growth rate within the experiment. They effectively maximised the log utility of the in-experiment wealth.

If any of these subjects had any material wealth outside of the experiment and were general growth-rate maximisers, their utility function within this experiment should be closer to linear, despite the multiplicative dynamics. The log function has material curvature for small wealth changes near zero. Once you are further up the logarithmic function (higher wealth), a short section of the function is approximately linear. Even though the stakes of this experiment are described as large (~$150USD with potential to win up to ~$600USD), they are likely not large within the context of the subjects’ broader wealth.

This point forms one of the central planks of the criticism of expected utility theory emerging from behavioural economics. People reject bets that, if they had any outside wealth, would be “no-brainers” for someone with log utility. Most of this evidence is gathered from experiments with additive dynamics, but there is also little evidence of linear utility in such circumstances.

Why did the Copenhagen experiment subjects adopt this narrow frame? It’s not clear, but the explanation will likely have to call on psychology or and an understanding of the experimental subjects’ broader circumstances.

Another line of critique comes from Adam Goldstein, who argues that “the dynamic version of EUT, multi-period EUT, predicts the same change in risk aversion that EE predicts in a simplified model of CE [the Copenhagen experiment].”

Goldstein is right that EUT predicts a reduction in measured risk aversion in an additive environment. But Goldstein’s analysis depends on people being able to observe each flip and their change in wealth, and then changing their behaviour accordingly. If they could take the bets flip by flip, the first bet on its own is unattractive for a risk averse utility maximiser. But it is possible (indeed likely) for them to reach a level of wealth where a single bet is attractive (in this case, above a wealth of $200), in which case they can continue to accept. Conversely, if they head toward ruin, they can start to reject bets. The possibility of getting up to a level of wealth where the bet becomes attractive can lead an expected logarithmic utility maximiser to accept the first bet due to the potential utility from later bets. The way to determine whether they will do this uses a technique called dynamic programming, which involves working from the last bet backward to work out the expected utility of each single bet. However, I am not convinced this critique applies to the experiment by Hulme and friends. The experimental subjects never got to observe the changes in their wealth during the active session (although they might weakly infer the likely direction based on the favourability of the bets they had been exposed to). As a result, I’m not convinced that you would see the change in risk aversion observed in the experiment under an expected utility framework. That inability to observe outcomes also makes the experiment a weaker examination of dynamics over time than it might otherwise be. It is in some ways a single period game where all outcomes are realised at the same time, multiplying or adding at that point. The absence of seeing how subjects act given a change in wealth removes the ability to see some of the distinguishing phenomena. For instance, the time optimal utility maximiser would not increase risk aversion after losing in the additive environment, whereas in that same environment the traditional utility maximiser would be more likely to reject when the bets become a larger proportion of their wealth. The prospect theory decision maker may become risk seeking if they perceived themselves to be in the domain of losses. The authors note that the lack of update is because they want to avoid mental accounting, but that is, of course, a feature of prospect theory (if I understand their use of the term mental accounting). (I should also say that I understand the lack of updating given the multiple purposes of the experiment, but it would be great to see that relaxed in future iterations.) Goldstein also raised a second possible driver of the reduced risk aversion in the additive scenario. Experimental subjects were paid on the based on a random draw of 10 of their active gambles. If the final wealth for an experimental subject from those 10 gambles was negative, they would be given a new draw of 10 gambles. In effect, they were protected from the most severe risks in the additive case, which would reduce risk aversion. One possible mitigant of this effect is that the experimental subjects were not explicitly told they could not lose (although they would likely have inferred they could not suffer loss). I am also not convinced that the strength of this effect would be enough to result in purely linear utility as was observed, but it should be accounted for. [Update: I simulated the potential payments of the participants based on the choices they actually made. Only around 4% of the potential payments involved a loss, which would have triggered the redraw. That affirms my view that, while it should be accounted for, it is unlikely to explain the experimental result.] A related point is that the paths involving negative wealth were removed from the passive session on the additive day. This means that the subjects were not conditioned to see these negative potential consequences. In simulations I conducted of the additive passive day (see code below), around 90% of the simulations breach that zero lower bound. Twenty five percent breach the 5000kr upper bound (some of them the same paths that went below zero), leaving only 2% of the trials that could be provided to subjects on the passive day. In contrast, passive multiplicative paths could not be excluded for going below zero, despite providing a hair-raising ride. Around 30% of the passive multiplicative paths that I simulated involve wealth dropping to less than 1kr (a 99.9% loss) and 60% to less than 10kr (a 99% loss). Then at the top end, 90% of the passive multiplicative paths went above$5000kr, leading to their exclusion.

The result is that the subjects were conditioned on a limited subset of additive paths that excluded the most negative moments (although also the 25% highest), and on multiplicative paths that excluded the most positive moments. This is a large asymmetry. Obviously, each person saw only one path, and they might have ended up with that combination anyhow, but the systematic conditioning of subjects with benign passive paths and harrowing multiplicative paths should be considered a potential factor in the response of subjects to those fractals.

It could be argued that despite the difference in paths, people are simply learning the effect of the fractals that they bet on. However, I am not convinced that experimental subjects would be unaffected by seeing the potential cumulative effect of these bets.

As a result, my preliminary view on this experiment is that it provides potential evidence that the dynamics of the environment can influence our model of decision making. However, the experimental results involve behaviour that don’t seem to be accounted for by any of the models, and it involves a conditioning process that I’m not completely sold on.

Some of my other observations on the experiment include:

• The experiment involved a number of “discrepant trials”, where linear utility should have generated one choice and log utility another. These trials generated moderate evidence against the hypothesis of linear utility under additive dynamics. You can also see in Figure 4 above (and in other parts of the paper) that the experimental subjects had mild risk aversion in the additive environment. Similarly, the coefficient of risk aversion in the multiplicative environment seems slightly greater than one – indicating more risk aversion than log utility. (Saying this, I wouldn’t read too much into these particular numbers.)

• Although there is a consistent shift for most subjects between the two environments, there is a lot of variation in their degree of risk aversion. This could be due to outside factors, such as total wealth, but raises the question of how much idiosyncracy there is between people in their approaches to growth-rate maximisation (or whatever else it is they are maximising).

• I’m not convinced that the experiment had a design with the strength necessary to elicit a loss aversion parameter of prospect theory (assuming it exists). Every bet involved a choice between a two gambles involving a gain and a loss, rather than having a mix of gain-gain and gain-loss options that might highlight loss aversion. Shifting between those frames would also provide more power to tease out the risk aversion coefficients in the loss and gain domains. (I should note that I’m not confident that the experiment doesn’t have the necessary strength – I use the words “I’m not convinced” deliberately.)

• There was a required choice between the gambles, which eliminates status quo effects, an arguable driver of many behavioural dynamics (as argued by David Gal).

• The elicitation of preferences where people need to learn the probabilities through experience is one of the experimental circumstances where loss aversion has generally not been shown to occur (see this literature review by Yechiam and Hochman (pdf)). This provides another reason we might not not expect to elicit loss aversion in this experiment.

• The set up is complicated. The subjects need to learn fractal relationships. Their payout is based on a random selection of 10 of their bets. The multiplicative environment harder to learn. Does uncertainty drive some of the increase in risk aversion?

Summary

Where does this leave us? I take the following lessons from ergodicity economics and the experimental evidence to date:

• The concept that simple growth-rate maximisation results in the same observed behaviour as expected logarithmic utility maximisation in a multiplicative environment (possibly the world we live in), yet possibly provides a more parsimonious explanation, is important. This deserves much more research, including the question of whether this is a model on which we could build the broader decision making architecture. Would prospect theory look different if built on this foundation?

• We don’t need to throw everything out of the window. Maximising expected utility through using the logarithm of wealth is equivalent to maximising the growth rate in a multiplicative environment. We can continue to use this functional form in much economics work, but should consider a different interpretation on its use.

• For decision-making under uncertainty, there is a case for placing greater weight on the logarithmic “utility function” over other more highly-specified utility models that do not maximise the growth rate. On this point, Paul Samuelson led a somewhat acrimonious debate about whether an investment strategy using the Kelly criterion – which maximises the geometric growth rate (discussed in my ergodicity economics primer post) – was an appropriate investment strategy. I’ll cover that debate in more detail in a future post, but one of Samuelson’s central points was that Kelly criterion investments are only optimal for an expected log utility maximiser, not for people with other utility functions. The ergodicity economics approach attempts to circumvent this debate by suggesting that our utility function is growth rate maximisation.

• A behavioural response to possible absorbing states (i.e. ruin, death) would seem to require an addition to the growth-rate maximisation model, rather than being directly derived from it. The growth-rate maximisation model also says little about risk-return trade-offs, particularly in an additive environment. (This was also a point raised by Samuelson in the debate about the Kelly criterion, as growth-rate maximisation over finite time horizons can result in catastrophic loss.)

• There are a lot of decision-making phenomena that would require substantial additions to the ergodicity economics framework if they were to be incorporated. Examples include status quo bias, framing effects, non-linear probability weighting, and rejection of many bets that would seem to maximise a person’s time average growth rate if accepted (or that require an inordinate amount of storytelling to justify it). (Peters and friends have some papers on the application of ergodicity economics to discounting that I’ll deal with in another post.)

On that final point, Peters often mentions that expected utility theory was an attempt to rescue the failure of expected wealth maximisation to capture decision dynamics. One of the benefits of his model is that the need for psychological explanations is removed.

However, an attempt to remove psychology from decision models will leave a lot of behaviour unexplained. There is a fair questions about “what psychology?” is required, and whether this is the psychology of behavioural economics, ecological decision making, resource rationality or something else (see my critical behavioural economics and behavioural science reading list for a flavour of this). But in many situations people do not appear to maximise the growth rate of wealth.

My other posts on loss aversion can be found here:

Code

Below is the R code used for the simulations described above and generation of the supporting figures.

library(ggplot2)
library(scales) #use the percent scale later


Create a function for running of the bets.

bet <- function(p, pop, periods, start=100, gain, loss, ergodic=FALSE, absorbing=FALSE){

#p is probability of a gain
#pop is how many people in the simulation
#periods is the number of coin flips simulated for each person
#start is the number of dollars each person starts with
#if ergodic=FALSE, gain and loss are the multipliers
#if ergodic=TRUE, gain and loss are the dollar amounts
#if absorbing=TRUE, zero wealth ends the series of flips for that person

params <- as.data.frame(c(p, pop, periods, start, gain, loss, ergodic, absorbing))
rownames(params) <- c("p", "pop", "periods", "start", "gain", "loss", "ergodic", "absorbing")
colnames(params) <- "value"

sim <- matrix(data = NA, nrow = periods, ncol = pop)

if(ergodic==FALSE){
for (j in 1:pop) {
x <- start
for (i in 1:periods) {
outcome <- rbinom(n=1, size=1, prob=p)
ifelse(outcome==0, x <- x*loss, x <- x*gain)
sim[i,j] <- x
}
}
}

if(ergodic==TRUE){
for (j in 1:pop) {
x <- start
for (i in 1:periods) {
outcome <- rbinom(n=1, size=1, prob=p)
ifelse(outcome==0, x <- x-loss, x <- x+gain)
sim[i,j] <- x
if(absorbing==TRUE){
if(x<0){
sim[i:periods,j] <- 0
break
}
}
}
}
}

sim <- rbind(rep(start,pop), sim) #placing the starting sum in the first row
sim <- cbind(seq(0,periods), sim) #number each period
sim <- data.frame(sim)
colnames(sim) <- c("period", paste0("p", 1:pop))
sim <- list(params=params, sim=sim)
sim
}


Simulate 10,000 people who accept a series of 1000 50:50 bets to win $50 or lose$40 from a starting wealth of $100. set.seed(20200203) ergodic <- bet(p=0.5, pop=10000, periods=1000, gain=50, loss=40, ergodic=TRUE, absorbing=FALSE)  Create a function for plotting the path of individuals in the population over a set number of periods. individualPlot <- function(sim, periods, people){ basePlot <- ggplot(sim$sim[c(1:(periods+1)),], aes(x=period)) +
labs(y = "Wealth ($)") for (i in 1:people) { basePlot <- basePlot + geom_line(aes_string(y = sim$sim[c(1:(periods+1)),(i+1)]), color = 2) #need to use aes_string rather than aes to get all lines to print rather than just last line
}

basePlot

}


Plot both the average outcome and first twenty people on the same plot.

jointPlot <- function(sim, periods, people) {
individualPlot(sim, periods, people) +
geom_line(aes(y = rowMeans(sim$sim[c(1:(periods+1)),2:(sim$params[2,]+1)])), color = 1, size=1)
}

ergodicPlot <- jointPlot(sim=ergodic, periods=100, people=20)
ergodicPlot


Create a function to generate summary statistics.

summaryStats <- function(sim, period=100){

meanWealth <- mean(as.matrix(sim$sim[(period+1),2:(sim$params[2,]+1)]))
medianWealth <- median(as.matrix(sim$sim[(period+1),2:(sim$params[2,]+1)]))
num99 <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]<(sim$params[4,]/100)) #number who lost more than 99% of their wealth numGain <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]>sim$params[4,]) #number who gain
num100 <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]>(sim$params[4,]*100)) #number who increase their wealth more than 100-fold winner <- max(sim$sim[(period+1),2:(sim$params[2,]+1)]) #wealth of wealthiest person winnerShare <- winner / sum(sim$sim[(period+1),2:(sim$params[2,]+1)]) #wealth share of wealthiest person print(paste0("mean:$", round(meanWealth, 2)))
print(paste0("median: $", round(medianWealth, 2))) print(paste0("number who lost more than 99% of their wealth: ", num99)) print(paste0("number who gained: ", numGain)) print(paste0("number who increase their wealth more than 100-fold: ", num100)) print(paste0("wealth of wealthiest person:$", round(winner)))
print(paste0("wealth share of wealthiest person: ", percent(winnerShare)))
}


Generate summary statistics for the population and wealthiest person after 100 and 1000 periods.

summaryStats(sim=ergodic, period=100)
summaryStats(sim=ergodic, period=1000)



Determine how many people experienced zero wealth or less during the simulation.

numZero <- function(sim, periods=1000){

numZero <- sim$params[2,] - sum(sapply(ergodic$sim[1:periods,2:(ergodic$params[2,]+1)], function(x) all(x>0))) numZero } numZero(sim=ergodic, periods=1000) numZero(sim=ergodic, periods=100)  Determine the mimimum wealth experienced by any person. minWealth <- function(sim, periods=1000){ minWealth <- min(sim$sim[1:periods,2:(ergodic$params[2,]+1)]) minWealth } minWealth(ergodic, 1000)  Simulate the passive paths for the Copenhagen experiment. passiveSim <- function(type="additive", people=10000, start=1000){ #parameters used in Copenhagen experiment add <- c(-428, -321, -214, -107, 0, 107, 214, 321, 428) mult <- c(0.447, 0.546, 0.668, 0.818, 1, 1.223, 1.496, 1.830, 2.239) add333 <- rep(add,37) mult333 <- rep(mult, 37) gamblePath <- cbind(rep(start, people), matrix(data = NA, nrow = people, ncol = 333)) if(type=="additive"){ for (i in 1:people){ gamble <- sample(add333, size=333, replace=FALSE) for (j in 1:333){ gamblePath[i, j+1] <- gamblePath[i, j]+gamble[j] } } } if(type=="multiplicative"){ for (i in 1:people){ gamble <- sample(mult333, size=333, replace=FALSE) for (j in 1:333){ gamblePath[i, j+1] <- gamblePath[i, j]*gamble[j] } } } gamblePath } addSim <- passiveSim(type="additive") multSim <- passiveSim(type="multiplicative")  Examine how many simulated paths conform to the required range. #function to output number below lower limit, above upper limit, and within the range of the two numRange <- function(sim, lower=0, upper=5000, people=10000){ low <- people - sum(apply(sim, 1, function(x) all(x>lower))) up <- people - sum(apply(sim, 1, function(x) all(x<upper))) range <- sum(apply(sim, 1, function(x) all(x>lower & x<upper))) print(low) print(up) print(range) } numRange(addSim) numRange(multSim)  Simulate the payments to participants based on their actual choices. library("tidyverse") set.seed(20200321) #Import the data on the choices made for (i in 1:19){ importData <- read.csv(paste0("https://raw.githubusercontent.com/ollie-hulme/ergodicity-breaking-choice-experiment/master/data/TxtFiles_additive/", i, "_2.txt"), sep="")[1:312,] #limit to 312 entries as subject 3 has 314 importData <- select(importData, earnings, KP_Final, Gam1_1, Gam1_2, Gam2_1, Gam2_2) assign(paste0("subject_data_", i), importData) } payment <- data.frame(matrix(NA, nrow=10, ncol=18)) #Simulate 1000 payments for each participant for (i in c(1:19)){ for (j in 1:1000){ subject_data <- get(paste0("subject_data_", i)) subject_data % mutate(Gam1 = case_when(KP_Final==9 ~ Gam1_1, KP_Final==8 ~ Gam2_1)) %>% mutate(Gam2 = case_when(KP_Final==9 ~ Gam1_2, KP_Final==8 ~ Gam2_2)) %>% mutate(result = mapply(function(x,y){sample(c(x,y),1)}, x=Gam1, y=Gam2)) #Payment is the initial endowment from the passive phase plus a draw of 10 gambles payment[j,i] <- subject_data$earnings[1] + sum(sample(subject_dataresult, 10)) #starting money plus random draw of 10 } } colnames(payment) <- c(1:19) #remove subject 5 from analysis as excluded in paper payment % select(-5) #Determine how many participants made a loss sum(payment>0, na.rm=TRUE) sum(payment<0, na.rm=TRUE) sum(payment<(-1000)) sum(is.na(payment))  # The next decade of behavioural science: a call for intellectual diversity Behavioral Scientist put out the call to share hopes, fears, predictions and warnings about the next decade of behavioral science. Here’s my contribution: As behavioral scientists, we’re not exactly a diverse bunch. We’re university educated. We live in major cities. We work in academia, tech, consulting, banking and finance. And dare I say it, we’re rather liberal. Read the twitter streams or other public outputs of the major behavioral science institutions, publications and personalities, and the topics of interest don’t stray too far from what a Democratic politician (substitute your own nation’s centre-left party) would discuss in a stump speech. In that light, we need to think more broadly about both the questions we tackle and the answers we “like”. We need to ask what problems matter to the large swathes of the population that we don’t encounter in our day-to-day. We need to be self critical, open to being wrong, and not cheerleaders of our own narrow conception of the world. We must find and listen those who don’t share our points of view. We must question our orthodoxies. In practice, that’s not easy. But its vital to our relevance and to our intellectual foundations. I had a few stabs at the ~200 words. Here’s another attempt on a similar theme: Through the replication crisis, some prominent concepts in behavioural science have been challenged. The priming literature is in ruins. The concept of willpower as a finite resource is scarcely alive. Experiments in areas from disfluency to scarcity have failed to replicate. The shaking of the behavioural foundation is not over. More tenets of behavioural science are going to bite the dust. Many will be exposed through ongoing replication attempts. They are built on the same foundations as those that have already crumbled: publication bias, the garden of forking paths, among other things. New findings that continue to be built on those same foundations will also tumble down. I suspect there will be dismay when some ideas crumble, as they align with core beliefs of the behavioural science community. Yet those beliefs will be at the core of the weakness. Ideas of a different alignment would have faced a more serious challenge. We accept too many ideas because we “like” them. Thankfully, I am confident that infrastructure is being built that will allow us to challenge even cherished ideas. Let us just make sure that when the scientific foundation no longer exists, we are willing to let them go. You can read contributions from the broader behavioural science community here. # Best books I read in 2019 Better late than never…. The best books I read in 2019 – generally released in other years – are below. Where I have reviewed, the link leads to that review (not many reviews this year). • Nick Chater, The Mind is Flat: A great book in which Chater argues that there are no ‘hidden depths’ to our minds. • Stephan Guyenet, The Hungry Brain: Outsmarting the Instincts that Make us Overeat: Excellent summary of modern nutrition research and how the body “regulates” its weight. • Jonathan Morduch and Rachel Schneider, The Financial Diaries: How American Families Cope in a World of Uncertainty: I find a lot of value reading about the world outside of my bubble. I learnt a lot from this book. • Paul Seabright, The Company of Strangers: An excellent exploration of the evolutionary foundations of cooperation. A staple of my evolutionary biology and economics reading list. • Lenore Skenazy, Free-Range Kids: How to Raise Safe, Self-Reliant Children (Without Going Nuts with Worry): A fun read of some wise advice. • M Mitchell Waldrop, The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal: A bit too much detail, but a worthwhile story about the origins of personal computing. Many of the concepts about human-machine interaction remain relevant today. Below is the full list of books that I read in 2019 (with links where reviewed and starred if a re-read). The volume of my reading declined year-on-year again, with 61 books total (40 non-fiction, 21 fiction). Most of that decline came in the back half of the year when I spent a lot of time reading and researching some narrow academic topics. 45 of the below were read before June. I could add a lot of children’s books to the list (especially Enid Blyton), but I’ll leave those aside. Non-Fiction • Dan Ariely and Jeff Kreisler, Dollars and Sense: Money Mishaps and How to Avoid Them • Christopher Chabris and Daniel Simons, The Invisible Gorilla • Nick Chater, The Mind is Flat • Mihaly Csikszentmihalyi, Flow • Nir Eyal, Hooked • Nir Eyal, Indistractable • Tim Ferris, Four Hour Work Week • Tim Ferris, Tribe of Mentors • Victor Frankl, Man’s Search for Meaning • Jason Fried and David Heinemeier Hansson, It Doesn’t have to be Crazy at Work • Jason Fried and David Heinemeier Hansson, Rework • Jason Fried and David Heinemeier Hansson, Remote • Atul Gawande, Better • Stephan Guyenet, The Hungry Brain: Outsmarting the Instincts that Make us Overeat • Jonathan Haidt, The Righteous Mind* • Adam Kay, This is Going to Hurt: Secret Diaries of a Junior Doctor • Peter D. Kaufman (ed), Poor Charlie’s Almanac: The Wit and Wisdom of Charles T. Munger, Expanded Third Edition • Thomas Kuhn, The Structure of Scientific Revolutions • David Leiser and Yhonatan Shemesh, How We Misunderstand Economics and Why it Matters: The Psychology of Bias, Distortion and Conspiracy • Gerry Lopez, Surf is Where You Find It • Jonathan Morduch and Rachel Schneider, The Financial Diaries: How American Families Cope in a World of Uncertainty • Cal Newport, Digital Minimalism • Karl Popper, The Logic of SCientific Discovery • James Reason, Human Error • Ben Reiter, Astroball • Matthew Salganik, Bit by bit: Social Research in the Digital Age • Barry Schwartz, The Paradox of Choice • Paul Seabright, The Company of Strangers* • Byron Sharp, How Brands Grow • Pater Singer, A Darwinian Left • Lenore Skenazy, Free-Range Kids: How to Raise Safe, Self-Reliant Children (Without Going Nuts with Worry) • Eugene Soltes, Why They Do It: Inside the Mind of the White-Collar Criminal • Dilip Soman, The Last Mile: Creating Social and Economic Value from Behavioral Insights • Matthew Syed, Black Box Thinking: Marginal Gains and the Secrets of High Performance • Ed Thorpe, Beat the Dealer • M Mitchell Waldrop, The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal • Mike Walsh, The Algorithmic Leader • Caroline Webb, How to Have a Nice Day: A Revolutionary Handbook for Work -and Life • Robert Wright, The Moral Animal • Scott Young, Ultralearning: Accelerate Your Career, Master Hard Skills and Outsmart the Competition Fiction • Fyodor Dostoevsky, The Brothers Karamazov • F Scott Fitzgerald, The Beautiful and The Dammed • F Scott Fitzgerald, This Side of Paradise • Graham Greene, My Man in Havana* • Robert Heilein, Starship Troopers • Michael Houellebecq, Submission • Jack London, Call of the Wild • John Le Carre, Call for the Dead • John Le Carre, A Murder of Quality • John Le Carre, The Looking Glass War • John Le Carre, A Small Town in Germany • Chuck Palahniuk, Fight Club* • Edgar Allan Poe, The Tell-Tale Heart and Other Stories • J.K. Rowling, Harry Potter and the Philosopher’s Stone • J.K. Rowling, Harry Potter and the Chamber of Secrets • J.K. Rowling, Harry Potter and the Prisoner of Azkaban • J.K. Rowling, Harry Potter and the Goblet of Fire • J.K. Rowling, Harry Potter and the Order of the Phoenix • J.K. Rowling, Harry Potter and the Half-Blood Prince • J.K. Rowling, Harry Potter and the Deathly Hallows • Tim Winton, Breath Previous lists: 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 # Ergodicity economics: a primer In my previous posts on loss aversion (here, here and here), I foreshadowed a post on how “ergodicity economics” might shed some light on whether we need loss aversion to explain people’s choices under uncertainty. This was to be that post, but the background material that I drafted is long enough to be a stand alone piece. I’ll turn to the application of ergodicity economics to loss aversion in a future post. The below is largely drawn from presentations and papers by Ole Peters and friends, with my own evolutionary take at the end. For a deeper dive, see the lecture notes by Peters and Alexander Adamou, or a recent Perspective by Peters in Nature Physics. The choice Suppose you have100 and are offered a gamble involving a series of coin flips. For each flip, heads will increase your wealth by 50%. Tails will decrease it by 40%. Flip 100 times.

The expected payoff

What will happen? For that first flip, you have a 50% chance of a $50 gain, and a 50% chance of a$40 loss. Your expected gain (each outcome weighted by its probability, 0.5*$50 + 0.5*-$40) is $5 or 5% of your wealth. The absolute size of the stake for future flips will depend on past flips, but for every flip you have the same expected gain of 5% of your wealth. Should you take the bet? I simulated 10,000 people who each started with$100 and flipped the coin 100 times each. This line in Figure 1 represents the mean wealth of the 10,000 people. It looks good, increasing roughly in accordance with the expected gain, despite some volatility, and finishing at a mean wealth of over $16,000. Figure 1: Average wealth of population Yet people regularly decline gambles of this nature. Are they making a mistake? One explanation for declining this gamble is risk aversion. A risk averse person will value the expected outcome of a gamble lower than the same sum with certainty. Risk aversion can be represented through the concept of utility, where each level of wealth gives subjective value (utility) for the gambler. If people maximise utility instead of the value of a gamble, it is possible that a person would reject the bet. For example, one common utility function to represent a risk averse individual is to take the logarithm of each level of wealth. If we apply the log utility function to the gamble above, the gambler will reject the offer of the coin flip. [The maths here is simply that the expected utility of the gamble is 0.5*ln(150) + 0.5*ln(60)=4.55, which is less than the utility of the sure$100, ln(100)=4.61.]

The time average growth rate

For a different perspective, below is the plot for the first 20 of these 10,000 people. Interestingly, only two people do better than break even (represented by the black line at $100). The richest has less than$1,000 at period 100.

Figure 2: Path of first 20 people

What is happening here? The first plot shows that the average wealth across all 10,000 people is increasing. When we look at the first 20 individuals, their wealth generally declines. Even those that make money make less than the gain in aggregate wealth would suggest.

To show this more starkly, here is a plot of the first 20 people on a log scale, together with the average wealth for the full population. They are all below average in final wealth.

Figure 3: Plot of first 20 people against average wealth (log scale)

If we examine the full population of 10,000, we see an interesting pattern. The mean wealth is over $16,000, but the median wealth after 100 periods is 51 cents, a loss of over 99% of the initial wealth. 54% of the population ends up with less than$1. 86% finishes with less than the initial wealth of $100. Yet 171 people end up with more than$10,000. The wealthiest person finishes with $117 million, which is over 70% of the total wealth of the population. For most people, the series of bets is a disaster. It looks good only on average, propped up by the extreme good luck and massive wealth of a few people. The expected payoff does not match the experience of most people. Four possible outcomes One way to think about what is happening is to consider the four possible outcomes over the first two periods. The first person gets two heads. They finish with$225. The second and third person get a heads and a tails (in different orders), and finish with $90. The fourth person ends up with$36.

The average across the four is $110.25, reflecting the compound 5% growth. That’s our positive picture. But three of the four lost money. As the number of flips increases, the proportion who lose money increases, with a rarer but more extraordinarily rich cohort propping up the average. Almost surely Over the very long-term, an individual will tend to get around half heads and half tails. As the number of flips goes to infinite, the number of heads and tails is “almost surely” equal. This means that each person will tend to get a 50% increase half the time (or 1.5 times the initial wealth), and a 40% decrease half the time (60% of the initial wealth). A bit of maths and the time average growth in wealth for an individual is (1.5*0.6)0.5 ~ 0.95, or approximately a 5% decline in wealth each period. Every individual’s wealth will tend to decay at that rate. To get an intuition for this, a long run of equal numbers of heads and tails is equivalent to flipping a head and a tail every two periods. Suppose that is exactly what you did – flipped a heads and then flipped a tail. Your wealth would increase to$150 in the first round ($100*1.5), and then decline to$90 in the second ($150*0.6). You get the same result if you change the order. Effectively, you are losing 10% (or getting only 1.5*0.6=0.9) of your money every two periods. A system where the time average converges to the ensemble average (our population mean) is known as an ergodic system. The system of gambles above is non-ergodic as the time average and the ensemble average diverge. And given we cannot individually experience the ensemble average, we should not be misled by it. The focus on ensemble averages, as is typically done in economics, can be misleading if the system is non-ergodic. The longer term How can we reconcile this expectation of loss when looking at the time average growth with the continued growth of the wealth of some people after 100 periods? It does not seem that everyone is “almost surely” on the path to ruin. But they are. If we plot the simulation for, say, 1,000 periods rather than 100, there are few winners. Here’s a plot of the average wealth of the population for 1000 periods (the first 100 being as previously shown), plus a log plot of that same growth (Figures 4 and 5). Figure 4: Plot of average wealth over 1000 periods Figure 5: Plot of average wealth over 1000 periods (log plot) We can see that despite a large peak in wealth around period 400, wealth ultimately plummets. Average wealth at period 1000 is$24, below the starting average of $100, with a median wealth of 1×10-21 (rounding to the nearest cent, that is zero). The wealthiest person has$242 thousand dollars, with that being 98.5% of the total wealth. If we followed that wealthy person for another 1000 generations, I would expect them to be wiped out too. [I tested that – at 2000 periods the wealthiest person had $4×10-7.] Despite the positive expected value, the wealth of the entire population is wiped out. Losing wealth on a positive value bet The first 100 periods of bets forces us to hold a counterintuitive idea in our minds. While the population as an aggregate experiences outcomes reflecting the positive expected value of the bet, the typical person does not. The increase in wealth across the aggregate population is only due to the extreme wealth of a few lucky people. However, the picture over 1000 periods appears even more confusing. The positive expected value of the bet is nowhere to be seen. How could this be the case? The answer to this lies in the distribution of bets. After 100 periods, one person had 70% of the wealth. We no longer have 10,000 equally weighted independent bets as we did in the first round. Instead, the path of the wealth of the population is largely subject to the outcome of the bets by this wealthy individual. As we have already shown, the wealth path for an individual almost surely leads to a compound 5% loss of wealth. That individual’s wealth is on borrowed time. The only way for someone to maintain their wealth would be to bet a smaller portion of their wealth, or to diversify their wealth across multiple bets. The Kelly criterion On the first of these options, the portion of a person’s wealth they should enter as stakes for a positive expected value bet such as this is given by the Kelly Criterion. The Kelly criterion gives the bet size that would maximise the geometric growth rate in wealth. The Kelly criterion formula for a simple bet is as follows: $f=\frac{bp-q}{b}=\frac{p(b+1)-1}{b}$ where f is the fraction of the current bankroll to wager b is the net odds received on the wager (i.e. you receive$b back on top of the $1 wagered for the bet) p is the probability of winning q is the probability of losing (1-p) For the bet above, we have p=0.5 and $b=\frac{0.5}{0.4}=1.25$. As offered, we are effectively required to bet f=0.4, or 40% of our wealth, for that chance to win a 50% increase. However, if we apply the above formula given p and b, a person should bet $\frac{(0.5*(1.25+1)-1)}{1.25}=0.1$, or 10%, of their wealth each round to maximise the geometric growth rate. The Kelly criterion is effectively maximising the expected log utility of the bet through setting the size of the bet. The Kelly criterion will result in someone wanting to take a share of any bet with positive expected value. The Kelly bet “almost surely”” leads to higher wealth than any other strategy in the long run. If we simulate the above scenarios, but risking only 10% of wealth each round rather than 40% (i.e. heads wealth will increase by 12.5%, tails it will decrease by 10%), what happens? The expected value of the Kelly bet is 0.5*0.125+0.5*-0.1=0.0125 or 1.25% per round. This next figure shows the ensemble average, showing a steady increase. Figure 6: Average wealth of population applying Kelly criterion (1000 periods) If we look at the individuals in this population, we can also see that their paths more closely resemble that of the population average. Most still under-perform the mean (the system is still non-ergodic – the time average growth rate is ((1.125*0.9)0.5=1.006 or 0.6%), and there is large wealth disparity with the wealthiest person having 36% of the total wealth after 1000 periods (after 100, they have 0.5% of the wealth). Still most people are better off, with 70% and 95% of the population experiencing a gain after 100 and 1000 periods respectively. The median wealth is almost$50,000 after the 1000 periods.

Figure 7: Plot of first 20 people applying Kelly criterion against average wealth (log scale, 1000 periods)

Unfortunately, given our take it or leave it choice we opened with involving 40% of our wealth, we can’t use the Kelly Criterion to optimise the bet size and should refuse the bet.

Update clarifying some comments on this post:

An alternative more general formula for the Kelly criterion that can be used for investment decisions is:

$f=\frac{p}{a}-\frac{q}{b}$

where

f is the fraction of the current bankroll to invest

b is the value by which your investment increases (i.e. you receive $b back on top of each$1 you invested)

a is the value by which your investment decreases if you lose (the first formula above assumes a=1)

p is the probability of winning

q is the probability of losing (1-p)

Applying this formula to the original bet at the beginning of this post, a=0.4 and b=0.5, by which f=0.5/0.4-0.5/0.5=0.25 or 25%. Therefore, you should put up 25% of your wealth, of which you could potentially lose 40% or win 50%.

This new formulation of the Kelly criterion gives the same recommendation as the former, but refers to different baselines. In the first case, the optimal bet is 10% of your wealth, which provides for a potential win of 12.5%. In the second case, you invest 25% of your wealth to possibly get a 50% return (12.5% of your wealth) or lose 40% of your investment (40% of 25% which is 10%). Despite the same effective recommendation, in one case you talk of f being 10%, and in the second 25%.

Evolving preferences

Suppose two types of agent lived in this non-ergodic world and their fitness was dependent on the outcome of the 50:50 bet for a 50% gain or 40% loss. One type always accepted the bet, the other always rejected it. Which would come to dominate the population?

An intuitive reaction to the above examples might be that while the accepting type might have a short term gain, in the long run they are almost surely going to drive themselves extinct. There are a couple of scenarios where that would be the case.

One is where the children of a particular type were all bound to the same coin flip as their siblings for subsequent bets. Suppose one individual had over 1 million children after 100 periods, comprising around 70% of the population (which is what they would have if we borrowed the above simulations for our evolutionary scenario, with one coin flip per generation). If all had to bet on exactly the same coin flip in period 101 and beyond, they are doomed.

If, however, each child faces their own coin flip (experiencing, say, idiosyncratic risks), that crash never comes. Instead the risk of those flips is diversified and the growth of the population more closely resembles the ensemble average, even over the very long term.

Below is a chart of population for a simulation of 100 generations of the accepting population, starting with a population of 10,000. For this simulation I have assumed that at the end of each period, the accepting types will have a number of children equal to the proportional increase in their wealth. For example, if they flip heads, they will have 1.5 children, For tails, they will have 0.6 children. They then die. (The simulation works out largely the same if I make the number of children probabilistic in accord with those numbers.) Each child takes their own flip.

Figure 8: Population of accepting types

This has an expected population growth rate of 5%.

This evolutionary scenario differs from Kelly criterion in that the accepting types are effectively able to take many independent shares of the bet for a tiny fraction of their inclusive fitness.

In a Nature Physics paper summarising some of his work, Peters writes:

[I]n maximizing the expectation value – an ensemble average over all possible outcomes of the gamble – expected utility theory implicitly assumes that individuals can interact with copies of themselves, effectively in parallel universes (the other members of the ensemble). An expectation value of a non-ergodic observable physically corresponds to pooling and sharing among many entities. That may reflect what happens in a specially designed large collective, but it doesn’t reflect the situation of an individual decision-maker.

For a replicating entity that is able to diversify future bets across many offspring, they are able to do just this.

There are a lot of wrinkles that could be thrown into this simulation. How many bets does someone have to make before they reproduce and effectively diversify their future? The more bets, the higher the chance of a poor end. There is also the question of whether bets by children would be truly independent (Imagine a highly-related tribe).

Risk and loss aversion in ergodicity economics

In my next post on this topic I ask whether, given the above, we need risk and loss aversion to explain our choices.

## Code

Below is the R code used for generation of the simulations and figures.

library(ggplot2)
library(scales) #use the percent scale later


Create a function for the bets.

bet <- function(p,pop,periods,gain,loss, ergodic=FALSE){

#p is probability of a gain
#pop is how many people in the simulation
#periods is the number of coin flips simulated for each person
#if ergodic=FALSE, gain and loss are the multipliers
#if ergodic=TRUE, gain and loss are the dollar amounts

params <- as.data.frame(c(p, pop, periods, gain, loss, ergodic))
rownames(params) <- c("p", "pop", "periods", "gain", "loss", "ergodic")
colnames(params) <- "value"

sim <- matrix(data = NA, nrow = periods, ncol = pop)

if(ergodic==FALSE){
for (j in 1:pop) {
x <- 100 #x is the number of dollars each person starts with
for (i in 1:periods) {
outcome <- rbinom(n=1, size=1, prob=p)
ifelse(outcome==0, x <- x*loss, x <- x*gain)
sim[i,j] <- x
}
}
}

if(ergodic==TRUE){
for (j in 1:pop) {
x <- 100 #x is the number of dollars each person starts with
for (i in 1:periods) {
outcome <- rbinom(n=1, size=1, prob=p)
ifelse(outcome==0, x <- x-loss, x <- x+gain)
sim[i,j] <- x
}
}
}

sim <- rbind(rep(100,pop), sim) #placing the $x starting sum in the first row sim <- cbind(seq(0,periods), sim) #number each period sim <- data.frame(sim) colnames(sim) <- c("period", paste0("p", 1:pop)) sim <- list(params=params, sim=sim) sim }  Simulate 10,000 people who accept a series of 1000 50:50 bets to win 50% of their wealth or lose 40%. set.seed(20191215) nonErgodic <- bet(p=0.5, pop=10000, periods=1000, gain=1.5, loss=0.6, ergodic=FALSE)  Create a function for plotting the average wealth of the population over a set number of periods. averagePlot <- function(sim, periods=100){ basePlot <- ggplot(sim$sim[c(1:(periods+1)),], aes(x=period)) +
labs(y = "Average Wealth ($)") averagePlot <- basePlot + geom_line(aes(y = rowMeans(sim$sim[c(1:(periods+1)),2:(sim$params[2,]+1)])), color = 1, size=1) averagePlot }  Plot the average outcome of these 10,000 people over 100 periods (Figure 1). averagePlot(nonErgodic, 100)  Create a function for plotting the path of individuals in the population over a set number of periods. individualPlot <- function(sim, periods, people){ basePlot <- ggplot(sim$sim[c(1:(periods+1)),], aes(x=period)) +
labs(y = "Wealth ($)") for (i in 1:people) { basePlot <- basePlot + geom_line(aes_string(y = sim$sim[c(1:(periods+1)),(i+1)]), color = 2) #need to use aes_string rather than aes to get all lines to print rather than just last line
}

basePlot

}


Plot of the path of the first 20 people over 100 periods (Figure 2).

nonErgodicIndiv <- individualPlot(nonErgodic, 100, 10)
nonErgodicIndiv


Plot both the average outcome and first twenty people on the same plot using a log scale (Figure 3).

logPlot <- function(sim, periods, people) {
individualPlot(sim, periods, people) +
geom_line(aes(y = rowMeans(sim$sim[c(1:(periods+1)),2:(sim$params[2,]+1)])), color = 1, size=1) +
scale_y_log10()
}

nonErgodicLogPlot <- logPlot(nonErgodic, 100, 20)
nonErgodicLogPlot


Create a function to generate summary statistics.

summaryStats <- function(sim, period=100){

meanWealth <- mean(as.matrix(sim$sim[(period+1),2:(sim$params[2,]+1)]))
medianWealth <- median(as.matrix(sim$sim[(period+1),2:(sim$params[2,]+1)]))
numDollar <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]<=1) #number with less than a dollar
numGain <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]>=100) #number who gain
num10000 <- sum(sim$sim[(period+1),2:(sim$params[2,]+1)]>=10000) #number who finish with more than $10,000 winner <- max(sim$sim[(period+1),2:(sim$params[2,]+1)]) #wealth of wealthiest person winnerShare <- winner / sum(sim$sim[(period+1),2:(sim$params[2,]+1)]) #wealth share of wealthiest person print(paste0("mean:$", round(meanWealth, 2)))
print(paste0("median: $", round(medianWealth, 2))) print(paste0("number with less than a dollar: ", numDollar)) print(paste0("number who gained: ", numGain)) print(paste0("number that finish with more than$10,000: ", num10000))
print(paste0("wealth of wealthiest person: $", round(winner))) print(paste0("wealth share of wealthiest person: ", percent(winnerShare))) }  Generate summary statistics for the population and wealthiest person after 100 periods summaryStats(nonErgodic, 100)  Plot the average wealth of the non-ergodic simulation over 1000 periods (Figure 4). averagePlot(nonErgodic, 1000)  Plot the average wealth of the non-ergodic simulation over 1000 periods using a log plot (Figure 5). averagePlot(nonErgodic, 1000)+ scale_y_log10()  Calculate some summary statistics about the population and the wealthiest person after 1000 periods. summaryStats(nonErgodic, 1000)  Kelly criterion bets Calculate the optimum Kelly bet size. p <- 0.5 q <- 1-p b <- (1.5-1)/(1-0.6) f <- (b*p-q)/b f  Run a simulation using the optimum bet size. set.seed(20191215) kelly <- bet(p=0.5, pop=10000, periods=1000, gain=1+f*b, loss=1-f, ergodic=FALSE)  Plot ensemble average of Kelly bets (Figure 6). averagePlotKelly <- averagePlot(kelly, 1000) averagePlotKelly  Plot of the path of the first 20 people over 1000 periods (Figure 7). logPlotKelly <- logPlot(kelly, 1000, 20) logPlotKelly  Generate summary stats after 1000 periods of the Kelly simulation summaryStats(kelly, 1000)  Evolutionary simulation Simulate the population of accepting types. set.seed(20191215) evolutionBet <- function(p,pop,periods,gain,loss){ #p is probability of a gain #pop is how many people in the simulation #periods is the number of generations simulated params <- as.data.frame(c(p, pop, periods, gain, loss)) rownames(params) <- c("p", "pop", "periods", "gain", "loss") colnames(params) <- "value" sim <- matrix(data = NA, nrow = periods, ncol = 1) sim <- rbind(pop, sim) #placing the starting population in the first row for (i in 1:periods) { for (j in 1:round(pop)) { outcome <- rbinom(n=1, size=1, prob=p) ifelse(outcome==0, x <- loss, x <- gain) pop <- pop + (x-1) } pop <- round(pop) print(i) sim[i+1] <- pop #"+1" as have starting population in first row } sim <- cbind(seq(0,periods), sim) #number each period sim <- data.frame(sim, row.names=NULL) colnames(sim) <- c("period", "pop") sim <- list(params=params, sim=sim) sim } evolution <- evolutionBet(p=0.5, pop=10000, periods=100, gain=1.5, loss=0.6) #more than 100 periods can take a very long time, simulation slows markedly as population grows  Plot the population growth for the evolutionary scenario (Figure 8). basePlotEvo <- ggplot(evolution$sim[c(1:101),], aes(x=period))

expectationPlotEvo <- basePlotEvo +
geom_line(aes(y=pop), color = 1, size=1) +
labs(y = "Population")

expectationPlotEvo


# The case against loss aversion

Summary: Much of the evidence for loss aversion is weak or ambiguous. The endowment effect and status quo bias are subject to multiple alternative explanations, including inertia. There is possibly better evidence for loss aversion in the response to risky bets, but what emerges does not appear to be a general principle of loss aversion. Rather, “loss aversion” is a conditional effect that most typically emerges when rejecting the bet is not the status quo and the stakes are material.

[As a postscript, a week after publishing this post, a working paper for a forthcoming Journal of Consumer Psychology article was released. That paper addresses some of the below points. A post on that paper is in the works.]

In a previous post I flagged three critiques of loss aversion that had emerged in recent years. The focus of that post was Eldad Yechiam’s analysis of the assumption of loss aversion in Kahneman and Tversky’s classic 1979 prospect theory paper.

The second critique, and the focus of this post, is an article by David Gal and Derek Rucker The Loss of Loss Aversion: Will It Loom Larger Than Its Gain (pdf). Its abstract:

Loss aversion, the principle that losses loom larger than gains, is among the most widely accepted ideas in the social sciences. The first part of this article introduces and discusses the construct of loss aversion. The second part of this article reviews evidence in support of loss aversion. The upshot of this review is that current evidence does not support that losses, on balance, tend to be any more impactful than gains. The third part of this article aims to address the question of why acceptance of loss aversion as a general principle remains pervasive and persistent among social scientists, including consumer psychologists, despite evidence to the contrary. This analysis aims to connect the persistence of a belief in loss aversion to more general ideas about belief acceptance and persistence in science. The final part of the article discusses how a more contextualized perspective of the relative impact of losses versus gains can open new areas of inquiry that are squarely in the domain of consumer psychology.

The release of Gal and Rucker’s paper was accompanied by a Scientific American article by Gal, Why the Most Important Idea in Behavioral Decision-Making Is a Fallacy. It uses somewhat stronger language. Here’s a snippet:

[T]here is no general cognitive bias that leads people to avoid losses more vigorously than to pursue gains. Contrary to claims based on loss aversion, price increases (ie, losses for consumers) do not impact consumer behavior more than price decreases (ie, gains for consumers). Messages that frame an appeal in terms of a loss (eg, “you will lose out by not buying our product”) are no more persuasive than messages that frame an appeal in terms of a gain (eg, “you will gain by buying our product”).

People do not rate the pain of losing $10 to be more intense than the pleasure of gaining$10. People do not report their favorite sports team losing a game will be more impactful than their favorite sports team winning a game. And people are not particularly likely to sell a stock they believe has even odds of going up or down in price (in fact, in one study I performed, over 80 percent of participants said they would hold on to it).

This critique of loss aversion is not completely new. David Gal has been making related arguments since 2006. In this more recent article, however, Gal and Rucker draw on a larger literature and some additional experiments to expand the critique.

To frame their argument, they describe three potential versions of loss aversion:

• The strong version: losses always loom larger than gains
• The weak version: losses on balance loom larger than gains
• The contextual version: Depending on context, losses can loom larger than gains, they can have equal weighting, gains can loom larger than losses

The strong version appears to be a straw man that few would defend, but there is some subtlety in Gal and Rucker’s definition. They write:

This strong version does not require that losses must outweigh gains in all circumstances, as factors such as measurement error and boundary conditions might obscure or reduce the fundamental propensity for losses to be weighted more than equivalent gains.

An interesting point by Gal and Rucker is that for most research on the boundaries or moderators of loss aversion, loss aversion is the general principle around which the exceptions are framed. If people don’t exhibit loss aversion, it is usually argued that the person is not enoding the transaction as a loss, so loss aversion does not apply. The alternative that the gains have equal weight to (or greater weight than) the loss is not put forward. So although few would defend a blunt reading of the strong version, many researchers take it as though people are loss averse unless certain circumstances are present.

Establishing the weak version seems difficult. Tallying studies in which losses loom larger and where gains dominate would provide evidence more on the focus of research than the presence of a general principle of loss aversion. It’s not even clear how you would compare across different contexts.

Despite this difficulty (or possibly because of it), Gal and Rucker come down firmly in favour of the contextual version. They do this not through tallying or comparing the contexts in which losses or gains loom larger, but by arguing that most evidence of loss aversion is ambiguous at best.

Loss aversion as evidence for loss aversion

The principle of loss aversion is descriptive. It is a label applied to an empirical phenomena. It is not an explanation. Similarly, the endowment effect, our tendency to ascribe more value to items that we have than to those we don’t, is a label applied to an empirical phenomena.

Despite being descriptive, Gal and Rucker note that loss aversion is often used as an explanation for choices. For example, loss aversion is often used as an explanation for the endowment effect. But using a descriptive label as an explanation provides no analytical value, with what appears to be an explanation simply application of a different label. (Owen Jones suggests that stating the endowment effect is due to loss aversion is no more useful than labelling human sexual behaviour as being due to abstinence aversion. I personally think it is marginally more useful, if only for the fact there is now a debate as to whether loss aversion and the endowment effect are related. The transfer of label shows that you believe these empirical phenomena have the same psychological basis.)

Gal and Rucker argue that the application of the loss aversion label to the endowment effect leads to circular arguments. The endowment effect is used as evidence for loss aversion, and, as noted above, loss aversion is commonly used to explain the endowment effect. This results in an unjustified reinforcement of the concept, and a degree of neglect of alternative explanations for the phenomena.

I have some sympathy for this claim, although am not overly concerned by it. The endowment effect has multiple explanations (as will be discussed below), so it is weak evidence of loss aversion at best. However, it is rare that the endowment effect is the sole piece of evidence presented for the existence of loss aversion. It is more often one of a series of stylised facts for which a common foundation is sought. So although there is circularity, the case for loss aversion does not rest solely on that circular argument.

Risky versus riskless choice

Much of Gal and Rucker’s examination of the evidence for loss aversion is divided between riskless and risky choice. Riskless choice involves known options and payoffs with certainty. Would you like to keep your chocolate or exchange it for a coffee mug? In risky choice, the result of the choice involves a payoff that becomes known only after the choice. Would you like to accept a 50:50 bet to win $110, lose$100?

Below is a collection of their arguments as to why loss aversion is not the best explanation for many observed empirical results sorted across those two categories.

Riskless choice – status quo bias and the endowment effect

Gal and Rucker’s examination of riskless choice centres on the closely related concepts of status quo bias and the endowment effect. Status quo bias is the propensity for someone to stick with the status quo option. The endowment effect is the propensity for someone to value an object they own over an object that they would need to acquire.

Status quo bias and the endowment effect are often examined in an exchange paradigm. You have been given a coffee mug. Would you like to exchange it for a chocolate? The propensity to retain the coffee mug (or the chocolate if that was what they were given first) is labelled as either status quo bias or the endowment effect. Loss aversion is often used to explain this decision, as the person would lose the status quo option or their current endowment when they choose an alternative.

Gal and Rucker suggests that rather than being driven by loss aversion, status quo bias in this exchange paradigm is instead due to a preference for inaction over action (call this inertia). A person needs a psychological motive for action. Gal examined this in his 2006 paper when he asked experimental subjects to imagine that they had a quarter minted in one city, and then whether they would be willing to change it for a nickel minted in another. Following speculation by Kahneman and others that people do not experience loss aversion when exchanging identical goods, Gal considered that a propensity for the status quo absent loss aversion would indicate the presence of inertia.

Gal found that despite there being no loss in the exchange of quarters, the experimental subjects preferred the status quo of keeping the coin they had. Gal and Rucker replicated this result on Amazon Turk, offering to exchange one hypothetical $20 bill for another. They took this as evidence of inertia. Apart from the question of what weight you should give an experiment involving hypothetical coins, notes and exchanges, I don’t find this a convincing demonstration that inertia lies behind the status quo bias. Exchange does involve some transaction costs (in the form of effort, however minimal, even if you are told to assume they are insignificant). In his 2006 paper, Gal reports other research where people traded identical goods when paid a nickel to cover “transaction costs”. The token amount spurred action. Those experiments, however, involved transactions of goods with known values. The value of a quarter is clear. In contrast, Gal’s exploration for the status quo bias in his 2006 paper involved goods without an obvious face value. This is important, as Gal argued that people have “fuzzy preferences” that are often ill-defined and constructed on an ad hoc basis. If we do not precisely judge the attractiveness of chocolate or a mug, we may not have a precise ordering of preference between the two that would justify choosing one after another. Under Gal’s concept of inertia, the lack of a psychological motive to change results in us sticking with the status quo mug. Contrasting this with the exchange of quarters, there the addition of a nickel to cover trading expenses allows for a precise ordering of the two options, as they are easily comparable monetary sums. In the case of a mug and chocolate, addition of a nickel is unlikely to make the choice any easier as the degree of fuzziness extends over a much larger range. The other paradigm under which the endowment effect is explored is the valuation paradigm. The valuation paradigm involves asking someone what they would be willing to pay to purchase or acquire an item, or how much they would require to be paid to accept an offer to purchase an item in their possession. The gap between this willingness to pay and the typically larger willingness to accept is the additional value given to the endowed good. (For some people this is how status quo bias and the endowment effect are differentiated. Status quo bias is the maintenance of the status quo in an exchange paradigm, the endowment effect is the higher valuation of endowed goods in the valuation paradigm. However, many also label the exchange paradigm outcome as being due to the endowment effect. Across the literature they are often used interchangeably.) This difference between willingness to pay and accept in the valuation paradigm is often cited as evidence of loss aversion. But as Gal and Rucker argue, this difference has alternative explanations. Fundamentally different questions are asked when seeking an individual’s willingness to accept (what is the market value?) and their willingness to pay (what is their personal utility?). Only willingness to pay is affected by budget constraints. Although not mentioned in the 2018 paper, Gal’s 2006 paper suggests this gap may also be due to fuzzy preferences, with the willingness to pay and willingness to accept representing the two end points of the fuzzy range of valuation. Willingness to pay is the lower bound. For any higher amount, they are either indifferent (the range of fuzzy preferences) or would prefer the monetary sum in their hand. Willingness to accept is the upper bound. For any lower amount they are either indifferent (the range of fuzzy preferences) or would prefer to keep the good. There are plenty of experiments in the broader literature seeking to tease out whether the endowment effect is due to loss aversion or alternative explanations of the type above. Gal and Rucker report their own (unpublished) set of experiments where they seek to isolate inertia as the driver of the difference between willingness to pay and willingness to accept. They asked for experimental subjects’ willingness to pay to obtain a good, versus their willingness to retain a good. For example, they compared subjects’ willingness to pay to fix a phone versus their willingness to pay to get a repaired phone. They asked about their willingness to expend time to drive to get a new notebook they left behind versus their willingness to drive to get a brand new notebook. They asked about their willingness to pay for fibre optic internet versus their willingness to pay to retain fibre optic internet that they already had. For each choices the subject needs to act to get the good, so inertia is removed as a possible explanation of a preference to retain an endowed good. With fuzzy preferences under this experimental set up, both willingness to pay and willingness to retain would be the lower bound, as any higher payment would lead to indifference or preference of the monetary sum. Here Gal and Rucker found little difference between willingness to pay and willingnes to accept. Gal and Rucker characterise each of the options as involving choices between losses and gains, and survey questions put to the experimental subjects confirmed that most were framing the choices in that way. This allowed them to point to this experiment as evidence against loss aversion driving the endowment effect. Remove inertia but leave the loss/gain framing, and the effect disappears. However, the experimental implementation of this idea is artificial. Importantly, the decisions are hypothetical and unincentivised. Whether coded as a loss or gain, the experimental subjects were never endowed with the good and weren’t experiencing a loss. More convincing evidence, however, came from Gal and Rucker’s application of this idea in an exchange paradigm. In one scenario, people were endowed with a pen or chocolate bar. They were then asked to choose between keeping the pen or swapping for the chocolate bar, so an active choice was required for either option. Gal and Rucker found that regardless of the starting point, roughly the same proportion chose the pen or chocolate bar. This constrasts with a more typical endowment effect experimental setup that they also ran, in which they simply asked people given a pen or chocolate bar whether they would like to exchange. Here the usual endowment effect pattern emerged, with people more likely to keep the endowed good. Like the endowment effect experiments they critique, this result is subject to alternative explanations, the simplest (although not necessarily convincing) being that the reference point has been changed by the framing of the question. By changing the status quo, you also change the reference point. (I should say this type of argument involving ad hoc stories about changes in reference points is one of the least satisfactory elements of prospect theory.) Despite the potential for alternative explanations, these experiments are the beginning of a body of evidence for inertia driving some results currently attributed to loss aversion. Gal and Rucker’s argument against use of the endowment effect as evidence of loss aversion is even stronger. There are many alternative explanations to loss aversion for the status quo bias and endowment effect. The evidence for loss aversion is better found elsewhere. Risky choice Gal and Rucker’s argument concerning risky bets resembles that for riskless choice. Many experiments in the literature involve an offer of a bet, such as a 50:50 chance to win$100 or lose $100, which the experimental subject can accept or reject. Rejection is the status quo, so inertia could be an explanation for the decision to reject. Gal and Rucker describe an alternative experiment in which people can choose between a certain return of 3% or a risky bet with expected value of zero. As they must make a choice, there is not a status quo option. 80% of people allocate at least some money to the risky bet, suggesting an absence of loss aversion. This type of finding is reflected across a broader literature. They also report a body of research where the risky bet is not the sole option to opt into, but rather one of two options for which an active choice must be made. For example, would you like$0 with certainty, or a 50:50 bet to win $10, lose$10. In this case, little evidence for loss aversion emerges unless the stakes are large.

This framing of the safe option as the status quo is one of many conditions under which loss aversion tends to emerge. Gal and Rucker reference a paper by Eyal Ert and Ido Erev, who identified that in addition to emerging when the safe option is the status quo, loss aversion also tends to emerge with:

• high nominal payoffs
• when the stakes are large
• when there are bets present in the choice list that create a contrast effect, and
• in long experiments without feedback where the computation of the expected payoff is difficult.

Ert and Erev described a series of experiments where they remove these features and eliminate loss aversion.

Gal and Rucker also reference a paper by Yechiam and Hochman (pdf), who surveyed the loss aversion literature involving balanced 50:50 bets. For experiential tasks, where decision makers are required to repeatedly select between options with no prior description of the outcomes of probabilities (effectively learning the probabilities with experience), there is no evidence of loss aversion. For descriptive tasks, where a choice is made between fully-described options, loss aversion most typically arises for “high-stakes” hypothetical amounts, and is often absent for lower sums (which are also generally hypothetical).

For the higher stakes bets, Yechiam and Hochman suggest risk aversion may explain the choices. However, what Yechiam and Hochman call high stakes aren’t that high; for example $600 versus$500. As I described in my previous post on the Rabin Paradox, risk aversion at stakes of that size can only be shoehorned into the traditional expected utility model with severe contortions (although it can be done). Rejecting that bet is a high level of risk aversion for anyone with more than minimal wealth (although these experimental subjects may have low wealth as they are generally students). Loss aversion is one alternative explanation.

Regardless, under the concept of loss aversion as presented in prospect theory, we should see loss aversion for low stakes bets. Once you are arguing that “loss aversion” will emerge if the bet is large enough, this is a different conception of loss aversion to that in the academic literature.

Other phenomena that may not involve loss aversion

At the end of the paper, Gal and Rucker mention a couple of other phenomena incorrectly attributed to or not necessarily caused by loss aversion.

The first of these is the Asian disease problem. In this problem, experimental subjects are asked:

Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimate of the consequences of the programs are as follows:

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is 1/3 probability that 600 people will be saved, and 2/3 probability that no people will be saved.

Which of the two programs would you favor?

Most people tend to prefer program A.

Then ask another set of people the following:

If Program C is adopted 400 people will die.

If Program D is adopted there is 1/3 probability that nobody will die, and 2/3 probability that 600 people will die.

Which of the two programs would you favor?

Most people prefer program D, despite C and D being a reframed version of programs A and B. The reason for this change is usually attributed to the second set of options being a loss frame, with people preferring to gamble to avoid the loss.

This, however, is not loss aversion. There is, after all, no potential for gain in the second set of questions against which the strength of the losses can be compared. Rather, this is the “reflection effect”.

Tversky and Kahneman recognised this when they presented the Asian disease problem in their 1981 Science article pdf, but the translation into public discourse has missed this difference, with the Asian disease problem often presented as an example of loss aversion.

Gal and Rucker point out some other examples of phenomena that may be incorrectly attributed to loss aversion. The disposition effect – people tend to sell winning investment and retain losing investments – could also be explained by the reflection effect, or by lay beliefs about mean reversion. The sunk cost effect involves a refusal to recognise losses rather than a greater impact of losses relative to gains, as no comparison to a gain is made.

Losses don’t hurt more than gains

Beyond the thoughtful argument in the paper, Gal’s Scientific American article goes somewhat further. For instance, Gal writes:

People do not rate the pain of losing $10 to be more intense than the pleasure of gaining$10. People do not report their favorite sports team losing a game will be more impactful than their favorite sports team winning a game.

I find it useful to distinguish two points. The first is the question of the psychological impact of a loss. Does a loss generate a different feeling, or level of attention, to an equivalent gain? The second is how that psychological response manifests itself in a decision. Do people treat losses and gains differently, resulting in loss aversion of the type described in prospect theory?

The lack of differentiation between these two points often clouds the discussion of loss aversion. The first point accords with our instinct. We feel the pain of a loss. But that pain does not necessarily mean that we will be loss averse in our decisions.

Gal and Rucker’s article largely focuses on the second of these points through its examination of a series of choice experiments. Yet the types of claims in the Scientific American article, as in the above quote, are more about the first.

This is the point where I disagree with Gal. Although contextual (isn’t everything), the evidence of the greater psychological impact of losses appears solid. In fact, the Yechiam and Hochman article (pdf), quoted by Gal and Rucker for its survey of the loss aversion literature, was an attempt to reconcile the disconnect between the evidence for the effect of losses on performance, arousal, frontal cortical activation, and behavioral consistency with the lack of evidence for loss aversion. Yechiam’s article on the assumption of loss aversion by Kahneman and Tversky (the subject of a previous post) closes with a section reconciling his argument with the evidence of the effect of small stake losses on learning and performance.

To be able to make claims that the evidence of psychological impact of losses is as weak and contextual as the evidence for loss aversion, Gal and Rucker would need to provide a much deeper review of the literature. But in the absence of that, my reading of the literature does not support those claims.

Unfortunately, these points in the Scientific American article have been the focus of the limited responses to Gal and Rucker’s article, leaving us with a somewhat unsatisfactory debate (as I discuss further below).

Hey, we’re overthrowing the old paradigm!

The third part of Gal and Rucker’s paper concerns what they call the “Sociology of Loss Aversion”. I don’t have much to say on their particular arguments in this section, except that I have a gut reaction against authors discussing Thomas Kuhn and contextualising their work as overthrowing the entrenched paradigm. Maybe it’s the lack of modesty in failing to acknowledge they could be wrong (like most outsiders complaining about their ideas being ignored and quoting Kuhn). Just build your case overthrowing the damn paradigm!

That said, the few responses to Gal and Rucker’s paper that I have seen are underwhelming. Barry Ritholtz wrote a column, labelled by Richard Thaler as a “Good takedown of recent overwrought editorial“, which basically said an extraordinary claim such as this requires extraordinary evidence, and that that standard has not been met.

Unfortunately, the lines in Gal’s Scientific American article on the psychological effect of losses were the focus of Ritholtz’s response, rather than the evidence in the Gal and Rucker article. Further, Ritholtz didn’t show much sign of having read the paper. For instance, in response to Gal’s claim that “people are not particularly likely to sell a stock they believe has even odds of going up or down in price”, Ritholtz responded that “the endowment effect easily explains why we place greater financial value on that which we already possess”. But, as noted above, (a solid) part of Gal and Rucker’s argument is that the endowment effect may not be the result of loss aversion. (It’s probably worth noting here that Gal and Rucker did effectively replicate the endowment effect many times over. The endowment effect is a solid phenomena.)

Another thread of response, linked by Ritholz, came from Albert Bridge Capital’s Drew Dickson. One part of Dickson’s 20-tweet thread runs as follows:

13| So, sure, a billionaire will not distinguish between a $100 loss and a$100 gain as much as Taleb’s at-risk baker with a child in college; but add a few zeros, and the billionaire will start caring.

4| Critics can pretend that Kahneman, Tversky and @R_Thaler haven’t considered this, but they of course have. From some starting point of wealth, there is some other number where loss aversion matters. For everyone. Even Gal. Even Rucker. Even Taleb.

15| Losses (that are significant to the one suffering the loss) feel much worse than similarly-sized gains feel good. Just do the test on yourself.

But this idea that you will be loss averse if the stakes are high enough is not “loss aversion”, or at least not the version of loss aversion from prospect theory, which applies to even the smallest of losses. It’s closer to the concept of “minimal requirements”, whereby people avoid bets that would be ruinous, not because losses hurt more than gains.

Thaler himself threw out a tweet in response, stating that:

No minor point about terminology. Nothing of substance. WTA > WTP remains.

That willingness to accept (WTA) is greater than willingness to pay (WTP) when framed as the status quo is not a point Gal and Rucker would disagree with. But is it due to loss aversion?

Thankfully, the publication of Gal and Rucker’s article was accompanied by two responses, one of which tackled some of the substantive issues (the other response built on rather than critiqued Gal and Rucker’s work). That substantive response (pdf), by Itamar Simonson and Ran Kivetz, would best be described as supporting the weak version of loss aversion.

Simonson and Kivetz largely agreed that status quo bias and the endowment effect do not offer reliable support for loss aversion, particularly given the alternative explanations for the phenomena. However, they were less convinced of Gal and Rucker’s experiments to identify inertia as the basis of these phenomena, suggesting the experiments involved “unrealistic experimental manipulations that are susceptible to confounds and give rise to simple alternative explanations”, although they leave those simple alternative explanations unspecified.

Simonson and Kivetz also disagreed with Gal and Rucker on the evidence concerning risky bets, describing as ad hoc and unsupported the assumption that not accepting the bet is the status quo. It’s not clear to me how they could describe that assumption as unsupported given Gal and Rucker’s experimental evidence (nor the evidence Gal and Rucker cite) about the absence of loss aversion for small stakes when rejecting the bet is not framed as the status quo. Loss aversion only emerges for larger bets.

I should say, however, that I do have some sympathy for Simonson and Kivetz’s resistance to accepting Gal and Rucker’s sweeping of the risky bet premium into the status quo bucket. Even those larger bets for which loss aversion arises aren’t that large (as noted above, they’re often in the range of $500). Risk aversion is a somewhat unsatisfactory alternative explanation (a topic I discuss in my post on Rabin’s Paradox), and I sense that some form of loss aversion kicks in, although here we may again be talking about a minimal requirements type of loss aversion, not the loss aversion of prospect theory. Despite their views on risky bets, Simonson and Kivetz were more than willing to approve of Gal and Rucker’s case that loss aversion was a contingent phenomena. They would simply argue that loss aversion occurs “on average”. As noted above, I’m not sure how you would weight the relative instances of gains or losses having greater weight, so I’ll leave that debate for now. Funnily enough, a final comment by Simonson and Kivetz on risky bets is that “the notion that losses do tend to loom larger than gains is most likely correct; it certainly resonates and “feels” consistent with personal experience, though intuitive reactions are a weak form of evidence.” As noted above, we should distinguish feelings and a decision exhibiting loss aversion. Unfortunately, I haven’t found anything else that attempts to pick apart Gal and Rucker’s article, so it is hard to gauge the broader reception to the article or whether it has resonated in academic circles at all. Where does this leave us on loss aversion? Putting this together, I would summarise the case for loss aversion as follows: • The conditions for loss aversion are more restrictive than is typically thought or stated in discussion outside academia • Some of the claimed evidence for loss aversion, such as the endowment effect, have alternative explanations. The evidence is better found elsewhere • There is sound evidence for the psychological impact of losses, but this does not necessarily manifest itself in loss aversion • Most of the loss aversion literature does a poor job of distinguishing between loss aversion in its pure sense and what might be called a “minimal requirements” effect, whereby people are avoiding the gamble due to the threat of ruin. This is a more restricted conception of loss aversion than I held when I started writing this post. The loss aversion series of posts My next post will be on the topic of ergodicity, which involves the concept that people are not maximising the expected value of a series of gambles, but rather the time average (explanation on what that means to come). If people maximise the latter, not the former as many approaches assume, you don’t need risk or loss aversion to explain their decisions. My other posts on loss aversion can be found here: # What can we infer about someone who rejects a 50:50 bet to win$110 or lose $100? The Rabin paradox explored Consider the following claim: We don’t need loss aversion to explain a person’s decision to reject a 50:50 bet to win$110 or lose $100. That just simple risk aversion as in expected utility theory. Risk aversion is the concept that we prefer certainty to a gamble with the same expected value. For example, a risk averse person would prefer$100 for certain over a 50-50 gamble between $0 and$200, which has an expected value of $100. The higher their risk aversion, the less they would value the 50:50 bet. They would also be willing to reject some positive expected value bets. Loss aversion is the concept that losses loom larger than gains. If the loss is weighted more heavily that the gain – it is often said that losses hurt twice as much as gains bring us joy – then this could also explain the decision to reject a 50:50 bet of the type above. Loss aversion is distinct from risk aversion as its full force applies to the first dollar either side of the reference point from which the person is assessing the change (and at which point risk aversion should be negligible). So, do we need loss aversion to explain the rejection of this bet, or does risk aversion suffice? One typical response to the above claim is loosely based on the Rabin Paradox, which comes from a paper published in 2000 by Matthew Rabin: An expected utility maximiser who rejects this bet is exhibiting a level of risk aversion that would lead them to reject bets that no one in their right mind would reject. It can’t be the case that this is simply risk aversion. For the remainder of this post I am going to pull apart Rabin’s argument from his justifiably famous paper Risk Aversion and Expected-Utility Theory: A Calibration Theorem (pdf). A more more readable version of this argument was also published in 2001 in an article by Rabin and Richard Thaler. To understand Rabin’s point, I have worked through the math in his paper. You can see my mathematical workings in an Appendix at the bottom of this post. There were quite a few minor errors in the paper – and some major errors in the formulas – but I believe I’ve captured the crux of the argument. (I’d be grateful for some second opinions on this). I started working through these two articles with an impression that Rabin’s argument was a fatal blow to the idea that expected utility theory accurately describes the rejection of bets such as that above. I would have been comfortable making the above response. However, after playing with the numbers and developing a better understanding of the paper, I would say that the above response is not strictly true. Rabin’s paper makes an important point, but it is far from a fatal blow by itself. (That fatal blow does come, just not solely from here.) Describing Rabin’s argument Rabin’s argument starts with a simple bet: suppose you are offered a 50:50 bet to win$110 or lose $100, and you turn it down. Suppose further that you would reject this bet no matter what your wealth (this is an assumption we will turn to in more detail later). What can you infer about your response to other bets? This depends on what decision making model you are using. For an expected utility maximiser – someone who maximises the probability weighted subjective value of these bets – we can infer that they will turn down any 50:50 bet of losing$1,000 and gaining any amount of money. For example, they would reject a 50:50 bet to lose $1,000, win one billion dollars. On its face value, that is ridiculous, and that is the crux of Rabin’s argument. Rejection of the low value bet to win$110 and lose $100 would lead to absurd responses to higher value bets. This leads Rabin to argue that risk aversion or the diminishing value of money has nothing to do with rejection of the low value bets. The intuition behind Rabin’s argument is relatively simple. Suppose we have someone that rejects a 50:50 bet for gain$11, lose $10. They are an expected utility maximiser with a weakly concave utility curve: that is, they are risk neutral or risk averse at all levels of wealth. From this, we can infer that they weight the average of each dollar between their current wealth (W) and their wealth if they win the bet (W+11) only 10/11 as much as they weight the average dollar of the last$10 of their current wealth (between W-10 and W). We can also say that they therefore weight their W+11th dollar at most 10/11 as much as their W-10th dollar (relying on the weak concavity here).

Suppose their wealth is now W+21. We have assumed that they will reject the bet at all levels of wealth, so they will also reject at this wealth. Iterating the previous calculations, we can say that they will weight their W+32nd dollar only 10/11 as much as their W+11th dollar. This means they value their W+32nd dollar only (10/11)2 as much as their W-10th dollar.

Keep iterating in this way and you end up with some ridiculous results. You value the 210th dollar above your current wealth only 40% as much as your last current dollar of your wealth [reducing by a constant factor of 10/11 every $21 – (10/11)10]. Or you value the 900th dollar above your current wealth at only 2% of your last current dollar [(10/11)40]. This is an absurd rate of discounting. Those numbers are from the 2001 Rabin and Thaler paper. In his 2000 paper, Rabin gives figures of 3/20 for the 220th and 1/2000 for the 880th dollar, effectively calculating (10/11)20 and (10/11)80, which is a reduction by a factor of 10/11 every 11 dollars. This degree of discounting could be justified and reflects the equations provided in the Appendix to his paper, but it requires a slightly different intuition than the one relating to the comparison between every 21st dollar. If instead you note that the$11 above a reference point are valued less than the $10 below, you only need iterate up$11 to get another discount of 10/11, as the next $11 is valued at most as much as the previous$10.

Regardless of whether you use the numbers from the 2000 or 2001 paper, taking this iteration to the extreme, it doesn’t take long for additional money to have effectively zero value. Hence the result, reject the 50:50 win $110, lose$100 and you’ll reject the win any amount, lose $1,000 bet. What is the utility curve of this person? This argument sounds compelling, but we need to examine the assumption that you will reject the bet at all levels of wealth. If someone rejects the bet at all levels of wealth, what is the least risk averse they could be? They would be close to indifferent to the bet at all levels of wealth. If that was the case across the whole utility curve, their absolute level of risk aversion is constant. The equation used to represent utility with constant absolute risk aversion is exponential utility (with a>0). A feature of the exponential utility function is that, for a risk averse person, utility caps out at a maximum. Beyond a certain level of wealth, they gain no additional utility – hence Rabin’s ability to define bets where they reject infinite gains. The need for utility to cap out is also apparent from the fact that someone might reject a bet that involves the potential for infinite gain. The utility of infinite wealth cannot be infinite, as any bet involving that the potential for infinite utility would be accepted. In the 2000 paper, Rabin brings the constant absolute risk aversion function into his argument more explicitly when he examines what proportion of their portfolio a person with an exponential utility function would invest in stocks (under some particular return assumptions). There he shows a ridiculous level of risk aversion and states that “While it is widely believed that investors are too cautious in their investment behavior, no one believes they are this risk averse.” However, this effective (or explicit) assumption of constant absolute risk aversion is not particularly well grounded. Most empirical evidence is that people exhibit decreasing absolute risk aversion, not constant. Exponential utility functions are used more for mathematical tractability than for realistically reflecting the decision making processes that people use. Yet, under Rabin’s assumption of rejecting the bet at all levels of wealth, constant absolute risk aversion and a utility function such as the exponential is the most accommodating assumption we can make. While Rabin states that “no one believes they are this risk averse”, it’s not clear that anyone believes Rabin’s underlying assumption either. This ultimately means that the ridiculous implications for rejecting low-value bets is the result of Rabin’s unrealistic assumption of rejecting the bet no matter what their wealth. Relaxing the “all levels of wealth” assumption Rabin is, of course, aware that the assumption of rejecting the bet at all levels of wealth is a weakness, so he provides a further example that applies to someone who only rejects this bet for all levels of wealth below$300,000.

This generates less extreme, but still clearly problematic bets that the bettor can be inferred to also reject.

For example, consider someone who rejects the 50:50 bet to win $110, lose$100 when they have $290,000 of wealth, and who would also reject that bet up to a wealth of$300,000. As for the previous example, each time you iterate up $110, each dollar in that$110 is valued at most 10/11 of the previous $110. It takes 90 iterations of$110 to cover that $10,000, meaning that a dollar around wealth$300,000 will be valued only (10/11)90 (0.02%) of a dollar at wealth $290,000. Each dollar above$300,000 is not discounted any further, but by then the damage has already been done, with that money of almost no utility.

For instance, this person will reject a bet of gain $718,190, lose$1,000. Again, this person would be out of their mind.

You might now ask whether a person with a wealth of $290,000 to$300,000 actually rejects bets of this nature? If not, isn’t this just another unjustifiable assumption designed to generate a ridiculous result?

It is possible to make this scenario more realistic. Rabin doesn’t mention this in his paper (nor do Rabin and Thaler), but we can generate the same result at much lower levels of wealth. All we need to find is someone who will reject that bet over a range of $10,000, and still have enough wealth to bear the loss – say someone who will reject that bet up to a wealth of$11,000. That person will also reject a win $718,190 lose$1,000 bet.

Rejection of the win $110, lose$100 bet over that range does not seem as unrealistic, and I could imagine a person with that preference existing. If we empirically tested this, we would also need to examine liquid wealth and cash flow, but the example does provide a sense that we could find some people whose rejection of low value bets would generate absurd results under expected utility maximisation.

The log utility function

Let’s compare Rabin’s example utility function with a more commonly assumed utility function, that of log utility. Log utility has decreasing absolute risk aversion (and constant relative risk aversion), so is both more empirically defensible and does not generate utility that asymptotes to a maximum like the exponential utility function.

A person with log utility would reject the 50:50 bet to win $110, lose$100 up to a wealth of $1,100. Beyond that, they would accept the bet. So, for log utility we should see most people accept this bet. A person with log utility will reject some quite unbalanced bets: such as a 50:50 bet to win$1 million, lose $90,900, but only up to a wealth of$100,000, beyond which they would accept. Rejection only occurs when a loss is near ruinous.

The result is that log utility does not generate the types of rejected bets that Rabin labels as ridiculous, but would also fail to provide much of an explanation for the rejection of low-value bets with positive expected value.

The empirical evidence

Do people actually turn down 50:50 bets of win $110, lose$100? Surprisingly, I couldn’t find an example of this bet (if someone knows a paper that directly tests this, let me know).

Most examinations of loss aversion examine symmetric 50:50 bets where the potential gain and the loss are the same. They compare a bet centred around 0 (e.g. gain $100 or lose$100) and a similar bet in a gain frame (e.g. gain $100 or gain$300, or take $200 for certain). If more people reject the first bet than the latter, then this is evidence of loss aversion. It makes sense that this is the experimental approach. If the bet is not symmetric, it becomes hard to tease out loss aversion from risk aversion. However, there is a pattern in the literature that people often reject risky bets with a positive expected value in the ranges explored by Rabin. We don’t know a lot about their wealth (or liquidity), but Rabin’s illustrative numbers for rejected bets don’t seem completely unrealistic. It’s the range of wealth over which the rejection occurs that is questionable. Rather than me floundering around on this point, there are papers that explicitly ask whether we can observe a set of bets for a group of experimental subjects and map a curve to those choices that resembles expected utility. For instance, Holt and Laury’s 2002 AER paper (pdf) examined a set of hypothetical and incentivised bets over a range of stakes (finding among other things that hypothetical predictions of their response to incentivised high-stakes bets were not very accurate). They found that if you are flexible about the form of the expected utility function that is used, rejection of small gambles does not result in absurd conclusions on large gambles. The pattern of bets could be made consistent with expected utility, assuming you correctly parameterise the equation. Over subsequent years there was some back and forth on whether this finding was robust [see here (pdf) and here (pdf)], but the basic result seemed to hold. The utility curve that best matched Holt and Laury’s experimental findings had increasing relative risk aversion, and decreasing absolute risk aversion. By having decreasing absolute risk aversion, the absurd implications of Rabin’s paper are avoided. Papers such as this suggest that while Rabin’s paper makes an important point, its underlying assumptions are not consistent with empirical evidence. It is possible to have an expected utility maximiser reject low value bets without generating ridiculous outcomes. So what can you infer about our bettor who has rejected the win$110, lose $100 bet? From the argument above, I would say not much. We could craft a utility function to accommodate this bet without leading to ridiculous consequences. I personally feel this defence is laboured (that’s a subject for another day), but the bet is not in itself fatal to the argument that they are an expected utility maximiser. My other posts on loss aversion can be found here: Appendix The utility of a gain Let’s suppose someone will reject a 50:50 bet with gain g and loss l for any level of wealth. What utility will they get from a gain of x? Rabin defines an upper bound of the utility of gaining x to be: $U(w+x)-U(w)\leq\sum_{i=0}^{k^{**}(x)}\left(\frac{l}{g}\right)^ir(w)\\$ $k^{**}(x)=int\left(\frac{x}{g}\right)\\$ $r(w)=U(w)-U(w-l)$ This formula effectively breaks down $x$ into $g$ size components, successively discounting each additional $g$ at $\frac{l}{g}$ of the previous $g$. You need $k^{**}(x)+1$ lots of $g$ to cover $x$. For instance, if $x$ was 32 and we had a 50:50 bet for win$11, lose $10, $\left(\frac{32}{11}\right)=2$. You need 2+1 lots of 11 to fully cover 32. It actually covers a touch more than 32, hence the calculation being for an upper bound. In the paper, Rabin defines $k^{**}(x)=int\left(\left(\frac{x}{g}\right)+1\right)$ This seems to better capture the required number of $g$ to fully cover $x$, but the iterations in the above formula start at $i=0$. The calculations I run with my version of the formula replicate Rabin’s, supporting the suggestion that the addition of 1 in the paper is an error. $r(w)$ is shorthand for the amount of utility sacrificed from losing the gamble (i.e. losing $l$). We know that the utility of the gain $g$ is less than this, as the bet is rejected. If we let $r(w)=1$, the equation can be thought of as giving you the maximum utility you could get from the gain of $x$ relative to the utility of the loss of $l$. Putting this together, the upper bound of the utility of the possible gain $x$ is therefore less than, first, the upper bound of the relative utility from the first$11, $\left(\frac{10}{11}\right)^0r(w)=r(w)$, the upper bound of utility from the next $11, $\left(\frac{10}{11}\right)^1r(w)$, and the upper bound of the utility from the remaining$10 – taking a conservative approach this is calculated as though it were a full $11: $\left(\frac{10}{11}\right)^2r(w)$ . The utility of a loss Rabin also gives us a lower bound of the utility of a loss of $x$ for this person who will reject a 50:50 bet with gain $g$ and loss $l$ for any level of wealth: $U(w)-U(w-x)\geq{2}\sum_{i=1}^{k^{*}(x)}\left(\frac{g}{l}\right)^{i-1}{r(w)}$ $k^{*}(x)=int\left(\frac{x}{2l}\right)$ The intuition behind $k^{*}(x)$ comes from Rabin’s desire to provide a relatively uncomplicated proof for the proposition. Effectively, the utility scales down with each step of $g$ by at least $\frac{g}{l}$. Since Rabin wants to express this in terms of losses, he defines $2l\geq{g}\geq{l}$. He can thereby say that utility scales down by at least $\frac{g}{l}$ every 2 lots of $l$. Otherwise, the intuition for this loss formula is the same as that for the gain. The summation starts at $i=1$ as this formula is providing a lower bound, so does not require the final iteration to fully cover $x$. The formula is also multiplied by 2 as each iteration covers two lots of $l$, whereby r(w) is for a single span of $l$. Running some numbers The below R code implements the above two formulas as a function, calculating the potential utility gain for a win of $G$ or a loss of $L$ for a person who rejects a 50:50 bet win $g$, lose $l$ at all levels of wealth. It then states whether we know the person will reject a win $G$, lose $L$ bet – we can’t state they will accept as we have upper and lower bounds of the utility change from the gain and loss. Rabin_bet <- function(g, l, G, L){ k_2star <- as.integer(G/g) k_star <- as.integer(L/(2*l)) U_gain <- 0 for (i in 0:k_2star) { U_step <- (l/g)^i U_gain <- U_gain + U_step } U_loss <- 0 for (i in 1:k_star) { U_step <- 2*(g/l)^(i-1) U_loss <- U_loss + U_step } ifelse(U_gain < U_loss, print("REJECT"), NA ) print(paste0("Max U from gain =", U_gain)) print(paste0("Min U from loss =", U_loss)) }  Take a person who will reject a 50:50 bet to win$110, lose $100. Taking the table from the paper, they would reject a win$1,000,000,000, lose $1,000 bet. Rabin_bet(110, 100, 1000000000, 1000)  [1] "REJECT" [1] "Max U from gain =11" [1] "Min U from loss =12.2102"  Relaxing the wealth assumption In the Appendix of his paper, Rabin defines his proof where the bet is rejected over a range of wealth $w\in(\bar w, \underline{w})$. In that case, relative utility for each additional gain of size $g$ is $\frac{l}{g}$ of the previous $g$ until $\bar w$. Beyond that point, each additional gain of $g$ gives constant utility until $x$ is reached. The formula for the upper bound on the utility gain is: $U(w+x)-U(w)\leq \begin{cases} \sum_{i=0}^{k^{**}(x)}\left(\frac{l}{g}\right)^ir(w) & if\quad x\leq{\bar w}-w\\ \\ \sum_{i=0}^{k^{**}(\bar w)}\left(\frac{l}{g}\right)^{i}r(w)+\left[\frac{x-(\bar w-w)}{g}\right]\left(\frac{l}{g}\right)^{k^{**}(\bar w)}r(w) & if\quad x\geq{\bar w}-w \end{cases}$ The first term of the equation where $x\geq\bar w-w$ involves iterated discounting as per the situation where the bet is rejected for all levels of wealth, but here the iteration is only up to wealth $\bar w$. The second term of that equation captures the gain beyond $\bar w$ discounted at a constant rate. There is an error in Rabin’s formula in the paper. Rather than the term $\left[\frac{x-(\bar w-w)}{g}\right]$ in the second equation, Rabin has it as $[x-\bar w]$. As for the previous equations, we need to know the number of iterations of the gain, not total dollars, and we need this between $\bar w$ and $w+x$. When Rabin provides the examples in Table II of the paper, from the numbers he provides I believe he actually uses a formula of the type $int\left[\frac{x-(w-\underline w)}{g}+1\right]$, which reflects a desire to calculate the upper-bound utility across the stretch above $\bar w$ in a similar manner to below, although this is not strictly necessary given the discount is constant across this range. I have implemented as per my formula, which means that a bet for gain $G$ is rejected $g$ higher than for Rabin (which given their scale is not material). Similarly, for the loss: $U(w)-U(w-x)\geq \begin{cases} {2}\sum_{i=1}^{k^{*}(x)}\left(\frac{g}{l}\right)^{i-1}{r(w)} & if\quad {w-\underline w+2l}\geq{x}\geq{2l}\\ \\ {2}\sum_{i=1}^{k^{*}(w-\underline w+2l)}\left(\frac{g}{l}\right)^{i-1}{r(w)}+\ \quad\left[\frac{x-(w-\underline w+l)}{2l}\right]\left(\frac{g}{l}\right)^{k^{*}(w-\underline w+2l)}{r(w)} & if\quad x\geq{w-\underline w+2l} \end{cases}$ There is a similar error here, with Rabin using the term $\left[x-(w-\underline w+l)\right]$ rather than $\left[\frac{x-(w-\underline w+l)}{2l}\right]$. We can’t determine how this was implemented by Rabin as his examples do not examine behaviour below a lower bound $\underline w$. Running some more numbers The below code implements the above two formulas as a function, calculating the potential utility gain for a win of $G$ or a loss of $L$ for a person who rejects a 50:50 bet win $g$, lose $l$ at wealth $w\in(\bar w, \underline{w})$. It then states whether we know the person will reject a win $G$, lose $L$ bet – as before, we can’t state they will accept as we have upper and lower bounds of the utility change from the gain and loss. Rabin_bet_general <- function(g, l, G, L, w, w_max, w_min){ ifelse( G <= (w_max-w), k_2star <- as.integer(G/g), k_2star <- as.integer((w_max-w)/g)) ifelse(w-w_min+2*l >= L k_star <- as.integer(L/(2*l)), k_star <- as.integer((w-w_min+2*l)/(2*l)) ) U_gain <- 0 for (i in 0:k_2star){ U_step <- (l/g)^i U_gain <- U_gain + U_step } ifelse( G <= (w_max-w), U_gain <- U_gain, U_gain <- U_gain + ((G-(w_max-w))/g)*(l/g)^k_2star ) U_loss <- 0 for (i in 1:k_star) { U_step <- 2*(g/l)^(i-1) U_loss <- U_loss + U_step } ifelse(w-w_min+2l >= L, U_loss <- U_loss, U_loss <- U_loss + ((L-(w-w_min+l))/(2*l))*(g/l)^k_star ) ifelse(U_gain < U_loss, print("REJECT"), print("CANNOT CONFIRM REJECT") ) print(paste0("Max U from gain =", U_gain)) print(paste0("Min U from loss =", U_loss)) }  Imagine someone who turns down the win$110, lose $100 bet with a wealth of$290,000, but who would only reject this bet up to $300,000. They will reject a win$718,190, lose $1000 bet. Rabin_bet_general(110, 100, 718190, 1000, 290000, 300000, 0)  [1] "REJECT" [1] "Max U from gain =12.2098745626936" [1] "Min U from loss =12.2102" The nature of Rabin’s calculation means that we can scale this calculation to anywhere on the wealth curve. We need only say that someone who rejects this bet over (roughly) a range of$10,000 plus the size of the potential loss will exhibit the same decisions. For example a person with $10,000 wealth who would reject the bet up to$20,000 wealth would also reject the win $718,190, lose$1000 bet.

Rabin_bet_general(110, 100, 718190, 1000, 10000, 20000, 0)

[1] "REJECT"
[1] "Max U from gain =12.2098745626936"
[1] "Min U from loss =12.2102"

Comparison with log utility

The below is an example with log utility, which is $U(W)=ln(W)$. This function determines whether someone of wealth $w$ will reject of accepta 50:50 bet for gain $g$ and loss $l$.

log_utility <- function(g, l, w){

log_gain <- log(w+g)
log_loss <- log(w-l)

EU_bet <- 0.5*log_gain + 0.5*log_loss
EU_certain <- log(w)

ifelse(EU_certain == EU_bet,
print("INDIFFERENT"),
ifelse(EU_certain > EU_bet,
print("REJECT"),
print("ACCEPT")
)
)

print(paste0("Expected utility of bet = ", EU_bet))
print(paste0("Utility of current wealth = ", EU_certain))
}


Testing a few numbers, someone with log utility is indifferent about a 50:50 win $110, lose$100 bet at wealth $1100. They would accept for any level of wealth above that level. log_utility(110, 100, 1100)  [1] "INDIFFERENT" [1] "Expected utility of bet = 7.00306545878646" [1] "Utility of current wealth = 7.00306545878646" That same person will always accept a 50:50 win$1100, lose $1000 bet above$11,000 in wealth.

log_utility(1100, 1000, 11000)

[1] "ACCEPT"
[1] "Expected utility of bet = 9.30565055178051"
[1] "Utility of current wealth = 9.30565055178051"

Can we generate any bets that don’t seem quite right? It’s quite hard unless you have a bet that will bring the person to ruin or near ruin. For instance, for a 50:50 bet with a chance to win $1 million, a person with log utility and$100,000 wealth would still accept the bet with a potential loss of $90,900, which brings them to less than 10% of their wealth. log_utility(1000000, 90900, 100000)  [1] "ACCEPT" [1] "Expected utility of bet = 11.5134252151368" [1] "Utility of current wealth = 11.5129254649702" The problem with log utility is not the ability to generate ridiculous bets that would be rejected. Rather, it’s that someone with log utility would tend to accept most positive value bets (in fact, they would always take a non-zero share if they could). Only if the bet brings them near ruin (either through size or their lack of wealth) would they turn down the bet. The isoelastic utility function – of which log utility is a special case – is a broader class of function that exhibits constant relative risk aversion: $U(x)=\frac{w^{1-\rho}-1}{1-\rho}$ If $\rho=1$, this simplifies to log utility (you need to use L’Hopital’s rule to get this as the fraction is undefined when $\rho=1$.) The higher $\rho$, the higher the level of risk aversion. We implement this function as follows: CRRA_utility <- function(g, l, w, rho=2){ ifelse( rho==1, print("function undefined"), NA ) log_gain <- ((w+g)^(1-rho)-1)/(1-rho) log_loss <- ((w-l)^(1-rho)-1)/(1-rho) EU_bet <- 0.5*log_gain + 0.5*log_loss EU_certain <- (w^(1-rho)-1)/(1-rho) ifelse(EU_certain == EU_bet, print("INDIFFERENT"), ifelse(EU_certain > EU_bet, print("REJECT"), print("ACCEPT") ) ) print(paste0("Expected utility of bet = ", EU_bet)) print(paste0("Utility of current wealth = ", EU_certain)) }  If we increase $\rho$, we can increase the proportion of low value bets that are rejected. For example, a person with $\rho=2$ will reject the 50:50 win$110, lose $100 bet up to a wealth of$2200. The rejection point scales with $\rho$.

CRRA_utility(110, 100, 2200, 2)

[1] "INDIFFERENT"
[1] "Expected utility of bet = 0.999545454545455"
[1] "Utility of current wealth = 0.999545454545455"

For a 50:50 chance to win $1 million at wealth$100,000, the person with $\rho=2$ is willing to risk a far smaller loss, and rejects even when the loss is only $48,000, or less than half their wealth (which admittedly is still a fair chunk). CRRA_utility(1000000, 48000, 100000, 2)  [1] "REJECT" [1] "Expected utility of bet = 0.99998993006993" [1] "Utility of current wealth = 0.99999" Higher values of $\rho$ start to become completely unrealistic as utility is almost flat beyond an initial level of wealth. It is also possible to have values of $\rho$ between 0 (risk neutrality) and 1. These would result in even fewer rejected low value bets than log utility, and fewer rejected bets with highly unbalanced potential gains and losses. # My latest article at Behavioral Scientist: Principles for the Application of Human Intelligence I am somewhat slow in posting this – the article has been up more than a week – but my latest article is up at Behavioral Scientist. The article is basically an argument that the scrutiny we are applying to algorithmic decision making should also be applied to human decision making systems. Our objective should be good decisions, whatever the source of the decision. The introduction to the article is below. Principles for the Application of Human Intelligence Recognition of the powerful pattern matching ability of humans is growing. As a result, humans are increasingly being deployed to make decisions that affect the well-being of other humans. We are starting to see the use of human decision makers in courts, in university admissions offices, in loan application departments, and in recruitment. Soon humans will be the primary gateway to many core services. The use of humans undoubtedly comes with benefits relative to the data-derived algorithms that we have used in the past. The human ability to spot anomalies that are missed by our rigid algorithms is unparalleled. A human decision maker also allows us to hold someone directly accountable for the decisions. However, the replacement of algorithms with a powerful technology in the form of the human brain is not without risks. Before humans become the standard way in which we make decisions, we need to consider the risks and ensure implementation of human decision-making systems does not cause widespread harm. To this end, we need to develop principles for the application for the human intelligence to decision making. Read the rest of the article here. # Kahneman and Tversky’s “debatable” loss aversion assumption Loss aversion is the idea that losses loom larger than gains. It is one of the foundational concepts in the judgment and decision making literature. In Thinking, Fast and Slow, Daniel Kahneman wrote “The concept of loss aversion is certainly the most significant contribution of psychology to behavioral economics.” Yet, over the last couple of years several critiques have emerged that question the foundations of loss aversion and whether loss aversion is a phenomena at all. One is an article by Eldad Yechiam, titled Acceptable losses: the debatable origins of loss aversion (pdf). Framed in one case as a spread of the replication crisis to loss aversion, the abstract reads as follows: It is often claimed that negative events carry a larger weight than positive events. Loss aversion is the manifestation of this argument in monetary outcomes. In this review, we examine early studies of the utility function of gains and losses, and in particular the original evidence for loss aversion reported by Kahneman and Tversky (Econometrica 47:263–291, 1979). We suggest that loss aversion proponents have over-interpreted these findings. Specifically, the early studies of utility functions have shown that while very large losses are overweighted, smaller losses are often not. In addition, the findings of some of these studies have been systematically misrepresented to reflect loss aversion, though they did not find it. These findings shed light both on the inability of modern studies to reproduce loss aversion as well as a second literature arguing strongly for it. A second, The Loss of Loss Aversion: Will It Loom Larger Than Its Gain (pdf), by David Gal and Derek Rucker, attacks the concept of loss aversion more generally (supposedly the “death knell“): Loss aversion, the principle that losses loom larger than gains, is among the most widely accepted ideas in the social sciences. The first part of this article introduces and discusses the construct of loss aversion. The second part of this article reviews evidence in support of loss aversion. The upshot of this review is that current evidence does not support that losses, on balance, tend to be any more impactful than gains. The third part of this article aims to address the question of why acceptance of loss aversion as a general principle remains pervasive and persistent among social scientists, including consumer psychologists, despite evidence to the contrary. This analysis aims to connect the persistence of a belief in loss aversion to more general ideas about belief acceptance and persistence in science. The final part of the article discusses how a more contextualized perspective of the relative impact of losses versus gains can open new areas of inquiry that are squarely in the domain of consumer psychology. A third strain of criticism relates to the concept of ergodicity. Put forward by Ole Peters, the basic claim is that people are not maximising the expected value of a series of gambles, but rather the time average. If people maximise the latter, not the former as many approaches assume, you don’t need risk or loss aversion to explain the decisions. (I’ll leave explaining what exactly this means to a later post.) I’m as sceptical and cynical about the some of the findings in the behavioural sciences as most (here’s my critical behavioural economics and behavioural science reading list), but I’m not sure I am fully on board with these arguments, particularly the stronger statements of Gal and Rucker. This post is the first of a few rummaging through these critiques to make sense of the debate, starting with Yechiam’s paper on the foundations of loss aversion in prospect theory. Acceptable losses: the debatable origins of loss aversion One of the most cited papers in the social sciences is Daniel Kahneman and Amos Tversky’s 1979 paper Prospect Theory: An Analysis of Decision under Risk (pdf). Prospect theory is intended to be a descriptive model of how people make decisions under risk, and an alternative to expected utility theory. Under expected utility theory, people assign a utility value to each possible outcome of a lottery or gamble, with that outcome typically relating to a final level of wealth. The expected utility for a decision under risk is simply the probability weighted sum of these utilities. The utility of a 50% chance of$0 and a 50% chance of $200 is simply the sum of 50% of the utility of each of$0 and $200. When utility is assumed to increase at a decreasing rate with each additional dollar of additional wealth – as is typically the case – it leads to risk averse behaviour, with a certain sum preferred to a gamble with an equivalent expected value. For example, a risk averse person would prefer$100 for certain that the 50-50 gamble for $0 or$200.

In their 1979 paper, Kahneman and Tversky described a number of departures from expected utility theory. These included:

• The certainty effect: People overweight outcomes that are considered certain, relative to outcomes which are merely probable.
• The reflection effect: Relative to a reference point, people are risk averse when considering gains, but risk seeking when facing losses.
• The isolation effect: People focus on the elements that differ between options rather than those components that are shared.
• Loss aversion: Losses loom larger than gains – relative to a reference point, a loss is more painful than a gain of the same magnitude.

Loss aversion and the reflection effect result in the following famous diagram of how people weight losses and gains under prospect theory. Loss aversion leads to a kink in the utility curve at the reference point. The curve is steeper below the reference point than above. The reflection effect results in the curve being concave above the reference point, and convex below.

Through the paper, Kahneman and Tversky describe experiments on each of the certainty effect, reflection effect, and isolation effect. However, as pointed out by Eldad Yechiam in his paper Acceptable losses: the debatable origins of loss aversion, loss aversion is taken as a stylised fact. Yechiam writes:

[I]n their 1979 paper, Kahneman and Tversky (1979) strongly argued for loss aversion, even though, at the time, they had not reported any experiments to support it. By indicating that this was a robust finding in earlier research, Kahneman and Tversky (1979) were able to rely upon it as a stylized fact. They begin their discussion on losses by stating that “a salient characteristic of attitudes to changes in welfare is that losses loom larger than gains” (p. 279), which suggests that this stylized fact is based on earlier findings. They then follow with the (much cited) sentence that “the aggravation that one experiences in losing a sum of money appears to be greater than the pleasure associated with gaining the same amount [17]” (p. 279). Most people who cite this sentence do so without the end quote of Galenter and Pliner (1974). Galenter and Pliner (1974) are, therefore, the first empirical study used to support the notion of loss aversion.

So what did Galenter and Pliner find? Yechiam writes:

Summing up their findings, Galenter and Pliner (1974) reported as follows: “We now turn to the question of the possible asymmetry of the positive and negative limbs of the utility function. On the basis of intuition and anecdote, one would expect the negative limb of the utility function to decrease more sharply than the positive limb increases… what we have observed if anything is an asymmetry of much less magnitude than would have been expected … the curvature of the function does not change in going from positive to negative” (p. 75).

Thus, our search for the historical foundations of loss aversion turns into a dead end on this particular branch: Galenter and Pliner (1974) did not observe such an asymmetry; and their study was quoted erroneously.

Effectively, the primary reference for the claim that we are loss averse does not support it.

So what other sources did Kahneman and Tversky rely on? Yechiam continues:

They argue that “the main properties ascribed to the value function have been observed in a detailed analysis of von Neumann–Morgenstern utility functions for changes of wealth [14].” (p. 281). The citation refers to Fishburn and Kochenberger’s forthcoming paper (at the time; published 1979). Fishburn and Kochenberger’s (1979) study reviews data of five other papers (Grayson, 1960; Green, 1963; Swalm, 1966; Halter & Dean, 1971; Barnes & Reinmuth, 1976) also cited by Kahneman and Tversky (1979). Summing up all of these findings, Kahneman and Tversky (1979) argue that “with a single exception, utility functions were considerably steeper for losses than for gains.” (p. 281). The “single exception” refers to a single participant who was reported not to show loss aversion, while the remaining one apparently did.

These five studies all involved very small samples, involving a total of 30 subjects.

Yechiam walks through three of the studies. On Swalm (1966):

The results of the 13 individuals examined by Swalm … appear at the first glance to be consistent with an asymmetric utility function implying overweighting of losses compared to gains (i.e., loss aversion). Notice, however, that amounts are in the thousands, such that the smallest amount used was set above $1000 and typically above$5000, because it was derived from the participant’s “planning horizon”. Moreover, for more than half of the participants, the utility curve near the origin …, which spans the two smallest gains and two smallest losses for each person, was linear. This deviates from the notion of loss aversion which implies that asymmetries should also be observed for small amounts as well.

This point reflects an argument that Yechiam and other have made in several papers (including here and here) that loss aversion is only apparent in high-stakes gambles. When the stakes are low, loss aversion does not appear.

On Grayson (1960):

A similar pattern is observed in Grayson’s utility functions … The amounts used were also extreme high, with only one or two points below the $50,000 range. For the points above$100,000, the pattern seems to show a clear asymmetry between gains and losses consistent with loss aversion. However, for 2/9 participants …, the utility curve for the points below 100,000 does not indicate loss aversion, and for 2/9 additional participants no loss aversion is observed for the few points below $50,000. Thus, it appears that in Grayson (1960) and Swalm (1966), almost all participants behaved as if they gave extreme losses more weight than corresponding gains, yet about half of them did not exhibit a similar asymmetry for the lower losses (e.g., below$50,000 in Grayson, 1960).

Again, loss aversion is stronger for extreme losses.

On Green (1963):

… Green (1963) did not examine any losses, making any interpretation concerning loss aversion in this study speculative as it rests on the authors’ subjective impression.

The results from Swalm (1966), Grayson (1960) and Green (1963) covers 26 of the 30 participants aggregated by Fishburn and Kochenberger. Halter and Dean (1971) and Barnes and Reinmuth (1976) only involved two participants each.

So what of other studies that were available to Kahneman and Tversky at the time?

In 1955, Davidson, Siegel, and Suppes conducted an experiment in which participants were presented with heads or tails bets which they could accept or refuse. …

… Outcomes were in cents and ran up to a gain or loss of 50 cents. The results of 15 participants showed that utility curves for gains and losses were symmetric …, with a loss/ gain utility ratio of 1.1 (far below than the 2.25 estimated by Tversky and Kahneman, 1992). The authors also re-analyzed an earlier data set by Mosteller and Nogee (1951) involving bets for amounts ranging from − 30 to 30 cents, and it too showed utility curves that were symmetric for gains and losses.

Lichtenstein (1965) similarly used incentivized bets and small amounts. … Lichtenstein (1965) argued that “The preference for low V [variance] bets indicates that the utility curve for money is not symmetric in its extreme ranges; that is, that large losses appear larger than large wins.” (p. 168). Thus, Lichtenstein (1965) interpreted her findings not as a general aversion to losses (which would include small losses and gains), but only as a tendency to overweight large losses relative to large gains.

… Slovic and Lichtenstein (1968) developed a regression-based approach to examine whether the participants’ willingness to pay (WTP) for a certain lottery is predicted more strongly by the size of its gains or the size of its losses. Their results showed that size of losses predicted WTP more than sizes of gains. … Moreover, in a follow-up study, Slovic (1969) found a reverse effect in hypothetical lotteries: Choices were better predicted by the gain amount than the loss amount. In the same study, he found no difference for incentivized lotteries in this respect.

Similar findings of no apparent loss aversion were observed in studies that used probabilities that are learned from experience (Katz, 1963; Katz, 1964; Myers & Suydam, 1964).

In sum, the evidence for loss aversion at the time of the publication of prospect theory was relatively weak and limited to high-stakes gambles.

As Yechiam notes, Kahneman and Tversky only turned their attention to specifically investigating loss aversion in 1992 – and even there it tended to involve large amounts.

Only in 1992 did Tversky and Kahneman (1992) and Redelmeier and Tversky (1992) start to empirically investigate loss aversion, and when they did, they used either very large amounts (Redelmeier & Tversky, 1992) or the so-called “list method” in which one chooses between lotteries with changing amounts up until choices switch from one alternative to the other (Tversky & Kahneman, 1992). This usage of high amounts would come to characterize most of the literature later arguing for loss aversion (e.g., Redelmeier & Tversky, 1992; Abdellaoui et al., 2007; Rabin & Weizsäcker, 2009) as would be the usage of decisions that are not incentivized (i.e., hypothetical; as discussed below).

I’ll examine the post-1979 evidence in more detail in a future post, but in the interim will note this observation from Yechiam on the more recent experiments.

In a review of the literature, Yechiam and Hochman (2013a) have shown that modern studies of loss aversion seem to be binomially distributed into those who used small or moderate amounts (up to $100) and large amounts (above$500). The former typically find no loss aversion, while the latter do. For example, Yechiam and Hochman (2013a) reviewed 11 studies using decisions from description (i.e., where participants are given exact information regarding the probability of gaining and losing money). From these studies, seven did not find loss aversion and all of them used loss/gain amounts of up to $100. Four did find loss aversion, and three of them used very high amounts (above$500 and typically higher). Thus, the usage of high amounts to produce loss aversion is maintained in modern studies.

The presence of loss aversion for only large stakes gambles raises some interesting questions. In particular, are we actually observing the effect of “minimal requirements”, whereby a loss would push them below some minimum threshold for, say, survival or other basic necessities? (Or at least a heuristic that operates with that intent?) This is a distinct concept from loss aversion as presented in prospect theory.

Finally – and a minor point on the claim that Yechiam’s paper was the beginning of the spread of the replication crisis to loss aversion – there is of course no direct experiment on loss aversion in the initial prospect theory paper to be replicated. A recent replication of the experiments in the 1979 paper had positive results (excepting some mixed results concerning the reflection effect). Replication of the 1979 paper doesn’t, however, resolve provide any evidence on the replicability of loss aversion itself, nor the appropriate interpretation of the experiments.

On that point, in my next post on the topic I’ll turn to some of the alternative explanations for what appears to be loss aversion, particularly the claims of Gal and Rucker that losses do not loom larger than gains.