behavioural science

Saint-Paul’s The Tyranny of Utility: Behavioral Social Science and the Rise of Paternalism

Saint-PaulThe growth in behavioural science has given a new foundation for paternalistic government interventions. Governments now try to help “biased” humans make better decisions – from nudging them to pay their taxes on time, to constraining the size of the soda they can buy, to making them save for that retirement so far in the future.

There is no shortage of critics of these interventions. Are people actually biased? Do these interventions change behaviour or improve outcomes for the better? Is an also biased government the right agent to fix these problems? Ultimately, do the costs outweigh the benefits of government action?

In The Tyranny of Utility: Behavioral Social Science and the Rise of Paternalism, Gilles Saint-Paul points out the danger in this line of defence. By fighting the utilitarian battle based on costs and benefits, there will almost certainly be circumstances in which the scientific evidence on human behaviour and the effect of the interventions will point in the freedom-reducing direction. Arguing about whether a certain behaviour is rational at best leads to an empirical debate. Similarly, arguments about the irrationality of government can be countered by empirical debate on how particular government interventions change behaviour and outcomes.

As a result, Saint-Paul argues that:

[I]f we want to provide intellectual foundations for limited governments, we cannot do it merely on the basis of instrumental arguments. Instead, we need a system of values that delivers those limits and such a system cannot be utilitarian.

Saint-Paul argues that part of the problem is that the utilitarian approach is the backbone of neoclassical economics – once (and still in some respects) a major source of arguments in favour of freedom. Now that the assumptions about human behaviour underpinning many neoclassical models are seen to no longer hold, you are still left with utility maximisation as the policy objective. As Saint-Paul writes:

It should be emphasized that the drift toward paternalism is entirely consistent with the research program of traditional economics, which supposes that policies should be advocated on the basis of a consequentialist cost-benefit analysis, using some appropriate social welfare function. Paternalism then derives naturally from these premises, by simply adding empirical knowledge about how people actually behave …

When Saint-Paul describes the practical costs of this increased paternalism, his choice of examples often make it hard to share his anger. One of his prime cases of infringed liberty is a five-times public transport molester who is banned from using the train as a court determined he lacked the self-control to travelling on it. On gun control laws he suggests authoritarian governments could rise in the absence of an armed citizenry.

Still, some of the other stories (or even these more extreme examples) lead to an important point. Saint-Paul points out that many of these interventions extend beyond the initial cause of the problem and impose responsibility on people for the failings of others. For example, in many countries you need a pool fence even if don’t have kids. You effectively need to look after other people’s children. Similarly, liquor laws can extend to preventing sales to people who are drunk or likely to drive. Where does the chain of responsibility transfer stop?

One of the more interesting threads in the book concerns what the objective of policy is. Is it consumption? Or happiness? And based on this objective, how far does the utilitarian argument extend. If it is happiness, should we just load everyone up with Prozac? And then what of the flow on costs if everyone decides to check out and be happy?

What if a cardiologist decides that experts and studies are right, that it’s stupid after all to buy a glossy Lamborghini, and dumps a few of his patients in order to take more time off with his family? How is the well-being of the patients affected? What if that entrepreneur who works seventy hours a week to gain market shares calls it a day and closes his factory? In a market society the pursuit of status and material achievement is obtained through voluntary exchange, and must thus benefit somebody else. Owning a Lamborghini is futile, but curing a heart disease is not. The cardiologist may be selfish and alienated; he makes his neighbors feel bad; and he is tired of the Lamborghini. His foolishness, however, has improved the lives of many people, even by the standards of happiness researchers. Competition to achieve status may be unpleasant to my future incarnations and those of my neighbors, but it increases the welfare of those who buy the goods I am producing to achieve this goal.

Saint-Paul’s response to these problems – presented more as suggestions than a manifesto, and thinly summarised in only two pages  at the end of the book – is not to ignore science but to set some limits:

I am not advocating that scientific evidence should be disregarded in the decision-making process. That is obviously a recipe for poor outcomes. Instead, I am pointing out that the increased power and reliability of Science makes it all the more important that strict limits define what is an acceptable government intervention and that it is socially accepted that policies which trespass those limits cannot be implemented regardless of their alleged beneficial outcomes. We are going in the opposite direction from such discipline.

These limits could involve a minimal redistributive state to rule out absolute poverty – allowing some values to supersede freedom – but these values would not include “statistical notions of public health or aggregate happiness”, nor most forms of strong paternalism.

But despite pointing to the dangers of utilitarian arguments against paternalistic interventions, Saint-Paul finds them hard to resist. He regularly refers the biases of government, noting the irony that “the government could well offset such deficiencies with its own policy tools but soon chose not to by having high public deficits and low interest rates.” And when it comes to his picture of his preferred world it has a utilitarian flavour itself.

Being treated by society as responsible and unitary goes a long way toward eliciting responsible and unitary behavior. The incentives to solve my own behavioral problems are much larger if I expect society to hold me responsible for the consequences of my actions.

Ariely’s The Honest Truth About Dishonesty

ArielyI rate the third of Dan Ariely’s books, The Honest Truth About Dishonesty: How We Lie to Everyone – Especially Ourselves, somewhere between his first two books.

One of the strengths of Ariely’s books is that he is largely writing about his own experiments, and not simply scraping through the same barrel as every other pop behavioural science author. The Honest Truth has a smaller back catalogue of experiments to draw from than Predictably Irrational, so it sometimes meanders in the same way as The Upside of Irrationality. But the thread that ties The Honest Truth together – how and why we cheat – and Ariely’s investigations into it gave those extended riffs more substance than the story telling that filled some parts of The Upside.

The basic story of the book is that we like to see ourselves as honest, but are quite willing and able to indulge in a small amount of cheating where we can rationalise it. This amount of cheating is quite flexible based on situational factors, such as what other people are doing, and is not purely the result of a cost-benefit calculation.

The experiment that crops up again and again through the book is a task to find numbers in a series of matrices. People then shred the answers before collecting payment based on how many the completed. Most people cheat a little, possibly because they can rationalise that they could have solved more, or had almost completed the next one. Few cheat to the maximum, even when it is clear they have the opportunity to do so.

For much of the first part of the book, Ariely frames his research against the Simple Model of Rational Crime (or ‘SMORC’) – where people do a rational cost-benefit analysis as to whether to commit the crime. He shows experiments where people don’t cheat to the maximum amount when they have no chance of being caught – almost no-one says that they solved all the puzzles (amusingly, a few say they solved 20 out of 20, but no-one says 18 or 19). And most people do not increase their level of cheating when the potential gains increase.

As Ariely works through the various experiments attempting to isolate parts of the SMORC and show they don’t hold, I never felt fully satisfied. It is always possible to see how people might rationally respond in a way that thwarts the experimental design.

For example, Ariely found that changes in the stake with no change in enforcement did not result in an increase in cheating. But if I am in an environment with more money, I might assume there is more monitoring and enforcement, even if I can’t see it. However, I believe Ariely is right in arguing that the decision is not a pure cost-benefit analysis.

One of the more interesting parts of the book concerned how increasing the degrees of separation from the monetary outcome increases cheating. Having people collect tokens, which could be later exchanged for cash, increased cheating. In that light, a decision to cheat in an area such as financial services, where the ultimate cost is cash but there are many degrees of separation (e.g. manipulating an interest rate benchmark which changes the price I get on a trade which affects my profit and loss which affects the size of my bonus), might not feel like cheating at all.

As is the case when I read any behavioural science book, the part that leaves me slightly cold is that I’m not sure I can trust some of the results. The recent replication failures involving priming and ego depletion – and both phenomena feature in the book – resulted in me taking some of the results with a grain of salt. How many will stand the test of time?

A week of links

Links this week (or more like two weeks):

  1. Another favourite behavioural science story bites the dust.
  2. Three schools of thought on decision making.
  3. Better teachers receive worse evaluations.
  4. An attempt to reduce bias backfires.
  5. Biased scientists.
  6. Hayek and business management.
  7. More highly educated women are having children.

Life continues to be busy, so posting will continue to be sparse for at least another couple of weeks.

A week of links

Links this week:

  1. On the misplaced politics of behavioral policy interventions. And hawkish biases.
  2. Noah Smith v Bryan Caplan on education signalling – 1, 2 and 3. I believe signalling is an important part of the education story, but Smith’s argument about costly signalling is on point.
  3. Robert Trivers on his friends and enemies. HT: Razib

And if you missed it, my one post this last week:

  1. Bad nudges toward organ donation.

Life continues to be busy, so posting will continue to be sparse.

Overcoming implicit bias

I have been working through The Behavioral Foundations of Public Policy, edited by Eldar Shafir, and have mixed views so far. As I go through, I will note some interesting points.

The opening substantive chapter by Curtis Hardin and Mahzarin Banaji is on bias – and particularly implicit bias. Implicit biases are unconscious negative (or positive) attitudes towards a person or group. Most people who claim (and believe) they are not biased because they don’t show explicit bias will nevertheless have implicit bias that affects their actions.

There is no shortage of tests out there on implicit bias (here’s one set, although you have to fill out a set of surveys before you get to play) and they consistently show that implicit bias exists. Even when you know it is occurring, it’s hard to overcome. Playing with the tests when writing this post, I came up with a strong automatic preference for thin over fat people.

As the chapter is in a book on public policy, it turns to how policy makers should deal with implicit bias. It has a generally optimistic tone about the potential to reduce implicit bias – one that I don’t necessarily share from a public policy perspective – so the paragraphs that stood out for me indicated how complicated any plans to intervene would be.

Research also suggests that the interpersonal regulation of implicit prejudice is due in part to a motivation to affiliate with others who are presumed to hold specific values related to prejudice, as implied by shared reality theory (e.g., Hardin and Conley, 2001). For example, participants exhibited less implicit racial prejudice in the presence of an experimenter wearing a T-shirt with an antiracism message than a blank T-shirt, but only when the experimenter was likeable (Sinclair et al., 2005). When the experimenter was not likeable, implicit prejudice was actually greater in the presence of the ostensibly egalitarian experimenter. In addition, social tuning in these experiments was mediated by the degree to which participants liked the experimenter, providing converging evidence that interpersonal dynamics play a role in the modulation of implicit prejudice, as they do in other dimensions of social cognition (Hardin and Conley, 2001; Hardin and Higgins, 1996).

As regards public and personal policy, these findings suggest that a public stance for egalitarian values is a double-edged sword, and a sharp one at that. Although it may reduce implicit prejudice among others when espoused by someone who is likeable and high in status, it may backfire when espoused by someone who is not likeable or otherwise of marginal status. This finding suggests one mechanism by which common forms of “sensitivity training” in service of the reduction of workplace sexism and racism may be subverted by interpersonal dynamics, however laudable the goals.

I’m guessing that in many scenarios government and its agents would fall into the “not likeable or otherwise of marginal status” category.

A week of links

Links this week:

  1. The Lancet’s obesity predictions.
  2. Design things to be difficult. HT: Rory Sutherland
  3. Is there any known safe level of government funding?
  4. Increasing diversity by hiring groups, not individuals.
  5. Plenty of critiques of nudge-style interventions popping up, although they are rarely done well. Here’s another. And what is a nudge?
  6. A perspective on consumer genomics.
  7. Wealth heritability.
  8. Edging toward the right answer.
  9. Why it is so much easier to data crunch sport than economics.

And if you missed them, my posts this week:

  1. Tolstoy, behavioural scientist.
  2. The left and heritability.

Nudging for freedom

“Nudges” change the decision environment so that people make “better” decisions, while retaining freedom of choice. Fitting within what Cass Sunstein and Richard Thaler call “libertarian paternalism”, nudges are often framed as alternatives to coercive measures. If you can nudge most people toward the “right” decision through the way you frame the choice, the coercive measure is not required.

A recent example is the introduction of default retirement savings in Illinois. A default three per cent of income will be directed to a retirement savings account, with freedom to opt out or increase the contribution. Another is where the Australian Financial System Inquiry recommended offering a default retirement income product (with certain income and risk management characteristics) to people when they retire, with people otherwise free to choose another product or blow their retirement savings on a sports car.

Of course, plenty of coercive measures get branded as nudges, such as proposed bans on large sugary drinks. And after extolling the benefits of retaining choice, choice restricting measures are often praised (such as in this speech by Andrew Leigh, where he praises compulsory superannuation and then defends behavioural economics against claims it is paternalistic).

But, to the point of this post – Are there are any examples of coercive government requirements being wound back explicitly because a nudge was considered effective? Has anyone stated “We have some coercive measures in place, but we have realised that by framing decision in the right way, most of you will make a good decision. Let’s remove these coercive requirements and replace them with a nudge.”?

For example, have there been any compulsory savings programs replaced by default programs on the basis that the default program could be just as effective? (In fact, a default program with a higher contribution rate could result in more savings than a compulsory program.)

If you know of any examples, please help me out. At the moment, my example basket is empty.

*Bryan Caplan has previously proposed some measures of this nature, none of which have been adopted.

Finding taxis on rainy days

A classic story on the play-list of many behavioural economics presentations is why you can’t find taxis on rainy days. The story is based on the idea that taxi drivers work to an income target. If driver wages are high due to high demand for taxis, such as when it rains, they will reach their income target earlier and go home for the day. The result is you can’t find a taxi when you need one most.

The story is such a favourite as it conflicts with conventional economic wisdom that people are maximisers who respond positively to incentives such as higher wages. Instead, drivers are satisficers who quit work for the day once have hit their target, even though the high wages would allow them to earn more than normal.

This story originates from a 1997 article by Colin Camerer and friends (I suggest following Camerer on twitter). They analysed taxi trips in New York and found that as wages went up, labour supply (taxis on the street) goes down. Their preferred explanation, based on what some drivers said, was that taxi drivers work to a daily income target. Their article did not include the reference to the rain, but it has become the way the story is traditionally told.

But, a new study suggests this negative relationship between wages and supply might not generally be the case for New York taxi drivers. Using a much bigger dataset of New York taxi driver activities, Henry Farber has found that, as standard economic theory would suggest, taxi drivers drive more when they can earn more. There was no evidence of income targeting in the data.

As another blow to the rainy day story, Farber also found that taxi drivers didn’t earn more when it was raining. As traffic was worse and they travelled less distance, their earnings didn’t increase despite the higher demand. There were less taxis on the street when it was raining, but this must be due to causes such as drivers preferring not to work when traffic is bad.

So how do we reconcile these conflicting findings? A starting point is in the original study. In a show of humility, Camerer and colleagues were open to the idea that their result might not be robust. They close with the following paragraph:

Because evidence of negative labor supply responses to transitory wage changes is so much at odds with conventional economic wisdom, these results should be treated with caution. Further analyses need to be conducted with other data sets (as in Mulligan [1995]) before reaching the conclusion that negative wage elasticities are more than an artifact of measurement or the special circumstances of cabdrivers. If replicated in further analyses, however, evidence of negative wage elasticities calls into question the validity of the life-cycle approach to labor supply.

To use the cliché, more research is required. And there has been a lot more research since Camerer and friends’ had their study published. While I’ve pitched the story as a new paper tearing up an almost 20-year old favourite, there has been a sequence of papers over the years with both supporting and conflicting results, including by Farber.

Farber’s explanation for the result in his latest paper is that he had access to a larger dataset – five million shifts compared to a few thousand in Camerer and friends’ or Farber’s earlier studies. Technological progress in recording taxi data also allowed Farber’s work to be at a much finer level of detail than was possible at the time of the original study. Other studies also had small datasets or used less reliable data such as from surveys (such as this one from Singapore), but there have also been at least one involving similarly large sets of taxi data that did find a negative relationship (such as a second from Singapore, although in that case the negative relationship seemed of too low a magnitude to support income targeting).

Another explanation might lie in the methodological battle about how you should measure the relationship between wages and supply for taxi drivers. Farber’s 2005 paper picked apart the original methodology, particularly around their assumptions on wages, and he chose a different approach based on drivers deciding whether to continue or not at the end of each ride. When I previously invested some time to understand it, I found Farber’s critique reasonably persuasive. However, I haven’t taken the time to understand the finer points of Farber’s new analysis and to what extent methodology determines the result, so it will be interesting to see some responses to this latest salvo.

Another potential distinction is that Camerer and friends’ original study was able to distinguish between owner-operators and employee drivers, each of which face different incentives. Farber wasn’t able to tease the two apart. However, Camerer and friends found a negative relationship for both groups, so at a minimum, Farber’s work suggests that the finding would not hold across both. Farber did consider whether there might be many different types of driver, which there may be. But if the satisficers do exist, there are not many of them.

On a brighter note, there is some hope that we will be better able to catch a taxi on rainy days in the future. With current taxi regulation and fixed pricing, the inconvenience of driving in bad traffic results in less taxis on the road. But with new entrants such as Uber able to charge more and adjust pricing at times of high demand, we might actually get more taxis or other vehicles on the road when we need them most. And we can have some comfort that when those taxis are needed most, there will be plenty of maximisers around to fill our need.

Kahneman’s optimistic view of the mind

In the Gerd Gigerenzer versus Daniel Kahneman wars, most of the projectiles seem to fly one way. Gigerenzer attacks directly, Kahneman expends little effort in defence.

As one test of whether my impression was correct, I searched Kahneman’s Thinking, Fast and Slow for how many times Kahneman directly mentions Gigerenzer. The answer is six, once in the index and five times in the notes. Gigerenzer is only alluded to in the main text.

Of the notes, only one is substantive, but it is an interesting point. In a slight reversal of their usual roles, Kahneman defends the power of the human mind:

An alternative approach to judgment heuristics has been proposed by Gerd Gigerenzer, Peter M. Todd, and the ABC Research Group, in Simple Heuristics That Make Us Smart (New York: Oxford University Press, 1999). They describe “fast and frugal” formal procedures such as “Take the best [cue],” which under some circumstances generate quite accurate judgments on the basis of little information. As Gigerenzer has emphasized, his heuristics are different from those that Amos and I studied, and he has stressed their accuracy rather than the biases to which they inevitably lead. Much of the research that supports fast and frugal heuristic uses statistical simulations to show that they could work in some real-life situations, but the evidence for the psychological reality of these heuristics remains thin and contested. The most memorable discovery associated with this approach is the recognition heuristic, illustrated by an example that has become well-known: a subject who is asked which of two cities is larger and recognizes one of them should guess that the one she recognizes is larger. The recognition heuristic works fairly well if the subject knows that the city she recognizes is large; if she knows it to be small, however, she will quite reasonably guess that the unknown city is larger. Contrary to the theory, the subjects use more than the recognition cue: Daniel M. Oppenheimer, “Not So Fast! (and Not So Frugal!): Rethinking the Recognition Heuristic,” Cognition 90 (2003): B1–B9. A weakness of the theory is that, from what we know of the mind, there is no need for heuristics to be frugal. The brain processes vast amounts of information in parallel, and the mind can be fast and accurate without ignoring information. Furthermore, it has been known since the early days of research on chess masters that skill need not consist of learning to use less information. On the contrary, skill is more often an ability to deal with large amounts of information quickly and efficiently.