How I focus (and live)

This post is a record of some strategies that I use to focus and be mildly productive. It also records a few other features of my lifestyle.

Why develop these strategies? On top of delivering in my day job, I have always tried to invest heavily in my human capital, and that takes a degree of focus.

The need to adopt many of the below also reflects how easily distracted I am. I have horrible habits when I get in front of a device. The advent of the web has been a mixed blessing for me.

My approaches can shift markedly over time, so it will be interesting to see which of the below are still reflected in my behaviour in a couple of years (and which continue to be supported by the evidence as effective).

If there is a common theme to the below, it is that creating the right environment, not reliance on willpower, is the path to success.

Periods of focus: Most of my productive output occurs in two places. One is on the train, with an hour commute at the beginning and end of each day that I travel to work. The only activities I do on the train are reading (books or articles) and writing. Internet is turned off. This is now an ingrained habit. The train is largely empty for most of the journey, with half through a national park, so it’s a pleasant way to work.

The rest of my output occurs in productive blocks (pomodoros) during the day. At the beginning of each day I schedule a set of half-hour blocks in my diary around my other commitments. In these blocks, I will turn off or close everything I don’t need for the task. I am typically less successful at putting up barriers to human (as opposed to digital) interruptions, except for occasionally closing my office door.

Ideally I will have several blocks in a row (in the morning), with a couple of minutes to stretch in between. I aim for at least 20 half-hour sessions each week. I average maybe 30. I block out the occasional morning in my diary to make sure each week is not completely filled with meetings (with eight direct reports and working in a bureaucracy, that is a real risk).

I also read whenever I can, and that fills a lot of the other space in my life. I read around 100 books per year (about 70-80 non-fiction).

Phone: My iPhone is used for four main purposes: as a phone; as a train timetable; as a listening device (podcasts, audiobooks and music); and for my meditation apps (more on meditation below). It also has a few utilities such as Uber that I rarely use. I don’t use my phone for social media, as a diary, or for email. Most of the day it stays in my pocket or on my desk. All notifications, except calls and text messages, are turned off. I rarely have any reason to look at it.

Even when I do look at my phone, the view is sparse. These are the two screens I see.

One thing you can’t see in these screenshots (for some strange technical reason) is that my phone is in grey scale. There is little colour to get me excited (although I am colour blind….). Except when I make a phone call, message someone, or (loosely) lock the phone with Forest, I use search to find the app. They are hidden in the Dump folder. When I go to my phone, there is little to divert me from my original intention.

iPad: I have an iPad, and it is similarly constrained. All notifications are turned off. It has email, but the account is turned off in settings, with account changes restricted. It takes me about a minute to disable restrictions to turn email on, which slows me down enough to make sure I am checking it for a reason. More on email below.

I also use the iPad for reading and writing (including these posts) on the train. When reading, I use my Kindle in preference to my iPad when I can, as the Kindle has far fewer rabbit holes.

Internet: I subscribe to Freedom which cuts off internet for certain apps and certain times. Among other things, I use it to block the internet from 8pm through to 7am (I don’t want to be checking email or browsing when I first get up), and on Sundays (generally a screen free day). I also use Freedom to shut off internet or certain apps at ad hoc times when I want to focus.

I try not to randomly browse at other times. I have little interest in news (see below), so that reduces the probability of messing around. I have previously used RescueTime to track my time online, but don’t currently as I can’t install it on my work computer, phone or iPad. The tracking had a subtle but limited effect on my behaviour on my home computer when I tried it.

Email: Currently my biggest failure, particularly when I am in the office. I aim to batch my email to a few times per day, but I check and am distracted by new emails more often than I would like. Partly that is because part of my workflow occurs through email, so it is hard not to look.

Social media: I have a Facebook account, but zero friends, so it provides little distraction. (I also like that when I run into people who I haven’t seen for a while, I don’t already know what they have been up to.) I only have the account because this blog has a Facebook page. I try to limit my visits to Twitter and LinkedIn to once a week (normally successful with Twitter, less so with LinkedIn as direct messages sometimes draw me in). Freedom helps constrain this.

Paper diary: My paper diary is an attempt to keep myself away from distracting devices. I also find it faster than the electronic alternative. I have an electronic calendar for work, but it is replicated in the paper diary.

News: I consume little news. I don’t have a television, don’t purchase newspapers and don’t visit internet news sites unless I follow a link based on a recommendation. I rarely miss anything important. If something big happens, someone will normally tell me.

I used to apply a filter to political news of “if this was happening in Canada, would I care?” That eliminated most political news, but I have found that after a few years, I have become so disconnected from Australian politics that most of it flows around me. I don’t recognise most politicians, and I feel unconnected to any of the personalities. Voting is compulsory in Australia, so to avoid being fined or voting for people I know nothing about, I get my name ticked off the electoral roll at a polling place, take the voting slip, but don’t bother filling it out. (And I have almost no idea what Trump is up to.)

I am in a similar place for sports news. Now that I have been disconnected for a while, I have no interest. Any names I overhear mean nothing to me. I couldn’t tell you who won any of the tennis grand slams last year or who the World Series champion is. I don’t think I could recognise a current Australian cricketer on sight.

Blogs: In substitute to going to any news sources, I subscribe to around 25 blogs using a feed reader (Feedly). I scan them around once a day. They provide more reading material than I can get through (through the posts themselves or links), so I have a backlog of reading material in Instapaper (I used to use Pocket, but dumped it when the ads appeared).

Sleep and rest: The evidence on the effect of lack of sleep is strong. I need eight hours a night and generally get it (children permitting). I don’t use screens (except for the Kindle) after 8pm at night. I also subscribe to the broader need for rest and the declining productivity that comes from overwork.

Meditation: Meditation is new for me (around four months), and I am still in the experimental phase. I meditate for around 15 to 20 minutes every day. I find it puts me on the right track at the start of the day (which is when I meditate, children permitting). It also acts as a daily reminder of what I am trying to do.

The evidence of increased concentration and emotional control seems strong enough to give it a go. I suspect I would have dismissed the idea a few years ago (maybe even a year ago), and pending changes in the evidence in favour and my own experience, I am prepared to dismiss it again in the future.

A benchmark I’d like to be able to compare meditation to is focused reading. If I shifted the meditation time to reading, that’s 15 to 20 additional books a year. What is the balance of costs and benefits?

I use three apps to meditate: Insight Timer, Headspace and 10% Happier. I find 10% Happier most useful as a teacher. Headspace is convenient and easy to use, but I don’t like the gamification element to it, and the packages seem relatively shallow and repetitive (although the repetitive nature is not necessarily a bad thing). At the end of the year when it is time to re-subscribe, I suspect I will drop Headspace and stick with 10% Happier if I am still learning something from it. Insight Timer will otherwise give me what I need.

I will post more on my thoughts on meditation in the near future – likely through a review of Sam Harris’s Waking Up in the first instance, as that was the book that pushed me across the line.

I give myself a 60% chance of still being meditating when I write my next post of what I do to focus (planning to do this roughly annually). My lapsing could be due to either changing my mind or failing to sustain the habit.

Diet: I see diet as closely linked to the ability to focus and be productive. I eat well. My diet might best be described as three parts Paleo, one part early agriculturalist, and 5% rubbish. My diet is mainly fruit (lots), vegetables, tubers, nuts, eggs (a dozen a week), meat, legumes and dairy (a lot of yogurt). I eat grains occasionally, largely in the form of rice (a few times of week) and porridge (once or twice a week). I’ll eat bread maybe once or twice a month (I love hamburgers and eggs on toast). A heuristic I often fall back onto is no processed grains, industrial seed oils or added sugar. There’s some arbitrariness to it, but it works. Stephan Guyenet is my most trusted source on diet.

It’s easy to stick to this diet because this is what is in my house. There are no cookies, ice cream or sugar based snacks. I don’t have to go down the aisles of the supermarket when shopping (although my groceries are normally home delivered). If I want to binge, rice crackers and toast are as exciting as I can find in the cupboard.

Exercise: As for diet, part of the productivity package. My major filter for choosing exercise is the desire to still be able to surf and get off the toilet when I’m 80. I surf a couple of times a week. Living within five minutes walk of a beach with good surf is a basic lifestyle criteria.

I did Crossfit for a few years, but don’t live near a Crossfit gym at the moment. However, I don’t think Crossfit is a sustainable long-term approach – at least if I trained as regularly as expected in the gyms I have been to. The intensity would have me falling apart in old age.

That said, I still keep Crossfit elements to my exercise – heavy compound lifts once or twice a week, and a short high intensity burst around once a week (so I’m in the gym once to twice a week). I also walk a lot, including trying to get out of the office for a decent walk at lunch each day. While walking, I consume a lot of audiobooks and podcasts. I stretch for 10 to 15 minutes most days.

Michael Mauboussin’s More Than You Know: Finding Financial Wisdom in Unconventional Places

Michael Mauboussin’s message in More Than You Know: Finding Financial Wisdom in Unconventional Places is that we need an interdisciplinary toolkit to give us the diversity to make good decisions. This is not diversity in groups, but diversity in thinking. You need diverse cognitive tools to deal with diverse problems.

The book is a series of essays that Mauboussin wrote for a newsletter over a dozen years or so when he was at CSFB. Given his background in investment management, there is a heavy focus on investment decisions. However, the tools he discusses are relevant for most decision-making domains, be that as a manager, parent, employee, or so on.

Mauboussin draws his interdisciplinary tools from four main areas, around which the essays in the book are arranged.

The first set of essays, on investment philosophy, largely concern probabilistic thinking. Focus on the process, not the outcome. If you judge solely on results, you will be deterred from taking the risks necessary to make the right decision.

In this vein, don’t set target prices for shares – an estimate of how you will believe a company will perform. Rather, provide a range of prices with associated probabilities. This allows you to invest knowing the downside probability, and to assess your choices in the knowledge that some decisions will have unfavourable outcomes.

One interesting thread to these essays is what amounts to a defence of Mauboussin’s occupation, investment management. Many people (myself included) see investment management performance as largely the outcome of luck. Mauboussin argues for the presence of skill (in at least some cases), with long streaks requiring (to paraphrase Steven Jay Gould) extraordinary luck imposed on great skill.

One limb of Mauboussin’s argument is the 15 consecutive years of market out-performance by Bill Miller, a fund manager at Legg Mason (where Mauboussin worked at the time the book was published). Getting 15 consecutive heads when tossing a coin is a one in 32,000 proposition. If your coin has only a 44% chance of coming up heads (the average probability of a fund outperforming the market over that stretch), a streak of 15 has a probability of one in 223,000. That number balloons to one in 2.3 million if you take the average probability of a fund beating the market in each individual year (in a couple of years less than 10% of funds beat the market).

Given these odds, Mauboussin argues that it is unlikely that Miller was effectively flipping a coin. Miller’s skills meant the odds were actually less daunting. Yes, he needed luck, but there needed to be skill underneath to realise the streak.

I’m not sure I buy this argument. This 15 year window is only one of many available. There are many funds. (And I have just found this – someone doing the numbers to get the odds of 3 in 4 of a 15 year streak by someone at some time.)

A contrast to the Miller story comes later in the book, when Mauboussin notes the trading success of Victor Niederhoffer in a different light. Niederhoffer averaged 35% per year returns from 1972 to 1996 (says Wikipedia), but this all came crashing down to nothing in 1997. He built another fortune to then lose in the global financial crisis. Mauboussin uses Niederhoffer’s story as an example of the fat tails of asset price movements, a pattern of many small changes, and a small but larger than expected number of large changes. To use Nassim Taleb’s framing, Niederhoffer was picking up pennies in front of a steamroller. (And on that point, Miller’s record since his streak is not so great.)

The second set of essays draws on psychology. This partly draws on the heuristics and biases program of Daniel Kahneman and friends, but Mauboussin ranges over wider territory. He draws in literature on animal behaviour, such as the herding behaviour of ants and the stress response of animals, and on the literature in naturalistic decision-making. He also has a keen appreciation of the fact that many of these decisions occur in systems, meaning that individual decision making flaws don’t necessarily lead to poor aggregate outcomes.

The third set of essays innovation and competition, has a game theory and evolutionary thread. The last set is on complexity, which contains both a warning about seeing cause and effect in complex systems, and a suggestion that some of the work in the complexity field gives a lens to understand the patterns we see.

Some of these essays deserve posts of their own, so I won’t go into any in-depth except to make a general observation. I am a fan of interdisciplinary approaches to problems, but parts of the book, particularly these latter sections, hint at why they aren’t adopted. Many times you get an interesting angle of looking at an issue, but it is not clear what you should do differently.

Partly this is a result of the origins of the book. Each essay is around two thousand words (guessing), so each gives a taste of a topic but little depth. One essay tends not to build on another.

That said, much of the advice is to effectively do nothing. That is valuable advice. If you want to read media accounts about share market moves, recognise that this is entertainment, not information. Disentangling cause and effect in a complex system like the share market is difficult, if not impossible, so stop telling stories.

Mauboussin also warns in the introduction that some ideas may not be useful right away. Some may never be useful. You are building a toolkit for future problems that you haven’t seen yet.

As a closing note, Mauboussin references many other popular science books. Given some of the essays must be 20 years old, I had not heard of most (which might say something for the longevity of popular science books). I’ve added a few to the reading list, but it will be interesting to see how they have held up through time.

Susan Cain’s Quiet: The Power of Introverts in a World That Can’t Stop Talking

I have mixed views about Susan Cain’s Quiet: The Power of Introverts in a World That Can’t Stop Talking.

Cain makes an important point that many of our environments, social structures and workplaces are unsuited to “introverts” (and possibly even humans in general). We could design more productive and inclusive workplaces, schools and organisations if we considered the spectrum of personality types who will work, live and learn in them.

On the flip side, Cain expanded the definition of introversion to include a host of positive attributes that wouldn’t normally (at least by me) be grouped with introversion. This led to a degree of cheer-leading for introverts that was somewhat off-putting (despite my own introverted nature). The last couple of chapters of the book also fall into evidence-free story-telling.

But to the good first. I enjoyed Cain’s filleting of open workplaces. Open plan workplaces or “activity-based working” are often dressed up as a means to seed creativity and collaboration, but they are more accurately described as a shift to lower floor space per employee to save costs. The evidence for increased collaboration or creativity is scant. Innovation may occur in teams, but it also requires quiet.

Cain suggests the trend toward these open workspaces is built on a mis-understanding of some of the classic examples of collaboration associated with the rise of the web. Yes, Linux and Wikipedia were built by teams, not individuals. But these people did not share offices or even countries. Regardless, the collaboration ideal was extended to our physical spaces.

Cain catalogues the research on the poor productivity in open workplaces. I had seen the following research before, but it is a great case study:

… DeMarco and his colleague Timothy Lister devised a study called the Coding War Games. The purpose of the games was to identify the characteristics of the best and worst computer programmers; more than six hundred developers from ninety-two different companies participated. Each designed, coded, and tested a program, working in his normal office space during business hours. Each participant was also assigned a partner from the same company. The partners worked separately, however, without any communication, a feature of the games that turned out to be critical.

When the results came in, they revealed an enormous performance gap. The best outperformed the worst by a 10:1 ratio. The top programmers were also about 2.5 times better than the median. When DeMarco and Lister tried to figure out what accounted for this astonishing range, the factors that you’d think would matter—such as years of experience, salary, even the time spent completing the work—had little correlation to outcome. Programmers with ten years’ experience did no better than those with two years. The half who performed above the median earned less than 10 percent more than the half below—even though they were almost twice as good. The programmers who turned in “zero-defect” work took slightly less, not more, time to complete the exercise than those who made mistakes.

It was a mystery with one intriguing clue: programmers from the same companies performed at more or less the same level, even though they hadn’t worked together. That’s because top performers overwhelmingly worked for companies that gave their workers the most privacy, personal space, control over their physical environments, and freedom from interruption. Sixty-two percent of the best performers said that their workspace was acceptably private, compared to only 19 percent of the worst performers; 76 percent of the worst performers but only 38 percent of the top performers said that people often interrupted them needlessly.

One pillar to the case for quiet spaces comes from how we build expertise. Work by Anders Ericsson, of deliberate practice fame, has identified studying alone or practicing in solitude as the prime way to gain skill. You need to be alone to engage in deliberate practice, as this allows you to go directly to the part that is challenging you. Open workspaces are a poor place to tackle challenging problems.

Cain also includes some interesting material on the extension of this “collaborative” space design to schooling. Children are increasingly schooled in pods as part of a shift to “cooperative learning”. We’re preparing children for the sub-optimal workplaces they are about to enter by replicating that sub-optimal environment in their schools. What is particularly problematic is that there is little opportunity in school to opt out, whereas adults have more opportunity to choose their workplace and shape their environment.

One thread in the book, which features in the opening, is that society is in the thrall of an “extrovert ideal”. Cain argues that we have become more interested in how people perceive us than the content of our character – a shift from a culture of character to one of personality. Self-help guides used to focus on concepts such as citizenship, duty, work, honour, morals, manners and integrity. They now focus on being magnetic, fascinating, attractive and energetic. Being quiet is now a problem.

This is particularly reflected in what we look for in leaders. People who talk more tend to be rated as more intelligent. Good presenters often get ahead. But talking more or presentation skills might be weak indicators of the actual capabilities you want in your leaders.

Cain briefly touches on the genetics of introversion. Unsurprisingly, as for every behavioural trait, introversion is heritable. Around 40% to 50% of the variation in introversion is due to differences in genes. Cain also hints at cross-racial differences in introversion, noting that the waves of emigrants to a new continent would have the more extroverted traits of world travellers.

The least satisfying element to the book was Cain’s definition of introvert. At times, Cain’s definition seemed to expand to capture all that is good. From a typical definition of being reserved, reflective, or interested in one’s own mental self, her definition includes everyone who is thoughtful, cerebral, willing to listen to others, and immune to the pull of wealth and fame. Introverts are needed to save us from climate change. (“Without people like you, we will, quite literally, drown.”) Extroverts, in contrast, are thoughtless risk seekers with no self control. Extroverts caused the global financial crisis.

Cain does note her broad definition of introvert in an appendix to the book, A Note on the Words Introvert and Extrovert. It would have helped me a lot if this note had been at the front (or if I had realised it was there before reading the book). There she clarifies that she is not using the standard definition of introversion captured by the well-established Big 5 taxonomy. She states that she is extending introversion to include people with “a cerebral nature, a rich inner life, a strong conscience, some degree of anxiety (especially shyness), and a risk-averse nature”.

These traits would normally be considered to relate to the other Big 5 traits of openness, conscientiousness and neuroticism. This is particularly confusing, as in parts of the book she talks about the other big 5 traits as separate concepts. Cain’s definition also appears broader than that used by Carl Jung and in the Myer-Briggs test, which seem to be her foundation. (Although never explicitly endorsed, I get the feeling that Cain is a Myer-Briggs advocate.)

Once the definition is expanded to include these other dimensions, it is hard to see how one third of the population can be described as introverts. It also means that many parts of the book feel more an ode to conscientiousness, and possibly even intelligence, than to introversion.

This was most stark in the chapter on the differences between Asians and Americans. Cain attributes Asian achievement – such as high scores in international tests and their superior academic results – to the higher introversion of Asians. There is not one mention of the higher conscientiousness of East Asians, nor their higher IQ scores. Instead these seem bundled into the introvert basket of traits.

I also struggled with the final two substantive chapters of the book – on relationships and children. There Cain shifts from an approach generally built on research to one that is little more than storytelling. The chapters are full of unsourced statements or recommendations. For instance, she recommends that you gradually introduce your kids to new situations. This supposedly produces more confident kids than the alternatives of overprotection or pushing too hard, contrasting somewhat with the established literature on the lack of effect of parents.

*Disclosure of interest: Here are the percentiles for the last time I did a Big 5 test. I’m not far from Cain’s introvert ideal (possibly a touch low on neuroticism):

Openness: 88
Conscientiousness: 80
Extroversion: 29
Agreeableness: 51
Neuroticism: 48

Cass Sunstein and Reid Hastie’s Wiser: Getting Beyond Groupthink to Make Groups Smarter

Cass Sunstein and Reid Hastie’s Wiser: Getting Beyond Groupthink to Make Groups Smarter is not an exciting read. However, it is a good catalogue of group decision-making research (leading to this post to also be somewhat of a catalogue) and worth reading for an overview.

The book’s theme is that group decisions are often better than individual decisions, but that groups have weaknesses that can impair outcomes. Much of the analysis of failures in group decision-making follows a similar theme to the research into individual judgement and decision-making, in that the research has generated a long list of “biases” that groups are subject to. Most of the book, however, focuses on getting better decisions, and a lot of these (thankfully) don’t rest on identification of particular biases.

Two types of groups

Sunstein and Hastie look at two types of groups – statistical and deliberating groups.

In a statistical group, members give their inputs individually. Those inputs are then aggregated. Think voting (which works well as long as the majority is right).

There is no shortage of material about the wisdom of statistical groups. The story of Francis Galton, where he had people estimate the weight of an ox, is a classic example. The average of the individual predictions was right on the mark.

In deliberating groups, individuals provide input during deliberations. Those inputs can affect and be affected by the inputs of other group members. People aim to influence others. People might change their minds.

Even if most members of a group have the wrong answer or belief, you can picture a scenario where reason and discussion allow the right answer emerge. That is sometimes the case, but the evidence is that deliberating groups do not necessarily converge on the truth.

In one experiment, people answered questions individually before answering those same questions in groups. If the majority of the group knew the correct answer to a problem, the group’s decision was correct 79% of the time. (It’s impressive that the incorrect minority were able to derail the group 21% of the time.) If the majority of the group answered a question incorrectly when answering individually, the group converged on the right answer only 44% of the time. The result of this dynamic was that the average group decision was better, but only marginally so, than the average individual (66% versus 62%).

As a result, it may be easier to simple elicit people’s individual views and average them (or combine in some other novel way) than go through the effort of the group discussion. A statistical group may be a more efficient solution.

Why deliberating groups go wrong (or right)

Why do we get results such as this? Sunstein and Hastie describe plenty of problems that can derail deliberating groups. Group decisions can be poor due to both the rational conduct of group members and because of their “biases”. Here are a few problems that can occur for “rational” reasons:

  • Informational signals: It is sensible to take into account what others have said in a group deliberation. If you know Jane is knowledgeable and has good judgment, hearing that she supports a project is evidence that can affect your support. But if she is wrong, she can derail the group. Seeing other people make errors can also provide “social proof” to an error.
  • Self-censorship: People tend not to give information contradicting their preferred outcome. In one study of over 500 mock jury trials, the experimenters never once observed someone giving information in this circumstance.
  • Reputational cascades: People might know what is right (or what they think is right), but they go along with the group or certain members of the group due to concern for their reputation or standing.

Then there are the “irrational” (a lot of these points are based on single studies, so take with a grain of salt):

  • Deliberating groups are more likely to escalate commitment to a failing course of action. They are also more susceptible to the sunk cost fallacy, the consideration of past costs that should be irrelevant to the decision about future action
  • Groups can amplify the representativeness heuristic, where we judge probability based on resemblance or similarity
  • People in deliberating groups have more unrealistic “overconfidence” (looking at the abstract of the paper cited for this point – I can’t access the full paper – I think they are talking about over-precision)
  • Groups are more vulnerable to framing effects, varying their decision based on how a choice is framed (although looking at the paper Sunstein and Hastie cite, it states that there is little consistency between studies)
  • Group deliberation can make both groups and the individuals in those groups more extreme
  • Shared information has a disproportionate effect on group members. If information is distributed so that key material is unshared (held by only a few group members), this can cause deliberating groups to perform worse.

That said, deliberating groups can temper some biases:

  • Groups tend to rely less on the availability heuristic – a heuristic by which we judge probability by how easily examples readily come to mind. The heuristic is tempered possibly because the group members have different memories. Across the group the available memories may be somewhat more realistic. That said, groups can be subject to availability cascades. An idea held by one person can spread through the group, eventually producing a widespread belief.
  • Groups have a lower tendency to anchor, the over-reliance on the first piece of information with which they are presented (even if it is irrelevant to the decision at hand)
  • Groups tend to have reduced hindsight bias, possibly because not everyone revises their views in the same way
  • Groups tend to have reduced egocentric biases, the belief that others think like you. A group typically has a wider set of tastes to draw on, so you are more likely to have someone point out that your tastes are not shared.

Improving deliberation

The most interesting part of the book is when Sunstein and Hastie turn to their tactics to improve group decision. There are two groups of tactics: those designed to improve deliberation, and alternative decision-making methods. A common threads to these is diversity, although this is “not necessarily along demographic lines, but in terms of ideas and perspectives.”

They list eight ways to avoid problems in deliberating groups: (1) inquisitive and self-silencing leaders; (2) “priming” critical thinking (although we have seen how the priming literature is holding up); (3) rewarding group success (incentives are important, particularly to counter self-censorship and reputational cascades); (4) role assignment; (5) perspective changing; (6) devil’s advocates; (7) red teams; and (8) the Delphi method. A few are worth mentioning.

Role assignment involves giving people discrete roles, such as labelling someone as an “expert”. The purpose is to bring out unshared information by making it clear that the individual expert has a role to play.

Devil’s advocacy involves appointing some group members to deliberately advocate against the group’s inclinations. Sunstein and Hastie suggest that the research behind devil’s advocates is mixed. There is some evidence that devil’s advocacy can be helpful and can enhance group performance. But it requires genuine dissent. If the dissent is insincere (which is often the case if the role is assigned), people discount the dissent accordingly. The advocate also has little to gain by zealously challenging the dominant view. This means it may be better for groups to encourage real dissent.

Sunstein and Hastie are more optimistic about red teaming, the creation of a team tasked with criticising or defeating the preferred solution or plan. I can see how they might be occasionally useful, such as in mock trials, but it wasn’t clear where their optimism came from as they provided little evidence in support.

One option I find useful is the Delphi method. You ask people to state their opinions anonymously and independently before deliberation. These opinions are then made available to others. It is effectively a secret ballot plus reasons, and provides a basis for hidden information to emerge without reputational or informational cascades. Several rounds of this process can be held as the group converges on a solution. It’s a great way to flush out doubts and dissent.

Better decisions without deliberation

Much of the book is dedicated to methods to arrive at good decisions outside of, rather than within, the deliberation process. These include design thinking (as a way of eliciting as much information and as many ideas as possible), cost-benefit analysis, asking the public (public comment or consultation), tournaments, prediction markets, and harnessing experts. Some of these are effectively statistical groups with different models for combining inputs.

Unsurprisingly given Sunstein’s background, the authors are positive on cost benefit analysis. Having seen some cost-benefit sausages being made for government decision-making, I don’t quite share the same optimism, but can see the benefits in the right place.

Sunstein and Hastie are also boosters of use of tournaments. The dispersion of competitors leads to independence in inputs. Their winner take all nature incentivises divergent strategies. They can promote elite performance at the top of competitor’s capabilities.

A question not addressed in the book is to what extent tournaments can be scaled and be a widely used solution. There is a waste of resources inherent in tournaments – the input of the losing teams. A Kaggle competition uses a massive amount of data science capability, far more than the “prize”. At the moment, many candidates are happy to input this effort as there are other benefits, such as reputation. Could it be the standard way of doing things? In the case of government tournaments, they would want to pick the projects of most value to avoid over-stretching the resource.

As a tournament example, Sunstein and Hastie were underwhelmed by the IARPA prediction tournament, where teams competed to predict political and economic events. They felt that the winning solution from the Good Judgement Project was more focused on reducing noise and bias, rather than developing game changing methods that increase signal (tough crowd). (See my post on Superforecasting for more on that tournament.) Maybe the new hybrid forecasting tournament might be more to their liking.

The final technique I’ll note is effective harnessing of experts. This could be using experts who use statistics to develop accurate predictions or make decisions (often in turn drawing on other sources). It could involve identifying fields where expert knowledge is genuine (as identified in the work of Gary Klein). When doing this, however, it is often best to look at statistical groups of experts, rather than to chase a single expert. The average of experts is likely the best prediction. And there is no need to weight for an expert’s confidence in developing that average – it has no correlation with their accuracy.

——-

Postscript 1: Sunstein and Hastie explore the question of collective intelligence (the “c factor”). That deserves to be the subject of another post.

Postscript 2: Sunstein and Hastie talk of “eureka” problems, where the right answer is clear to all once announced. Groups are good at these. They give the “trivial” example of “Why are manhole covers round?” Because “if they were almost any other shape, a loose cover could shift orientation and fall through the hole, potentially causing damage and injuries.” Is that really the logic behind their design? Or is this just a benefit? (I ask not just because most manhole covers in Australia are square or rectangular, and I have never seen a cover fall through the hole.) This example is famous as being used in Microsoft job interviews, but it is a question more focused on making the interviewer feel clever than actually predicting, say, good job performance.

Some podcast recommendations

What I’ve been listening to recently:

  1. Shane Parrish’s blog Farnam Street is a favourite of mine. His podcast The Knowledge Project is also worth a listen. I recommend the episodes featuring Michael Mauboussin (1 and 2), Rory Sutherland (if you’ve seen Rory speak before, the half hour gap between Shane’s first attempt to wind up the conversation and the end of the episode will come as no surprise), Susan Cain (I’ll write a review of Quiet shortly), Adam Grant (I disagree with his perspective on the replication crisis) and Chris Voss (I recommend Voss’s book, Never Split the Difference, which I will also review soonish).
  2. I turned to Sam Harris’s podcast Waking Up after reading the book of the same title (which I need to read again if I am going to write anything about it). There are plenty of episodes worth listening to, including interviews with David Krakauer of the Santa Fe Institute, Stuart Russell on the threats of AI, Tristan Harris on what technology is doing to us, and Max Tegmark on the future of intelligence. I’ve generally avoided the episodes on politics, free speech and the culture wars.
  3. Russ Roberts’s Econtalk is always worth listening to. I particularly enjoyed the episode with Tim O’Reilly. Here’s one great section (in turn pulling from O’Reilly’s book:

Russ Roberts: You say,

If you think with the 20th century factory mindset, you might believe that the tens of thousands of software engineers in companies like Google, Amazon, and Facebook spend their days grinding out products just like their industrial forebears, only today they are producing software rather than physical goods. If instead you step back and view these companies with a 21st century mindset, you realize that a large part of what they do–delivering search results, news and information, social network status updates, relevant products for purchase, drivers on demand–is done by software programs and algorithms. These programs are workers; and the programmers who create them are their managers. Each day, these managers take in feedback about their workers’ performance, as measured in real-time data from the marketplace. And if necessary, they give feedback to the workers in the form of minor tweaks and updates to the program or the algorithm.

End of quote. … And, as you point out a number of times in the book, and as you just said: It’s hard to talk about where the human and where the technology start and end. They are just intertwined. They are augmenting each other.

Tim O’Reilly: Yeah. And you pick a key word here, which is ‘augmenting.’ … just as the technology is the 18th, 19th, and 20th century were about augmenting our muscles, from the 20th into the 21st century we were really about augmenting our minds. And, you augment in a word to increase our capabilities.

  1. Frank Conway’s Economic Rockstar. I’ve only listened to a couple of episodes, but the conversation with Greg Davies is excellent. After listening to the episode, watch the below.

People should use their judgment … except they’re often lousy at it

My Behavioral Scientist article, Don’t Touch The Computer was in part a reaction to Andrew McAfee and Eric Brynjolfsson’s book The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies. In particular, I felt their story of freestyle chess as an illustration of how humans and machines can work together was somewhat optimistic.

I have just read McAfee and Brynjolfsson’s Machine, Platform, Crowd: Harnessing Our Digital Future. Chapter 2, titled The Hardest Thing to Accept About Ourselves, runs a line somewhat closer to mine. Here are some snippets:

[L]et people develop and exercise their intuition and judgment in order to make smart decisions, while the computers take care of the math and record keeping. We’ve heard about and seen this division of labor between minds and machines so often that we call it the “standard partnership.”

The standard partnership is compelling, but sometimes it doesn’t work very well at all. Getting rid of human judgments altogether—even those from highly experienced and credentialed people—and relying solely on numbers plugged into formulas, often yields better results.

Here’s one example:

Sociology professor Chris Snijders used 5,200 computer equipment purchases by Dutch companies to build a mathematical model predicting adherence to budget, timeliness of delivery, and buyer satisfaction with each transaction. He then used this model to predict these outcomes for a different set of transactions taking place across several different industries, and also asked a group of purchasing managers in these sectors to do the same. Snijders’s model beat the managers, even the above-average ones. He also found that veteran managers did no better than newbies, and that, in general, managers did no better looking at transactions within their own industry than at distant ones.

This is a general finding:

A team led by psychologist William Grove went through 50 years of literature looking for published, peer-reviewed examples of head-to-head comparisons of clinical and statistical prediction (that is, between the judgment of experienced, “expert” humans and a 100% data-driven approach) in the areas of psychology and medicine. They found 136 such studies, covering everything from prediction of IQ to diagnosis of heart disease. In 48% of them, there was no significant difference between the two; the experts, in other words, were on average no better than the formulas. A much bigger blow to the notion of human superiority in judgment came from the finding that in 46% of the studies considered, the human experts actually performed significantly worse than the numbers and formulas alone. This means that people were clearly superior in only 6% of cases. And the authors concluded that in almost all of the studies where humans did better, “the clinicians received more data than the mechanical prediction.”

Despite this victory, it seems a good idea to check the algorithm’s output.

In many cases … it’s a good idea to have a person check the computer’s decisions to make sure they make sense. Thomas Davenport, a longtime scholar of analytics and technology, calls this taking a “look out of the window.” The phrase is not simply an evocative metaphor. It was inspired by an airline pilot he met who described how he relied heavily on the plane’s instrumentation but found it essential to occasionally visually scan the skyline himself.

But …

As companies adopt this approach, though, they will need to be careful. Because we humans are so fond of our judgment, and so overconfident in it, many of us, if not most, will be too quick to override the computers, even when their answer is better. But Chris Snijders, who conducted the research on purchasing managers’ predictions highlighted earlier in the chapter, found that “what you usually see is [that] the judgment of the aided experts is somewhere in between the model and the unaided expert. So the experts get better if you give them the model. But still the model by itself performs better.”

So, measure which is best:

We support having humans in the loop for exactly the reasons that Meehl and Davenport described, but we also advocate that companies “keep score” whenever possible—that they track the accuracy of algorithmic decisions versus human decisions over time. If the human overrides do better than the baseline algorithm, things are working as they should. If not, things need to change, and the first step is to make people aware of their true success rate.

Accept the result will often be to defer to the algorithm:

Most of us have a lot of faith in human intuition, judgment, and decision-making ability, especially our own …. But the evidence on this subject is so clear as to be overwhelming: data-driven, System 2 decisions are better than those that arise out of our brains’ blend of System 1 and System 2 in the majority of cases where both options exist. It’s not that our decisions and judgment are worthless; it’s that that they can be improved on. The broad approaches we’ve seen here—letting algorithms and computer systems make the decisions, sometimes with human judgment as an input, and letting the people override them when appropriate—are ways to do this.

And from the chapter summary:

The evidence is overwhelming that, whenever the option is available, relying on data and algorithms alone usually leads to better decisions and forecasts than relying on the judgment of even experienced and “expert” humans.

Many decisions, judgments, and forecasts now made by humans should be turned over to algorithms. In some cases, people should remain in the loop to provide commonsense checks. In others, they should be taken out of the loop entirely.

In other cases, subjective human judgments should still be used, but in an inversion of the standard partnership: the judgments should be quantified and included in quantitative analyses.

Algorithms are far from perfect. If they are based on inaccurate or biased data, they will make inaccurate or biased decisions. These biases can be subtle and unintended. The criterion to apply is not whether the algorithms are flawless, but whether they outperform the available alternatives on the relevant metrics, and whether they can be improved over time.

As for the remainder of the book, I have mixed views. I enjoyed the chapters on machines. The four chapters on platforms and first two on crowds were less interesting, and much could have been written five years ago (e.g. the stories on Wikipedia, Linux, two-sided platforms). The closing two chapters on crowds, which discussed decentralisation, complete contracts and the future of the firm were, however, excellent.

 

Philip Tetlock on messing with the algorithm

From an 80,000 hours podcast episode:

Robert Wiblin: Are you a super forecaster yourself?

Philip Tetlock: No. I could tell you a story about that. I actually thought I could be, I would be. So in the second year of the forecasting tournament, by which time I should’ve known enough to know this was a bad idea. I decided I would enter into the forecasting competition and make my own forecasts. If I had simply done what the research literature tells me would’ve been the right thing and looked at the best algorithm that distills the most recent forecast or the best forecast and then extremises as a function of the diversity of the views within, if I had simply followed that, I would’ve been the second best forecaster out of all the super forecasters. I would have been like a super, super forecaster.

However, I insisted … What I did is I struck a kind of compromise. I didn’t have as much time as I needed to research all the questions, so I deferred to the algorithms with moderate frequency. I often tweaked them. I often said they’re not right about that, I’m going tweak this here, I’m going to tweak this here. The net effect of all my tweaking effort, which was to move me from being in second place which I would’ve been if I’d mindlessly adopted the algorithmic prediction, to about 35th place. So that was … I fell 33 positions thanks to the cognitive effort I devoted there.

Tetlock was tweaking an algorithm that is built on human inputs (forecasts), so this isn’t a lesson that we can leave decision-making to an algorithm. The humans are integral to the process. But it is yet another story of humans taking algorithmic outputs and making them worse.

The question of where we should simply hand over forecasting decisions to algorithms is being explored in a new IARPA tournament involving human, machine, and human-machine hybrid forecasters. It will create some interesting data on the boundaries of where each performs best – although the algorithm described by Tetlock above and used by the Good Judgment team suggests that even a largely human system will likely need statistical combination of forecasts to succeed.

Robert Wiblin: [F]irst, you have a new crowdsourcing tournament going on now, don’t you, called Hybrid Mind?

Philip Tetlock: Well, I wouldn’t claim that it belongs to me. It belongs to IARPA, the Intelligence Advanced Research Projects Activity, which is the same operation and US intelligence community that ran the earlier forecasting tournament. The new one is called Hybrid Forecasting Competition, and it, I think, represents a very important new development in forecasting technology. It pits humans against machines against human-machine hybrids, and they’re looking actively for human volunteers.

So hybridforecasting.com is the place to go if you want to volunteer.

Well, there are a lot of unknowns. It may seem obvious that machines will have an advantage when you’re dealing with complex quantitative problems. It would be very hard for humans to do better than machines when you’re trying to forecast, say, patterns of economic growth in OECD countries where you have very rich, pre-quantified time series, cross-sectional data sets, correlation matrices, lots of macro models. It’s hard to imagine people doing much better than that, but it’s not impossible because the models often over fit.

So far, as the better forecasters are aware of turbulence on the horizon and appropriately adjust their forecasts, they could even have an advantage on turf where we might assume machines would be able to do better.

So there’s a domain, I think, of questions where there’s kind of a presumption among many people observe these things that the machines have an advantage. Then there are questions where people sort of scratch their heads and say how could the machines possibly do questions like this? Here, they have in mind the sorts of questions that were posed, many of the questions that were posed anyway, on the earlier IARPA forecasting tournament, the one that lead to the discovery of super forecasters.

These are really hard questions about how long is the Syrian civil war going to last in 2012? Is the war going to last another six months or another 12 months? When the Swiss and French medical authorities do an autopsy on Yasser Arafat, will they discover polonium? It’s hard to imagine machines getting a lot of traction on many of these quite idiosyncratic context-specific questions where it’s very difficult to conjure any kind of meaningful statistical model.

Although, when I say it’s hard to construct those things, it doesn’t mean it’s impossible.

Finally, Robert Wiblin is a great interviewer. I recommend subscribing to the 80,000 hours podcast.

Michael Lewis’s The Undoing Project: A Friendship That Changed The World

LewisMy journey into understanding human decision making started when I read Michael Lewis’s Moneyball in 2005. The punchline – which, as it turns out, has been known across numerous domains since at least the 1950s – is that “expert” judgement is often outperformed by simple statistical analysis.

A couple of years later I read Malcolm Gladwell’s Blink and was diverted into the world of  Gary Klein, which then led me to Kahneman and Tversky among others. It was only then that I started to think about the what it is that causes the experts to under-perform (For all Gladwell’s flaws, Gladwell is a great gateway to new ideas).

In the opening to The Undoing Project: A Friendship That Changed The World, Lewis tells of a similar intellectual journey (although obviously with a somewhat closer connection to Moneyball):

[O]nce the dust had settled on the responses to my book [Moneyball], one of them remained more alive and relevant than the others: a review by a pair of academics, then both at the University of Chicago—an economist named Richard Thaler and a law professor named Cass Sunstein. Thaler and Sunstein’s piece, which appeared on August 31, 2003, in the New Republic, managed to be at once both generous and damning. The reviewers agreed that it was interesting that any market for professional athletes might be so screwed-up that a poor team like the Oakland A’s could beat most rich teams simply by exploiting the inefficiencies. But—they went on to say—the author of Moneyball did not seem to realize the deeper reason for the inefficiencies in the market for baseball players: They sprang directly from the inner workings of the human mind. The ways in which some baseball expert might misjudge baseball players—the ways in which any expert’s judgments might be warped by the expert’s own mind—had been described, years ago, by a pair of Israeli psychologists, Daniel Kahneman and Amos Tversky. My book wasn’t original. It was simply an illustration of ideas that had been floating around for decades and had yet to be fully appreciated by, among others, me.

Lewis realised that there was a deeper story to tell, with The Undoing Project the result.

I am increasingly of the view that a biography or autobiography is one of the more effective (although not always balanced) ways to lay out a set of ideas. Between The Undoing Project and Richard Thaler’s Misbehaving, a layperson would struggle to find a more accessible and interesting introduction to behavioural science and behavioural economics.

The first substantive chapter of the Undoing Project focuses on Daryl Morey, General Manager of the Houston Rockets. It felt like a Moneyball style essay for which Lewis hadn’t been able to find another use (although you can read this chapter on Slate). However, it was an interesting illustration of the idea that once you have the statistics in hand, it’s still hard to eliminate the involvement of the human mind. For instance:

If he could never completely remove the human mind from his decision-making process, Daryl Morey had at least to be alive to its vulnerabilities. He now saw these everywhere he turned. One example: Before the draft, the Rockets would bring a player in with other players and put him through his paces on the court. How could you deny yourself the chance to watch him play? But while it was interesting for his talent evaluators to see a player in action, it was also, Morey began to realize, risky. A great shooter might have an off day; a great rebounder might get pushed around. If you were going to let everyone watch and judge, you also had to teach them not to place too much weight on what they were seeing. (Then why were they watching in the first place?) If a guy was a 90 percent free-throw shooter in college, for instance, it really didn’t matter if he missed six free throws in a row during the private workout.

Morey leaned on his staff to pay attention to the workouts but not allow whatever they saw to replace what they knew to be true. Still, a lot of people found it very hard to ignore the evidence of their own eyes. A few found the effort almost painful, as if they were being strapped to the mast to listen to the Sirens’ song. One day a scout came to Morey and said, “Daryl, I’ve done this long enough. I think we should stop having these workouts. Please, just stop doing them.” Morey said, Just try to keep what you are seeing in perspective. Just weight it really low. “And he says, ‘Daryl, I just can’t do it.’ It’s like a guy addicted to crack,” Morey said. “He can’t even get near it without it hurting him.”

I tend to have little interest in personal histories, so I found the following chapters leading up to Kahneman and Tversky’s collaboration less interesting. In part, this is because any attempt to understand someone’s achievements in the context of their upbringing is little more than storytelling.

But once the book hits the development of Kahneman and Tversky’s ideas – their work on the basic heuristics (availability, representativeness, anchoring ), the development of prospect theory, their work on happiness – the sequential discussion of how these ideas were developed added some real understanding (for me). You can also see the care that went into developing their work, with a desire to create something that would stand the test of time rather than create a headline through a cute result.

One of the more interesting parts of the book near the close relates to Kahneman and Tversky’s interaction with Gerd Gigerenzer (who I have written about a fair bit). While Lewis’s characterisation of Gigerenzer as an “evolutionary psychologist” is wide of the mark, Lewis captures well the frustration that I imagine Kahneman and Tversky must have felt during some of the exchanges. Lewis writes:

[I]n Danny and Amos’s view he’d ignored the usual rules of intellectual warfare, distorting their work to make them sound even more fatalistic about their fellow man than they were. He also downplayed or ignored most of their evidence, and all of their strongest evidence. He did what critics sometimes do: He described the object of his scorn as he wished it to be rather than as it was. Then he debunked his description.

This debate is interesting enough that I’ll explore it in more detail in a future post.

Angela Duckworth’s Grit: The Power of Passion and Perseverance

DuckworthIn Grit: The Power of Passion and Perseverance, Angela Duckworth argues that outstanding achievement comes from a combination of passion – a focused approach to something you deeply care about – and perseverance – a resilience and desire to work hard. Duckworth calls this combination of passion and perseverance “grit”.

For Duckworth, grit is important as focused effort is required to both build skill and turn that skill into achievement. Talent plus effort leads to skill. Skill plus effort leads to achievement. Effort appears twice in the equation. If one expends that effort across too many domains (no focus through lack of passion), the necessary skills will not be developed and those skills won’t be translated into achievement.

While sounding almost obvious written this way, Duckworth’s claims go deeper. She argues that in many domains grit is more important than “talent” or intelligence. And she argues that we can increase people’s grit through the way we parent, educate, coach and manage.

Three articles from 2016 (in SlateThe New Yorker and npr) critiquing Grit and the associated research make a lot of the points that I would. But before turning to those articles and my thoughts, I will say that Duckworth appears to be one of the most open recipients of criticism in academia that I have come across. She readily concedes good arguments, and appears caught between her knowledge of the limitations of the research and the need to write or speak in a strong enough manner to sell a book or make a TED talk.

That said, I am sympathetic with the Slate and npr critiques. Grit is not the best predictor of success. To the extent there is a difference between “grit” and the big five trait of conscientiousness, it is minor (making grit largely an old idea rebranded with a funkier name). A meta-analysis (working paper) by Marcus Credé, Michael Tynan and Peter Harms makes this case (and forms the basis of the npr piece).

Also critiqued in the npr article is Duckworth’s example of grittier cadets being more likely to make it through the seven-week West Point training program Beast Barracks, which features in the book’s opening. As she states, “Grit turned out to be an astoundingly reliable predictor of who made it through and who did not.”

The West Point research comes from two papers by Duckworth and colleagues from 2007 (pdf) and 2009 (pdf). The difference in drop out rate is framed as a rather large in the 2009 article:

“Cadets who scored a standard deviation higher deviation higher than average on the Grit-S were 99% more likely to complete summer training”

But to report the results another way, 95% of all cadets made it through. 98% of the top quartile in grit stayed. As Marcus Credé states in the npr article, there is only a three percentage point difference between the average drop out rate and that of the grittiest cadets. Alternatively, you can consider that 88% of the bottom quartile made it through. That appears a decent success rate for these low grit cadets. (The number reported in the paper references the change in odds, which is not the way most people would interpret that sentence. But on Duckworth being a great recipient of criticism, she concedes in the npr article she should have put it another way.)

Having said this, I am sympathetic to the argument that there is something here that West Point could benefit from. If low grit were the underlying cause of cadet drop-outs, reducing the drop out rate of the least gritty half to that of the top half could cut the drop out rate by more than 50%. If they found a way of doing this (which I am more sceptical about), it could be a worthwhile investment.

One thing that I haven’t been able to determine from the two papers with the West Point analysis is the distribution of grit scores for the West Point cadets. Are they gritty relative to the rest of the population? In Duckworth’s other grit studies, the already high achievers (spelling bee contestants, Stanford students, etc.) look a lot like the rest of us. Why does it take no grit to enter into domains which many people would already consider to be success? Is this the same for West Point?

Possibly the biggest question I have about the West Point study is why people drop out. As Duckworth talks about later in the book (repeatedly), there is a need to engage in search to find the thing you are passionate about. Detours are to be expected. When setting top-level goals, don’t be afraid to erase an answer that isn’t working out. Finishing what you begin could be a way to miss opportunities. Be consistent over time, but first find a thing to be consistent with. If your mid-level goals are not aligned with your top level objective, abandon them. And so on. Many of the “grit paragons” that Duckworth interviewed for her book explored many different avenues before settling on the one that consumes them.

So, are the West Point drop-outs leaving because of low grit, or are they are shifting to the next phase of their search? If we find them later in their life (at a point of success), will they then score higher on grit as they have found something they are passionate about that they wish to persevere with? How much of the high grit score of the paragons is because they have succeeded in their search? To what extent is grit simply a reflection of current circumstances?

One of the more interesting sections of the book addresses whether there are limits to what we can achieve due to talent. Duckworth’s major point is that we are so far from whatever limits we have that they are irrelevant.

On the one hand, that is clearly right – in almost every domain people could improve through persistent effort (and deliberate practice). But another consideration is where their personal limits lie relative to the degree of skill required to successfully achieve a person’s goals. I am a long way from my limits as a tennis player, but my limits are well short of that required to ever make a living from it.

Following from this, Duckworth is of the view that people should follow their passion and argues against the common advice that following your passion is the path to poverty. I’m with Cal Newport on this one, and think that “follow your passion” is horrible advice. If you don’t have anything of value to offer related to your passion, you likely won’t succeed.

Duckworth’s evidence behind her argument is mixed. She notes that people are more satisfied with jobs when they follow a personal interest, but this is not evidence that people who want to find a job that matches their interest are more satisfied. Where are those who failed? Duckworth also notes that these people perform better, but again, what is the aggregate outcome of all the people who started out with this goal?

One chapter concerns parenting. Duckworth concedes the evidence here is thin, incomplete and that there are no randomised controlled trials. But she then suggests that she doesn’t have time to wait for the data come in (which I suppose you don’t if you are already raising children).

She cites research on supportive versus demanding parenting, derived from measures such as surveys of students. These demonstrate that students with more demanding parents have higher grades. Similarly, research on world-class performers shows that their parents are models of work ethic. The next chapter reports on the positive relationship between extracurricular activities while at school and job outcomes, particularly where they stick with the same activity for two or more years (i.e. consistent parents).

But Duckworth does not address the typical problem of studies in this domain – they all ignore biology. Do the students receive higher grades because their parents are more demanding, or because they are the genetic descendants of two demanding people? Are they world-class performers because their parents model a work ethic, or because they have inherited a work ethic? Are they consistent with their extracurricular activities because their parents consistently keep them at it, or because they are the type of people likely to be consistent?

These questions might appear speculation in themselves, but the large catalogue of twin, adoption and now genetic studies points to the answers. To the degree children resemble their parents, this is largely genetic. The effect of the shared environment – i.e. parenting – is low (and in many studies zero). That is not say interventions cannot be developed. But they are not reflected in the variation in parenting the subject of these studies.

Duckworth does briefly turn to genetics when making her case for the ability to change someone’s grit. Like a lot of other behavioural traits, the heritability of grit is moderate: 37% for perseverance, 20% for passion (the study referenced is here). Grit is not set in stone, so Duckworth takes this as a case for the effect of environment.

However, a heritability less than one provides little evidence that deliberate changes in environment can change a trait. The same study finding moderate heritability also found no effect of shared environment (e.g. parenting). The evidence of influence is thin.

Finally, Duckworth cites the Flynn effect as evidence of the malleability of IQ – and how similar effects could play out with grit – but she does not reference the extended trail of failed interventions designed to increase IQ (although a recent meta-analyses show some effect of education). I can understand Duckworth’s aims, but feel that the literature in support of them is somewhat thin.

Other random points or thoughts:

  • As for any book that contain colourful stories of success linked to the recipe it is selling, the stories of the grit paragons smack of survivorship bias. Maybe the coach of the Seattle Seahawks pushes toward a gritty culture, but I’m not sure the other NFL teams go and get ice-cream every time training gets tough. Jamie Dimon, CEO of JP Morgan, is praised for the $5 billion profit JP Morgan gained through the GFC (let’s skate over the $13 billion in fines). How would another CEO have gone?
  • Do those with higher grit display a higher level of sunk cost fallacy, being unwilling to let go?
  • Interesting study – Tsay and Banaji, Naturals and strivers: Preferences and beliefs about sources of achievement. The abstract:

To understand how talent and achievement are perceived, three experiments compared the assessments of “naturals” and “strivers.” Professional musicians learned about two pianists, equal in achievement but who varied in the source of achievement: the “natural” with early evidence of high innate ability, versus the “striver” with early evidence of high motivation and perseverance (Experiment 1). Although musicians reported the strong belief that strivers will achieve over naturals, their preferences and beliefs showed the reverse pattern: they judged the natural performer to be more talented, more likely to succeed, and more hirable than the striver. In Experiment 2, this “naturalness bias” was observed again in experts but not in nonexperts, and replicated in a between-subjects design in Experiment 3. Together, these experiments show a bias favoring naturals over strivers even when the achievement is equal, and a dissociation between stated beliefs about achievement and actual choices in expert decision-makers.”

  • A follow up study generalised the naturals and strivers research over some other domains.
  • Duckworth reports on the genius research of Catharine Cox, in which Cox looked at 300 eminent people and attempted to determine what it was that makes them a genius. All 300 had an IQ above 100. The average of the top 10 was 146. The average of the bottom 10 was 143. Duckworth points to the trivial link between IQ and ranking within that 300, with the substantive differentiator being level of persistence. But note those average IQ scores…