The heritability debate, again

Like the level of selection debate, the debate about what heritability means has a life of its own. The latest shot comes from Scott Barry Kaufman who argues (among other things) that:

The heritability of a trait can vary from 0.00 to 1.00, depending on the environments from which research participants are sampled. Because we know that genes play some role in the development of any trait, the precise heritability estimate doesn’t matter in a practical sense.

Heritability depends on the amount of variability in the environmental factors that contribute to a trait. The problem is that our understanding of the factors that contribute to the development of human traits in general — and to IQ in particular — is currently so deficient that we typically do not know if the environmental factors important in the development of a particular trait are stable across testing situations, vary somewhat across those situations, or vary wildly across those situations.

In his conclusion he states:

At the very least, heritability tells us how much of the variation in IQ can be accounted for by variation in genetic factors when development occurs in an exquisitely specific range of environments. However, David S. Moore has argued that even this is not significant when we realize that the magnitude of any heritability statistic reflects the extent of variation in unidentified non-genetic factors that contribute to the development of the trait in question.

(HT: Bryan Caplan)

Through his post, Kaufman constructs a series of paper tigers, tears them down and implies that because the extreme case does not hold, we should be wary of heritability estimates. I did not find much to disagree with in his examples, but the I differed on the conclusions we should draw.

So, where I do not agree – first, the heritability estimate does matter. While I don’t think it is hugely important whether the heritability of IQ in a specific sample is 0.5 or 0.6, it is important whether the measured heritability is 0 or 0.6. As Caplan notes in his post:

My money says, for example, that the average adult IQ heritability estimate published in 2020 will exceed .5.

I think that Caplan is right (although I might have stated some conditions about the relevant sample), and Kaufman’s argument overstates how finely tuned the environment needs to be to get a meaningful heritability estimate. Heritability estimates of a sample of children growing up in extreme poverty might be much lower (or zero) but as is found again and again, once the basic requirements of a child are met, heritability estimates for IQ are consistently above 0.4. We can construct arguments that in each study there are different gene-environment interactions and so on, but if genes weren’t important in variation in IQ and the gene-environment interactions weren’t consistent to some degree, why would such consistent heritability results (and correlation between parent and child IQ) be found?

Further, these results matter. They suggest that poverty is affecting the IQ of some children, and policies could be tailored to cut this disadvantage. For children not subject to deficient environments, the high heritability of IQ should influence policies such as those for education. Children are different and the education system should take this into account.

Implicit in Kaufman’s post was the “its all too complex” argument.  Social and biological sciences are complex (which is why I find them interesting). However, if we fully accepted Kaufman’s argument that “our understanding of the factors that contribute to the development of human traits … is currently so deficient that we typically do not know if the environmental factors important in the development of a particular trait are stable across testing situations”, it would put into question most of the data analysis in economics, sociology and biology. Econometrics operates on the idea of all other things being equal.

Fortunately, Kaufman has not taken the Gladwell-esque approach of suggesting that we forget about genetic factors. Kaufman suggests further research into how nature and nurture are intertwined. If it is all too complex, we should start unwinding the complexity. However, I believe that, in the meantime, this complexity does not mean that we should throw out all the results that have previously been obtained.

Micromotives and macrobehavior

In a post a couple of months or ago as part of a debate on complexity in aid, I recommended Thomas Schelling’s Micromotives and Macrobehavior as a good starting point for understanding complexity science. The book predates a lot of the language associated with complexity science (in fact, I don’t think it uses the word complexity at all), but it provides an excellent illustration of some of the basic tenets of complexity science. These include that complex behaviour can emerge from simple mathematical models and that individual actions can result in aggregate outcomes that do not reflect the individual intentions.

Having recommended but not read the book in over 10 years, I re-read it and it triggered a couple of thoughts. The first is that my recollection of the great illustrations provided by Schelling was correct, although towards the end of the book they can feel a touch laboured as he goes over a few more examples than were needed for me to get the point. Still, some of his classic discussions, such as that on the emergence of segregation despite people having only moderate preferences as to their neighbours, provide real insight. The use of examples like these is typical of a lot of books on complex adaptive systems (such as Miller and Page’s Complex Adaptive Systems), although I rate Schelling’s as the most interesting, and probably the most accessible to someone not familiar with the area.

On finishing the book, I found I was asking the same question I always seem to ask after reading a book on complex adaptive systems and agent based modelling (one of the common ways of modelling complex systems). To what extent could complexity science or agent based modelling shed new light on economic policy questions or macroeconomics? The books always give a host of interesting examples that may change the way you think about some phenomena, and often advocate broader use of complexity science, but they are vague on what future uses might be. There seems to be more people interested in the methodology of complexity science and agent based modelling than there are people interested in using it as a tool. I’ve taken a small stake in this question by exploring the use of agent based models in my research as a way of testing the robustness of my models, exploring some extensions (such as adding a spatial element) and allowing me to loosen some assumptions. However, I am not convinced about how much value I am likely to extract.

On that note, I’d be happy to receive recommendations of good books that discuss complexity science (or agent based modelling) being applied to an area of economic interest and ideally, showing how it leads to some new insights. And by insights, I am hoping for more than “complexity science shows us it is all too complex” or “we need policies to deal with the system’s complexity”. I have plenty of books I can refer someone to when they wish to know what complexity science is, but none which I am happy to recommend as demonstrating what it is good for.

Income and IQ

As I noted in my recent post on Malcolm Gladwell’s Outliers, Gladwell ignored the possibility that traits with a genetic component, other than IQ, might play a role in determining success. His approach reminded me of a useful paper by Samuel Bowles and Herbert Gintis from 2002 on the inheritance of inequality. Bowles and Gintis sought to explain the observed correlation between parental and child income (a correlation of around 0.4) by examining IQ, other genetic factors, environment, race and schooling.

As an example of the consequences of the transmission of income. Bowles and Gintis cited a paper by Hertz which showed that a son born to someone in the top decile of income had a 22.9 per cent chance of attaining that decile himself, compared to a 1.3 per cent chance for someone born to parents in the bottom decile. Conversely, a child born to parents in the top decile had only a 2.4 per cent chance of finishing in the lowest decile compared to over 31.2 per cent for those born to bottom decile parents.

As Gladwell did, Bowles and Gintis started their examination with IQ. To calculate  the inheritance of income through genetically inherited IQ, Bowles and Gintis considered the correlation between parent IQ and income, the heritability of IQ from parent to child and the correlation between IQ and income for the child. Breaking this down, Bowles and Gintis used the following steps and estimates:

1. The correlation between parental income and IQ is 0.266.

2.If the parents’ genotypes are uncorrelated, the genetic correlation between the genotype of the parents and of the child is 0.5. This can be increased with assortive mating (people pairing with people more like themselves) to a maximum of one (clones mating). Bowles and Gintis use 0.6.

3.The heritability of IQ is 0.5.

4. The correlation between child income and IQ is 0.266.

Multiplying these four numbers together gives the intergenerational correlation of income due to genetically based transmission of IQ. I think there is a mistake in the calculations used by Bowles and Gintis, as they find an intergenerational correlation of 0.01, where I calculated 0.02. This leads to genetically inherited IQ variation explaining 5.3 per cent of the observed intergenerational correlation in income. Regardless of the error, this is a  low proportion of the income heritability. (After I wrote this post I did a google search to find if someone had spotted this error before – and they had – on a earlier Gene Expression post on this same paper.)

I would have used some slightly higher numbers, but pushing the numbers to the edges of feasible estimates, such as increasing the correlation between income and IQ to 0.4, the genetically based correlation between parent and child IQ to 0.8 and the degree of assortive mating so that parent-child genotype correlation is 0.8 only yields an intergenerational correlation of 0.10. Genetically inherited IQ would account for approximately 26 per cent of the observed intergenerational correlation.

Unlike Gladwell, Bowles and Gintis then asked what role other genetic factors may play. By using twin studies, which provide an estimate of the degree of heritability of income (using the difference in correlation between fraternal and identical twins) and the degree of common environments of each type of twin, Bowles and Gintis estimated that genetic factors explain almost a third (0.12) of the 0.4 correlation between parent and child income. Loosening their assumptions on the degree of shared environments by identical twins compared to fraternal twins (i.e. assuming near identical environments for both identical and fraternal twins) can generate a higher estimate of the genetic basis of almost three-quarters of the variability in income.

From this, it seems that genetic inheritance plays an important role income transmission between generations. The obvious question is what these factors might be. I expect that patience or ability to delay gratification must play a role, although I would expect that there would be a broad suite of relevant personality traits. I would also expect that appearance and physical features would be relevant. Bowles and Gintis do not take their analysis to this point.

The authors finished their analysis with some consideration of other factors, and conclude that race, wealth and schooling are more important than IQ as a transmission mechanism of income across generations (although as the authors noted, they may have overestimated the importance of race by not including a measure of cognitive performance in the regression). That conclusion may be fair, but as they had already noted, there is a substantial unexplained genetic component.

This highlights the paper’s limitation, as once the specific idea that heritability of IQ is a substantial cause of intergenerational income inequality has been dented,  the identification of other (but unknown) genetic factors leaves open a raft of questions about income heritability. Using Bowles and Gintis’s conservative estimates, we still have 25 per cent of income heritability being put down to genetic factors without any understanding of what these traits are and the extent of the role they play.

In their conclusion, Bowles and Gintis touch on whether policy interventions might be based on these results. They are somewhat vague in their recommendations, but suggest that rather than seeking zero intergenerational correlation, interventions should target correlations that are considered unfair. They suggest, as examples, that there are large majorities supporting compensation for inherited disabilities while intervention for good looks is not appropriate.

One thing I find interesting in an analysis of heritability such as this is that over a long enough time horizon, to the extent that someone with a trait has a fitness advantage (or disadvantage), the gene(s) behind the trait will move to fixation (or be eliminated) as long as heritability is not zero. The degree of heritability is relevant only to the rate at which this occurs and only in a short-term context. The obvious question then becomes (which is besides the point of this post) whether IQ (through income or not) currently yields a fitness advantage. Over a long enough time period, variation will tend to eliminate itself and Bowles and Gintis would be unable to find any evidence of IQ heritability affecting income across generations.

Evolution and irrationality

In a classic behavioural economics story, research participants are offered the choice between one bottle of wine a month from now and two bottles of wine one month and one day from now (alternatively, substitute cake, money or some other pay-off for wine). Most people will choose the two bottles of wine. However, when offered one bottle of wine straight away, more people will take that bottle and not wait until the next day to take up the alternative of two bottles. This suggests that people discount the value of goods received after short delays at a higher rate than they do for longer delays.

This set of decisions could be argued to be irrational. To understand why, suppose you face the first set of choices for one or two bottles of wine in 30 or 31 days. You choose the two bottles. Then, on the 30th day, you are allowed to reconsider your decision, which is effectively making the choice in the second scenario above. Some people will change their mind and take the single bottle of wine. Why would they make one decision at one point of time and then change their mind later? This preference reversal is a result of what is called time inconsistency, which some consider to be evidence of irrationality.

While the evidence of time inconsistent behaviour has grown, evolutionary explanations of how rates of time preference could have evolved generally do not generate these preference reversals. Time preference is consistent as any genes that increase an individual’s predisposition to have irrational decision rules should be progressively eliminated from the population. In most papers on time preference, such as those by Hansson and Stuart, Rogers and Robson and Samuelson, decisions are time consistent.

One useful paper in this area is by Peter Sozou, who seeks to offer a basis for this behaviour, which could be applied in an evolutionary context. The model in the paper matched the intuition I have in my head, so it is nice to come across a paper that formalises the concept.

Sozou’s idea is that uncertainty as to the nature of any underlying hazards can explain time inconsistent preferences. Suppose  there is a hazard that may prevent the pay-off from being realised. This would provide a basis (beyond impatience) for discounting a pay-off in the future. But suppose further that you do not know what the specific probability of that hazard being realised is (although you know the probability distribution). What is the proper discount rate?

Sozou shows that as time passes, one can update their estimate of the probability of the underlying hazard. If after a week the hazard has not occurred, this would suggest that the probability of the hazard is not very high, which would allow the person to reduce the rate at which they discount the pay-off. When offered with a choice of one or two bottles of wine 30 or 31 days into the future, the person applies a lower discount rate in their mind than for the short period because they know that as each day passes in which there has been no hazard preventing the pay-off, their estimate of the hazard’s probability will drop.

This example provides a nice evolutionary explanation of the shape of time preferences. In a world of uncertain hazards, it would be appropriate to apply a heavier discount rate for a short-term pay-off. It is rational and people who applied that rule would not have lower fitness than those who apply a constant discount rate.

While this is a neat scenario, it does leave some questions open. The most obvious is that in many of the experiments that have demonstrated time-inconsistent preferences, there is clearly no hazard. The pay-off is near certain. We could question whether time-inconsistent behaviour under certainty is simply an evolutionary hang-up from more hazardous and uncertain times – although those types of explanations seem to be a “just-so” story.

If Sozou’s explanation represents an underlying predisposition, it also seems that some people are better at overcoming it than others. As I have blogged about before, people vary widely in their ability to delay gratification (with strong links to life outcomes), and variation can be seen across countries. If this trait is sitting in our sub-conscious, it seems that some people are far better at putting aside that urge to discount in a time-inconsistent manner in situations where the pay-off is certain to occur.

There are also some questions about what form the probability distribution of the underlying hazard needs to take to generate the form of time-inconsistency shown in experiments. In Sozou’s paper, he used an exponential probability distribution, and sensitivity analysis showed that this could be relaxed somewhat. However, the question becomes what types of hazards humans faced during their evolution and what the probability distributions of these hazards are. To look at this question, Sozou suggests some cross-species analysis to examine discount rates and the particular ecological hazards faced by those species.

One other outstanding issue is that this explanation offered in the paper does not explain the irrationality in the example I used above. If someone did originally accept the two bottles of wine at 31 days, under Sozou’s model they would not change their mind at day 30 if given the chance. They now have 30 days of observation of the underlying hazard rate and would not want to discount the remaining day of waiting at a high rate. Irrationality of this form is still not explained.

Economists and biology

Mike the Mad Biologist has posted this piece on economists’ understanding of biology. He pulls apart some statements by Russ Roberts and suggests that if economists are going to use biology as a model for the economics discipline, they should try to understand it first.

Naturally, I agree with this. Apart from preventing the mangling of biological concepts when using a biological analogy, there is a lot in biology that could benefit economics.

However, Mike then moves to one of his favourite topics, which is the use of “stupid … natural history facts” in biology and their seeming absence in economics. As he states when comparing economics and biology, “the really key difference is that biology has accepted modes of confronting theories and, importantly, discarding them“.

I agree that economics has more models floating around for which there does not seem to be  factual support. But I am not sure that there is a general lack of empirical work in economics. This hits on one of Russ Robert’s favourite issues, which is the use of complex statistical techniques to empirically validate theories. Statistics can be as misused as theoretical models. Take the back-and-forth on “more guns, less crime” or the impact of legalised abortion on crime. The debate is now predominantly about data and neither side has conceded. As Roberts usually asks, how many economists have changed their mind on the basis of an empirical study? I don’t know of many.

On the flip side, did Dawkins or Gould (or their respective supporters) ever concede to the other side that they were wrong and substantially change their world view?

So why does biology discard theories out of sync with the facts more readily than economics? I can only suggest that economics is more prone to personal bias. The issue government spending tends to elicit a stronger response than whether a particular gene is pleiotropic. Many evolutionary biologists have strong views on economics (as a read of Mike’s blog will show), while most economists probably aren’t overly concerned about evolutionary biology. Perhaps we should ask how many evolutionary biologists have fundamentally changed their economic or political views in the face of data?

Update: Some follow up on whether biologists can admit they are wrong by Razib Khan.

Update 2: And Mike the Mad Biologist with some further thoughts.

Gladwell's Outliers

OutliersAfter flipping through Malcolm Gladwell’s Outliers: The Story of Success late last year, I have finally read the book (nothing like over 30 hours of travel to get through a few).

Having heard a few podcasts involving Gladwell (such as this), I knew largely what to expect. Gladwell is strongly on the nurture side of the nature-nurture debate and is dismissive of explanations involving the individual or their inherent traits. While he does (at times) concede that nature might play a role, he suggests this is uninteresting and that we pull this explanation out too often. I think he is right that it may be pulled out too often in explaining the success of a particular person, but there is a large gap between giving nature the right level of focus and ignoring it altogether as Gladwell suggests.

So, rather than reviewing the book (as has been done plenty of times in the blogosphere already), there are a few specific parts of the book that I feel are worth mentioning.

First, the main points of agreement. I don’t doubt that for the most extreme of outliers – take Bill Gates or the Beatles – that luck played a large part. If Bill Gates had been born in any other country or if computers were not available to him at school, he would not have founded Microsoft and become the richest man in the world. Similarly, I don’t doubt that an ice hockey player is more likely to become a star if they have the good fortune to be born at the right time of the year. Nassim Taleb makes many similar points in The Black Swan on the role of luck.

The other side of this point, however, is that there is still plenty of room for nature to play a part. Why did Bill Gates and Paul Allen, of all the students at their school, take advantage of this opportunity? The January born ice hockey stars are still a very small sample of those born in January, so what distinguishes those January born stars from the others born in January? And the December born players who make it despite their disadvantage?

In some ways, Gladwell’s focus on the most extreme of outliers for much of the book is what gives luck such an important role. Take the example he makes of the little benefit to having an IQ above 120. Even if that were true (I am not sure it is – in many sciences those extra IQ points are still worth a bit), 90 per cent of the population has an IQ below 120. IQ is a strong predictor of income, status, health and a raft of other factors. While someone with an IQ of 140 might be as likely as someone with an IQ of 180 to win a Nobel prize, Nobel prizes are not the measure of success for most of us. If Gladwell had been examining success in the way most of us think of (or experience) it, IQ and other inherent abilities cannot be ignored. Or to put it another way, the absence of difference in outcomes between someone at the 99.9th percentile and 99.99th percentile of IQ does not mean it is unimportant for everyone else.

This brings me to Gladwell’s strong focus on IQ, as opposed to other heritable characteristics. In Gladwell’s discussion on the link between hard work and maths results, he refers to the Trends in International Mathematics and Science Study (TIMSS) test. It was found that the ranking of countries in this test corresponded to the country ranking for the number of questions answered in the accompanying (and 120 question long) questionnaire. Children who did better at the maths test also filled out more information on their family, education and a raft of other background issues. Gladwell points to this as evidence of the link between hard work and mathematical achievement, as it takes patience to work hard and learn mathematics or to answer the questions. He suggests that any IQ (or inherent quality) based explanation is flawed.

Ignoring that the ability to fill out a long questionnaire at a young age is probably influenced by IQ (answering questions on family education requires some cognitive skills), Gladwell lines up his attack on IQ but does not question whether a broader suite of heritable traits might be at play. If it is not IQ that is relevant, Gladwell suggests it must be environmental factors. Take time preference (patience), which has a heritable component and would undoubtedly influence competency at mathematics and willingness to complete the survey. Might that play a role? Gladwell draws a similar conclusion on the lack of success of Christopher Langan (with an IQ approaching 200) and suggests that compared to Robert Oppenheimer, he lacked the skills required to navigate the world. Once he claims to have eliminated IQ, Gladwell pins it to environmental causes, despite the possibility that these other skills have a genetic component. (Gladwell’s approach reminded me of a paper by Samuel Bowles and Herbert Gintis on the inheritance of inequality. The attempt to pin down the heritability of the level of income to IQ showed that other genetic factors would need to be considered. I’ll blog about this paper in the next couple of days.)

Two other anecdotes stuck out. First was Gladwell’s example of Jewish immigrants coming into New York with a wealth of tailoring skills at just the right time. He suggests that their children went on to become highly successful (usually as lawyers and doctors) after they saw the hard work of their parents in the home – hard work that those parents “lucked into” by having a skill that was suddenly in huge demand. It is a nice story, but it does not explain the success of Jewish immigrants in field after field where high cognitive ability is an advantage – be that banking, law, research (plenty of Nobel prizes there), medicine, and the list goes on. The success occurred on such a broad scale despite the varied (and often disadvantaged) family histories. Should Gladwell be looking further back in time for an explanation?

He does that in the other anecdote I found most interesting, which was Gladwell’s discussion of some intractable family disputes in some parts of the United States (think the Hatfields and McCoys). Gladwell suggests that their ancestors originally came from marginally fertile areas where herding dominated and there was a need to establish a strong reputation to protect their herds. This resulted in “cultures of honour” forming, in which misdeeds needed to be punished quickly and brutally to ward off future attacks. When they migrated to the United States, Gladwell suggested that these people brought their culture with them and it has persisted through generations, despite the shift in countries and for many people, significant changes in wealth and status. It is an interesting explanation, but it is one part of the book where innate traits are crying out to be examined. Gladwell did refer to some interesting studies on this issue, which is something that I will definitely be following up.

Diamond on biological differences

On Friday afternoon, as has happened a few times, I was asked if I had read Jared Diamond’s Guns, Germs and Steel. How could an evolutionary analysis of development accommodate Diamond’s thesis?

As Diamond frames his book in the prologue, Guns, Germs and Steel provides an environmental explanation of human development. Diamond states that you could summarise his book with the following sentence:

History followed different courses for different peoples because of differences among peoples’ environments, not because of biological differences among peoples themselves.

The interesting thing about this characterisation of his book is the discussion over the following pages where Diamond counters those who seek to develop genetic explanations of development. While taking aim at those who suggest there is an inherent superiority to people from industrialised nations (which is fair enough), Diamond utilises an evolutionary argument himself.

Diamond suggests, based on his observations of people in New Guinea, that modern stone-age people are on average probably more intelligent than industrialised people. For example, he suggests they have much better skills such as forming a mental map of unfamiliar surroundings.

Diamond suggests two reasons for his impression that New Guineans are smarter than Westerners. One is environmental, with Diamond believing that, compared to the passive television based environment of Westerners, New Guinean children are exposed to a far more stimulating environment. The second is that New Guineans are more likely to have been selected for intelligence.

Comparing the environments of Westerners and New Guineans, Diamond submits that while most Western children survive to adulthood and reproduce regardless of their genes (or intelligence), New Guineans live in societies where population is too small for the epidemic diseases to evolve. Mortality came from murder, tribal warfare, accidents or failure to meet subsistence needs. In New Guinea, intelligent people are more likely than less intelligent people to escape those causes of high mortality.

This suggests that Diamond is not averse to arguments about the selective pressures in different societies. In the same way, I consider an approach to development that considers how humans have evolved is not inconsistent with most of Diamond’s work. As global populations were exposed, as Diamond catalogues, to a range of different environments and opportunities, they followed different developmental paths, and those developmental paths in turn could have subjected populations to varying selective pressures.

This issue would then be what those selective pressures are, how populations may have changed and whether this affected economic development. Diamond only dealt with this issue in the prologue, so it is not clear whether Diamond considers changes in the frequency of genes and traits could have played a role in development, but it would seem not.

As for my perspective, I find Diamond’s hypothesis useful in an evolutionary analysis of development. The environmental differences identified by Diamond would place different selective pressures on human populations, and as those populations change, that could in turn feed back into their environment. Take agriculture. Those populations who had access to the right plants, animals and geographic features are those that developed agriculture – an environmental argument. But within those populations, people with certain traits would have been more likely to take up the agricultural lifestyle and of those who did, those with certain traits more successful at it. A feedback loop would occur, with the environment shaping the people and vice versa. This is not, as Diamond frames it, a question of nature or nature. It is about the relationship between the two.

Unskilled and unaware

Robin Hanson has had another stab at the oft-quoted paper by Kruger and Dunning, Unskilled and Unaware of It. The first couple of sentences of the paper’s abstract gives Kruger and Dunning’s basic (and somewhat amusing) claim:

People tend to hold overly favorable views of their abilities in many social and intellectual domains. The authors suggest that this overestimation occurs, in part, because people who are unskilled in these domains suffer a dual burden: Not only do these people reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to realize it.

Hanson notes that this paper is commonly used by people to suggest that those who disagree with them are confused idiots who lack the basic ability to recognise their error.

Ignoring for a moment other possible explanations for Kruger and Dunning’s empirical results (which is the purpose of Hanson’s post), I always found the “ignore the idiot” interpretation of the paper to be missing the point. The problem is that it is not clear who the idiot is. Trying to self-assess whether you are correct is subject to the biases identified by Kruger and Dunning. If you are an idiot, you are unlikely to know this. If anything, the paper suggests that, without objective evidence, one should be more humble in assessing their ability.

Onto the interpretation of the results, it is also an interesting question about what extent the overestimation of ability results in a “dual burden” as claimed by Kruger and Dunning. There is some evidence that self-deception can be adaptive, such as von Hippel and Trivers discuss in a recent paper. Apart from assisting in deceiving others (it’s easier to lie if you don’t know you are), self-deception in the form of overconfidence might encourage one to try tasks and persevere at them, with occasional success, where a more realistic assessment may result in a decision not to try at all. Starek and Keating’s work on self-deception and swimming also illustrates this idea (HT on the swimming: Radiolab). To the extent this is the case, an overconfident estimation of one’s ability may not be a burden but may help someone to get out there with what they have.

 

Crisis in human genetics?

It is a bit over a year since Geoffrey Miller wrote this piece foreshadowing a crisis in conscience by human geneticists that would become public knowledge in 2010. The crisis had two parts: that new findings in genetics would reveal less than hoped about disease and that they would reveal more than feared about genetic differences between classes, ethnicities and race.

Now that we are through 2010 with no crisis (that I was aware of – is this crisis still happening in private?), I thought I’d revisit Miller’s suggestion that geneticists would show more than feared about class, ethnic and race differences.

At the time I first read the article, I found it hard to characterise this information as something to fear. As Miller identifies, it would be a consequence of some interesting progress:

Once enough DNA is analysed around the world, science will have a panoramic view of human genetic variation across races, ethnicities and regions. We will start reconstructing a detailed family tree that links all living humans, discovering many surprises about mis-attributed paternity and covert mating between classes, castes, regions and ethnicities.

This sounds good to me. To understand the way genes spread as people migrated and mixed across the world will be to gain an important insight into human history.

Miller then points out that some people may be troubled when researchers start to identify genes that create physical and mental differences between populations and identify when those genes arose. Millers states:

If the shift from GWAS [genome wide association studies] to sequencing studies finds evidence of such politically awkward and morally perplexing facts, we can expect the usual range of ideological reactions, including nationalistic retro-racism from conservatives and outraged denial from blank-slate liberals.

But it is not all bad. He closes with:

The few who really understand the genetics will gain a more enlightened, live-and-let-live recognition of the biodiversity within our extraordinary species—including a clearer view of likely comparative advantages between the world’s different economies.

Reading that last sentence, the title to the article and the first paragraph appear over-inflated. People will always misuse information and there will be another body of people who will make great use of it.

Looking at Miller’s article from the vantage point of 2011, I am not sure much has changed. If anything, there has been a slow trickling of some of these ideas into spaces where they are starting to add value. GWAS studies are filling the journals and the store of population genetic data is increasing quickly. While most blank slaters continue to ignore it and the retro-racists use bits as they see fit, some of us are ploughing through it to learn something new.

Although Miller barely touches on it, the economic idea in that last sentence is interesting. If GWAS and sequencing studies identify different skills and comparative advantages across the world’s populations and economies, research into economic development could be vastly changed. However, I am not convinced that we are particularly close to obtaining that sort of information. As I noted in my last post, it seems that we are some distance from taking the load of genetic information and the associated picture of human evolutionary history and being able to link it to characteristics that matter economically. For the moment, basic information of human traits and heritability are filling that role.

Genetic distance and economic development

The History and Geography of Human Genes has heavily influenced the way I think about human evolution. Even though it is getting old at a time when masses of population genetic data are being accumulated, a flip through the maps depicting the geographic distribution of genes provides a picture that is available in few other places.

It was only a matter of time before some economists grabbed this population genetic data to see whether it could shed any light on economic development. In a paper published in the Quarterly Journal of Economics in 2009, Enrico Spolaore and Romain Wacziarg took data on genetic distance from the The History and Geography of Human Genes and asked whether it is correlated with differences in income between countries.

Spolaore and Wacziarg proposed the following model. Take an initial population that branches into two sub-populations each time period, with genetic distance between the two populations being the time since they had a common ancestor. Each sub-population has a transmitted characteristic which is represented by a number. This characteristic mutates either up or down with a 50 per cent probability each generation, so it follows a random walk. As a result, the difference in characteristics (or vertical distance) between two populations is a function of their genetic distance, with the vertical characteristics more likely to have “walked” apart as the time since the shared ancestor increases.

Next, assume that when a sub-population develops a new technology, other sub-populations’ ability to adopt that technology is a function of their vertical distance from the population at the technological frontier. If technology determines income, then the difference in income between two populations is the size of the relative vertical distance from the population that is at the frontier, which in turn is related to the genetic distance. The core insight from this model is that relative genetic distance should have a higher correlation with differences in technology than abolute genetic distance.

While I am not sure this model adds much to the initial intuition, it does serve a useful purpose in that it looks to link genetic distance with income differences through differences in vertical characteristics. If genetic distance and income differences had been directly linked, we would not be left with the interesting question of what these characteristics are.

On the flip-side, Spolaore and Wacziarg have produced a model in which differences in vertical characteristics are a function of random drift, rather than selection. This is unsatisfying, but it is hard to see how the authors could otherwise have produced the model without a theory about what those characteristics are. The model is also agnostic about how one country may develop technology as the authors assume transmitted characteristics do not have any effect on productivity. Introducing a theory of technological development could have been interesting as if certain traits make technological development more likely, there would be two effects creating the income difference – the higher probability of technological progress coupled with the barriers to diffusion.

With model in hand, Spolaore and Wacziarg turned to the population genetic data. Taking data on from 42 world populations, they matched it to countries (for which they have economic data) using information on the ethnic composition of those countries. This formed the basis of determining the genetic distance between countries. They also took a set of European population data (of 26 populations) which would allow them to do a European analysis. The regressions had to depart from the model and test the link between genetic distance and income differences directly as the data does not tell us anything about the vertical characteristics of the population.

The authors completed a mountain of regressions in analysing the data, so here are some of the headline findings. Taking the United States as the world technological frontier in 1995 (a fair assumption), the authors regressed genetic distance against the log of income and, as expected, found that income was negatively correlated with average genetic distance from the United States population. Genetic distance also had reasonably high explanatory power, accounting for 39 per cent of the variation in the sample. The chart below gives the picture. Throwing a range of other explanatory variables into the analysis such as geography and linguistic and religious differences did not materially change this result.

Spolaore and Wacziarg then created 9,316 pairs of countries (from 137 countries) for the world sample and 325 pairs (based on 26 countries) for the European sample and assessed the link between genetic distance and income difference. When they use this broader set of pairs, as opposed to the simple comparison with the United States technological frontier, the degree of variation accounted for by genetic distance decreases, although the genetic distance still has a material effect. For example, one standard deviation change in genetic distance accounts for 16.79% of a standard deviation change in income difference when genetic distance alone is entered into the regression.

The authors also examined a range of other factors, such as Jared Diamond’s thesis about differences in geography and domesticable plants and animals. While including these factors in the analysis reduced the explanatory power of the genetic difference measure, the significance remained. The data also allowed some analysis of earlier time periods, which was in fact easier as most countries’ populations were more ethnically uniform in, say, 1500. At for the later dates, the relationship still held.

Given the agnosticism of Spolaore and Wacziarg on what the vertical characteristics driving income differences are, I hope this paper triggers some deeper examination of what is going on. What are the microeconomic mechanisms driving this result? What are the vertical characteristics that are relevant? And how has selection affected these characteristics? Without the characteristics being subject to selection, the change in characteristics would be fairly slow. These slow changes are then hypothesised to create a substantial barrier to technological diffusion even though the populations have been separated a relatively short period. I would suggest that selection is required.

The authors suggest that more research on peaceful and non-peaceful interaction between societies may be useful to tease out the mechanisms that they have proposed. I agree that research may be interesting, but it leaves open the question which the model ignores – how did some countries get that technological lead in the first place? Do these vertical characteristics play a role in that? Asking why others did not follow is not as interesting as asking why some countries got the lead in the first place.