Study: Race Exists, and Common-Sense Visual Classification Is Wrong Only 14 Times Out of 10,000
CULTURAL LAG is the polite term for habits and hypotheses that never die. They become immune to refutation by virtue of constant repetition. One such meme, due to Lewontin (1972), asserts that there is more genetic variation within genetic groups than between them, and therefore that…… er, ….there is no difference between the groups/there is no genetic difference between genetic groups/any differences between groups cannot be due to genetic reasons/asserting that genetic group differences are discriminable by genetics would be arbitrary and wrong/genetic groups do not exist.
I had never been convinced by these arguments, on the simple basis that genetic groups are clearly visible, and sustain themselves by genetic means, and are usually halved by admixture. Also, it was only a vague thought, but it seemed to me that a t-test could still be significant with relatively small mean differences if the sample size was high enough. Probably not relevant in genetics, I mused.
In fact, the ease with which you can separate two genetic groups depends, like all discriminations and all clustering, on the number of markers available for the discrimination and clustering techniques being used. With only a few markers, discrimination is difficult, and error prone. As you increase the number, allocation to different groups becomes progressively easier.
So, to counter the endless echo of the original hypothesis, I am trying to put together a list of papers which explain and test the issue.
Tim Bates explains that Lewontin based his claims on blood type markers: about as advanced as it was possible to be in 1972, but hopeless to identify genetic clustering, therefore doomed to render a false negative. The 2005 paper by Neil Risch (now cited 400 times) shows how inadequate that procedure was by showing one can now predict race near perfectly with random sets of SNPs.
The authors say in their abstract:
We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multi-ethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity — as opposed to current residence — is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed.
In their discussion they say:
Attention has recently focused on genetic structure in the human population. Some have argued that the amount of genetic variation within populations dwarfs the variation between populations, suggesting that discrete genetic categories are not useful (Lewontin 1972; Cooper et al. 2003; Haga and Venter 2003). On the other hand, several studies have shown that individuals tend to cluster genetically with others of the same ancestral geographic origins (Mountain and Cavalli-Sforza 1997; Stephens et al. 2001; Bamshad et al. 2003). Prior studies have generally been performed on a relatively small number of individuals and/or markers. A recent study (Rosenberg et al. 2002) examined 377 autosomal microsatellite markers in 1,056 individuals from a global sample of 52 populations and found significant evidence of genetic clustering, largely along geographic (continental) lines. Consistent with prior studies, the major genetic clusters consisted of Europeans/West Asians (whites), sub-Saharan Africans, East Asians, Pacific Islanders, and Native Americans. It is clear that the ability to define distinct genetic clusters depends on the number and type of markers used (Risch et al. 2002). Reports that document inability to define distinct clusters generally used only a modest number of markers and, hence, had little power to detect clusters (Romualdi et al. 2002). Studies with larger numbers of markers appear to show strong evidence of clustering (Stephens et al. 2001; Rosenberg et al. 2002).
Another major point of discussion has been the correspondence between genetic clusters and commonly used racial/ethnic labels. Some have argued for poor correspondence between these two entities (Lewontin 1972; Wilson et al. 2001), whereas others have suggested a strong correlation (Risch et al. 2002; Burchard et al. 2003). We have shown a nearly perfect correspondence between genetic cluster and SIRE for major ethnic groups living in the United States, with a discrepancy rate of only 0.14%.
In sum, you get a near perfect correspondence between genetic measures and the common racial labels, with a misclassification rate of a mere 14 per 10,000. Some of this is due to the admixed “other” category, and perhaps some existential confusion in the others, but 9,986 in 10,000 subjects can master the art of looking in a mirror and noting which race they most resemble, a task beyond the wit of some academics.
* * *
Source: Psychological Comments