search this blog


Sunday, May 3, 2015

R1a1a from an Early Bronze Age warrior grave in Poland

Ancient DNA tests on a skeleton from an Early Bronze Age "warrior" grave near Hrubieszow, southeastern Poland, have revealed that the remains belong to Y-haplogroup R1a1a [source].

Mitochondrial sequences were also obtained from seven other samples from the same burial site, and assigned to mt-haplogroups H1a, H1b (two), H2a (two), H6 and U5b1.

R1a1a is by far the most frequent Y-haplogroup in Poland today, and its presence in the remains from a high-status burial might be a clue as to how it became so common in East-Central Europe.

Interestingly, the site is classified as part of the Strzyżow Culture, which is considered by Polish archaeologists to be the result of contacts between local communities in southeastern Poland and Kurgan newcomers from the North Pontic steppe.

All of the other ancient R1a1a samples reported to date from Central Europe are also younger than the Middle Neolithic and from presumably steppe-derived Indo-European archeological cultures:

- Late Neolithic, Eulau, Germany, Corded Ware Culture, three related samples

- Late Neolithic, Esperstedt, Gemany, Corded Ware Culture, one sample

- Late Bronze Age, Halberstadt, Germany, Urnfield Culture (?), one sample

- Late Bronze Age, Lichtenstein Cave, Germany, Urnfield Culture, two samples

More info about the Bronze Age Pole, including photos of a facial reconstruction, can be found here and here (in Polish).

See also...

Large-scale recent expansion of European patrilineages

Population genetics of Copper and Bronze Age inhabitants of the Eastern European steppe

Eastern Europe as a bifurcation hotspot for Y-hg R1

Monday, December 22, 2014

Ancient East European and West Asian admixture deep in Siberia

The full mito sequences from this new Derenko at al. study should come in very handy when many more ancient genomes from across North Eurasia are published. The paper is open access, but here are a few excerpts anyway:

Although the genetic heritage of aboriginal Siberians is mostly of eastern Asian ancestry, a substantial western Eurasian component is observed in the majority of northern Asian populations. Traces of at least two migrations into southern Siberia, one from eastern Europe and the other from western Asia/the Caucasus have been detected previously in mitochondrial gene pools of modern Siberians.

We report here 166 new complete mitochondrial DNA (mtDNA) sequences that allow us to expand and re-analyze the available data sets of western Eurasian lineages found in northern Asian populations, define the phylogenetic status of Siberian-specific subclades and search for links between mtDNA haplotypes/subclades and events of human migrations. From a survey of 158 western Eurasian mtDNA genomes found in Siberia we estimate that nearly 40% of them most likely have western Asian and another 29% European ancestry. It is striking that 65 of northern Asian mitogenomes, i.e. ~41%, fall into 19 branches and subclades which can be considered as Siberian-specific being found so far only in Siberian populations. From the coalescence analysis it is evident that the sequence divergence of Siberian-specific subclades was relatively small, corresponding to only 0.6-9.5 kya (using the complete mtDNA rate) and 1–6 kya (coding region rate).


Overall, the phylogeographic analysis strongly implies that the western Eurasian founders, giving rise to Siberian specific subclades, trace their ancestry only to the early and mid-Holocene, though some of genetic lineages may trace their ancestry back to the end of LGM. Importantly, we have not found the modern northern Asians to have western Eurasian genetic components of sufficient antiquity to indicate traces of pre-LGM expansions, that originated from the Upper Paleolithic industries present both in the southern Siberia and Siberian Arctic, and that date back to ~30 kya, well before the LGM [43]–[45]. Apparently, the Upper Paleolithic population of northern Asia, whose western Eurasian ancestry was approved recently by complete genome sequencing of 24 kya-old individual from Mal’ta and 17 kya-old individual from Afontova Gora in south-central Siberia, did not leave a genetic mark on the female lineages of modern Siberians. It is probable that the initial population expansion in the southern Siberia region involved maternal lineages other than present now, or that there was a substantial gene flow into the region after the LGM, most probably from eastern Asian sources as have been suggested by Raghavan et al. [7].


Derenko et al.: Western Eurasian ancestry in modern Siberians based on mitogenomic data. BMC Evolutionary Biology 2014 14:217. doi:10.1186/s12862-014-0217-9

Thursday, October 23, 2014

Ancient DNA from Iron Age and Medieval Poland

A new paper at PLoS ONE featuring ancient mitochondrial (mtDNA) data from Wielbark, Przeworsk and early Slavic remains argues for matrilineal continuity in present-day Poland since the Iron Age. It's actually based on a thesis that I blogged about more than two years ago (see here). However, it does include some fresh insights, so it's worth a look even if you read the thesis. RoIA stands for Roman Iron Age.

Three modern populations or groups of populations (Lithuanians and Latvians, Poles, and Czechs and Slovaks) were found to contain significantly higher percentages (p,0.05) of shared informative haplotypes with the RoIA samples compared to other present-day populations (Figure 2, Table S4). Notably, modern Poles shared the highest number (nine) of informative mtDNA haplotypes with the RoIA individuals.


Of particular interest are three RoIA samples assigned to subhaplogroup H5a1, which were recovered from the Kowalewko (sample K1), the Gaski, and the Rogowo (samples G1 and R3) burial sites (see Figure 1). Recent studies on mtDNA hg H5 have revealed that phylogenetically older subbranches, H5a3, H5a4 and H5e, are observed primarily in modern populations from southern Europe, while the younger ones, including H5a1 that was found among RoIA individuals in our study, date to around 4.000 years ago (kya) and are found predominantly among Slavic populations of Central and East Europe, including contemporary Poles [15]. Notably, we also found one ME sample belonging to subhaplogroup H5a1 (sample OL1 in Table 3). The presence of subclusters of H5a1 in four ancient samples belonging to both the RoIA and the ME periods, and in contemporary Poles, indicates the genetic continuity of this maternal lineage in the territory of modern-day Poland from at least Roman Iron Age i.e., 2 kya.


The evolutionary age of H5 sub-branches (,4 kya) [15] also approximates the age of N1a1a2 subclade found in the RoIA population (sample KA2) (Table 2). The coalescence age of N1a1a2 is around 3.4–4 kya, making this haplotype one of the youngest sub-branches within hg N [52]. The N1a1a2 haplotype found in one RoIA individual was classified as unique because no exact match was found among the twelve comparative populations or groups of populations used in the haplotype sharing test. Notably, a similar N1a1a2 haplotype carrying an additional transition at position 16172 was found in a modern-day Polish individual [53].

I suspect the publication of these results at this time, so many months after they were first revealed in the aforementioned thesis, is part of an effort to drum up interest and secure funding for a new project on the genetic history of Greater Poland, which was announced late last year (see here). I say that because one of the people organizing the project, Janusz Piontek, is also listed as a co-author on this paper. So if we're lucky we might soon see full genome sequences from a few of these Iron Age and Medieval samples.


Juras A, Dabert M, Kushniarevich A, Malmstro¨m H, Raghavan M, et al. (2014) Ancient DNA Reveals Matrilineal Continuity in Present-Day Poland over the Last Two Millennia. PLoS ONE 9(10): e110839. doi:10.1371/journal.pone.0110839

Monday, October 6, 2014

The power of imputation

The latest version of the Affymetrix Human Origins genotyping dataset, published last month along with Lazaridis et al. 2014, is an awesome resource for population genetics (see here). However, it lacks Polish samples, which is a major drawback as far as this blogger is concerned.

Hopefully this oversight is corrected soon. In the meantime, I decided to include 15 Poles from the Eurogenes Project dataset in my copy of the Human Origins. But in order to do that I first had to impute around 460K genotypes for each of these people.

Imputing so many markers might sound pretty crazy, but it's actually very doable, especially for genetically homogeneous groups with relatively low haplotype diversity, like the Polish population. I used BEAGLE 3.3.2 for the job, mostly because I'm familiar with it, but also because it's quick and accurate.

My reference panel included 1090 individuals, most of them shared by Eurogenes and Human Origins, and just over 1 million markers. Only around 130K of the markers were shared by the two datasets, but well over 50% of the 1 million genotypes were observed in each of the Poles. This meant that I was imputing sporadically missing data, which is certainly a more sensible strategy than attempting to fill in long stretches of empty calls.

Everything seems to have worked out just fine, and the proof is in the pudding. Below are two Principal Component Analyses (PCA) featuring the Poles alongside 50 samples from the HGDP. The first PCA is based on observed genotypes, while the second on markers that were imputed into the Polish genomes. PCA are very sensitive to artifacts like genotyping errors, but as you can see, there's very little difference between these results. Also, keep in mind that the SNPs used in the Human Origins were specifically chosen for population genetics, while those in the Eurogenes dataset come from chips mostly designed for commercial ancestry and medical work.

Also, here's a PCA based on more than 300K SNPs, both observed and imputed in the Poles, featuring all of the West Eurasian samples from the filtered version of Human Origins, as well as the 15 Polish individuals. Note that the Poles cluster more or less between the Czechs and groups from the East Baltic region, and overlap most strongly with Belarusians, which makes sense.


Brian L. Browning, Sharon R. Browning, A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals, AJHG, Volume 84, Issue 2, p210–223, 13 February 2009, DOI:

Lazaridis et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, Nature, 513, 409–413 (18 September 2014), doi:10.1038/nature13673

Sunday, September 21, 2014

Corded Ware people: more versatile and healthier than Neolithic farmers

Over at West Hunter Greg Cochran argues that late Neolithic farmers in Northern Europe experienced nothing short of genocide at the hands of Corded Ware Culture (CWC) pastoralists, who pushed deep into the continent from somewhere east of present-day Germany around 4,800 years ago.

I think he's exaggerating. My view is that farming populations throughout much of Neolithic Europe began to crash well ahead of any invasions, perhaps as a result of climate change, overpopulation, environmental degradation and bad health. This, I'd say, created a vacuum that attracted groups from the peripheries of the Neolithic world, like the CWC nomads.

If so, it's likely that many of the surviving farmers were killed or marginalized in the process, although, as often happens in such cases, their women might have been incorporated on a large scale into the new post-Neolithic societies. This is perhaps why the most common Neolithic Y-chromosome haplogroup, G2a, is now so scarce in Europe, while a wide variety of mitochondrial (mtDNA) lineages frequently found among Neolithic skeletons are still carried by many Europeans today.

Nevertheless, I'm not aware of any evidence of a wholesale slaughter, or even any wars, going on in Europe during the early CWC period.

This new paper at Anthropologie seems to back up my case. The Corded Ware people were simply more versatile and healthier than the Neolithic farmers. No wonder then, that they eventually came out on top.

This study focuses on the changes in the human skeleton that are associated with the transition to agricultural subsistence. Two populations from the territory of contemporary Poland that differ in terms of their subsistence strategies are compared. An agricultural subsistence strategy is represented by a Lengyel Culture population from Oslonki (5690-4950 BP), whilst the Corded Ware populations from Zerniki Gorne and Zlota (c. 4160-3900 BP) represent mixed, agricultural-breeding-pastoral economies supplemented with hunting and gathering. The Corded Ware sample consisted of 62 individuals in total, and the Lengyel sample comprised 68 individuals. Health status was examined through skeletal stress indicators, cribra orbitalia, enamel hypoplasia and Harris lines. The analysis of enamel hypoplasia showed the effect of different adaptive strategies on buffering adverse nutritional factors and diseases. The prevalence and severity of the condition proved significantly higher in the Lengyel sample than in the Corded Ware population (64.7% vs. 43.5%, respectively). It is suggested that agricultural subsistence, associated with a less diversified diet, sedentism, exposure to pathogens, spread of infections and increased population density, caused more frequent and severe stress episodes than the mixed economy of the Corded Ware people. The inverse relationship between enamel hypoplasia and the mean age at death found in the agricultural population clearly shows an effect of adverse living conditions on the biological development of the individuals studied.


Krenz-Niedbala M, A biocultural perspective on the transition to agriculture in Central Europe, Anthropologie, 2014/Volume 52/Issue 2/pp. 115-132, ISSN 0323-1119

See also...

Best of 2008: Corded Ware DNA from Germany

Corded Ware Culture linked to the spread of ANE across Europe

Thursday, September 4, 2014

Ancient North Eurasian (ANE) admixture across Europe & Asia

Another update: ANE is the primary cause of west to east genetic differentiation within West Eurasia


This is an update of a supervised ADMIXTURE analysis that I ran earlier this year looking at ANE levels throughout Asia, the results of which I posted at my other blog (see here). Anyone wanna make a map?

ANE admixture across Europe & Asia spreadsheet

My claim is that these estimates are more accurate than those we've seen recently in scientific literature. Obviously I'm referring here to Lazaridis et al. 2013/14 (see here). That's not to say that the authors of this paper don't know what they're doing. Clearly they do, but at the fine-scale there's usually room for improvement no matter who you are.

For instance, in their paper in table S14.9 they list the Basques (in fact, French Basques) as 11.4% ANE, which sounds reasonable, although perhaps a little too high considering they admit that this population can be modeled as 0% ANE. On the other hand, they estimate the "North Spanish" to be 16.3% ANE.

Now, this reference set is actually from the 1000 Genomes project, where it's listed as Spaniards from Pais Vasco (ie. Basque Country). Essentially, what this means is that these are Basques from Spain. So why would Basques from France carry only 11.4% ANE, and Basques from Spain a whopping 16.3%? Not only that, but according to Lazaridis et al., these "North Spanish" also can be modeled as 0% ANE.

Obviously, something's not quite right there. Indeed, in my spreadsheet, the very same French Basques are listed as 7.4% ANE, while the Pais Vasco Spaniards as just over 8%. Call me crazy, and many do, but I think these results actually make good sense.

By the way, I made ten synthetic samples from the ANE allele frequencies from this test, and remarkably, in all of the analyses I've ran so far they behaved very much like MA-1 or Mal'ta boy, the main ANE proxy. Below, for example, is a Principal Component Analysis (PCA) of West Eurasia featuring these individuals. The result is very similar to that I obtained with Mal'ta boy (see here).

The synthetic ANE samples are available here. Feel free to play around with them, and if you do, please let me know what you discover.

As some regular visitors already know, I'm currently designing a new test for GEDmatch that will include various ancient components like ANE. Unfortunately, it might be a while before it's ready, simply because I want it to be as accurate as possible.

See also...

Eurogenes ANE K7

Corded Ware Culture linked to the spread of ANE across Europe

Wednesday, August 13, 2014

Male height in Europe

A new paper in the Economics & Human Biology journal argues that male height in Europe is mostly determined by nutrition and genetics. That's not exactly earth shattering news. However, the authors also point out that Y-chromosome haplogroup I-M170 shows a strong correlation with the highest average stature on the continent, and speculate that the link between the two might be Upper Paleolithic hunter-gatherer ancestry:

The average height of 45 national samples used in our study was 178.3 cm (median 178.5 cm). The average of 42 European countries was 178.3 cm (median 178.4 cm). When weighted by population size, the average height of a young European male can be estimated at 177.6 cm. The geographical comparison of European samples (Fig. 1) shows that above average stature (178+ cm) is typical for Northern/Central Europe and the Western Balkans (the area of the Dinaric Alps). This agrees with observations of 20th century anthropologists (Coon, 1939; Lundman 1977). At present, the tallest nation in Europe (and also in the world) are the Dutch (average male height 183.8 cm), followed by Montenegrins (183.2 cm) and possibly Bosnians (182.5 cm) (Table 1). In contrast with these high values, the shortest men in Europe can be found in Turkey (173.6 cm), Portugal (173.9 cm), Cyprus (174.6 cm) and in economically underdeveloped nations of the Balkans and former Soviet Union (mainly Albania, Moldova, and the Caucasian republics).


The trend of increasing height has already stopped in Norway, Denmark, the Netherlands, Slovakia and Germany. In Norway, military statistics date its cessation to late 1980s.


In contrast, the fastest pace of the height increase (≥1 cm/decade) can be observed in Ireland, Portugal, Spain, Latvia, Belarus, Poland, Bosnia and Herzegovina, Croatia, Greece, Turkey and at least in the southern parts of Italy.


Although the documented differences in male stature in European nations can largely be explained by nutrition and other exogenous factors, it is remarkable that the picture in Fig. 1 strikingly resembles the distribution of Y haplogroup I-M170 (Fig. 10a). Apart from a regional anomaly in Sardinia (sub-branch I2a1a-M26), this male genetic lineage has two frequency peaks, from which one is located in Scandinavia and northern Germany (I1-M253 and I2a2-M436), and the second one in the Dinaric Alps in Bosnia and Herzegovina (I2a1b-M423)16. In other words, these are exactly the regions that are characterized by unusual tallness. The correlation between the frequency of I-M170 and male height in 43 European countries (including USA) is indeed highly statistically significant (r = 0.65; p < 0.001) (Fig. 11a, Table 4). Furthermore, frequencies of Paleolithic Y haplogroups in Northeastern Europe are improbably low, being distorted by the genetic drift of N1c-M46, a paternal marker of Ugrofinian hunter-gatherers. After the exclusion of N1c-M46 from the genetic profile of the Baltic states and Finland, the r-value would further slightly rise to 0.67 (p < 0.001). These relationships strongly suggest that extraordinary predispositions for tallness were already present in the Upper Paleolithic groups that had once brought this lineage from the Near East to Europe.


Grasgruber et al., The role of nutrition and genetics as key determinants of the positive height trend, Economics & Human Biology, available online 7 August 2014, DOI: 10.1016/j.ehb.2014.07.002