PLoS One has just put out a major paper on the genetic structure of Balto-Slavs, and unfortunately I have to say it's a major disappointment.
Kushniarevich A, Utevska O, Chuhryaeva M, Agdzhoyan A, Dibirova K, Uktveryte I, et al. (2015) Genetic Heritage of the Balto-Slavic Speaking Populations: A Synthesis of Autosomal, Mitochondrial and Y-Chromosomal Data. PLoS ONE 10(9):e0135820. doi:10.1371/journal.pone.0135820
The authors used a very small number of Polish samples from the Estonian Biocentre for their genome-wide analysis (see here). Most of these Poles come from Estonia, and some even resemble northern Russians with their unusual ancestry proportions. Note the dichotomy in the levels of the lemon yellow "Siberian/Volga-region" component within the Polish set in the ADMIXTURE bar graph from the paper.
I raised this issue with Estonian Biocentre Research Director Mait Metspalu a while ago when these samples were first published, and this was his response.
What we have is self identity. As you can see from Supplementary table 1 these samples have been collected in Estonia and the donors are self reported Poles.
Well, I'm sure there are people in South America who identify as Spanish. But would anyone in their right mind use them to make inferences about the genetic structure and history of the people of Spain?
Fine-scale population genetic analyses like this should only be done with lots of samples from the right places. Self-reported Poles from Estonia, and perhaps also Russia, who clearly don't resemble Poles from Poland aren't good enough. Estonian Biocentre scientists do a lot of useful work, but in this particular instance they were rather sloppy in labeling these Estonian Poles as simply Polish.
Kushniarevich et al. were very sloppy in not stating where their Polish samples were really from (their map shows the middle of Poland) and not bothering to remove obvious outliers from their dataset.