GPS - frequently asked questions



This FAQ is provided as general information only about GPS and SHOULD NOT be taken as a recommendation to buy genetic tests or services or to participate in genetic studies.

What is GPS?
Biogeography deals with questions such as where mutation has happened that created a unique haplogroup or unique genomic signature of autosomal DNA. The DNA changes in every generation. However, some changes are large (e.g., when a person of East Asian ancestry and a person of European ancestry have a child together) and some are small (East Asian have a child with East Asian). The Geographic Population Structure (GPS) infers the most recent geographic origin of human DNA sequence from their DNA. Due to the frequent population movements, Y and mtDNA haplogroups can only be used to infer broad geographical origins (e.g., East Europe) and only for the past 200,000-20,000 years. By contrast, GPS uses autosomal data, which is highly accurate to the recent past, but loses information as we move along (every person has 50% of their paternal origins, but only 25% of their grand paternal origins, etc.). GPS is designed for individuals with four grandparents of the same geographical origins, for everyone else, it would return the average of their ancestral origins. Therefore, GPS is best suited for regions that have a stable geographic history from (e.g., <3000 in Oceanian and <1000 in Europe). To learn more about GPS, please watch this short video.


An introductory video to GPS produced by The University of Sheffield.

What is the logic behind GPS?
GPS is based on a new conceptualization of human populations that considers everyone as mixed from different gene pools. This model is fundamentally different from existing approaches, which suggest humans branched from 3 or 4 populations that changed over time. GPS instead relies upon genetic uniqueness due to mixing of global gene pools in different proportions. In addition GPS exploits autosomal DNA (chromosomes 1-22), which have better sensitivity to the genetic signature of one's most recent common ancestors.

How accurate is GPS?
In the GPS paper (Elhaik et al. 2014) GPS predicted the continental origins of worldwide population with a 98% accuracy (the remaining 2% are due to the heterogeneous Caribbean populations). GPS assigned 83% of worldwide individuals to their country of origin, and, when applicable, 66% of them to their regional locations. Applied to a dataset of Southeast Asian and Oceanian populations residing in islands GPS assigned 87.5% of the individuals to their right island. Assigned to a dataset of Sardinian villagers, GPS located 25% of the villagers to their correct village, 50% within 15km of their village, and most of the villagers within 100km of their homes.

In 2016, GPS was applied to a population of unknown origins: Yiddish and non-Yiddish speaking Ashkenazic Jews (Des et al. 2016). Yiddish has been used since the 9th century, for over 1,100 years, and while it has been thought to have a German origin, a Slavic origin has also been proposed. GPS localized 93% of the individuals to an ancient Silk and trade road hub in northeastern Turkey, where it four villages (Iskenaz, Eskenez, Ashanas, and Aschuz (Destroyed in 640 AD)) have been located with names that may be derived from the word Ashkenaz.


GPS predictions of Ashkenazic Jewish DNA samples may have found the location of ancient Ashkenaz.
Figure is from (Des et al. 2016)

This is the only region in the world where such placenames exist.
The combined results of the study suggest that this may be the historical location of ancient Ashkneaz: one that existed during the early centuries A.D. up to the 6-7th centuries.

How does GPS work?
GPS applies an algorithm described in our 2014 GPS paper. It calculates the origin of the unknown DNA by comparing it to a registry of known geographical origins of other DNAs obtained from different regions in the world. GPS computes the genetic distances between the unknown and known DNAs from worldwide populations and converts the genetic distances to geographic ones. It then places the new unknown DNA between the populations of known geographic origins. It is similar in concept to a satellite navigation system when driving a car. These coordinates are the last place where the DNA was last changed at the population level, that is, this is when two populations came together and created the DNA under analysis.


Illustrating how the GPS tool works.

How do GPS results coincide with tools like PCA, SPA, or TreeMix that visualize humans as originated from 3-4 groups?
TreeMix and other approaches plot human populations as if they "evolved" or "drifted" from other ancestral populations. They do not effectively explain the arrangements of heterogeneity in human populations. By implementing an admixture-based model - one that attempts to address interbreeding, GPS provides the concept that all populations are mixed to some degree. GPS results may therefore not match results obtained from other tools.

How does GPS account for multiple ancestors?
Genealogical logic should not be confused with the logic behind GPS. GPS does not report the geographic origins of each individual ancestors, but rather the geographical origin of one’s DNA. Why is it not the same? Assume that one’s four grandparents came from East Asia. Although they are four different individuals, their DNA signature is very similar and GPS would count that as 1. In other words, when one’s ancestors stay in the same place and reuse the same genes over and over again it does not change the DNA signature. GPS will consider this a single DNA signature and will infer its East Asian geographical origins.

In the case of parents from two different origins what would GPS report?
GPS-1 cannot handle mixed origin (instances where parents come from different genetic backgrounds). It would report the middle place between the two gene pools. Future GPS development will address multi origin parents

How can I run GPS on my data?
GPS tools are available through several means:
1) The GPS code was published in our GPS paper (most suitable for academics).
2) For those who wish to trace their own origins, the next generation of GPS technologies, called GPS Origins, became commercially available on August 2016. The University of Sheffield has exclusively licensed commercial rights for GPS Origins to DNA Diagnostics Centre, Inc. who provide this service.


Can you help me get my GPS results?
For the general public, this FAQ page is all that can be provided. There will be no response to sample submissions or data files.

I have a very interesting family history that I would like to share with you
I am happy to hear interesting stories. My group does not have bandwidth to perform validations.

How can I take part in GPS research?
At this moment, we are not looking for participants.