Concepts – Genetic Distance

At Family Tree DNA, your Y DNA and full sequence mitochondrial matches display a column titled Genetic Distance.  One of the most common questions I receive is how to interpret genetic distance.

GD example 2

Many people mistakenly assume that genetic distance is the number of generations to a common ancestor, but that is NOT AT ALL what genetic distance means.

Genetic distance is how many mutations difference the participant (you) has with that particular match. In other words, how many mismatches in your DNA compared with that person’s DNA.

White the concept is the same, Y DNA and mitochondrial DNA Genetic Distance function a little differently, so let’s look at them separately.

Y DNA Genetic Distance

I wrote about genetic distance as part of a larger article titled “Concepts – Y DNA Matching and Connecting with your Paternal Ancestor,” but I’m going to excerpt the genetic distance portion of that article here.

You’ll notice on the Y DNA matches page that the first column says “Genetic Distance.”

STR genetic distance

Looking at the example above, if this is your personal page, then you mismatch with Howard once, and Sam twice, etc.

Counting Genetic Distance

Genetic distance for Y DNA can be counted in different ways, and Family Tree DNA utilizes a combination of two scientific methods to provide the most accurate results. Let’s look at an example.

In the methodology known as the Step-Wise Mutation Model, each difference is counted as 1 step, because the mutation that caused the difference happened in one mutation event.

STR genetic distance calc

So, if marker 393 has mutated from 12 to 13, the difference is 1, so there is one difference and if that is the only mutation between these two men, the total genetic distance would be 1.

However, if marker 390 mutated from 24 to 26, the difference is 2, because those mutations most likely occurred in two different steps – in other words marker 390 had a mutation two different times, perhaps once in each man’s line.  Therefore, the total genetic distance for these two men, combining both markers and with all of their other markers matching, would be 3.

Easy – right?  You know this is too easy!

Some markers don’t play nice and tend to mutate more than one step at a time, sometimes creating additional marker locations as well.  They’re kind of like a copy machine on steroids. These are known as multi-copy (or palindromic) markers and have more than one value listed for each marker.  In fact, marker 464 typically has 4 different values shown, but can have several more.

The multiple mutations shown for those types of multi-copy markers tend to occur in one step, so they are counted as one event for that marker as a whole, no matter how much math difference is found between the values. This calculation method is called the Infinite Alleles Mutation Model.

str genetic distance calc 2 v2

Because marker 464 is calculated using the infinite alleles model, even though there are two differences, the calculation only notes that there IS a difference, and counts that difference as having occurred in one step, counting only as 1 in genetic distance.

However, if one man also has one or more extra copies of the marker, shown below as 464e and 464f, that is counted as one additional genetic distance step, regardless of the number of additional copies of the marker, and regardless of the values of those copies.

STR genetic distance calc 3 v2

With markers 464e and 464f, which person 2 carries and person 1 does not, the difference is 17 and the generational difference is 1, for each marker, but since the copy event likely happened at one time, it’s considered a mutational difference or genetic distance of only 1, not 34 or 2. Therefore, in our example, the total genetic distance for these men is now 5, not 8 or 38.

In our last example, a deletion has occurred, which sometimes happens at marker location 425. When a deletion occurs, all of the DNA at that location is permanently deleted, or omitted, between father and son, and the value is 0.  Once gone, that DNA has no avenue to ever return, so forever more, the descendants of that man show a value of zero at marker 425.

STR genetic distance calc 4 v2

In this deletion example, even though the mathematical difference is 12, the event happened at once, so the genetic distance for a deletion is counted as 1. The total genetic distance for these two men now is 6.

In essence, the Total Genetic Distance is a mathematical calculation of how many times mutations happened between the lines of these two men since their common ancestor, whether that common ancestor is known or not.

Family Tree DNA provides a the TIP calculator which helps estimate the time to a common ancestor using a proprietary algorithm that includes individuals marker mutation rates.  You can read more about this in the Y DNA Concepts article or in the TIP article.

Please note that on July 26, 2016 Family Tree DNA introduced changes in how the genetic distance is calculated for some markers to be less restrictive.  You can read about the changes here.

Mitochondrial DNA

GD mt example

Mitochondrial DNA Genetic Distance is a bit different. In order to be shown as a match, you must be an exact match in the HVR1 and HVR2 regions, so there is no genetic distance shown, because there are no mutations allowed.

At the full sequence level, you are allowed 4 or fewer mismatches to be considered a match.

Genetic distance means how many mismatches you have to another person when comparing your 16,569 mitochondrial locations to theirs. The full sequence test tests all of those locations.

Of course, in general, fewer mismatches mean you are more closely related than to someone with more mismatches. I said generally, because I have seen a situation where a mutation occurred between mother and child, meaning that individual had a genetic distance of 1 when compared to their mother, along with anyone who matched their mother exactly. Clearly, they are far more closely related to their mother than to their mother’s matches.

One of the most common questions I receive about genetic distance is how to convert genetic distance to time – meaning how long ago am I related to someone who has a genetic distance of 1 or 2, for example.

The answer is that it depends and it varies widely, very widely.  I know, I hate the “it depends” answer too.

Turning to the Family Tree DNA Learning Center, we find the following information:

    • Matching on HVR1 means that you have a 50% chance of sharing a common maternal ancestor within the last fifty-two generations. That is about 1,300 years.
    • Matching on HVR1 and HVR2 means that you have a 50% chance of sharing a common maternal ancestor within the last twenty-eight generations. That is about 700 years.
    • Matching exactly on the Mitochondrial DNA Full Sequence test brings your matches into more recent times. It means that you have a 50% chance of sharing a common maternal ancestor within the last 5 generations. That is about 125 years.

I think the full sequence estimate is overly generous. I seldom find identifiable matches, and I do have my genealogy back more than 5 generations on my mitochondrial line and so do many of my clients.

My 4 times great-grandmother, or 6 generations distant from me (counting my mother as generation 1), Elisabetha Mehlheimer, was found living in Goppmansbuhl, Germany when she gave birth to her daughter in 1823. This puts Elisabetha’s birth around 1800, or possibly earlier, very probably in the same village in Germany.  German church records compulsively identify people who aren’t residents, and even residents who originally came from another location.

Part of my mitochondrial full sequence matches are shown below.

GD my results

Looking at my 13 exact matches, it becomes obvious very quickly that my matches aren’t from Germany, they are primarily from Scandinavia. Not at all what I expected. I created this chart to view the match locations. I have omitted anyone who did not provide either location or oldest ancestor information. Fortunately, Scandinavians are very good about participating fully in DNA testing and by and large, they want to get the most out of their results. The way to do that, of course is to include as much information as possible so that we can all benefit by sharing and collaboration.

Match Genetic Distance Location Birth Year of Most Distant Ancestor
TS 0 Norway 1758
Svein 0 Norway 1725
Bo-Lennart 0 Norway 1725
Per 0 Norway 1718
Hakan 0 Sweden 1716
Ragnhild 0 Sweden 1857
Constance 0 Russia
Teresa 0 Poland 1750
Valerie 0 Norway 1763
Vladimir 0 Russia
Rose 0 Sweden 1845
IRL 0 Norway 1702
Lynn 0 Norway 1696
Anastasia 1 Russia above Georgia 1923
AJ 1 Sweden 1771
Marianne 1 Sweden 1661
Inga 1 Sweden 1691
Inger 1 Sweden
Marianne 1 Sweden 1661
Maria 1 Poland C 1880
Marie M. 1 Bavaria, Germany 1836
Tomas 2 Probably Czech Republic 1880
DL 2 Sweden 1827

A quick look at my matches map shows the distribution of my matches more visually, although not everyone includes their matrilineal ancestor’s geographic information, so they don’t have pins on the map. In my case, I’m lucky because several people have included geographical information which makes the maps very useful. The white pin is where Elisabetha Mehlheimer lived.  Red pins are exact matches, orange are one mutation difference and yellow are two.

GD matches map

I am very clearly not related to these individuals within 6 generations, and probably not for several more generations back in time. The one match from Germany is one mutation different, which certainly could mean that we share a common ancestor and her line had a mutation while mine line didn’t. Wurttemburg and Bavaria do share borders and are neighboring districts in southern Germany as illustrated by this 1855 map of Bavaria and Wurtemberg.

GD Bavaria Wurttemberg

Unfortunately, there is no “rule of thumb” for mitochondrial DNA genetic distance relative to years and generations distant. In other words, there is no TIP calculator for mtDNA. I did some research some years ago attempting to quantify MRCA (most recent common ancestor) time and answer this very question, but the only research papers I was able to find referred to studies on penguins.

How Far is Far?

In some cases, I know that a common ancestor actually reached back hundreds to thousands of years. Of course, relationships in female lines are more difficult to “see” since the surname changes with every generation, historically. In Y DNA, you can look at the surname of the participant and determine immediately if there is a likelihood that you share a common paternal ancestor if the surname matches. Let’s look at some mitochondrial examples.

I recently had a client that matched her haplogroup assignment exactly, with no additional unusual mutations found as compared to the expected mitochondrial mutation profile. She had several exact matches. Her haplogroup? H7a2, which was formed about 2500 years ago, with a standard deviation of 2609, according to the supplemental date from the paper, “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root” by Doron Behar, et al, published in The American Journal of Human Genetics, Volume 90, April 6, 2012. This means that H7a2 could have been formed anytime from recently to about 5000 years ago, with 2500 being the most likely and best fit.

Standard deviation, in this case, means the dates could be off that much in either direction, but the further from 2500, the less likely it is to be accurate.

Conversely, another recent client was haplogroup U2b formed roughly 30,000 years ago, with a standard deviation of 5,800 years. The client had 16 differences, which averages to about one mutation every 2,000 years. Is that what actually happened or did those mutations happen in fits and starts? We don’t know.

A last example is my own DNA with two relevant differences from my haplogroup profile, J1c2f, which was formed about 2,000 years ago with a standard deviation of 3,100 years. Technically, this means my haplogroup might not be formed yet (joke) since 2,000 years ago minus 3,100 years hasn’t happened yet. While that obviously can’t be true, the standard deviation is relevant in the other direction. In essence, what this says is that my haplogroup could be fairly young, probably is about 2000 years old, and could be as old as 5,100 years. Given the clustering, it’s likely that J1c2f was formed in Scandinavia and a few descendants, at some time, migrated into continental Europe and Russia.

GD extra mutations

By the way, the 315 “extra mutations” insertions are too unstable to be considered relevant. They are not included in the genetic distance count in your results.

At the other end of the spectrum, I know of one person who has a mutation between themselves and an aunt and a different mutation when compared with a sister.  Furthermore, those mutations occurred in the HVR1 and HVR2 regions, meaning that these women don’t show as matches to each other until you get to the coding region where the full range of full sequence matches are shown and 4 mutations are allowed.  This caused a bit of panic initially, but was perfectly legitimate and understandable once the actual results were compared. Is this rare? Absolutely. Is it possible? Absolutely.

As you can see, there just isn’t any good measure for mitochondrial DNA mutation timing.  Mutations don’t happen on any time schedule, unfortunately.

I use genetic distance as a gauge for relative relatedness, no pun intended, and I keep in mind that I might actually be more closely related to someone with a slightly further genetic distance than an exact match.

While you can’t compare your actual results to matches online, you can contact your matches to compare actual results.  In my case, I developed a branching tree mutation chart that showed that a group of the people in Sweden with one mutation difference actually all shared an additional mutation that I, and my exact matches, don’t have.  In other words, this Swedish group forms a new branch of the tree and will likely, someday, be a new subhaplogroup of J1c2f.

Sometimes digging a little deeper reveals fascinating patterns that aren’t initially evident.

Summary

When working with genetic distance, look for patterns, not only in terms of geography, but in terms of matching mutations and grouping of individuals.  Sometimes the combination of mutation patterns and geography can reveal information that could not be obtained any other way – and may lead you to your common ancestor, with or without a name.

For example, I know that my common ancestor with these people probably lived someplace in Scandinavia about 2000 years ago, based upon both the clustering and the branching.  How my ancestor got to Germany is still a mystery, but one that might potentially be solved by looking at the history of the region where my known ancestor is found in 1800.

Happy hunting!

30 thoughts on “Concepts – Genetic Distance

  1. I have what may be determined here as two half family members. A half first cousin and a half uncle.We all have different mothers. Our Y-DNA 67 marker tests together have me measured as a distance of one to both. The interesting part is on of the samples (461634) is a half uncle who is 50% Costa Rican! A place where is father had lived temporarily. Should we assume than we all have the same grandfather?

    461634 Martin (half uncle and 50% Costa Rican)
    173590 Harry (me)
    255308 Douglas (half first cousin)

    http://www.worldfamilies.net/surnames/blanton/results

      • Disagree! I am far beyond the initial research and investigative portion of my genealogy. We indeed have the same paternal line without question. I was hoping to get a professional opinion on the data attached.

  2. Hi Roberta,

    I am following your blog with great interest for the past month. Excellent explanation of the genetic distance !

    At the beginning of the “Mitochondrial DNA” section of the present article you are showing an image of the full mt sequence results for a single individual. Are these your own results? I am asking the question because I am also with FTDNA and I have been trying to see how my own results appear to my matches. I was never able to do so.

    Thank you,

  3. My maternal first cousin and I both took the Full Sequence mtDNA test through FTDNA. Mine came back with a GD=1 to her. Turns out I had a “heteroplasmy”, perhaps thanks to my dad, and FTDNA considered that to be one step and hence the difference. So for genealogy purposes I ignored it.

  4. A technical note regarding Standard Deviation: The confidence level for the first SD is 67%, meaning that one out of three reported values falls outside the first SD. The 95% confidence level is reached at two SDs. Thus, a value of 5000 years with a 2000 y SD really means that 95% of the values fall between 5000 +/- 4000 years, or between 1000 and 9000 years ago.

  5. A very timely post. I just got my notice today that my uncle’s YDNA-37 kit had arrived at the FTDNA lab. This is the first YDNA test that I’ve purchased, so I’m still learning. Your blogs are so helpful for this. There are so many links I have to follow to get informed about YDNA testing. Now I just have to wait for the processing to happen! Must be patient!

    • VAN! I am currently spending time with our Landrys in the far north – Cape Breton and Isle Madame. Working on a few mysteries of my own. I look forward to hearing about your results. Cousin Deborah

  6. Hi
    My Dys464 are all one number , which is 11 ( there is no 10 or lower in my haplogroup ). Does this 11 mean I have a very old marker or very young marker?.
    I ask because most people are 11-11-15-16 for my haplogroup ( so this one DYS464 , gives me a difference of 9 immediately ).
    The only person who is 11-11-11-11 and closest genetic distance for me is 19 genetic differences in a 37 marker test .

    regards

    • It doesn’t “mean” anything out of context. Marker values are only relevant when compared to family members to establish lines or to establish baseline haplogroup norms.

  7. It’s very interesting to see different haplogroups & their matches. I only have a 0 distance match with my mom. 🙂 Everyone else, 1 if we’re lucky and you know we have no ties coming from up to 4,000 miles away from each other & having been isolated for at least 8 centuries. I can be 2 genetic distance from other Hawaiians but 1 from Maoris, Pitcairn descendants (from Tahitian women) and even Tongans/Samoans. It’s interesting how that works.

  8. In a Y-DNA 37 marker test FTDNA classifies my genetic distance as 2 with one distant cousin, although the Step-Wise Mutation Model counts three separate mutations. The same GD of 2 occurs between him and a third cousin. Two of the three mutations are at DY389i and DY389ii (and the third at CDY). It appears that FTDNA considers those two mutations at DY389 as a count of one in their GD model. Can you confirm this?

  9. Regarding Y-DNA testing at FTDNA, in 2013 I stumbled upon an algorithmic anomaly which (to the best of my knowledge) has still not been “fixed” or explained to customers. I had tested two men who were understood and documented to be second cousins, but when the results came back they did not show as match … at all. An NPE? Nope. Only through obnoxious persistence did I learn that the FTDNA genetic distance algorithm will not show a match if one kit shows a null value. One of the two men has a null at 393. Otherwise, the two are a 35/36 match.

  10. I have two questions:

    First, one of my match show as 0 genetic distance, however he has an extra 309.2C mutation (we compared screenshots of our mutations). Why don’t the 309.2C mutation count in genetic distance?

    Second, I have 667 matches on HVR1 level, but the number drop dramatically to 11 at HVR1+2 and 8 at full sequence. The three HVR1+2 matches who aren’t full sequence matches haven’t tested the coding region.

    All full sequence matches are either 0 or 1 genetic distance from me and the 1s are confirmed as plus one from my mutations and not minus one. The lack of match with genetic distance of 2 or 3 is weird, especially since we have 3 extra mutations in the coding region from H2a1.

    Is one of our six extra mutations in the HVR1 and HVR2 region responsible for the lack of distant mtDNA matches?

    • The 309 mutations come and go, so they are considered unstable. Without seeing your results, I can only answer the rest of your question by a qualified “probably.”

      • “Probably” mean I’m at least thinking this in the right way. ^_^

        Let me get more into details then. In the HVR1 and 2, I have C16519T, 309.1C, 315.1C, 522.1A and 522.2C, all of which are said to be “not considered for phylogenetic reconstruction” at phylotree.org, so probably unstable like 309.2C earlier.

        But there’s the C194T which is use to define H2c1 and H3s haplogroup, so probably a noteworthy mutation. Also, on mitosearch, we have a lot of 1 genetic distance matches without the C194T who don’t show as mt-matches to us, but all the 1 genetic distance matches with C194T matches are our matches.

        Therefore, I accuse C194T to be the culprit who prevent me to see any distant mt-matches! (looking anxiously in an envelope as if playing a Clue game) ^_^

  11. Pingback: The Concepts Series | DNAeXplained – Genetic Genealogy

  12. Pingback: Y DNA Match Changes at Family Tree DNA Affect Genetic Distance | DNAeXplained – Genetic Genealogy

  13. Pingback: Building Your Personal Mitochondrial Tree | DNAeXplained – Genetic Genealogy

  14. Roberta, we have been having a discussion. Do all people have exactly the same set of 23 chromosomes? And the difference is the SNPs that are on the chromosome that we pick up? I know we are very close to chimpanzees! pat

    • I don’t really understand the question. One set of chromosomes is gender specific and not everyone has a Y chromosome. Men have an X and Y, women have 2 Xs, so those aren’t the same in everyone. Otherwise, the chromosome structure is the same and the differences are mutations at different locations.

  15. Pingback: 2016 Genetic Genealogy Retrospective | DNAeXplained – Genetic Genealogy

  16. Your map of the distribution of your matches looks a bit like the locations various Vikings hailed from or pulled up during the late first millennium. The central German point could be the result of Gustuvas Adolphus visits to Germany with his army during the 30 Year War. I have a similar Swedish problem in that my most distant maternal ancestor was born in Silesia in 1803, and came to Australia in 1838. My GD=0 matches on FTDNA FMS include two Swedish women, a 4th cousin descended from the same maternal ancestor, and someone in Italy whose grandmother was Swedish. The Swedish Haplogroup Database list a couple dozen women with the same mt Haplogroup and a few downstream from it. The GD>0 matches include many Finns and other Scandinavians. The haplogroup is K1b2a1. Why is it so? I put it down to Gustuvas Adlophus in the early 17thC.

  17. Pingback: Using Spousal Surnames and DNA to Unravel Male Lines | DNAeXplained – Genetic Genealogy

  18. Very informative article. Now I know what GD means! (I’ll probably have to read it a couple more times for it to sink in, but that’s just me.)

    You wrote, “…I keep in mind that I might actually be more closely related to someone with a slightly further genetic distance than an exact match.” As in the situation with the mother and daughter that you describe, correct?

    Out of curiousity, I wonder if the latest additions to the haplotree now show a new downstream group of J1c2f? Have you checked?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s