Sometimes, a match is not a match. I know, now I’ve gone and ruined your day…
One of the questions that everyone wants the answer to when looking at matches, regardless of what kind of DNA testing we’re talking about, is “how long ago?” How long ago did I share a common ancestor with my match? Seems like a pretty simple question doesn’t it?
The answer, especially with mitochondrial DNA is not terribly straightforward. A perfect example of this fell into my lap this week, and I’m sharing it with you.
Mitochondrial DNA – A Short Primer
There are three regions that are tested in mitochondrial DNA testing for genealogy. The HVR1 and HVR2 regions are tested at most testing companies, and at Family Tree DNA, the rest of the mitochondria, called the coding region, is tested as well with the mega or full mitochondrial sequence test. This is the mitochondrial equivalent of Paul Harvey’s “the rest of the story,” and of course we all know that the real story is always in “the rest of the story” or he wouldn’t be telling us about it!
Many times, the rest of the story is critically important. In mitochondrial DNA, it’s the only way to obtain your full haplogroup designation. If you don’t want to just be haplogroup J or A or H, you can test the coding region by taking the full sequence test and find out that you’re J1c2 or A2 or H21, and discover the story that goes with that haplogroup. Guaranteed, it’s a lot more specific than the one that goes with simple J, A or H. Often it’s the difference between where your ancestor was 2000 years ago and 20,000 years ago – and they probably covered a lot of territory in 18,000 years!
Let’s take a quick look at mitochondrial DNA.
To begin with, the HVR1 and HVR2 regions are called HVR for a reason – it’s short for hypervariable. And of course, that means they vary, or mutate, a lot more rapidly, as compared to the coding region of the mitochondrial DNA.
In layman’s terms, think of a clock. No, not a digital clock, an old-fashioned alarm clock.
The entire mitochondrial DNA has 16,569 locations. The HVR1 and HVR2 regions take up the space on the clock face from 5 till until 5 after the hour. The rest is the coding region – the mitochondrial “rest of the story.” The coding region mutates much slower than the two HVR regions.
Just to be sure we’re on the same page, let’s talk for just a minute about how mitochondrial haplogroup assignments work. For a detailed discussion of haplogroup assignments and how they are done, see Bill Hurst’s discussion here.
Generally a base haplogroup can be reasonably assigned by HVR1 region testing, but not always. Sometimes they change with full sequence testing – so what you think you know may not be the end result.
My full haplogroup is J1c2f. My base haplogroup is J. I’m on the first branch of J, J1. On branch J1, I’m on the third stick, c, J1c. On the third stick J1c, I’m on the second twig, J1c2. On the second twig, J1c2, I’m leaf f, or J1c2f. Each of these branches of haplogroup J is determined by a specific mutation that happened long ago and was then passed to all of that person’s offspring, between them and me today. The question is always, how long ago?
Mutation Rates – How Long Ago is Long Ago?
While we have a tip calculator at Family Tree DNA for Y-line DNA to predict how long ago 2 Y-line matches shared a most recent common ancestor, we don’t have anything similar for mitochondrial DNA, partly because of the great variation in the mutation rates for the various regions of mitochondrial DNA. Family Tree DNA does provide guidelines for the HVR1 region, but they are so broad as to be relatively useless genealogically. For example, at the 50th percentile, you are likely to have a common ancestor with someone whom you match exactly on the HVR1 mutations in 52 generations, or about 1300 years ago, in the year 713. Wait, I know just who that is in my family tree!
These estimates do not take into account the HVR2 or coding regions.
I did some research jointly with another researcher not long ago attempting to determine the mutation rate for those regions, and we found estimates that ranged from 500 years to several thousand years per mutation occurrence and it wasn’t always clear in the publications whether they were referring to the entire mitochondria or just certain portions. And then there are those pesky hot-spots that for some reason mutate a whole lot faster than other locations. We’re not even going there. Suffice it to say there is a wide divergence in opinion among academics, so we probably won’t be seeing any type of mito-tip calculator anytime soon.
Family Tree DNA does their best to make our matches useful to us and to eliminate matches that we know aren’t genealogically relevant.
For example, this week, I was working on a client’s DNA Report. Let’s call him Joe. Joe is haplogroup J1c2. I am haplogroup J1c2f. J1c2f has one additional haplogroup defining mutation, in the coding region, that J1c2 does not have.
Joe and I did not show as matches at Family Tree DNA, even though our HVR1 and HVR2 regions are exact matches. Now, for a minute, that gave me a bit of a start. In fact, I didn’t even realize that we were exact matches until I was working with his results at MitoSearch and recognized my own User ID.
I had to think for a minute about why we would not be considered matches at Family Tree DNA, and I was just about ready to submit a bug report, when I realized the answer was my extended haplogroup. This, by the way, is the picture-perfect example of why you need full sequence testing.
Family Tree DNA knows that we both tested at the full sequence level. They know that with a different haplogroup, we don’t share a common ancestor in hundreds to thousands of years, so it doesn’t matter if we match exactly on the HVR1 and HVR2 levels, we DON’T match on a haplogroup defining mutation, which, in this case, happens to be in the coding region, found only with full sequence testing. Even if we have only one mismatch at the full sequence level, if it’s a haplogroup defining marker, we are not considered matches. Said a different way, if our only difference was location 9055 and 9055 was NOT a haplogroup defining mutation, we would have been considered a match on all three levels – exact matches at the HVR1 and HVR2 levels and a 1 mutation difference at the full sequence level. So how a mutation is identified, whether it’s haplogroup defining or not, is critical.
In our case, I carry a mutation at marker 9055 in the coding region that defines haplogroup J1c2f. Joe doesn’t have this mutation, so he is not J1c2f, just J1c2. So we don’t match.
So – How Long Ago for Me and Joe?
Dr. Behar in his “Copernican Reassessment of the Mitochondrial DNA Tree,” which has become the virtual Bible of mtDNA, estimates that the J1c2f haplogroup defining mutation at location 9055 occurred about 2000 years ago, plus or minus another 3000 years, which means my ancestor who had that mutation could have lived as long ago as 5000 years.
The mutations that define haplogroup J1c2 occurred about 9800 years ago, plus or minus another 2000. So we know that Joe and I share a common ancestor about 7,800 – 11,800 years ago and our lines diverged sometime between then and 2,000 – 5,000 years ago. So, in round numbers our common ancestor lived between 2,000 and 9,800 years ago. Not much chance of identifying that person!
The ability to eliminate “near-misses” where the HVR1+HVR2 matches but the people aren’t in the same haplogroup, which is extremely common in haplogroup H, is actually a very useful feature that Family Tree DNA nicknamed SmartMatching. With over 1000 matches at the HVR1 level, more than 200 at the HVR1+HVR2 level and another 50+ at the full sequence level, Joe certainly didn’t need to have any “misleading” matches included that could have been eliminating by a logic process.
So while Joe and I match, technically, if you only look at the HVR1 and HVR2 levels, we don’t really match, and that’s not evident at MitoSearch or at Ancestry or anyplace else that does not take into consideration both full sequence AND haplogroup defining mutations. Family Tree DNA is the only company that does this. Ancestry does not test at the full sequence level, so you can’t even get a full haplogroup assignment there, which is another reason, aside from inaccurate matches, that Ancestry customers often retest at Family Tree DNA.
It’s interesting to think about the fact that 2 people can match exactly at the HVR1+HVR2 levels, but the distance of the relationship can be vastly different. I also match my mother on the HVR1+HVR2 levels, exactly, and our common ancestor is her. So the distance to a common ancestor with an exact HVR1+HVR2 match can be anyplace from one generation (Mom) to thousands of years (Joe), and there is no way to tell the difference without full sequence testing and in this case, SmartMatching.
And that, my friends, is the rest of the story!