Mitochondrial DNA Smartmatching – The Rest of the Story

Sometimes, a match is not a match.  I know, now I’ve gone and ruined your day…

One of the questions that everyone wants the answer to when looking at matches, regardless of what kind of DNA testing we’re talking about, is “how long ago?”  How long ago did I share a common ancestor with my match?  Seems like a pretty simple question doesn’t it?

The answer, especially with mitochondrial DNA is not terribly straightforward.  A perfect example of this fell into my lap this week, and I’m sharing it with you.

Mitochondrial DNA – A Short Primer

There are three regions that are tested in mitochondrial DNA testing for genealogy.  The HVR1 and HVR2 regions are tested at most testing companies, and at Family Tree DNA, the rest of the mitochondria, called the coding region, is tested as well with the mega or full mitochondrial sequence test.  This is the mitochondrial equivalent of Paul Harvey’s “the rest of the story,” and of course we all know that the real story is always in “the rest of the story” or he wouldn’t be telling us about it!

Many times, the rest of the story is critically important.  In mitochondrial DNA, it’s the only way to obtain your full haplogroup designation.  If you don’t want to just be haplogroup J or A or H, you can test the coding region by taking the full sequence test and find out that you’re J1c2 or A2 or H21, and discover the story that goes with that haplogroup.  Guaranteed, it’s a lot more specific than the one that goes with simple J, A or H.  Often it’s the difference between where your ancestor was 2000 years ago and 20,000 years ago – and they probably covered a lot of territory in 18,000 years!

Let’s take a quick look at mitochondrial DNA.

To begin with, the HVR1 and HVR2 regions are called HVR for a reason – it’s short for hypervariable.  And of course, that means they vary, or mutate, a lot more rapidly, as compared to the coding region of the mitochondrial DNA.

In layman’s terms, think of a clock.  No, not a digital clock, an old-fashioned alarm clock.

alarm clock

The entire mitochondrial DNA has 16,569 locations.  The HVR1 and HVR2 regions take up the space on the clock face from 5 till until 5 after the hour.   The rest is the coding region – the mitochondrial “rest of the story.”  The coding region mutates much slower than the two HVR regions.

Just to be sure we’re on the same page, let’s talk for just a minute about how mitochondrial haplogroup assignments work.  For a detailed discussion of haplogroup assignments and how they are done, see Bill Hurst’s discussion here.

Generally a base haplogroup can be reasonably assigned by HVR1 region testing, but not always.  Sometimes they change with full sequence testing – so what you think you know may not be the end result.

My full haplogroup is J1c2f.  My base haplogroup is J.  I’m on the first branch of J, J1.  On branch J1, I’m on the third stick, c, J1c.  On the third stick J1c, I’m on the second twig, J1c2.  On the second twig, J1c2, I’m leaf f, or J1c2f.  Each of these branches of haplogroup J is determined by a specific mutation that happened long ago and was then passed to all of that person’s offspring, between them and me today.  The question is always, how long ago?

Mutation Rates – How Long Ago is Long Ago?

While we have a tip calculator at Family Tree DNA for Y-line DNA to predict how long ago 2 Y-line matches shared a most recent common ancestor, we don’t have anything similar for mitochondrial DNA, partly because of the great variation in the mutation rates for the various regions of mitochondrial DNA.  Family Tree DNA does provide guidelines for the HVR1 region, but they are so broad as to be relatively useless genealogically.  For example, at the 50th percentile, you are likely to have a common ancestor with someone whom you match exactly on the HVR1 mutations in 52 generations, or about 1300 years ago, in the year 713.  Wait, I know just who that is in my family tree!

These estimates do not take into account the HVR2 or coding regions.

I did some research jointly with another researcher not long ago attempting to determine the mutation rate for those regions, and we found estimates that ranged from 500 years to several thousand years per mutation occurrence and it wasn’t always clear in the publications whether they were referring to the entire mitochondria or just certain portions.  And then there are those pesky hot-spots that for some reason mutate a whole lot faster than other locations.  We’re not even going there.  Suffice it to say there is a wide divergence in opinion among academics, so we probably won’t be seeing any type of mito-tip calculator anytime soon.

Enter SmartMatching

Family Tree DNA does their best to make our matches useful to us and to eliminate matches that we know aren’t genealogically relevant.

For example, this week, I was working on a client’s DNA Report.  Let’s call him Joe.  Joe is haplogroup J1c2.  I am haplogroup J1c2f.  J1c2f has one additional haplogroup defining mutation, in the coding region, that J1c2 does not have.

Joe and I did not show as matches at Family Tree DNA, even though our HVR1 and HVR2 regions are exact matches.  Now, for a minute, that gave me a bit of a start.  In fact, I didn’t even realize that we were exact matches until I was working with his results at MitoSearch and recognized my own User ID.

I had to think for a minute about why we would not be considered matches at Family Tree DNA, and I was just about ready to submit a bug report, when I realized the answer was my extended haplogroup.  This, by the way, is the picture-perfect example of why you need full sequence testing.

Family Tree DNA knows that we both tested at the full sequence level.  They know that with a different haplogroup, we don’t share a common ancestor in hundreds to thousands of years, so it doesn’t matter if we match exactly on the HVR1 and HVR2 levels, we DON’T match on a haplogroup defining mutation, which, in this case, happens to be in the coding region, found only with full sequence testing.  Even if we have only one mismatch at the full sequence level, if it’s a haplogroup defining marker, we are not considered matches.  Said a different way, if our only difference was location 9055 and 9055 was NOT a haplogroup defining mutation, we would have been considered a match on all three levels – exact matches at the HVR1 and HVR2 levels and a 1 mutation difference at the full sequence level.  So how a mutation is identified, whether it’s haplogroup defining or not, is critical.

In our case, I carry a mutation at marker 9055 in the coding region that defines haplogroup J1c2f.  Joe doesn’t have this mutation, so he is not J1c2f, just J1c2.  So we don’t match.

So – How Long Ago for Me and Joe?

Dr. Behar in his “Copernican Reassessment of the Mitochondrial DNA Tree,” which has become the virtual Bible of mtDNA, estimates that the J1c2f haplogroup defining mutation at location 9055 occurred about 2000 years ago, plus or minus another 3000 years, which means my ancestor who had that mutation could have lived as long ago as 5000 years.

The mutations that define haplogroup J1c2 occurred about 9800 years ago, plus or minus another 2000.  So we know that Joe and I share a common ancestor about 7,800 – 11,800 years ago and our lines diverged sometime between then and 2,000 – 5,000 years ago.  So, in round numbers our common ancestor lived between 2,000 and 9,800 years ago.  Not much chance of identifying that person!

The ability to eliminate “near-misses” where the HVR1+HVR2 matches but the people aren’t in the same haplogroup, which is extremely common in haplogroup H, is actually a very useful feature that Family Tree DNA nicknamed SmartMatching.  With over 1000 matches at the HVR1 level, more than 200 at the HVR1+HVR2 level and another 50+ at the full sequence level, Joe certainly didn’t need to have any “misleading” matches included that could have been eliminating by a logic process.

So while Joe and I match, technically, if you only look at the HVR1 and HVR2 levels, we don’t really match, and that’s not evident at MitoSearch or at Ancestry or anyplace else that does not take into consideration both full sequence AND haplogroup defining mutations.  Family Tree DNA is the only company that does this.  Ancestry does not test at the full sequence level, so you can’t even get a full haplogroup assignment there, which is another reason, aside from inaccurate matches, that Ancestry customers often retest at Family Tree DNA.

It’s interesting to think about the fact that 2 people can match exactly at the HVR1+HVR2 levels, but the distance of the relationship can be vastly different.  I also match my mother on the HVR1+HVR2 levels, exactly, and our common ancestor is her.  So the distance to a common ancestor with an exact HVR1+HVR2 match can be anyplace from one generation (Mom) to thousands of years (Joe), and there is no way to tell the difference without full sequence testing and in this case, SmartMatching.

And that, my friends, is the rest of the story!

38 thoughts on “Mitochondrial DNA Smartmatching – The Rest of the Story

  1. Please use grammar check. It should be Joe and I not Me and Joe…please! You are not the Cookie Monster even if you do love cookies!

  2. Roberta

    Thanks again for another great bit of education on DNA genealogy. And equally important to some of us for providing links for further reading (Hurst & Behar). Nothing better than being back in a grad school environment.

    • One of the things that I really love about this field is that there is always something really interesting to learn, and you can stay on the surface and get benefit or dive as deep as you want.

  3. Very clear explanation of a very complex & difficult subject to comprehend.

    I have done the ‘full’ test and you created a very interesting report for me.

    One of my best matches is a match not for me in FF but my Mum? Also I note some unusual names in MtDNA results that seem to point to some FF matches. Are there ever clues in MtDNA matches that can help solve genealogical road blocks & provide clues for adoptees? Thank you

      • We will be waiting for this article. Many people say MtDNA is useless to adoptees but I’m sure there must be some use to it, at least it tells part of our history. I think MtDNA is so underestimated. Very interesting article, thanks.

  4. Roberta, thanks so much for the recent article on mtDNA. Just two days ago I received the results of my full mtDNA genome test. I had one match, which is the only one I’ve ever had since I first tested about five years ago. However, until I read your article, I had no idea what that really meant. He and I are 3 steps off, and I’m still not sure exactly what that means, but it seems that at least we know that within the last few hundred years (as opposed to thousands) we shared a common female ancestor. He had traced his line back to Scotland in the 1700′s and I had traced mine to SW VA in the mid 1700′s. (He lives in Australia.) Since my mom’s ancestry shows to be 100% Orcadian, that makes sense.

    All of your articles on DNA have really been of a big help to me. I find myself reading them again from time to time as I need to research a particular issue. Thanks for helping out all us “amateurs!”

  5. Finally the light has turned on in my head, thank-you Roberta, from now on i am going to ignore those matches that have not completed full genome testing until such time as my education has improved, and clearly i must read the rest of your blog.
    Kind Regards

    • I look at them, especially the HVR2 ones and see if anything looks “good”, particularly location. Otherwise I focus on the full sequence ones. I do know of people who have found their relatives through just the HVR1-HVR2 though, so I don’t discount them entirely. I guess that’s the point, at that level, there is nothing to differentiate the very close relatives from those thousands of years removed.

  6. My husband did the full sequence mtDNA a few years ago and a heteroplasmy at 11809 (“Y” which is T or C) kept him from having any FMS matches. I wrote to seven people who matched him on HVR1 and HVR2 and had also done FMS. Four of them responded and three of them would have been exact FMS matches if not for that heteroplasmy – they all had T11809C.

    Then FTDNA did SmartMatching and all of those people disappeared. He’s now considered H1ao. So I wrote to FTDNA and asked “what happened?” Your blog explained it – but there’s more to the story. Apparently, T11809C is a haplogroup defining mutation and he “didn’t” have it since he had T11809Y – which is really T11809C sometimes and no 11809 mutation elsewhere. So NOW they apparently fixed it and all of these people are showing as FMS matches with one difference.

    • Heroplasmy’s are a horse of an entirely different color. They should match either the CRS, the mutated state or the heteroplasmic indicator, such as Y, but they don’t. They match either the CRS which is “no mutation” or the heteroplasmic indicator, such as Y but they don’t match the other state if it is present in a match, such as C or T for Y. And if the heteroplasmy is in the HVR1 or HVR2 region, it’s a mess. I have never written about heteroplasmys but I may someday. If you have a value on your mtdna other than T,A,C or G, then you have a heteroplasmy which is in essence a mutation in process and you have some of two states of DNA.

      • Hi
        I just found your blog, and it answered a lot of questions I had. Thanks for sharing your knowledge.
        I just did a mtDNA full sequence test for my father from Family Tree DNA, and under the HVR2 results, he was given a result of A240R. from reading the FAQ, I understand that he has a heteroplasmy in his mtDNA. However, I can’t find any detailed information on heteroplasmies. Until you get around to writing about it(And I’m waiting with baited breath :-) ) can you recomend any other sites or books to learn more about them, escpecially as to how they can affect his genealogy results?
        Thank you for any assistance.
        Ryan Unruh

      • I knew this would happen one day. :)

        Other than the FAQ at FTDNA, there is very little written. The only thing I know of is an article by Dr. Ann Turner in one of the earlier issues of JOGG. I think it’s called something like “Now You See It, Now You Don’t.” Here’s the link to the site. http://www.jogg.info/

        I will blog about it, but it won’t be until later this fall or winter. My plate is really full right now.

      • Thank you for your prompt response. Whenever I tried to look up Heteroplasmies, I could only find information on the biology of them and mitochondria, nothing on genealogy and how they affect the results given, so this article will be informative for me.
        Thanks again
        Ryan Unruh

  7. Hi Roberta – you mention the tip calculator for Y-DNA and the fact that FTDNA does not have something similar for mtDNA; is there any way to predict the MRCA for steps from the full sequence test? In my case (H1ad, and being an adoptee with a blank tree) I have one match at 2 steps and 2 matches at 3 steps. How close do you think they may be?

    • That’s the question I was trying to answer for a client. The answer is anyplace from about 1500 years ago to the Neolithic age. A huge range and not helpful. The research on mutation rates has simply not been done at the level required to do something like a TIP calculator.

    • While we’re waiting for Roberta’s answer – I can give you two observations from the kits I manage:

      I’ve traced my husband’s maternal line back to colonial Virginia ca. 1660. mtDNA haplogroup is H1ao. The “other side of the pond” is almost definitely British Isles (surname GREEN), but where? He has no exact matches but one of his one-step matches has a maternal line that never left Wales. So the connection is at least that far back. His one-step difference is due to a heteroplasmy at 11809 – it’s actually a half-step difference.

      My sister’s maternal grandmother was an orphan. I have her father’s name but no surname for her mother. My sister’s mtDNA haplogroup is K1a1b1a (probably the most common Ashkenazi haplogroup). She has 76 one step matches but no exact matches. The key difference appears to be a back mutation at 16223. I would guess that it’s quite recent.

      I am mtDNA HV1b2 (another Ashkenazi haplogroup) and I have 36 exact matches. I have traced my maternal line back to my 2nd great grandmother and many of my matches have done the same. We are all from the same area – Poland, Lithuania and Belarus – but that’s as close as we can get.

      So, we have my husband with an MRCA at least back in 1630 and my sister with an MRCA that might be her great-grandmother. Both situations are one-step differences.

      Hope that helps!

      • Thanks Roberta and Gaye. I guess I’ll have to wait for an exact match! At least Family Finder, as an autosomal test, seems to offer closer relationships at shorter distances.

  8. Roberta, can you clarify please? You said, “Generally a base haplogroup can be reasonably assigned by HVR1 region testing, but not always. Sometimes they change with full sequence testing – so what you think you know may not be the end result.” Does this mean one’s HVR1 haplogroup may not be actually correct and may, when HVR2 & or Mega are completed change to another haplogroup? What about this scenario? I have two women who are supposed to have the same mother, one tested haplogroup U, another haplogroup K, but only the U test have I done the full sequence, on the K test is it only HVR1&2.. is there a chance the K is incorrect, or do I have it right when I assume now that these two likely actually are one each from the two wives of my ancestor rather than both from one wife as previously thought? Thx!
    Lisa

    • Part of the accuracy of the assignment depends on the company and the technology you’re dealing with. As part of the HVR1, FTDNA tests 22 SNP locations so they can estimate the haplogroups accurately, or mostly accurately. Still, I’ve seen some of them change with the FMS test and also with the changes in definitions. Part of the equation is whether there are haplogroup defining mutations in the HVR1 region or not. If you tested at Ancestry, all bets are off. I’ve seen people with exactly the same mutations, and not hand entered info, be assigned to different base haplogroups. The bottom line Lisa is that you need the full sequence to be sure.

      • Lesson learned…the test I’m questioning that is giving us ‘third wife’ results Is indeed at ancestry and comes back as haplogroup H whereas the others I have tested at ftdna are U and K. Now that I see this reply, I’m going to go look at that in comparison to the two we have done at ftdna (one FMS and one HVR1&2) from the two known wives to see if this ancestry.com dna match happens to match one of those two. THX!! I’m baaaaccckkkk I LOVE YOU ROBERTA!! YOU ARE RIGHT!! the ancestry reported haplogroup H is indeed an exact match to the HVR1 & 2 of our U reported line at FTDNA!! GEEZ–Can Ancestry GET ANYTHING RIGHT?!?!? LOL THANK YOU THANK YOU THANK YOU!!!

  9. Add on..this from your linked article to my earlier comment…
    “First, we are talking about we usually call mtDNA subclades, not haplogroups. The basic haplogroups have been set in stone for years now. Of course, it can be confusing. If U is a macrohaplogroup or superhaplogroup, then U8 is considered a haplogroup and U8b a subclade. However, it is now known that K is part of U8b; so you have haplogroup below a subclade on the mtDNA tree

    So, can my K haplogroup female have a possibility of being U5b2a1a1a on full sequence testing as is her sister who we did the full sequence test for?

    • The only way to know for sure is to test one of the sisters at the full sequence level. It’s unlikely, but you just don’t know for sure without the full sequence test. You’ll dealing with educated estimates – and some of them very good. The only other thing you can do is to ask Bill Hurst who is the Hap K/U guru.

  10. Pingback: Combining Tools – Autosomal Plus Y-DNA, mtDNA and the X Chromosome | DNAeXplained – Genetic Genealogy

  11. Just finding this blog entry, and I have to say, the exceptions do make it interesting!

    For example, my maternal uncle and I both had the FMS test and FF, and the latter confirms that we are indeed uncle/nephew (which was not in doubt, just noted for clarity). On the FMS test, neither one of us shows up as a match with the other at all. In fact, I can force it using the advanced matching, by including both FF and FMS, so that he shows up in my matches, and under FMS it says simply “x” (=”not a match”).

    In fact, I have compared our FMS data myself and there is just one mutation difference between us. As in your case above, that one happens to be one of the three defining my subgroup, J1c8… So my uncle is called just J1c with two “extra mutations,” and the “smartmatching” refuses to consider the possibility that we could share a common ancestor anywhere in genealogical time, since we are technically in different groups! I pointed out to FTDNA that it is probably more parsimonious to consider him J1c8 with a back-mutation, and after getting bounced up their support tree a bit, they responded that they’re not making any more manual changes until they have upgraded their older mtDNA tree to the newer one…

    The way I figure it, one of us probably has spurious “matches” under mtDNA due to being forced into the ‘wrong’ haplogroup on that technicality, and that is a bummer… All I need now is another known relative on the same maternal line to test FMS, and see what the tiebreaker says ;-)

  12. Pingback: Edith Barbara Lore Ferverda (1888-1960) and the Road to Hell – 52 Ancestors #10 | DNAeXplained – Genetic Genealogy

  13. I don’t know how I missed this earlier — but thank you, Roberta!! What sent me on maybe a misguided year-long search regarding another HVR2 match was the fact that back around 7 years ago, I matched at HVR2 at FTDNA another woman who turned out to also have my maternal line third great grandmother We had never met before, and we could compare ancestral notes — we were from the same maternal line family. So I applied this limited knowledge to another person in the US who also matched both of us on HVR2. There were only 7 of us in total. I spent over a year trying to connect us — no luck. Another thing that was interesting — the 3 of us in the United States matched 4 others currently living in Finland. We had no knowledge of family connections in Finland. I am U2e at FTDNA and U2e1a1 at 23andMe.

    This also explains why when my husband took the HVR2 mtdna test at FTDNA about 3 years ago, that he only had two matches!

    Now, I am wondering what is the best route (currently) to do mtdna comparisons!! I do not have any sisters or living maternal relatives.

    • 23andMe only uses probes on certain haplogroup defining locations and does not test the entire mitochondria. So you really can’t compare 23andme’s haplogroups to Family Tree DNAs where they test every location in the mitochondria if you purchase the full sequence. Plus, I don’t believe the two companies are using the same “version” of mtDNA assignment charts. Perhaps the person at 23andMe would test at Family Tree DNA so you can compare apples and apples.

      • I actually have not compared my mtdna ratings at FTDNA and 23andMe because 23andMe does not directly connect you to people you might match on mtdna. If you are comparing genomes (and I know you already know all this), they just put you in a group matching situation.

        Sorry I didn’t make it clearer — the other person who matched me was on FTDNA HVR2 and I was trying to help him find a connecting lineage. And this was before 23andMe began offering testing, so was not connected to any results there.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s