Ancient DNA Matches – What Do They Mean?

The good news is that my three articles about the Anzick and other ancient DNA of the past few days have generated a lot of interest.

The bad news is that it has generated hundreds of e-mails every day – and I can’t possibly answer them all personally.  So, if you’ve written me and I don’t reply, I apologize and  I hope you’ll understand.  Many of the questions I’ve received are similar in nature and I’m going to answer them in this article.  In essence, people who have matches want to know what they mean.

Q – I had a match at GedMatch to <fill in the blank ancient DNA sample name> and I want to know if this is valid.

A – Generally, when someone asks if an autosomal match is “valid,” what they really mean is whether or not this is a genealogically relevant match or if it’s what is typically referred to as IBS, or identical by state.  Genealogically relevant samples are referred to as IBD, or identical by descent.  I wrote about that in this article with a full explanation and examples, but let me do a brief recap here.

In genealogy terms, IBD is typically used to mean matches over a particular threshold that can be or are GENEALOGICALLY RELEVANT.  Those last two words are the clue here.  In other words, we can match them with an ancestor with some genealogy work and triangulation.  If the segment is large, and by that I mean significantly over the threshold of 700 SNPs and 7cM, even if we can’t identify the common ancestor with another person, the segment is presumed to be IBD simply because of the math involved with the breakdown of segment into pieces.  In other words, a large segment match generally means a relatively recent ancestor and a smaller segment means a more distant ancestor.  You can readily see this breakdown on this ISOGG page detailing autosomal DNA transmission and breakdown.

Unfortunately, often smaller segments, or ones determined to be IBS are considered to be useless, but they aren’t, as I’ve demonstrated several times when utilizing them for matching to distant ancestors.  That aside, there are two kinds of IBS segments.

One kind of IBS segment is where you do indeed share a common ancestor, but the segment is small and you can’t necessarily connect it to the ancestor.  These are known as population matches and are interpreted to mean your common ancestor comes from a common population with the other person, back in time, but you can’t find the common ancestor.  By population, we could mean something like Amish, Jewish or Native American, or a country like Germany or the Netherlands.

In the cases where I’ve utilized segments significantly under 7cM to triangulate ancestors, those segments would have been considered IBS until I mapped them to an ancestor, and then they suddenly fell into the IBD category.

As you can see, the definitions are a bit fluid and are really defined by the genealogy involved.

The second kind of IBS is where you really DON’T share an ancestor, but your DNA and your matches DNA has managed to mutate to a common state by convergence, or, where your Mom’s and Dad’s DNA combined form a pseudo match, where you match someone on a segment run long enough to be considered a match at a low level.  I discussed how this works, with examples, in this article.  Look at example four, “a false match.”

So, in a nutshell, if you know who your common ancestor is on a segment match with someone, you are IBD, identical by descent.  If you don’t know who your common ancestor is, and the segment is below the normal threshold, then you are generally considered to be IBS – although that may or may not always be true.  There is no way to know if you are truly IBS by population or IBS by convergence, with the possible exception of phased data.

Data phasing is when you can compare your autosomal DNA with one or both parents to determine which half you obtained from whom.  If you are a match by convergence where your DNA run matches that of someone else because the combination of your parents DNA happens to match their segment, phasing will show that clearly.  Here’s an example for only one location utilizing only my mother’s data phased with mine.  My father is deceased and we have to infer his results based on my mother’s and my own.  In other words, mine minus the part I inherited from my mother = my father’s DNA.

My Result My Result Mother’s Result Mother’s Result Father’s Inferred Result Father’s Inferred Result

In this example of just one location, you can see that I carry a T and an A in that location.  My mother carries a T and a G, so I obviously inherited the T from her because I don’t have a G.  Therefore, my father had to have carried at least an A, but we can’t discern his second value.

This example utilized only one location.  Your autosomal data file will hold between 500,000 and 700,000 location, depending on the vendor you tested with and the version level.

You can phase your DNA with that of your parent(s) at GedMatch.  However, if both of your parents are living, an easier test would be to see if either of your parents match the individual in question.  If neither of your parents match them, then your match is a result of convergence or a data read error.

So, this long conversation about IBD and IBS is to reach this conclusion.

All of the ancient specimens are just that, ancient, so by definition, you cannot find a genealogy match to them, so they are not IBD.  Best case, they are IBS by population.  Worse case, IBS by convergence.  You may or may not be able to tell the difference.  The reason, in my example earlier this week, that I utilized my mother’s DNA and only looked at locations where we both matched the ancient specimens was because I knew those matches were not by convergence – they were in fact IBS by population because my mother and I both matched Anzick.

ancient compare5

Q – What does this ancient match mean to me?

A – Doggone if I know.  No, I’m serious.  Let’s look at a couple possibilities, but they all have to do with the research you have, or have not, done.

If you’ve done what I’ve done, and you’ve mapped your DNA segments to specific ancestors, then you can compare your ancient matching segments to your ancestral spreadsheet map, especially if you can tell unquestionably which side the ancestral DNA matches.  In my case, shown above, the Clovis Anzik matched my mother and me on the same segment and we both matched Cousin Herbie.  We know unquestionably who our common ancestor is with cousin Herbie – so we know, in our family line, which line this segment of DNA shared with Anzick descends through.

ancient compare6

If you’re not doing ancestor mapping, then I guess the Anzick match would come in the category of, “well, isn’t that interesting.”  For some, this is a spiritual connection to the past, a genetic epiphany.  For other, it’s “so what.”

Maybe this is a good reason to start ancestor mapping!  This article tells you how to get started.

Q – Does my match to Anzick mean he is my ancestor?

A – No, it means that you and Anzick share common ancestry someplace back in time, perhaps tens of thousands of years ago.

Q – I match the Anzick sample.  Does this prove that I have Native American heritage? 

A – No, and it depends.  Don’t you just hate answers like this?

No, this match alone does not prove Native American heritage, especially not at IBS levels.  In fact, many people who don’t have Native heritage match small segments?  How can this be?  Well, refer to the IBS by convergence discussion above.  In addition, Anzick child came from an Asian population when his ancestors migrated, crossing from Asia via Beringia.  That Eurasian population also settled part of Europe – so you could be matching on very small segments from a common population in Eurasia long ago.  In a paper just last year, this was discussed when Siberian ancient DNA was shown to be related to both Native Americans and Europeans.

In some cases, a match to Anzick on a segment already attributed to a Native line can confirm or help to confirm that attribution.  In my case, I found the Anzick match on segments in the Lore family who descend from the Acadians who were admixed with the Micmac.  I have several Anzick match segments that fit that criteria.

A match to Anzick alone doesn’t prove anything, except that you match Anzick, which in and of itself is pretty cool.

Q – I’m European with no ancestors from America, and I match Anzick too.  How can that be?

A – That’s really quite amazing isn’t it.  Just this week in Nature, a new article was published discussing the three “tribes” that settled or founded the European populations.  This, combined with the Siberian ancient DNA results that connect the dots between an ancient population that contributed to both Europeans and Native Americans explains a lot.

3 European Tribes

If you think about it, this isn’t a lot different than the discovery that all Europeans carry some small amount of Neanderthal and Denisovan DNA.

Well, guess what….so does Anzick.

Here are his matches to the Altai Neanderthal.

Chr Start Location End Location Centimorgans (cM) SNPs
2 241484216 242399416 1.1 138
3 19333171 21041833 2.6 132
6 31655771 32889754 1.1 133

He does not match the Caucasus Neanderthal.  He does, however, match the Denisovan individual on one location.

Chr Start Location End Location Centimorgans (cM) SNPs
3 19333171 20792925 2.1 107

Q – Maybe the scientists are just wrong and the burial is not 12,500 years old,  maybe just 100 years old and that’s why the results are matching contemporary people.

A – I’m not an archaeologist, nor do I play one…but I have been closely involved with numerous archaeological excavations over the past decade with The Lost Colony Research Group, several of which recovered human remains.  The photo below is me with Anne Poole, my co-director, sifting at one of the digs.

anne and me on dig

There are very specific protocols that are followed during and following excavation and an error of this magnitude would be almost impossible to fathom.  It would require  kindergarten level incompetence on the part of not one, but all professionals involved.

In the Montana Anzick case, in the paper itself, the findings and protocols are both discussed.  First, the burial was discovered directly beneath the Clovis layer where more than 100 tools were found, and the Clovis layer was undisturbed, meaning that this is not a contemporary burial that was buried through the Clovis layer.  Second, the DNA fragmentation that occurs as DNA degrades correlated closely to what would be expected in that type of environment at the expected age based on the Clovis layer.  Third, the bones themselves were directly dated using XAD-collagen to 12,707-12,556 calendar years ago.  Lastly, if the remains were younger, the skeletal remains would match most closely with Native Americans of that region, and that isn’t the case.  This graphic from the paper shows that the closest matches are to South Americans, not North Americans.

anzick matches

This match pattern is also confirmed independently by the recent closest GedMatch matches to South Americans.

Q – How can this match from so long ago possibly be real?

A – That’s a great question and one that was terribly perplexing to Dr. Svante Paabo, the man who is responsible for producing the full genome sequence of the first, and now several more, Neanderthals.  The expectation was, understanding autosomal DNA gets watered down by 50% in every generation though recombination, that ancient genomes would be long gone and not present in modern populations.  Imagine Svante’s surprise when he discovered that not only isn’t true, but those ancient DNA segmetns are present in all Europeans and many Asians as well.  He too agonized over the question about how this is possible, which he discussed in this great video.  In fact he repeated these tests over and over in different ways because he was convinced that modern individuals could not carry Neanderthal DNA – but all those repeated tests did was to prove him right.  (Paabo’s book, Neanderthal Man, In Search of Lost Genomes is an incredible read that I would highly recommend.)

What this means is that the population at one time, and probably at several different times, had to be very small.  In fact, it’s very likely that many times different pockets of the human race was in great jeopardy of dying out.  We know about the ones that survived.  Probably many did perish leaving no descendants today.  For example, no Neanderthal mitochondrial DNA has been found in any living or recent human.

In a small population, let’s say 5 males and 5 females who some how got separated from their family group and founded a new group, by necessity.  In fact, this could well be a description of how the Native Americans crossed Beringia.  Those 5 males and 5 females are the founding population of the new group.  If they survive, all of the males will carry the men’s haplogroups – let’s say they are Q and C, and all of the descendants will carry the mitochondrial haplogroups of the females – let’s say A, B, C, D and X.

There is a very limited amount of autosomal DNA to pass around.  If all of those 10 people are entirely unrelated, which is virtually impossible, there will be only 10 possible combinations of DNA to be selected from.  Within a few generations, everyone will carry part of those 10 ancestor’s DNA.  We all have 8 ancestors at the great-grandparent level.  By the time those original settlers’ descendants had great-great-grandparents – of which each one had 16, at least 6 of those original people would be repeated twice in their tree.

There was only so much DNA to be passed around.  In time, some of the segments would no longer be able to be recombined because when you look at phasing, the parents DNA was exactly the same, example below.  This is what happens in endogamous populations.

My Result My Result Mother’s Result Mother’s Result Father’s Result Father’s  Result

Let’s say this group’s descendants lived without contact with other groups, for maybe 15,000 years in their new country.  That same DNA is still being passed around and around because there was no source for new DNA.  Mutations did occur from time to time, and those were also passed on, of course, but that was the only source of changed DNA – until they had contact with a new population.

When they had contact with a new population and admixture occurred, the normal 50% recombination/washout in every generation began – but for the previous 15,000 years, there had been no 50% shift because the DNA of the population was, in essence, all the same.  A study about the Ashkenazi Jews that suggests they had only a founding population of about 350 people 700 years ago was released this week – explaining why Ashkenazi Jewish descendants have thousands of autosomal matches and match almost everyone else who is Ashkenazi.  I hope that eventually scientists will do this same kind of study with Anzick and Native Americans.

If the “new population” we’ve been discussing was Native Americans, their males 15,000 year later would still carry haplogroups Q and C and the mitochondrial DNA would still be A, B, C, D and X.  Those haplogroups, and subgroups formed from mutations that occurred in their descendants, would come to define their population group.

In some cases, today, Anzick matches people who have virtually no non-Native admixture at the same level as if they were just a few generations removed, shown on the chart below.

anzick gedmatch one to all

Since, in essence, these people still haven’t admixed with a new population group, those same ancient DNA segments are being passed around intact, which tells us how incredibly inbred this original small population must have been.  This is known as a genetic bottleneck.

The admixture report below is for the first individual on the Anzick one to all Gedmatch compare at 700 SNPs and 7cM, above.  In essence, this currently living non-admixed individual still hasn’t met that new population group.


If this “new population” group was Neanderthal, perhaps they lived in small groups for tens of thousands of years, until they met people exiting Africa, or Denisovans, and admixed with them.

There weren’t a lot of people anyplace on the globe, so by virtue of necessity, everyone lived in small population groups.  Looking at the odds of survival, it’s amazing that any of us are here today.

But, we are, and we carry the remains, the remnants of those precious ancestors, the Denisovans, the Neanderthals and Anzick.  Through their DNA, and ours, we reach back tens of thousands of years on the human migration path.  Their journey is also our journey.  It’s absolutely amazing and it’s no wonder people have so many questions and such a sense of enchantment.  But it’s true – and only you can determine exactly what this means to you.

One Chromosome, Two Sides, No Zipper – ICW and the Matrix

ZipperThe questions I’ve received most often since the release of the new Family Finder Matrix from Family Tree DNA has to do with matches.  Specifically, what the “In Common With” feature is telling you versus what the Family Finder “Matrix” is telling you and how to utilize all of this information together.  At the bottom of this confusion is often a fundamental lack of understanding of how matching occurs and what it means in different contexts.

Let’s talk about this, step by step.

The “in common with” function (called triangulation for a few weeks, but now labeled “run common matches” ) shows you every person that you and one of your matches, match with in common.  I’ll be running this option for my matches with cousin David, shown below.

zipper 1

Here’s an example of my matches in common with my cousin, David.

Zipper 2

The Family Finder Matrix takes this information a bit further and shows you whether or not the people involved with this match, match each other as well.

In this case, I happen to know that my cousins Harold, Carl and Dean will match each other on my father’s side, as will my cousin David.  Warren doesn’t have firm genealogy, but from this, we can tell that he is indeed connected to this family group because he matches me, David, Harold and Carl, but not Dean and not Nova.  We have no idea how Nova connects to this line, if she does.  Notice that Nova does not match any of the other people in this group in the matrix below.  That means that my and David’s common ancestor with her is likely not from this same ancestral line shared by Harold, Carl and Dean.

zipper 3

From this point forward, I would drop back to my trusty downloaded full match spreadsheet that I maintain to see if indeed any of these people match me and my known cousins on the same segments.  If so, that confirms a family/ancestor relationship.   On the snipped from my spreadsheet below, you can see that Warren indeed matches both Buster and David and I, but not on the same segments.  Nova didn’t match any grouping on the same segments.  However, Buster and David both match me on the same portion of chromosome 19, so this confirms that we do share a common ancestor.  In this case, we also know, from our genealogy that the common ancestor is Lazarus Estes and wife, Elizabeth Vannoy.  Based on our multiple cousin matches, we can say that Warren is somehow connected to this line, but we can’t say how.

Zipper 4

I’ve had comments like “I have everything I need on my spreadsheet – I can see where all of my matches match me.”  And indeed, you can, but it’s not everything you need.  Here’s why.

Without additional information, you can’t tell, by just looking at your spreadsheet whether two people who match you on the same segment are matching on your Mom or Dad’s side.  For example, above, I know that both David and Buster are from my Dad’s line, but if I didn’t know that, one of them could be from Mom’s line and one could be from Dad’s, and while they are both related to me, on the same chromosome, they would, in that case, not be related to each other.  So, my spreadsheet of matches tells me clearly THAT people match me, and where, but it doesn’t tell me HOW or on which side.  For that, I need additional tools like ICW, the Matrix and plain old genealogy research.

This is the fundamental concept of matching and in a nutshell, why it’s so difficult.

Every Chromosome Has Two Sides

There are two sides to every chromosome, Mom’s side and Dad’s side.  Except nature has played a cruel trick on us and not installed a zipper.  There are no Mom and Dad labels.  There is no dividing that DNA or those matches in half magically, except by determing who they match, and how they do or don’t match each other.

When we match ourselves against our parents, for example, we then know immediately which half of our DNA came from which parent, but if you don’t have any parents available to match against, then you have to use genealogy or cousin matches to figure that out.

I talk about that in the Chromosome Mapping aka Ancestor Mapping article.

I’m going to use spreadsheets as examples here.  It think they are easier to see and understand, plus, I can manipulate them easily to reflect different situations.

Example 1 – The Very Basics of Matching

At each DNA location, or address, you have two alleles, one from each parent.  These alleles can have one of 4 values, or nucleotides, at each location, represented by the abbreviations T, A, C and G, short for Thymine, Adenine, Cytosine and Guanine.  That’s it, you’re done with all the science words now, so keep reading:)

On any given chromosome, from locations 1-20, you have the following DNA, in our example.

From Mom, you received all As and from Dad, all Cs.  You know that because I’m telling you, but remember, the matching software doesn’t know that because there is no zipper in your DNA.  All the software sees are that you have both an A and an C in location 1 and either an A or C is considered a match.

Zipper 5

In fact, this is what the software sees.  Be aware that in this case, AC=CA.

Zipper 6

Easy so far, right?

Example Two – Mom’s Known Cousin and Dad’s Known Cousin

Now you have two cousins, Mary and Myrtle.  You know, from having known them all of your life and sharing lots of Thanksgiving turkey that they are your family and you know clearly which side of your family they descend from.  Both of your cousins, Mary and Myrtle match you at the same locations on this chromosome, from 5-15.

But Mary is your mother’s cousin, and Myrtle is your Dad’s cousin.  So even though they both match you on the same exact chromosome and the same location, they do not match each other.  Well, let’s put it this way, if they also match each other, then you have an entirely different family genetic genealogy problem, called endogamy, and yes, you might be your own grandpa…but I digress.  But we’re going to assume for this discussion that your mother and father are not related to each other and do not share common ancestors.

Zipper 7

Still easy, right?

Example Three – An Unknown Cousin

Next, we have Martha.  You don’t know Martha, and you don’t know how she is related, but she obviously is.  Martha matches you, but she does not match Myrtle at all, and she doesn’t match Mary on enough overlapping chromosomes to be considered a match to her.  You can see their common match here between Mary and Martha in location 5.  In this case, as it turns out, Martha IS a cousin to Mary on Mom’s side, but we can’t tell that from this information because they don’t match in enough common locations to be above the matching threshold.  With this information, you can’t draw any conclusions.  You will have to wait to see who else Martha matches and look on your spreadsheet to see if Martha matches any of your known cousins and you on common segments which would confirm a common ancestor.  Your download spreadsheet will contain much more detailed information because once you match on any segment above the match threshold of about 7.7cM (plus a few other factors,) all matching segments of 1cM or above are downloaded – so you have a lot of information to work with.

But using both the ICW and matrix tools, Mary might cluster with other cousins on Mom’s side which would provide us with clues as to her relationship.  In fact, the first thing I’d do is to run an ICW with Mary and then utilize the Matrix tool to further define those relationships.

Zipper 8

Still not difficult.

Example Four – A “False Match”

Next we have Jeremy who is also a match to you.

Zipper 9

If you look at how Jeremy matches, you can see that he is actually matching on both sides, Mom’s and Dad’s side, but randomly.  Technically, he is a match to you, because he does match one or the other of your nucleotides at each location, A or C, but without a zipper, we have no idea HOW that DNA is divided in you between Mom and Dad.  In other words, the software doesn’t know that Mom was all A and Dad was all C, unless we’ve phased the data against your parents AND the software knows how to utilize that information.

However, if your parents are one of your matches, you can immediately see which side the match falls on, if either.  In this case, Jeremy doesn’t fall on either side because he is simply a circumstantial match, also known as a match my convergence or a false match.  This is also called IBS, or identical by state, as opposed to IBD, identical by descent.  The smaller the segment you show as a match, especially if there is no clustering, the more likely the match is to be IBS instead of the genealogically desirable IBD.

When people ask how someone can match a child but not a parent, this is the answer.  He matches you on 11 segments, circumstantially, but he only matches your parents on 5 and 6 segments, respectively, which often (but not always) puts him under the matching threshold.  Jeremy may also match Mary, depending on the thresholds.

This is also how someone can match in the “in common with” tool, but not be a match to anyone on the match list in the Matrix.  In fact, this is the power of these multiple tools.

This also doesn’t mean this match is entirely useless, because you DO match.  It may simply not be relevant genealogically.  In “The Autosomal Me” series, I’ve utilized very small match segments that in fact very probably ARE reflective of a common population and not of recent ancestry.  In my Native American research, this is exactly what I was looking for.  You may not be able to utilize this information today, but don’t entirely discount it either.  Just set it aside and move on to a more productive match.

Example Five – Common Matches, Different Ancestors

This situation provides clues, but no proof.

Mary and Joyce both match me on Mom’s segments, but they do not match each other.  They don’t match me on the same segments, so this indicates that they are probably from different ancestors in my Mother’s lines.  As more matches appear, the clusters of people and their genealogy will make this more apparent.

In order to determine which ancestors, I’ll need to work on the genealogy of both Mary and Joyce and see who else they also match on the same segments.  Sometimes the secret of the genealogy match is in the genealogy research or descent of your matches.

Zipper 10

Example Six – Clusters of Cousins

In this example, no one matches Dad, so he’s just out for now.  Susie and Mary match mom on the same segment, which proves that the three of these people share a common ancestor.  Mom and Joyce match each other too, but Joyce doesn’t match Mary and Susie, so they won’t cluster together on the matrix.  However, on the ICW tool, all three women, Joyce, Mary and Susie will match me and Mom.

Using the ICW tool if I were to ICW with Mom, you would see this list:

  • Joyce
  • Mary
  • Susie

The question then becomes, are Joyce, Mary and Susie related to each other, or not.  If so, and to me and Mom, then that indicates a common ancestor within the match group, like me, Joyce and Mom.  The second group doesn’t match the first group – me, Mary, Mom and Susie.  Using these tools together, these people clearly fall into two match groups, the green and blue on the spreadsheet below.  But remember, the match routine doesn’t know which side your As and Cs came from.  All it knows is that you match these people.  But based on these groups and my download spreadsheet common segment matches, I can tell that I’m working with two ancestral lines.

Zipper 11

My matrix for these people would look like this:

Zipper 12

My master matching spreadsheet would now look like this.

zipper 13

When we started, all I would have been able to see is that all of these people matched Mom and Dad and I on the same segments. By utilizing the various tools, I was able to sort into groups and eventually, subgroups.

In fact, you can see below that within Mom’s pink group, there is also the smaller cluster of Mary, Susie, me and Mom.

Zipper 14

For Jeremy and Martha, we can’t do any more right now, so I’ve recorded what we do know and set them aside.

Here, you can see the matches sorted by chromosome, start and end segment.

zipper 16

It looks a lot different than where we started, shown below, when all we had was a list of people who matched each other with no additional information.  We’ve added a lot!

zipper 17

In Summary – Creating the Zipper

So, where are we with this?

By utilizing all of the tools at your disposal, including the ICW tool, the Family Finder Matrix, your matching spreadsheet and your genealogical information, you’re in essence creating that zipper that divides half of your DNA into Mom’s side and Dad’s side.  Then into grandma’s and grandpa’s side, and on up the pedigree chart.

Each of these tools can tell you something unique and important.

The ICW tool tells you who matches you and another person, in common.  It doesn’t tell you if they also match each other.  This tool can provide extremely important clustering information.  For example, if I see unknown cousin Martha clustered with a whole group of known Estes descendants, then that’s a pretty good clue about how I’m related to Martha.  If, on the other hand, I find Martha clustered with people from both sides of my family, well, my Mom and Dad just might be related to each other or their ancestors went to or came from the same places.

By utilizing the Matrix tool, I can tell which of my matches are actually matching each other too, so that puts Martha in a much smaller group, or maybe eliminates her from certain groups.

By then utilizing my downloaded match spreadsheet, on which I record every known tidbit of genealogy information, even generalities like, “family from NC” if that’s the best I can get, I can then see where Martha matches me and others on the same segments, and based on the information in the ICW and the Matrix and my genealogy info, I may be able to slot Martha into a family group.  On a great day – I’ll be able to be more specific and tell her which family group – like we were able to do with my newly found cousin, Loujean.

So, I hope you’ve enjoyed learning how to install a chromosome zipper.  Now you can happily go about unzipping all of that genealogy information held in your DNA, that piece by piece, we’re slowing revealing.

zipper final

Mitochondrial DNA Convergence and Matches

Every now and then, when I’m doing DNA reports, I run across the perfect example of a DNA phenomenon.  Today, it was a mitochondrial DNA mutation in motion.  Let’s take a look at what happened, how it was discovered and what it means.

mtdna convergence chart

I was contacted a few weeks ago by someone I had been working with on another project.  This woman, we’ll call her June, was concerned because both she and her maternal first cousin, Doris, had both taken mitochondrial DNA tests and they didn’t match each other.  I took a look, of course, and sure enough, at the HVR1 level, there was one mutation difference, at location T16271C.

mtdna convergence

This was particularly interesting, because at the first cousin level, these women shared a maternal grandmother, which means that either June’s mother or Doris’s mother had had a mutation in their mitochondrial DNA, or June or Doris did.  June asked me how she could tell who had the mutation.

I asked if either June or Doris had siblings.  June had a brother, John, so she ordered a kit for John.  If John matched June, then their mother is the one who had the mutation.  If John matched Doris, then June herself had the mutation.

How do I know this, that the mutation didn’t happen in Doris or her mother?  Because the mutation is not “normal” and is listed in the RSRS values in the “extra mutations.”

Furthermore, Doris, who did not carry the extra mutation, had 13,204 matches at the HVR1 level (haplogroup H), where June who did carry the extra mutation only had 41.  Clearly to be useful, genealogically, this test would need to be expanded to the full sequence level.

So June’s brother, John, tested and he matched his sister June, telling us that their mother carried this mutation, and gave it to both of her children.  So the mutation occurred between June’s mother and June’s grandmother.

Are These Matches Valid?

June asked me if her matches were valid.

That’s a tough question to answer, because convergence has occurred.

So let me answer this in two ways.

The matches are technically accurate.  This means that indeed she matches all 41 of the people that the matching routine shows as her exact HVR1 matches.  So in that way, those matches are accurate, but they aren’t valid or meaningful for genealogy.

They aren’t useful, because we know, beyond a doubt that these matches are not related to her in a very long time, probably back into prehistory, because the reason she matches them at the HVR1 level is because she just happened to have the same mutation that all 41 of them carry.  Carrying the same mutation does NOT absolutely mean you share a common ancestor who carried that mutation.  Mutations can occur at any time, and if a mutation happens at this location in the mitochondrial DNA, there is a 1 in 3 chance the person who has the mutation will have the same value as you, since there are only 4 choices, T, A, C, and G, to begin with.  This is what we call convergence, and you’ve just seen it happen.  People match each other, but because they happened to have the same spontaneous mutation, not because they share a common ancestor who had that mutation.  Most of the time, we don’t know whether we are looking at real matches or matches by convergence, but this time, we know for sure, because we can prove that June’s grandmother did not have the mutation, because June’s first cousin, Doris, does not.

So, if June’s HVR1 results aren’t useful to her, whose are?  That’s easy, her cousin Doris’s results are representative of the mitochondrial DNA of their mutual grandmother, so Doris’s matches are actually June and John’s ancestral matches as well.

Could There Be A Fly in the Ointment?

Not matching someone you thought you should match is unsettling.  Could we test someone else to be absolutely positive we’re not dealing with a back mutation?

Certainly, if grandmother had another female child who had children, or if grandmother has a living male child, they can be tested too.  The test on the third child would positively confirm grandmother’s mitochondrial DNA values.

Could we prove positively that the first cousins are actually first cousins, to remove any nagging doubt?

Certainly, using the Family Finder test.