X Matching and Mitochondrial DNA is Not the Same Thing

Recently, I’ve noticed a lot of confusion surrounding X DNA matching and mitochondrial DNA. Some folks think they are the same thing, but they aren’t at all.

It’s easy to become confused by the different types of DNA that we can use for genealogy, so I’ll try to explain these differences two or three different ways – and hopefully one of them will be just the ticket for you.

Both Associated with Females

I suspect the confusion has to do with the fact that mitochondrial DNA and the X chromosome are both associated in some manner with female inheritance. However, that isn’t always true in the strictest sense, as women also inherit an X chromosome from their father.

Males Inherit:

  • An X chromosome from their mother
  • Mitochondrial DNA from their mother

Females Inherit:

  • An X chromosome from their mother
  • An X chromosome from their father
  • Mitochondrial DNA from their mother

The difference, as you can quickly see, is that females inherit an X chromosome from both parents, while males only inherit the X from their mothers. That’s because males inherit the Y chromosome from their father instead – which is what makes males male.

As a quick overview about inheritance works, you might want to read the article, 4 Kinds of DNA for Genetic Genealogy.

The good news is that both mitochondrial DNA and the X chromosome have very specific inheritance paths that can be very useful to genealogy, once you understand how they work.

Who Gets What?

Mitochondrial DNA Inheritance

Mitochondrial DNA is inherited by both genders of children from their mothers. Mitochondrial DNA is NEVER recombined with the mitochondrial DNA of the father – so it’s passed intact. That’s why both males and females can test for their direct matrilineal line through their mitochondrial DNA.

In the pedigree chart above, you can see that mtDNA (red circles) is passed directly down the matrilineal line, while Y DNA is passed directly down the patrilineal (surname) line (blue squares.)

I’ve written an in-depth article titled, Mitochondrial DNA – Your Mom’s Story that might be useful to read, as well as Working with Y DNA – Your Dad’s Story.

The X Chromosome

The X Chromosome is autosomal, meaning that it recombines in every generation. If you are a female, the X recombines just like any other autosome, meaning chromosomes 1-22. You receive a copy from each parent.

The 23rd pair of chromosomes is the X and Y chromosomes which convey gender. Males receive an X from their mother and Y from their father. The Y chromosome makes males male. Females receive an X chromosome from both parents, just like the rest of chromosomes 1-22.

Inheritance Pathways

If you are a male, the inheritance path of the X chromosome is a bit different from that of a female, because you inherit your X only from your mother.

Females inherit their father’s ONLY X chromosome intact, which he inherited from his mother. Females inherit their X chromosome from their mother in the normal autosomal way. A mother has two X chromosomes, so the mother can give a child either chromosome entirely or parts of both of her X chromosomes.

Because of the different ways that males and females inherit the X chromosome, the inheritance path is different than chromosomes 1-22, portions of which you can inherit from any of your ancestors. Conversely, you can only inherit portions of your X chromosome from certain ancestors. You can read about more about this in the article, X Marks the Spot.

Female X inheritance chart. For male distribution, look at my father’s side of the tree.

My own colorized X chromosome chart is shown above, produced from my genealogy software and Charting Companion. An X match MUST COME from one of the ancestors in the pink and blue colored quadrants. It’s very unlikely that I would inherit parts of my X chromosome from all of these ancestors, but these ancestors are the only candidates from whom my X originated. In other words, genealogically, these are the only ancestors for me to investigate when I have an X DNA match with someone.

Because of this unbalanced distribution of the X chromosome, if you are a male and you match someone on the X chromosome, assuming it’s a legitimate match and not a match by chance, then you know the match MUST come from your mother’s side of the family, and only from her pink and blue colored ancestors – looking at my father’s half of the tree as an example.

If you are a female the match can come from either side, but only from a restricted number of individuals – those colored pink or blue, as shown above.

X chart with Y line included in purple, for males, and mitochondrial line in green.

My mitochondrial line, shown on the X chart would consist of only the women on the bottom row, extending to the right from me, colored in green above. My father’s Y DNA line would be the purple region, extending along the bottom at left. Of course, I don’t have a Y chromosome, because I’m female.

Of the individuals carrying the purple Y DNA, the only one with an X chromosome that a female could inherit would be the father. A female would inherit both the mtDNA of all of the green women, plus could also inherit an X chromosome (or part of an X) from them too.

For males, looking at my father’s half of the chart. He can inherit no X chromosome from any of the purple Y DNA portion, because those men gave him their Y chromosome. My father would inherit his mitochondrial DNA from his direct matrilineal line, shown in yellow, below.

X chart with mitochondrial inheritance line for mother (and child) shown in green, for father shown in yellow.  Both yellow and green lines can contribute to the X chromosome for males and females.

In my father’s case, the females in his tree that he can inherit an X chromosome from are quite limited, but people who have the opportunity to pass their X chromosome to my father are never restricted to only the people that pass his mitochondrial DNA to him. However, the X chromosome contributors always include the mitochondrial DNA contributors for both males and females.

In my father’s case, above, he inherits his X chromosome from his mother, who can only inherit her X from the people on his side of the chart shown in yellow, blue or pink. In essence, the people in yellow or to the left of the yellow with any color.

As his daughter, I can inherit from any of those ancestors as well, since he gives me his only X, who he inherited from his mother. I also inherit an X from my mother from anyone who is green, pink or blue on her side of my chart.

As you can see, my X can come from many fewer ancestors on my father’s side than on my mother’s side.

It just happens that ancestors in the mitochondrial line also are able to contribute an X chromosome and either gender can inherit parts of their X chromosome from any female upstream of their mother in the direct matrilineal line. However, only the direct matrilineal line (yellow for your father and green for your mother) contributes mitochondrial DNA. None of the other ancestors contribute mtDNA to this male or female, although females contribute their mtDNA to other individuals in the tree. For a more detailed discussion on inheritance, please read the article, “Concepts – ‘Who to Test Series”.

Special Treatment for X Matches

While the generally accepted threshold for autosomal DNA is about 7cM, for X DNA, there appears to be a much higher incidence of false matches at higher levels than the rest of the chromosomes, as documented by Philip Gammon as in his Match-Maker-Breaker tool.  This appears to have to do with SNP density.

I would encourage genetic genealogists to consider someplace between 10 and 15 cM as an acceptable threshold for an X chromosome match. This of course does not mean that smaller segment matching can’t be relevant, it’s just that X matches are less likely to be relevant at levels below 10-15 cM than the rest of the chromosomes.

Summary

As you can see, the mitochondrial DNA is passed from one line only – the direct matrilineal line – green to my mother and then me, yellow to my father. The mitochondrial DNA has absolutely NOTHING to do with the X chromosome, as they are entirely different kinds of DNA. It just so happens that the individuals who contribute mitochondrial DNA are also some of the ancestors who can contribute an X chromosome to either males or females.

The yellow and green ancestors always contribute mitochondrial DNA, but the pink and blue NEVER contribute mitochondrial DNA to the father and mother in our chart.

The X chromosome has a very distinctive inheritance path, shown in the first fan chart, that will help identify potential ancestors who may have contributed your X chromosome – which is wonderful for genealogists. If your ancestor is not colored pink or blue, in the first chart, they did not contribute anything to your X chromosome – so an X match MUST come from a pink or blue ancestor (which includes yellow and green in the later charts.)

By color, the people in the fan chart provide the following:

  • Purple – Y chromosome to father only.  Y is passed on to a male child, but not to females.
  • Yellow – Mitochondrial always to father. X always from mother to males but X can come from either yellow or pink and blue ancestors upstream.
  • Green – Mitochondrial always to the mother.  Females receive an X chromosome from their green mother and also from their father, who received his X chromosome from his yellow mother.
  • PInk and blue on father’s side – contribute to the father’s X chromosome, in addition to yellow.
  • Pink and blue on mother’s side – contribute to the mother’s X chromosome, in addition to green.

 

If you are a male and see an X match on your father’s side of the tree, you know that match is either actually coming from your mother’s side of the tree, or the match is false, meaning identical by chance.

The great news is that X matching is another tool with special attributes in the genealogist’s toolbox, along with both mitochondrial and Y DNA.

Your X chromosome test is included as part of the Family Finder test. You can order the Family Finder or the mitochondrial DNA tests here.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Concepts – Who To Test for Your Father’s DNA

If the first thing you thought when you read the title of this article was, “Well duh – test your father,” you would be right…unless your father is deceased.  Then, it’s not nearly as straightforward, because you have to find other family members who carry the same Y and mitochondrial DNA as your father.

These same concepts and techniques can be applied to testing for other men’s lines as well – so please read, even if Dad is sitting right beside you.

Before beginning this article, you might want to read “4 Kinds of DNA for Genetic Genealogy” to understand the very basics of how different kinds of DNA are inherited, and how they can help you.

I was inspired to write this series of “Who to Test” articles to help people determine how to obtain the DNA they need to solve family mysteries from ancestors in their tree. For the most part, those ancestors are deceased, so one must understand how to obtain their DNA by testing living descendants descended in special ways.

Click to enlarge any graphic

In this series, we’ll be discussing how to test all of the individuals above for their mitochondrial DNA and males for their Y DNA.

Y DNA lineages are shown by blue lines and mitochondrial DNA lineages are shown by pink lines. In the charts below, different colored boxes and hearts showing the descent of blue male lines and pink(ish) mitochondrial lines.

In other words, the son at the bottom inherits his father’s light blue Y DNA, but his mother’s pink mitochondrial DNA that is the same as his sister’s and his mother’s mitochondrial line. Hence, his pink heart.

What Can Y and Mitochondrial DNA Tell You?

Both Y and mitochondrial DNA can tell you about your clan, meaning where your ancestors in that particular line were found. Many people have been surprised to find that these particular lines descended from Native American, Asian, Jewish, European or African ancestors. Some clan assignments, known as haplogroups, can be quite specific, but others are more general in nature.

You also receive matches and can communicate to find your common ancestor. Males can look for surnames the same or similar to their own.

You can read more about what mitochondrial DNA can do for you in the article, Mitochondrial DNA – Your Mom’s Story.

You can read more about what Y DNA can do for you in the article, Working with Y DNA – Your Dad’s Story.

Your Father’s DNA

Testing your father’s Y and mitochondrial DNA is easy, if your father is living. You can simply test your father.

As you can see in the chart above, your father inherited his Y DNA from the light blue line, from his father, which is typically the surname line.

Your father inherited his mitochondrial DNA from his direct matrilineal line, meaning the magenta line – your paternal grandmother and her direct maternal ancestors.

Your father did NOT pass his mitochondrial DNA to either of his children and he only passed his Y DNA to his son. His daughter has no Y DNA and her mother’s mitochondrial DNA.

You can test both your father’s Y DNA and mitochondrial by simply testing your father. However, testing becomes more challenging if your father is not available to test.

Your goal then becomes to find people who carry the same light blue Y DNA as your father, and the same magenta mitochondrial DNA that he carried as well. Let’s look at various ways to achieve that goal.

Testing Uncles and Siblings

If you are a male, meaning the son in the chart above, just test yourself for your father’s Y DNA.

Of course, you carry your mother’s mitochondrial DNA, shown by the pink heart that matches your sister, so you will have to find someone else who carries the same mitochondrial DNA as your father.

If you are a female, you can’t test for either your father’s Y or his mtDNA line. However, all is not lost.

If your father has any full male siblings, that’s your next best bet, because they will carry the same Y DNA and the same mitochondrial DNA as your father, because they share the same parents. You can test the same uncle for both Y and mitochondrial DNA. A brother and sister to your father have been added to the chart, below.

In the above chart, your father has two siblings, a male and a female. All three share the same mitochondrial DNA, but only the males share the Y DNA. Your father’s brother shares both. Your father’s sister shares his mitochondrial DNA, but not his Y DNA, shown above.

However, let’s say you’re the daughter and that your father and his brother are deceased. You can test your father’s sister for her mitochondrial DNA and you can test your own brother for your father’s Y DNA, shown below.

Don’t have a brother but your father’s brother had a son? No problem. Test the brother’s son who will carry his father’s Y DNA, which is the same as your father’s Y DNA, assuming nothing unknown.

You say your father’s sister is deceased too, but she had a child of either gender. No problem, you can test that child, whether they are a male or female for the sister’s mitochondrial DNA, which is the same as your father’s mitochondrial DNA.

In the chart above, all of the people with sky blue squares can test for your father’s Y DNA and all of the people with magenta squares or magenta hearts can test for your father’s mitochondrial DNA.

As you can see, you may well have lots of options.

Potential Testers

Father’s Y DNA Father’s mtDNA
Your Father Yes Yes
You Yes, if you are a male, No if you are a female No – you inherit your mtDNA from your mother
Your sibling Yes, if your sibling is a male, No if your sibling is a female No – your father does not pass his mtDNA to his children
Your father’s brother Yes Yes
Your father’s sister No – she didn’t inherit a Y chromosome from her father Yes
Your father’s brother’s children Yes, if male, No if female No – he didn’t pass his mitochondrial DNA to his children
Your father’s sister’s children No Yes – both genders

What Tests to Order

Family Tree DNA is the only testing vendor that offers Y and mitochondrial DNA testing that allows you to match to others. Additionally, they provide additional tools to understand the message Y and mtDNA carries for you.

For Y DNA testing, you can order either the 37, 67 or 111 marker test. I recommend that you purchase what the budget can afford. You can always upgrade later, but the cost of the original test plus an upgrade is somewhat more than just purchasing the larger test initially.  The greater the number of markers you purchase, the higher the level of specificity in the match results. The more closely you match someone, the more closely related you are to that person, and the closer in time your common ancestor lived. If you’re unsure what to purchase, 37 markers is a great place to begin.

For mitochondrial DNA testing, you can order the mtDNA Plus test, which is a subset of the mtFull Sequence test. In order to receive your full haplogroup designation, the entire mitochondrial DNA needs to be tested. I recommend the full sequence test be ordered.

For autosomal DNA testing, everyone can test, and as long as you’re placing an order, I’d suggest that you go ahead and order the Family Finder test. You can discover your ethnicity percentage estimates for several worldwide regions, including breakdowns of Europe, Africa and Asia as well as Native American and Jewish.

Additionally, while the Y and mitochondrial DNA tests reach back deep into time on those two specific lines, and only those two lines, the autosomal test tests the DNA of all of your ancestral lines, but may not reach back reliably in time for matches before the past 5 or 6 generations. Think of Y and mtDNA as viewing recent as well as very deep ancestors on just those lines, and Family Finder as broadly surveying all of your ancestors, but just in the past 7-10 generations.

The fun of autosomal DNA testing, aside from ethnicity estimates, is to discover which cousins you match and find your common ancestor.

In order for Family Tree DNA to be able to provide you with phased Family Finder matches, which indicates on which side of your tree (maternal or paternal) your match is found, helping to identify common ancestors – it’s critical for known relatives to test. The older the relative, generationally, the more helpful the testing is to you – so test those older family members immediately, while you still can.

You can order your tests and upgrades here.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Quick Tip – How to Unjoin a Project at Family Tree DNA

Oops!  Did you accidentally join a project at Family Tree DNA in error, or just need to do some housekeeping?

Some folks think that only project administrators can remove people from projects, but people can unjoin themselves – and don’t have to wait on the administrator.

Removing yourself from a Family Tree DNA project is easy. Just click on the Projects tab, at the top right of your personal page, then on “Manage my projects.”

You will then see a list of the projects you have joined where you are currently a member. Click to enlarge the graphic below.

At the far right, you can click on “Leave Project” to unjoin yourself from the project.

The next screen you will see asks you to provide a reason for leaving.

Type something in the box, but please be nice – administrators are all volunteers – then click submit.

Understand that your reason is sent to the administrator, but they have no avenue to reply to you after you have left the project. So don’t expect to hear from them, because they can’t.  If you have a question for the admins or a discussion item, prior to leaving, just send them an e-mail.

Easy peasy!!

If you’re looking for how to select and join a project, you might enjoy How to Join a DNA Project.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Quick Tip – Add Most Distant Ancestor and Location

This Quick Tip will help you get the most out of your Y and mitochondrial DNA results at Family Tree DNA in 9 easy steps.  It’s not difficult, so let’s take a look at how this will help you and walk through the steps together.

Finding Your Common Ancestor

As genealogists, our goal is to find our common ancestor with our matches and this is done through matching our DNA and looking at the relevant branches of our and our matches’ trees.

At Family Tree DNA, one of the things each of us can do to help our matches identify our most distant direct matrilineal (mtDNA) and Y DNA matches is to complete the Earliest Known Ancestor fields in our Personal Information.

If you’re wondering how this benefits YOU, just look at the information you see about your matches. How much information you see is entirely dependent on your match completing their Most Distant Ancestor and that ancestor’s location information.

Note that you can click on any of the graphics to enlarge.

In the above example, the matches (names obscured for privacy) happen to be my mitochondrial DNA full sequence matches. Regardless of which matches you’re looking at, all Y and mtDNA matches show the Earliest Known Ancestor – which is absolutely critical information for you to discern whether you can identify a common ancestor, and whether or not the location of that ancestor is someplace near the location of your own earliest known ancestor.

The second screen where Earliest Known Ancestor information appears is the Matches Map, below, which shows you the location of the Earliest Known Ancestor of each of your matches.

My Matches Map for full sequence mitochondrial results is shown above, with my ancestor shown with the white pin. Ancestors and their locations are critically important for determining the relevance of matches.

The more everyone shares, the better for everyone who matches!

Who is My Earliest Known Ancestor?

It’s easy to get confused, because this field isn’t asking for your oldest known ancestor in that entire line, but your DIRECT LINE ancestor, specifically:

  • For mitochondrial DNA – your earliest known ancestor is your direct MATERNAL (matrilineal) ancestor – so, you, your mother, her mother, her mother, etc., until you run out of mothers. If your oldest ancestor in that line is the husband of one of the mothers, that doesn’t count – because you only inherit your mitochondrial DNA from the direct matrilineal females. The person listed in this field MUST BE A FEMALE. If you see one of your matches listing a male, you know they are confused.

To clarify, in the above pedigree chart, you inherit your mitochondrial DNA from the red circle ancestors – so the oldest ancestor in that line is whose name is listed as the Earliest Known Ancestor.

  • For your paternal line, Y DNA for males, your Earliest Known Ancestor would be your surname ancestor on the direct paternal line – shown by blue squares, above.

How Do I Add or Update Ancestors?

Step 1 – On your dashboard, beneath your picture, click on the orange “Manage Personal Information” link.

Step 2 – You will then see the Account Setting toolbar below.

Click on the “Genealogy” tab.

Step 3 – Click on the “Earliest Known Ancestors” link, beneath the Genealogy tab.

Step 4 – Update your Earliest Known Ancestors information, then click on the orange “Save” button on the bottom to save your information.

Step 5 – To add or update the Ancestral Location, click on “Update Location” for the Direct Paternal or Direct Maternal side, shown above.. You will see the following map which displays the locations for your ancestors if you have entered that information.

For females, since you don’t have a Y chromosome, your paternal location, won’t show. Everyone’s mitochondrial DNA location will be displayed on the map.

Step 6 – Below the map, click on “Edit Location.”

A grey box will be displayed with your current information showing. To add information or change a location, click on “Update Maternal Location” or “Update Paternal Location.” The Maternal and Paternal steps are the same, so we’ll use the maternal line as an example.

Step 7 – Enter your direct matrilineal ancestor’s name, birth year and location. This is the information that will show in your match link to others. Be sure it’s your earliest known ancestor in your mother’s direct line; your mother, her mother, her mother, etc.

Then click on “next.”

Step 8 – The system will search for the location you entered, showing in the search location, below, or finding the closest location. The system automatically completes the longitude and latitude, so ignore those fields.

Click on Search. You will be given the option to change the verbiage of the location. This may be useful when the name of the town, region or country has changed from when your ancestor lived there versus the name today.

Step 9 – Your final information will be shown, so click on “Save and Exit.”

Done

Congratulations, you’re finished!  If you want to update your information, just follow the same process.

Now might be a good time to check your information to be sure it’s as detailed and complete as possible. After all, we all want information about our matches, so we need to give them our own!

You can click here to sign in.

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Concepts – Segment Size, Legitimate and False Matches

Matchmaker, matchmaker, make me a match!

One of the questions I often receive about autosomal DNA is, “What, EXACTLY, is a match?”  The answer at first glance seems evident, meaning when you and someone else are shown on each other’s match lists, but it really isn’t that simple.

What I’d like to discuss today is what actually constitutes a match – and the difference between legitimate or real matches and false matches, also called false positives.

Let’s look at a few definitions before we go any further.

Definitions

  • A Match – when you and another person are found on each other’s match lists at a testing vendor. You may match that person on one or more segments of DNA.
  • Matching Segment – when a particular segment of DNA on a particular chromosome matches to another person. You may have multiple segment matches with someone, if they are closely related, or only one segment match if they are more distantly related.
  • False Match – also known as a false positive match. This occurs when you match someone that is not identical by descent (IBD), but identical by chance (IBC), meaning that your DNA and theirs just happened to match, as a happenstance function of your mother and father’s DNA aligning in such a way that you match the other person, but neither your mother or father match that person on that segment.
  • Legitimate Match – meaning a match that is a result of the DNA that you inherited from one of your parents. This is the opposite of a false positive match.  Legitimate matches are identical by descent (IBD.)  Some IBD matches are considered to be identical by population, (IBP) because they are a result of a particular DNA segment being present in a significant portion of a given population from which you and your match both descend. Ideally, legitimate matches are not IBP and are instead indicative of a more recent genealogical ancestor that can (potentially) be identified.

You can read about Identical by Descent and Identical by Chance here.

  • Endogamy – an occurrence in which people intermarry repeatedly with others in a closed community, effectively passing the same DNA around and around in descendants without introducing different/new DNA from non-related individuals. People from endogamous communities, such as Jewish and Amish groups, will share more DNA and more small segments of DNA than people who are not from endogamous communities.  Fully endogamous individuals have about three times as many autosomal matches as non-endogamous individuals.
  • False Negative Match – a situation where someone doesn’t match that should. False negatives are very difficult to discern.  We most often see them when a match is hovering at a match threshold and by lowing the threshold slightly, the match is then exposed.  False negative segments can sometimes be detected when comparing DNA of close relatives and can be caused by read errors that break a segment in two, resulting in two segments that are too small to be reported individually as a match.  False negatives can also be caused by population phasing which strips out segments that are deemed to be “too matchy” by Ancestry’s Timber algorithm.
  • Parental or Family Phasing – utilizing the DNA of your parents or other close family members to determine which side of the family a match derives from. Actual phasing means to determine which parts of your DNA come from which parent by comparing your DNA to at least one, if not both parents.  The results of phasing are that we can identify matches to family groups such as the Phased Family Finder results at Family Tree DNA that designate matches as maternal or paternal based on phased results for you and family members, up to third cousins.
  • Population Based Phasing – In another context, phasing can refer to academic phasing where some DNA that is population based is removed from an individual’s results before matching to others. Ancestry does this with their Timber program, effectively segmenting results and sometimes removing valid IBD segments.  This is not the type of phasing that we will be referring to in this article and parental/family phasing should not be confused with population/academic phasing.

IBD and IBC Match Examples

It’s important to understand the definitions of Identical by Descent and Identical by Chance.

I’ve created some easy examples.

Let’s say that a match is defined as any 10 DNA locations in a row that match.  To keep this comparison simple, I’m only showing 10 locations.

In the examples below, you are the first person, on the left, and your DNA strands are showing.  You have a pink strand that you inherited from Mom and a blue strand inherited from Dad.  Mom’s 10 locations are all filled with A and Dad’s locations are all filled with T.  Unfortunately, Mother Nature doesn’t keep your Mom’s and Dad’s strands on one side or the other, so their DNA is mixed together in you.  In other words, you can’t tell which parts of your DNA are whose.  However, for our example, we’re keeping them separate because it’s easier to understand that way.

Legitimate Match – Identical by Descent from Mother

matches-ibd-mom

In the example above, Person B, your match, has all As.  They will match you and your mother, both, meaning the match between you and person B is identical by descent.  This means you match them because you inherited the matching DNA from your mother. The matching DNA is bordered in black.

Legitimate Match – Identical by Descent from Father

In this second example, Person C has all T’s and matches both you and your Dad, meaning the match is identical by descent from your father’s side.

matches-ibd-dad

You can clearly see that you can have two different people match you on the same exact segment location, but not match each other.  Person B and Person C both match you on the same location, but they very clearly do not match each other because Person B carries your mother’s DNA and Person C carries your father’s DNA.  These three people (you, Person B and Person C) do NOT triangulate, because B and C do not match each other.  The article, “Concepts – Match Groups and Triangulation” provides more details on triangulation.

Triangulation is how we prove that individuals descend from a common ancestor.

If Person B and Person C both descended from your mother’s side and matched you, then they would both carry all As in those locations, and they would match you, your mother and each other.  In this case, they would triangulate with you and your mother.

False Positive or Identical by Chance Match

This third example shows that Person D does technically match you, because they have all As and Ts, but they match you by zigzagging back and forth between your Mom’s and Dad’s DNA strands.  Of course, there is no way for you to know this without matching Person D against both of your parents to see if they match either parent.  If your match does not match either parent, the match is a false positive, meaning it is not a legitimate match.  The match is identical by chance (IBC.)

matches-ibc

One clue as to whether a match is IBC or IBD, even without your parents, is whether the person matches you and other close relatives on this same segment.  If not, then the match may be IBC. If the match also matches close relatives on this segment, then the match is very likely IBD.  Of course, the segment size matters too, which we’ll discuss momentarily.

If a person triangulates with 2 or more relatives who descend from the same ancestor, then the match is identical by descent, and not identical by chance.

False Negative Match

This last example shows a false negative.  The DNA of Person E had a read error at location 5, meaning that there are not 10 locations in a row that match.  This causes you and Person E to NOT be shown as a match, creating a false negative situation, because you actually do match if Person E hadn’t had the read error.

matches-false-negative

Of course, false negatives are by definition very hard to identify, because you can’t see them.

Comparisons to Your Parents

Legitimate matches will phase to your parents – meaning that you will match Person B on the same amount of a specific segment, or a smaller portion of that segment, as one of your parents.

False matches mean that you match the person, but neither of your parents matches that person, meaning that the segment in question is identical by chance, not by descent.

Comparing your matches to both of your parents is the easiest litmus paper test of whether your matches are legitimate or not.  Of course, the caveat is that you must have both of your parents available to fully phase your results.

Many of us don’t have both parents available to test, so let’s take a look at how often false positive matches really do occur.

False Positive Matches

How often do false matches really happen?

The answer to that question depends on the size of the segments you are comparing.

Very small segments, say at 1cM, are very likely to match randomly, because they are so small.  You can read more about SNPs and centiMorgans (cM) here.

As a rule of thumb, the larger the matching segment as measured in cM, with more SNPs in that segment:

  • The stronger the match is considered to be
  • The more likely the match is to be IBD and not IBC
  • The closer in time the common ancestor, facilitating the identification of said ancestor

Just in case we forget sometimes, identifying ancestors IS the purpose of genetic genealogy, although it seems like we sometimes get all geeked out by the science itself and process of matching!  (I can hear you thinking, “speak for yourself, Roberta.”)

It’s Just a Phase!!!

Let’s look at an example of phasing a child’s matches against those of their parents.

In our example, we have a non-endogamous female child (so they inherit an X chromosome from both parents) whose matches are being compared to her parents.

I’m utilizing files from Family Tree DNA. Ancestry does not provide segment data, so Ancestry files can’t be used.  At 23andMe, coordinating the security surrounding 3 individuals results and trying to make sure that the child and both parents all have access to the same individuals through sharing would be a nightmare, so the only vendor’s results you can reasonably utilize for phasing is Family Tree DNA.

You can download the matches for each person by chromosome segment by selecting the chromosome browser and the “Download All Matches to Excel (CSV Format)” at the top right above chromosome 1.

matches-chromosomr-browser

All segment matches 1cM and above will be downloaded into a CSV file, which I then save as an Excel spreadsheet.

I downloaded the files for both parents and the child. I deleted segments below 3cM.

About 75% of the rows in the files were segments below 3cM. In part, I deleted these segments due to the sheer size and the fact that the segment matching was a manual process.  In part, I did this because I already knew that segments below 3 cM weren’t terribly useful.

Rows Father Mother Child
Total 26,887 20,395 23,681
< 3 cM removed 20,461 15,025 17,784
Total Processed 6,426 5,370 5,897

Because I have the ability to phase these matches against both parents, I wanted to see how many of the matches in each category were indeed legitimate matches and how many were false positives, meaning identical by chance.

How does one go about doing that, exactly?

Downloading the Files

Let’s talk about how to make this process easy, at least as easy as possible.

Step one is downloading the chromosome browser matches for all 3 individuals, the child and both parents.

First, I downloaded the child’s chromosome browser match file and opened the spreadsheet.

Second, I downloaded the mother’s file, colored all of her rows pink, then appended the mother’s rows into the child’s spreadsheet.

Third, I did the same with the father’s file, coloring his rows blue.

After I had all three files in one spreadsheet, I sorted the columns by segment size and removed the segments below 3cM.

Next, I sorted the remaining items on the spreadsheet, in order, by column, as follows:

  • End
  • Start
  • Chromosome
  • Matchname

matches-both-parents

My resulting spreadsheet looked like this.  Sorting in the order prescribed provides you with the matches to each person in chromosome and segment order, facilitating easy (OK, relatively easy) visual comparison for matching segments.

I then colored all of the child’s NON-matching segments green so that I could see (and eventually filter the matchname column by) the green color indicating that they were NOT matches.  Do this only for the child, or the white (non-colored) rows.  The child’s matchname only gets colored green if there is no corresponding match to a parent for that same person on that same chromosome segment.

matches-child-some-parents

All of the child’s matches that DON’T have a corresponding parent match in pink or blue for that same person on that same segment will be colored green.  I’ve boxed the matches so you can see that they do match, and that they aren’t colored green.

In the above example, Donald and Gaff don’t match either parent, so they are all green.  Mess does match the father on some segments, so those segments are boxed, but the rest of Mess doesn’t match a parent, so is colored green.  Sarah doesn’t match any parent, so she is entirely green.

Yes, you do manually have to go through every row on this combined spreadsheet.

If you’re going to phase your matches against your parent or parents, you’ll want to know what to expect.  Just because you’ve seen one match does not mean you’ve seen them all.

What is a Match?

So, finally, the answer to the original question, “What is a Match?”  Yes, I know this was the long way around the block.

In the exercise above, we weren’t evaluating matches, we were just determining whether or not the child’s match also matched the parent on the same segment, but sometimes it’s not clear whether they do or do not match.

matches-child-mess

In the case of the second match with Mess on chromosome 11, above, the starting and ending locations, and the number of cM and segments are exactly the same, so it’s easy to determine that Mess matches both the child and the father on chromosome 11. All matches aren’t so straightforward.

Typical Match

matches-typical

This looks like your typical match for one person, in this case, Cecelia.  The child (white rows) matches Cecelia on three segments that don’t also match the child’s mother (pink rows.)  Those non-matching child’s rows are colored green in the match column.  The child matches Cecelia on two segments that also match the mother, on chromosome 20 and the X chromosome.  Those matching segments are boxed in black.

The segments in both of these matches have exact overlaps, meaning they start and end in exactly the same location, but that’s not always the case.

And for the record, matches that begin and/or end in the same location are NOT more likely to be legitimate matches than those that start and end in different locations.  Vendors use small buckets for matching, and if you fall into any part of the bucket, even if your match doesn’t entirely fill the bucket, the bucket is considered occupied.  So what you’re seeing are the “fuzzy” bucket boundaries.

(Over)Hanging Chad

matches-overhanging

In this case, Chad’s match overhangs on each end.  You can see that Chad’s match to the child begins at 52,722,923 before the mother’s match at 53,176,407.

At the end location, the child’s matching segment also extends beyond the mother’s, meaning the child matches Chad on a longer segment than the mother.  This means that the segment sections before 53,176,407 and after 61,495,890 are false negative matches, because Chad does not also match the child’s mother of these portions of the segment.

This segment still counts as a match though, because on the majority of the segment, Chad does match both the child and the mother.

Nested Match

matches-nested

This example shows a nested match, where the parent’s match to Randy begins before the child’s and ends after the child’s, meaning that the child’s matching DNA segment to Randy is entirely nested within the mother’s.  In other words, pieces got shaved off of both ends of this segment when the child was inheriting from her mother.

No Common Matches

matches-no-common

Sometimes, the child and the parent will both match the same person, but there are no common segments.  Don’t read more into this than what it is.  The child’s matches to Mary are false matches.  We have no way to judge the mother’s matches, except for segment size probability, which we’ll discuss shortly.

Look Ma, No Parents

matches-no-parents

In this case, the child matches Don on 5 segments, including a reasonably large segment on chromosome 9, but there are no matches between Don and either parent.  I went back and looked at this to be sure I hadn’t missed something.

This could, possibly, be an instance of an unseen a false negative, meaning perhaps there is a read issue in the parent’s file on chromosome 9, precluding a match.  However, in this case, since Family Tree DNA does report matches down to 1cM, it would have to be an awfully large read error for that to occur.  Family Tree DNA does have quality control standards in place and each file must pass the quality threshold to be put into the matching data base.  So, in this case, I doubt that the problem is a false negative.

Just because there are multiple IBC matches to Don doesn’t mean any of those are incorrect.  It’s just the way that the DNA is inherited and it’s why this type of a match is called identical by chance – the key word being chance.

Split Match

matches-split

This split match is very interesting.  If you look closely, you’ll notice that Diane matches Mom on the entire segment on chromosome 12, but the child’s match is broken into two.  However, the number of SNPs adds up to the same, and the number of cM is close.  This suggests that there is a read error in the child’s file forcing the child’s match to Diane into two pieces.

If the segments broken apart were smaller, under the match threshold, and there were no other higher matches on other segments, this match would not be shown and would fall into the False Negative category.  However, since that’s not the case, it’s a legitimate match and just falls into the “interesting” category.

The Deceptive Match

matches-surname

Don’t be fooled by seeing a family name in the match column and deciding it’s a legitimate match.  Harrold is a family surname and Mr. Harrold does not match either of the child’s parents, on any segment.  So not a legitimate match, no matter how much you want it to be!

Suspicious Match – Probably not Real

matches-suspicious

This technically is a match, because part of the DNA that Daryl matches between Mom and the child does overlap, from 111,236,840 to 113,275,838.  However, if you look at the entire match, you’ll notice that not a lot of that segment overlaps, and the number of cMs is already low in the child’s match.  There is no way to calculate the number of cMs and SNPs in the overlapping part of the segment, but suffice it to say that it’s smaller, and probably substantially smaller, than the 3.32 total match for the child.

It’s up to you whether you actually count this as a match or not.  I just hope this isn’t one of those matches you REALLY need.  However, in this case, the Mom’s match at 15.46 cM is 99% likely to be a legitimate match, so you really don’t need the child’s match at all!!!

So, Judge Judy, What’s the Verdict?

How did our parental phasing turn out?  What did we learn?  How many segments matched both the child and a parent, and how many were false matches?

In each cM Size category below, I’ve included the total number of child’s match rows found in that category, the number of parent/child matches, the percent of parent/child matches, the number of matches to the child that did NOT match the parent, and the percent of non-matches. A non-match means a false match.

So, what the verdict?

matches-parent-child-phased-segment-match-chart

It’s interesting to note that we just approach the 50% mark for phased matches in the 7-7.99 cM bracket.

The bracket just beneath that, 6-6.99 shows only a 30% parent/child match rate, as does 5-5.99.  At 3 cM and 4 cM few matches phase to the parents, but some do, and could potentially be useful in groups of people descended from a known common ancestor and in conjunction with larger matches on other segments. Certainly segments at 3 cM and 4 cM alone aren’t very reliable or useful, but that doesn’t mean they couldn’t potentially be used in other contexts, nor are they always wrong. The smaller the segment, the less confidence we can have based on that segment alone, at least below 9-15cM.

Above the 50% match level, we quickly reach the 90th percentile in the 9-9.99 cM bracket, and above 10 cM, we’re virtually assured of a phased match, but not quite 100% of the time.

It isn’t until we reach the 16cM category that we actually reach the 100% bracket, and there is still an outlier found in the 18-18.99 cM group.

I went back and checked all of the 10 cM and over non-matches to verify that I had not made an error.  If I made errors, they were likely counting too many as NON-matches, and not the reverse, meaning I failed to visually identify matches.  However, with almost 6000 spreadsheet rows for the child, a few errors wouldn’t affect the totals significantly or even noticeably.

I hope that other people in non-endogamous populations will do the same type of double parent phasing and report on their results in the same type of format.  This experiment took about 2 days.

Furthermore, I would love to see this same type of experiment for endogamous families as well.

Summary

If you can phase your matches to either or both of your parents, absolutely, do.  This this exercise shows why, if you have only one parent to match against, you can’t just assume that anyone who doesn’t match you on your one parent’s side automatically matches you from the other parent. At least, not below about 15 cM.

Whether you can phase against your parent or not, this exercise should help you analyze your segment matches with an eye towards determining whether or not they are valid, and what different kinds of matches mean to your genealogy.

If nothing else, at least we can quantify the relatively likelihood, based on the size of the matching segment, in a non-endogamous population, a match would match a parent, if we had one to match against, meaning that they are a legitimate match.  Did you get all that?

In a nutshell, we can look at the Parent/Child Phased Match Chart produced by this exercise and say that our 8.5 cM match has about a 66% chance of being a legitimate match, and our 10.5 cM match has a 95% change of being a legitimate match.

You’re welcome.

Enjoy!!

Concepts – Undocumented Adoptions vs Untested Y Lines

So you took the Y-line test and you don’t match the surnames you expected to match and now you’re worried. Is there maybe an “oops” in your lineage?

One of two things has happened. Either your line has simply not tested or you have an undocumented adoption in your line.

An undocumented adoption is any “adoption” at any time in history that is not documented – so if you didn’t know about it, it’s an undocumented adoption. Often, these events in genetic genealogy are referred to as NPEs, Non-Paternal Events, but I prefer undocumented adoptions.

Yes, there are myriad ways for this to happen, and I mean besides the obvious infidelity situation, but right now, you only care about figuring out IF you have an undocumented adoption, not how it happened.

How can you tell if your line is one that simply hasn’t been tested of if there is an undocumented adoption in your line? Sometimes you can’t, you’ll simply have to wait until more people of your surname test. Of course, you can always recruit people through the Rootsweb and Genforum lists and boards and social media.

Most of the time this is a process of elimination. If you can’t find anything to suggest that you have an undocumented adoption, then your line is simply probably untested, especially if it’s not a common surname or your ancestors had few male children.

However, there are often clues lurking relative to undocumented adoptions.

Scenario 1 – Right Family, Non-Matching DNA

If you are part of DNA surname project and there are other people who have tested, that you don’t match, that claim the same ancestor as you do – you might have an undocumented adoption on your hands.

In this case, someone’s genealogy is wrong, yours or theirs. By wrong, that doesn’t mean you made a mistake. You (or they) may have tracked the line back to the right ancestor, but instead of being the child of a son of John Doe, for example, your ancestor was the child of the daughter of John Doe, who wasn’t married at the time and had a child by a Smith, but gave the child her surname, Doe.

undoc-1

So right Doe family, wrong child giving birth. There are also other family situations that are discovered utilizing Y DNA testing, like a child simply using the step-father’s name. In this case, finding more descendants to test, especially through other sons will help resolve the paternity question. Given the scenario above, we really don’t know whether the green or red DNA is the Y DNA of John Doe. We need the DNA of another son to resolve the question.

Scenario 2 – Accurate Genealogy, Undocumented Adoption

If you are part of a DNA surname project and two other people who descend from two separate sons of the same ancestor you claim, both having good solid genealogy back to that ancestor – you do have an undocumented adoption on your hands. This situation pretty much removes any doubt about your ancestral line if you are Steve, below.

undoc-2

Assuming their genealogy is correct (and yes, the genealogy could be wrong), theirs (the green) is the paternal line from that ancestor, so you need to start looking at situations that might lend themselves to your ancestor having that name but not sharing that paternal genetic line.

The break in the ancestral line can have occurred anyplace between John Doe and son Steve and the tester, Steve V.  You might want to test males descended from men between Steve Doe and Steve Doe V.  Word of warning here – if you don’t want to know the answer, don’t test.  The break could be between you and your father or your father and grandfather.  Sometimes, these possibilities are just too close for comfort.

At this point, I would turn to autosomal testing to see if any of the people in the surname project match you autosomally. That may tell you if you are actually descended from this line at all – perhaps through a female child as described above. With autosomal testing, especially of distant relatives, you can prove a positive, that you are related, but you can’t really prove a negative, that you aren’t related.

If you’re testing second cousins or closer, you can prove a negative.  If you don’t match your full second cousins, there is a problem – and it’s not the genealogy.

Scenario 3 – Matching a Group of Men with a Particular Surname

If you match a significant number of men with other surnames, with one surname in particular being closely matched and quite prevalent, it’s a large hint. For example, let’s say you have 6 matches at your highest marker level, and 5 of them are Miller men descended from the same ancestor. Chances are very good that you are of Miller descent too.

Again, I’d turn to autosomal testing at this point to see how closely you are related to your closest matching Y DNA Millers or others descended from this same ancestral line.

undoc-3

Scenario 4 – Your Line is Untested

If your surname is something quite unusual, like Ferverda for example, and you don’t fit the situations described above, then it’s likely that your line simply hasn’t tested yet. In this case, the grandfather of our tester was the immigrant from the Netherlands, and Ferverda, both there and in the US, is a very unusual name.

undoc-4

Of course, your line having not tested can happen with common surnames too.

Utilizing Y Search

Check www.ysearch.org periodically to see if others of your surname took the Y chromosome test elsewhere and just got around to entering the results into YSearch, even though the other testers (Ancestry, Sorenson) have been defunct for some time now relative to Y DNA.

undoc-5

You can also search at YSearch by surname. You don’t have any way to view results by surname, outside of projects, at Family Tree DNA, so the only way to discover that someone who claims your paternal line and doesn’t match you is to search by surname at YSearch and hope they have included a tree.

undoc-6

In this example, one person with the Estes surname has results at YSearch, but 40 have Estes in their tree, just not as their patrilineal surname.

undoc-7

Keep in mind that depending on how far back in time an undocumented adoption occurred, you may find matches to people with that same surname who descend from your common biological ancestor, but you may still not share the original ancestor. In the example above, the Doe men red all match each other, because their unknown Smith ancestor is the same, but they don’t match the descendant of John Doe through son James.

A non-match to men of your same surname isn’t a cause for panic, but it is time to do some additional digging to see if you can discover why.

Happy ancestor hunting!

Concepts – Genetic Distance

At Family Tree DNA, your Y DNA and full sequence mitochondrial matches display a column titled Genetic Distance.  One of the most common questions I receive is how to interpret genetic distance.

GD example 2

Many people mistakenly assume that genetic distance is the number of generations to a common ancestor, but that is NOT AT ALL what genetic distance means.

Genetic distance is how many mutations difference the participant (you) has with that particular match. In other words, how many mismatches in your DNA compared with that person’s DNA.

White the concept is the same, Y DNA and mitochondrial DNA Genetic Distance function a little differently, so let’s look at them separately.

Y DNA Genetic Distance

I wrote about genetic distance as part of a larger article titled “Concepts – Y DNA Matching and Connecting with your Paternal Ancestor,” but I’m going to excerpt the genetic distance portion of that article here.

You’ll notice on the Y DNA matches page that the first column says “Genetic Distance.”

STR genetic distance

Looking at the example above, if this is your personal page, then you mismatch with Howard once, and Sam twice, etc.

Counting Genetic Distance

Genetic distance for Y DNA can be counted in different ways, and Family Tree DNA utilizes a combination of two scientific methods to provide the most accurate results. Let’s look at an example.

In the methodology known as the Step-Wise Mutation Model, each difference is counted as 1 step, because the mutation that caused the difference happened in one mutation event.

STR genetic distance calc

So, if marker 393 has mutated from 12 to 13, the difference is 1, so there is one difference and if that is the only mutation between these two men, the total genetic distance would be 1.

However, if marker 390 mutated from 24 to 26, the difference is 2, because those mutations most likely occurred in two different steps – in other words marker 390 had a mutation two different times, perhaps once in each man’s line.  Therefore, the total genetic distance for these two men, combining both markers and with all of their other markers matching, would be 3.

Easy – right?  You know this is too easy!

Some markers don’t play nice and tend to mutate more than one step at a time, sometimes creating additional marker locations as well.  They’re kind of like a copy machine on steroids. These are known as multi-copy (or palindromic) markers and have more than one value listed for each marker.  In fact, marker 464 typically has 4 different values shown, but can have several more.

The multiple mutations shown for those types of multi-copy markers tend to occur in one step, so they are counted as one event for that marker as a whole, no matter how much math difference is found between the values. This calculation method is called the Infinite Alleles Mutation Model.

str genetic distance calc 2 v2

Because marker 464 is calculated using the infinite alleles model, even though there are two differences, the calculation only notes that there IS a difference, and counts that difference as having occurred in one step, counting only as 1 in genetic distance.

However, if one man also has one or more extra copies of the marker, shown below as 464e and 464f, that is counted as one additional genetic distance step, regardless of the number of additional copies of the marker, and regardless of the values of those copies.

STR genetic distance calc 3 v2

With markers 464e and 464f, which person 2 carries and person 1 does not, the difference is 17 and the generational difference is 1, for each marker, but since the copy event likely happened at one time, it’s considered a mutational difference or genetic distance of only 1, not 34 or 2. Therefore, in our example, the total genetic distance for these men is now 5, not 8 or 38.

In our last example, a deletion has occurred, which sometimes happens at marker location 425. When a deletion occurs, all of the DNA at that location is permanently deleted, or omitted, between father and son, and the value is 0.  Once gone, that DNA has no avenue to ever return, so forever more, the descendants of that man show a value of zero at marker 425.

STR genetic distance calc 4 v2

In this deletion example, even though the mathematical difference is 12, the event happened at once, so the genetic distance for a deletion is counted as 1. The total genetic distance for these two men now is 6.

In essence, the Total Genetic Distance is a mathematical calculation of how many times mutations happened between the lines of these two men since their common ancestor, whether that common ancestor is known or not.

Family Tree DNA provides a the TIP calculator which helps estimate the time to a common ancestor using a proprietary algorithm that includes individuals marker mutation rates.  You can read more about this in the Y DNA Concepts article or in the TIP article.

Please note that on July 26, 2016 Family Tree DNA introduced changes in how the genetic distance is calculated for some markers to be less restrictive.  You can read about the changes here.

Mitochondrial DNA

GD mt example

Mitochondrial DNA Genetic Distance is a bit different. In order to be shown as a match, you must be an exact match in the HVR1 and HVR2 regions, so there is no genetic distance shown, because there are no mutations allowed.

At the full sequence level, you are allowed 4 or fewer mismatches to be considered a match.

Genetic distance means how many mismatches you have to another person when comparing your 16,569 mitochondrial locations to theirs. The full sequence test tests all of those locations.

Of course, in general, fewer mismatches mean you are more closely related than to someone with more mismatches. I said generally, because I have seen a situation where a mutation occurred between mother and child, meaning that individual had a genetic distance of 1 when compared to their mother, along with anyone who matched their mother exactly. Clearly, they are far more closely related to their mother than to their mother’s matches.

One of the most common questions I receive about genetic distance is how to convert genetic distance to time – meaning how long ago am I related to someone who has a genetic distance of 1 or 2, for example.

The answer is that it depends and it varies widely, very widely.  I know, I hate the “it depends” answer too.

Turning to the Family Tree DNA Learning Center, we find the following information:

    • Matching on HVR1 means that you have a 50% chance of sharing a common maternal ancestor within the last fifty-two generations. That is about 1,300 years.
    • Matching on HVR1 and HVR2 means that you have a 50% chance of sharing a common maternal ancestor within the last twenty-eight generations. That is about 700 years.
    • Matching exactly on the Mitochondrial DNA Full Sequence test brings your matches into more recent times. It means that you have a 50% chance of sharing a common maternal ancestor within the last 5 generations. That is about 125 years.

I think the full sequence estimate is overly generous. I seldom find identifiable matches, and I do have my genealogy back more than 5 generations on my mitochondrial line and so do many of my clients.

My 4 times great-grandmother, or 6 generations distant from me (counting my mother as generation 1), Elisabetha Mehlheimer, was found living in Goppmansbuhl, Germany when she gave birth to her daughter in 1823. This puts Elisabetha’s birth around 1800, or possibly earlier, very probably in the same village in Germany.  German church records compulsively identify people who aren’t residents, and even residents who originally came from another location.

Part of my mitochondrial full sequence matches are shown below.

GD my results

Looking at my 13 exact matches, it becomes obvious very quickly that my matches aren’t from Germany, they are primarily from Scandinavia. Not at all what I expected. I created this chart to view the match locations. I have omitted anyone who did not provide either location or oldest ancestor information. Fortunately, Scandinavians are very good about participating fully in DNA testing and by and large, they want to get the most out of their results. The way to do that, of course is to include as much information as possible so that we can all benefit by sharing and collaboration.

Match Genetic Distance Location Birth Year of Most Distant Ancestor
TS 0 Norway 1758
Svein 0 Norway 1725
Bo-Lennart 0 Norway 1725
Per 0 Norway 1718
Hakan 0 Sweden 1716
Ragnhild 0 Sweden 1857
Constance 0 Russia
Teresa 0 Poland 1750
Valerie 0 Norway 1763
Vladimir 0 Russia
Rose 0 Sweden 1845
IRL 0 Norway 1702
Lynn 0 Norway 1696
Anastasia 1 Russia above Georgia 1923
AJ 1 Sweden 1771
Marianne 1 Sweden 1661
Inga 1 Sweden 1691
Inger 1 Sweden
Marianne 1 Sweden 1661
Maria 1 Poland C 1880
Marie M. 1 Bavaria, Germany 1836
Tomas 2 Probably Czech Republic 1880
DL 2 Sweden 1827

A quick look at my matches map shows the distribution of my matches more visually, although not everyone includes their matrilineal ancestor’s geographic information, so they don’t have pins on the map. In my case, I’m lucky because several people have included geographical information which makes the maps very useful. The white pin is where Elisabetha Mehlheimer lived.  Red pins are exact matches, orange are one mutation difference and yellow are two.

GD matches map

I am very clearly not related to these individuals within 6 generations, and probably not for several more generations back in time. The one match from Germany is one mutation different, which certainly could mean that we share a common ancestor and her line had a mutation while mine line didn’t. Wurttemburg and Bavaria do share borders and are neighboring districts in southern Germany as illustrated by this 1855 map of Bavaria and Wurtemberg.

GD Bavaria Wurttemberg

Unfortunately, there is no “rule of thumb” for mitochondrial DNA genetic distance relative to years and generations distant. In other words, there is no TIP calculator for mtDNA. I did some research some years ago attempting to quantify MRCA (most recent common ancestor) time and answer this very question, but the only research papers I was able to find referred to studies on penguins.

How Far is Far?

In some cases, I know that a common ancestor actually reached back hundreds to thousands of years. Of course, relationships in female lines are more difficult to “see” since the surname changes with every generation, historically. In Y DNA, you can look at the surname of the participant and determine immediately if there is a likelihood that you share a common paternal ancestor if the surname matches. Let’s look at some mitochondrial examples.

I recently had a client that matched her haplogroup assignment exactly, with no additional unusual mutations found as compared to the expected mitochondrial mutation profile. She had several exact matches. Her haplogroup? H7a2, which was formed about 2500 years ago, with a standard deviation of 2609, according to the supplemental date from the paper, “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root” by Doron Behar, et al, published in The American Journal of Human Genetics, Volume 90, April 6, 2012. This means that H7a2 could have been formed anytime from recently to about 5000 years ago, with 2500 being the most likely and best fit.

Standard deviation, in this case, means the dates could be off that much in either direction, but the further from 2500, the less likely it is to be accurate.

Conversely, another recent client was haplogroup U2b formed roughly 30,000 years ago, with a standard deviation of 5,800 years. The client had 16 differences, which averages to about one mutation every 2,000 years. Is that what actually happened or did those mutations happen in fits and starts? We don’t know.

A last example is my own DNA with two relevant differences from my haplogroup profile, J1c2f, which was formed about 2,000 years ago with a standard deviation of 3,100 years. Technically, this means my haplogroup might not be formed yet (joke) since 2,000 years ago minus 3,100 years hasn’t happened yet. While that obviously can’t be true, the standard deviation is relevant in the other direction. In essence, what this says is that my haplogroup could be fairly young, probably is about 2000 years old, and could be as old as 5,100 years. Given the clustering, it’s likely that J1c2f was formed in Scandinavia and a few descendants, at some time, migrated into continental Europe and Russia.

GD extra mutations

By the way, the 315 “extra mutations” insertions are too unstable to be considered relevant. They are not included in the genetic distance count in your results.

At the other end of the spectrum, I know of one person who has a mutation between themselves and an aunt and a different mutation when compared with a sister.  Furthermore, those mutations occurred in the HVR1 and HVR2 regions, meaning that these women don’t show as matches to each other until you get to the coding region where the full range of full sequence matches are shown and 4 mutations are allowed.  This caused a bit of panic initially, but was perfectly legitimate and understandable once the actual results were compared. Is this rare? Absolutely. Is it possible? Absolutely.

As you can see, there just isn’t any good measure for mitochondrial DNA mutation timing.  Mutations don’t happen on any time schedule, unfortunately.

I use genetic distance as a gauge for relative relatedness, no pun intended, and I keep in mind that I might actually be more closely related to someone with a slightly further genetic distance than an exact match.

While you can’t compare your actual results to matches online, you can contact your matches to compare actual results.  In my case, I developed a branching tree mutation chart that showed that a group of the people in Sweden with one mutation difference actually all shared an additional mutation that I, and my exact matches, don’t have.  In other words, this Swedish group forms a new branch of the tree and will likely, someday, be a new subhaplogroup of J1c2f.

Sometimes digging a little deeper reveals fascinating patterns that aren’t initially evident.

Summary

When working with genetic distance, look for patterns, not only in terms of geography, but in terms of matching mutations and grouping of individuals.  Sometimes the combination of mutation patterns and geography can reveal information that could not be obtained any other way – and may lead you to your common ancestor, with or without a name.

For example, I know that my common ancestor with these people probably lived someplace in Scandinavia about 2000 years ago, based upon both the clustering and the branching.  How my ancestor got to Germany is still a mystery, but one that might potentially be solved by looking at the history of the region where my known ancestor is found in 1800.

Happy hunting!