Concepts – Percentage of Ancestors’ DNA

A very common question is, “How much DNA of an ancestor do I carry and how does that affect my ethnicity results?”

This question is particularly relevant for people who are seeking evidence of a particular ethnicity of an ancestor several generations back in time. I see this issue raise its head consistently when people take an ethnicity test and expect that their “full blood” Native American great-great-grandmother will show up in their results.

Let’s take a look at how DNA inheritance works – and why they might – or might not find the Native DNA they seek, assuming that great-great-grandma actually was Native.

Inheritance

Every child inherits exactly 50% of their autosomal DNA from each parent (except for the X chromosome in males.) However, and this is a really important however, the child does NOT inherit exactly half of the DNA of each ancestor who lived before the parents. How can this be, you ask?

Let’s step through this logically.

The number of ancestors you have doubles in each generation, going back in time.

This chart provides a summary of how many ancestors you have in each generation, an approximate year they were born using a 25 year generation and a 30 year generation, respectively, and how much of their DNA, on average, you could expect to carry, today. You’ll notice that by the time you’re in the 7th generation, you can be expected, on average, to carry 0.78% meaning less than 1% of that GGGGG-grandparent’s DNA.

Looking at the chart, you can see that you reach the 1% level at about the 6th generation with an ancestor probably born in the late 1700s or early 1800s.

It’s also worth noting here that generations can be counted differently. In some instances, you are counted as generation one, so your GGGGG-grandparent would be generation 8.

In general, DNA showing ethnicity below about 5% is viewed as somewhat questionable and below 2% is often considered to be “noise.” Clearly, that isn’t always the case, especially if you are dealing with continental level breakdowns, as opposed to within Europe, for example. Intra-continental (regional) ethnicity breakdowns are particularly difficult and unreliable, but continental level differences are easier to discern and are considered to be more reliable, comparatively.

If you want to learn more about how ethnicity calculations are derived and what they mean, please read the article Ethnicity Testing – A Conundrum.

On Average May Not Mean You

On average, each child receives half of the DNA of each ancestor from their parent.

The words “on average” are crucial to this discussion, because the average assumes that in fact each generation between your GGGGG-grandmother and you inherited exactly half of the DNA in each generation from their parent that was contributed by that GGGGG-grandmother.

Unfortunately, while averages are all that we have to work with, that’s not always how ancestral DNA is passed in each generation.

Let’s say that your GGGGG-grandmother was indeed full Native, meaning no admixture at all.

You can click to enlarge images.

Using the chart above, you can see that your GGGGG-grandmother was full native on all 20 “pieces” or segments of DNA used for this illustration. Those segments are colored red. The other 10 segments, with no color, were contributed by the father.

Let’s say she married a person who was not Native, and in every generation since, there were no additional Native ancestors.

Her child, generation 6, inherited exactly 50% of her DNA, shown in red – meaning 10 segments..

Generation 5, her grandchild, inherited exactly half of her DNA that was carried by the parent, shown in red – meaning 5 segments..

However, in the next generation, generation 4, that child inherited more than half of the Native DNA from their parent. They inherited half of their parent’s DNA, but the half that was randomly received included 3 Native segments out of a possible 5 Native segments that the parent carried.

In generation 3, that child inherited 2 of the possible 3 segments that their parent carried.

In generation 2, that person inherited all of the Native DNA that their parent carried.

In generation 1, your parent inherited half of the DNA that their parent carried, meaning one of 2 segments of Native DNA carried by your grandparent.

And you will either receive all of that one segment, part of that one segment, or none of that one segment.

In the case of our example, you did not inherit that segment, which is why you show no Native admixture, even though your GGGGG-grandmother was indeed fully Native..

Of course, even if you had inherited that Native segment, and that segment isn’t something the population reference models recognize as “Native,” you still won’t show as carrying any Native at all. It could also be that if you had inherited the red segment, it would have been too small and been interpreted as noise.

The “Received” column at the right shows how much of the ancestral DNA the current generation received from their parent.

The “% of Original” column shows how the percentage of GGGGG-grandmother’s DNA is reduced in each generation.

The “Expected” column shows how much DNA, “on average” we would expect to see in each generation, as compared to the “% of Original” which is how much they actually carry.

I intentionally made the chart, above, reflect a scenario close to what we could expect, on average. However, it’s certainly within the realm of possibility to see something like the following scenario, as well.

In the second example, above, neither you nor your parent or grandparent inherited any of the Native segments.

It’s also possible to see a third example, below, where 4 generations in a row, including you, inherited the full amount of Native DNA segments carried by the GG-grandparent.

Testing Other Relatives

Every child of every couple inherits different DNA from their parents. The 50% of their parents’ DNA that they inherit is not all the same. The three example charts above could easily represent three children of the GG-Grandparent and their descendants.

The pedigree chart below shows the three different examples, above.  The great-great-grandparent in the 4th generation who inherited 3 Native DNA segments is shown first, then the inheritance of the Native segments through all 3 children to the current generation.

Therefore, you may not have inherited the red segment of GGGGG-grandmother’s Native DNA, but your sibling might, or vice versa. As you can see in the chart above, one of your third cousins received 3 native segments from GGGGG-grandmother. but your other third cousin received none.

You can see why people are always encouraged to test their parents and grandparents as well as siblings. You never know where your ancestor’s DNA will turn up, and each person will carry a different amount, and different segments of DNA from your common ancestors.

In other words, your great-aunt and great-uncle’s DNA is every bit as important to you as your own grandparent’s DNA – so test everyone in older generations while you can, and their children if they are no longer available.

Back to Great-Great-Grandma

Going back to great-great-grandma and her Native heritage. You may not show Native ethnicity when you expected to see Native, but you may have other resources and recourses. Don’t give up!

Reason Resources and Comments
She really wasn’t Native. Genealogical research will help and mitochondrial DNA testing of an appropriate descendant will point the way to her true ethnic heritage, at least on her mother’s side.
She was Native, but the ethnicity test doesn’t show that I am. Test relatives and find someone descended from her through all females to take a mitochondrial test. The mitochondrial test will answer the question for her matrilineal line unquestionably.
She was partly, but not fully Native. This would mean that she had less Native DNA than you thought, which would mean the percentage coming to you is lower on average than anticipated. Mitochondrial DNA testing someone descended from her through all females to the current generation, which can be male, would reveal whether her mother was Native from her mother’s line.
She was Native, but several generations back in time. You or your siblings may show small percentages of Native or other locations considered to be a component of Native admixture in the absence of any other logical explanation for their presence, such as Siberian or Eastern Asian.

Using Y and Mitochondrial DNA Testing to Supplement Ethnicity Testing

When in doubt about ethnicity results, find an appropriately descended person to take a Y DNA test (males only, for direct paternal lineage) or a mitochondrial DNA test, for direct matrilineal results. These tests will yield haplogroup information and haplogroups are associated with specific world regions and ethnicities, providing a more definitive answer regarding the heritage of that specific line.

Y DNA reflects the direct male line, shown in blue above, and mitochondrial DNA reflects the direct matrilineal line, shown in red. Only males carry Y DNA, but both genders carry mitochondrial DNA.

For a short article about the different kinds of DNA and how they can help genealogists, please read 4 Kinds of DNA for Genetic Genealogy.

Ethnicity testing is available from any of the 3 major vendors, meaning Family Tree DNA, Ancestry or 23andMe. Base haplogroups are provided with 23andMe results, but detailed testing for Y and mitochondrial DNA is only available from Family Tree DNA.

To read about the difference between the two types of testing utilized for deriving haplogroups between 23andMe and Family Tree DNA, please read Haplogroup Comparisons between Family Tree DNA and 23andMe.

For more information on haplogroups, please read What is a Haplogroup?

For a discussion about testing family members, please read Concepts – Why DNA Testing the Oldest Family Members is Critically Important.

If you’d like to read a more detailed explanation of how inheritance works, please read Concepts – How Your Autosomal DNA Identifies Your Ancestors.

DNA Day Sale Starts Today!

Everyone anticipates this sale every year. The DNA Day Sale begins sometime today, April 20, and runs for a week, according to the information provided by Family Tree DNA, below.

National DNA Day, April 25, celebrates the discovery of the double helix structure of DNA in 1953, as well as the completion of the Human Genome Project in 2003. And wherever DNA is being celebrated, you’ll find genetic genealogists eagerly anticipating a sale on DNA tests.

Beginning April 20, the FTDNA DNA Day 2017 sale will begin! The sale ends at 11:59 PM Central Time on Thursday, April 27. Please note that Items ordered on invoice during the sale must be paid by the end of the sale.

You’ll note that Y-DNA and mtDNA upgrades are not included. You will receive the sale price if you add a new product, listed above, to an existing kit, but going from Y37 to 67 or mtPlus to mtFull Sequence (FMS) will not be discounted.

Click here to take advantage of these prices and order or upgrade.

Autosomal DNA Transfers – Which Companies Accept Which Tests?

Somehow, I missed the announcement that Family Tree DNA now accepts uploads from MyHeritage.

Update – Shortly after the publication of this article, I was notified that the MyHeritage download has been disabled and they are working on the issue which is expected to be resolved shortly.  Family Tree DNA is ready when the MyHeritage downloads are once again functional.

Other people may have missed a few announcements too, or don’t understand the options, so I’ve created a quick and easy reference that shows which testing vendors’ files can be uploaded to which other vendors.

Why Transfer?

Just so that everyone is on the same page, if you test your autosomal DNA at one vendor, Vendor A, some other vendors allow you to download your raw data file from Vendor A and transfer your results to their company, Vendor B.  The transfer to Vendor B is either free or lower cost than testing from scratch.  One site, GedMatch, is not a testing vendor, but is a contribution/subscription comparison site.

Vendor B then processes your DNA file that you imported from Vendor A, and your results are then included in the database of Vendor B, which means that you can obtain your matches to other people in Vendor B’s data base who tested there originally and others who have also transferred.  You can also avail yourself of any other tools that Vendor B provides to their customers.  Tools vary widely between companies.  For example, Family Tree DNA, GedMatch and 23andMe provide chromosome browsers, while Ancestry does not.  All 3 major vendors (Family Tree DNA, Ancestry and 23andMe) have developed unique offerings (of varying quality) to help their customers understand the messages that their unique DNA carries.

Ok, Who Loves Whom?

The vendors in the left column are the vendors performing the autosomal DNA tests. The vendor row (plus GedMatch) across the top indicates who accepts upload transfers from whom, and which file versions. Please consider the notes below the chart.

  • Family Tree DNA accepts uploads from both other major vendors (Ancestry and 23andMe) but the versions that are compatible with the chip used by FTDNA will have more matches at Family Tree DNA. 23andMe V3, Ancestry V1 and MyHeritage results utilize the same chip and format as FTDNA. 23andMe V4 and Ancestry V2 utilize different formats utilizing only about half of the common locations. Family Tree DNA still allows free transfers and comparisons with other testers, but since there are only about half of the same DNA locations in common with the FTDNA chip, matches will be fewer. Additional functions can be unlocked for a one time $19 fee.
  • Neither Ancestry, 23andMe nor Genographic accept transfer data from any other vendors.
  • MyHeritage does accept transfers, although that option is not easy to find. I checked with a MyHeritage representative and they provided me with the following information:  “You can upload an autosomal DNA file from your profile page on MyHeritage. To access your profile page, login to your MyHeritage account, then click on your name which is displayed towards the top right corner of the screen. Click on “My profile”. On the profile page you’ll see a DNA tab, click on the tab and you’ll see a link to upload a file.”  MyHeritage has also indicated that they will be making ethnicity results available to individuals who transfer results into their system in May, 2017.
  • LivingDNA has just released an ethnicity product and does not have DNA matching capability to other testers.  They also do not provide a raw DNA download file for customers, but hope to provide that feature by mid-May. Without a download file, you cannot transfer your DNA to other companies for processing and inclusion in their data bases. Living DNA imputes DNA locations that they don’t test, but the initial download, when available, file will only include the DNA locations actually tested. According to LivingDNA, the Illumina GSA chip includes 680,000 autosomal markers. It’s unclear at this point how many of these locations overlaps with other chips.
  • WeGene’s website is in Chinese and they are not a significant player, but I did include them because GedMatch accepts their files. WeGene’s website indicates that they accept 23andme uploads, but I am unable to determine which version or versions. Given that their terms and conditions and privacy and security information are not in English, I would be extremely hesitant before engaging in business. I would not be comfortable in trusting on online translation for this type of document. SNPedia reports that WeGene has data quality issues.
  • GedMatch is not a testing vendor, so has no entry in the left column, but does provide tools and accepts all versions of files from each vendor that provides files, to date, with the exception of the Genographic Project.  GedMatch is free (contribution based) for many features, but does have more advanced functions available for a $10 monthly subscription.
  • The Genographic Project tested their participants at the Family Tree DNA lab until November 2016, when they moved to the Helix platform, which performs an exome test using a different chip.
  • The Ancestry V2 chip began processing in May 2016.
  • The 23andMe V3 chip began processing in December 2010. The 23andMe V4 chip began processing in November 2013.

Incompatible Files

Please be aware that vendors that accept different versions of other vendors files can only work with the tested locations that are in the files generated by the testing vendors unless they use a technique called imputation.

For example, Family Tree DNA tests about 700,000 locations which are on the same chip as MyHeritage, 23andMe V3 and Ancestry V1. In the later 23andMe V4 test, the earlier 23andMe V2 and the Ancestry V2 tests, only a portion of the same locations are tested.  The 23andMe V4 and Ancestry V2 chips only test about half of the file locations of the vendors who utilize the Illumina OmniExpress chip, but not the same locations as each other since both the Ancestry V2 and 23andMe V4 chips are custom. 23andMe and Ancestry both changed their chips from the OmniExpress version and replaced genealogically relevant locations with medically relevant locations, creating a custom chip.

I know this if confusing, so I’ve created the following chart for chip and test compatibility comparison.

You can easily see why the FTDNA, Ancestry V1, 23andMe V3 and MyHeritage tests are compatible with each other.  They all tested utilizing the same chip.  However, each vendor then applies their own unique matching and ethnicity algorithms to customer results, so your results will vary with each vendor, even when comparing ethnicity predictions or matching the same two individuals to each other.

Apples to Apples to Imputation

It’s difficult for vendors to compare apples to apples with non-compatible files.

I wrote about imputation in the article about MyHeritage, here. In a nutshell, imputation is a technique used to infer the DNA for locations a vendor doesn’t test (or doesn’t receive in a transfer file from another vendor) based on the location’s neighboring DNA and DNA that is “normally” passed together as a packet.

However, the imputed regions of DNA are not your DNA, and therefore don’t carry your mutations, if any.

I created the following diagram when writing the MyHeritage article to explain the concept of imputation when comparing multiple vendors’ files showing locations tested, overlap and imputed regions. You can click to enlarge the graphic.

Family Tree DNA has chosen not to utilize imputation for transfer files and only compares the actual DNA locations tested and uploaded in vendor files, while MyHeritage has chosen to impute locations for incompatible files. Family Tree DNA produces fewer, but accurate matches for incompatible transfer files.  MyHeritage continues to have matching issues.

MyHeritage may be using imputation for all transfer files to equalize the files to a maximum location count for all vendor files. This is speculation on my part, but is speculation based on the differences in matches from known compatible file versions to known matches at the original vendor and then at MyHeritage.

I compared matches to the same person at MyHeritage, GedMatch, Ancestry and Family Tree DNA. It appears that imputed matches do not consistently compare reliably. I’m not convinced imputation can ever work reliably for genetic genealogy, because we need our own DNA and mutations. Regardless, imputation is in its infancy today.

To date, two vendors are utilizing imputation. LivingDNA is using imputation with the GSA chip for ethnicity, and MyHeritage for DNA matching.

Summary

Your best results are going to be to test on the platform that the vendor offers, because the vendor’s match and ethnicity algorithms are optimized for their own file formats and DNA locations tested.

That means that if you are transferring an Ancestry V1 file, a 23andMe V3 file or a MyHeritage file, for example, to Family Tree DNA, your matches at Family Tree DNA will be the same as if you tested on the FTDNA platform.  You do not need to retest at Family Tree DNA.

However, if you are transferring an Ancestry V2 file or 23andMe V4 file, you will receive some matches, someplace between one quarter and half as compared to a test run on the vendor’s own chip. For people who can’t be tested again, that’s certainly better than nothing, and cross-chip matching generally picks up the strongest matches because they tend to match in multiple locations. For people who can retest, testing at Family Tree DNA would garner more matches and better ethnicity results for those with 23andMe V2 and V4 tests as well as Ancestry V2 tests.

For absolutely best results, swim in all of the major DNA testing pools, test as many relatives as possible, and test on the vendor’s Native chip to obtain the most matches.  After all, without sharing and matching, there is no genetic genealogy!

Introducing the Match-Maker-Breaker Tool for Parental Phasing

A few days after I published the article, Concepts – Segment Size, Legitimate and False Matches, Philip Gammon, a statistician who lives in Australia, posted a comment to my blog.

Great post Roberta! I’m a statistician so my eyes light up as soon as I see numbers. That table you have produced showing by segment length the percentage that are IBD is one of the most useful pieces of information that I have seen. Two days to do the analysis!!! I’m sure that I could write a formula that would identify the IBD segments and considerably reduce this time.

By this time, my eyes were lighting up too, because the work for the original article had taken me two days to complete manually, just using segments 3 cM and above. Using smaller segments would have taken days longer. By manually, I mean comparing the child’s matches with that of both parents’ matches to see which, if either, parent the child’s match also matches on the same segment.

In the simplest terms, the Segment Size article explained how to copy the child’s and both parents’ matches to a spreadsheet and then manually compare the child’s matches to those of the parents. In the example above, you can see that both the child and the mother have matches to Cecelia. As it turns out, the exact same segment of DNA was passed in its entirety to the child from the mother, who is shown in pink – so Cecelia matches both the child and the parent on exactly the same segment.

That’s not always the case, and the Segment Size article went into much greater detail.

For the past month or so, Philip and I have been working back and forth, along with some kind volunteers who tested Philip’s new tool, in order to create something so that you too can do this comparison and in much less than two days.

Foundation

Here’s the underlying principle for this tool – if a child has a match that does NOT match either parent on the same segment, then the match is not a legitimate match. It’s a false match, identical by chance, and it is NOT genealogically relevant.

If the child’s match also matches either parent on the same segment, it is most likely a match by descent and is genealogically relevant.

For those of you who noticed the words “most likely,” yes, it is possible for someone to match a parent and child both and still not phase (or match) to the next higher generation, but it’s unusual and so far, only found in smaller segments. I wrote about multiple generation phasing in the article, “Concepts – Segment Survival – 3 and 4 Generation Phasing.” Once a segment phases, it tends to continue phasing, especially with segments above about 3.5 cM.

For those who have both parents available to test, phased matching is a HUGE benefit.

But I Have Only One Parent Available

You can still use the tool to identify matches to that one parent, but you CANNOT presume that matches that DON’T match that parent are from the other (missing) parent. Matches matching the child but not matching the tested parent can be due to:

  • A match to the missing parent
  • A false match that is not genealogically relevant

According to the statistics generated from Philip’s Match-Maker-Breaker tool, shown below, segments 9 cM and above tend to match one or the other parent 90% or more of the time.  Segments 12 cM and over match 97% of the time or more, so, in general, one could “assume” (dangerous word, I know) that segments of this size that don’t match to the tested parent would match to the other parent if the other parent was available. You can also see that the reliability of that assumption drops rapidly as the segment sizes get smaller.

Platform

This tool was written utilizing Microsoft Excel and only works reliably on that platform.

If you are using Excel and are NOT attempting to use MAC Numbers, skip this section.  If you want to attempt to use Numbers, read this section.

I tried, along with a MAC person, to try to coax Numbers (free MAC spreadsheet) into working. If you have any other option other than using Numbers, so do. Microsoft Excel for MAC seemed to work fine, but it was only tested on one MAC.

Here’s what I discovered when trying to make Numbers work:

  • You must first launch numbers and then select the various spreadsheets.
  • The tabs are not at the bottom and are instead at the top without color.
  • The instructions for copying the formulas in cells H2-K2 throughout the spreadsheet must be done manually with a copy/paste.
  • After the above step, the calculations literally took a couple hours (MacBook Air) instead of a couple minutes on the PC platform. The older MAC desktop still took significantly longer than on a Microsoft PC, but less time than the solid state MacBook Air.
  • After the calculations complete, the rows on the child’s spreadsheet are not colored, which is one of the major features of the Match-Maker-Breaker tool, as Numbers reports that “Conditional highlighting rules using formulas are not supported and were removed.”
  • Surprisingly, the statistical Reports page seems to function correctly.

How Long Does Running Match-Maker-Breaker Tool on a PC Take?

The first time I ran this tool, which included reading Philip’s instructions for the first time, the entire process took me about 10 minutes after I downloaded the files from Family Tree DNA.

Vendors

This tool only works with matches downloaded from Family Tree DNA.

Transfer Kits

It’s strongly suggested that all 3 individuals being compared have tested at Family Tree DNA or on the same chip version imported into Family Tree DNA.

Matches not run on the same chip as Family Tree DNA testers can only provide a portion of the matches that the same person’s results run on the FTDNA chip can provide. You can run the matching tool with transferred results, but the results will only provide a subset of the results that will be provided by having all parties that are being compared, meaning the child and both parents, test at Family Tree DNA.

The following products versions CAN be all be compared successfully at Family Tree DNA, as they all utilize the same Illumina chip:

  • All Family Finder tests
  • Ancestry V1 (before May 2016)
  • 23andMe V3 (before November 2013)
  • MyHeritage

The following tests do NOT utilize the same Illumina testing platform and cannot be compared successfully with Family Finder tests from Family Tree DNA, or the list above. Cross platform testing results cannot be reliably compared. Those that DO match will be accurate, but many will not match that would match if all 3 testers were utilizing the same platform, therefore leading you to inaccurate conclusions.

  • Ancestry V2 (beginning in May 2016 to present)
  • 23andMe V4 (beginning November 2013 to present)

The child and two parents should not be compared utilizing mixed platforms – meaning, for example, that the child should not have been tested at FTDNA and the parents transferred from Ancestry on the V2 platform since May 2016.

If any of the three family members, being the child or either parent, have tested on an incompatible platform, they should retest at Family Tree DNA before using this tool.

What You Need

  • You will need to download the chromosome match lists from the child and both parents, AT THE SAME TIME. I can’t stress this enough, because any matches that have been added for either of the three people at a later time than the others will skew the matching and the statistics. Matches are being added all the time.
  • You will also need a relatively current version of Excel on your computer to run this tool. No, I did not do version compatibility testing so I don’t know how old is too old. I am running MSOffice 2013.
  • You will need to know how to copy and paste data from and to a spreadsheet.

Instructions for Downloading Match Files

My recommendation is that you download your matches just before utilizing this tool.

To download your matches, sign on to each account. On your main page, you will see the Family Finder section, and the Chromosome Browser. Click on that link.

At the top of the chromosome browser page, below, you’ll see the image of chromosomes 1 through X. At the top right, you’ll see the option to “Download all matches to Excel (CSV Format). Click on that link.

Next, you’ll receive a prompt to open or save the file. Save it to a file name that includes the name of the person plus the date you did the download. I created a separate folder so there would be no confusion about which files are which and whether or not they are current.

Your match file includes all of your matches and the chromosome matching locations like the example shown below.

These files of matches are what you’ll need to copy into the Match-Maker-Breaker spreadsheet.

Do not delete any information from your match spreadsheets. If you normally delete small segments, don’t. You may cause a non-match situation if the parent carries a larger portion of the same segment.

You can rerun the Match-Maker-Breaker tool at will, and it only takes a very few minutes.

The Match-Maker-Breaker Tool

The Match-Maker-Breaker Tool has 5 sheets when you open the spreadsheet:

  • Instructions – Please read entirely before beginning.
  • Results – The page where your statistical results will be placed.
  • Child – The page where you will paste the child’s matches and then look at the match results after processing.
  • Father – The page where you will paste the father’s matches.
  • Mother – The page where you will paste the mother’s matches.

Download

Download the free Match-Maker-Breaker tool which is a spreadsheet by clicking on this link: Match-Maker-Breaker Tool V2

Please don’t start using the tool before reading the instructions completely and reading the rest of this article.

Make a Copy

After you download the tool, make a copy on your system. You’ll want to save the Match-Maker-Breaker spreadsheet file for each trio of people individually, and you’ll want a fresh Match-Maker-Breaker spreadsheet copy to run with each new set of download files.

Instructions

I’m not going to repeat Philip’s instructions here, but please read them entirely before beginning and please follow them exactly. Philip has included graphic illustrations of each step to the right of the instruction box. The spreadsheet opens to the Instructions page. You can print the instruction page as well.

Copy/Pasting Data

When copying the parents’ and child’s data into the spreadsheets, do NOT copy and paste the entire page by selecting the page. Select and copy the relevant columns by highlighting columns A through G by touching your cursor to the A-G across the top, as shown below.  After they are selected, then click on “copy.” In the child’s chromosome browser download spreadsheet, position the curser in the first cell in row 1 in the child’s page of the Match-Maker-Breaker spreadsheet and click on “paste.”

Do NOT select columns H-K when highlighting and copying, or your paste will wipe out Philip’s formulas to do calculations on the child’s tab on the spreadsheet.

The example above, assuming that Annie is the last entry on the spreadsheet, shows that I’ve highlighted all of the cells in columns A-G, prior to executing the copy command. Your spreadsheets of course will be much longer.

I wrote a very quick and dirty article about using Excel here

The Match Making Breaking Part

After you copy the formulas from rows H2 to K2 through the rest of the spreadsheet by following Philip’s instructions, you’ll see the results populating in the status bar at the bottom. You’ll also see colors being added to the matches on the left hand side of the spreadsheet page and counts accruing in the 4 right columns. Be patient and wait. It may take a few minutes. When it’s finished, you can verify by scrolling to the last row on the child’s page and you’ll see something like the example below, where every row has been assigned a color and every match that matches the child and the father, mother, both or is found in the HLA region is counted as 1 in the right 4 columns.

In this example, 5 segments, shown in grey, don’t match anyone, one, shown in tan is found in the HLA region, and three match the father, in blue.

Output

After you run the Match-Maker-Breaker tool, the child’s matches on the Child tab will be identified as follows:

This means that segment of the child that matches that individual also matches the father, the mother, both parents, the HLA region, or none of the above on all or part of that same segment.

What is a Match?

Philip and I worked to answer the question, “what is a match?” In the Concepts article, I discussed the various kinds of matches.

  • Full match: The child’s match and parent’s match share the same exact segment, meaning same start and end points and same number of SNPs within that segment.
  • Partial match: The child’s match matches a portion of the segment from the parent – meaning that the child inherited part of the segment, but not the entire segment.
  • Overhanging match: The child’s match matches part or all of the parent’s segment, but either the beginning or end extends further than the parents match. This means that the overlapping portion is legitimate, meaning identical by descent (IBD), but the overhanging portion is identical by chance (IBC.)
  • Nested match: The child’s match is smaller than the match to the parent, but fully within the parent’s match, indicating a legitimate match.
  • No match: The person matches the child, but neither parent, meaning that this match is not legitimate. It’s identical by chance (IBC).

Full matches and no matches are easy.

However, partial matches, overlapping matches and nested matches are not as straightforward.

What, exactly, is a match? Let’s look at some different scenarios.

If someone matches a parent on a large segment, say 20cM, and only matches the child on 2cM, fully within the parent’s segment, is this match genealogically relevant, or could the match be matching the child by chance on a part of the same segment that they match the parents by descent? We have no way to know for sure, just utilizing this tool. Hopefully, in this case, the fact that the person matches the parent on a large segment would answer any genealogical questions through triangulation.

If the person matches the parent but only matches the child on a small portion of the same segment plus an overhanging region, is that a valid match? Because they do match on an overhanging region, we know that match is partly identical by chance, but is the entire match IBC or is the overlapping part legitimate? We don’t know. Partly, how strongly I would consider this a valid match would be the size of the matching portion of the segment.

One of the purposes of phasing and then looking at matches is to, hopefully, learn more about which matches are legitimate, which are not, and predictors of false versus legitimate matches.

Relative to this tool, no editing has been done, meaning that matches are presented exactly as that, regardless of their size or the type of match. A match is a match if any portion of the match’s DNA to the child overlaps any portion of either or both parent’s DNA, with the exception of part of chromosome 6. It’s up to you, as the genealogist, to figure out by utilizing triangulation and other tools whether the match is relevant or not to your genealogy.

If you are not familiar with identical by descent (meaning a legitimate match), identical by population (IBP) meaning identical by descent but because the population as a whole carries that segment and identical by chance (IBC) meaning a false match, the article Identical by…Descent, State, Population and Chance explains the terms and the concepts so that you can apply them usefully.

About Chromosome 6

After analyzing the results of several people, the area of chromosome 6 that includes the HLA region has been excluded from the analysis. Long known to be a pileup region where people carry significant segments of the same DNA that is not genealogically relevant (meaning IBP or identical by population,) this region has found to be often unreliable genealogically, and falls outside the norm as compared to the rest of the segments. This area has been annotated separately and excluded from match results. This was the only region found to universally have this effect.

This does not mean that a match in this region is positively invalid or false, but matches in the HLA region should be viewed very skeptically.

The Results Tab – Statistics

Now that you’ve populated the spreadsheet and you can see on the Child tab which matches also match either or both parents, or neither, or the HLA region, go to the Results tab of the spreadsheet.

This tab gives you some very interesting statistics.

First, you’ll see the number and percent of matches by chromosome.

The person compared was a female, so she would have X matches to both parents. However, notice that X matching is significantly lower than any of the other chromosomes.

Frankly, I’ve suspected for a long time that there was a dramatic difference in matching with the X chromosome, and wrote about it here. It was suggested by some at the time that I was only reporting my personal observations that would not hold beyond a few results (ascertainment bias), but this proves that there is something different about X chromosome matching. I don’t know what or why, but according to this data that is consistent between all of the beta testers, matching to the X chromosome is much less reliable.

The second statistics box you will see are statistics for the matches to the child that also match the parents. The actual matches of the child to the parents are shown as the 23 shown under “excluded from calculations.”

The next group of statistics on your page will be your own, but for this example, Philip has combined the results from several beta testers and provided summary information, so that the statistics are not skewed by any one individual.

Next, the match results by segment size for chromosomes 1-22. Philip has separated out segments with less than 500 SNPs and reports them separately.

You will note that 90% or more of the segments 9 cM and above match one of the two parents, and 97% or more of segments 12cM or above.

The X chromosome follows, analyzed separately. You’ll notice that while 27% of the matches on chromosomes 1-22 match one or both parents, only 14% of the X matches do.

Even with larger segments, not all X segments match both the child and the parents, suggesting that skepticism is warranted when evaluating X chromosome matches.

Philip then calculated a nice graph for showing matching autosomal segments by cM size, excluding the X.

The next set of charts shows matches by SNP density. Many people neglect SNP count when evaluating results, but the higher the SNP count, the more robust the match.

Note that SNP density above 2,200 almost always matched, but not always, while SNP density of 2,800 reaches the 97% threshold..

The X chromosome, by SNP count, below.

X segment reach the 100% threshold about 1600, however, we really need more results to be predictive at the same level as the results for chromosomes 1-22.  Two data samples really isn’t adequate.

Once again, Philip prepared a nice chart showing percentage of matching segments by SNP count, below.

Predictive

In the Segment Survival – 3 and 4 Generation Phasing article, one can see that phased matches are predictive, meaning that a child/parent match is highly suggestive that the segment is a valid segment match and that it will hold in generations further upstream.

Several years ago, Dr. Tim Janzen, one of the early phasing pioneers, suggested that people test their children, even if both parents had already tested. For the life of me, I couldn’t understand how that would be the least bit productive, genealogically, since people were more likely to match the parents than the children, and children only carry a subset of their parent’s DNA.

However, the predictive nature of a segment being legitimate with a child/parent match to a third party means that even in situations where your own parent isn’t available, a match by a third party on the same segment with your child suggests that the match is legitimate, not IBC.

In the article, I showed both 3 and 4 generations of phased comparisons between generations of the same family and a known cousin. The results of the 5 different family comparisons are shown below, where the red segments did not phase or lost phasing between generations, and the green segments did phase through multiple generations.

Very, very few segments lost phasing in upper (older) generations after matching between a parent and a child. In the five 4-generation examples above, only a total of 7 groups of segments lost phasing. The largest segment that lost phasing in upper generations was 3.69 cM. In two examples, no segments were lost due to not phasing in upper generations.

The net-net of this is that you can benefit by testing your children if your parents aren’t available, because the matches on the segment to both you and the child are most likely to be legitimate. Of course, there will be segments where someone matches you and not your child, because your child did not inherit that segment of your DNA, and those may be legitimate matches as well. However, the segments where you and your child both match the same person will likely be legitimate matches, especially over about 3.5 cM. Please read the Segment Survival article for more details.

If you want to order additional Family Finder tests for more family members, you can click here.

Group Analysis

Philip has performed a group analysis which has produced some expected results along with some surprising revelations. I’d prefer to let people get their feet wet with this tool and the results it provides before publishing the results, with one exception.

In case you’re wondering if the comparisons used as examples, above, are representative of typical results, Philip analyzed 10 of our beta testers and says the following:

The results are remarkably consistent between all 10 participants. Summing it up in words: with each person that you match you will have an average of 11 matching segments. Three will be genuine and will add to [a total of] 21 cM. Eight will be false and add to [a total of] 19 cM.

Philip compiled the following chart summarizing 10 beta testers’ results. Please note that you can click to enlarge the images.

The X, being far less consistent, is shown below.

We Still Need Endogamous Parent-Child Trios

When I asked for volunteer testers, we were not able to obtain a trio of fully endogamous individuals. Specifically, we would like to see how the statistics for groups of non-endogamous individuals compare to the statistics for endogamous individuals.

Endogamous groups include people who are 100% Jewish, Amish, Mennonite, or have a significant amount of first or second cousin marriages in recent generations.

Of these, Jewish families prove to be the most highly endogamous, so if you are Jewish and have both Jewish parents’ DNA results, please run this tool and send either Philip or me the resulting spreadsheet. Your results won’t be personally identified, only the statistics used in conjunction with others, similar to the group analysis shown above. Your results will be entirely anonymous.

Philip’s e-mail is philip.gammon@optusnet.com.au and you can reach me at roberta@dnaexplain.com.

Caveat

Philip has created the Match-Maker-Breaker tool which is free to everyone. He has included some wonderful diagnostics, but Philip is not providing individual support for the tooI. In other words, this is a “what you see is what you get” gift.

Thank You and Acknowledgements

Of course, a very big thank you to Philip for creating this tool, and also to people who volunteered as alpha and beta testers and provided feedback. Also thanks to Jim Kvochick for trying to coax Numbers into working.

Match-Maker-Breaker Author Bio:

Philip’s official tagline reads: Philip Gammon, BEng(ManSysEng) RMIT, GradDipSc(AppStatistics) Swinburne

I asked Philip to describe himself.

I’d describe myself as a business analyst with a statistics degree plus an enthusiastic genetic genealogist with an interest in the mathematical and statistical aspects of inheritance and cousinship.

The important aspect of Philip’s resume is that he is applying his skills to genetic genealogy where they can benefit everyone. Thank you so much Philip.

Watch for some upcoming guest articles from Philip.

Family Tree DNA myOrigins Ethnicity Update – No April Foolin’

The long-anticipated myOrigins update at Family Tree DNA has happened today. Not only are the ethnicity percentages updated, sometimes significantly, but so are the clusters and the user interface.

Furthermore, because of the new clusters and reference populations, the entire data base has been rerun. In essence, this isn’t just an update, but an entirely new version of myOrigins.

New Population Clusters

The updated version of myOrigins includes 24 reference populations, an increase of 6 from the previous 18 clusters.

The new clusters are:

African

  • East Central Africa
  • West Africa
  • South Central Africa

Central/South Asian

  • South Central Asia
  • Oceania
  • Central Asia

East Asian

  • Northeast Asia
  • Southeast Asia
  • Siberia

Europe

  • West and Central Europe
  • East Europe
  • Iberia
  • Southeast Europe
  • British Isles
  • Finland
  • Scandinavia

Jewish Diaspora

  • Sephardic Diaspora
  • Ashkenazi Diaspora

Middle Eastern

  • East Middle East
  • West Middle East
  • Asia Minor
  • North Africa

New World

  • North and Central America
  • South and Central America

Note that this grouping divides Native American between North and South America and includes the long-awaited Sephardic cluster.

New User Experience

Your experience starts on your home page where you’ll click on myOrigins, like always. That part hasn’t changed.

The next page you’ll see is new.

This myOrigins page shows your major category results, with a down arrow to display your subgroups and trace results.

Now, for the great news! Family Tree DNA is now displaying trace results! Often interpreted to be noise, that’s not always the case. However, Family Tree DNA does provide an annotation for trace amounts of DNA, so everyone is warned about the potential hazard.

It’s now up to you, the genealogist, to make the determination whether your trace amounts are valid or not.

Trace DNA inclusion has been something I’ve wanted for a long time, so THANK YOU Family Tree DNA!

MyOrigins now identifies my North and Central American ancestry, which translates into Native American, proven by haplogroups in those particular family lines.

Clicking on the various subcategories shows the location of the cluster on the map, along with new educational material below the map.

Pressing the down arrow beside any category displays the subcategories.

Clicking on “Show All” displays all of the categories and your ethnicity percentages within those categories.

Clicking on “View myOrigins Map” shows you the entire world map and your cluster locations where your DNA is found in those reference populations.

The color intensity reflects the amount of your DNA found there. In other words, bright blue is my majority ethnicity at 48% in the British Isles.

In the information box in the lower left hand corner, you can now opt to view your shared origins with people you match and share the same major regions, or you can view the regional information.

Accuracy

I’ve already mentioned how pleased I am to find my Native American ancestry accurately reported, but I’m also equally as pleased to see my British Isles and Germanic/Dutch/French much more accurately reflected. My mother’s results are more succinct as well, reflecting her known heritage almost exactly.

The chart below shows my new myOrigins results compared to the older results. I prepared this chart originally as a part of the article, Concepts – Calculating Ethnicity Percentages. The new results are much more reflective of what I know about my genealogy.

Take a look at your new results on your home page at Family Tree DNA.

Summary

All ethnicity estimates, from all sources, are just that…estimates.  There will always be a newer version as reference populations continue to improve.  The new myOrigins version offers a significant improvement for me and the kits I administer.

Ethnicity estimates are more of a beginning than an end.  I hope that no one is taking any ethnicity estimate as hard and fast fact.  They aren’t.  Ethnicity estimates are one of the many tools available to genetic genealogists today.  They really aren’t a shortcut to, or in place of, traditional genealogy.  I hope what they are, for many people, is the enticement that encourages them to jump into the genealogy pool and go for a swim.

For people seeking to know “who they are” utilizing ethnicity testing, they need to understand that while ethnicity results are fun, they aren’t an answer.  Ethnicity results are more of a hint or a road sign, pointing the way to potential answers that may be reaped from traditional genealogical research.

If your results aren’t quite what you were expecting, or even if they are and you’d like to understand more about how ethnicity and DNA works, please read my article, Ethnicity Testing – A Conundrum.

Jessica Biel – A Follow-up: DNA, Native Heritage and Lies

Jessica Biel’s episode aired on Who Do You Think You Are on Sunday, April 2nd. I wanted to write a follow-up article since I couldn’t reveal Jessica’s Native results before the show aired.

The first family story about Jessica’s Biel line being German proved to be erroneous. In total, Jessica had three family stories she wanted to follow, so the second family legend Jessica set out to research was her Native American heritage.

I was very pleased to see a DNA test involved, but I was dismayed that the impression was left with the viewing audience that the ethnicity results disproved Jessica’s Native heritage. They didn’t.

Jessica’s Ethnicity Reveal

Jessica was excited about her DNA test and opened her results during the episode to view her ethnicity percentages.

Courtesy TLC

The locations shown below and the percentages, above, show no Native ethnicity.

Courtesy TLC

Jessica was understandably disappointed to discover that her DNA did not reflect any Native heritage – conflicting with her family story. I feel for you Jessica.  Been there, done that.

Courtesy TLC

Jessica had the same reaction of many of us. “Lies, lies,” she said, in frustration.

Well Jessica, maybe not.

Let’s talk about Jessica’s DNA results.

Native or Lies?

I’ve written about the challenges with ethnicity testing repeatedly. At the end of this article, I’ll provide a reading resource list.

Right now, I want to talk about the misperception that because Jessica’s DNA ethnicity results showed no Native, that her family story about Native heritage is false. Even worse, Jessica perceived those stories to be lies. Ouch, that’s painful.

In my world view, a lie is an intentional misrepresentation of the truth. Let’s say that Jessica really didn’t have Native heritage. That doesn’t mean someone intentionally lied. People might have been confused. Maybe they made assumptions. Sometimes facts are misremembered or misquoted. I always give my ancestors the benefit of the doubt unless there is direct evidence of an intentional lie. And if then, I would like to try to understand what prompted that behavior. For example, discrimination encouraged many people of mixed ethnicity to “pass” for white as soon as possible.

That’s certainly a forgivable “lie.”

Ok, Back to DNA

Autosomal DNA testing can only reliably pick up to about the 1% level of minority DNA admixture successfully – minority meaning a small amount relative to your overall ancestry.

Everyone inherits DNA from ancestors differently, in different amounts, in each generation. Remember, you receive half of your DNA from each parent, but which half of their DNA you receive is random. That holds true for every generation between the ancestor in question and Jessica today.  Ultimately, more or less than 50% of any ancestor’s DNA can be passed in any generation.

However, if Jessica inherited the average amount of DNA from each generation, being 50% of the DNA from the ancestor that the parent had, the following chart would represent the amount of DNA Jessica carried from each ancestor in each generation.

This chart shows the amount of DNA of each ancestor, by generation, that an individual testing today can expect to inherit, if they inherit exactly 50% of that ancestor’s DNA from the previous generation. That’s not exactly how it works, as we’ll see in a minute, because sometimes you inherit more or less than 50% of a particular ancestor’s DNA.

Utilizing this chart, in the 4th generation, Jessica has 16 ancestors, all great-great-grandparents. On average, she can expect to inherit 6.25% of the DNA of each of those ancestors.

In the rightmost column, I’ve shown Jessica’s relationship to her Jewish great-great-grandparents, shown in the episode, Morris and Ottilia Biel.

Jessica has two great-great-grandparents who are both Jewish, so the amount of Jewish DNA that Jessica would be expected to carry would be 6.25% times two, or 12.50%. But that’s not how much Jewish DNA Jessica received, according to Ancestry’s ethnicity estimates. Jessica received only 8% Jewish ethnicity, 36% less than average for having two Jewish great-great-grandparents.

Courtesy TLC

Now we know that Jessica carries less Jewish DNA that we would expect based on her proven genealogy.  That’s the nature of random recombination and how autosomal DNA works.

Now let’s look at the oral history of Jessica’s Native heritage.

Native Heritage

The intro didn’t tell us much about Jessica’s Native heritage, except that it was on her mother’s mother’s side. We also know that the fully Native ancestor wasn’t her mother or grandmother, because those are the two women who were discussing which potential tribe the ancestor was affiliated with.

We can also safely say that it also wasn’t Jessica’s great-grandmother, because if her great-grandmother had been a member of any tribe, her grandmother would have known that. I’d also wager that it wasn’t Jessica’s great-great-grandmother either, because most people would know if their grandmother was a tribal member, and Jessica’s grandmother didn’t know that. Barring a young death, most people know their grandmother. Utilizing this logic, we can probably safely say that Jessica’s Native ancestor was not found in the preceding 4 generations, as shown on the chart below.

On this expanded chart, I’ve included the estimated birth year of the ancestor in that particular generation, using 25 years as the average generation length.

If we use the logic that the fully Native ancestor was not between Jessica and her great-great-grandmother, that takes us back through an ancestor born in about 1882.

The next 2 generations back in time would have been born in 1857 and 1832, respectively, and both of those generations would have been reflected as Indian on the 1850 and/or 1860 census. Apparently, they weren’t or the genealogists working on the program would have picked up on that easy tip.

If Jessica’s Native ancestor was born in the 7th generation, in about 1807, and lived to the 1850 census, they would have been recorded in that census as Native at about 43 years of age. Now, it’s certainly possible that Jessica had a Native ancestor that might have been born about 1807 and didn’t live until the 1850 census, and whose half-Native children were not enumerated as Indian.

So, let’s go with that scenario for a minute.

If that was the case, the 7th generation born in 1807 contributed approximately 0.78% DNA to Jessica, IF Jessica inherited 50% in each generation. At 0.78%, that’s below the 1% level. Small amounts of trace DNA are reported as <1%, but at some point the amount is too miniscule to pick up or may have washed out entirely.

Let’s add to that scenario. Let’s say that Jessica’s ancestor in the 7th generation was already admixed with some European. Traders were well known to marry into tribes. If Jessica’s “Native” ancestor in the 7th generation was already admixed, that means Jessica today would carry even less than 0.78%.

You can easily see why this heritage, if it exists, might not show up in Jessica’s DNA results.

No Native DNA Does NOT Equal No Native Heritage

However, the fact that Jessica’s DNA ethnicity results don’t indicate Native American DNA doesn’t necessarily mean that Jessica doesn’t have a Native ancestor.

It might mean that Jessica doesn’t have a Native ancestor. But it might also mean that Jessica’s DNA can’t reliably disclose or identify Native ancestry that far back in time – both because of the genetic distance and also because Jessica may not have inherited exactly half of her ancestor’s Native DNA. Jessica’s 8% Jewish DNA is the perfect example of the variance in how DNA is actually passed versus the 50% average per generation that we have to utilize when calculating expected estimates.

Furthermore, keep in mind that all ethnicity tools are imprecise.  It’s a new field and the reference panels, especially for Native heritage, are not as robust as other groups.

Does Jessica Have Native Heritage?

I don’t know the answer to that question, but here’s what I do know.

  • You can’t conclude that because the ethnicity portion of a DNA test doesn’t show Native ancestry that there isn’t any.
  • You can probably say that any fully Native ancestor is not with in the past 6 generations, give or take a generation or so.
  • You can probably say that any Native ancestor is probably prior to 1825 or so.
  • You can look at the census records to confirm or eliminate Native ancestors in many or most lines within the past 6 or 7 generations.
  • You can utilize geographic location to potentially eliminate some ancestors from being Native, especially if you have a potential tribal affiliation. Let’s face it, Cherokees are not found in Maine, for example.
  • You can potentially utilize Y and mitochondrial DNA to reach further back in time, beyond what autosomal DNA can tell you.
  • If autosomal DNA does indicate Native heritage, you can utilize traditional genealogy research in combination with both Y and mitochondrial DNA to prove which line or lines the Native heritage came from.

Mitochondrial and Y DNA Testing

While autosomal DNA is constrained to 5 or 6 generations reasonably, Y and mitochondrial DNA is not.

Of course, Ancestry, who sponsors the Who Do You Think You Are series, doesn’t sell Y or mitochondrial DNA tests, so they certainly aren’t going to introduce that topic.

Y and mitochondrial DNA tests reach back time without the constraint of generations, because neither Y nor mitochondrial DNA are admixed with the other parent.

The Y DNA follows the direct paternal line for males, and mitochondrial DNA follows the direct matrilineal line for both males and females.

In the Concepts – Who To Test article, I discussed all three types of testing and who one can test to discover their heritage, through haplogroups, of each family line.  Every single one of your ancestors carried and had the opportunity to pass on either Y or mitochondrial DNA to their descendants.  Males pass the Y chromosome to male children, only, and females pass mitochondrial DNA to both genders of their children, but only females pass it on.

I don’t want to repeat myself about who carries which kind of DNA, but I do want to say that in Jessica’s case, based on what is known about her family, she could probably narrow the source of the potential Native ancestor significantly.

In the above example, if Jessica is the daughter – let’s say that we think the Native ancestor was the mother of the maternal great-grandmother. She is the furthest right on the chart, above. The pink coloring indicates that the pink maternal great grandmother carries the mitochondrial DNA and passed it on to the maternal grandmother who passed it to the mother who passed it to both Jessica and her siblings.

Therefore, Jessica or her mother, either one, could take a mitochondrial DNA test to see if there is deeper Native ancestry than an autosomal test can reveal.

When Y and mitochondrial DNA is tested, a haplogroup is assigned, and Native American haplogroups fall into subgroups of Y haplogroups C and Q, and subgroups of mitochondrial haplogroups A, B, C, D, X and probably M.

With a bit of genealogy work and then DNA testing the appropriate descendants of Jessica’s ancestors, she might still be able to discern whether or not she has Native heritage. All is not lost and Jessica’s Native ancestry has NOT been disproven – even though that’s certainly the impression left with viewers.

Y and Mitochondrial DNA Tests

If you’d like to order a Y or mitochondrial DNA test, I’d recommend the Full Mitochondrial Sequence test or the 37 marker Y DNA test, to begin with. You will receive a full haplogroup designation from the mitochondrial test, plus matching and other tools, and a haplogroup estimate with the Y DNA test, plus matching and other tools.

You can click here to order the mitochondrial DNA, the Y DNA or the Family Finder test which includes ethnicity estimates from Family Tree DNA. Family Tree DNA is the only DNA testing company that performs the Y and mitochondrial DNA tests.

Further Reading:

If you’d like to read more about ethnicity estimates, I’d specifically recommend “DNA Ethnicity Testing – A Conundrum.

If you’d like more information on how to figure out what your ethnicity estimates should be, I’d recommend Concepts – Calculating Ethnicity Percentages.

You can also search on the word “ethnicity” in the search box in the upper right hand corner of the main page of this blog.

If you’d like to read more about Native American heritage and DNA testing, I’d  recommend the following articles. You can also search for “Native” in the search box as well.

How Much Indian Do I Have In Me?

Proving Native American Ancestry Using DNA

Finding Your American Indian Tribe Using DNA

Native American Mitochondrial Haplogroups

Mitochondrial DNA Build 17 Update at Family Tree DNA

I knew the mitochondrial DNA update at Family Tree DNA was coming, I just didn’t know when. The “when” was earlier this week.

Take a look at your mitochondrial DNA haplogroup – it maybe different!

Today, this announcement arrived from Family Tree DNA.

We’re excited to announce the release of mtDNA Build 17, the most up-to-date scientific understanding of the human genome, haplogroups and branches of the mitochondrial DNA haplotree.

As a result of these updates and enhancements—the most advanced available for tracing your direct maternal lineage—some customers may see a change to their existing mtDNA haplogroup. This simply means that in applying the latest research, we are able to further refine your mtDNA haplogroup designation, giving you even more anthropological insight into your maternal genetic ancestry.

With the world’s largest mtDNA database, your mitochondrial DNA is of great value in expanding the overall knowledge of each maternal branch’s history and origins. So take your maternal genetic ancestry a step further—sign in to your account now and discover what’s new in your mtDNA!

This is great news. It means that your haplogroup designation is the most up to date according to Phylotree.

I’d like to take this opportunity to answer a few questions that you might have.

What is Phylotree?

Phylotree is, in essence, the mitochondrial tree of humanity. It tracks the mutations that formed the various mutations from “Mitochondrial Eve,” the original ancestor of all females living today, forward in time…to you.

You can view the Phylotree here.

For example, if your haplogroup is J1c2f, for example, on Phylotree, you would click on haplogroup JT, which includes J. You would then scroll down through all the subgroups to find J1c2f. But that’s after your haplgroup is already determined. Phylotree is the reference source that testing companies use to identify the mutations that define haplogroups in order to assign your haplogroup to you.

It’s All About Mutations

For example, J1c2f has the following mutations at each level, meaning that each mutation(s) further defines a subgroup of haplogroup J.

As you can see, each mutation(s) further refines the haplogroup from J through J1c2f. In other words, if the person didn’t have the mutation G9055A, they would not be J1c2f, but would only be J1c2. If new clusters are discovered in future versions of Phylotree, then someday this person might be J1c2f3z.

Family Tree DNA provides an easy reference mutations chart here.

What is Build 17?

Research in mitochondrial DNA is ongoing. As additional people test, it becomes clear that new subgroups need to be identified, and in some cases, entire groups are moved to different branches of the tree. For example, if you were previously haplogroup A4a, you are now A1, and if you were previously A4a1 you are now A1a.

Build 17 was released in February of 2016. The previous version, Build 16, was released in February 2014 and Build 15 in September of 2012. Prior to that, there were often multiple releases per year, beginning in 2008.

Vendors and Haplogroups

Unfortunately, because some haplogroups are split, meaning they were previously a single haplogroup that now has multiple branches, a haplogroup update is not simply changing the name of the haplogroup. Some people that were previously all one haplogroup are now members of three different descendant haplogroups. I’m using haplogroup Z6 as an example, because it doesn’t exist, and I don’t want to confuse anyone.

Obviously, the vendors can’t just change Z6 to Z6a, because people that were previously Z6 might still be Z6 or might be Z6a, Z6b or Z6c.

Each vendor that provides haplogroups to clients has to rerun their entire data base, so a mitochondrial DNA haplogroup update is not a trivial undertaking and requires a lot of planning.

For those of you who also work with Y DNA, this is exactly why the Y haplotree went from haplogroup names like R1b1c to R-M269, where the terminal SNP, or mutation furthest down the tree (that the participant has tested for) is what defines the haplogroup.

If that same approach were applied to mitochondrial DNA, then J1c2f would be known as J-G9055A or maybe J-9055.

Why Version Matters

When comparing haplogroups between people who tested at various vendors, it’s important to understand that they may not be the same. For example, 23andMe, who reports a haplogroup prediction based not on full sequence testing, but on a group of probes, is still using Phylotree Build 12 from 2011.

Probe based vendors can update their client’s haplogroup to some extent, based on the probes they use which test only specific locations, but they cannot fully refine a haplogroup based on new locations, because their probes never tested those locations. They weren’t known to be haplogroup defining at the time their probes were designed. Even if they redefine their probes, they would have to rerun the actual tests of all of their clients on the new test platform with the new probes.

Full sequence testing at Family Tree DNA eliminates that problem, because they test the entire mitochondria at every location.

Therefore, it’s important to be familiar with your haplogroup, because you might match someone it doesn’t appear that you match. For example, our haplogroup A4a=A1 example. At 23andMe the person would still be A4a but at Family Tree DNA they would be A1.

If you utilize MitoSearch or if you are looking at mtDNA haplogroups recorded in GedMatch, for example, be aware of the source of the information. If you are utilizing other vendors who provide haplogroup estimates, ask which Phylotree build they are using so you know what to expect and how to compare.

Knowing the history of your haplogroup’s naming will allow you to better evaluate haplogroups found outside of Family Tree DNA matchs.

Build History

You can view the Phylotree Update History at this link, but Built 17 information is not yet available. However, since Family Tree DNA went from Built 14 to Build 17, and other vendors are further behind, the information here is still quite relevant.

Growth

If you’re wondering how much the tree grew, Build 14 defined 3550 haplogroups and Built 17 identified 5437. Build 14 utilized and analyzed 8,216 modern mitochondrial sequences, reflected in the 2012 Copernicus paper by Behar et al. Build 17 utilized 24,275 mitochondrial sequences. I certainly hope that the authors will update the Copernicus paper to reflect Build 17. Individuals utilizing the Copernicus paper for haplogroup aging today will have to be cognizant of the difference in haplogroup names.

Matching

If your haplogroup changed, or the haplogroup of any of your matches, your matches may change. Family Tree DNA utilizes something called SmartMatching which means that they will not show you as a match to someone who has taken the full sequence test and is not a member of your exact haplogroup. In other words, they will not show a haplogroup J1c2 as a match to a J1c2f, because their common ancestors are separated by thousands of years.

However, if someone has only tested at the HVR1 or HVR1+HVR2 (current mtDNA Plus test) levels and is predicted to be haplogroup J or J1, and they match you exactly on the locations in the regions where you both tested, then you will be shown as a match. If they upgrade and are discovered to be a different haplogroup, then you will no longer be shown as a match at any level.

Genographic Project

If you tested with the Genographic Project prior to November of 2016, your haplogroup may be different than the Family Tree DNA haplogroup. Family Tree DNA provided the following information:

The differences can be caused by the level of testing done, which phase of the Genographic project that you tested, and when.

  • Geno 1 tested all of HVR1.
  • Geno 2 tested a selection of SNPs across the mitochondrial genome to give a more refined haplogroup using Build 14.
  • Geno 2+ used an updated selection of SNPs across the mitochondrial genome using Build 16.

If you have HVR1 either transferred from the Genographic Project or from the FTDNA product mtDNA, you will have a basic, upper-level haplogroup.

If you tested mtDNA Plus with FTDNA, which is HVR1 + HVR2, you will have a basic, upper-level haplogroup.

If you tested the Full Mitochondrial Sequence with Family Tree DNA, your haplogroup will reflect the full Build 17 haplogroup, which may be different from either the Geno 2 or Geno 2+ haplogroup because of the number and selection of SNPs tested in the Genographic Project, or because of the build difference between Geno 2+ and FTDNA.

Thank You

I want to say a special thank you to Family Tree DNA.

I know that there is a lot of chatter about the cost of mitochondrial DNA testing as compared to autosomal, which is probe testing. It’s difficult for a vendor to maintain a higher quality, more refined product when competing against a lower cost competitor that appears, at first glance, to give the same thing for less money. The key of course is that it’s not really the same thing.

The higher cost is reflective of the fact that the full sequence mitochondrial test uses different technology to test all of the 16,569 mitochondrial DNA locations individually to determine whether the expected reference value is found, a mutation, a deletion or an insertion of other DNA.

Because Family Tree DNA tests every location individually, when new haplogroups are defined, your mitochondrial DNA haplogroup can be updated to reflect any new haplogroup definition, based on any of those 16,569 locations, or combinations of locations. Probe testing in conjunction with autosomal DNA testing can’t do this because the nature of probe testing is to test only specific locations for a value, meaning that probe tests test only known haplogroup defining locations at the time the probe test was designed.

So, thank you, Family Tree DNA, for continuing to test the full mitochondrial sequence, thank you for the updated Build 17 for refined haplogroups, and thank you for answering additional questions about the update.

Testing

If you haven’t yet tested your mitochondrial DNA at the full sequence level, now’s a great time!

If you have tested at the HVR1 or the HVR1+HVR2 levels, you can upgrade to the full sequence test directly from your account. For the next week, upgrades are only $99.

There are two mtDNA tests available today, the mtPlus which only tests through the HVR1+HVR2 level, or about 7% of your mitochondrial DNA locations, or the mtFull Sequence that tests your entire mitochondria, all 16,569 locations.

Click here to order or upgrade.