Mitochondrial DNA Resources – Everything You Need to Know

Mitochondrial DNA Resources

Recently, I wrote a multi-part series about mitochondrial DNA – start to finish – everything you need to know.

I’ve assembled several articles in one place, and I’ll add any new articles here as well.

Please feel free to share this resource or any of the links to individual articles with friends, genealogy groups or on social media.

What the Difference Between Mitochondrial and Other Types of DNA?

Mitochondrial DNA is inherited directly from your matrilineal line, only, meaning your mother’s mother’s mother’s mother – on up your family tree until you run out of direct line mothers that you’ve identified. The great news is even if you don’t know the identities of those people in your tree, you carry their mitochondrial DNA which can help identify them.

Here’s a short article about the different kinds of DNA that can be used for genealogy.

Why Mitochondrial DNA?

Let’s start out with why someone might want to test their mitochondrial DNA.

After you purchase a DNA test, swab, return the kit and when the lab finishes processing your test, you’ll receive your results on your personal page at FamilyTreeDNA, the only company that tests mitochondrial DNA at the full sequence level and provides matching with tens of thousands of other testers.

What About Those Results?

People want to understand how to use all of the different information provided to testers. These articles provide a step-by-step primer.

Mitochondrial DNA personal page

Sign in to your Family Tree DNA account and use these articles as a guideline to step through your results on your personal page.

We begin with an overview. What is mitochondrial DNA, how it is inherited and why is it useful for genealogy?

Next, we look at your results and decode what all the numbers mean. It’s easy, really!

Our ancestors lived in clans, and our mitochondrial DNA has its own versions of clans too – called haplogroups. Your full haplogroup can be very informative.

Sometimes there’s more than meets the eye. Here are my own tips and techniques for more than doubling the usefulness of your matches.

You’ll want to wring every possible advantage out of your tests, so be sure to join relevant projects and use them to their fullest extent.

Do you know how to utilize advanced matching? It’s a very powerful tool. If not, you will after these articles.

Mitochondrial DNA Information for Everyone

FamilyTreeDNA maintains an extensive public mitochondrial DNA tree, complete with countries of origin for all branches. You don’t need to have tested to enjoy the public tree.

However, if you have tested, take a look to see where the earliest known ancestors of your haplogroup matches are located based on the country flags.

Mitochondrial resources haplotree

These are mine. Where are yours?

What Can Mitochondrial DNA Do for You?

Some people mistakenly think that mitochondrial DNA isn’t useful for genealogy. I’m here to testify that it’s not only useful, it’s amazing! Here are three stories from my own genealogy about how I’ve used mitochondrial DNA to learn more about my ancestors and in some cases, break right through brick walls.

It’s not only your own mitochondrial DNA that’s important, but other family members too.

My cousin tested her mitochondrial DNA to discover that her direct matrilineal ancestor was Native American, much to her surprise. The great news is that her ancestor is my ancestor too!

Searching for Native American Ancestors?

If you’re searching for Native American or particular ancestors, mitochondrial DNA can tell you specifically if your mitochondrial DNA, or that of your ancestors (if you test a direct matrilineal descendant,) is Native, African, European, Jewish or Asian. Furthermore, your matches provide clues as to what country your ancestor might be from and sometimes which regions too.

Did you know that people from different parts of the world have distinctive haplogroups?

You can discover your ancestors’ origins through their mitochondrial DNA.

You can even utilize autosomal segment information to track back in time to the ancestor you seek. Then you can obtain that ancestor’s mitochondrial DNA by selectively testing their descendants or finding people who have already tested that descend from that ancestor. Here’s how.

You never know what you’re going to discover when you test your mitochondrial DNA. I discovered that although my earliest known matrilineal ancestor is found in Germany, her ancestors were from Scandinavia. My cousin discovered that our common ancestor is Mi’kmaq.

What secrets will your mitochondrial DNA reveal?

You can test or upgrade your mitochondrial DNA by clicking here.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Native American & Minority Ancestors Identified Using DNAPainter Plus Ethnicity Segments

Ethnicity is always a ticklish subject. On one hand we say to be leery of ethnicity estimates, but on the other hand, we all want to know who our ancestors were and where they came from. Many people hope to prove or disprove specific theories or stories about distant ancestors.

Reasons to be cautious about ethnicity estimates include:

  • Within continents, like Europe, it’s very difficult to discern ethnicity at the “country” level because of thousands of years of migration across regions where borders exist today. Ethnicity estimates within Europe can be significantly different than known and proven genealogy.
  • “Countries,” in Europe, political constructs, are the same size as many states in the US – and differentiation between those populations is almost impossible to accurately discern. Think of trying to figure out the difference between the populations of Indiana and Illinois, for example. Yet we want to be able to tell the difference between ancestors that came from France and Germany, for example.

Ethnicity states over Europe

  • All small amounts of ethnicity, even at the continental level, under 2-5%, can be noise and might be incorrect. That’s particularly true of trace amounts, 1% or less. However, that’s not always the case – which is why companies provide those small percentages. When hunting ancestors in the distant past, that small amount of ethnicity may be the only clue we have as to where they reside at detectable levels in our genome.

Noise in this case is defined as:

  • A statistical anomaly
  • A chance combination of your DNA from both parents that matches a reference population
  • Issues with the reference population itself, specifically admixture
  • Perhaps combinations of the above

You can read about the challenges with ethnicity here and here.

On the Other Hand

Having restated the appropriate caveats, on the other hand, we can utilize legitimate segments of our DNA to identify where our ancestors came from – at the continental level.

I’m actually specifically referring to Native American admixture which is the example I’ll be using, but this process applies equally as well to other minority or continental level admixture as well. Minority, in this sense means minority ethnicity to you.

Native American ethnicity shows distinctly differently from African and European. Sometimes some segments of DNA that we inherit from Native American ancestors are reported as Asian, specifically Siberian, Northern or Eastern Asian.

Remember that the Native American people arrived as a small group via Beringia, a now flooded land bridge that once connected Siberia with Alaska.

beringia map

By Erika Tamm et al – Tamm E, Kivisild T, Reidla M, Metspalu M, Smith DG, et al. (2007) Beringian Standstill and Spread of Native American Founders. PLoS ONE 2(9): e829. doi:10.1371/journal.pone.0000829. Also available from PubMed Central., CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=16975303

After that time, the Native American/First Nations peoples were isolated from Asia, for the most part, and entirely from Europe until European exploration resulted in the beginning of sustained European settlement, and admixture beginning in the late 1400s and 1500s in the Americas.

Family Inheritance

Testing multiple family members is extremely useful when working with your own personal minority heritage. This approach assumes that you’d like to identify your matches that share that genetic heritage because they share the same minority DNA that you do. Of course, that means you two share the same ancestor at some time in the past. Their genealogy, or your combined information, may hold the clue to identifying your ancestor.

In my family, my daughter has Native American segments that she inherited from me that I inherited from my mother.

Finding the same segment identified as Native American in several successive generations eliminates the possibility that the chance combination of DNA from your father and mother is “appearing” as Native, when it isn’t.

We can use segment information to our benefit, especially if we don’t know exactly who contributed that DNA – meaning which ancestor.

We need to find a way to utilize those Native or other minority segments genealogically.

23andMe

Today, the only DNA testing vendor that provides consumers with a segment identification of our ethnicity predictions is 23andMe.

If you have tested at 23andMe, sign in and click on Ancestry on the top tab, then select Ancestry Composition.

Minority ethnicity ancestry composition.png

Scroll down until you see your painted chromosomes.

Minority ethnicity chromosome painting.png

By clicking on the region at left that you want to see, the rest of the regions are greyed out and only that region is displayed on your chromosomes, at right.

Minority ethnicity Native.png

According to 23andMe, I have two Native segments, one each on chromosomes 1 and 2. They show these segments on opposite chromosomes, meaning one (the top for example) would be maternal or paternal, and the bottom one would be the opposite. But 23andMe apparently could not tell for sure because neither my mother nor father have tested there. This placement also turned out to be incorrect. The above image was my initial V3 test at 23andMe. My later V4 results were different.

Versions May Differ

Please note that your ethnicity predictions may be different based on which test you took which is dictated by when you took the test. The image above is my V3 test that was in use at 23andMe between 2010 and November 2013, and the image below is my V4 test in use between November 2013 and August 2017.

23andMe apparently does not correct original errors involving what is known as “strand swap” where the maternal and paternal segments are inverted during analysis. My V4 test results are shown below, where the strands are correctly portrayed.

Minority ethnicity Native V4.png

Note that both Native segments are now on the lower chromosome “side” of the pair and the position on the chromosome 1 segment has shifted visually.

Minority ethnicity sides.png

I have not tested at 23andMe on the current V5 GSA chip, in use since August 9, 2017, but perhaps I should. The results might be different yet, with the concept being that each version offers an improvement over earlier versions as science advances.

If your parents have tested, 23andMe makes adjustments to your ethnicity estimates accordingly.

Although my mother can’t test at 23andMe, I happen to already know that these Native segments descend from my mother based on genealogical and genetic analysis, combined. I’m going to walk you through the process.

I can utilize my genealogy to confirm or refute information shown by 23andMe. For example, if one of those segments comes from known ancestors who were living in Germany, it’s clearly not Native, and it’s noise of some type.

We’re going to utilize DNAPainter to determine which ancestors contributed your minority segments, but first you’ll need to download your ethnicity segments from 23andMe.

Downloading Ethnicity Segment Data

Downloading your ethnicity segments is NOT THE SAME as downloading your raw DNA results to transfer to another vendor. Those are two entirely different files and different procedures.

To download the locations of your ethnicity segments at 23andMe, scroll down below your painted ethnicity segments in your Ancestry Composition section to “View Scientific Details.”

MInority ethnicity scientific details.png

Click on View Scientific Details and scroll down to near the bottom and then click on “Download Raw Data.” I leave mine at the 50% confidence level.

Minority ethnicity download raw data.png

Save this spreadsheet to your computer in a known location.

In the spreadsheet, you’ll see columns that provide the name of the segment, the chromosome copy number (1 or 2) and the chromosome number with start and end locations.

Minority ethnicity download.png

You really don’t care about this information directly, but DNAPainter does and you’ll care a lot about what DNAPainter does for you.

DNAPainter

I wrote introductory articles about DNAPainter:

If you’re not familiar with DNAPainter, you might want to read these articles first and then come back to this point in this article.

Go ahead – I’ll wait!

Getting Started

If you don’t have a DNAPainter account, you’ll need to create one for free. Some features, such as having multiple profiles are subscription based, but the functionality you’ll need for one profile is free.

I’ve named this example profile “Ethnicity Demo.” You’ll see your name where mine says “Ethnicity Demo.”

Minority ethnicity DNAPainter.png

Click on “Import 23andme ancestry composition.”

You will copy and paste all the spreadsheet rows in the entire downloaded 23andMe ethnicity spreadsheet into the DNAPainter text box and make your selection, below. The great news is that if you discover that your assumption about copy 1 being maternal or paternal is incorrect, it’s easy to delete the ethnicity segments entirely and simply repaint later. Ditto if 23andMe changes your estimate over time, like they have mine.

Minority ethnicity DNAPainter sides.png

I happen to know that “copy 2” is maternal, so I’ve made that selection.

You can then see your ethnicity chromosome segments painted, and you can expand each one to see the detail. Click on “Save Segments.”

MInority ethnicity DNAPainter Native painting

Click to enlarge

In this example, you can see my Native segments, called by various names at different confidence levels at 23andMe, on chromosome 1.

Depending on the confidence level, these segments are called some mixture of:

  • East Asian & Native American
  • North Asian & Native American
  • Native American
  • Broadly East Asian & Native American

It’s exactly the same segment, so you don’t really care what it’s called. DNAPainter paints all of the different descriptions provided by 23andMe, at all confidence levels as you can see above.

The DNAPainter colors are different from 23andMe colors and are system-selected. You can’t assign the colors for ethnicity segments.

Now, I’m moving to my own profile that I paint with my ancestral segments. To date, I have 78% of my segments painted by identifying cousins with known common ancestors.

On chromosomes 1 and 2, copy 2, which I’ve determined to be my mother’s “side,” these segments track back to specific ancestors.

Minority ethnicity maternal side

Click to enlarge

Chromosome 1 segments, above, track back to the Lore family, descended from Antoine (Anthony) Lore (Lord) who married Rachel Hill. Antoine Lore was Acadian.

Minority ethnicity chromosome 1.png

Clicking on the green segment bar shows me the ancestors I assigned when I painted the match with my Lore family member whose name is blurred, but whose birth surname was Lore.

The Chromosome 2 segment, below, tracks back to the same family through a match to Fred.

Minority ethnicity chromosome 2.png

My common ancestors with Fred are Honore Lore and Marie Lafaille who are the parents of Antoine Lore.

Minority ethnicity common ancestor.png

There are additional matches on both chromosomes who also match on portions of the Native segments.

Now that I have a pointer in the ancestral direction that these Native American segments arrived from, what can traditional genealogy and other DNA information tell me?

Traditional Genealogy Research

The Acadian people were a mixture of English, French and Native American. The Acadians settled on the island of Nova Scotia in 1609 and lived there until being driven out by the English in 1755, roughly 6 or 7 generations later.

Minority ethnicity Acadian map.png

The Acadians intermarried with the Mi’kmaq people.

It had been reported by two very qualified genealogists that Philippe Mius, born in 1660, married two Native American women from the Mi’kmaq tribe given the name Marie.

The French were fond of giving the first name of Marie to Native women when they were baptized in the Catholic faith which was required before the French men were allowed to marry the Native women. There were many Native women named Marie who married European men.

Minority ethnicity Native mitochondrial tree

Click to enlarge

This Mius lineage is ancestral to Antoine Lore (Lord) as shown on my pedigree, above.

Mitochondrial DNA has revealed that descendants from one of Philippe Mius’s wives, Marie, carry haplogroup A2f1a.

However, mitochondrial tests of other descendants of “Marie,” his first wife, carry haplogroup X2a2, also Native American.

Confusion has historically existed over which Marie is the mother of my ancestor, Francoise.

Karen Theroit Reader, another professional genealogist, shows Francoise Mius as the last child born to the first Native wife before her death sometime after 1684 and before about 1687 when Philippe remarried.

However, relative to the source of Native American segments, whether Francoise descends from the first or second wife doesn’t matter in this instance because both are Native and are proven so by their mitochondrial DNA haplogroups.

Additionally, on Antoine’s mother’s side, we find a Doucet male, although there are two genetic male Doucet lines, one of European origin, haplogroup R-L21, and one, surprisingly, of Native origin, haplogroup C-P39. Both are proven by their respective haplogroups but confusion exists genealogically over who descends from which lineage.

On Antoine’s mother’s side, there are several unidentified lineages, any one or multiples of which could also be Native. As you can see, there are large gaps in my tree.

We do know that these Native segments arrived through Antoine Lore and his parents, Honore Lore and Marie LaFaille. We don’t know exactly who upstream contributed these segments – at least not yet. Painting additional matches attributable to specific ancestral couples will eventually narrow the candidates and allow me to walk these segments back in time to their rightful contributor.

Segments, Traditional Research and DNAPainter

These three tools together, when using continent-level segments in combination with painting the DNA segments of known cousins that match specific lineages create a triangulated ethnicity segment.

When that segment just happens to be genealogically important, this combination can point the researchers in the right direction knowing which lines to search for that minority ancestor.

If your cousins who match you on this segment have also tested with 23andMe, they should also be identified as Native on this same segment. This process does not apply to intracontinental segments, meaning within Europe, because the admixture is too great and the ethnicity predictions are much less reliable.

When identifying minority admixture at the continental level, adding Y and mitochondrial DNA testing to the mix in order to positively identify each individual ancestor’s Y and mitochondrial DNA is very important in both eliminating and confirming what autosomal DNA and genealogy records alone can’t do. The base haplogroup as assigned at 23andMe is a good start, but it’s not enough alone. Plus, we only carry one line of mitochondrial DNA and only males carry Y DNA, and only their direct paternal line.

We need Y and mitochondrial DNA matching at FamilyTreeDNA to verify the specific lineage. Additionally, we very well may need the Y and mitochondrial DNA information that we don’t directly carry – but other cousins do. You can read about Y and mitochondrial DNA testing, here.

I wrote about creating a personal DNA pedigree chart including your ancestors’ Y and mitochondrial DNA here. In order to find people descended from a specific ancestor who have DNA tested, I utilize:

  • WikiTree resources and trees
  • Geni trees
  • FamilySearch trees
  • FamilyTreeDNA autosomal matches with trees
  • AncestryDNA autosomal matches and their associated trees
  • Ancestry trees in general, meaning without knowing if they are related to a DNA match
  • MyHeritage autosomal matches and their trees
  • MyHeritage trees in general

At both MyHeritage and Ancestry, you can view the trees of your matches, but you can also search for ancestors in other people’s trees to see who might descend appropriately to provide a Y or mitochondrial DNA sample. You will probably need a subscription to maximize these efforts. My Heritage offers a free trial subscription here.

If you find people appropriately descended through WikiTree, Geni or FamilySearch, you’ll need to discuss DNA testing with them. They may have already tested someplace.

If you find people who have DNA tested through your DNA matches with trees at Ancestry and MyHeritage, you’ll need to offer a Y or mitochondrial DNA test to them if they haven’t already tested at FamilyTreeDNA.

FamilyTreeDNA is the only vendor who provides the Y DNA and mitochondrial DNA tests at the higher resolution level, beyond base haplogroups, required for matching and for a complete haplogroup designation.

If the person has taken the Family Finder autosomal test at FamilyTreeDNA, they may have already tested their Y DNA and mtDNA, or you can offer to upgrade their test.

Projects

Checking projects at FamilyTreeDNA can be particularly useful when trying to discover if anyone from a specific lineage has already tested. There are many, special interest projects such as the Acadian AmerIndian Ancestry project, the American Indian project, haplogroup projects, surname projects and more.

You can view projects alphabetically here or you can click here to scroll down to enter the surname or topic you are seeking.

Minority ethnicity project search.png

If the topic isn’t listed, check the alphabetic index under Geographical Projects.

23andMe Maternal and Paternal Sides

If possible, you’ll want to determine which “side” of your family your minority segments originate come from, unless they come from both. you’ll want to determine whether chromosome side one 1 or 2 is maternal, because the other one will be paternal.

23andMe doesn’t offer tree functionality in the same way as other vendors, so you won’t be able to identify people there descended from your ancestors without contacting each person or doing other sleuthing.

Recently, 23andMe added a link to FamilySearch that creates a list of your ancestors from their mega-shared tree for 7 generations, but there is no tree matching or search functionality. You can read about the FamilySearch connection functionality here.

So, how do you figure out which “side” is which?

Minority ethnicity minority segment.png

The chart above represents the portion of your chromosomes that contains your minority ancestry. Initially, you don’t know if the minority segment is your mother’s pink chromosome or your father’s blue chromosome. You have one chromosome from each parent with the exact same addresses or locations, so it’s impossible to tell which side is which without additional information. Either the pink or the blue segment is minority, but how can you tell?

In my case, the family oral history regarding Native American ancestry was from my father’s line, but the actual Native segments wound up being from my mother, not my father. Had I made an assumption, it would have been incorrect.

Fortunately, in our example, you have both a maternal and paternal aunt who have tested at 23andMe. You match both aunts on that exact same segment location – one from your father’s side, blue, and one from your mother’s side, pink.

You compare your match with your maternal aunt and verify that indeed, you do match her on that segment.

You’ll want to determine if 23andMe has flagged that segment as Native American for your maternal aunt too.

You can view your aunt’s Ancestry Composition by selecting your aunt from the “Your Connections” dropdown list above your own ethnicity chromosome painting.

Minority ethnicity relative connections.png

You can see on your aunt’s chromosomes that indeed, those locations on her chromosomes are Native as well.

Minority ethnicity relative minority segments.png

Now you’ve identified your minority segment as originating on your maternal side.

Minority ethnicity Native side.png

Let’s say you have another match, Match 1, on that same segment. You can easily tell which “side” Match 1 is from. Since you know that you match your maternal aunt on that minority segment, if Match 1 matches both you and your maternal aunt, then you know that’s the side the match is from – AND that person also shares that minority segment.

You can also view that person’s Ancestry Composition as well, but shared matching is more reliable,especially when dealing with small amounts of minority admixture.

Another person, Match 2, matches you on that same segment, but this time, the person matches you and your paternal aunt, so they don’t share your minority segment.

Minority ethnicity match side.png

Even if your paternal aunt had not tested, because Match 2 does not match you AND your maternal aunt, you know Match 2 doesn’t share your minority segment which you can confirm by checking their Ancestry Composition.

Download All of Your Matches

Rather than go through your matches one by one, it’s easiest to download your entire match list so you can see which people match you on those chromosome locations.

Minority ethnicity download aggregate data.png

You can click on “Download Aggregate Data” at 23andMe, at the bottom of your DNA Relatives match list to obtain all of your matches who are sharing with you. 23andMe limits your matches to 2000 or less, the actual number being your highest 2000 matches minus the people who aren’t sharing. I have 1465 matches showing and that number decreases regularly as new testers at 23andMe are focused on health and not genealogy, meaning lower matches get pushed off the list of 2000 match candidates.

You can quickly sort the spreadsheet to see who matches you on specific segments. Then, you can check each match in the system to see if that person matches you and another known relative on the minority segments or you can check their Ancestry Composition, or both.

If they share your minority segment, then you can check their tree link if they have one, included in the download, their Family Search information if included on their account, or reach out to them to see if you might share a known ancestor.

The key to making your ethnicity segment work for you is to identify ancestors and paint known matches.

Paint Those Matches

When searching for matches whose DNA you can attribute to specific ancestors, be sure to check at all 4 places that provide segment information that you can paint:

At GedMatch, you’ll find some people who have tested at the other various vendors, including Ancestry, but unfortunately not everyone uploads. Ancestry doesn’t provide segment information, so you won’t be able to paint those matches directly from Ancestry.

If your Ancestry matches transfer to GedMatch, FamilyTreeDNA or MyHeritage you can view your match and paint your common segments. At GedMatch, Ancestry kit numbers begin with an A. I use my Ancestry kit matches at GedMatch to attempt to figure out who that match is at Ancestry in order to attempt to figure out the common ancestor.

To Paint, You Must Test

Of course, in order to paint your matches that you find in various databases, you need to be in those data bases, meaning you either need to test there or transfer your DNA file.

Transfers

If you’d like to test your DNA at one vendor and download the file to transfer to another vendor, or GedMatch, that’s possible with both FamilyTreeDNA and MyHeritage who both accept uploads.

You can transfer kits from Ancestry and 23andMe to both FamilyTreeDNA and MyHeritage for free, although the chromosome browsers, advanced tools and ethnicity require an unlock fee (or alternatively a subscription at MyHeritage). Still, the free transfer and unlock for $19 at FamilyTreeDNA or $29 at MyHeritage is less than the cost of testing.

Here’s a quick cheat sheet.

DNA vendor transfer cheat sheet 2019

From time to time, as vendor file formats change, the ability to transfer is temporarily interrupted, but it costs nothing to try a transfer to either MyHeritage or FamilyTreeDNA, or better yet, both.

In each of these articles, I wrote about how to download your data from a specific vendor and how to upload from other vendors if they accept uploads.

Summary Steps

In order to use your minority ethnicity segments in your genealogy, you need to:

  1. Test at 23andMe
  2. Identify which parental side your minority ethnicity segments are from, if possible
  3. Download your ethnicity segments
  4. Establish a DNAPainter account
  5. Upload your ethnicity segments to DNAPainter
  6. Paint matches of people with whom you share known common ancestors utilizing segment information from 23andMe, FamilyTreeDNA, MyHeritage and AncestryDNA matches who have uploaded to GedMatch
  7. If you have not tested at either MyHeritage or FamilyTreeDNA, upload your 23andMe file to either vendor for matching, along with GedMatch
  8. Focus on those minority segments to determine which ancestral line they descend through in order to identify the ancestor(s) who provided your minority admixture.

Have fun!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

First Steps When Your DNA Results are Ready – Sticking Your Toe in the Genealogy Water

First steps helix

Recently someone asked me what the first steps would be for a person who wasn’t terribly familiar with genealogy and had just received their DNA test results.

I wrote an article called DNA Results – First Glances at Ethnicity and Matching which was meant to show new folks what the various vendor interfaces look like. I was hoping this might whet their appetites for more, meaning that the tester might, just might, stick their toe into the genealogy waters😊

I’m hoping this article will help them get hooked! Maybe that’s you!

A Guide

This article can be read in one of two ways – as an overview, or, if you click the links, as a pretty thorough lesson. If you’re new, I strongly suggest reading it as an overview first, then a second time as a deeper dive. Use it as a guide to navigate your results as you get your feet wet.

I’ll be hotlinking to various articles I’ve written on lots of topics, so please take a look at details (eventually) by clicking on those links!

This article is meant as a guideline for what to do, and how to get started with your DNA matching results!

If you’re looking for ethnicity information, check out the First Glances article, plus here and here and here.

Concepts – Calculating Ethnicity Percentages provides you with guidelines for how to estimate your own ethnicity percentages based on your known genealogy and Ethnicity Testing – A Conundrum explains how ethnicity testing is done.

OK, let’s get started. Fun awaits!

The Goal

The goal for using DNA matching in genealogy depends on your interests.

  1. To discover cousins and family members that you don’t know. Some people are interested in finding and meeting relatives who might have known their grandparents or great-grandparents in the hope of discovering new family information or photos they didn’t know existed previously. I’ve been gifted with my great-grandparent’s pictures, so this strategy definitely works!
  2. To confirm ancestors. This approach presumes that you’ve done at least a little genealogy, enough to construct at least a rudimentary tree. Ancestors are “confirmed” when you DNA match multiple other people who descend from the same ancestor through multiple children. I wrote an article, Ancestors: What Constitutes Proof?, discussing how much evidence is enough to actually confirm an ancestor. Confirmation is based on a combination of both genealogical records and DNA matching and it varies depending on the circumstances.
  3. Adoptees and people with unknown parents seeking to discover the identities of those people aren’t initially looking at their own family tree – because they don’t have one yet. The genealogy of others can help them figure out the identity of those mystery people. I wrote about that technique in the article, Identifying Unknown Parents and Individuals Using DNA Matching.

DNAAdoption for Everyone

Educational resources for adoptees and non-adoptees alike can be found at www.dnaadoption.org. DNAAdoption is not just for adoptees and provides first rate education for everyone. They also provide trained and mentored search angels for adoptees who understand the search process along with the intricacies of navigating the emotional minefield of adoption and unknown parent searches.

First Look” classes for each vendor are free for everyone at DNAAdoption and are self-paced, downloadable onto your computer as a pdf file. Intro to DNA, Applied Autosomal DNA and Y DNA Basics classes are nominally priced at between $29 and $49 and I strongly recommend these. DNAAdoption is entirely non-profit, so your class fee or contribution supports their work. Additional resources can be found here and their 12 adoptee search steps here.

Ok, now let’s look at your results.

Matches are the Key

Regardless of your goal, your DNA matches are the key to finding answers, whether you want to make contact with close relatives, prove your more distant ancestors or you’re involved in an adoptee or unknown parent search.

Your DNA matches that of other people because each of you inherited a piece of DNA, called a segment, where many locations are identical. The length of that DNA segment is measured in centiMorgans and those locations are called SNPs, or single nucleotide polymorphisms. You can read about the definition of a centimorgan and how they are used in the article Concepts – CentiMorgans, SNPs and Pickin’Crab.

While the scientific details are great, they aren’t important initially. What is important is to understand that the more closely you match someone, the more closely you are related to them. You share more DNA with close relatives than more distant relatives.

For example, I share exactly half of my mother’s DNA, but only about 25% of each of my grandparents’ DNA. As the relationships move further back in time, I share less and less DNA with other people who descend from those same ancestors.

Informational Tools

Every vendor’s match page looks different, as was illustrated in the First Glances article, but regardless, you are looking for four basic pieces of information:

  • Who you match
  • How much DNA you share with your match
  • Who else you and your match share that DNA with, which suggests that you all share a common ancestor
  • Family trees to reveal the common ancestor between people who match each other

Every vendor has different ways of displaying this information, and not all vendors provide everything. For example, 23andMe does not support trees, although they allow you to link to one elsewhere. Ancestry does not provide a tool called a chromosome browser which allows you to see if you and others match on the same segment of DNA. Ancestry only tells you THAT you match, not HOW you match.

Each vendor has their strengths and shortcomings. As genealogists, we simply need to understand how to utilize the information available.

I’ll be using examples from all 4 major vendors:

Your matches are the most important information and everything else is based on those matches.

Family Tree DNA

I have tested many family members from both sides of my family at Family Tree DNA using the Family Finder autosomal test which makes my matches there incredibly useful because I can see which family members, in addition to me, my matches match.

Family Tree DNA assigns matches to maternal and paternal sides in a unique way, even if your parents haven’t tested, so long as some close relatives have tested. Let’s take a look.

First Steps Family Tree DNA matches.png

Sign on to your account and click to see your matches.

At the top of your Family Finder matches page, you’ll see three groups of things, shown below.

First Steps Family Tree DNA bucketing

Click to enlarge

A row of tools at the top titled Chromosome Browser, In Common With and Not in Common With.

A second row of tabs that include All, Paternal, Maternal and Both. These are the maternal and paternal tabs I mentioned, meaning that I have a total of 4645 matches, 988 of which are from my paternal side and 847 of which are from my maternal side.

Family Tree DNA assigns people to these “buckets” based on matches with third cousins or closer if you have them attached in your tree. This is why it’s critical to have a tree and test close relatives, especially people from earlier generations like aunts, uncles, great-aunts/uncles and their children if they are no longer living.

If you have one or both parents that can test, that’s a wonderful boon because anyone who matches you and one of your parents is automatically bucketed, or phased (scientific term) to that parent’s side of the tree. However, at Family Tree DNA, it’s not required to have a parent test to have some matches assigned to maternal or paternal sides. You just need to test third cousins or closer and attach them to the proper place in your tree.

How does bucketing work?

Maternal or Paternal “Side” Assignment, aka Bucketing

If I match a maternal first cousin, Cheryl, for example, and we both match John Doe on the same segment, John Doe is automatically assigned to my maternal bucket with a little maternal icon placed beside the match.

First Steps Family Tree DNA match info

Click to enlarge

Every vendor provides an estimated or predicted relationship based on a combination of total centiMorgans and the longest contiguous matching segment. The actual “linked relationship” is calculated based on where this person resides in your tree.

The common surnames at far right are a very nice features, but not every tester provides that information. When the testers do include surnames at Family Tree DNA, common surnames are bolded. Other vendors have similar features.

People with trees are shown near their profile picture with a blue pedigree icon. Clicking on the pedigree icon will show you their ancestors. Your matches estimated relationship to you indicates how far back you should expect to share an ancestor.

For example, first cousins share grandparents. Second cousins share great-grandparents. In general, the further back in time your common ancestor, the less DNA you can be expected to share.

You can view relationship information in chart form in my article here or utilize DNAPainter tools, here, to see the various possibilities for the different match levels.

Clicking on the pedigree chart of your match will show you their tree. In my tree, I’ve connected my parents in their proper places, along with Cheryl and Don, mother’s first cousins. (Yes, they’ve given permission for me to utilize their results, so they aren’t always blurred in images.)

Cheryl and Don are my first cousins once removed, meaning my mother is their first cousin and I’m one generation further down the tree. I’m showing the amount of DNA that I share with each of them in red in the format of total DNA shared and longest unbroken segment, taken from the match list. So 382-53 means I share a total of 382 cM and 53 cM is the longest matching block.

First Steps Family Tree DNA tree.png

The Chromosome Browser

Utilizing the chromosome browser, I can see exactly where I match both Don and Cheryl. It’s obvious that I match them on at least some different pieces of my DNA, because the total and longest segment amounts are different.

The reason it’s important to test lots of close relatives is because even siblings inherit different pieces of DNA from their parents, and they don’t pass the same DNA to their offspring either – so in each generation the amount of shared DNA is probably reduced. I say probably because sometimes segments are passed entirely and sometimes not at all, which is how we “lose” our ancestors’ DNA over the generations.

Here’s a matching example utilizing a chromosome browser.

First Steps Family Tree DNA chromosome browser.png

I clicked the checkboxes to the left of both Cheryl and Don on the match page, then the Chromosome Browser button, and now you can see, above, on chromosomes 1-16 where I match Cheryl (blue) and Don (red.)

In this view, both Don and Cheryl are being compared to me, since I’m the one signed in to my account and viewing my DNA matches. Therefore, one of the bars at each chromosome represents Don’s DNA match to me and one represents Cheryl’s. Cheryl is the first person and Don is the second. Person match colors (red and blue) are assigned arbitrarily by the system.

My grandfather and Cheryl/Don’s father, Roscoe, were siblings.

You can see that on some segments, my grandfather and Roscoe inherited the same segment of DNA from their parents, because today, my mother gave me that exact same segment that I share with both Don and Cheryl. Those segments are exactly identical and shown in the black boxes.

The only way for us to share this DNA today is for us to have shared a common ancestor who gave it to two of their children who passed it on to their descendants who DNA tested today.

On other segments, in red boxes, I share part of the same segments of DNA with Cheryl and Don, but someone along the line didn’t inherit all of that segment. For example on chromosome 3, in the red box, you can see that I share more with Cheryl (blue) than Don (red.)

In other cases, I share with either Don or Cheryl, but Don and Cheryl didn’t inherit that same segment of DNA from their father, so I don’t share with both of them. Those are the areas where you see only blue or only red.

On chromosome 12, you can see where it looks like Don’s and Cheryl’s segments butt up against each other. The DNA was clearly divided there. Don received one piece and Cheryl got the other. That’s known as a crossover and you can read about crossovers here, if you’d like.

It’s important to be able to view segment information to be able to see how others match in order to identify which common ancestor that DNA came from.

In Common With

You can use the “In Common With” tool to see who you match in common with any match. My first 6 matches in common with Cheryl are shown below. Note that they are already all bucketed to my maternal side.

First Steps Family Tree DNA in common with

click to enlarge

You can click on up to 7 individuals in the check box at left to show them on the chromosome browser at once to see if they match you on common segments.

Each matching segment has its own history and may descend from a different ancestor in your common tree.

First Steps 7 match chromosome browser

click to enlarge

If combinations of people do match me on a common segment, because these matches are all on my maternal side, they are triangulated and we know they have to descend from a common ancestor, assuming the segment is large enough. You can read about the concept of triangulation here. Triangulation occurs when 3 or more people (who aren’t extremely closely related like parents or siblings) all match each other on the same reasonably sized segment of DNA.

If you want to download your matches and work through this process in a spreadsheet, that’s an option too.

Size Matters

Small segments can be identical by chance instead of identical by descent.

  • “Identical by chance” means that you accidentally match someone because your DNA on that segment has been combined from both parents and causes it to match another person, making the segment “looks like” it comes from a common ancestor, when it really doesn’t. When DNA is sequenced, both your mother and father’s strands are sequenced, meaning that there’s no way to determine which came from whom. Think of a street with Mom’s side and Dad’s side with identical addresses on the houses on both sides. I wrote about that here.
  • “Identical by descent” means that the DNA is identical because it actually descends from a common ancestor. I discussed that concept in the article, We Match, But Are We Related.

Generally, we only utilize 7cM (centiMorgan) segments and above because at that level, about half of the segments are identical by descent and about half are identical by chance, known as false positives. By the time we move above 15 cM, most, but not all, matches are legitimate. You can read about segment size and accuracy here.

Using “In Common With” and the Matrix

“In Common With” is about who shares DNA. You can select someone you match to see who else you BOTH match. Just because you match two other people doesn’t necessarily mean that it’s on the same segment of DNA. In fact, you could match one person from your mother’s side and the other person from your father’s side.

First Steps match matrix.png

In this example, you match Person B due to ancestor John Doe and Person C due to ancestor Susie Smith. However, Person B also matches person C, but due to ancestor William West that they share and you don’t.

This example shows you THAT they match, but not HOW they match.

The only way to assure that the matches between the three people above are due to the same ancestor is to look at the segments with a chromosome browser and compare all 3 people to each other. Finding 3 people who match on the same segment, from the same side of your tree means that (assuming a reasonably large segment) you share a common ancestor.

Family Tree DNA has a nice matrix function that allows you to see which of your matches also match each other.

First steps matrix link

click to enlarge

The important distinction between the matrix and the chromosome browser is that the chromosome browser shows you where your matches match you, but those matches could be from both sides of your tree, unless they are bucketed. The matrix shows you if your matches also match each other, which is a huge clue that they are probably from the same side of your tree.

First Steps Family Tree DNA matrix.png

A matrix match is a significant clue in terms of who descends from which ancestors. For example, I know, based on who Amy matches, and who she doesn’t match, that she descends from the Ferverda side and that Charles, Rex and Maxine descend from ancestors on the Miller side.

Looking in the chromosome browser, I can tell that Cheryl, Don, Amy and I match on some common segments.

Matching multiple people on the same segment that descends from a common ancestor is called triangulation.

Let’s take a look at the MyHeritage triangulation tool.

MyHeritage

Moving now to MyHeritage who provides us with an easy to use triangulation tool, we see the following when clicking on DNA matches on the DNA tab on the toolbar.

First Steps MyHeritage matches

click to enlarge

Cousin Cheryl is at MyHeritage too. By clicking on Review DNA Match, the purple button on the right, I can see who else I match in common with Cheryl, plus triangulation.

The list of people Cheryl and I both match is shown below, along with our relationships to each person.

First Steps MyHeritage triangulation

click to enlarge

I’ve selected 2 matches to illustrate.

The first match has a little purple icon to the right which means that Amy triangulates with me and Cheryl.

The second match, Rex, means that while we both match Rex, it’s not on the same segment. I know that without looking further because there is no triangulation button. We both match Rex, but Cheryl matches Rex on a different segment than I do.

Without additional genealogy work, using DNA alone, I can’t say whether or not Cheryl, Rex and I all share a common ancestor. As it turns out, we do. Rex is a known cousin who I tested. However, in an unknown situation, I would have to view the trees of those matches to make that determination.

Triangulation

Clicking on the purple triangulation icon for Amy shows me the segments that all 3 of us, me, Amy and Cheryl share in common as compared to me.

First Steps MyHeritage triangulation chromosome browser.png

Cheryl is red and Amy is yellow. The one segment bracketed with the rounded rectangle is the segment shared by all 3 of us.

Do we have a common ancestor? I know Cheryl and I do, but maybe I don’t know who Amy is. Let’s look at Amy’s tree which is also shown if I scroll down.

First Steps MyHeritage common ancestor.png

Amy didn’t have her tree built out far enough to show our common ancestor, but I immediately recognized the surname Ferveda found in her tree a couple of generations back. Darlene was the daughter of Donald Ferverda who was the son of Hiram Ferverda, my great-grandfather.

Hiram was the father of Cheryl’s father, Roscoe and my grandfather, John Ferverda.

First Steps Hiram Ferverda pedigree.png

Amy is my first cousin twice removed and that segment of DNA that I share with her is from either Hiram Ferverda or his wife Eva Miller.

Now, based on who else Amy matches, I can probably tell whether that segment descends from Hiram or Eva.

Viva triangulation!

Theory of Family Relativity

MyHeritage’s Theory of Family Relativity provides theories to people whose DNA matches regarding their common ancestor if MyHeritage can calculate how the 2 people are potentially related.

MyHeritage uses a combination of tools to make that connection, including:

  • DNA matches
  • Your tree
  • Your match’s tree
  • Other people’s trees at MyHeritage, FamilySearch and Geni if the common ancestor cannot be found in your tree compared against your DNA match’s MyHeritage
  • Documents in the MyHeritage data collection, such as census records, for example.

MyHeritage theory update

To view the Theories, click on the purple “View Theories” banner or “View theory” under the DNA match.

First Steps MyHeritage theory of relativity

click to enleage

The theory is displayed in summary format first.

MyHeritage view full theory

click to enlarge

You can click on the “View Full Theory” to see the detail and sources about how MyHeritage calculated various paths. I have up to 5 different theories that utilize separate resources.

MyHeritage review match

click to enlarge

A wonderful aspect of this feature is that MyHeritage shows you exactly the information they utilized and calculates a confidence factor as well.

All theories should be viewed as exactly that and should be evaluated critically for accuracy, taking into consideration sources and documentation.

I wrote about using Theories of Relativity, with instructions, here and here.

I love this tool and find the Theories mostly accurate.

AncestryDNA

Ancestry doesn’t offer a chromosome browser or triangulation but does offer a tree view for people that you match, so long as you have a subscription. In the past, a special “Light” subscription for DNA only was available for approximately $49 per year that provided access to the trees of your DNA matches and other DNA-related features. You could not order online and had to call support, sometimes asking for a supervisor in order to purchase that reduced-cost subscription. The “Light” subscription did not provide access to anything outside of DNA results, meaning documents, etc. I don’t know if this is still available.

After signing on, click on DNA matches on the DNA tab on the toolbar.

You’ll see the following match list.

First Steps Ancestry matches

click to enlarge

I’ve tested twice at Ancestry, the second time when they moved to their new chip, so I’m my own highest match. Click on any match name to view more.

First Steps Ancestry shared matches

click to enlarge

You’ll see information about common ancestors if you have some in your trees, plus the amount of shared DNA along with a link to Shared Matches.

I found one of the same cousins at Ancestry whose match we were viewing at MyHeritage, so let’s see what her match to me at Ancestry looks like.

Below are my shared matches with that cousin. The notes to the right are mine, not provided by Ancestry. I make extensive use of the notes fields provided by the vendors.

First Steps Ancestry shared matches with cousin

click to enlarge

On your match list, you can click on any match, then on Shared Matches to see who you both match in common. While Ancestry provides no chromosome browser, you can see the amount of DNA that you share and trees, if any exist.

Let’s look at a tree comparison when a common ancestor can be detected in a tree within the past 7 generations.

First Steps Ancestry view ThruLines.png

What’s missing of course is that I can’t see how we match because there’s no chromosome browser, nor can I see if my matches match each other.

Stitched Trees

What I can see, if I click on “View ThruLines” above or ThruLines on the DNA Summary page on the main DNA tab is all of the people I match who Ancestry THINKS we descend from a common ancestor. This ancestor information isn’t always taken from either person’s tree.

For example, if my match hadn’t included Hiram Ferverda in her tree, Ancestry would use other people’s trees to “stitch them together” such that the tester is shown to be descended from a common ancestor with me. Sometimes these stitched trees are accurate and sometimes they are not, although they have improved since they were first released. I wrote about ThruLines here.

First Steps Ancestry ThruLines tree

click to enlarge

In closer generations, especially if you are looking to connect with cousins, tree matching is a very valuable tool. In the graphic above, you can see all of the cousins who descend from Hiram Ferverda who have tested and DNA match to me. These DNA matches to me either descend from Hiram according to their trees, or Ancestry believes they descend from Hiram based on other people’s trees.

With more distant ancestors, other people’s trees are increasingly likely to be copied with no sources, so take them with a very large grain of salt (perchance the entire salt lick.) I use ThruLines as hints, not gospel, especially the further back in time the common ancestor. I wish they reached back another couple of generations. They are great hints and they end with the 7th generation where my brick walls tend to begin!

23andMe

I haven’t mentioned 23andMe yet in this article. Genealogists do test there, especially adoptees who need to fish in every pond.

23andMe is often the 4th choice of the major 4 vendors for genealogy due to the following challenges:

  • No tree support, other than allowing you to link to a tree at FamilySearch or elsewhere. This means no tree matching.
  • Less than 2000 matches, meaning that every person is limited to a maximum of 2000 matches, minus however many of those 2000 don’t opt-in for genealogical matching. Given that 23andMe’s focus is increasingly health, my number of matches continues to decrease and is currently just over 1500. The good news is that those 1500 are my highest, meaning closest matches. The bad news is the genealogy is not 23andMe’s focus.

If you are an adoptee, a die-hard genealogist or specifically interested in ethnicity, then test at 23andMe. Otherwise all three of the other vendors would be better choices.

However, like the other vendors, 23andMe does have some features that are unique.

Their ethnicity predictions are acknowledged to be excellent. Ethnicity at 23andMe is called Ancestry Composition, and you’ll see that immediately when you sign in to your account.

First Steps 23andMe DNA Relatives.png

Your matches at 23andMe are found under DNA Relatives.

First Steps 23andMe tools

click to enlarge

At left, you’ll find filters and the search box.

Mom’s and Dad’s side filter matches if you’ve tested your parents, but it’s not like the Family Tree DNA bucketing that provides maternal and paternal side bucketing by utilizing through third cousins if your parents aren’t available for testing.

Family names aren’t your family names, but the top family names that match to you. Guess what my highest name is? Smith.

However, Ancestor Birthplaces are quite useful because you can sort by country. For example, my mother’s grandfather Ferverda was born in the Netherlands.

First Steps 23andMe country.png

If I click on Netherlands, I can see my 5 matches with ancestors born in the Netherlands. Of course, this doesn’t mean that I match because of my match’s Dutch ancestors, but it does provide me with a place to look for a common ancestor and I can proceed by seeing who I match in common with those matches. Unfortunately, without trees we’re left to rely on ancestor birthplaces and family surnames, if my matches have entered that information.

One of my Dutch matches also matches my Ferverda cousin. Given that connection, and that the Ferverda family immigrated from Holland in 1868, that’s a starting point.

MyHeritage has a similar features and they are much more prevalent in Europe.

By clicking on my Ferverda cousin, I can view the DNA we share, who we match in common, our common ethnicity and more. I have the option of comparing multiple people in the chromosome browser by clicking on “View DNA Comparison” and then selecting who I wish to compare.

First Steps 23andMe view DNA Comparison.png

By scrolling down instead of clicking on View DNA Comparison, I can view where my Ferverda cousin matches me on my chromosomes, shown below.

First STeps 23andMe chromosome browser.png

23andMe identifies completely identical segments which would be painted in dark purple, the legend at bottom left.

Adoptees love this feature because it would immediately differentiate between half and full siblings. Full siblings share approximately 25% of the exact DNA on both their maternal and paternal strands of DNA, while half siblings only share the DNA from one parent – assuming their parents aren’t closely related. I share no completely identical DNA with my Ferverda cousin, so no segments are painted dark purple.

23andMe and Ancestry Maps Show Where Your Matches Live

Another reason that adoptees and people searching for birth parents or unknown relatives like 23andMe is because of the map function.

After clicking on DNA Relatives, click on the Map function at the top of the page which displays the following map.

First Steps 23andMe map

click to enlarge

This isn’t a map of where your matches ancestors lived, but is where your matches THEMSELVES live. Furthermore, you can zoom in, click on the button and it displays the name of the individual and the city where they live or whatever they entered in the location field.

First Steps 23andMe your location on map.png

I entered a location in my profile and confirmed that the location indeed displays on my match’s maps by signing on to another family member’s account. What I saw is the display above. I’d wager that most testers don’t realize that their home location and photo, if entered, is being displayed to their matches.

I think sharing my ancestors’ locations is a wonderful, helpful, idea, but there is absolutely no reason whatsoever for anyone to know where I live and I feel it’s stalker-creepy and a safety risk.

First Steps 23andMe questions.png

If you enter a location in this field in your profile, it displays on the map.

If you test with 23andMe and you don’t want your location to display on this map to your matches, don’t answer any question that asks you where you call home or anything similar. I never answer any questions at 23andMe. They are known for asking you the same question repeatedly, in multiple locations and ways, until you relent and answer.

Ancestry has a similar map feature and they’ve also begun to ask you questions that are unrelated to genealogy.

Ancestry Map Shows Where Your Matches Live

At Ancestry, when you click to see your DNA matches, look to the right at the map link.

First Steps Ancestry map link.png

By clicking on this link, you can see the locations that people have entered into their profile.

First Steps Ancestry match map.png

As you can see, above, I don’t have a location entered and I am prompted for one. Note that Ancestry does specifically say that this location will be shown to your matches.

You can click on the Ancestry Profile link here, or go to your Personal Profile by click the dropdown under your user name in the upper right hand corner of any page.

This is important because if you DON’T want your location to show, you need to be sure there is nothing entered in the location field.

First Steps Ancestry profile.png

Under your profile, click “Edit.”

First Steps Ancestry edit profile.png

After clicking edit, complete the information you wish to have public or remove the information you do not.

First Steps Ancestry location in profile.png

Sometimes Your Answer is a Little More Complicated

This is a First Steps article. Sometimes the answer you seek might be a little more complicated. That’s why there are specialists who deal with this all day, everyday.

What issues might be more complex?

If you’re just starting out, don’t worry about these things for now. Just know when you run into something more complex or that doesn’t make sense, I’m here and so are others. Here’s a link to my Help page.

Getting Started

What do you need to get started?

  • You need to take a DNA test, or more specifically, multiple DNA tests. You can test at Ancestry or 23andMe and transfer your results to both Family Tree DNA and MyHeritage, or you can test directly at all vendors.

Neither Ancestry nor 23andMe accept uploads, meaning other vendors tests, but both MyHeritage and Family Tree DNA accept most file versions. Instructions for how to download and upload your DNA results are found below, by vendor:

Both MyHeritage and Family Tree DNA charge a minimal fee to unlock their advanced features such as chromosome browsers and ethnicity if you upload transfer files, but it’s less costly in both cases than testing directly. However, if you want the MyHeritage DNA plus Health or the Family Tree DNA Y DNA or Mitochondrial DNA tests, you must test directly at those companies for those tests.

  • It’s not required, but it would be in your best interest to build as much of a tree at all three vendors as you can. Every little bit helps.

Your first tree-building step should be to record what your family knows about your grandparents and great-grandparents, aunts and uncles. Here’s what my first step attempt looked like. It’s cringe-worthy now, but everyone has to start someplace. Just do it!

You can build a tree at either Ancestry or MyHeritage and download your tree for uploading at the other vendors. Or, you can build the tree using genealogy software on your computer and upload to all 3 places. I maintain my primary tree on my computer using RootsMagic. There are many options. MyHeritage even provides free tree builder software.

Both Ancestry and MyHeritage offer research/data subscriptions that provide you with hints to historical documents that increase what you know about your ancestors. The MyHeritage subscription can be tried for free. I have full subscriptions to both Ancestry and MyHeritage because they both include documents in their collections that the other does not.

Please be aware that document suggestions are hints and each one needs to be evaluated in the context of what you know and what’s reasonable. For example, if your ancestor was born in 1750, they are not included in the 1900 census, nor do women have children at age 70. People do have exactly the same names. FindAGrave information is entered by humans and is not always accurate. Just sayin’…

Evaluate critically and skeptically.

Ok, Let’s Go!

When your DNA results are ready, sign on to each vendor, look at your matches and use this article to begin to feel your way around. It’s exciting and the promise is immense. Feel free to share the link to this article on social media or with anyone else who might need help.

You are the cumulative product of your ancestors. What better way to get to know them than through their DNA that’s shared between you and your cousins!

What can you discover today?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Exploring Family Trees Website, Including Average DNA Percent Inheritance by Ancestor

Sometimes you just have to do something just because it’s fun.

That’s the website learnforeverlearn at this link, a free tool created by B. F. Lyon visualizations that allows you to view your family tree or pedigree chart in very novel ways.

Here’s what greets you.

learnforever splash

The “About This” link at the very top of the page shows the following:

learnforever about

In case you’re wondering, your Gedcom file never leaves your PC, so you don’t need to worry about security.

Getting Started

First, you’ll be prompted to upload a Gedcom file, a file generated by either your genealogy software like RootsMagic or a site like Ancestry. If you have a tree at Ancestry, you can download it into a Gedcom file format and save on your computer.

My own personal Gedcom file from my PC software was too large, so I downloaded a smaller file that I use on Ancestry. I’ve entered all of my ancestors at Ancestry through 12 generations, if known, and some of their children. I use my Ancestry file to focus on direct line ancestors and DNA matches, not as my primary tree.

The first thing you see after uploading your Gedcom file is that your pedigree chart is displayed in one tree. If you want to see examples before uploading your own, click here, or view mine below. You can click to see a larger image.

learnforever ancestors

What fun! If you’ve experienced pedigree collapse where you are descended from the same ancestral line multiple times, you’ll see that in this large pedigree map. I don’t have pedigree collapse, but you take a look at fun examples under “Sample Trees.”

If you want to see more detail, just scroll your mouse wheel for larger or smaller. If you get yourself lost, simply reset pan/zoom or reset to the root person.

You can’t “hurt” this application because you reload your file every time you want to use it, so you can always just start over.

Your options are at the top, but by mousing over anything on the page, you can generally learn a lot more. Every time I use this tool, I notice something I didn’t see previously.

learnforever toolbar

Let’s take a look at what you can do.

Who’s Who

I currently have 793 individuals in my tree. By clicking on the “Current Tree Details” at the top of the page, you can see the list of who is included.

learnforever tree detail

This is an easy way to see if you have any issues in your file. I quickly discovered that I have two people with typos in their birth dates because the years have 3 digits. How did that happen?

Validation Check

You can also run a data validation check.

learnforever data validation

What a valuable tool!

Hmmm, looks like I need to do some cleanup. Ahem!!

The X Chromosome

At the top right, you can click on “Highlight X DNA Contributions” which creates a view of the people who contributed or are candidates to contribute segments of their X chromosome to the home person. Remember that you can change the home (root) person to someone else in your tree, like maybe one of your parents, for example.

The X is important because it has unique inheritance properties that can be very helpful that I wrote about here.

learnforever x contributions

I moused over the various people and discovered that when you “land” on someone, you can view their information. In this case, my great-grandmother who, on average, contributed 12.5% of her DNA to me and 25% of her X chromosome.

learnforever ancestor contribution

I can then view Evaline’s ancestor or descendant tree, or a straight path to the root, which is me, by clicking the blue buttons.

learnforever ancestor tree

Years

learnforever years

By scrolling your mouse up and down between people, you can see a horizontal black “line” that shows you a year. By following the line, you can see who in your tree was living during that year.

learnforever living years

Gosh this is fun!

History

By mousing over the green year bar at far right, you can see what was going on historically at that time, as well as in your own family.

learnforever history

I love this tool!

Locations

Under the options tab, at upper left, by toggling the flag icon, you can then view your tree by birth location.

learnforever options

I love this view.

learnforever flags

You can view the migration progression by just looking at your tree.

Scroll on down the options tab for more display possibilities.

Possible Immigrants

learnforever possible immigrants

Ancestor Information

learnforever statistics

In my case, the “number of children” information isn’t accurate because I have not fleshed out the families at Ancestry. I was only working primarily with my direct ancestors.

Unique Birthplaces

learnforever birthplaces

I’ve combined unique birthplaces with potential immigrants.

Ancestor Cone

learnforever ancestor cone

By mousing, you can see how many ancestors you had at a particular time and the total world population.

learnforever ancestors vs world population

Wow. In 1615, I had 16,384 ancestors? I need to get busy! I am never going to be finished!

Just when you think you can’t have any more fun…

You can read more about this tool and ways to use it in an article written by the author here.

Thank You

I don’t know B. F. Lyon who created this cool free website, but under the options tab, I found this:

Want more options/features? Let me know at bradflyon@gmail.com

Please drop Brad a note to say thank you or offer suggestions!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

MyHeritage LIVE Conference Day 2 – The Science Behind DNA Matching    

The MyHeritage LIVE Oslo conference is but a fond memory now, and I would count it as a resounding success.

Perhaps one of the reasons I enjoyed it so much is the scientific aspect and because the content is very focused on a topic I enjoy without being the size and complexity of Rootstech. The smaller, more intimate venue also provides access to the “right” people as well as the ability to meet other attendees and not be overwhelmed by the sheer size.

Here are some stats:

  • 401 registered guests
  • 28 countries represented including distant places like Australia and South America
  • More than 20 speakers plus the hands-on workshops where specialist teams worked with students
  • 38 sessions and workshops, plus the party
  • 60,000 livestream participants, in spite of the time differences around the world

I was blown away by the number of livestream attendees.

I don’t know what criteria Gilad Japhet will be using to determine “success” but I can’t imagine this conference being judged as anything but.

Let’s take a look at the second day. I spent part of the time talking to people and drifting in and out of the rear of several sessions for a few minutes. I meant to visit some of the workshops, but there was just too much good, distracting content elsewhere.

I began Sunday in Mike Mansfield’s presentation about SuperSearch. Yes, I really did attend a few sessions not about DNA, but my favorite was the session on Improved DNA Matching.

Improved DNA Matching

I’m sure it won’t surprise any of my readers that my favorite presentations were about the actual science of genetic genealogy.

Consumers don’t really need to understand the science behind autosomal results to reap the benefits, but the underlying science is part of what I love – and it’s important for me to understand the underpinnings to be able to unravel the fine points of what the resulting matches are and are not revealing. Misinterpretation of DNA results leading to faulty conclusions is a real issue in genetic genealogy today. Consequently, I feel that anyone working with other people’s results and providing advice really needs to understand how the science and technology together works.

Dr. Daphna Weissglas-Volkov, a population geneticist by training, although she clearly functions far beyond that scope today, gave a very interesting presentation about how MyHeritage handles (their greatly improved) DNA Matching. I’m hitting the high points here, but I would strongly encourage you to watch the video of this session when they are made available online.

In addition to Dr. Weissglas-Volkov’s slides, I’ve added some additional explanations and examples in various places. You can easily tell that the slides are hers and the graphics that aren’t MyHeritage slides are mine.

Dr. Weissglas-Volkov began the session by introducing the MyHeritage science team and then explaining terminology to set the stage.

A match is when two people match each other on a fairly long piece of DNA. Of course, “fairly long” is defined differently by each vendor.

Your genetic map (of your chromosomes) is comprised of the DNA you inherit from different ancestors by the process of recombination when DNA is transferred from the parents to the child. A centiMorgan is the relatively likelihood that a recombination will occur in a single generation. On average, 36 recombinations occur in each generation, meaning that the DNA is divided on any chromosome. However, women, for reasons unknown have about 1.5 times as many recombinations as men.

You can’t see that when looking at an example of a person compared to their parents, of course, because each individual is a full match to each parent, but you can see this visually when comparing a grandchild to their maternal grandmother and their paternal grandmother on a chromosome browser.

The above illustration is the same female grandchild compared to her maternal grandmother, at left, and her paternal grandmother at right. Therefore the number of crossovers at left is through a female child (her mother), and the number at right is through a male child (her father.)

# of Crossovers
Through female child – left 57
Through male child – right 22

There are more segments at left, through the mother, and the segments are generally shorter, because they have been divided into more pieces.

At right, fewer and larger segments through the father.

Keep in mind that because you have a strand of DNA from each parent, with exactly the same “street addresses,” that what is produced by DNA sequencing are two columns of data – but your Mom’s and Dad’s DNA is intermixed.

The information in the two columns can’t be identified as Mom’s or Dad’s DNA or strand at this point.

That interspersed raw data is called a genotype. A haplotype is when Mom’s and Dad’s DNA can be reassembled into “sides” so you can attribute the two letters at each address to either Mom or Dad.

Here’s a quick example.

The goal, of course, is to figure out how to reassemble your DNA into Mom’s side and Dad’s side so that we know that someone matching you is actually matching on all As (Mom) or all Gs (Dad,) in this example, and not a false match that zigzags back and forth between Mom and Dad.

The best way to accomplish that goal of course is trio phasing, when the child and both parents are available, so by comparing the child’s DNA with the parents you can assign the two strands of the child’s DNA.

Unfortunately, few people have both or even one parent available in order to actual divide their DNA into “sides,” so the next best avenue is statistical phasing. I’ve called this academic phasing in the past, as compared to parental phasing which MyHeritage refers to as trio phasing.

There’s a huge amount of confusion about phasing, with few people understanding there are two distinct types.

Statistical phasing is a type of machine learning where a large number of reference populations are studied. Since we know that DNA travels together in blocks when inherited, statistical phasing learns which DNA travels with which buddy DNA – and creates probabilities. Your DNA is then compared to these models and your DNA is reshuffled in order to assemble your DNA into two groups – one representing your Mom’s DNA and one representing your Dad’s DNA, according to statistical probability.

Looking at your genotype, if we know that As group together at those 6 addresses in my example 95% of the time, then we know that the most likely scenario to create a haplotype is that all of the As came from one parent and all of the Gs from the other parent – although without additional information, there is no way to yet assign the maternal and paternal identifier. At this point, we only know parent 1 and parent 2.

In order to train the computers (machine learning) to properly statistically phase testers’ results, MyHeritage uses known relationships of people to teach the machines. In other words, their reference panels of proven haplotypes grows all of the time as parent/child trios test.

Dr. Weissglas-Volkev then moved on to imputation.

When sequencing DNA, not every location reads accurately, so the missing values can be imputed, or “put back” using imputation.

Initially imputation was a hot mess. Not just for MyHeritage, but for all vendors, imputation having been forced upon them (and therefore us) by Illumina’s change to the GSA chip.

However, machine learning means that imputation models improve constantly, and matching using imputation is greatly improved at MyHeritage today.

Imputation can do more than just fill in blanks left by sequencing read errors.

The benefit of imputation to the genetic genealogy community is that vendors using disparate chips has forced vendors that want to allow uploads to utilize imputation to create a global template that incorporates all of the locations from each vendor, then impute the values they don’t actually test for themselves to complete the full template for each person.

In the example below, you can see that no vendor tests all available locations, but when imputation extends the sequences of all testers to the full 1-500 locations, the results can easily be compared to every other tester because every tester now has values in locations 1-500, regardless of which vendor/chip was utilized in their actual testing.

Therefore, using imputation, MyHeritage is able to match between quite disparate chips, such as the traditional Illumina chips (OmniExpress), the custom Ancestry chip and the new GSA chip utilized by 23andMe and LivingDNA.

So, how are matches determined?

Matching

First your DNA and that of another person are scanned for nearly identical seed sequences.

A minimum segment length of 6cM must be identified for further match processing to occur. Anything below 6cM is discarded at this point.

The match is then further evaluated to see if the seed match is of a high enough quality that it should be perfected and should count as a match. Other segments continue to be evaluated as well. If the total matching segment(s) is 8 total cM or greater, it’s considered a valid match. MyHeritage has taken the position that they would rather give you a few accidental false matches than to miss good matches. I appreciate that position.

Window cleaning is how they refer to the process of removing pileup regions known to occur in the human genome. This is NOT the same as Ancestry’s routine that removes areas they determine to be “too matchy” for you individually.

The difference is that in humans, for example, there is a segment of chromosome 6 where, for some reason, almost all humans match. Matching across that segment is not informative for genetic genealogy, so that region along with several others similar in nature are removed. At Ancestry, those genome-wide pileup segments are removed, along with other regions where Ancestry decides that you personally have too many matches. The problem is that for me, these “too matchy” segments are many of my Acadian matches. Acadians are endogamous, so lots of them match each other because as a small intermarried population, they share a great deal of the same DNA. However, to me, because I have one great-grandfather that’s Acadian, that “too matchy” information IS valuable although I understand that it wouldn’t be for someone that is 100% Acadian or Jewish.

In situations such as Ashkenazi Jewish matching, which is highly endogamous, MyHeritage uses a higher matching threshold. Otherwise every Ashkenazi person would match every other Ashkenazi person because they all descend from a small founder population, and for genealogy, that’s not useful.

The last step in processing matches is to establish the confidence level that the match is accurately predicted at the correct level – meaning the relationship range based on the amount of matching DNA and other criteria.

For example, does this match cluster with other proven matches of the same known relationship level?

From several confidence ascertainment steps, a confidence score is assigned to the predicted relationship.

Of course, you as a customer see none of this background processing, just the fact that you do match, the size of the match and the confidence score. That’s what genealogists need!

Matching Versus Triangulation Thresholds

Confusion exists about matching thresholds versus triangulation thresholds.

While any single segment must be over 6 cM in length for the matching process to begin, the actual match threshold at MyHeritage is a total of 8 cM.

I took a look at my lowest match at MyHeritage.

I have two segments, one 6.1 cM segment, and one 6 cM segment that match. It would appear that if I only had one 6 cM segment, it would not show as a match because I didn’t have the minimum 8 cM total.

Triangulation Threshold

However, after you pass that matching criteria and move on to triangulation with a matching individual, you have the option of selecting the triangulation threshold, which is not the same thing as the match threshold. The match threshold does not change, but you can change the triangulation threshold from 2 cM to 8 cM and selections in-between.

In the example below, I’m comparing myself against two known relatives.

You won’t be shown any matches below the 6 cM individual segment threshold, BUT you can view triangulated segments of different sizes. This is because matching segments often don’t line up exactly and the triangulated overlap between several individuals may be very small, but may still be useful information.

Flying your mouse over the location in the bubble, which is the triangulated segment, tells you the size of the triangulated portion. If you selected the 2 cM triangulation, you would see smaller triangulated portions of matches.

Closing Session

The conference was closed by Aaron Godfrey, a super-nice MyHeritage employee from the UK. The closing session is worth watching on the recorded livestream when it becomes available, in part because there are feel good moments.

However, the piece of information I was looking for was whether there will be a MyHeritage LIVE conference in 2019, and if so, where.

I asked Gilad afterwards and he said that they will be evaluating the feedback from attendees and others when making that decision.

So, if you attended or joined the livestream sessions and found value, please let MyHeritage know so that they can factor your feedback onto their decision. If there are topics you’d like to see as sessions, I’m sure they’d love to hear about that too. Me, I’m always voting for more DNA😊

I hope to hear about MyHeritage LIVE 2019, and I’m voting for any of the following locations:

  • Australia
  • New Zealand
  • Israel
  • Germany
  • Switzerland

What do you think?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Elizabeth Warren’s Native American DNA Results: What They Mean

Elizabeth Warren has released DNA testing results after being publicly challenged and derided as “Pochahontas” as a result of her claims of a family story indicating that her ancestors were Native America. If you’d like to read the specifics of the broo-haha, this Washington Post Article provides a good summary, along with additional links.

I personally find name-calling of any type unacceptable behavior, especially in a public forum, and while Elizabeth’s DNA test was taken, I presume, in an effort to settle the question and end the name-calling, what it has done is to put the science of genetic testing smack dab in the middle of the headlines.

This article is NOT about politics, it’s about science and DNA testing. I will tell you right up front that any comments that are political or hateful in nature will not be allowed to post, regardless of whether I agree with them or not. Unfortunately, these results are being interpreted in a variety of ways by different individuals, in some cases to support a particular political position. I’m presenting the science, without the politics.

This is the first of a series of two articles.

I’m dividing this first article into four sections, and I’d ask you to read all four, especially before commenting. A second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will follow shortly about how to get the most out of an ethnicity test when hunting for Native American (or other minority, for you) ethnicity.

Understanding how the science evolved and works is an important factor of comprehending the results and what they actually mean, especially since Elizabeth’s are presented in a different format than we are used to seeing. What a wonderful teaching opportunity.

  • Family History and DNA Science – How this works.
  • Elizabeth Warren’s Genealogy
  • Elizabeth Warren’s DNA Results
  • Questions and Answers – These are the questions I’m seeing, and my science-based answers.

My second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will include:

  • Potential – This isn’t all that can be done with ethnicity results. What more can you do to identify that Native ancestor?
  • Resources with Step by Step Instructions

Now, let’s look at Elizabeth’s results and how we got to this point.

Family Stories and DNA

Every person that grows up in their biological family hears family stories. We have no reason NOT to believe them until we learn something that potentially conflicts with the facts as represented in the story.

In terms of stories handed down for generations, all we have to go on, initially, are the stories themselves and our confidence in the person relating the story to us. The day that we begin to suspect that something might be amiss, we start digging, and for some people, that digging begins with a DNA test for ethnicity.

My family had that same Cherokee story. My great-grandmother on my father’s side who died in 1918 was reportedly “full blooded Cherokee” 60 years later when I discovered she had existed. Her brothers reportedly went to Oklahoma to claim headrights land. There were surely nuggets of truth in that narrative. Family members did indeed to go Oklahoma. One did own Cherokee land, BUT, he purchased that land from a tribal member who received an allotment. I discovered that tidbit later.

What wasn’t true? My great-grandmother was not 100% Cherokee. To the best of my knowledge now, a century after her death, she wasn’t Cherokee at all. She probably wasn’t Native at all. Why, then, did that story trickle down to my generation?

I surely don’t know. I can speculate that it might have been because various people were claiming Native ancestry in order to claim land when the government paid tribal members for land as reservations were dissolved between 1893 and 1914. You can read more about that in this article at the National Archives about the Dawes Rolls, compiled for the Cherokee, Creek, Choctaw, Chickasaw and Seminole for that purpose.

I can also speculate that someone in the family was confused about the brother’s land ownership, especially since it was Cherokee land.

I could also speculate that the confusion might have resulted because her husband’s father actually did move to Oklahoma and lived on Choctaw land.

But here is what I do know. I believed that story because there wasn’t any reason NOT to believe it, and the entire family shared the same story. We all believed it…until we discovered evidence through DNA testing that contradicted the story.

Before we discuss Elizabeth Warren’s actual results, let’s take a brief look at the underlying science.

Enter DNA Testing

DNA testing for ethnicity was first introduced in a very rudimentary form in 2002 (not a typo) and has progressed exponentially since. The major vendors who offer tests that provide their customers with ethnicity estimates (please note the word estimates) have all refined their customer’s results several times. The reference populations improve, the vendor’s internal software algorithms improve and population genetics as a science moves forward with new discoveries.

Note that major vendors in this context mean Family Tree DNA, 23andMe, the Genographic Project and Ancestry. Two newer vendors include MyHeritage and LivingDNA although LivingDNA is focused on England and MyHeritage, who utilizes imputation is not yet quite up to snuff on their ethnicity estimates. Another entity, GedMatch isn’t a testing vendor, but does provide multiple ethnicity tools if you upload your results from the other vendors. To get an idea of how widely the results vary, you can see the results of my tests at the different vendors here and here.

My initial DNA ethnicity test, in 2002, reported that I was 25% Native American, but I’m clearly not. It’s evident to me now, but it wasn’t then. That early ethnicity test was the dinosaur ages in genetic genealogy, but it did send me on a quest through genealogical records to prove that my family member was indeed Native. My father clearly believed this, as did the rest of the family. One of my early memories when I was about four years old was attending a (then illegal) powwow with my Dad.

In order to prove that Elizabeth Vannoy, that great-grandmother, was Native I asked a cousin who descends from her matrilineally to take a mitochondrial DNA test that would unquestionably provide the ethnicity of her matrilineal line – that of her mother’s mother’s mother’s direct line. If she was Native, her haplogroup would be a derivative either A, B, C, D or X. Her mitochondrial DNA was European, haplogroup J, clearly not Native, so Elizabeth Vannoy was not Native on that line of her family. Ok, maybe through her dad’s line then. I was able to find a Vanoy male descendant of her father, Joel Vannoy, to test his Y DNA and he was not Native either. Rats!

Tracking Elizabeth Vannoy’s genealogy back in time provided no paper-trail link to any Native ancestors, but there were and are still females whose surnames and heritage we don’t know. Were they Native or part Native? Possibly. Nothing precludes it, but nothing (yet) confirms it either.

Unexpected Results

DNA testing is notorious for unveiling unexpected results. Adoptions, unknown parents, unexpected ethnicities, previously unknown siblings and half-siblings and more.

Ethnicity is often surprising and sometimes disappointing. People who expect Native American heritage in their DNA sometimes don’t find it. Why?

  • There is no Native ancestor
  • The Native DNA has “washed out” over the generations, but they did have a Native ancestor
  • We haven’t yet learned to recognize all of the segments that are Native
  • The testing company did not test the area that is Native

Not all vendors test the same areas of our DNA. Each major company tests about 700,000 locations, roughly, but not the same 700,000. If you’re interested in specifics, you can read more about that here.

50-50 Chance

Everyone receives half of their autosomal DNA from each parent.

That means that each parent contributes only HALF OF THEIR DNA to a child. The other half of their DNA is never passed on, at least not to that child.

Therefore, ancestral DNA passed on is literally cut in half in each generation. If your parent has a Native American DNA segment, there is a 50-50 chance you’ll inherit it too. You could inherit the entire segment, a portion of the segment, or none of the segment at all.

That means that if you have a Native ancestor 6 generations back in your tree, you share 1.56% of their DNA, on average. I wrote the article, Ancestral DNA Percentages – How Much of Them is in You? to explain how this works.

These calculations are estimates and use averages. Why? Because they tell us what to expect, on average. Every person’s results will vary. It’s entirely possible to carry a Native (or other ethnic) segment from 7 or 8 or 9 generations ago, or to have none in 5 generations. Of course, these calculations also presume that the “Native” ancestor we find in our tree was fully Native. If the Native ancestor was already admixed, then the percentages of Native DNA that you could inherit drop further.

Why Call Ethnicity an Estimate?

You’ve probably figured out by now that due to the way that DNA is inherited, your ethnicity as reported by the major testing companies isn’t an exact science. I discussed the methodology behind ethnicity results in the article, Ethnicity Testing – A Conundrum.

It is, however, a specialized science known as Population Genetics. The quality of the results that are returned to you varies based on several factors:

  • World Region – Ethnicity estimates are quite accurate at the continental level, plus Jewish – meaning African, Indo-European, Asian, Native American and Jewish. These regions are more different than alike and better able to be separated.
  • Reference Population – The size of the population your results are being compared to is important. The larger the reference population, the more likely your results are to be accurate.
  • Vendor Algorithm – None of the vendors provide the exact nature of their internal algorithms that they use to determine your ethnicity percentages. Suffice it to say that each vendor’s staff includes population geneticists and they all have years of experience. These internal differences are why the estimates vary when compared to each other.
  • Size of the Segment – As with all genetic genealogy, bigger is better because larger segments stand a better chance of being accurate.
  • Academic Phasing – A methodology academics and vendors use in which segments of DNA that are known to travel together during inheritance are grouped together in your results. This methodology is not infallible, but in general, it helps to group your mother’s DNA together and your father’s DNA together, especially when parents are not available for testing.
  • Parental Phasing – If your parents test and they too have the same segment identified as Native, you know that the identification of that segment as Native is NOT a factor of chance, where the DNA of each of your parents just happens to fall together in a manner as to mimic a Native segment. Parental phasing is the ability to divide your DNA into two parts based on your parent’s DNA test(s).
  • Two Chromosomes – You have two chromosomes, one from your mother and one from your father. DNA testing can’t easily separate those chromosomes, so the exact same “address” on your mother’s and father’s chromosomes that you inherited may carry two different ethnicities. Unless your parents are both from the same ethnic population, of course.

All of these factors, together, create a confidence score. Consumers never see these scores as such, but the vendors return the highest confidence results to their customers. Some vendors include the capability, one way or another, to view or omit lower confidence results.

Parental Phasing – Identical by Descent

If you’re lucky enough to have your parents, or even one parent available to test, you can determine whether that segment thought to be Native came from one of your parents, or if the combination of both of your parent’s DNA just happened to combine to “look” Native.

Here’s an example where the “letters” (nucleotides) of Native DNA for an example segment are shown at left. If you received the As from one of your parents, your DNA is said to be phased to that parent’s DNA. That means that you in fact inherited that piece of your DNA from your mother, in the case shown below.

That’s known as Identical by Descent (IBD). The other possibility is what your DNA from both of your parents intermixed to mimic a Native segment, shown below.

This is known as Identical by Chance (IBC).

You don’t need to understand the underpinnings of this phenomenon, just remember that it can happen, and the smaller the segment, the more likely that a chance combination can randomly happen.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s genealogy, is reported to the 5th generation by WikiTree.

Elizabeth’s mother, Pauline Herring’s line is shown, at WikiTree, as follows:

Notice that of Elizabeth Warren’s 16 great-great-great grandparents on her mother’s side, 9 are missing.

Paper trail being unfruitful, Elizabeth Warren, like so many, sought to validate her family story through DNA testing.

Elizabeth Warren’s DNA Results

Elizabeth Warren didn’t test with one of the major vendors. Instead, she went directly to a specialist. That’s the equivalent of skipping the family practice doctor and going to the Mayo Clinic.

Elizabeth Warren had test results interpreted by Dr. Carlos Bustamante at Stanford University. You can read the actual report here and I encourage you to do so.

From the report, here are Dr. Bustamante’s credentials:

Dr. Carlos D. Bustamante is an internationally recognized leader in the application of data science and genomics technology to problems in medicine, agriculture, and biology. He received his Ph.D. in Biology and MS in Statistics from Harvard University (2001), was on the faculty at Cornell University (2002-9), and was named a MacArthur Fellow in 2010. He is currently Professor of Biomedical Data Science, Genetics, and (by courtesy) Biology at Stanford University. Dr. Bustamante has a passion for building new academic units, non-profits, and companies to solve pressing scientific challenges. He is Founding Director of the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG) and Inaugural Chair of the Department of Biomedical Data Science. He is the Owner and President of CDB Consulting, LTD. and also a Director at Eden Roc Biotech, founder of Arc-Bio (formerly IdentifyGenomics and BigData Bio), and an SAB member of Imprimed, Etalon DX, and Digitalis Ventures among others.

He’s no lightweight in the study of Native American DNA. This 2012 paper, published in PLOS Genetics, Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas focused on teasing out Native American markers in admixed individuals.

From that paper:

Ancestry Informative Markers (AIMs) are commonly used to estimate overall admixture proportions efficiently and inexpensively. AIMs are polymorphisms that exhibit large allele frequency differences between populations and can be used to infer individuals’ geographic origins.

And:

Using a panel of AIMs distributed throughout the genome, it is possible to estimate the relative ancestral proportions in admixed individuals such as African Americans and Latin Americans, as well as to infer the time since the admixture process.

The methodology produced results of the type that we are used to seeing in terms of continental admixture, shown in the graphic below from the paper.

Matching test takers against the genetic locations that can be identified as either Native or African or European informs us that our own ancestors carried the DNA associated with that ethnicity.

Of course, the Native samples from this paper were focused south of the United States, but the process is the same regardless. The original Native American population of a few individuals arrived thousands of years ago in one or more groups from Asia and their descendants spread throughout both North and South America.

Elizabeth’s request, from the report:

To analyze genetic data from an individual of European descent and determine if there is reliable evidence of Native American and/or African ancestry. The identity of the sample donor, Elizabeth Warren, was not known to the analyst during the time the work was performed.

Elizabeth’s test included 764,958 genetic locations, of which 660,173 overlapped with locations used in ancestry analysis.

The Results section says after stating that Elizabeth’s DNA is primarily (95% or greater) European:

The analysis also identified 5 genetic segments as Native American in origin at high confidence, defined at the 99% posterior probability value. We performed several additional analyses to confirm the presence of Native American ancestry and to estimate the position of the ancestor in the individual’s pedigree.

The largest segment identified as having Native American ancestry is on chromosome 10. This segment is 13.4 centiMorgans in genetic length, and spans approximately 4,700,000 DNA bases. Based on a principal components analysis (Novembre et al., 2008), this segment is clearly distinct from segments of European ancestry (nominal p-value 7.4 x 10-7, corrected p-value of 2.6 x 10-4) and is strongly associated with Native American ancestry.

The total length of the 5 genetic segments identified as having Native American ancestry is 25.6 centiMorgans, and they span approximately 12,300,000 DNA bases. The average segment length is 5.8 centiMorgans. The total and average segment size suggest (via the method of moments) an unadmixed Native American ancestor in the pedigree at approximately 8 generations before the sample, although the actual number could be somewhat lower or higher (Gravel, 2012 and Huff et al., 2011).

Dr. Bustamante’s Conclusion:

While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.

I was very pleased to see that Dr. Bustamante had included the PCA (Principal Component Analysis) for Elizabeth’s sample as well.

PCA analysis is the scientific methodology utilized to group individuals to and within populations.

Figure one shows the section of chromosome 10 that showed the largest Native American haplotype, meaning DNA block, as compared to other populations.

Remember that since Elizabeth received a chromosome from BOTH parents, that she has two strands of DNA in that location.

Here’s our example again.

Given that Mom’s DNA is Native, and Dad’s is European in this example, the expected results when comparing this segment of DNA to other populations is that it would look half Native (Mom’s strand) and half European (Dad’s strand.)

The second graphic shows Elizabeth’s sample and where it falls in the comparison of First Nations (Canada) and Indigenous Mexican individuals. Given that Elizabeth’s Native ancestor would have been from the United States, her sample falls where expected, inbetween.

Let’s take a look at some of the questions being asked.

Questions and Answers

I’ve seen a lot of misconceptions and questions regarding these results. Let’s take them one by one:

Question – Can these results prove that Elizabeth is Cherokee?

Answer – No, there is no test, anyplace, from any lab or vendor, that can prove what tribe your ancestors were from. I wrote an article titled Finding Your American Indian Tribe Using DNA, but that process involves working with your matches, Y and mitochondrial DNA testing, and genealogy.

Q – Are these results absolutely positive?

A – The words “absolutely positive” are a difficult quantifier. Given the size of the largest segment, 13.4 cM, and that there are 5 Native segments totaling 25.6 cM, and that Dr. Bustamante’s lab performed the analysis – I’d say this is as close to “absolutely positive” as you can get without genealogical confirmation.

A 13.4 cM segment is a valid segment that phases to parents 98% of the time, according to Philip Gammon’s work, here, and 99% of the time in my own analysis here. That indicates that a 13.4 cM segment is very likely a legitimately ancestral segment, not a match by chance. The additional 4 segments simply increase the likelihood of a Native ancestor. In other words, for there NOT to be a Native ancestor, all 5 segments, including the large 13.4 cM segment would have to be misidentified by one of the premier scientists in the field.

Q – What did Dr. Bustamante mean by “evidence of an unadmixed Native American ancestor?”

A – Unadmixed means that the Native person was fully Native, meaning not admixed with European, Asian or African DNA. Admixture, in this context, means that the individual is a mixture of multiple ethnic groups. This is an important concept, because if you discover that your ancestor 4 generations ago was a Cherokee tribal member, but the reality was that they were only 25% Native, that means that the DNA was already in the process of being divided. If your 4th generation ancestor was fully Native, you would receive about 6.25% of their DNA which would be all Native. If they were only 25% Native, that means that while you will still receive about 6.25% of their DNA but only one fourth of that 6.25% is possibly Native – so 1.56%. You could also receive NONE of their Native DNA.

Q – Is this the same test that the major companies use?

A – Yes and no. The test itself was probably performed on the same Illumina chip platform, because the chips available cover the markers that Bustamante needed for analysis.

The major companies use the same reference data bases, plus their own internal or private data bases in addition. They do not create PCA models for each tester. They do use the same methodology described by Dr. Bustamante in terms of AIMs, along with proprietary algorithms to further define the results. Vendors may also use additional internal tools.

Q – Did Dr. Bustamante use more than one methodology in his analysis? What if one was wrong?

A – Yes, he utilized two different methodologies whose results agreed. The global ancestry method evaluates each location independently of any surrounding genetic locations, ignoring any correlation or relationship to neighboring DNA. The second methodology, known as the local ancestry method looks at each location in combination with its neighbors, given that DNA pieces are known to travel together. This second methodology allows comparisons to entire segments in reference populations and is what allows the identification of complete ancestral segments that are identified as Native or any other population.

Q – If Elizabeth’s DNA results hadn’t shown Native heritage, would that have proven that she didn’t have Native ancestry?

A – No, not definitively, although that is a possible reason for ethnicity results not showing Native admixture. It would have meant that either she didn’t have a Native ancestor, the DNA washed out, or we cannot yet detect those segments.

Q – Does this qualify Elizabeth to join a tribe?

A – No. Every tribe defines their own criteria for membership. Some tribes embrace DNA testing for paternity issues, but none, to the best of my knowledge, accept or rely entirely on DNA results for membership. DNA results alone cannot identify a specific tribe. Tribes are societal constructs and Native people genetically are more alike than different, especially in areas where tribes lived nearby, fought and captured other tribe’s members.

Q – Why does Dr. Bustamante use words like “strong probability” instead of absolutes, such as the percentages shown by commercial DNA testing companies?

A – Dr. Bustamante’s comments accurately reflect the state of our knowledge today. The vendors attempt to make the results understandable and attractive for the general population. Most vendors, if you read their statements closely and look at your various options indicate that ethnicity is only an estimate, and some provide the ability to view your ethnicity estimate results at high, medium and low confidence levels.

Q – Can we tell, precisely, when Elizabeth had a Native ancestor?

A – No, that’s why Dr. Bustamante states that Elizabeth’s ancestor was approximately 8 generations ago, and in the range of 6-10 generations ago. This analysis is a result of combined factors, including the total centiMorgans of Native DNA, the number of separate reasonably large segments, the size of the longest segment, and the confidence score for each segment. Those factors together predict most likely when a fully Native ancestor was present in the tree. Keep in mind that if Elizabeth had more than one Native ancestor, that too could affect the time prediction.

Q – Does Dr. Bustamante provide this type of analysis or tools for the general public?

A – Unfortunately, no. Dr. Bustamante’s lab is a research facility only.

Roberta’s Summary of the Analysis

I find no omissions or questionable methods and I agree with Dr. Bustamante’s analysis. In other words, yes, I believe, based on these results, that Elizabeth had a Native ancestor further back in her tree.

I would love for every tester to be able to receive PCA results like this.

However, an ethnicity confirmation isn’t all that can be done with Elizabeth’s results. Additional tools and opportunities are available outside of an academic setting, at the vendors where we test, using matching and other tools we have access to as the consuming public.

We will look at those possibilities in a second article, because Elizabeth’s results are really just a beginning and scratch the surface. There’s more available, much more. It won’t change Elizabeth’s ethnicity results, but it could lead to positively identifying the Native ancestor, or at least the ancestral Native line.

Join me in my next article for Possibilities, Wringing the Most Out of Your DNA Ethnicity Test.

In the mean time, you might want to read my article, Native American DNA Resources.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – DNA Recombination and Crossovers

What is a crossover anyway, and why do I, as a genetic genealogist, care?

A crossover on a chromosome is where the chromosome is cut and the DNA from two different ancestors is spliced together during meiosis as the DNA of the offspring is created when half of the DNA of the two parents combines.

Identifying crossover locations, and who the DNA that we received came from is the first step in identifying the ancestor further back in our tree that contributed that segment of DNA to us.

Crossovers are easier to see than conceptualize.

Viewing Crossovers

The crossover is the location on each chromosome where the orange and black DNA butt up against each other – like a splice or seam.

In this example, utilizing the Family Tree DNA chromosome browser, the DNA of a grandchild is compared to the DNA of a grandparent. The grandchild received exactly 50 percent of her father’s DNA, but only the average of 25% of the DNA of each of her 4 grandparents. Comparing this child’s DNA to one grandmother shows that she inherited about half of this grandmother’s DNA – the other half belonging to the spousal grandfather.

  • The orange segments above show the locations where the grandchild matches the grandmother.
  • The black sections (with the exception of the very tips of the chromosomes) show locations where the grandchild does not match the grandmother, so by definition, the grandchild must match the grandfather in those black locations (except chromosome tips).
  • The crossover location is the dividing line between the orange and black. Please note that the ends of chromosomes are notoriously difficult and inconsistent, so I tend to ignore what appear to be crossovers at the tips of chromosomes unless I can prove one way or the other. Of the 22 chromosomes, 16 have at least one black tip. In some cases, like chromosome 16, you can’t tell since the entire chromosome is black.
  • Ignore the grey areas – those regions are untested because they are SNP poor.

We know that the grandchild has her grandmother’s entire X chromosome, because the parent is a male who only inherited an X chromosome from his mother, so that’s all he had to give his daughter. The tips of the X chromosome are black, showing that the area is not matching the mother, so that region is unstable and not reported.

It’s also interesting to note that in 6 cases, other than the X chromosome, the entire chromosome is passed intact from grandparent to grandchild; chromosomes 4, 11, 16, 20, 21 and 22.

Twenty-six crossovers occurred between mother and son, at 5cM.  This was determined by comparing the DNA of mother to son in order to ascertain the actual beginning and end of the chromosome matching region, which tells me whether the black tips are or are not crossovers by comparing the grandchild’s DNA to the grandmother.

For more about this, you might want to read Concepts – Segment Survival – Three and Four Generation Phasing.

Before going on, let’s look at what a match between a parent and child looks like, and why.

Parent/Child Match

If you’re wondering why I showed a match between a grandchild and a grandparent, above, instead of showing a match between a child and a parent, the chromosome browser below provides the answer.

It’s a solid orange mass for each chromosome indicating that the child matches the parent at every location.

How can this be if the child only inherits half of the parent’s DNA?

Remember – the parent has two chromosomes that mix to give the child one chromosome.  When comparing the child to the parent, the child’s single chromosome inherited from the parent matches one of the parent’s two chromosomes at every address location – so it shows as a complete match to the parent even though the child is only matching one of the parent’s two of chromosome locations.  This isn’t a bug and it’s just how chromosome browsers work. In other words, the “other ” chromosome that your parents carry is the one you don’t match.

The diagram below shows the mother’s two copies of chromosome 1 she inherited from her father and mother and which section she gave to her child.

You can see that the mother’s father’s chromosome is blue in this illustration, and the mother’s mother’s chromosome is pink.  The crossover points in the child are between part B and C, and between part C and D.  You can clearly see that the child, when compared to the mother, does in fact match the mother in all locations, or parts, 3 blue and 1 pink, even though the source of the matching DNA is from two different parents.

This example shows the child compared to both parents, so you can see that the child does in fact match both parents on every single location.

This is exactly why two different matches may match us on the same location, but may not match each other because they are from different sides of our family – one from Mom’s side and one from Dad’s.

You can read more about this in the article, One Chromosome, Two Sides, No Zipper – ICW and the Matrix.

The only way to tell which “sides” or pieces of the parent’s DNA that the child inherited is to compare to other people who descend from the same line as one of the parents.  In essence, you can compare the child to the grandparents to identify the locations that the child received from each of the 4 grandparents – and by genetic subtraction, which segments were NOT inherited from each grandparent as well, if one grandparent happens to be missing.

In our Parental Chromosome pink and blue diagram illustration above, the child did NOT inherit the pink parts A, B and D, and did not inherit the blue part C – but did inherit something from the parent at every single location. They also didn’t inherit an equal amount of their grandparents pink and blue DNA. If they inherited the pink part, then they didn’t inherit the blue part, and vice versa for that particular location.

The parent to child chromosome browser view also shows us that the very tip ends of the chromosomes are not included in the matching reports – because we know that the child MUST match the parent on one of their two chromosomes, end to end. The download or chart view provides us with the exact locations.

This brings us to the question of whether crossovers occur equally between males and female children.  We already know that the X chromosome has a distinctive inheritance pattern – meaning that males only inherit an X from their mothers.  A father and son will NEVER match on the X chromosome.  You can read more about X chromosome inheritance patterns in the article, X Marks the Spot.

Crossovers Differ Between Males and Females

In the paper Genetic Analysis of Variation in Human Meiotic Recombination by Chowdhury, et al, we learn that males and females experience a different average number of crossovers.

The authors say the following:

The number of recombination events per meiosis varies extensively among individuals. This recombination phenotype differs between female and male, and also among individuals of each gender.

Notably, we found different sequence variants associated with female and male recombination phenotypes, suggesting that they are regulated by different genes.

Meiotic recombination is essential for the formation of human gametes and is a key process that generates genetic diversity. Given its importance, we would expect the number and location of exchanges to be tightly regulated. However, studies show significant gender and inter-individual variation in genome-wide recombination rates. The genetic basis for this variation is poorly understood.

The Chowdhury paper provides the following graphs. These graphs show the average number of recombinations, or crossovers, per meiosis for each of two different studies, the AGRE and the FHS study, discussed in the paper.

The bottom line of this paper, for genetic genealogists, is that males average about 27 crossovers per child and females average about 42, with the AGRE study families reporting 41.1 and the FHS study families reporting 42.8.

I have been collaborating with statistician, Philip Gammon, and he points out the following:

Male, 22 chromosomes plus the average of 27 crossovers = an average of 49 segments of his parent’s DNA that he will pass on to his children. Roughly half will be from each of his parents. Not exactly half. If there are an odd number of crossovers on a chromosome it will contain an even number of segments and half will be from each parent. But if there are an even number of crossovers (0, 2, 4, 6 etc.) there will be an odd number of segments on the chromosome, one more from one parent than the other.

The average size of segments will be approximately:

  • Males, 22 + 27 = 49 segments at an average size of 3400 / 49 = 69 cM
  • Females, 22 + 42 = 64 segments at an average size of 3400 / 64 = 53 cM

This means that cumulatively, over time, in a line of entirely females, versus a line of entirely males, you’re going to see bigger chunks of DNA preserved (and lost) in males versus females, because the DNA divides fewer times. Bigger chunks of DNA mean better matching more generations back in time. When males do have a match, it would be likely to be on a larger segment.

The article, First Cousin Match Simulations speaks to this as well.

Practically Speaking

What does this mean, practically speaking, to genetic genealogists?

Few lines actually descend from all males or all females. Most of our connections to distant ancestors are through mixtures of male and female ancestors, so this variation in crossover rates really doesn’t affect us much – at least not on the average.

It’s difficult to discern why we match some cousins and we don’t match others. In some cases, rather than random recombination being a factor, the actual crossover rate may be at play. However, since we only know who we do match, and not who tested and we don’t match, it’s difficult to even speculate as to how recombination affected or affects our matches. And truthfully, for the application of genetic genealogy, we really don’t care – we (generally) only care who we do match – unless we don’t match anyone (or a second cousin or closer) in a particular line, especially a relatively close line – and that’s a horse of an entirely different color.

To me, the burning question to be answered, which still has not been unraveled, is why a difference in recombination rates exists between males and females. What processes are in play here that we don’t understand? What else might this not-yet-understood phenomenon affect?

Until we figure those things out, I note whether or not my match occurred through primarily men or women, and simply add that information into the other data that I use to determine match quality and possible distance.  In other words, information that informs me as to how close and reasonable a match is likely to be includes the following information:

  • Total amount of shared DNA
  • Largest segment size
  • Number of matching segments
  • Number of SNPs in matching segment
  • Shared matches
  • X chromosome
  • mtDNA or Y DNA match
  • Trees – presence, absence, accuracy, depth and completeness
  • Primarily male or female individuals in path to common ancestor
  • Who else they match, particularly known close relatives
  • Does triangulation occur

It would be very interesting to see how the instances of matches to a certain specific cousin level – say 3rd cousins (for example), fare differently in terms of the average amount of shared DNA, the largest segment size and the number of segments in people descended from entirely female and entirely male lines. Blaine Bettinger, are you listening? This would be a wonderful study for the Shared cM Project which measures actual data.

Isn’t the science of genetics absolutely fascinating???!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

First Cousin Match Simulations

Update: Please note that in August of 2019, this article was updated to reflect 200,000 simulations as opposted to the original 80,000, along with other applicable statistics.

Have you ever wondered if your match with your first cousin is “normal,” or what the range of normal is for a first cousin match? How would we know? And if your result doesn’t fall into the expected range, does that mean it’s wrong? Does gender make a difference?

If you haven’t wondered some version of these questions yet, you will eventually, don’t worry! Yep, the things that keep genetic genealogists awake at night…

Philip Gammon, our statistician friend who wrote the Match-Maker-Breaker tool for parental match phasing has continued to perform research. In his latest endeavor, he has created a tool that simulates the matching between individuals of a given relationship. Philip is planning to submit a paper describing the tool and its underlying model for academic publication, but he has agreed to give us a sneak peek. Thanks Philip!

In this example, Philip simulated matching between first cousins.

The data presented here is the result of 200,000 simulations:

First cousin simulation V2

Philip was interested in this particular outcome in order to understand why his father shared 1206 cM with a first cousin, and if that was an outlier, since it is not near the average produced from the Shared cM Project (2017 revision) coordinated by Blaine Bettinger.

Academically calculated expectations suggest first cousins should share 850 cM. The data collected by Blaine showed an actual average of 874 cM, but varied within a 99th percentile range of 553 to 1225 cM utilizing 1512 respondents. You can view the expected values for relationships in the article, Concepts – Relationship Predictions and a second article, Shared cM Project 2017 Update Combined Chart  that includes a new chart incorporating the values from the 2016 Shared cM Project, the 2017 update and the DNA Detectives chart reflecting relationships as well.

Philip grouped the results into the same bins as used in the 2017 Shared cM Project:

First cousins shared cM format V2.png

The graph below is from the Shared cM Project tables.

Philip’s commentary regarding his simulations and The Shared cM Project’s results:

I’d say that they look very similar. The spread is just about right. The Shared cM data is a little higher but this is consistent with vendor results typically containing around 20 cM of short IBC segments. My sample size is more than 100 times greater so this gives more opportunity to observe extreme values. I observed 25 events exceeding 1410 cM, with a maximum of 1604 cM. At the lower end I have 787 events (about 0.4%) with fewer than 510 shared cM and a minimum of 272 cM.

I thought that the gender of the related parents of the 1st cousins would have quite an impact on the spread of the amounts shared between their children. Fewer crossovers for males means that the respective children of two brothers would be receiving on average, larger segments of DNA, so greater opportunity for either more sharing or for less. Conversely, the respective children of two sisters, with more crossovers and smaller segments, would be more tightly clustered around the average of 12.5% (855 cM in my model). There is a difference, but it’s not nearly as pronounced as I was expecting:

First cousins match curve V2.png

The most noticeable difference is in the tails. First cousins whose fathers were brothers are about two and a half times as likely to either share less than 8% or more than 17% than first cousins whose mothers were sisters. And of course, if the cousins were connected via a respective parent who were brother and sister to each other, the spread of shared cM is somewhere in between.

% DNA shared between the respective offspring of…
<8% 8-10% 10-15% 15-17% >17%
2 sisters 0.6% 8.2% 82.0% 8.2% 1.0%
1 brother, 1 sister 0.8% 9.2% 79.5% 9.1% 1.5%
2 brothers 1.6% 11.1% 74.2% 10.7% 2.4%

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Shared cM Project 2017 Update Combined Chart

The original goal of Blaine Bettinger’s Shared cM Project was to document the actual shared ranges of centiMorgans found in various relationships between testers in genetic genealogy. Previously, all we had were academically calculated models which didn’t accurately really reflect the data that genetic genealogists were seeing.

In June 2016, Blaine published the first version of the Shared cM Project information gathered collaboratively through crowd-sourcing. He continued to gather data, and has published a new 2017 version recently, along with an accompanying pdf download that explains the details. Today, more than 25,000 known relationships have been submitted by testers, along with their amount of shared DNA.

Blaine continues to accept submissions at this link, so please participate by submitting your data.

In the 2017 version, some of the numbers, especially the maximums in the more distant relationship categories changed rather dramatically. Some maximums actually doubled, meaning having more data to work with was a really good thing.

The 2017 project update refines the numbers with more accuracy, but also adds more uncertainly for people looking for nice, neat, tight relationship ranges. This project and resulting informational chart is a great tool, but you can’t now and never will be able to identify relationships with complete certainly without additional genealogical information to go along with the DNA results.

That’s the reason there is a column titled “Degree of Relationship.” Various different relationships between people can be expected to share about the same amount of DNA, so determining that relationship has to be done through a combination of DNA and other information.

When the 2016 version was released, I completed a chart that showed the expected percentage of shared DNA in various relationship categories and contrasted the expected cM of DNA against what Blaine had provided. I published the chart as part of an article titled, Concepts – Relationship Predictions. This article is still a great resource and very valid, but the chart is now out of date with the new 2017 information.

What a great reason to create a new chart to update the old one.

Thanks to Blaine and all the genetic genealogists who contributed to this important crowd-sourced citizen science project!

2016 Compared to 2017

The first thing I wanted to know was how the numbers changed from the 2016 version of the project to 2017. I combined the two years’ worth of data into one file and color coded the results. Please note that you can click on any image to enlarge.

The legend is as follows:

  • White rows = 2016 data
  • Peach rows = 2017 data for the same categories as 2016
  • Blue rows = new categories in 2017
  • Red cells = information that changed surprisingly, discussed below
  • Yellow cells = the most changed category since 2016

I was very pleased to see that Blaine was able to add data for several new relationship categories this year – meaning that there wasn’t enough information available in 2016. Those are easy to spot in the chart above, as they are blue.

Unexpected Minimum and Maximum Changes

As I looked at these results, I realized that some of the minimums increased. At first glance, this doesn’t make sense, because a minimum can get lower as the range expands, but a minimum can’t increase with the same data being used.

Had Blaine eliminated some of the data?

I thought I understood that the 2017 project simply added to the 2016 data, but if the same minimum data was included in both 2016 and 2017, why was the minimum larger in 2017? This occurred in 6 different categories.

By the same token, and applying the same logic, there are 5 categories where the maximum got smaller. That, logically, can’t happen either using the same data. The maximum could increase, but not decrease.

I know that Blaine worked with a statistician in 2016 and used a statistical algorithm to attempt to eliminate the outliers in order to, hopefully, eliminate errors in data entry, misunderstandings about the proper terms for relationships and relationships that were misunderstood either through genealogy or perhaps an unknown genetic link. Of course, issues like endogamy will affect these calculations too.

A couple good examples would be half siblings who thought they were full siblings, or half first cousins instead of just first cousins. The terminology “once removed” confuses people too.

You can read about the proper terminology for relationships between people in the article, Quick Tip – Calculating Cousin Relationships Easily.

In other words, Blaine had to take all of these qualifiers that relate to data quality into consideration.

Blaine’s Explanation

I asked Blaine about the unusual changes. He has given me permission to quote his response, below:

The maximum and minimum aren’t the largest and smallest numbers people have submitted, they’re the submissions statistically identified by the entire dataset as being either the 95th percentile maximum and minimum, or the 99th percentile maximum and minimum. As a result, the max or min can move in either direction. Think of it in terms of the histograms; if the peak of the histogram moves to the right or left due to a lot more data, then the shoulders (5 & 95% or the 1 and 99%) of the histogram will move as well, either to the right or left.

So, for example, substantially more data for 1C2R revealed that the previously minimum was too low, and has corrected it. There are still 1C2R submissions down there below the minimum of 43, and there are submissions above the maximum of 531, but the entire dataset for 1C2R has statistically identified those submissions as being outliers

The histogram for 1C2R supports that as well, showing that there are submissions above 531, but they are clearly outliers:

People submit “bad” numbers for relationships, either due to data entry errors, incorrect genealogies, unknown pedigree collapse, or other reasons. Unless I did this statistical analysis, the project would be useless because every relationship would have an exorbitant range. The 95th and 99th percentiles help keep the ranges in check by identifying the reasonable upper and lower boundaries.

Adding Additional Information

The reason I created this chart was not initially to share, but because I use the information all the time and wanted it in one easily accessible location.

I appreciate the work that Blaine has done to eliminate outliers, but in some cases, those outliers, although in the statistical 1%, will be accurate. In other cases, they clearly won’t, or they will be accurate but not relevant due to endogamy and pedigree collapse. How do you know? You don’t.

In the pdf that Blaine provides, he does us the additional service by breaking the results down by testing vendors: 23andMe, Ancestry and Family Tree DNA, and comparison service, GedMatch. He also provides endogamous and non-endogamous results, when known.

The vendor where an individual tests does have an impact on both the testing, the matching and the reporting. For example, Family Tree DNA includes all matches to the 1cM level in total cM, Ancestry strips out DNA they think is “too matchy” with their Timber algorithm, so their total cM will be much smaller than Family Tree DNA, and 23andMe is the only one of the vendors to report fully identical regions by adding that number into the total shared cM a second time. This isn’t a matter of right or wrong, but a matter of different approaches.

Blaine’s vendor specific charts go a long way in accounting for those differences in the Parent/Child and Sibling charts shown below.

A Combined Chart

In order to give myself the best change of actually correctly locating not just the best fit for a relationship as predicted by total matching cM, but all possible fits, I decided to add a third data source into the chart.

The DNA Detectives Facebook Group that specializes in adoption searches has compiled their own chart based on their experiences in reconstructing families through testing. This chart is often referred to simply as “the green chart” and therefore, I have added that information as well, rows colored green (of course), and combined it into the chart.

I modified the headings for this combined chart, slightly, and added a column for actual shared percent since the DNA Detectives chart provides that information.

I have also changed the coloring on the blue rows, which were new in 2017, to be the same as the rest of Blaine’s 2017 peach colored rows.

I hope you find this combined chart as useful as I do. Feel free to share, but please include the link to this article and credit appropriately, for my work compiling the chart as well as Blaine’s work on the 2016 and 2017 cM Projects and DNA Detective’s work producing their “green chart.”

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Ancestral DNA Percentages – How Much of Them is in You?

One of the most common questions I receive, especially in light of the interest in ethnicity testing, is how much of an ancestor’s DNA someone “should” share.

The chart above shows how much of a particular generation of ancestors’ DNA you would inherit if each generation between you and that ancestor inherited exactly 50% of that ancestor’s DNA from their parent. This means, on the average, you will carry less than 1% of each of your 5 times great-grandparents DNA, shown in generation 7, in total. You’ll carry about 1.56% of each of your 4 times great-grandparents, your 6th generation ancestors, and so forth.

As you can see, if you’re looking for a Native American ancestor, for example, who is 7 generations back in your tree, if you carry the average amount of DNA from that ancestor, it will be less than 1% which will be under the noise threshold for detection – and that’s assuming they were 100% Native at that time.

Everyone inherits 50% of their DNA from their parents, but not everyone inherits half of each of their ancestors’ DNA from a parent. Sometimes, the child will inherit all of a segment of DNA from an ancestor, and in other cases, the child will inherit none. In some cases, they will inherit half or a portion of the DNA from an ancestor. In reality, the DNA segments are very seldom divided exactly in half, but all we can deal with are averages when discussing how much DNA you “should” receive from an ancestor, based on where they are in your tree.

The generational relationship chart above represents the average that you will inherit from each of those ancestors. Of course, few people are actually average, and you may not be either. In other words, your ancestor’s DNA may not be detectible at 5, 6 or 7 generations, because it was lost in generations between them and you, while another ancestor’s DNA is still present in detectable amounts at 8 or 9 generations.

How Does Inheritance of Ancestral Segments Actually Work?

For you to inherit a particular segment from one GGGGG-grandparent, the inheritance might look something like this. “You” are at the bottom of the tree. You can click on any graphic to enlarge.

In the above example, you inherited one tenth of the segment from your GGGGG-grandparent which was one third of the DNA that your parent carried in that segment from that ancestor.

A second example is every bit as likely, shown below.

In this second scenario, you inherited nothing of that segment from your GGGGG-grandparent.

A third scenario is also a possibility.

In this third scenario, you inherited all of the DNA from that ancestor as your parent.

Now, think of these three scenarios as three different siblings inheriting from the same parent, and you’ll understand why siblings carry different amounts of DNA from their ancestors.

Of course, the child can only inherit what the parent has inherited from that ancestor, and if that particular segment was gone in the parent’s generation, or generations before the parent, the child certainly can’t inherit the segment. There is no such thing as “skipping generations.”

In this fourth scenario, the parent didn’t receive any of the segment from the GGGGG-grandparent, but maybe their brother or sister did, which is why you want to test aunts and uncles. Testing everyone in your family available from the oldest generation is absolutely critical.

This, of course, is exactly why we test as many relatives as we can. Everyone inherits different amounts of segments of DNA from our common ancestors. This is also why we map our matching segments to those ancestors by triangulating with cousins – to identify which pieces of our DNA came from which ancestor.

Seeing examples of how inheritance works helps us understand that there is no “one answer” to the question we want to know about each ancestor – “How much of you is in me?” The answer is, “it depends” and the actual amount would be different for every ancestor except your parents, where the answer is always 50%.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research